SSSSS SSSS SSSS SSSS SSSS SSSS T h e o r e m P r ov i n g a nd A l g e b r a S J o s e ph A . G og u e n Theorem Proving and Algebra
Joseph A. Goguen heorem ProvingandAlgebraheorem ProvingandAlgebra
Joseph A. GoguenUniversity of California, San Diego oseph A. Goguen (1941–2006)Department of Computer Science and EngineeringUniversity of California, San Diego9500 Gilman Drive, La Jolla CA 92093-0114 USAApril 2006 © Joseph A. Goguen, 1990–2006.Edited by:Kokichi FutatsugiNarciso Martí-OlietJosé MeseguerText reviewed and commented by:Kokichi Futatsugi
Daniel Mircea GainaNarciso Martí-OlietJosé MeseguerMasaki NakamuraMiguel PalominoBook designed and typeset by:Alberto Verdejo ontents
Foreword by Editors xiii1 Introduction 1 versus
Logic for Applications . . 21.4 Why Equational Logic and Algebra? . . . . . . . . . . . . 31.5 What Kind of Algebra? . . . . . . . . . . . . . . . . . . . . 41.6 Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . 51.7 Logical Systems and Proof Scores . . . . . . . . . . . . . 51.8 Semantics and Soundness . . . . . . . . . . . . . . . . . . 61.9 Loose versus
Standard Semantics . . . . . . . . . . . . . 71.10 Human Interface Design . . . . . . . . . . . . . . . . . . . 81.11 OBJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.12 Some History and Related Work . . . . . . . . . . . . . . 101.13 Using this Text . . . . . . . . . . . . . . . . . . . . . . . . . 121.13.1 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . 131.13.2 Novel Features . . . . . . . . . . . . . . . . . . . . . 131.14 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 i Contents ( (cid:63) ) Parse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.8 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 ( (cid:63) ) An Alternative Congruence Rule . . . . . . . 724.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . 734.6 Deduction using OBJ . . . . . . . . . . . . . . . . . . . . . 734.7 Two More Rules of Deduction . . . . . . . . . . . . . . . . 764.8 Conditional Deduction and its Completeness . . . . . . 774.8.1 Deduction with Conditional Equations in OBJ3 . 794.9 Conditional Subterm Replacement . . . . . . . . . . . . . 814.10 ( (cid:63) ) Specification Equivalence . . . . . . . . . . . . . . . . 834.11 ( (cid:63) ) A More Abstract Formulation of Deduction . . . . . 884.12 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 ( (cid:63) ) Noetherian Orderings . . . . . . . . . . . . . . 1415.8.4 Proving Church-Rosser . . . . . . . . . . . . . . . 1495.9 ( (cid:63) ) Relation between Abstract and Term Rewriting Sys-tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1525.10 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 ontents vii ( (cid:63) ) Initial Horn Models . . . . . . . . . . . . . . . . 2508.3 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . 253 iii
Contents
10 Order-Sorted Algebra and Term Rewriting 319
11 Generic Modules 36512 Unification 367
13 Hidden Algebra 36914 A General Framework (Institutions) 371A OBJ3 Syntax and Usage 373 ontents ix B Exiled Proofs 379
B.1 Many-Sorted Algebra . . . . . . . . . . . . . . . . . . . . . 379B.2 Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381B.2.1 ( (cid:63) ) Orthogonal Term Rewriting Systems . . . . . 384B.3 Rewriting Modulo Equations . . . . . . . . . . . . . . . . 389B.4 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . 389B.5 Order-Sorted Algebra . . . . . . . . . . . . . . . . . . . . . 390
C Some Background on Relations 397
C.1 OBJ Theories for Relations . . . . . . . . . . . . . . . . . 401
D Social Implications 405Bibliography 407Editors’ Notes 423
Contents ist of Figures k = F( A ) . . . . . . . . . . . . . . . . . . . 1546.1 Proof for Uniqueness of Quotients . . . . . . . . . . . . . . . 164 mos Transistor: n on Left, p on Right . 2147.4 A cmos not
Gate . . . . . . . . . . . . . . . . . . . . . . . . . 2207.5 A cmos xor
Gate . . . . . . . . . . . . . . . . . . . . . . . . . . 2227.6 A cmos nor
Gate . . . . . . . . . . . . . . . . . . . . . . . . . 223 ii List of Figures cmos
Cell . . . . . . . . . . . . . . . . . . . . . . . . . . 2268.1 A Ripple Carry Adder . . . . . . . . . . . . . . . . . . . . . . . 2849.1 Series Connected Inverters . . . . . . . . . . . . . . . . . . . . 3149.2 Parity of a Bit Stream . . . . . . . . . . . . . . . . . . . . . . . 31610.1 Visualizing Regularity . . . . . . . . . . . . . . . . . . . . . . . 32410.2 Condition (2) of Universal Property of Quotient . . . . . . . 33710.3 Subsort Structure for Number System . . . . . . . . . . . . . 355B.1 Factorization of θ . . . . . . . . . . . . . . . . . . . . . . . . . 393 oreword by Editors Two of us, Futatsugi and Meseguer, had the privilege of working closelywith Joseph Goguen, were influenced by his very creative and funda- mental ideas, and, on the occasion of the Festschrift organized in hishonor for his 65th birthday in San Diego, California, we wrote:
Joseph Goguen is one of the most prominent computer scien-tists worldwide. His numerous research contributions spanmany topics and have changed the way we think about manyconcepts. Our views about data types, programming lan-guages, software specification and verification, computationalbehavior, logics in computer science, semiotics, interface de-sign, multimedia, and consciousness, to mention just some ofthe areas, have all been enriched in fundamental ways by hisideas.
Sadly, Joseph Goguen’s life was cut short due to a fatal illness somedays after the Festschrift Symposium in his honor, in which he couldstill be present. He was at that time working on
Theorem Proving andAlgebra (TPA), a long-term project still unfinished, yet quite advanced.The TPA book provides the definitive mathematical foundation for alge-braic theorem proving. Professor Goguen also exposes formal methodswith the OBJ language system. This is quite unique and an importantfeature of the book.We are convinced that Joseph Goguen’s ideas in the TPA book have a fundamental and lasting value and should be made available to theresearch community. Furthermore, as we explain below, they have in-fluenced subsequent work in several algebraic languages originating inthe OBJ language on which two of us, Futatsugi and Meseguer, workedclosely with Joseph Goguen, namely, CafeOBJ and Maude. However, theTPA book should be his book, with no efforts to complete parts of themanuscript that were unfinished or in any way modify its contents.Our approach to this task has been the usual one in editing any partof the nachlass of a scholar: to make only small corrections of typos orsmall mistakes that, clearly, the author would have himself wished tobe done; and to add a few explanatory editorial notes —clearly marked iv Foreword as such, and different from the text itself— to help the reader betterunderstand some specific points in the text: again, making the bestguess possible about what the author himself might have wished toadd as explanations, given the unfinished nature of the text. Both thesmall corrections and the editorial notes are based on careful revisionsof the original text by the editors with the additional help of DanielMircea Gaina, Masaki Nakamura, and Miguel Palomino.
Impact on CafeOBJ and Maude
CafeOBJ (https://cafeobj.org/) and Maude (https://maude.cs.uiuc.edu/)are two sibling languages of OBJ which draw significant inspiration from Joseph Goguen’s work presented in the TPA book. In what fol-lows we explain several ways in which the ideas in the TPA book haveinfluenced further developments in both CafeOBJ and Maude.
CafeOBJ inherits from OBJ distinctive features such as user-definedmix-fix syntax, subtyping by ordered sorts, module system with pa-rameterized module expressions, conditional rewriting with associa-tive/commutative matching, loose and tight (or initial) semantics formodules, and theorem proving with proof scores.CafeOBJ adds to OBJ new features such as behavioural (or obser-vational) abstraction with hidden algebra, rewriting logic à la Maudefor specifying transition systems, and their combinations with order-sorted algebra. This multiparadigm approach has a mathematical se-mantics based on multiple institutions. Some theorem-proving capabil-ities are also added, including behavioral rewriting, observational coin-duction, and built-in search predicates.Transition systems can be specified with observational abstractionor with rewriting rules in CafeOBJ. The observational style is more ab-stract/algebraic and there is no need to determine state configurationsthat are instead needed to define state transitions via rewriting rules.The built-in search predicates facilitate verification of rewriting-basedtransition systems. Both styles have their own merits and it is worth- while to support both in CafeOBJ.In the TPA book’s introduction (1 Introduction) Professor Goguenstates:
We do not pursue the lofty goal of mechanizing proofs likethose of which mathematicians are justly so proud; instead,we seek to take steps towards providing mechanical assis-tance for proofs that are useful for computer scientists in de-veloping software and hardware. This more modest goal hasthe advantage of both being achievable and having practicalbenefits. oreword xv He continues (1.7 Logical Systems and Proof Scores):
The first step of our approach is to construct proof scores,which are instructions such that when executed (or “played”),if everything evaluates as expected, then the desired theoremis proved. A proof score is executed by applying proof mea-sures, which progressively transform formulae in a languageof goals into expressions which can be directly executed. Wewill see that equational logic is adequate for implementingversions of first and second-order logic in this way, as well asmany other logical systems. and (1.8 Semantics and Soundness):
This text justifies proof measures for a logical system by demon-strating their soundness with respect to the notions of modeland satisfaction for that system. In this sense, it places Se-mantics First! In fact, users are primarily concerned withtruth; they want to know whether certain properties are trueof certain models, which may be realized in software and/orhardware. From this point of view, proof is a necessary nui-sance that we tolerate only because we have no other wayto effectively demonstrate truth. Moreover, it is usually eas-ier and more intuitive to justify proof measures on semanticrather than syntactic grounds.
Inspired by the above-stated OBJ’s proof score approach and itspotential for realizing well structured and reusable proof documents,theorem proving with proof scores has been pursued extensively inCafeOBJ. Many case studies have been done in a variety of applicationareas and the proof scores have been found to be usable for practicaltheorem proving.Constructor-based algebra and its proof calculus, which are not in-cluded in the TPA book, were formalized as a theoretical foundationfor proof scores in CafeOBJ. The Constructor-based Inductive TheoremProver (CITP) was first implemented in Maude Metalevel and its vari- ant is now incorporated into CafeOBJ as a Proof Tree Calculus (PTcalc)subsystem. The “Semantics First!” principle played an important rolein the design and implementation of PTcalc. As a result, PTcalc en-courages model-based analyses and proofs of properties to be verified,which is an important merit of algebraic theorem proving.The harmony of (1) model satisfaction semantics, (2) equational de-duction, and (3) rewriting execution constitutes the core of algebraictheorem proving and the TPA book provides the most reliable and com-prehensive account of it. A proof score applies proof measures by exe-cuting equations as rewriting rules to prove the model satisfaction. Theharmony of semantics, deduction, and execution makes it possible to vi Foreword formalize effective and transparent proof measures. Major proof mea-sures in CafeOBJ include (i) case split with exhaustive equations and (ii)well-founded induction via term refinement.CafeOBJ’s module system is basically the same as OBJ’s one, exceptfor succinct notation for inline view definition and gradually developedreliable and efficient implementation. The module system is an impor-tant feature of algebraic language systems and its power has been wellappreciated. CafeOBJ’s module system works significantly not only forconstructing specifications and proof scores but also for preparing li-braries of generic data structures and proof measures. The PTcalc sub-system of CafeOBJ, where proof nodes are modules, depends on themodule system in a significant way.
Maude is a language based on rewriting logic, a simple computational logic to specify and program concurrent systems as initial models ofrewrite theories. In Maude they are specified as system modules ofthe form: mod FOO is ( Σ , E, R) endm , where the rewrite theory ( Σ , E, R) specifies a concurrent system whose concurrent states belong to thealgebraic data type (initial algebra) T Σ /E —with Σ a typed signature offunction symbols and E a set of equations— and whose local concurrenttransitions are specified by the rewrite rules R . When R = ∅ , a rewritetheory ( Σ , E, R) becomes an equational theory ( Σ , E) . In Maude thisgives rise to a sublanguage of functional modules of the form fmodBAR is ( Σ , E) endfm with initial algebra semantics, which is a supersetof OBJ, because Maude is based on the more expressive membershipequational logic, which contains OBJ’s order-sorted equational logic asa special case.Maude naturally extends OBJ, because membership equational logicis itself a sublogic (the case R = ∅ ) of rewriting logic. The practicalmeaning of this extension is that equational logic is well-suited to spec-ify deterministic systems, whereas rewriting logic does naturally spec-ify non-deterministic and concurrent systems. Furthermore, Maude hasthe following additional features: (i) reachability analysis; (ii) LTL modelchecking; (iii) a strategy language to guide the execution of rewrite the-ories; (iv) concurrent object-oriented system specification, including ex- ternal objects that allow Maude objects to interact with any other en-tities and support distributed implementations; (v) reflection, thanksto the existence of a universal theory that can simulate deduction inall other theories (including itself) and represent functional and systemmodules as data for meta-programming purposes; and (vi) symboliccomputation features such as semantic unification, variants, symbolicreachability analysis, and SMT solving.The ideas in Joseph Goguen’s TPA book have stimulated further de-velopments in the formal verification of Maude modules, including thefollowing: (1) the use of reflection and of symbolic methods to auto-mate constructor-based inductive theorem-proving verification of func- oreword xvii tional modules; (2) tools to check the confluence, sufficient complete-ness, and termination of functional modules; and (3) theorem provingof properties of system modules (rewrite theories).All the above-described advances in CafeOBJ and Maude illustratesome of the ways in which Joseph Goguen’s TPA book has stimulatedfurther developments in algebraic verification. But they do not exhaustat all the possible ways in which this fundamental book could stimulateother readers: we are convinced that it will continue to stimulate usand others. We have undertaken the task of making it available to theresearch community, as it was Joseph Goguen’s desire, precisely forthis purpose. December 2020K. FutatsugiN. Martí-OlietJ. Meseguer viii
Foreword
Introduction
This book can be seen either as a text on theorem proving that usestechniques from general algebra, or else as a text on general algebra illustrated and made concrete by practical exercises in theorem prov-ing. This introductory chapter provides background and motivation,though some points may only become fully clear in light of subsequentchapters, in part because of terminology not yet defined. Section 1.13.1is a synopsis. The book considers several different logical systems,including first-order logic, Horn clause logic, equational logic, and first-order logic with equality. Similarly, several different proof paradigmsare considered. However, we do emphasize equational logic, and forsimplicity we use only the OBJ3 software system, though it is used in arather flexible manner.We do not pursue the lofty goal of mechanizing proofs like thoseof which mathematicians are justly so proud; instead, we seek to takesteps towards providing mechanical assistance for proofs that are use-ful for computer scientists in developing software and hardware. Thismore modest goal has the advantage of both being achievable and hav-ing practical benefits.
You can think of theorem proving as game playing where there are veryprecise rules, initial positions and goals; you win if you reach the goal from the initial position by correctly following the rules. Different logi-cal systems have different rules and different notions of position, whiledifferent problems in the same system have different goals and/or dif-ferent initial positions. Playing is called inference or deduction , a moveis a step of inference or deduction, and positions are (usually) sets offormulae; the formulae in an initial position may be called axioms, as-sumptions or hypotheses . Thus a logical system consists of a languagewhose sentences (which are formulae) are used to state goals and ax-ioms, plus some rules of inference for deriving new sentences fromold ones. There may also be notions of model and of satisfaction, andthese will play a key role in this text. Models relate to rules of inference Introduction much as a chessboard with its pieces relates to the rules of chess; inthis setting, satisfaction means that a sentence accurately describes agiven position.The most classical example is Euclidean plane geometry. More re-cently, first-order predicate calculus (usually called “First-Order Logic”and often abbreviated “FOL”) has been the most important game intown, but there are many, many other logical systems, a few of whichare seen later in this book.
Mechanical theorem proving has many practical applications:• The verification of digital hardware circuits, especially VLSI, where the economic cost of design errors can be enormous.• The design and verification of so-called critical systems , such asnuclear power plants, heart pacemakers, and aircraft guidancesystems, where failure can endanger human life or property ona large scale.• Tools to make programming more reliable and robust, for exam-ple, to help with debugging, modifying, and optimizing programs,based on their semantics.• The technology of theorem proving has been used in a numberof modern programming languages that are based upon logic, in-cluding OBJ (which is described in Section 1.11 and used through-out this text) and Prolog.• The technology of theorem proving has also been used in systemsfor robot vision, motion planning, drug discovery, DNA sequenc-ing, and many similar applications. versus
Logic for Applications
Logic has been mainly concerned with the foundations of mathematics since the rude shock of the paradoxes discovered around the turn ofthe twentieth century by Russell and others. Such foundational worktends to simplify notation, axioms, and inference rules to the bare min-imum, in order to facilitate the study of meta-mathematical issues suchas consistency and completeness. But logic is used in applications forcompletely different reasons. In particular, computer scientists and en-gineers build hardware and software systems that are actually used inthe real worlds of science, commerce, and technology, for which verydifferent approaches to logic are more appropriate. In particular, thelogical systems used for applications are often far more complex than hy Equational Logic and Algebra? those used in foundations; there may be many more symbols, axiomsand rules; and some data types may be “built in,” such as natural num-bers or lists. The ability to add new definitions and notations and thenuse them is also important, and some applications even require the useof more than one logical system. This text gives a privileged role to general algebra (also called univer-sal algebra) and to its logic, which is equational logic. One reasonfor this is that equational logic is the logic of substituting equals forequals, which is a basic common denominator among many different logics. Also, equational logic is attractive as the foundation for a theo-rem prover because of the simplicity, familiarity, and elegance of equa-tional reasoning, and because there is a great deal of relevant theory,including the extensive literature on abstract data types. Moreover,equational reasoning can be implemented efficiently by term rewriting,which can then serve as a workhorse for a general purpose theoremprover. In addition, many other interesting and important logics can beembedded within or built on top of equational logic, as we will see.We will also see that equational logic is ideal as a meta-logic for de-scribing other logical systems, because the syntax of a logic is a free al-gebra, while the rules of deduction can be implemented by (conditional)rewrite rules. Thus, we can use equational logic at the meta level (fordescribing logical systems and justifying proof scores), as well as at theobject level (for proving theorems).Any computable function over any computable data structure canbe defined in equational logic [11], and order-sorted equational logic,which adds subsorts [82], extends this to encompass the partial com-putable functions. Thus, equational logic is sufficiently powerful todescribe any standard model of interest. Although not every propertythat one might want to prove about some real system can be expressedusing just equational logic, much more can be expressed than might at first be thought. In particular, we will see that many typical resultsabout higher-order functional programs and most of the usual digitalhardware verification examples fall within this setting, and it seemsreasonable to use the simplest possible logic for any given application.However, we do not restrict ourselves to the most traditional kind ofequational logic, but rather extend it in various ways, as discussed fur-ther in the next section; also, Chapter 8 considers first-order logic. The terms “algebra” and “equational logic” in their narrow senses refer to modelsand deductions, respectively, but in their broad sense, both terms refer to both modelsand deductions.
Introduction
This text does not view algebra as having a single all encompassinglogic, but rather as having a family of related logics, ranging from theclassical unsorted case toward first-order logic with equality, and evensecond-order logic. The following are brief character sketches for cer-tain versions that are developed in detail later on:•
Many Sorts.
Computing applications typically involve more thanone sort of data, and it can be awkward, or even impossible, totreat these applications adequately with unsorted algebra. Still,it is not unusual to see papers that treat only the unsorted case,perhaps with a remark that “everything generalizes easily.” Al-though this is true in essence, Section 4.3 shows that significant difficulties can arise if the generalization is not done carefully.•
Conditional.
Many applications involve equations that are onlytrue under certain conditions; examples include defining the tran-sitive closure of a relation (see Appendix C) as well as many ab-stract data types (see Chapter 6), and the rules of inference forfirst-order logic (see Chapter 8). See Chapter 3 for details.•
Overloaded.
Many computing applications involve overloaded op-eration symbols, where arguments may have different sort pat-terns. Examples include overloading in ordinary programminglanguages (such as Ada), polymorphism in functional program-ming languages (such as ML) and the λ -calculus, and overwritingin object-oriented languages (such as Smalltalk and Eiffel). SeeChapter 2 for more detail.• Ordered Sorts.
This rather substantial extension of many-sortedalgebra involves having a partial ordering on the set of sorts,called the subsort relation, which is interpreted semantically asa subset relation on the sets that interpret the sorts. This hasmany interesting applications, including exception handling andpartially defined functions.•
Second Order.
Another substantial extension allows quantifica- tion over functions as well as over elements. This has significantapplications to digital hardware verification. Surprisingly, muchof general algebra extends without difficulty, as shown in Chap-ter 9.•
Additional Connectives.
The basic formulae of equational logicare universally quantified equations, but we can build more com-plex formulae from these, using conjunction, implication, disjunc-tion, negation, and existential quantification. Satisfaction of suchformulae can be defined in terms of the satisfaction of their con-stituents. Chapter 8 gives the details. erm Rewriting • Hidden Sorts.
Computing applications typically involve states,and it can be awkward to treat these applications in a purely func-tional style. Hidden sorted algebra substantially extends ordinaryalgebra by distinguishing sorts used for data from sorts used forstates, calling them respectively visible and hidden sorts, and itchanges the notion of satisfaction to behavioral (also called ob-servational ) satisfaction, so that equations need only appear tobe satisfied under all the relevant experiments. Hidden algebrais powerful enough to give a semantics for the object paradigm,including inheritance and concurrency. See Chapter 13 for details.Putting all this together gives possibilities that are far from classicalgeneral algebra. The major difference from the usual first and second-order logic is that the only relation symbol used is equality. However, other relations can be represented by Boolean-valued functions.
A significant part of this text is devoted to explaining and using termrewriting. The general idea can be expressed as follows:
Terms (or expressions ) over a fixed syntax Σ form an algebra. A rewrite rule isa rule for rewriting some terms into others. Each rewrite rule has a leftside ( L ), which is an expression defining a pattern that may or maynot match a given expression. A match of an expression E to L consistsof a subexpression E (cid:48) of E and an assignment of values to the variablesin L such that substituting those values into L yields E (cid:48) .A rewrite rule also has a rightside ( R ), which is a second expressioncontaining only variables that already occur in L . If there is a matchof L to a subexpression of a given expression, then the matched subex-pression is replaced by the corresponding substitution instance of R .This process is called term rewriting , term reduction , or subterm re-placement , and is the basis for the OBJ system (see Section 1.11) usedin this text. A very basic question in theorem proving is what logical system to use.The dominant modern logical system is first-order predicate logic, butadvances in computer science have spawned a huge array of new logics,e.g., for database systems, knowledge representation, and the semanticweb; these include variants of propositional logic, modal logic, intu-itionistic logic, higher-order logic, and equational logic, among manyothers.The view of this text is that the choice of logical system should beleft to the user, and that a mechanical theorem prover should be a
Introduction basic engine for applying rewrite rules, so that a wide variety of logicalsystems can be implemented by supplying appropriate definitions tothe rewrite engine. This view is influenced by the theory of institutions[67], and also resembles that of the Edinburgh Logical Framework [98,1], Paulson’s Isabelle system [148], and the use of Maude as a meta-toolfor theorem proving [29], in avoiding commitment to any particularlogical system. However, it differs in using equational logic and termrewriting as a basis.The first step of our approach is to construct proof scores , which areinstructions such that when executed (or “played”), if everything eval-uates as expected, then the desired theorem is proved. A proof scoreis executed by applying proof measures , which progressively transformformulae in a language of goals into expressions which can be directly executed. We will see that equational logic is adequate for implement-ing versions of first and second-order logic in this way, as well as manyother logical systems.This approach can be further mechanized by implementing the metalevel of goals and proof measures in OBJ itself, and providing a transla-tor to an object level of computation, also in OBJ. If each proof measureis sound, and the computer implementations are correct, then each re-sulting proof score is guaranteed to be sound, in the sense that if itexecutes as desired, then its goal has been proved. But the converse,that a proof score will prove its goal (if it is true), does not hold ingeneral. In addition, the advanced module facilities of OBJ can be usedto express the structure of proofs. The Kumo [74, 83, 73] and 2OBJ[170, 84] systems provide even more direct support for such an ap-proach. Equational logic is demonstrably adequate for our purpose,because the syntax, rules of deduction, and rules of translation of alogical system must be computable, and therefore (see Section 1.4) canbe expressed with equations.
This text justifies proof measures for a logical system by demonstratingtheir soundness with respect to the notions of model and satisfaction forthat system. In this sense, it places
Semantics First!
In fact, users are primarily concerned with truth ; they want to knowwhether certain properties are true of certain models, which may berealized in software and/or hardware. From this point of view, proofis a necessary nuisance that we tolerate only because we have no otherway to effectively demonstrate truth. Moreover, it is usually easier and oose versus
Standard Semantics more intuitive to justify proof measures on semantic rather than syn-tactic grounds. The above slogan has many further implications. For example, itsuggests that to define some “expressive” (i.e., complex) syntax, writesome code that is driven by it, and then call the result a “theoremprover” if it usually prints
TRUE when you want it to, is unwise. Simi-larly, it may not be a good idea to “give a semantics” for some systemafter it has already been implemented (or designed), because such asemantics may well be too complex to be of much use. Finally, it isdangerous to try to combine several logics, unless precise and reason-ably simple notions of model and satisfaction are known for the com-bination. Unfortunately, many theorem-proving projects have failed toobserve these basic rules of logical hygiene. Nevertheless, as empha-sized by the metaphor in Section 1.1, theorem proving is syntactic ma- nipulation, and hence syntax is fundamental and unavoidable for theenterprise of this book. We can state our view in a balanced way asfollows: Semantics is fundamental at the meta level (of correctnessfor proof rules), while syntax is fundamental at the objectlevel (of actual proofs). versus
Standard Semantics
An important distinction concerns the intended semantics of a logicalsystem: is it meant to capture formulae that are true of all models ofsome set of axioms, or just formulae that are true of a fixed standardmodel of those axioms? Let us call the first case loose semantics , and thesecond case standard (or tight ) semantics . The usual first-order logichas loose semantics, and captures properties that are true of all modelsof some given axioms. This fits many applications. For example, a logicintended to capture group theory must be loose, since it must applyto all groups. On the other hand, a logic for arithmetic should captureproperties of a single standard model, consisting of the numbers with their usual operations.A completeness theorem for a logical system says that all the formu-lae that are true of all the intended models of a given set of formulae(the axioms) can be proved from those formulae. There are complete-ness theorems for some well-known loose logics, including first-order These points should not be seen as rejecting constructivist approaches like that ofMartin-Löf and others who identify the syntax and semantics of logical systems. Infact, there is much to recommend such approaches, especially for the foundations ofmathematics. But note that what we are calling soundness problems reappear in thiscontext as consistency problems. Unless perhaps you are prepared to go through several iterations, modifying boththe implementation and the logic until they are consistent and elegant.
Introduction logic and equational logic. However, completeness cannot in general beexpected for logics under standard semantics, because the class of for-mulae true of a fixed model is not in general recursively enumerable;Gödel’s famous incompleteness theorem shows that this holds evenfor the natural numbers. On the other hand, the familiar and powerfultechniques for induction are not (usually) sound for loose semantics,but only for standard semantics. It seems that completeness has beenoveremphasized in the theorem-proving literature, because many com-puter science applications actually concern properties of a standardmodel of some set of formulae, rather than properties of all its models.This text treats theorem proving for both standard and loose seman-tics, as well as combinations of the two. Given a set A of formulae anda formula e , we will write A (cid:238) e to indicate that e is true in all models of A , and A |(cid:155) e to indicate that e is true in the standard model of A . Thealready well-developed theory of abstract data types is helpful in study-ing the relation |(cid:155) . In particular, we will see that formalizing the notionof “standard model” with initiality leads to simple proofs at both theobject and the meta levels, i.e., of properties of standard models andof theorems about such proofs. We also consider loose extensions ofstandard models, standard extensions of models of a loose theory, andso on recursively. Logical systems are so very precise and detailed that human beings of-ten find it difficult and/or unpleasant to use them. Usually mathematicsis conducted in a quite informal way, with only infrequent reference toany underlying logical system, much as the inhabitants of a house usu-ally ignore its foundations [51]. Indeed, unlike a house, it is not clearthat mathematics really needs foundations, though they may help yousleep better at night.Computers can greatly lighten the burden of rigorously followingthe rules of a logical system, and fully automatic theorem proving at-tempts to entirely eliminate the pain of applying rules, although of course users must still state their axioms and goals precisely. In fact,fully automatic theorem proving has not been very successful, and nonew theorems of real interest to mathematics have been proved in thisway. E1 One difficulty is that users often have to trick some built-inheuristics into doing what they want. However, fully automatic theo-rem proving remains an important area for research. An approach atthe other extreme is proof checking , where rules are explicitly invokedone at a time by the user, and then actually applied by the machine.This can be quite tedious, but it can detect many errors, and even cor-rect certain errors.This text avoids both these extremes, taking the view that humans BJ should do the interesting parts of theorem proving, such as inventingproof strategies and inductive hypotheses, while machines should dothe tedious parts, mechanically applying sets of rewrite rules (see Sec-tion 1.6 above) that lead closer to subgoals. An important advantageof this approach is that partially successful proofs may return usefulinformation about what to try next; for example, the output may sug-gest a new lemma that would further advance the proof. The readerwho does the exercises will see many examples of this phenomenon. Avariant of this approach provides a tactic language in which possiblyquite complex combinations of proof measures can be expressed, anda tactic interpreter to apply these compound tactics; many theorem-proving systems take this approach, including HOL [93], Isabelle [148],and Kumo [74, 83, 73]. Much research has been done on the use of graphics in theoremproving. But we have found that for even modest size proofs, graphicalrepresentations of proof trees are not only unhelpful, but are actuallyobstructive and confusing [64, 65]. Instead, we recommend structuring proofs, by using modules and other features of OBJ, as is much illus-trated in the following; we will see that this also supports proof reuse.
OBJ [47, 90] integrates specification, prototyping, and verification intoa single system, with a single underlying logic, which is (first-order con-ditional order-sorted) equational logic. OBJ3, which is the implementa-tion of OBJ used in this text, allows a module P to be either:1. an object , whose intended interpretation is a standard model of P ; or2. a theory , whose intended interpretation is the variety of all mod-els of P .In OBJ3, objects are executable, while theories describe properties; bothhave sets of equations as their bodies, but the former have standard(initial algebra) semantics and are executed as rewrite rules, while the latter have loose semantics, which can be “executed” in a loose sense byapplying rules of inference to derive new equations. Although theorieshave been studied more extensively in the theorem-proving literature,they often play a lesser role in practice, because most real applicationsrequire particular data structures and operations upon them.OBJ also has generic modules, module inheritance, and module ex-pressions which describe interconnections of modules and actually cre- More precisely, we mean an initial model of P , in a sense made precise in Theo-rem 3.2.1 of Chapter 3. Although much of the terminology in this paragraph may be unfamiliar now, it isall defined later on. Introduction ate the described subsystem when evaluated. The OBJ module systemis a practical realization of ideas originally developed in the Clear lan-guage [22, 23]; these ideas have directly influenced the module systemsof the ML, Ada, C ++ , and Module-2 languages. OBJ’s user-definable mix-fix syntax allows users to tailor their notation to their application, andL A TEX [119] symbols can be used to produce pretty output. Rewritingmodulo associativity and/or commutativity is also supported, and caneliminate a great deal of tedium, and the subsorts provided by order-sorted algebra (see Chapter 10) support error messages and exceptionhandling in a smooth and convenient way.OBJ can be used directly as a theorem prover for equational logiconly because its semantics is the semantics of equational logic. E2 Ev-ery OBJ computation is a proof of some theorem. It is not true that any other functional programming language would do just as well. Al-though most functional languages have an operational semantics thatis based on higher-order rewriting, they do not have a declarative, logi-cal semantics for all of their features. It is also important that the OBJmodule facility has a rigorous semantics, as explained in Chapter 11.OBJ began as an algebraic specification language at UCLA about1976, and has been further developed at SRI International, Oxford [47,90], UCSD, and several other sites [26, 33, 166] as a declarative speci-fication and rapid prototyping language; Appendix A gives more detailon OBJ, for which see also [90, 77]. The systematic use of OBJ as atheorem prover stems from [59]. The latest members of the OBJ fam-ily are CafeOBJ [43], Maude [30], and BOBJ [76, 75]. These systemsgo beyond OBJ3 in significant ways which are not needed for this text(rewriting logic for Maude, hidden algebra for BOBJ, and both of thesefor CafeOBJ); they could be used instead of OBJ3, though some syntac-tic changes would be needed. This book provides everything about thesyntax and semantics of OBJ3 needed for theorem proving, includingpractical details on getting started (for which see Appendix A).
There is such a large literature on theorem proving that an adequate survey would take several volumes. Consequently, we limit the fol-lowing discussion to systems that have particularly influenced the ap-proach in this book, or that seem related in some other significant way.Much of the early work on mechanical theorem proving was donein the context of so-called “Artificial Intelligence” (“AI”), and much of itwas not very rigorous. One inspiration was to give computers someability to reason, and then see how far this could be extended andapplied; there were even dreams of replacing mathematicians by pro-grams. The collection
Theorem Proving: After 25 Years [14] summa-rizes the state of theorem proving as of about 1984. The papers by ome History and Related Work Loveland and by Wang in this collection contain many interesting his-torical details. For a long time, the dominant approach for first-orderlogic was resolution , introduced by Alan Robinson in 1965 [157]. Thistechnique for loose semantics is most suitable for fully automatic the-orem proving, because its representations make it hard for users tounderstand or use to guide proofs. Chang and Lee [27] give a readableexposition, but Leitsch [122] is more precise and up to date. This tradi-tion is well represented by the otter [132] and Gandalf [174] systems.A wide range of interesting results were first verified with the Boyer-Moore Nqthm prover [19], using clever heuristics for induction. Itsbasic logic is untyped first-order universally quantified with functionsymbols and with equality as its only predicate; users can define newdata structures by induction and recursion. Users influence its behav- ior by requesting intermediate results to be proved in a certain order,since it recalls what it has already proved; users can also set certainparameters. However, this can be a very awkward way to control theprover. Its successor system, ACL2, is interactive instead of automatic[112].Another tradition arises from Milner’s LCF system [95, 147], in whicha higher-order strongly typed functional language (namely ML) is usedfor writing tactics that guide the application of elementary steps of de-duction for achieving goals. Soundness is guaranteed by having a type“ thm ” that can only be inhabited by formulae that have actually beenproved. One problem with this approach is that, because a proof isdescribed by a single functional expression, for difficult problems thisexpression can be hard to understand and to edit. Gordon’s HOL sys-tem [93], which has been successful for hardware verification, is an im-portant development in this tradition; HOL is now commonly run on Is-abelle [148]. Work by Stickel on the Prolog Technology Theorem Prover[171] should also be mentioned, as should burgeoning work based ontype theory, e.g., the ELF [98, 1] and IPE [156] systems from Edinburgh,and Coq from INRIA; another important development in this area isNuPRL [35]. There is also much work using term rewriting techniquesgeneralizing Knuth-Bendix completion, some of which is discussed in
Chapter 12, although we have found that inductive proofs often workbetter in practice.Every approach mentioned above has some drawbacks, and so doesthe one in this book. In fact, every approach must be unsatisfying insome ways, because general theorem proving is recursively unsolvable.Even though there is a completeness theorem for first-order logic withloose semantics, the problem is still only semi-decidable ; there is noway to know whether an attempt to prove a given formula will everhalt, although if there is a proof, it will eventually be found (unless theavailable memory is exceeded). Theorem proving for standard seman-tics is not in general even semi-decidable, so that any automatic prover Introduction for this domain will necessarily fail to find proofs of some true formu-lae, even if given arbitrarily much time. However, such results no moreprevent machines from proving theorems than they do humans.The view of theorem proving as very strict “game playing” (see Sec-tion 1.1) comes from the formalist view of the foundations of math-ematics advocated by Hilbert and by Whitehead and Russell, amongothers. In this view, mathematics is a purely formal activity, rigidly gov-erned by sets of rules. This view is opposed by several other schoolsof the philosophy of mathematics, including idealists (e.g., Platonists)and intuitionists, both of whom argue that mathematics has some in-herent meaning. The mechanization of theorem proving is necessarilyconsistent with a formalist view, because computer manipulations arenecessarily formal; but that does not mean that one has to believe the formalist position to engage in mechanical theorem proving, and infact the author of this book does not accept the formalist position, butrather subscribes to a view like that of Wittgenstein, that mathematicalproofs are a kind of (socially situated) language game [51]. In any case,formal semantics cannot capture the sort of meaning that Platonistsand intuitionists talk about, because meaning in their sense is inher-ently non-formal. Despite all this, we will see that formal semanticscan be very useful.Some discussion of the history and literature of algebra is given inSections 2.8 and 3.8.
This text was developed for advanced undergraduates (that is, third orfourth year), but it can also be used at the graduate level by includingmore of the difficult material. It can be used in courses on general alge-bra, on the practice of mechanical theorem proving, and on the math-ematical foundations of theorem proving. The second choice (whichwas taken at Oxford) would give precedence to the OBJ exercises, whilethe first and third choices would give precedence to the mathematics;alternatively, all three goals could be pursued at once. This text could perhaps be used as the basis for a course on discrete structures, butfor this purpose it should be supplemented. In any case, the exercisesthat use OBJ should be done if at all possible, because they give a muchmore concrete feeling for the more theoretical material; OBJ is availableby ftp (see Appendix A for details).The choice of what to include (or to develop, in the case of new re-sults) has always preferred material that is directly useful in computerscience, especially theorem proving, although this does sometimes re-quire including other material that is not itself directly useful. Proposi-tional and predicate logic, and basic set theory, are necessary for under-standing this text; some prior exposure to algebra would be helpful, as sing this Text would some experience with computing. Appendix C reviews some ba-sic mathematical concepts, such as transitive closure, and many othersare reviewed within the body of the text.Results, definitions, and examples are numbered on the same counter,which is reset for each section; E3 exercises are on a separate counter,which is also reset. Material marked “ ( (cid:63) ) ” can be skipped without lossof continuity, and probably should be skipped on a first reading. Themore difficult proofs have been relegated to Appendix B. More advancedtopics that could be skipped in an introductory class include initialHorn models, order-sorted algebra and rewriting, generic modules, hid-den algebra, and the object paradigm. The following topics are covered: many-sorted signature, algebra andhomomorphism; term algebra and substitution; equation and satisfac-tion; conditional equations; equational deduction and its complete-ness; deduction for conditional equations; the theorem of constants;interpretation and equivalence of theories; term rewriting, termination,confluence and normal form; abstract rewrite systems; standard mod-els, abstract data types, initiality, and induction; rewriting and deduc-tion modulo equations; first-order logic, models, and proof planning;second-order algebra; order-sorted algebra and rewriting; modules; uni-fication and completion; and hidden algebra. In parallel with these area gradual introduction to OBJ3, applications to group theory, variousabstract data types (such as number systems, lists, and stacks), propo-sitional calculus, hardware verification, the λ -calculus, correctness offunctional programs, and other topics. Some social aspects of formalmethods are discussed in Appendix D. Novel features of this book include the following: the use of arrows rather than set-theoretic functions; commutative diagrams; overloadedmany-sorted algebra; an emphasis on signatures for logics and termrewriting; an algebraic treatment of first-order logic; second-order gen-eral algebra; applications to vlsi (especially cmos ) transistor circuits;use of an executable specification language for proofs; the notion ofproof score; systematic use of the theorem of constants; algebraic treat-ments of termination proofs and rewriting modulo equations. Moreadvanced novel topics include: an algebraic treatment of parsing; anadjunction between term rewriting systems and abstract rewriting sys-tems; an algebraic treatment of Horn clause logic and its initial models;and results on hierarchical term rewriting systems. Introduction
I thank the Computer Science Lab at SRI, where this research began,the Programming Research Group at Oxford University, where this textbegan, and the Department of Computer Science and Engineering atUCSD, where it was finished. In particular, I thank Frances Page, JoanArnold, and Sarah Farrer for help with the diagrams and corrections.Special thanks to Dr. José Meseguer, in collaboration with whom manyof the ideas behind this text were developed, and to Mr. Timothy Win-kler, who implemented most of OBJ3, as well as Drs. Kokichi Futat-sugi, Jean-Pierre Jouannaud, Claude Kirchner, Hélène Kirchner, DavidPlaisted, Joseph Tardo, Patrick Lincoln, and Aristide Megrelis, all ofwhom helped get OBJ3 where it is today. In addition, I thank Prof. RodBurstall and Drs. James Thatcher, Eric Wagner, and Jesse Wright, who all helped get the theory to the point where a text like this becamepossible. I thank Monica Marcus, R˘azvan Diaconescu, Grigore Ro¸su,Yoshihito Toyama, Virgil-Emil Cazanescu, and José Barros for help withChapter 5, and R˘azvan Diaconescu and Grigore Ro¸su for help withChapter 8. Paulo Borba, Jason Brown, Steven Eker, Healfdene Goguen,Ranko Lazic, Alexander Leitsch, Dorel Lucanu, Sula Ma, Monica Marcus,Chiyo Matsumiya, Oege de Moor, and Adolfo Socorro have all helpedspot bugs and typos. Finally, I thank the students in my classes fortheir patience and comments. It has been a great pleasure for me towork with all these people.
A Note for Lecturers:
It is not necessary to spend time onthe material in this chapter, because most of it arises nat-urally as the actual content of the course unfolds. Instead,it can just be assigned as reading; it should probably be as-signed twice, once at the beginning and once at the end ofthe course.
Signature and Algebra
In order to prove theorems, we need formulae to express assumptionsand goals, and we need precise notions of “model” and of “satisfaction of a formula by a model” to ensure correctness. This chapter developsmany-sorted general (or universal) algebras as models, while the nextchapter considers equations and their satisfaction by algebras.
We first briefly summarize some notation that will be used throughoutthis book, assuming that basic set theory is already familiar. ω denotesthe set { , , , . . . } of all natural numbers, and S denotes the cardinal-ity of a finite set S . We let S * denote the set of all lists (or strings) ofelements from S , including the empty list which we denote [] . We writethe elements of a list in sequence without any punctuation. For exam-ple, if S = { a, b, c, d } then some elements of S * are a, ac, acb, d and [] .Notice that in this notation, S ⊆ S *. Given w ∈ S *, we let w denotethe length of w ; in particular, [] =
0, and for s ∈ S as a one elementlist, s =
1. Let S + = S * − { [] } .This book makes systematic use of arrows (also called maps ), whichare functions with given source and target sets (elsewhere these may becalled domain and codomain sets). f : A → B designates an arrow from A to B , i.e., a function defined on source A with image contained in target B . For example, the successor map s : ω → ω is defined by s(n) = n +
1. We let [A → B] denote the set of all arrows from A to B .In Appendix C, this approach is compared with the usual set-theoreticapproach to functions.Given arrows f : A → B and g : B → C , then f ; g denotes their composition , which is an arrow A → C (this notation follows the con-ventions of computing science, rather than of mathematics). An arrow f : A → B is injective iff f (a) = f (a (cid:48) ) implies a = a (cid:48) , is surjective ifffor each b ∈ B there is some a ∈ A such that f (a) = b , and is bijective iff it is both injective and surjective. We let 1 A denote the identity ar-row at A , defined by 1 A (a) = a for all a ∈ A . Notice that 1 A ; f = f and f ; 1 B = f for any f : A → B . Signature and Algebra (cid:36)(cid:37)(cid:38)(cid:39) (cid:36)(cid:37)(cid:38)(cid:39) (cid:36)(cid:37)(cid:38)(cid:39)(cid:19)(cid:18) (cid:16)(cid:17)(cid:16) s (cid:27) (cid:63) (cid:18) OutputStateInput (cid:64)(cid:64)(cid:64)(cid:82) • f • g Figure 2.1: Signature for AutomataSome further set-theoretic topics are reviewed in Appendix C, whichshould be consulted by readers for whom the above concepts are notyet entirely comfortable.
Any theorem-proving problem needs a vocabulary in which to expressits goals and assumptions. This vocabulary will include a sort set S ,to classify (or “sort”) the entities (or “data items”) that are involved.Names for the operations involved are also needed, and for each givenoperation name, the sorts of arguments that it takes, and the sort ofvalue that it returns, should be indicated. A vocabulary with such in-formation is called a signature .For example, let us consider automata. These have three sorts of en-tity, namely input, state, and output, so that S = { Input , State , Output } .Also, there are transition and output operations, say f and g , plus aninitial state s . It is convenient to present this structure graphicallywith the “ADJ diagram” in Figure 2.1, which clearly shows that f takesan input and a state as its arguments and returns a state, while g takesa single state as input, returning an output, and s is a constant of sort State .We wish to formalize the signature concept in such a way that oper-ation symbols can be overloaded , that is, so that an operation symbol can have more than one type; a rather sophisticated word sometimesused to describe this phenomenon is “polymorphism.” For example, S might contain Nat , Bool and
List , and we might want “ + ” to de-note operations for adding natural numbers, for taking exclusive-or ofbooleans, and for concatenating lists.Our first step towards formalizing all this is to capture the notionof providing a set of elements for each sort s ∈ S . For example, anautomaton A will have three such sets, for its elements of sorts Input , State , and
Output ; these three sets are denoted A Input , A State , and A Output , respectively, and should be thought of as a “family” of sets ofelements. Our mathematical formalization of this concept is as follows: orted Sets Definition 2.2.1
Given a set S , whose elements are called sorts (or indices ),an S - sorted (or S - indexed ) set A is a set-valued map with source S ,whose value at s ∈ S is denoted A s ; we will use the notation { A s | s ∈ S } for this. Also, we let | A | = (cid:83) s ∈ S A s and we let a ∈ A meanthat a ∈ | A | . Finally, we may sometimes write { a } s for the single-ton S -sorted set A with A s = { a } and with A s (cid:48) = ∅ for s ≠ s (cid:48) ; wemay also extend this notation, to write { a, a (cid:48) } s for { a } s ∪ { a (cid:48) } s , etc. (cid:2) It is significant that the sets A s need not be disjoint, because it is thisthat supports overloading. For example, in many important examplesof automata, the sets A Input , A State , and A Output are all the same, e.g.,the natural numbers, or perhaps the Booleans. These sorted sets willalso be used just a little later as the basis for our notion of (overloaded) signature. (Our use of the notation { A s | s ∈ S } should not be con-fused with the set of all the sets A s ; an S -indexed set A is not a set ofsets, but rather a map from S to sets. Notations like (cid:104) A s | s ∈ S (cid:105) and { A s } s ∈ S might make this distinction clearer, but for this book a nota-tion stressing the analogy of indexed sets with ordinary sets is moredesirable.)A different approach to sorting elements provides an arrow τ : | A | → S , where τ(a) gives the sort of a ∈ A . Unfortunately, thisapproach does not permit overloading, and therefore does not allowmany important kinds of syntactic ambiguity to be studied, becausethey cannot even exist without overloading. Applications that requireoverloading include the refinement of data representations, and pars-ing in modern programming languages, such as Ada; in addition, mostobject-oriented languages allow a form of overloading that is resolvedat run-time by so-called dynamic binding.In general, concepts extend component-wise from ordinary sets to S -sorted sets. For example, A ⊆ B means that A s ⊆ B s for each s ∈ S , the empty S -sorted set ∅ has ∅ s = ∅ for each s ∈ S , and A ∪ B is defined by (A ∪ B) s = A s ∪ B s for each s ∈ S . Because of thesecomponent-wise definitions, many laws about sets also extend fromsimple sets to S -sorted sets. For example, we can show that ∅ ∪ A = A for any S -sorted set A , by checking that it is true for each component,as follows: ( ∅ ∪ A) s = ∅ ∪ A s = A s . Exercise 2.2.1
Define intersection for S -sorted sets, and show that A ∩ B = B ∩ A , and that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) , where A, B, C are all S -sorted sets. (cid:2) Definition 2.2.2 An S - sorted (or S - indexed ) arrow f : A → B between S -sortedsets A and B is an S -sorted family { f s | s ∈ S } of arrows f s : A s → B s .Given S -sorted arrows f : A → B and g : B → C , their composition isthe S -sorted family { f s ; g s | s ∈ S } of arrows. Each S -sorted set A has Signature and Algebra an identity arrow, 1 A = { As | s ∈ S } . An S -sorted arrow h : M → M (cid:48) is injective iff each component h s : M s → M (cid:48) s is injective, is surjective iff each h s : M s → M (cid:48) s is surjective, and is bijective iff it is injective andsurjective. (cid:2) Exercise 2.2.2 If f : A → B is an S -sorted arrow, show that 1 A ; f = f and that f ; 1 B = f . (cid:2) We are now ready for the following basic concept:
Definition 2.3.1
Given a sort set S , then an S -sorted signature Σ is an indexedfamily { Σ w,s | w ∈ S * , s ∈ S } of sets, whose elements are called opera-tion symbols , or possibly function symbols . A symbol σ ∈ Σ w,s is saidto have arity w , sort s , and rank (or “type”) (cid:104) w, s (cid:105) , also written w → s ;in particular, any σ ∈ Σ [],s is called a constant symbol . (Operationand constant symbols will later be interpreted as actual operations andconstants.)A symbol σ ∈ | Σ | is overloaded iff σ ∈ Σ w,s ∩ Σ w (cid:48) ,s (cid:48) for some (cid:104) w, s (cid:105) ≠ (cid:104) w (cid:48) , s (cid:48) (cid:105) . Σ is a ground signature iff Σ [],s ∩ Σ [],s (cid:48) = ∅ when-ever s ≠ s (cid:48) , and Σ w,s = ∅ unless w = [] , i.e., iff it consists only ofnon-overloaded constant symbols. (cid:2) Now some examples of signatures, beginning with a more formaltreatment of automata:
Example 2.3.2 ( Automata ) Because an automaton consists of an input set X ,a state set W , an output set Y , an initial state s ∈ W , a transitionfunction f : X × W → W , and an output function g : W → Y , we have S = { Input , State , Output } , with Σ [], State = { s } , Σ Input State , State ={ f } , Σ State , Output = { g } , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) , asshown in Figure 2.1. (cid:2) Example 2.3.3 ( Peano Natural Numbers ) There is just one sort of interest, say S = { Nat } . To describe all natural numbers, it suffices to have symbolsfor the constant zero and for the successor operation, say 0 and s , respectively. Then we can describe the number n as n applicationsof s to 0; thus, 0 is represented by 0, 1 by s( ) , 2 by s(s( )) , etc.This is sometimes called Peano notation ; we could also speak of “caveman numbers,” since this is counting in base 1. The signature has Σ [], Nat = { } , Σ Nat , Nat = { s } and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) ; orin the singleton notation of Definition 2.2.1, Σ = { } [], Nat ∪ { s } Nat , Nat . (cid:2) Notice that the natural interpretation for Example 2.3.3 is a certainparticular standard model , whereas any model provides a suitable in-terpretation for Example 2.3.2. These two kinds of semantics are called ignature (cid:19)(cid:19) (cid:16)(cid:18) (cid:17)(cid:39)(cid:36)(cid:38)(cid:37) (cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:93)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:45) ••+ , ∗ s Nat
Figure 2.2: Signature for Numerical Expressions ∂ ∂ (cid:38)(cid:39)(cid:37)(cid:36) NodeEdge (cid:38)(cid:39)(cid:37)(cid:36) (cid:45)(cid:45) •• Figure 2.3: Signature for Graphs standard (or tight ) and loose , respectively. Later, we will make this dis-tinction precise.
Example 2.3.4 ( Numerical Expressions ) Again, there is just one sort of inter-est, say S = { Nat } , and assuming that we are only interested in theoperation symbols shown in Figure 2.2, then Σ [], Nat = { } , Σ Nat , Nat ={ s } , Σ Nat Nat , Nat = {+ , ∗} , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) . Theintended semantics of this example is standard rather than loose, be-cause there is just one intended model for these expressions. (cid:2) Example 2.3.5 ( Graphs ) A (directed, unordered) graph G consists of a set E of edges , a set N of nodes , and two arrows, ∂ , ∂ : E → N , whichgive the source and target node of each edge, respectively. Thus S ={ Edge , Node } and Σ = { ∂ , ∂ } Edge , Node in the notation of Definition 2.2.1.This signature is shown in Figure 2.3. The intended semantics is loose,because there are many possible graphs. (cid:2)
Notation 2.3.6
Because a ground signature X has all its sets X w,s empty unless w = [] , we can identify such a signature with the S -indexed set X with X s = X [],s , and we shall often do so in the following; note that in thiscase, X s is disjoint from X s whenever s ≠ s , by the definition ofground signature (Definition 2.3.1).By our conventions about sorted sets, | Σ | = (cid:83) w,s Σ w,s and Σ (cid:48) ⊆ Σ means that Σ (cid:48) w,s ⊆ Σ w,s for each w ∈ S * and s ∈ S . Similarly, the union Signature and Algebra of two signatures is defined by ( Σ ∪ Σ (cid:48) ) w,s = Σ w,s ∪ Σ (cid:48) w,s . A common special case is union with a ground signature X . We will usethe notation Σ (X) = Σ ∪ X for this, but always assuming that | X | and | Σ | are disjoint, and X is adisjoint family. When X is an S -indexed set, the above equation may berewritten as Σ (X) [],s = Σ [],s ∪ X s Σ (X) w,s = Σ w,s when w ≠ [] . (cid:2) OBJ modules that are intended to be interpreted loosely begin with thekeyword theory (which may be abbreviated th ) and close with the key-word endth . Between these two keywords come (optional) declarationsfor sorts and operations, plus (as discussed in detail later on) variables,equations, and imported modules. For example, the following speci-fies the theory of automata of Example 2.3.2: th AUTOM issorts Input State Output .op s : -> State .op f : Input State -> State .op g : State -> Output .endth Notice that each of the four internal lines begins with a keywordwhich tells what kind of declaration it is, and terminates with a period.Any number of sorts can be declared following sorts , and operationsare declared with both their arity, between the : and the -> , and theirsort, following the -> . Because a constant like s has empty arity, noth-ing appears between the : and the -> in its declaration. It is conven- tional (but not necessary) in OBJ for sort identifiers to begin with anuppercase letter, and for module names to be all uppercase.Graphs as defined in Example 2.3.5 may be specified as follows: th GRAPH issorts Edge Node .op ∂ : Edge -> Node .op ∂ : Edge -> Node .endth Actually, OBJ3 can only read ASCII characters, and the s that you see was producedby a L A TEX macro whose name consists of all ASCII characters. (ASCII provides a certainfixed set of characters with a certain fixed binary encoding.) ignatures in OBJ Also, the Peano natural numbers of Example 2.3.3 may be specifiedby the following: obj NATP issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .endo
Here the keyword pair obj ...endo indicates that standard seman-tics is intended. Also, notice the use of sort instead of sorts in thisexample; actually, sort and sorts are synonyms in OBJ3, so that thechoice of which to use is just a matter of style.This example also uses “mixfix” syntax for the successor operationsymbol: in the expression before the colon, the underbar character is a place holder, showing where the operation’s arguments should go;there must be the same number of underbars as there are sorts in thearity; the other symbols before the colon go between or around thearguments. Thus, the notation s_ defines prefix syntax, while _+_ de-fines infix syntax, as in ; similarly, _! is postfix , {_} is outfix , and if_then_else_fi is general mixfix . When there are no underbars, adefault prefix-with-parentheses syntax is assumed, as with f and g in AUTOM above. Notice that the formal definition of signature does notspecify “fixity”, but only arity and rank; this issue is discussed furtherin Section 3.7 below.Here is an OBJ specification for the expressions over the naturalnumbers introduced in Example 2.3.4: obj NATEXP issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .op _+_ : Nat Nat -> Nat .op _*_ : Nat Nat -> Nat .endo
One way to characterize the intuitive difference between loose andstandard semantics is to consider what entities are of major interest ineach case. For loose semantics (OBJ theories), the entities of greatestinterest are the models ; for example, for the theory
GRAPH , we are inter-ested in graphs, which are algebras. In such cases, we may say that thetheory denotes the class of all graphs. On the other hand, for standardsemantics (OBJ objects), we are interested in the elements of the stan-dard model; for example, for the specification
NATP , we are interestedin the natural numbers. Of course, the algebra of all natural numbersis also of great interest, because it contains all the natural numbers, aswell as certain operations upon them. In such cases, we may say that Signature and Algebra the OBJ specification denotes the algebra of natural numbers. (We willlater see that this is only defined up to isomorphism.)
Notation 2.4.1 If FOO is the name of an OBJ module, then we let Σ FOO denote itssignature. For example, | Σ NATP | = { , s } . (cid:2) Signatures specify the syntax of theorem-proving problems, but formany problems we are really interested in semantics , that is, in partic-ular entities of the given sorts, and particular functions that interpretthe given function symbols. This is formalized by the following basicconcept:
Definition 2.5.1 A Σ - algebra M consists of an S -sorted set also denoted M , i.e.,a set M s for each s ∈ S , plus(0) an element M σ in M s for each σ ∈ Σ [],s , interpreting the constantsymbol σ as an actual element, and(1) a function M σ : M s × · · · × M s n → M s for each σ ∈ Σ w,s where w = s . . . s n (for n > interpretation of Σ in M . Often we will writejust σ for M σ . Also, we may write M w instead of M s × · · · × M s n . Forexample, using this notation we can write M σ : M w → M s for σ ∈ Σ w,s . When a symbol σ is overloaded, the notation M σ isambiguous, and we may instead write M w,sσ to explicitly indicate therank that is intended for a particular interpretation of σ . Finally, wemay sometimes write σ M instead of M σ , especially in examples.The set M s is called the carrier of M of sort s . (cid:2) Example 2.5.2
For example, we could have S = { Int , Bool } with the symbols0 , Σ [], Int and Σ [], Bool . Then we might have a Σ -algebra M with M [], Int = M [], Bool = (cid:2) Example 2.5.3 ( Automata ) An automaton is a Σ -algebra where Σ = Σ AUTOM is thesignature of Example 2.3.2, i.e., it consists of an input set X , a stateset W , an output set Y , an initial state s ∈ W , a transition function f : X × W → W , and an output function g : W → Y .Here is a simple Σ AUTOM -algebra A : let A Input = A State = A Output = ω ,the natural numbers; let (s ) A =
0, let f A (m, n) = m + n , and let g A (n) = n +
1. Then A is an automaton whose state records the sumof the inputs received, and whose output is one more than the sum ofthe inputs received. AUTOM denotes the class of all automata. (cid:2) lgebras d b a (cid:45)(cid:45)(cid:63) (cid:63) c f gg f (cid:45)(cid:45)(cid:63) (cid:63)(cid:45)(cid:45)(cid:63) (cid:63) Figure 2.4: Views of a Graph
Example 2.5.4 ( Expressions ) The standard semantics for Example 2.3.4 is a one-sorted algebra E whose carrier consists of all well-formed expressionsin 0 , s, + , ∗ , such as s( + (s( ), s(s( )))) . Here the value of + E (e , e ) is just the expression + (e , e ) ; similarly, the value of s E (e) is s(e) , of ∗ E (e , e ) is ∗ (e , e ) , and the interpretation 0 E of the symbol 0is 0. Of course, there is another Σ NATEXP -algebra which has carrier ω ,and which interprets 0 , s, + , ∗ as the expected operations on naturalnumbers. However, the intended denotation is the algebra of all well-formed expressions — or more precisely, any isomorphic algebra, aswill be discussed later on. (cid:2) Example 2.5.5 ( Graphs ) If we let Σ be the signature Σ GRAPH of Example 2.3.5,then a Σ -algebra G consists of a set E of edges , a set N of nodes , andtwo arrows, ∂ , ∂ : E → N , which give the source and target nodeof each edge, respectively, that is, G is a (directed, unordered) graph ,which we may write as (E, N, ∂ , ∂ ) .A typical graph is shown to the left in Figure 2.4; here E = { a, b, c, d } , N = { , , , } , ∂ (a) = ∂ (c) = ∂ (a) = ∂ (b) = ∂ (c) = ∂ (d) =
3, and ∂ (b) = ∂ (d) =
4. It is usual to draw such a graph as shownin the center of Figure 2.4, omitting the names of nodes and edges, sothat labels can be attached instead, as shown in the rightmost diagramof Figure 2.4, and as explained further in Example 2.5.6 below. (cid:2)
Example 2.5.6 ( Labelled Graphs ) To the signature of Example 2.3.5, let us add asingle new sort
Nlabel , and a single new operation symbol l ∈ Σ Node , Nlabel . An algebra over this signature is a node labelled graph , and may bewritten (E, N, L, ∂ , ∂ , l) . The most typical interpretations are strict in L , but loose in everything else. An algebra with the underlying graphshown in the left of Figure 2.4 and with node labels from ω is shown atthe left of Figure 2.4. We can also label edges, by adding another sort Elabel and another operation symbol, l (cid:48) ∈ Σ Edge , Elabel . Thus, in therightmost diagram of Figure 2.4, l (cid:48) (a) = l (cid:48) (d) = f and l (cid:48) (c) = l (cid:48) (b) = g . (cid:2) Exercise 2.5.1
Write an OBJ specification for the node labelled graphs of Exam-ple 2.5.6. (cid:2) Signature and Algebra
Example 2.5.7 ( Overloading ) Now let’s consider an example with overloading,given by the following OBJ code: th OL issorts Nat Bool .ops 0 1 : -> Nat .ops 0 1 : -> Bool .op s_ : Nat -> Nat .op n_ : Bool -> Bool .op _+_ : Nat Nat -> Nat .op _+_ : Bool Bool -> Bool .endth
Here the keyword ops indicates that a number of operations with the same rank will be defined together. Writing this signature outthe hard way, we have S = { Nat , Bool } , Σ [], Bool = Σ [], Nat = { , } , Σ Bool , Bool = { n } , Σ Nat , Nat = { s } , Σ Bool Bool , Bool = Σ Nat Nat , Nat = {+} , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) . Then 0 and 1 are overloaded, andso is +.One algebra for this signature, usually denoted T Σ , has the naturalnumber terms in its carrier of sort Nat , and the Boolean terms in itscarrier of sort
Bool . Many of these terms are ambiguous in the sensethat there is no unique s ∈ S such that they lie in T Σ ,s . For example,the terms 0 + + ( + ) are ambiguous, as of course are 0 and1; but s( ) and 1 + (n( ) + ) are unambiguous. (Proposition 3.7.2in Section 3.7 will give a necessary and sufficient condition for non-ambiguity.) We will also see later how to disambiguate terms. (cid:2) The terms over a given signature Σ form a Σ -algebra which will be es-pecially useful and important to us in the following. Indeed, it is akind of “universal” Σ -algebra, which can serve as a standard model forspecifications that do not have any equations. Definition 2.6.1
Given an S -sorted signature Σ , then the S -sorted set T Σ of all( ground ) Σ - terms is the smallest set of lists over the set | Σ | ∪ { (, ) } (where ( and ) are special symbols disjoint from Σ ) such that(0) Σ [],s ⊆ T Σ ,s for all s ∈ S , and(1) given σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n then σ (t . . . t n ) ∈ T Σ ,s . (cid:2) When the operations are not constants, parentheses are needed to separate thedifferent operation forms. erm Algebras s ss + s s + ss (cid:0) (cid:64) (cid:0) (cid:64) Figure 2.5: Trees for Some TermsNotice that this representation of terms does not use mixfix syntax, butrather uses a default prefix-with-parentheses syntax; nevertheless, wewill use mixfix notation in examples, and will later give some theoryto support its use. Also, we will usually omit the underbars on theparentheses. For example, using the signature Σ NATEXP , we can form terms like 0, s( ) , s(s( )) , 0 + s( ) , and s( ) + s(s( )) . It is common topicture such terms as node labelled trees, as shown in Figure 2.5. Thiscorrespondence is made precise in Example 3.6.3 below.Notice also that the carriers of T Σ need not be disjoint when Σ isoverloaded. For example, if Σ is the signature of Example 2.5.7, then (T Σ ) Int and (T Σ ) Bool both contain 0 and 1.We can use an operation symbol σ in Σ as a “constructor,” that is, asa template into whose argument slots terms of appropriate sorts can beplaced, yielding new terms. For example, if t and t are two Σ -terms ofsort s , and if + is in Σ ss,s , then + (t , t ) is another Σ -term, constructedby placing t and t into the form + (_,_) . Similarly, we can think ofa constant symbol σ ∈ Σ [],s as constructing the constant term σ itself.In this way, T Σ becomes a Σ -algebra. More precisely now, Definition 2.6.2
We can view T Σ as a Σ -algebra as follows:(0) interpret σ ∈ Σ [],s in T Σ as the singleton list σ , and(1) interpret σ ∈ Σ s ...s n ,s in T Σ as the operation which sends t , . . . , t n to the list σ (t . . . t n ) , where t i ∈ T Σ ,s i for i = , . . . , n .Thus, (T Σ ) σ (t , . . . , t n ) = σ ( t . . . t n ) , and from here on we usuallyuse the first notation. T Σ is called the term algebra , or sometimes the word algebra , over Σ . (cid:2) Example 2.6.3
Let us consider the term algebra for the signature Σ AUTOM of Ex-ample 2.3.2: since there are no terms of sort Input , the only term ofsort
State is s , and hence the only term of sort Output is g(s ) ; thatis, T Σ AUTOM , Input = ∅ , while T Σ AUTOM , State = { s } , and T Σ AUTOM , Output = { g(s ) } .The moral of this example is that the term algebras of signatures thatare intended to be interpreted loosely are not necessarily very interest-ing. In fact, Example 2.5.4 is a more typical term algebra. (cid:2) Signature and Algebra
Most expositions of logic treat the unsorted case, rather than the many-sorted case. Indeed, there has been a belief that many-sorted logic isjust a special case of unsorted logic. However, this fails for a varietyof reasons, including that many-sorted theorem proving can be muchmore efficient than unsorted theorem proving, and that different rulesof deduction must be used in certain cases.This subsection describes unsorted algebra, and informally showsthat it is equivalent to one-sorted algebra, i.e., to the special case where S =
1. Rules of deduction and some other logics are considered later.
Definition 2.7.1 An unsorted signature Σ is a family { Σ n | n ∈ ω } of sets, whose elements are called operation symbols ; σ ∈ Σ is called a con- stant symbol , and σ ∈ Σ n is said to have arity n . (cid:2) Notice that overloading is impossible for unsorted signatures.
Definition 2.7.2
Given an unsorted signature Σ , then a Σ - algebra M is a set,also denoted M and called the carrier , together with(0) an element M σ ∈ M for each σ ∈ Σ , and(1) an arrow M σ : M n → M for each σ ∈ Σ n with n > (cid:2) Note that when n =
0, then M is (by convention defined to be) a one-point set, and so M σ : M → M determines an element of M in its image,which can be considered its value.We now consider the relationship with the one-sorted case. Let usassume that we are given a sort set S = { s } . Then an S -sorted set isa family { M s | s ∈ S } consisting of a single set M s and an S -sortedarrow h : M s → M (cid:48) s is a family { h s | s ∈ S } consisting of a single arrow h s : M s → M (cid:48) s . So one-sorted sets and arrows are essentially the samething as ordinary sets and arrows.Again assuming that S = { s } , an S -sorted signature is a family { Σ (cid:104) w,s (cid:105) | (cid:104) w, s (cid:105) ∈ S * × S } , which can be identified with the unsortedsignature Σ (cid:48) = { Σ (cid:48) n | n ∈ ω } where Σ (cid:48) n = Σ s n ,s . Hence, a Σ -algebrais essentially the same thing as a Σ (cid:48) -algebra. In the following, we willusually identify the unsorted and one-sorted concepts. The use of many-sorted structures is important in computing sciencebecause it can model the way that programming and specification lan-guages keep track of the types of entities. Also, syntax in general forms iterature a many-sorted algebra; this observation will later be helpful in formal-izing various logical systems. However, most of the mathematics litera-ture and much of the computing literature treat only unsorted algebra,which is inadequate for the applications on which this book focusses.The development and use of algebras in the technical sense was amajor advance in mathematics. Alfred North Whitehead [182] prefig-ured this revolution in the late nineteenth century.Around 1931, Emmy Nöther laid the foundations for what is nowcalled “modern” or “abstract algebra” by systematizing and widely ap-plying the concepts of algebra and particularly homomorphism; for thisreason she has been called “the mother of modern algebra” [20, 127].In 1935, Garrett Birkhoff [12] gave the now standard definitions for the unsorted case, and proved the important completeness and vari-ety theorems. Perhaps the classic mathematical reference for general(unsorted) algebra is Cohn [32]; this book also discusses some categorytheory and some applications to theoretical computing science.Many-sorted algebra seems to have been first studied by Higgins[101] in 1963; Benabou [8] gave an elegant category-theoretic develop-ment around 1968. The use of sorted sets for many-sorted algebraseems notationally simpler than alternative approaches, such as [101],[8] and [13]; it was introduced by the author in lectures at the Universityof Chicago in 1968, and first appeared in print in [52]. The definitionof signature with overloading (Definition 2.3.1) was first developed inthese early lectures, but the idea only reveals its full potential in order-sorted algebra, which adds subsorts, as discussed in Chapter 10. Ideasin the papers [24] and [137] also contributed to the treatment of many-sorted general algebra that is given in this text.Our systematic use of arrows is influenced by category theory, forwhich see, e.g., [126, 63].“ADJ diagrams” were introduced in [87] as a way to visualize the many-sorted signatures used in the theory of abstract data types bythe “ADJ group,” which was originally defined to be the set {Goguen,Thatcher, Wagner, Wright}. (See [58] for some historical remarks onADJ.) The name “ADJ diagram” is due to Cliff Jones.OBJ began as an algebraic specification language at UCLA about1976 [53, 55, 85], and was further developed at SRI International [47,90] and several other sites [26, 166, 33] as a declarative specificationand rapid prototyping language; Appendix A gives more detail on OBJ3,following [90, 77]. The use of OBJ as a theorem prover stems from [59],as further developed in [62]. Signature and Algebra
A Note for Lecturers:
When lecturing on the material inthis chapter, it may help to begin with examples (automata,natural number expressions, graphs, etc.), first giving theirADJ diagrams, then an intuitive explanation, then their OBJ3syntax, then some models, and then some computations inthose models. This is because some students without a suf-ficient mathematics background can find the formalities of S -sorted sets, arrows, and so on, rather difficult. After thesetopics have been treated, then the formal definitions of sig-nature and algebra can be introduced. It helps to motivatethe material by reminding students frequently that signa-tures provide the syntax for a domain within which we wishto prove theorems, and that algebras provide the semantics (models). Homomorphism, Equationand Satisfaction
Homomorphisms can express many important relationships betweenalgebras, including isomorphism , in which two structures differ only inhow they represent their elements, as well as the subalgebra and quo-tient algebra relationships. In addition, we will use homomorphismsto characterize standard models and to define substitutions, two basicconcepts that will play an important role throughout this book.
Homomorphisms formalize the idea of interpreting one Σ -algebra intoanother, by mapping elements to elements in such a way that all sorts,operations, and constants are preserved. This concept may alreadybe familiar from linear transformations, which map vectors to vectorsin such a way as to preserve the (constant) vector 0, as well as theoperations of vector addition and scalar multiplication. The followingequations express this, T ( ) = T (a + b) = T (a) + T (b)T (r • a) = r • T (a) where a, b are vectors, r is a scalar, and • is scalar multiplication. Thegeneral notion is: Definition 3.1.1
Given an S -sorted signature Σ and Σ -algebras M, M (cid:48) , a Σ - homo-morphism h : M → M (cid:48) is an S -sorted arrow h : M → M (cid:48) such that thefollowing homomorphism condition holds:(0) h s (M c ) = M (cid:48) c for each constant symbol c ∈ Σ [],s , and(1) h s (M σ (m , . . . , m n )) = M (cid:48) σ (h s (m ), . . . , h s n (m n )) whenever n > σ ∈ Σ s ...s n ,s and m i ∈ M s i for i = , . . . , n . Homomorphism, Equation and Satisfaction
The composition of Σ -homomorphisms g : M → M (cid:48) and h : M (cid:48) → M (cid:48)(cid:48) istheir composition as S -sorted arrows, denoted g ; h : M → M (cid:48)(cid:48) .If h : M (cid:48) → M is an inclusion and a homomorphism, then M (cid:48) is saidto be a sub- Σ -algebra of M ; in this case, h may be called an inclusionhomomorphism . (cid:2) Exercise 3.1.1
Show that a sub- Σ -algebra M (cid:48) of M is a subset ( S -indexed, ofcourse) that is closed under all the operations in Σ , i.e., that satisfies: (1) M (cid:48) s ⊆ M s for all s ∈ S ; and (2) for every σ ∈ Σ s ...s n ,s , M σ (a , . . . , a n ) ∈ M (cid:48) s whenever a i ∈ M (cid:48) s i . (cid:2) Note that to cover signatures with overloading, we should reallyhave written the two conditions of Definition 3.1.1 as:(0) h s (M [],sc ) = M (cid:48) [],sc for each constant symbol c ∈ Σ [],s , and (1) h s (M w,sσ (m , . . . , m n )) = M (cid:48) w,sσ (h s (m ), . . . , h s n (m n )) whenever w = s . . . s n , n > σ ∈ Σ w,s and m i ∈ M s i for i = , . . . , n . Exercise 3.1.2
Show that a composition of two Σ -homomorphisms is a Σ -homo-morphism, and that the identity 1 M on a Σ -algebra M is a Σ -homomor-phism. (cid:2) It may be interesting to see explicitly what are the homomorphismsof graphs and automata as defined in the previous chapter:
Example 3.1.2
Given two graphs (in the sense of Example 2.5.5), say G = (E, N,∂ , ∂ ) and G (cid:48) = (E (cid:48) , N (cid:48) , ∂ (cid:48) , ∂ (cid:48) ) , then a homomorphism h : G → G (cid:48) con-sists of two arrows, h E : E → E (cid:48) and h N : N → N (cid:48) that satisfy thehomomorphism condition for each σ ∈ Σ . In this case, there are justtwo σ in Σ , and the corresponding equations are h N (∂ (e)) = ∂ (cid:48) (h E (e))h N (∂ (e)) = ∂ (cid:48) (h E (e)) for e ∈ E . These equations say that graph homomorphisms preservesource and target. (cid:2) Example 3.1.3 If A = (X, W , Y , f , g, s ) and A (cid:48) = (X (cid:48) , W (cid:48) , Y (cid:48) , f (cid:48) , g (cid:48) , s (cid:48) ) are two automata (in the sense of Example 2.5.3), then a homomorphism h : A → A (cid:48) consists of three arrows, which we may denote h Input : X → X (cid:48) , h State : W → W (cid:48) , and h Output : Y → Y (cid:48) , satisfying the following threeequations h State (f (x, s)) = f (cid:48) (h Input (x), h
State (s))h
Output (g(s)) = g (cid:48) (h State (s))h
State (s ) = s (cid:48) which just say that automaton homomorphisms preserve the opera-tions of automata. (cid:2) For those not already familiar with it, this notion is defined in Appendix C. omomorphism and Isomorphism Thus, Σ -homomorphisms are arrows that preserve the structure of Σ -algebras. One of the most important kinds of homomorphism is the isomorphism , which provides a translation between the data represen-tations of two Σ -algebras that are “abstractly the same.” Before givinga formal definition, we illustrate the concept with the following: Example 3.1.4
Let us consider two different ways of representing the naturalnumbers, each a variant of the Peano representation of Example 2.3.3.For the first, we have a one-sorted algebra P whose carrier consists ofthe lists 0 , s , s s , . . . , in which 0 denotes the list 0, and the operation s maps an expression e to the expression s e . For the second, we havean algebra P (cid:48) whose carrier consists of the lists 0 , (cid:48) , (cid:48)(cid:48) , . . . . We cannow define h : P → P (cid:48) recursively by the equations h( ) = h(s e) = h(e) (cid:48) . Intuitively, P and P (cid:48) provide two different representations of the samething, and h describes a translation between these representations. (cid:2) Definition 3.1.5 A Σ -homomorphism h : M → M (cid:48) is a Σ - isomorphism iff thereis another Σ -homomorphism g : M (cid:48) → M such that h ; g = M and g ; h = M (cid:48) (i.e., such that for each s ∈ S , g s (h s (m)) = m for all m ∈ M s and h s (g s (m (cid:48) )) = m (cid:48) for all m (cid:48) ∈ M (cid:48) s ). In this case, g is called the inverse of h , and is denoted h − ; also, we write M (cid:155) Σ M (cid:48) if there existsa Σ -isomorphism between M and M (cid:48) , and we may omit the subscript Σ if it is clear from context. (cid:2) Exercise 3.1.3
Prove that h as defined in Example 3.1.4 above really is an iso-morphism. (cid:2) Example 3.1.6
We now consider the binary representation of the natural num-bers, forming a one-sorted algebra B . Its carrier consists of the symbol0 plus all (finite) lists of 0’s and 1’s not beginning with 0; 0 denotes thelist 0 (i.e., B = s is binary addition of 1. Thenthere is an arrow h : P → B with P as defined in Example 3.1.4, such that h( ) = h(s e) = + h(e). (cid:2) Exercise 3.1.4
Prove that h as defined in Example 3.1.6 is an isomorphism. (cid:2) Exercise 3.1.5
Prove that a Σ -homomorphism h is an isomorphism iff each h s is bijective. (cid:2) The following summarizes some of the most useful properties ofisomorphisms: Homomorphism, Equation and Satisfaction
Proposition 3.1.7 If f : M → M (cid:48) and g : M (cid:48) → M (cid:48)(cid:48) are Σ -isomorphisms, then(a) (f − ) − = f .(b) (f ; g) − = g − ; f − .(c) 1 − M = M .(d) (cid:155) Σ is an equivalence relation on the class of all Σ -algebras. (cid:2) Exercise 3.1.6
Prove the assertions in Proposition 3.1.7. (cid:2)
In Chapter 6, we will see that if there is an injective Σ -homomorphism h : M → M (cid:48) , then M is isomorphic to a subalgebra of M (cid:48) , and if thereis a surjective Σ -homomorphism h : M → M (cid:48) , then M (cid:48) is isomorphic to a quotient algebra of M ; the converses also hold. These results areCorollaries 6.1.8 and 6.1.9, respectively, and their converses are givenin Exercise 6.1.2. Definition 3.1.8 An S -sorted arrow f : M → M (cid:48) is a left inverse iff there isanother S -sorted arrow g : M (cid:48) → M such that f ; g = M . In this case,we also say that g is a right inverse of f ; we may also say that f has aright inverse and that g has a left inverse. (cid:2) Exercise 3.1.7
Show that if an S -sorted arrow has a right inverse then it isinjective, and if it has a left inverse then it is surjective. (cid:2) Exercise 3.1.8
Show that if an S -sorted arrow is injective and has a left inverse,then it is bijective. Similarly, show that if an arrow is surjective and hasa right inverse, then it is bijective. (cid:2) These results imply the following, which will be very useful in certainproofs later on:
Proposition 3.1.9
An injective Σ -homomorphism with a left inverse is an iso-morphism, and so is a surjective Σ -homomorphism with a right inverse. (cid:2) We now consider homomorphisms for unsorted algebra in the sense ofSection 2.7.
Definition 3.1.10
Given unsorted Σ -algebras M and M (cid:48) , then a Σ - homomor-phism h : M → M (cid:48) is an arrow M → M (cid:48) , also denoted h , such that(0) h(M σ ) = M (cid:48) σ whenever σ ∈ Σ , and This concept is defined in Appendix C. nitiality of the Term Algebra (1) h(M σ (m , . . . , m n )) = M (cid:48) σ (h(m ), . . . , h(m n )) whenever σ ∈ Σ n and m i ∈ M for i = , . . . , n with n > (cid:2) Consistently with the results of Section 2.7, such a Σ -homomorphismis essentially the same thing as a one-sorted Σ (cid:48) -homomorphism, where Σ n = Σ (cid:48) s n ,s with S = { s } . For many signatures Σ , the term algebra T Σ of Section 2.6 has a veryspecial (and important) property: There is a unique way to interpreteach of its elements in any Σ -algebra M . For example, if we let Σ = Σ NATEXP and let M = ω , then t = s(s(s(s( ))) ∗ (s( ) + s( ))) should be interpreted as 7. And if we let M = { true , false } , with + interpretedas “or”, with ∗ interpreted as “and”, s as “not”, and 0 as false , then t should be interpreted as false . This section formalizes this propertyand explores some of its consequences. The key property of T Σ is statedbelow; its proof is given in Appendix B. We later construct a Σ -algebrathat has this property for Σ that may be overloaded (Theorem 3.2.10). Theorem 3.2.1 ( Initiality ) Given a signature Σ without overloading and any Σ -algebra M , there is a unique Σ -homomorphism T Σ → M . (cid:2) The property that there is a unique Σ -homomorphism to any other Σ -algebra is called initiality , and any such algebra is called an initial al-gebra . We may think of the operation symbols in a signature Σ as ele-mentary operations or commands (or microinstructions ), and then thinkof T Σ as the collection of all expressions (or simple programs ) formedfrom Σ , and finally think of a Σ -algebra M as a machine (or micropro-cessor ) that can execute the commands in Σ . For example, a constantsymbol f in Σ can be thought of as an instruction to load the value of f in M . Then Theorem 3.2.1 tells us that each such simple program hasone and only one value when executed on M . Thus initiality expressesa very basic intuition about computation on a machine.We will later see that any two initial algebras are isomorphic, so that initiality defines a “standard model” for a signature that is unique upto the renaming of its elements.Many interesting arrows arise as unique Σ -homomorphisms fromsome Σ -algebra; indeed, defining arrows by initiality is essentially thesame as defining functions by induction. Let us consider some exam-ples. Example 3.2.2 ( Evaluating Terms over the Naturals ) If Σ is the signature Σ NATEXP of Example 2.3.4, then we can give ω the structure of a Σ -algebra inwhich the operation symbol 0 is interpreted as the number 0, the oper-ation symbol s is interpreted as the successor operation, the operation Homomorphism, Equation and Satisfaction symbol + is interpreted as the addition function, and ∗ as multiplica-tion. Then the unique Σ -homomorphism from T Σ to ω computes thevalues of the arithmetic expressions in T Σ in exactly the expected way. (cid:2) Exercise 3.2.1
Compute the values of the terms in Figure 2.5 in the algebra ofExample 3.2.2. (cid:2)
Example 3.2.3 If Σ = Σ NATEXP then T Σ is not isomorphic to ω but for Σ (cid:48) = Σ NATP ,then T Σ (cid:48) is isomorphic to ω . Indeed, T Σ (cid:48) is the natural numbers in Peanonotation. (cid:2) Example 3.2.4 ( Depth of a Term ) Given an arbitrary signature Σ , we can makethe natural numbers into a Σ -algebra Ω by letting Ω s = ω for each s ∈ S , and by interpreting(0) each σ ∈ Σ [],s as 0 ∈ Ω s , and(1) each σ ∈ Σ s ...s n ,s for n > n naturalnumbers i , . . . , i n to the number 1 + max { i , . . . , i n } .Then the unique Σ -homomorphism d : T Σ → Ω computes the depth of Σ -terms, that is, the maximum amount of nesting in terms. (cid:2) Exercise 3.2.2
Compute the depth of the terms shown in Figure 2.5 (page 25)using the algebra of Example 3.2.4. (cid:2)
Example 3.2.5 ( Size of a Term ) Let Σ be the signature Σ NATEXP of Example 2.3.4,let ω be the carrier of an algebra A in which 0 ∈ Σ is interpreted as 1, s A (n) = n + + A (m, n) = ∗ A (m, n) = m + n +
1. Then the unique Σ -homomorphism h : T Σ → A computes the size of a term, that is, thenumber of operation symbols that occur in it. (cid:2) Exercise 3.2.3
Compute the size of the terms shown in Figure 2.5 (page 25)using the algebra of Example 3.2.5. (cid:2)
Exercise 3.2.4
Use initiality to define an arrow from Σ -terms which gives thenumber of interior (i.e., non-leaf) nodes in the corresponding tree. (cid:2) This way of defining functions is a special case of a much moregeneral method called initial algebra semantics [52, 88]. This methodregards terms in T Σ as objects to which some meaning is to be assigned,constructs a Σ -algebra M of suitable meanings, and then lets the unique Σ -homomorphism that automatically exists do the work. For exam-ple, T Σ might contain the various syntactic elements of a programminglanguage in its various sorts, such as expressions, procedures, and ofcourse programs, with M containing suitable denotations for these, e.g.,in the style of denotational semantics [92, 161]; many examples of thisapproach are given in [77]. nitiality of the Term Algebra Example 3.2.6 ( Final Algebra ) There is a trivial but interesting algebra denoted F Σ that can be constructed for any signature Σ : let (F Σ ) s = { s } for eachsort s ∈ S ; and given σ ∈ Σ w,s , let F σ (s , . . . , s n ) = s when w = s . . . s n .Then the unique homomorphism T Σ → F Σ gives the sort of a Σ -term. (cid:2) The following generalizes this to any signature Σ and any Σ -algebra; theunique Σ -homomorphism again gives the sorts of elements. Proposition 3.2.7
Given any signature Σ and any Σ -algebra M , there is one andonly one Σ -homomorphism h : M → F Σ . Proof:
Given m ∈ M s , we have to define h(m) = s because h(m) must bein (F Σ ) s = { s } . It is straightforward to check that this gives a Σ -homomorphism. (cid:2) This property is called finality ; it is dual to initiality.
Exercise 3.2.5
Let Σ be the signature Σ NATP of Example 2.3.3, and let D be the Σ -algebra with carrier { , } , with 0 D =
0, with s D ( ) = s D ( ) =
0. Give a direct proof that there is one and only one Σ -homomorphism T Σ → D . (cid:2) Example 3.2.8
When there is overloading, terms do not always have a uniquesort or parse. For example, if Σ is the signature of Example 2.5.7, then0 and 1 are ambiguous in the sense that there is no unique s ∈ S suchthat they lie in (T Σ ) s ; the terms 0 +
1, 1 + ( + ) and many others are alsoambiguous, although for example, the terms s( ) and 1 + (n( ) + ) areunambiguous. Proposition 3.7.2 in Section 3.7 below gives a necessaryand sufficient condition on Σ such that no Σ -terms are ambiguous.For Σ as in Example 2.5.7, T Σ is initial even though it has ambiguousterms. However, there are overloaded signatures such that T Σ is not initial. For example, let S = { A, B, C } , let Σ [],A = Σ [],B = { } , and let Σ A,C = Σ B,C = { f } . Now define a Σ -algebra M as follows: M A = { } ; M B = { } ; M C = { , } ; M [],A = M [],B = M A,Cf ( ) = M B,Cf ( ) = Σ -homomorphism h : T Σ → M , because the term f ( ) has two distinct parses of the same sort, which are T A,Cf (T [],A ) and T B,Cf (T [],B ) . Therefore f ( ) must be mapped to two different elementsof M , h(T A,Cf (T [],A )) = M A,Cf (M [],A ) = ,h(T B,Cf (T [],B )) = M B,Cf (M [],B ) = , which is impossible. (cid:2) Because initiality is so important for this book, the above examplemeans that T Σ is not adequate for our purposes. However, there is aclosely related Σ -algebra that is initial for any signature; its terms areannotated with their sorts. Homomorphism, Equation and Satisfaction
Definition 3.2.9
Given any S -sorted signature Σ , the S -sorted set T Σ of all sorted(ground ) Σ - terms is the smallest set of lists over the set S ∪| Σ |∪{· , (, ) } (where · , ( and ) are special symbols disjoint from Σ ) such that(0) if σ ∈ Σ [],s then σ · s ∈ T Σ ,s for all s ∈ S , and(1) if σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n then σ · s(t . . . t n ) ∈ T Σ ,s .The S -sorted set T Σ can be given the structure of a Σ -algebra in thesame way that T Σ was:(0) interpret σ ∈ Σ [],s in T Σ as the singleton list σ · s , and(1) interpret σ ∈ Σ s ...s n ,s in T Σ as the function that sends t , . . . , t n tothe list σ · s(t . . . t n ) , where t i ∈ T Σ ,s i for i = , . . . , n with n > As before, we will usually write σ · s(t , . . . , t n ) instead of σ · s(t . . . t n ) .Call T Σ the (sort) annotated term algebra over Σ . (cid:2) The following result is proved in Appendix B:
Theorem 3.2.10 ( Initiality ) Given any signature Σ and any Σ -algebra M , there isa unique Σ -homomorphism T Σ → M . (cid:2) It follows that T Σ is “almost an initial Σ -algebra,” because its terms dif-fer from those of T Σ only in the sort annotations; for many signatures,including most of those that come up in practice, T Σ actually is initial.This motivates the following Convention 3.2.11
We will usually write terms without sort annotation, andwill usually annotate operations only in so far as necessary to deter-mine a unique fully annotated term with the given partial annotation.Moreover, we will usually write T Σ when we really mean T Σ . (cid:2) Proposition 3.2.12 below characterizes when T Σ is initial. It uses thefollowing: Exercise 3.2.6
Show that the arrow h : T Σ → T Σ that strips sort annotations offoperation symbols is a Σ -homomorphism. It may be defined as follows:(0) h s (σ · s) = σ for σ ∈ Σ [],s , and (1) h s (σ · s(t . . . t n )) = σ (h s t . . . h s n t n ) for σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n with n > (cid:2) Proposition 3.2.12
The term algebra T Σ is initial iff any two distinct sortedterms of the same sort remain distinct after the sorts are stripped offoperation symbols. Proof:
That the unique Σ -homomorphism h : T Σ → T Σ strips off sorts is dueto its homomorphic property, and since it is surjective, it is an isomor-phism iff it is injective. (cid:2) quation and Satisfaction Recall that a sufficient condition for T Σ to be initial is that Σ has nooverloading. The above says that it is initial iff whatever overloadingmay be present does not produce ambiguity. From this, it follows byinduction that it suffices for there to be no overloaded constants. Structural induction [21] is an important proof technique that is closelyrelated to initiality. To prove that a certain property P is true of all Σ -terms, we (0) prove that P is true of all the constants in Σ , and then (1)prove that if P is true of t , . . . , t n (of appropriate sorts), then P is trueof σ (t , . . . , t n ) , for every nonconstant symbol σ in Σ . It then followsthat P is true of all (ground) Σ -terms, because they all can be built from the symbols in Σ , working upward from the constants. Somewhat moreformally, let P ⊆ T Σ be the ( S -sorted) subset of all Σ -terms for whichthe desired property holds. Then the two properties to be shown implythat P is a Σ -algebra. But it can be shown (see the next paragraph) that T Σ does not have any proper Σ -subalgebras, from which it follows that P = T Σ .More formally the following steps are required for proving that some S -indexed family P of predicates holds for all Σ -terms by structural in-duction :(0) prove that P s holds for every σ ∈ Σ [],s , for each s ∈ S ; and(1) prove that P s holds for σ (t , . . . , t n ) for all σ ∈ Σ s ...s n ,s where P s i is assumed to hold for each t i , a Σ -term of sort s i , for i = , . . . , n .This proof method is carefully stated and validated in Chapter 6 (The-orem 6.4.4), but the idea is as follows: let i denote the inclusion of P into T Σ and let h : T Σ → P be the unique Σ -homomorphism given byinitiality. Then h ; i : T Σ → T Σ is also a Σ -homomorphism. But initialityimplies that there is only one Σ -homomorphism T Σ → T Σ , which is nec-essarily the identity 1 T Σ , because that is a Σ -homomorphism. Therefore h ; i = T Σ , and so by Exercise 3.1.8, i is a Σ -isomorphism, because it isinjective and a right inverse. This section defines the basic concepts of equation, and of satisfactionof an equation by an algebra. This will give us a semantic notion oftruth for equational logic, and hence a standard by which to judge thesoundness of rules of deduction for that system. A number of examplesare also given.To discuss equations, we need terms with variables. It can seemquite difficult to say exactly what a variable actually is in some branches Homomorphism, Equation and Satisfaction of mathematics. But in general algebra, this is not so hard: A Σ -termwith variables in X is just an element of T Σ (X) where X is a groundsignature (see Notation 2.3.6) disjoint from Σ ; that is, a variable is justa new constant symbol. Definition 3.3.1 A Σ - equation consists of a ground signature X of variablesymbols (disjoint from Σ ) plus two Σ (X) -terms of the same sort s ∈ S ;we may write such an equation abstractly in the form ( ∀ X) t = t (cid:48) and concretely in the form ( ∀ x, y, z) t = t (cid:48) when (for example) | X | = { x, y, z } and the sorts of x, y, z can be in- ferred from their uses in t and in t (cid:48) . A specification is a pair ( Σ , A) ,consisting of a signature Σ and a set A of Σ -equations. A Σ -specificationis a specification whose signature is Σ . (cid:2) Example 3.3.2 ( Semigroups ) This specification has just one sort, say
Elt , andjust one operation, say _*_ : Elt Elt -> Elt , which must obey theassociative law, (x ∗ y) ∗ z = x ∗ (y ∗ z) where x, y, z are variables of sort Elt . If we let X be the ground sig-nature with X Elt = { x, y, z } , then this can be written more accuratelyas ( ∀ X) (x ∗ y) ∗ z = x ∗ (y ∗ z) and slightly less formally as ( ∀ x, y, z) (x ∗ y) ∗ z = x ∗ (y ∗ z) . In OBJ, we would write th SEMIGROUP is sort Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq (X * Y)* Z = X *(Y * Z).endth
This follows the convention that variable names begin with an up-percase letter; like the convention for sort names, it is not enforcedby the OBJ3 system. However, the systematic use of these conventionsdoes help users to distinguish sort and variable names from keywordsand operation symbols, and thus helps make specifications more read-able. (For a diagram of the signature of this specification, delete theedges labelled by e and by -1 from Figure 3.1.) (cid:2) Recall that this means that all of the variable symbols in X are distinct. quation and Satisfaction (cid:19)(cid:19) (cid:16)(cid:18) (cid:17)(cid:31)(cid:28)(cid:30)(cid:29) (cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:93)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:45) •• *-1e Elt Figure 3.1: Signature for Groups
Example 3.3.3 ( Monoids ) Similarly, monoids are specified as follows: th MONOID is sort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq e * X = X .eq (X * Y)* Z = X *(Y * Z).endth
This theory denotes the class of all monoids. (We no longer givea set-theoretic version.) Our convention names both objects and sortswith the singular version of the structure involved, rather than the plu-ral; thus, we write
MONOID and
Elt rather than
MONOIDS and
Elts . (cid:2) Exercise 3.3.1
Write out a formal set-theoretic definition of the above OBJ spec-ification of monoids, in the style of Example 2.5.3. (cid:2)
Example 3.3.4 ( Groups ) It is little more work to specify groups than to specifymonoids: th GROUP is sort Elt .op e : -> Elt .op _ − : Elt -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq X *(X − ) = e .eq (X * Y)* Z = X *(Y * Z).endth Notice that only half of the usual pairs of equations for the identityand inverse laws are given. We will later prove that the other halves Homomorphism, Equation and Satisfaction follow from these laws (and vice versa ). The signature for this specifi-cation is shown in Figure 3.1.If h : G (cid:48) → G is an inclusion homomorphism, then G (cid:48) is said to be a subgroup of G . (cid:2) Example 3.3.5 ( Integers ) We can specify the integers as follows: obj INT is sort Int .op 0 : -> Int .op s_ : Int -> Int .op p_ : Int -> Int .var I : Int .eq s p I = I .eq p s I = I .endo
Here s_ is the successor operation, and p_ is the predecessor oper-ation. This specification defines the algebra of integers, with the givenoperations; we may also say that it denotes the class of all standardmodels of the integers, as initial algebras with these operations.The following specification also defines addition and negation onthe integers: obj INT is sort Int .op 0 : -> Int .op s_ : Int -> Int .op p_ : Int -> Int .op -_ : Int -> Int .op _+_ : Int Int -> Int .vars I J : Int .eq s p I = I .eq p s I = I .eq - 0 = 0 .eq - s I = p - I .eq - p I = s - I .eq I + 0 = I .eq I + s J = s(I + J).eq I + p J = p(I + J).endo
Here -_ is the negation operation and _+_ is addition. It is interestingto notice that the integers with 0 as identity, -_ as inverse and _+_ as“multiplication” form a group; however, we do not yet have the toolsneeded to prove this. (cid:2) Our commitment to semantics requires that we not only formalizewhat equations are , but also what they mean . This is done by the con-cept of satisfaction, which will use the following: As will be discussed in more detail later, neither of these operations is a “construc-tor” in the usual sense, because there are non-trivial relations between them. quation and Satisfaction Notation 3.3.6
Recall that a Σ -algebra M provides an interpretation for eachoperation symbol in Σ , and in particular, for each constant symbol in Σ .If X is a ground signature (e.g., a set of variables), then an interpretationfor X is just a (many-sorted) arrow a : X → M . Thus a Σ -algebra M and an arrow a : X → M give an interpretation in M for all of Σ (X) ,allowing M to be seen as a Σ (X) -algebra. Theorem 3.2.1 now gives aunique Σ (X) -homomorphism from the initial Σ (X) -algebra T Σ (X) to M as a Σ (X) -algebra, using a .In such a situation, we call a : X → M an interpretation or an assign-ment of the variable symbols in X , and we let a : T Σ (X) → M denote theunique extension of a to a Σ (X) -homomorphism from the term algebra T Σ (X) . (cid:2) Definition 3.3.7 A Σ -algebra M satisfies E4 a Σ -equation ( ∀ X) t = t (cid:48) iff for any assignment a : X → M we have a(t) = a(t (cid:48) ) in M . In this case we write M (cid:238) Σ ( ∀ X) t = t (cid:48) . We call (cid:238) a “ ,” and we generally omit the subscript Σ when it is clear from context. A Σ -algebra M satisfies a set A of Σ -equations iff it satisfies each e ∈ A , and in this case we write M (cid:238) Σ A .
We may also say that M is a P -algebra, and write M (cid:238) P where P is a specification ( Σ , A) . The class of all algebras that satisfy P is called the variety defined by P , and we may also say that the deno-tation of P is this variety.Finally, for A a set of Σ -equations, we let A (cid:238) Σ ( ∀ X) t = t (cid:48) mean that M (cid:238) Σ A implies M (cid:238) Σ ( ∀ X) t = t (cid:48) . (cid:2) Example 3.3.8 If Σ is the signature Σ MONOID of Example 3.3.3, then a Σ -algebra is a monoid iff it satisfies the equations in Example 3.3.3, i.e., iff it satis-fies the specification MONOID of monoids. The denotation of the theory
MONOID is the variety of all monoids. For example, given a set S , recallthat S * denotes the set of all lists of elements from S , including theempty list [] . Then S * is a monoid with e = [] and with ∗ interpretedas concatenation of lists: for any choice of x, y, z ∈ S *, it is true that (x ∗ y) ∗ z = x ∗ (y ∗ z) , because each term yields the concatenationof the three lists. (cid:2) Example 3.3.9 If Σ is the signature Σ GROUP of Example 3.3.4, then a Σ -algebra isa group iff it satisfies the equations in Example 3.3.4, i.e., iff it satisfies Homomorphism, Equation and Satisfaction the specification
GROUP . For example, S * satisfies the first and thirdaxioms of GROUP , because it is a monoid, but there is no way to definean operation i : S * → S * such that x ∗ i(x) = [] for all x ∈ S *, becausea concatenation of two lists always yields a list that is at least as longas its arguments. Indeed, the only concatenation that yields the value [] is [] ∗ [] . This is an example of non-satisfaction. (cid:2) Exercise 3.3.2
Let S be a set.1. Show that the bijections f : S → S form a group under composi-tion (of functions), with identity 1 S .2. Let G be a group of bijections on a set S , and let F ⊆ S , called a“figure.” Show that { f ∈ G | f (F) = F } is a subgroup of G , calledthe group of symmetries of F . (cid:2) Exercise 3.3.3 An endomorphism of a Σ -algebra M is a Σ -homomorphism h : M → M , and an automorphism is an endomorphism that is bijective(i.e., an isomorphism).1. Show that the set of all endomorphisms of a given Σ -algebra M has the structure of a monoid under composition.2. Show that the set of all automorphisms of a given Σ -algebra M hasthe structure of a group under composition. (cid:2) Example 3.3.10
A rather cute specification that is not very well known definespairs of natural numbers using just one constant, two unary operations,and a single equation: obj 2NAT is sort 2Nat .op 0 : -> 2Nat .ops (s1_) (s2_) : 2Nat -> 2Nat .var P : 2Nat .eq s1 s2 P = s2 s1 P .endo
We can show that pairs of natural numbers are an initial algebra for as follows: Let P = {(cid:104) m, n (cid:105) | m, n ∈ ω } , let 0 P be (cid:104) , (cid:105) , let s send (cid:104) m, n (cid:105) to (cid:104) sm, n (cid:105) , and let s send (cid:104) m, n (cid:105) to (cid:104) m, sn (cid:105) . Nowif M is any -algebra, then define h : P → M to send (cid:104) m, n (cid:105) to(the value that is denoted by the term) s m s n M . Then h is a Σ -homomorphism because h( P ) = h( (cid:104) , (cid:105) ) = M , and h(s (cid:104) m, n (cid:105) ) = s h( (cid:104) m, n (cid:105) ) because each equals s m + s n M , and similarly for preser-vation of s . We leave it as an exercise to show that if g : P → M is alsoa Σ -homomorphism, then necessarily g = h .Another -algebra has as carrier the set consisting of all terms ofthe form s m s n
0, with s (s m s n ) = s m + s n s (s m s n ) = s m s n + P . (cid:2) quation and Satisfaction The fact that variables are just unconstrained constants suggeststhat equations with variables can be regarded as ground equations inwhich the variables are treated as new constants. The following is aformal statement of this basic intuition:
Theorem 3.3.11 ( Theorem of Constants ) Given a signature Σ , a ground signa-ture X disjoint from Σ , a set A of Σ -equations, and t, t (cid:48) ∈ T Σ (X) , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iff A (cid:238) Σ ∪ X ( ∀∅ ) t = t (cid:48) . Proof:
Each condition is equivalent to the condition that a(t) = a(t (cid:48) ) for every Σ (X) -algebra M satisfying A and every assignment a : X → M . E5 (cid:2) It is very pleasing that this proof is so simple. This is because it is based on the semantics of satisfaction, rather than some particularrules of deduction, and because it exploits the initiality of the termalgebra. The intuition behind this result is again that variables “are”constants about which we do not know anything.Although Example 3.3.4 gives perhaps the most common way tospecify groups, there are many other equivalent ways. Therefore itis interesting to examine what “equivalent” means in this context. Ofcourse, we first define equivalence semantically, and after that give syn-tactic measures for proving equivalence. This section considers onlythe special case where the two specifications have the same signature;Section 4.10 extends this to allow different signatures.
Definition 3.3.12 Σ -specifications P and P (cid:48) are equivalent iff for each Σ -algebra M , M (cid:238) P iff M (cid:238) P (cid:48) . We can now define a theory to be an equivalence class of specifications. (cid:2)
It is usual to identify a specification P with the theory that it represents,that is, with its equivalence class; thus, we may say that GROUP is thetheory of groups, rather than saying that GROUP represents (or presents ) the theory of groups. Example 3.3.13 ( Left Groups ) Example 3.3.4 specified the theory
GROUP of groupswith right identity and inverse equations; here is a specification withleft-handed versions of these equations: th GROUPL is sort Elt .op e : -> Elt .op _ − : Elt -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq e * X = X . Homomorphism, Equation and Satisfaction eq (X − ) * X = e .eq X *(Y * Z) = (X * Y)* Z .endth In Chapter 4 we will prove that
GROUPL is equivalent to
GROUP . (cid:2) There are many cases where some equation (or other formula) holdsonly in certain conditions. The extension of equations and their satis-faction to the conditional case is straightforward.
Definition 3.4.1 A conditional Σ - equation consists of a ground signature X dis-joint from Σ , a finite set C of pairs of Σ (X) -terms, and a pair t, t (cid:48) of Σ (X) -terms; we will use the notation ( ∀ X) t = t (cid:48) if C .
Given a Σ -algebra M , define E6 M (cid:238) Σ ( ∀ X) t = t (cid:48) if C to mean that, given any interpretation a : X → M , if a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , then a(t) = a(t (cid:48) ) . If A is a set of conditional Σ -equations, then we say that M satisfies A iff M satisfies each equationin A , and if e is a conditional equation, then A (cid:238) Σ e iff M (cid:238) Σ e whenever M (cid:238) Σ A . (cid:2) Conditional equations make sense even when C is not finite; but with-out that restriction, neither equational deduction nor term rewritingwith such equations would be possible in finite space, and in partic-ular, we could not write down (finite) proof scores that use equationswith infinite conditions. Fact 3.4.2
Given any Σ -equation e = ( ∀ X) t = t (cid:48) , let e (cid:48) = ( ∀ X) t = t (cid:48) if ∅ .Then for each Σ -algebra M , M (cid:238) Σ e iff M (cid:238) Σ e (cid:48) . (cid:2) Consequently, we can regard any ordinary equation as a conditionalequation with the empty condition, and vice versa ; we will feel free to do this hereafter.The following result gives us a technique for proving conditionalequations with equational deduction, and hence with reduction; theapplication of this result for deduction is given in Theorem 4.8.4.
Proposition 3.4.3
Given a conditional Σ -equation ( ∀ X) t = t (cid:48) if C and a set A of Σ -equations, then A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iff (A ∪ C (cid:48) ) (cid:238) Σ (X) ( ∀∅ ) t = t (cid:48) , where C (cid:48) is defined to be { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } . Proof:
Each condition is equivalent to the following: ubstitution XT Σ (X) Mi X ¯ aa (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 3.2: Free Algebra Propertyfor each Σ -algebra M and interpretation a : X → M , if M (cid:238) A and if a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , then a(t) = a(t (cid:48) ) ,where a : T Σ (X) → M is the unique homomorphism. (cid:2) Once again, initiality enables us to give a very simple proof.
The substitution of terms into other terms will play a basic role in laterchapters, especially those on deduction and term rewriting. To helpdefine this concept, we first consider the so-called free algebras , usinga technique already employed in setting up Definition 3.3.7. Given asignature Σ and a ground signature X disjoint from Σ , we can form the Σ (X) -algebra T Σ (X) and then view it as a Σ -algebra by just “forgetting”about the constants in X ; this works because T Σ (X) already has all theoperations it needs to be a Σ -algebra, and it does no harm that it alsohas some others. Let us denote this Σ -algebra by T Σ (X) . It is calledthe free Σ -algebra generated by (or over ) X , and it has the followingcharacteristic property, called free generation by (or over ) X (see alsoFigure 3.2): Proposition 3.5.1
Given a signature Σ , a ground signature X disjoint from Σ , a Σ -algebra M , and a map a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M which extends a , in the sense that a s (x) = a s (x) for each s ∈ S and x ∈ X s . (This property is illustrated in Figure 3.2, where i X is the S -sorted inclusion.) We may call a an assignment from X to M . Proof:
Let j be the interpretation for Σ in M . Then combining j with a gives aninterpretation for Σ (X) in M , and hence makes M into a Σ (X) -algebra.Therefore, by initiality of T Σ (X) there is a unique Σ (X) -homomorphismfrom T Σ (X) to M . But this is exactly the same thing as a Σ -homomor-phism from T Σ (X) to M that extends a . (cid:2) We have already noted that a Σ -term with variables in an S -sortedground signature Y is just an element of T Σ (Y ) . Then an assignment Homomorphism, Equation and Satisfaction a : X → T Σ (Y ) assigns Σ -terms with variables from Y to variables from X in a way that respects the sorts involved, and the Σ -homomorphism a : T Σ (X) → T Σ (Y ) given by Proposition 3.5.1 substitutes a term a(x) for each variable x ∈ X into each term t in T Σ (X) , yielding a term a(t) in T Σ (Y ) . Hence we have the following: Definition 3.5.2 A substitution of Σ -terms with variables in Y for variables in X is an arrow a : X → T Σ (Y ) ; we may also use the notation a : X → Y . The application of a to t ∈ T Σ (X) is a(t) . Given substitutions a : X → T Σ (Y ) and b : Y → T Σ (Z) , their composition a ; b (as substitutions) is the S -sorted arrow a ; b : X → T Σ (Z) . (cid:2) Notation 3.5.3
The following notation makes substitutions look less abstract:Given t ∈ T Σ (X) and given a : X → T Σ (Y ) such that | X | = { x , . . . , x n } and a(x i ) = t i for i = , . . . , n , then we may write a(t) in the form t(x ← t , x ← t , . . . , x n ← t n ), and whenever t i is the variable x i , we can omit the pair x i ← t i . (cid:2) Exercise 3.5.1
Let Σ be the signature of Example 2.3.4, let X = { x, y, z } [], Nat ,let t = x + s(s(y) + z) , let Y = { u, v } [], Nat , and define a : X → T Σ (Y ) by a(x) = u + s(v) , a(y) =
0, and a(z) = v + s( ) . Now compute a(t) . (cid:2) Exercise 3.5.2 If i X : X → T Σ (X) is the inclusion, show that i X (t) = t for each t ∈ T Σ (X) . (cid:2) Exercise 3.5.3
Given a substitution a : X → T Σ (Y ) , show that i X ; a = a and a ; i Y = a . (cid:2) Notation 3.5.4
Because i X serves as an identity for the composition of substi-tutions, we may write 1 X for i X in the following. (cid:2) It is natural to expect that term substitution is associative, in thesense that given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) and c : Y → T Σ (Z) , we have (a ; b) ; c = a ; (b ; c) . We will see in the nextsection that there is a simple and beautiful proof of this using the free property of term algebras. Exercise 3.5.4
Is substitution commutative? I.e., given a, b : X → T Σ (X) , does a ; b = b ; a ? Give a proof or a counterexample. (cid:2) There is a very nice way to graphically represent systems of equations,such as those that arise from the homomorphism condition; this is themethod of commutative diagrams . We will later see that this method asting and Chasing not only allows us to represent systems of equations graphically, butalso to reason about them graphically; such reasoning with diagrams isoften called diagram chasing because of the characteristic way in whichone follows arrows around the diagram. In order to explain this in aprecise way, we first need some further concepts from graph theory. Definition 3.6.1 A path p in a graph G is a list e , . . . , e m of edges of G such that ∂ (e i ) = ∂ (e i + ) for i = , . . . , m −
1; the source of p is ∂ (p) = ∂ (e ) and the target of p is ∂ (p) = ∂ (e m ) ; we will write p : n → n (cid:48) when ∂ (p) = n and ∂ (p) = n (cid:48) . If m = p would be the empty list [] , and ∂ ([]) and ∂ ([]) would not be defined; so instead, we makethe source and target of [] explicit, writing [] n . Given p : n → n (cid:48) and q : n (cid:48) → n (cid:48)(cid:48) , we define their composition p ; q : n → n (cid:48)(cid:48) to be their concatenation as lists. Note that [] n is an identity for this composition,in the sense that [] n ; p = p and p ; [] m = p , for any path p with source n and target m ; in practice, we may omit the subscripts on [] .A graph G is a tree iff it has a node r , called its root , such that foreach node n of G , there is a unique path from r to n in G . (cid:2) One motivation for being so formal about all of this is that mechanicaltheorem proving is necessarily formal to this extent.
Exercise 3.6.1
Show that the root of a tree is necessarily unique. (cid:2)
Exercise 3.6.2
Given a graph G with at most one edge between each pair ofnodes, show that a path p = e . . . e k in G is uniquely determined by thesequence n n . . . n k − n k of the nodes that it passes through, where n = ∂ (e ) , n = ∂ (e ) = ∂ (e ) , . . . , n k − = ∂ (e k − ) = ∂ (e k ) and n k = ∂ (e k ) . (cid:2) Definition 3.6.2 A diagram ( of ( sorted ) sets ) is a graph whose nodes are la-belled by (sorted) sets, and whose edges are labelled by (sorted) arrows,in such a way that• if the arrow f : A → A (cid:48) labels the edge e : n → n (cid:48) , then the label of n is A and the label of n (cid:48) is A (cid:48) .A diagram commutes iff• whenever p, q are two paths, at least one of which has length atleast 2, each with (say) source n and target n (cid:48) , such that the labelsalong the edges of p are f , . . . , f m and along q are g , . . . , g k , then f ; . . . ; f m = g ; . . . ; g k .That is, the arrows obtained by composition along any two (non-trivial)paths from one node to another are equal. (cid:2) Homomorphism, Equation and Satisfaction • • • f gh (cid:45) (cid:35) (cid:32) (cid:63) (cid:34) (cid:33) (cid:54)
Figure 3.3: Length One Paths E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:45)(cid:45)(cid:63) (cid:63) Eh E E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:45)(cid:45)(cid:63) (cid:63) Eh E Figure 3.4: Commutative Diagrams for Graph HomomorphismThus, a commutative diagram is a geometrical presentation of a sys-tem of (non-trivial) equations among arrows. The reason for excludingpaths of length 1 in the above definition is that we want diagrams ofthe form in Figure 3.3 to say that f ; g = f ; h , without also saying that g = h .For example, we can express the two equations which say that an S -sorted arrow h : G → G (cid:48) is a graph homomorphism by the two com-mutative diagrams shown in Figure 3.4, in which h E and h N denote the Edge and
Node components of h , respectively. Similarly, the three dia-grams in Figure 3.5 express the conditions for a sorted arrow to be anautomaton homomorphism (here h, i, j denote the Input , State and
Output components of the homomorphism, respectively).We now give a more complex example (it can be skipped at firstreading): X (cid:48) × S (cid:48) f (cid:48) S (cid:48) iSf (cid:45)(cid:45)(cid:63) (cid:63) X × Sh × i S (cid:48) g (cid:48) Y (cid:48) jYg (cid:45)(cid:45)(cid:63) (cid:63) SiSS (cid:48) i {∗} (cid:63)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:106)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:42) s (cid:48) s Figure 3.5: Commutative Diagrams for Automaton Homomorphism asting and Chasing Example 3.6.3 ( (cid:63) ) ( Tree of a Term ) Given a signature Σ , we will construct a “ Σ -algebra of labelled graphs,” some of whose elements will be the treesthat represent Σ -terms. Let S be the sort set of Σ , and recall that | Σ | = (cid:83) w,s Σ w,s . Now define G Σ to be the Σ -algebra where, for each s ∈ S , G Σ ,s is the set of all node labelled graphs G = (E, N, L, ∂ , ∂ , l) having E, N ⊆ ω * and L = | Σ | , with each σ ∈ Σ [],s interpreted as the graph G σ = ( ∅ , { [] } , L, ∂ , ∂ , l σ ) where l σ ([]) = σ , and each σ ∈ Σ s ...s m ,s interpreted as the arrow which sends graphs G , . . . , G m to the graph G = (E, N, L, ∂ , ∂ , l) in which• N = { [] } ∪ (cid:83) mi = i · N i , where · denotes concatenation for lists ofnaturals, where N i is the node set of the graph G i • E = N − { [] } • ∂ (i . . . i n ) = i . . . i n − • ∂ (i . . . i n ) = i . . . i n • l([]) = σ and• l(i . . . i n ) = l i (i . . . i n ) , where n > l k is the label functionof G k for k = , . . . , m .Then the unique Σ -homomorphism h : T Σ → G Σ sends each Σ -term toits | Σ | -labelled tree representation. (cid:2) Exercise 3.6.3 ( (cid:63) ) Show that the trees shown in Figure 2.5 actually do arisein the manner of Example 3.6.3 from the terms shown after Definition2.6.1. (cid:2)
Commutative diagrams are a well established proof technique inmodern algebra, and are increasingly used in computing science aswell. In this technique, a diagram represents a system of simultane-ous equations (among compositions of arrows ), and geometrical oper-ations on diagrams correspond to algebraic operations on systems ofequations. One such operation is called “pasting,” because geometri-cally it amounts to pasting commutative diagrams together, whereas algebraically it amounts to combining systems of equations.For example, we can prove that the composition of two graph ho-momorphisms is a graph homomorphism by diagram pasting, ratherthan by calculation (as in Exercise 3.1.2): assume that we are given ho-momorphisms h : G → G (cid:48) and h (cid:48) : G (cid:48) → G (cid:48)(cid:48) . Then Figure 3.4 showsthe diagrams for h ; those for h (cid:48) are similar, but with an additional (cid:48) symbol everywhere. For the operation symbol ∂ , Figure 3.6 shows thetwo diagrams, let us call them P and P , that we wish to paste together,with their common subdiagram P and their union P . The fact that P Later we will see how equations among other kinds of entities fall into the sameframework. Homomorphism, Equation and Satisfaction (cid:63) (cid:63)(cid:45)(cid:45)
E h E E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:63) (cid:63)(cid:45)(cid:45) E (cid:48) h (cid:48) E E (cid:48)(cid:48) ∂ (cid:48)(cid:48) N (cid:48)(cid:48) h (cid:48) N N (cid:48) ∂ (cid:48) (cid:63) N (cid:48) E (cid:48) ∂ (cid:48) (cid:63) (cid:63) (cid:63)(cid:45)(cid:45) (cid:45)(cid:45) N (cid:48) E (cid:48) ∂ E E (cid:48)(cid:48) N (cid:48)(cid:48) N h E h (cid:48) E h N h (cid:48) N ∂ (cid:48)(cid:48) ∂ (cid:48) (cid:63) (cid:63)(cid:45)(cid:45) E h E ; h (cid:48) E E (cid:48)(cid:48) ∂ (cid:48)(cid:48) N (cid:48)(cid:48) h N ; h (cid:48) N N∂ Figure 3.6: Commutative Diagrams for Graph Homomorphism Proofcommutes then gives us that the rightmost diagram commutes, whichis what we really want. (The case of ∂ is similar.)It is easy to give a formal justification for this assertion using equa- tional reasoning. The leftmost two squares represent the two equations ∂ ; h N = h E ; ∂ (cid:48) ∂ (cid:48) ; h (cid:48) N = h (cid:48) E ; ∂ (cid:48)(cid:48) and so we can prove commutativity of the square in which we are inter-ested as follows: ∂ ; (h N ; h (cid:48) N ) = (∂ ; h N ) ; h (cid:48) N = (h E ; ∂ (cid:48) ) ; h (cid:48) N = h E ; (∂ (cid:48) ; h (cid:48) N ) = h E ; (h (cid:48) E ; ∂ (cid:48)(cid:48) ) = (h E ; h (cid:48) E ) ; ∂ (cid:48)(cid:48) . Geometrically, this argument simply says that the functions along eachoutside path are equal to the function along the path through the cen-tral edge, namely h E ; ∂ (cid:48) ; h (cid:48) N .This argument is typical of those used to justify diagram pasting. Itis also a typical (though rather simple) diagram chase . Usually such ar-guments are done geometrically, preferably on a black (or white) board,and are omited in written documents. Example 3.6.4
Although pasting commutative diagrams works just as well withtriangles, pentagons and other polygons as it does with squares, it isworth remarking that there are cases where the union of a collection ofcommutative diagrams is not commutative. Hence some caution must be observed. Consider, for example, the diagram in Figure 3.7, in which N is the natural numbers, Z is the integers, each edge labelled 1 isthe identity on N , each diagonal is the inclusion map N → Z , and thefour outer maps ( a, b, c, d ) are arbitrary except that they restrict to theidentity on N . For example, we might choose the following, for i ∈ Z , a(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N b(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N asting and Chasing Z ZZZ N NNN c dba 1 111 (cid:45) (cid:63)(cid:45)(cid:63) (cid:45) (cid:63)(cid:45)(cid:63)(cid:64)(cid:64)(cid:64)(cid:64)(cid:73) (cid:0)(cid:0)(cid:0)(cid:0)(cid:18)(cid:64)(cid:64)(cid:64)(cid:64)(cid:82)(cid:0)(cid:0)(cid:0)(cid:0)(cid:9)
Figure 3.7: A Non-Commutative Diagram c(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N d(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N Then for all i ∈ Z , (a ; b)(i) = b(i) and (c ; d)(i) = d(i) ; so in particular, (a ; b)( − ) = − (c ; d)( − ) = −
4. So a ; b ≠ c ; d . (cid:2) Because “diagram chasing” refers to arguments made using dia-grams, diagram pasting may be considered a particular kind of diagramchasing. Another common form involves using initiality (or freeness) toargue that because there are two arrows between two nodes (with someproperty), they must be equal. Both are illustrated in the following ele-gant proof of the associativity of substitution:
Proposition 3.6.5 ( Associativity of Substitution ) Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , c : Y → T Σ (Z) , then (a ; b) ; c = a ; (b ; c). Proof:
The assertion to be proved translates to (a ; b) ; c = a ; (b ; c) , where “;”indicates composition of ordinary (many-sorted) arrows. By the usualassociative law for such arrows, it suffices to show that b ; c = (b ; c) . By the uniqueness condition of Proposition 3.5.1, the above equationwill follow from showing that b ; c is a Σ -homomorphism extending b ; c . Homomorphism, Equation and Satisfaction WT Σ (W ) T Σ (X) ¯ aa (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) X T Σ (Y ) ¯ bb (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Y ¯ cc T Σ (Z)Z (cid:45) (cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) (cid:19) (cid:16) (cid:63) (b ; ¯ c) (cid:54) Figure 3.8: Associativity of Substitution Proof Diagram
If we let i : X → T Σ (X) denote the injection, then what we have to showis that i ; (b ; c) = b ; c . But this follows from i ; b = b , which is just commutativity of the middlebottom triangle. (cid:2) This proof looks much simpler and more elegant if done by chasingthe right hand two thirds of the diagram in Figure 3.8 on a white orblackboard. By contrast, to prove the result by direct manipulation ofthe set-theoretic representation of terms would require several pagesof very tedious calculation. It is worth drawing out the following keyelement of this in the proof, because it is needed later on:
Corollary 3.6.6
Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , then a ; b = (a ; b) . (cid:2) ( (cid:63) ) Parse
Because a signature Σ can be overloaded, Σ -terms can also be over-loaded, and it is useful to characterize when this can happen. Definition 3.7.1 An S -sorted signature Σ is regular iff σ ∈ Σ w,s ∩ Σ w,s (cid:48) implies s = s (cid:48) . A Σ -term t is overloaded iff there are distinct sorts s, s (cid:48) ∈ S such that t ∈ T Σ ,s ∩ T Σ ,s (cid:48) . (cid:2) Notice that regularity implies in particular that all constant symbols aredistinct.
Proposition 3.7.2
A signature Σ is regular iff there are no overloaded Σ -terms. Proof:
Assume that Σ is regular. By induction on the depth of terms, we willshow that Σ -terms t, t (cid:48) of depths ≤ d with distinct sorts s, s (cid:48) must be (cid:63) ) Parse different. For d =
0, suppose that t = σ and t (cid:48) = σ (cid:48) ; then w = w (cid:48) = [] and the result follows directly from regularity. For depth d > t = σ (t , . . . , t m ) and t (cid:48) = σ (cid:48) (t (cid:48) , . . . , t (cid:48) n ) . Because Σ -termsare lists, and lists have unique factorizations, t = t (cid:48) implies σ = σ (cid:48) , m = n and t i = t (cid:48) i for i = , . . . , n . Now the induction hypothesisimplies that t i and t (cid:48) i have the same sort for i = , . . . , n . Therefore w = w (cid:48) , and hence regularity gives s = s (cid:48) .Conversely, E7 suppose that t = σ (t , . . . , t n ) ∈ T Σ ,s ∩ T Σ ,s (cid:48) is a min-imal overloading , in the sense that s ≠ s (cid:48) and none of the t i are over-loaded. Then necessarily σ ∈ Σ w,s ∩ Σ w,s (cid:48) where w = s . . . s n and s i isthe sort of t i (for i = , . . . , n ). Thus Σ is not regular. (cid:2) This result does not consider ambiguities due to mixfix syntax, becauseit only uses the prefix-with-parenthesis syntax. These more elaborate kinds of ambiguities are considered below.
Exercise 3.7.1
Give a simple non-regular signature Σ and a simple overloaded Σ -term. (cid:2) We now generalize the definitions of signature and term to the caseof mixfix syntax:
Definition 3.7.3
Let A be some fixed set of characters that does not include theunderbar character “ _ ” or the three special symbols · , ( , and ) . Then a form is a list in ( A ∪ { _ } ) *, and the arity of a form is the number of _ ’sthat occur in it. A (many-sorted) mixfix signature is an indexed family { Σ w,s | w ∈ S * , s ∈ S } for some set S of sorts , where each Σ w,s is a setof forms of arity w . (cid:2) Example 3.7.4
Let
A = { a, b, +} . Then the following are all forms: a, a _ , _ a, a _ b, _ + _ , __ . The first has arity 0, the next three have arity 1, and the last two have ar-ity 2. The first defines syntax for a constant, while the second throughlast (respectively) define syntax for prefix, postfix, outfix, infix, and jux-taposition operations. (cid:2)
We now give a recursive construction for mixfix Σ -terms: Definition 3.7.5 If Σ is an S -sorted mixfix signature disjoint from S , then the S -sorted set E8 M Σ of all mixfix ( ground ) Σ - terms is the smallest set oflists over the set A ∪ {· , (, ) } such that(0) if f ∈ Σ [],s then f · s ∈ M Σ ,s for all s ∈ S , and(1) if f ∈ Σ s ...s n ,s for n > t i ∈ M Σ ,s i for i = , . . . , n then (k t k . . . k n t n k n + ) · s ∈ M Σ ,s , where f = k _ k _ . . . k n _ k n + (note that some of the k i may bethe empty list). Homomorphism, Equation and Satisfaction
As with terms in T Σ , we will usually omit the sort annotation unless itis necessary. (cid:2) Every mixfix signature Σ is also an ordinary signature. However,the Σ -terms will look rather different in the two cases. For example, if _+_ ∈ Σ ss,s and a, b ∈ Σ [],s , then a + b is in M Σ but not in T Σ , whereas _+_ (a, b) is in T Σ but not in M Σ . If extra clarity is needed, we will let Σ denote the ordinary signature corresponding to a mixfix signature Σ . Definition 3.7.6
Given a mixfix signature Σ disjoint from S , we can give M Σ thestructure of a Σ -algebra in the following way:(0) interpret f ∈ Σ [],s in M Σ as the singleton list f · s , and (1) interpret f ∈ Σ s ...s n ,s with n > M Σ as the function sending t , . . . , t n to the list (k t k . . . k n t n k n + ) · s , where t i ∈ M Σ ,s i for i = , . . . , n and f = k _ k _ . . . k n _ k n + .Thus, we have that (M Σ ) f (t , . . . , t n ) = (k t k . . . k n t n k n + ) · s , althoughit will usually be written k t k . . . k n t n k n + . (cid:2) It follows that there is a unique Σ -homomorphism T Σ → M Σ ; the fol-lowing uses this fact in comparing the two kinds of Σ -terms: Definition 3.7.7
A mixfix signature Σ is sort ambiguous iff the carriers of M Σ are non-disjoint, and is mixfix ambiguous iff the unique homomor-phism h : T Σ → M Σ is non-injective. Given m ∈ M Σ , if h(t) = m then t is said to be a parse of m . (cid:2) Exercise 3.7.2 . Suppose that a mixfix signature Σ has a single sort A , and alsohas Σ [],A = { a } , and Σ A,A = { a _ , a _ a, _ a } , with all other Σ w,s = ∅ . Then
1. Show that Σ is mixfix ambiguous; in particular, show that aaa hasfive distinct parses, and write each one out.2. How many parses does aaaa have?3. Noting that Σ is not sort ambiguous, construct an ordinary many-sorted signature that is sort ambiguous. (cid:2) Exercise 3.7.3
Show that a term over a mixfix signature is mixfix ambiguous iffit has at least two distinct parses. (cid:2) As usual, T Σ here really means T Σ . iterature Relatively few books develop many-sorted general algebra in any depth.Bergstra et al. [9], Ehrig and Mahr [45] and van Horenbeck [181] each de-velop a certain amount for the algebraic specification theory which istheir main concern. I am not aware of any mathematics text that de-velops many-sorted general algebra in any detail. The notation and ap-proach of this chapter continues that of the previous chapter, followingideas from [52] as further developed in [137, 78] and other publications.Initial algebra semantics (as discussed in Section 3.2) originated in[52], and was further developed in [88]. It can be seen as an algebraicformulation of the so-called attribute semantics of Knuth [116].It is common to treat both variables and substitutions either in- tuitively, or else with extreme logical formalism; the approach givenhere tries to find a middle ground. Our Theorem of Constants (Theo-rem 3.3.11) is analogous to a well-known result in first-order logic. Thisresult is not usually treated in the computing or the general algebra lit-erature, although it is not difficult, and it plays an important role injustifying proofs by term rewriting.It is known that conditional equations have more expressive powerthan unconditional equations, in the sense that there are algebras thatare initial models of a specification having conditional equations thatare not initial models of any specification having only unconditionalequations [176].The proof of associativity for substitution (Theorem 3.6.5) follows[89], and is the same as the more abstract proof which a category the-orist would call “associativity of composition in a Kleisli category” (thenecessary concepts are beyond the scope of this book, but may befound, for example, in [126]). Structural induction was introduced tocomputer science by Burstall [21] in 1969.The formulation of theory equivalence at the end of Section 3.3 is amore concrete and many-sorted version of Lawvere’s category-theoreticformulation for theories [121]; see [130] and [5] for further informationon Lawvere theories, and see Section 4.10 for some techniques for prov-ing equivalence.
The emphasis on models and satisfaction in this and the previouschapter (as well as in subsequent chapters) was influenced by the theoryof institutions [67], which axiomatizes the notion of “logical system”using satisfaction. But as Wittgenstein is said to have remarked,
Is a proof not also part of an institution? and as Thomas Jefferson said on July 12, 1816,
Laws and institutions must go hand in hand with the progressof the human mind. Homomorphism, Equation and Satisfaction
And indeed, laws and proofs play a major role in the rest of this book,as they must in any study of theorem proving.
A Note for Lecturers:
Emphasizing the “microprocessor”interpretation of initiality in the discussion after Theo-rem 3.2.1 can considerably sharpen students’ intuitions, andsome concrete examples with drawings can strengthen thisprocess.The proof of Proposition 3.6.5 should be done as a live dia-gram chase on the board; this is a lot of fun, and it is also thebest way to bring out the essential simplicity of this proof.
Equational Deduction
This chapter considers how to correctly deduce new equations fromold ones. We give several finite sets of rules for equational deduction that are both sound , i.e., truth preserving, and complete for loose se-mantics, in the sense that every equation that is true in all models of agiven set of equations can be deduced from that set using these rules.Such results are important because they say that we can find out whatis true by using formal, finitary manipulations of finite syntactic ob-jects, whereas the semantic definition of truth (by satisfaction, Defini-tion 3.3.7 of the previous chapter) in general requires examining infinitesets of infinite objects (since algebras in general have infinite carriers);obviously, such an examination cannot be done on any real computer,which has only a finite amount of memory. Nonetheless, satisfaction re-mains fundamental, because it provides the standard of correctness fordeduction. Equational deduction also has many important applicationsin computer science and elsewhere (see Section 4.12).
Equational deduction is reasoning with just the properties of equality.Basic properties of equality include the following:(1) Anything is equal to itself; this is the reflexivity of equality.(2) If t equals t (cid:48) , then t (cid:48) equals t ; this is the symmetry of equality. (3) If t equals t (cid:48) and t (cid:48) equals t (cid:48)(cid:48) , then t equals t (cid:48)(cid:48) ; this is the transi- tivity of equality.(4) If t equals t (cid:48) and t equals t (cid:48) . . . and t n equals t (cid:48) n , and if t hasvariables x , . . . , x n , then the result of substituting t i for x i in t equals the result of substituting t (cid:48) i for x i in t ; this is called the congruence property of equality, and may be paraphrased as say-ing that substituting equal expressions into the same expressionyields equal expressions.(5) If t equals t (cid:48) where t and t (cid:48) involve variables x , . . . , x n , and if t , . . . , t n are terms, then the result of substituting t i for x i in t Equational Deduction equals the result of substituting t i for x i in t (cid:48) ; this is called the substitutivity (or instantiation ) property of equality, and may beparaphrased as saying that any substitution instance of an equa-tion is an equation.In these properties, the various t ’s are terms, which may involve vari-ables; furthermore, both (4) and (5) involve substituting terms for vari-ables. We will see that it is necessary to keep careful track of variablesduring equational deduction, or else soundness can be lost. This moti-vates the following: Notation 4.1.1
We will use the notation of Definition 3.3.1 for equations thatappear in deduction, writing ( ∀ X) t = t (cid:48) , where all variables in t and t (cid:48) are taken from X . Also, we will write θ : X → T Σ (Y ) for a substitutionof terms from T Σ (Y ) for variables in X , as in Definition 3.5.2. Finally, we adopt the convention that if t ∈ T Σ (X) is a Σ -term with variables in X , then the result of substituting θ(x) for each x in X into t may bewritten θ(t) , rather than θ(t) as in Definition 3.5.2. (cid:2) The simple example below reviews notation and concepts, in prepara-tion for more complex material to come.
Example 4.1.2
Suppose there is just one sort, say
Elt , and that X has threevariables of that sort, say x, y, z . Let Y = { x, w } and define θ : X → T Σ (Y ) by θ(x) = x − − , θ(y) = w − , θ(z) = x ∗ x − . Now if t = (x ∗ y) − , then θ(t) = (x − − ∗ w − ) − . (cid:2) We can now give the following formal versions of the above proper-ties of equality:
Definition 4.1.3
Given a signature Σ and a set A of Σ -equations, called the ax-ioms or assumptions , the following rules of deduction define the Σ -equations that are deducible (or provable or inferable ) ( from A ):(0) Assumption . Each equation in A is deducible.(1) Reflexivity . Each equation of the form ( ∀ X) t = t is deducible.(2) Symmetry . If ( ∀ X) t = t (cid:48) is deducible, then so is ( ∀ X) t (cid:48) = t .(3) Transitivity . If the equations ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) are deducible, then so is ( ∀ X) t = t (cid:48)(cid:48) . ules of Deduction (4) Congruence . If θ, θ (cid:48) : Y → T Σ (X) are substitutions such that foreach y ∈ Y , the equation ( ∀ X) θ(y) = θ (cid:48) (y) is deducible, then given any t ∈ T Σ (Y ) , the equation ( ∀ X) θ(t) = θ (cid:48) (t) is also deducible.(5) Instantiation . If ( ∀ Y ) t = t (cid:48) is in A , and if θ : Y → T Σ (X) is a substitution, then the equation ( ∀ X) θ(t) = θ(t (cid:48) ) is deducible. (cid:2) The next section will give a formal definition of equational deductionusing these rules, and will define a relation A (cid:96) e indicating that e can be deduced from A . But first, we illustrate what it is that we wish toformalize: Example 4.1.4 ( Left Groups ) Suppose we want to prove that the right inverselaw ( ∀ x) x ∗ x − = e holds in the specification GROUPL of Example 3.3.4, which is reproducedbelow, except that the variables A , B , and C are used instead of X , Y ,and Z , and a precedence declaration has been added for the inverseoperation -1 .Precedence provides a way to declare that some operation symbolsare “stronger” or “more binding” than others. For example, the usualconventions for mathematical notation assume that x ∗ x − means x ∗ (x − ) rather than (x ∗ x) − , because − binds more tightly than ∗ . In OBJ3, precedence is defined by giving a natural number p as an“attribute” of an operation, in the form [prec p] following the opera-tion’s sort. Lower precedence means tighter binding, and a binary infixoperation symbol like * has a default precedence of 41. (We do not givea formal treatment of precedence in this text, although it is not difficultto do so; see [90] for further discussion.) th GROUPL is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A B C : Elt .eq e * A = A .eq A -1 * A = e .eq A *(B * C) = (A * B)* C .endth Let G.1, G.2 and G.3 denote the three equations in the specification
GROUPL above, in their order of appearance. Equational Deduction
As an illustration, let us apply rule (4) to the equation ( ∀ x) e = (x − − ∗ x − ) , with X = { x } and Y = { z, x } and t = z ∗ (x ∗ x − ) . Then we want θ(z) = e and θ (cid:48) (z) = x − − ∗ x − , while θ(x) = θ (cid:48) (x) = x , so that theresult is the equation ( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) . The following is a deduction for the right identity law, using therules of Definition 4.1.3. In this display, some of the deduced equationsare named by bracketed numbers given to their left. Also, each step ofdeduction is indicated by a long horizontal line, to the right of whichthe rule used is indicated, together with the names of any equations used (in addition to the one above the line) after the word “on”. ( ∀ x) e ∗ (x ∗ x − ) = (x ∗ x − ) (5) on G.1 [ ] ( ∀ x) (x ∗ x − ) = e ∗ (x ∗ x − ) (2) ( ∀ x) (x − − ∗ x − ) = e (5) on G.2 [ ] ( ∀ x) e = (x − − ∗ x − ) (2) ( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (4) on [2] with t = z ∗ (x ∗ x − )[ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (3) on [1] ( ∀ x) (x − − ∗ x − ) ∗ (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − (5) on G.3(3) on [3] [ ] ( ∀ x) (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − ( ∀ x) x − − ∗ (x − ∗ x) = (x − − ∗ x − ) ∗ x (5) on G.3 ( ∀ x) (x − − ∗ x − ) ∗ x = x − − ∗ (x − ∗ x) (2) ( ∀ x) ((x − − ∗ x − ) ∗ x) ∗ x − = (x − − ∗ (x − ∗ x)) ∗ x − (4) with t = z ∗ x − (3) on [4] [ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ (x − ∗ x)) ∗ x − ( ∀ x) (x − ∗ x) = e (5) on G.2 ( ∀ x) (x − − ∗ (x − ∗ x)) ∗ x − = (x − − ∗ e) ∗ x − (4) with t = (x − − ∗ z) ∗ x − (3) on [5] [ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ e) ∗ x − ( ∀ x) x − − ∗ (e ∗ x − ) = (x − − ∗ e) ∗ x − (5) on G.3 ( ∀ x) (x − − ∗ e) ∗ x − = x − − ∗ (e ∗ x − ) (2)(3) on [6] [ ] ( ∀ x) (x ∗ x − ) = x − − ∗ (e ∗ x − )( ∀ x) e ∗ x − = x − (5) on G.1 ( ∀ x) x − − ∗ (e ∗ x − ) = x − − ∗ x − (4) with t = x − − ∗ z (3) on [7] [ ] ( ∀ x) (x ∗ x − ) = x − − ∗ x − ( ∀ x) (x − − ∗ x − ) = e (5) on G.2(3) on [8] [ ] ( ∀ x) (x ∗ x − ) = e quational Proof This rather tedious proof is illustrated in the “proof tree” shown inFigure 4.1, in which each arrow indicates the application of the rule ofdeduction with which it is labelled (the last step is omitted). Section 4.5will give a more powerful rule that will allow us to give a much easierproof of this result. (cid:2)
The rules of equational deduction are used to prove or deduce a newequation e from a given set A of equations, by repeatedly applying therules (0–5) to previously deduced equations. We let A (cid:96) e mean that e is deducible from A . A proof for the assertion A (cid:96) e is a sequence of rule applications that really proves e from A . A fully annotated proofprovides all of the information that is used in each step of deduction,including the name of the rule involved, and any substitutions that areused. We formalize this as follows: Definition 4.2.1
Given a signature Σ and a set A of Σ -equations, a ( bare ) proof from A is a sequence e , . . . , e n of Σ -equations where each e i is de-ducible from A ∪ { e , . . . , e i − } by a single application of a single rule;then we say that e , . . . , e n is a ( bare ) proof of e n from A . A ( fully ) annotated proof is a sequence a , . . . , a n , where each a i has one of thefollowing forms:(0) (cid:104) e i , ( ) (cid:105) where e i ∈ A .(1) (cid:104) e i , ( ) (cid:105) where e i is of the form ( ∀ X) t = t .(2) (cid:104) e i , e j , ( ) (cid:105) where j < i and e i is of the form ( ∀ X) t = t (cid:48) and e j is of the form ( ∀ X) t (cid:48) = t .(3) (cid:104) e i , e j , e k , ( ) (cid:105) where j, k < i , and e i is of the form ( ∀ X) t = t (cid:48)(cid:48) and e j , e k are of the forms ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) respectively.(4) (cid:104) e i , θ, θ (cid:48) , ϕ, t, ( ) (cid:105) where t ∈ T Σ (Y ) , where ϕ : Y → ω , and where θ, θ (cid:48) : Y → T Σ (X) are substitutions such that for each y ∈ Y , eachequation ( ∀ X) θ(y) = θ (cid:48) (y) is some e ϕ(y) where ϕ(y) < i , and e i is of the form ( ∀ X) θ(t) = θ (cid:48) (t) . Equational Deduction
G.1 ↓ (5) ( ∀ x) e ∗ (x ∗ x − ) = (x ∗ x − ) ↓ (2) ( ∀ x) (x ∗ x − ) = e ∗ (x ∗ x − ) (cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:113) G.2 ↓ (5) ( ∀ x) (x − − ∗ x − ) = e ↓ (2) ( ∀ x) e = (x − − ∗ x − ) ↓ (4) with t = z ∗ (x ∗ x − )( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:113) G.3 ↓ (5) ( ∀ x) (x − − ∗ x − ) ∗ (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − (cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:57) G.3 ↓ (5) ( ∀ x) x − − ∗ (x − ∗ x) = (x − − ∗ x − ) ∗ x ↓ (2) ( ∀ x) (x − − ∗ x − ) ∗ x = x − − ∗ (x − ∗ x) ↓ (4) with t = z ∗ x − ( ∀ x) ((x − − ∗ x − ) ∗ x) ∗ x − = (x − − ∗ (x − ∗ x)) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ (x − ∗ x)) ∗ x − (cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:122) G.2 ↓ (5) ( ∀ x) (x − ∗ x) = e ↓ (4) with t = (x − − ∗ z) ∗ x − ( ∀ x) (x − − ∗ (x − ∗ x)) ∗ x − = (x − − ∗ e) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ e) ∗ x − (cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:57) G.3 ↓ (5) ( ∀ x) x − − ∗ (e ∗ x − ) = (x − − ∗ e) ∗ x − ↓ (2) ( ∀ x) (x − − ∗ e) ∗ x − = x − − ∗ (e ∗ x − ) ↓ (3) ( ∀ x) (x ∗ x − ) = x − − ∗ (e ∗ x − ) (cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:122) G.1 ↓ (5) ( ∀ x) e ∗ x − = x − ↓ (4) with t = x − − ∗ z( ∀ x) x − − ∗ (e ∗ x − ) = x − − ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = x − − ∗ x − ( ) ( )( )( ) ( )( ) Figure 4.1: A Proof Tree oundness and Counterexamples (5) (cid:104) e i , θ, e, t, t (cid:48) , ( ) (cid:105) where t, t (cid:48) ∈ T Σ (Y ) , and where θ : Y → T Σ (X) isa substitution such that e ∈ A has the form ( ∀ Y ) t = t (cid:48) and e i is of the form ( ∀ X) θ(t) = θ(t (cid:48) ) .Let A (cid:96) Σ e , or usually A (cid:96) e when Σ is clear from context, indicate thatthere is a proof of e from A . We will also use notations like “ A (cid:96) ( − ) e ”or “ A (cid:96) ( − , ) e ” to indicate that e is deducible using only the rules (0–4), or (1–3) plus (5), respectively; by default, A (cid:96) e will mean A (cid:96) ( − ) e .Also, we let A denote the set of all equations deducible from A usingrules (0–5), and call it the deductive closure or theory of A . (cid:2) Notice that if a , . . . , a n is a fully annotated proof, then the sequence e , . . . , e n of the first components of the a i is a bare proof.We now illustrate these concepts by proving that any equation thatcan be deduced using (0) can also be deduced using (5). First, supposethat ( ∀ X) t = t (cid:48) is in A , let Y = X , and define θ : X → T Σ (X) by θ(x) = x for all x ∈ X . Then (by Exercise 3.5.2) θ(t) = t and θ(t (cid:48) ) = t (cid:48) ,so (5) tells us that ( ∀ X) t = t (cid:48) is deducible, as desired. The followingis a formal statement of what we have just shown: Fact 4.2.2
Given a set A of Σ -equations and a Σ -equation e , then e ∈ A implies A (cid:96) ( ) e . (cid:2) From this we get the following:
Fact 4.2.3
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iff A (cid:96) ( − ) e . Proof:
Let P be a proof of e from A using (0–5). Then for each use of the rule(0) in P , substitute the corresponding use of (5) according to Fact 4.2.2,resulting in a proof P (cid:48) of e from A that does not use rule (0). (cid:2) Thus we can get by with a set of five rules, instead of six. We will see later on that there are many other rule sets for equational deduction,and that the number of rules can be further reduced. A more abstractformulation of deduction is given in (the optional) Section 4.11.
This section shows that equational deduction is sound , in the sense thatif we can deduce e from A , then e is true in all models of A . To thisend, the following lemmas show the soundness of each rule separately: Equational Deduction
Lemma 4.3.1
Given any Σ -algebra M such that M (cid:238) A , and given e ∈ A , then M (cid:238) e . Proof:
This is immediate from the definition of satisfaction. (cid:2)
Lemma 4.3.2
Given any Σ -algebra M and given t ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t . Proof:
Let a : X → M . Then certainly ¯ a(t) = ¯ a(t) . (cid:2) Lemma 4.3.3
Given any Σ -algebra M and given t, t (cid:48) ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t (cid:48) implies M (cid:238) ( ∀ X) t (cid:48) = t . Proof:
Let a : X → M . Then ¯ a(t) = ¯ a(t (cid:48) ) implies ¯ a(t (cid:48) ) = ¯ a(t) . (cid:2) Lemma 4.3.4
Given any Σ -algebra M and given t, t (cid:48) , t (cid:48)(cid:48) ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t (cid:48) and M (cid:238) ( ∀ X) t (cid:48) = t (cid:48)(cid:48) imply M (cid:238) ( ∀ X) t = t (cid:48)(cid:48) . Proof:
Let a : X → M . Then ¯ a(t) = ¯ a(t (cid:48) ) and ¯ a(t (cid:48) ) = ¯ a(t (cid:48)(cid:48) ) imply ¯ a(t) = ¯ a(t (cid:48)(cid:48) ) . (cid:2) Lemma 4.3.5
Given any Σ -algebra M , given t ∈ T Σ (Y ) , and given θ, θ (cid:48) : Y → T Σ (X) such that M (cid:238) ( ∀ X) θ(y) = θ (cid:48) (y) for each y ∈ Y , then M (cid:238) ( ∀ X) θ(t) = θ (cid:48) (t) . Proof:
Let a : X → M . Then ¯ a(θ(y)) = ¯ a(θ (cid:48) (y)) , i.e., θ ; ¯ a(y) = θ (cid:48) ; ¯ a(y) for each y ∈ Y . But now the freeness of T Σ (Y ) implies that θ ; ¯ a(t) = θ (cid:48) ; ¯ a(t) , i.e., that ¯ a(θ(t)) = ¯ a(θ (cid:48) (t)) . (cid:2) Lemma 4.3.6
Given any Σ -algebra M , given t, t (cid:48) ∈ T Σ (Y ) such that M (cid:238) ( ∀ Y ) t = t (cid:48) , and given θ : Y → T Σ (X) , then M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . Proof:
Let a : X → M . Then θ ; ¯ a : Y → M , and so M (cid:238) ( ∀ Y ) t = t (cid:48) implies (θ ; ¯ a)(t) = (θ ; ¯ a)(t (cid:48) ) , i.e., ¯ a(θ(t)) = ¯ a(θ (cid:48) (t (cid:48) )) , i.e., M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . (cid:2) Exercise 4.3.1
The notation used in the proof above conceals a use of Corollary3.6.6. Identify the gap and show how to fill it using this result. (cid:2)
We can now use induction on proof length to show soundness ofequational deduction:
Proposition 4.3.7 ( Soundness ) Given a set A of Σ -equations, a Σ -equation e , and a Σ -algebra M , then M (cid:238) A and A (cid:96) ( − ) e imply M (cid:238) e . Proof:
Let M be a Σ -algebra such that M (cid:238) A .If e has a proof of length 1 from A , then e is derived using exactlyone instance of exactly one of the rules (0–5); then Lemmas 4.3.1–4.3.6show that M (cid:238) e for each of these six cases, thus concluding the baseof the induction.For the inductive step, assume that if e has a proof of length n then M (cid:238) e , and let e (cid:48) have a proof e , . . . , e n + of length n +
1. The inductivehypothesis gives us that M (cid:238) e i , for i = , . . . , n , and from this we canconclude that M (cid:238) e n + by applying one of Lemmas 4.3.1–4.3.6. (cid:2) oundness and Counterexamples Suppose we are given a set A of axioms and an equation e , all overthe same signature Σ , and we want to prove that e cannot be deducedfrom A . (For example, we may have put some effort into proving A (cid:96) e without success, and now suspect that it is not possible.) The impossi-bility of giving a proof can be demonstrated by giving a counterexample ,which is a Σ -algebra M that satisfies A but does not satisfy e . The proofthat counterexamples work only depends on the soundness of deduc-tion, because if A (cid:96) e then for any Σ -algebra M , if M (cid:238) A then M (cid:238) e .Therefore if M (cid:238) A but M (cid:238) e is false, we cannot have A (cid:96) e .In order to show that M (cid:238) e is false, we need only give a singleassignment where e fails: if e is ( ∀ X) t = t (cid:48) , then we need only exhibit θ : X → M such that θ(t) ≠ θ(t (cid:48) ) . Thus, the way to show that an equation cannot be proved is to give an algebra M , an assignment θ into that algebra, and a proof that the assignment has different valueson the two terms of the equation. We will use this in the next subsectionand elsewhere. The most common formulations of equation and equational deductiondo not involve explicit universal quantifiers for variables. However,we will show that explicit quantifiers E9 are necessary for an adequatetreatment of satisfaction. Our demonstration will use the followingspecification: th FOO is sorts B A .ops T F : -> B .ops (_ ∨ _) (_&_) : B B -> B .op ¬ _ : B -> B [prec 2] .op foo : A -> B .var B : B . var A : A .eq B ∨ ¬ B = T .eq B & ¬ B = F .eq B ∨ B = B .eq B & B = B .eq ¬ F = T .eq ¬ T = F .eq ¬ foo(A) = foo(A) .endth The OBJ3 keyword ops allows two or more operation symbols hav-ing the same rank to be declared together; for non-constant operationsymbols, parentheses must be used to separate the different opera-tion forms. The notation T , F , ∨ , & , ¬ , and the first four equationsshould be familiar from Boolean algebra, and we can think of foo as a Equational Deduction kind of “test” on elements of sort A . This example therefore resemblesspecifications found in many applications, except perhaps for the lastequation.Now consider the Σ FOO -algebra I with I A = ∅ and I B = { T , F } , where T,F are distinct, and where & , ∨ , ¬ are interpreted as expected for thebooleans ( F ∨ F = F , etc.), and where foo is the empty function. (This isactually the initial Σ FOO -algebra.) It is easy to check that I satisfies theequation ( ∀ x) F = T where x is of sort A , and that I does not satisfythe equation ( ∀∅ ) F = T . Since these two equations have differentmeanings, they cannot be identified, and therefore the quantifier reallyis necessary.Example 4.3.8 below will show that with unsorted equational deduc-tion, the unquantified equation F = T can be proved from the equations in FOO , from which, given the above discussion of I , it follows thatunsorted equational deduction is in general not sound . This refutesthe apparently common misconception that unsorted and many-sortedequational deduction are equivalent; see [78] for a detailed discussionof this issue.To make our discussion precise, we need an explicit formulation ofunsorted equational deduction. Recall that a Σ -equation consists of aground signature X disjoint from Σ , plus two terms t, t (cid:48) ∈ T Σ (X),s forsome sort s ; that is, a Σ -equation is a triple (cid:104) X, t, t (cid:48) (cid:105) , by convention writ-ten ( ∀ X) t = t (cid:48) . By contrast, equations in unsorted equational logic donot have explicit quantifiers; they are just pairs (cid:104) t, t (cid:48) (cid:105) , conventionallywritten in the form t = t (cid:48) . The unsorted rules of deduction are exactlythe same as the many-sorted rules (1–5) of Definition 4.1.3 except thatall quantifiers (e.g., ( ∀ X) and ( ∀ Y ) ) are omitted.
Example 4.3.8 ( An Unsound Deduction ) We will show that unsorted equationaldeduction can prove an equation that is untrue in some models of thespecification
FOO above. We apply the unsorted versions of the rules(1–5), letting F.1, . . . ,F.7 denote the equations in
FOO in the order oftheir appearance, and letting x be a new variable symbol: ¬ foo (x) = foo (x) (5) on F.7 [ ] foo (x) = ¬ foo (x) (2) foo (x) ∨ ¬ foo (x) = T (5) on F.1 foo (x) ∨ foo (x) = foo (x) ∨ ¬ foo (x) (3)(4) on [0] with t = foo (x) ∨ z foo (x) ∨ foo (x) = Tfoo (x) ∨ foo (x) = foo (x) (5) on F.3 foo (x) = foo (x) ∨ foo (x) (2)(3) [ ] foo (x) = T ompleteness foo (x) & foo (x) = foo (x) (5) on F.4 foo (x) = foo (x) & foo (x) (2) foo (x) & foo (x) = foo (x) & ¬ foo (x) (3)(4) on [0] with t = foo (x) & z[ ] foo (x) = foo (x) & ¬ foo (x) foo (x) & ¬ foo (x) = F (5) on F.2(3) on [2] foo (x) = F (2) F = foo (x) (3) on [1] [ ] F = T The algebra I is a counterexample to the equation F = T that wasproved above. But since the proof really does use the unsorted rules ofdeduction correctly, we must conclude that these rules are not soundfor this many-sorted algebra. It should however be noted that the un- sorted rules of deduction are sound and complete for the classical case(studied by Birkhoff and others) where only unsorted (i.e., one-sorted)algebras are used as models. (cid:2) We will see later that by adding quantifiers to the proof, we get aproof of ( ∀ x) F = T , and we will also see that this does not mean that F = T is satisfied by all models of FOO . The counterexample is only pos-sible because I A = ∅ , and indeed, it can be shown that F = T does holdin every model of FOO that has all of its carriers non-empty. Moreover, itcan be shown that unsorted equational deduction is sound if restrictedto models that have all their carriers non-empty. Hence, it might seemthat the way out is just to restrict signatures so that no carrier can pos-sibly be empty; for example, this approach is advocated in [109]. Butsuch a restriction would exclude many important examples, such asthe theory of partially ordered sets. Another possible way out (and thisis the approach of classical logic) is simply to require that all modelshave all their carriers non-empty. However, we do not want to abandonthe possibility of empty carriers, because then not all specifications willhave initial models, as demonstrated by the above example, and manyothers. It therefore follows that we cannot use the unsorted rules of de-duction with their unsorted notation for equations, and instead must use a version of many-sorted equational deduction in which equationshave explicit quantifiers.
The main result about equational deduction is that it is complete forloose semantics. The following extensions of the notation for satisfac-tion enable us to state this in a simple way:
Definition 4.4.1
Let A and A (cid:48) be sets of Σ -equations, and let e be a Σ -equation.Then we write A (cid:238) Σ e iff for all Σ -algebras M , M (cid:238) Σ A implies M (cid:238) Σ e , Equational Deduction that is, iff every Σ -algebra that satisfies A also satisfies e . Also, we write A (cid:238) Σ A (cid:48) iff A (cid:238) Σ e (cid:48) for all e (cid:48) ∈ A (cid:48) . Similarly, we write A (cid:96) Σ A (cid:48) iff A (cid:96) Σ e (cid:48) for all e (cid:48) ∈ A (cid:48) . (cid:2) Now the main result:
Theorem 4.4.2 ( Completeness ) Given a signature Σ and a set A of Σ -equations,then for any Σ -equation e , A (cid:96) e iff A (cid:238) e . (cid:2) One direction of this equivalence is the soundness of the rules, whichhas already been proved (Proposition 4.3.7); it says that anything thatcan be proved by equational deduction really is true of all models.The other direction, which is completeness in the narrow sense, is much more difficult, and is proved in Appendix B (actually, the moregeneral case of conditional order-sorted equations is proved there).Theorem 4.4.2 is very comforting, because it says every equation e thatis true in all models of A can be deduced using our rules. For example,we can conclude that every equation that is true of all groups can beproved from the group axioms.We will soon see that there are other rule sets that can make proofsmuch easier than they are with (1–5). In fact, the particular rules (1–5)were chosen because each rule is relatively simple and intuitive, andbecause this formulation facilitates proving the completeness theorem.The following slightly more general formulation of completenessfollows from Theorem 4.4.2 and Definition 4.4.1: Corollary 4.4.3
Let A and A (cid:48) be sets of Σ -equations. Then A (cid:96) A (cid:48) iff A (cid:238) A (cid:48) . (cid:2) Before leaving this section, we show transitivity for the extendednotion of satisfaction given in Definition 4.4.1:
Fact 4.4.4
Let
A, A (cid:48) , A (cid:48)(cid:48) be sets of Σ -equations. Then A (cid:238) A (cid:48) and A (cid:48) (cid:238) A (cid:48)(cid:48) imply A (cid:238) A (cid:48)(cid:48) . Proof:
We are assuming that M (cid:238) A implies M (cid:238) A (cid:48) and that M (cid:238) A (cid:48) implies M (cid:238) A (cid:48)(cid:48) . Therefore, by transitivity of implication, M (cid:238) A implies M (cid:238) A (cid:48)(cid:48) . (cid:2) A specialized rule of inference using subterm replacement is the basisfor term rewriting , a powerful technique for mechanical inference thatis discussed in the next chapter. We will develop this rule gradually,starting with a special case of rule (4) in which only one variable issubstituted for. ubterm Replacement Suppose (using the notation of Definition 2.2.1) that X = Y ∪ { z } s where z (cid:54)∈ Y and that θ, θ (cid:48) : X → T Σ (Y ) are substitutions such that θ(y) = θ (cid:48) (y) = y for all y ∈ Y and such that the equation ( ∀ Y ) θ(z) = θ (cid:48) (z) is deducible. Since ( ∀ Y ) y = y is deducible for all y ∈ Y , therule (4) implies that, for any t ∈ T Σ (Y ∪ { z } s ) , ( ∀ Y ) t (z ← t ) = t (z ← t ) is also deducible, where t = θ(z) and t = θ (cid:48) (z) , noting that t , t havethe same sort s as z . Therefore the following rule is sound, because wehave shown that it is a special case of (4):(4 ) One Variable Congruence . Given t ∈ T Σ (Y ∪ { z } ) where z (cid:54)∈ Y , if ( ∀ Y ) t = t is of sort s and is deducible, then ( ∀ Y ) t (z ← t ) = t (z ← t ) is also deducible. Example 4.5.1
Let us use the specification
FOO of Example 4.3.8. Consider theequation ( ∀ x) foo (x) ∨ ¬ foo (x) = T , which is shown deducible inExample 4.3.8. Now let t = foo (x) ∨ z . Then rule (4 ) gives us that ( ∀ x) foo (x) ∨ ( foo (x) ∨ ¬ foo (x)) = foo (x) ∨ T is also deducible. (cid:2) We can get the effect of (4) by repeated applications of (4 ) (i.e., theformal proof is by induction on the number of variables in X , usingthe transitivity of equality). Notice that in (4 ), t (respectively, t ) issubstituted for all occurrences of z in t ; there may be many such oc-currences, or none. We will see later that OBJ3 implements the casewhere there is exactly one occurrence of z . Proposition 4.5.2
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( , ) e iff A (cid:96) ( , ) e . (cid:2) That is, (4) and (4 ) are interchangeable so long as (3) is present. Thisgives the following: Corollary 4.5.3
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iff A (cid:96) ( − , , ) e . (cid:2) And of course, both rule sets are complete, by Theorem 4.4.2. Our nextstep is to combine (4 ) and (5) into the following rule:(6) Forward Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. Equational Deduction
Exercise 4.5.1
Show that rule (0) is a special case of rule (6). (cid:2)
Exercise 4.5.2
Show that if A ≠ ∅ then rule (1) is a special case of rule (6). (cid:2) Exercise 4.5.3
Show that (5) is a special case of (6), but (4 ) is not. (cid:2) The following symmetrical variant of (6) is just as useful:(–6)
Reverse Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X ,and given a substitution θ : Y → T Σ (X) , if ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. The soundness of (–6) follows from that of (6) by first using (2) on theequation ( ∀ X) t = t and then applying (6). For clarity and emphasis,we may write (+6) instead of (6). We now combine (+6) and (–6) into asingle rule, as follows:( ± Bidirectional Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t or ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.This rule is sound because it is the disjunction of two sound rules. Itincludes (5) and basic cases of (2) and (4). In fact, the following can beshown (see Appendix B): Theorem 4.5.4
For any set A of Σ -equations and any (unconditional) Σ -equation e , A (cid:96) e iff A (cid:96) ( , , ± ) e . (cid:2) This result says that (cid:96) is the reflexive and transitive closure of (cid:96) ( ± ) ;consequently, we might write ( ± ∗ ) instead of ( , , ± ) . It is equiva-lent to take the reflexive, symmetric and transitive closure of (6), whichjustifies writing ( ≡ ) . Based on this, we could get a single rule of de-duction based on (6) that is complete all by itself. However, this rulewould be rather complex, and we do not give it here.Theorem 4.5.4 has the important consequence that the reflexive,transitive closure of ( ±
6) is complete, by Theorem 4.4.2. These are the cases where the deduced equation in the premise is actually in A . Appendix C contains a brief review of this concept. ubterm Replacement Exercise 4.5.4
Given a set A of Σ -equations, show that for any Σ -equation e , A (cid:96) e iff A (cid:96) ( , , , ) e . Hint:
Show A (cid:96) ( ± ) e iff A (cid:96) ( , ) e . (cid:2) Example 4.5.5 ( Groups ) Now let us use this to prove the right inverse law ( ∀ x) x ∗ x − = e for the specification GROUPL . By the Theorem of Constants (Theorem3.3.11), it suffices to introduce a new constant a and then prove theequation ( ∀∅ ) a ∗ a − = e. Let GL.1, GL.2, GL.3 denote the three equations in
GROUPL . Then: [ ] a ∗ a − = e ∗ (a ∗ a − ) (–6) on GL.1 [ ] = (a − − ∗ a − ) ∗ (a ∗ a − ) (–6) on GL.2 with A = a − (6) on GL.3 [ ] = ((a − − ∗ a − ) ∗ a) ∗ a − (–6) on GL.3 [ ] = (a − − ∗ (a − ∗ a)) ∗ a − (6) on GL.2 [ ] = (a − − ∗ e) ∗ a − (–6) on GL.3 [ ] = a − − ∗ (e ∗ a − ) (6) on GL.1 [ ] = (a − − ∗ a − ) (6) on GL.2 [ ] = e This proof is much simpler than that given in Section 4.1. Also, noticethat each step builds on the one before it, which makes the proof mucheasier to understand. (cid:2)
The rules (+6), (–6) and ( ±
6) can all be specialized to the case where t has exactly one occurrence of z , which corresponds to what OBJ3implements. In particular, the specialized form of ( ± ) is the followingrule:( ± ) Bidirectional One Occurrence Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z where z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t or ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.It is a bit tricky to formalize the concept that t ∈ T Σ (X ∪ { z } s ) has exactly one occurrence of z . One way is to use the initial algebraapproach of Section 3.2. Letting Σ (cid:48) = Σ (X ∪ { z } s ) , we define a Σ (cid:48) -homomorphism z : T Σ (cid:48) → ω which counts the number of occurrencesof z in terms, by giving ω a Σ (cid:48) -structure, using the convention that ω Equational Deduction denotes an S -sorted set of copies of the natural numbers, where S isthe sort set of Σ : if σ ∈ Σ has n > σ on ω by σ (i , . . . , i n ) = i + · · · + i n ; if σ is a constant in Σ , define σ on ω to be0; and finally, define x ∈ X to be 0, and z to be 1, in ω .Now we can state the main result of this section: Theorem 4.5.6
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iff A (cid:96) ( , , ± ) e . Proof:
We have already shown the soundness of rules (1), (3) and ( ± ). There-fore A (cid:96) ( , , ± ) e implies A (cid:238) e , and then the completeness of (cid:96) givesus that A (cid:96) e . For the converse, by Theorem 4.5.4 it suffices to showthat we can derive the rule ( ±
6) from rules (1), (3), and ( ± ). This can be done by using induction and rule (3) for the two cases (6) and (–6)separately. (cid:2) Because of Theorem 4.4.2, this result implies that any equation thatis true of all groups can be proved using just ( ± ) plus transitivityand reflexivity. Exercise 4.5.5
Show that if we weaken (6) to “at most one occurrence,” then (1)is a special case of this weaker rule.
E10 (cid:2)
Corollary 4.5.7
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) e iff A (cid:96) ( , , , ) e . Proof:
This follows from Theorem 4.5.6, since (2) and ( ) are equivalent to ( ± ) . (cid:2) Completeness of (cid:96) ( , , , )A means that every equation valid for A canbe proved from A without ever having to apply a rule backwards. Thisperhaps surprising result motivates and justifies term rewriting, a com-putational method based on (cid:96) ( , , )A which is the topic of the next chap- ter. ( (cid:63) ) An Alternative Congruence Rule
This section shows that an apparently weaker congruence rule is in factequivalent to the original formulation (4) on page 58. The new rule is: ( (cid:48) ) Congruence . Given σ ∈ Σ s ...s n ,s and given deducible equations ( ∀ X) t i = t (cid:48) i of sort s i for i = , . . . , n , then ( ∀ X) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) is also deducible. eduction using OBJ Proposition 4.5.8
Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iff A (cid:96) ( − , (cid:48) , ) e .A (cid:96) ( ) e iff A (cid:96) ( (cid:48) ) e . Proof:
Since the first assertion follows from the second by induction on thelength of proofs, it suffices to prove the second assertion.If A (cid:96) ( (cid:48) ) e , then A (cid:96) ( ) e , because e necessarily has the form ( ∀ X) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) , and in rule (4) we can take t = σ (y , . . . , y n ) with θ(y i ) = t i and θ (cid:48) (y i ) = t (cid:48) i for i = , . . . , n .For the converse, assume A (cid:96) ( ) e . We use structural induction (seeSection 3.2.1) on the form of t to show that A (cid:96) ( (cid:48) ) e . For the base, if t ∈ Y , then ( ∀ X) θ(t) = θ (cid:48) (t) is deducible by hypothesis. Now sup-pose that t = σ (t , . . . , t n ) and that ( ∀ X) θ(t i ) = θ (cid:48) (t i ) is deducible for i = , . . . , n . Then ( (cid:48) ) gives us that ( ∀ X) σ (θ(t ), . . . , θ(t n )) = σ (θ (cid:48) (t ), . . . , θ (cid:48) (t n )) is deducible, i.e., that ( ∀ X) θ(t) = θ (cid:48) (t) is de-ducible. (cid:2) Notice that this result implies that (cid:96) ( − , (cid:48) , ) is complete. We have given many rules of deduction for many-sorted equationallogic, and shown that various subsets are complete for loose seman-tics. The first variant, consisting of the rules (1–5) in Definition 4.1.3, isnot very convenient for calculation, but each rule is relatively intuitive,and this system is convenient for proving the completeness theorem.The final variant, consisting of rules (1), (3) and ( ± ), is much moreconvenient for calculation, although the rule ( ± ) may seem somewhatcomplex at first. Some intermediate variants helped to bridge the gapbetween these two. OBJ3 not only supports writing theories (such as that of groups), butalso deducing new equations from theories, by applying subterm re-placement. This section introduces some features of OBJ3 that are use-ful for such proofs, through an example proving the right inverse lawfor the following left-handed theory
GROUPL for groups (it is the sameas the theory
GROUPL in Example 4.1.4 on page 59): th GROUPL is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A B C : Elt . Equational Deduction eq e * A = A .eq A -1 * A = e .eq A *(B * C) = (A * B)* C .endth
OBJ3 automatically assigns numbers to equations. If there are noother equations in the current environment, then the above equationswill be numbered, starting from 1, in the order that they occur, and canbe referred to as “
GROUPL.1 ”, “
GROUPL.2 ” and “
GROUPL.3 ”, or morecompactly, as “ .1 ”, “ .2 ” and “ .3 ”, provided that GROUPL is the modulecurrently in focus.You can also give your own name to an equation, by placing thatname in square brackets in front of the equation. For example, if youhad written [e] eq e * A = A .[i] eq A -1 * A = e .[a] eq A *(B * C) = (A * B)* C . in GROUPL , then you could refer to these equations with the names“
GROUPL.e ”, “
GROUPL.i ” and “
GROUPL.a ”, or more compactly, with “ .e ”,“ .i ” and “ .a ”. When there are multiple modules around, this can bemuch more convenient, because introducing new modules can causethe numbers of old equations to change, whereas the user-assignednames will not change.As in Example 4.5.5, the proof given below exploits the Theoremof Constants to get rid of a quantifier, and instead reason with a newconstant. Assuming that GROUPL has just been read into OBJ3, the com-mand “ open . ” permits us to begin working within the module
GROUPL ,and “ op a : -> Elt ” temporarily adds a new constant symbol “ a ” ofsort Elt , so that we can form terms that involve this symbol (it repre-sents the universally quantified variable). The command “ start a *a -1 . ” declares an initial term to which subterm replacement can beapplied, yielding a series of equal new terms.The lines beginning with “ *** > ” are comments, in this case used to say what term we expect the command above it to produce. Each apply command applies the rule (6 ) or ( − ) to the term produced by thecommand above it. In each case, an equation in GROUPL is mentioned,either in the form . n or in the form -. n , depending on whether (6 )or ( − ) is to be used. A substitution θ is indicated in the form “ withA = a -1 ”, and “ at term ” indicates that t = z in rule ( ± ). Othersubterms (i.e., other, non-trivial choices of t ) can be selected using so-called “occurrence notation.” For example, the left subterm of (a ∗ b) ∗ (c ∗ d) is selected with (1) , and the right subterm with (2) ; moreover, b is selected with (1 2) , c with (2 1) , and d with (2 2) . Finally, eduction using OBJ “ close ” exits this special mode of OBJ3, forgetting any operationsand equations that may have been added. Further details about apply appear in Section 7.2.2. open .op a : -> Elt .start a * a -1 .apply -.1 at term .***> should be: e * (a * a -1)apply -.2 with A = (a -1) at (1) .***> should be: (a -1 -1 * a -1) * (a * a -1)apply .3 at term .***> should be: ((a -1 -1 * a -1)* a)* a -1apply -.3 at (1) .***> should be: (a -1 -1 * (a -1 * a)) * a -1apply .2 at (1 2) .***> should be: (a -1 -1 * e) * a -1apply -.3 at term .***> should be: a -1 -1 * (e * a -1)apply .1 at (2) .***> should be: a -1 -1 * a -1apply .2 at term .***> should be: eclose Exercise 4.6.1
Try this yourself. (cid:2)
In conjunction with Theorem 4.5.6, the completeness theorem im-plies that every equation that is true in the theory of groups can beproved using OBJ3 in the style of Example 4.5.5; therefore, we knowthat the proofs requested in Exercises 4.6.2–4.6.4 are possible (providedthe equations are true).The following additional features of OBJ3 are also useful in doingsuch proofs: To add an equation that you have just proved, while stillinside an open...close environment, you can just type (for example) [ri] eq A * A -1 = e . and then use this equation in proving another. You can also add morevariables, e.g., vars C D : Elt .
And of course you can add more operations. At any time, you can seewhat the current environment contains just by typing show . Note that open must be followed by a period, while close must not be followed bya period! Equational Deduction
You may be surprised to see that a version of the Booleans has beenautomatically included; it will not be included if before the specificationyou type set include BOOL off .
The command “ show rules . ” causes all the equations currentlyknown to OBJ to be printed together with their numbers and names(if any). In examples that involve importing other modules, it can behard to predict what ordering OBJ will give to the rules. Therefore,it is important to check what order they actually have, if you want toapply them by their numbers. (A potentially confusing point is thatin the terminology of OBJ, “equations” are also called “rules,” becausethey are applied as rewrite rules in OBJ computations; thus, one must be careful to distinguish whether a given instance of the word “rule”means “rule of deduction” in talk about the theory of
OBJ, or “rewriterule” in talk about computations in
OBJ.)
Exercise 4.6.2
Prove the right identity law for the specification
GROUPL . (cid:2) Exercise 4.6.3
Prove the left identity law for the specification
GROUP . (cid:2) Exercise 4.6.4
Prove the left inverse law for the specification
GROUP . (cid:2) Exercise 4.6.5
Use OBJ3 to prove that
T = F for the specification
FOO of Exam-ple 4.3.8. What equation have you really proved? Does it follow that
T= F in every
FOO -algebra? Why? (cid:2)
This section gives two more rules of deduction for equational logic;they throw an interesting light on cases like that of Example 4.3.8.(7)
Abstraction . If ( ∀ X) t = t is deducible from A , and if Y is a ground signature disjoint from X , then ( ∀ X ∪ Y ) t = t is also deducible from A .(This rule also applies when X = ∅ , where there are originally no vari-ables and some are added.)For the next rule, we need a preliminary concept: let us say that asort s ∈ S is void in a signature Σ iff (T Σ ) s = ∅ . onditional Deduction and its Completeness (8) Concretion . If ( ∀ X ∪ Y ) t = t is deducible from A , if no sort of a variable in Y is void in Σ andif t , t ∈ T Σ (X) , then ( ∀ X) t = t is also deducible from A . Exercise 4.7.1
Show the soundness of rule (7). (cid:2)
Fact 4.7.1
Rule (8) is sound.
Proof:
We have to show that if M (cid:238) ( ∀ X ∪ Y ) t = t , then M (cid:238) ( ∀ X) t = t ;i.e., that if a(t ) = a(t ) for all assignments a : X ∪ Y → M , then b(t ) = b(t ) for all b : X → M . This will follow if we can extend any b : X → M to some a : X ∪ Y → M . But we can always pick an arbitraryelement m y ∈ M s for each y ∈ Y s , and then set b(y) = m y , unlessthere are some s ∈ S and y ∈ Y s such that M s = ∅ . However, thenon-voidness of each s ∈ S such that there is some y ∈ Y s guaranteesthat this cannot happen. (cid:2) From the proof, we see that it is unsound to remove a quantifier overa void sort, because there really can exist models where the carrier ofthat sort is void. For example, in Example 4.3.8, we are unable to applythe concretion rule to remove the variable in the equation ( ∀ x) F = T ,because the sort A is void. We have seen that the resulting equation F = T is not satisfied by the model I , although the quantified version issatisfied.By contrast, the abstraction rule (7) is sound if some, or even all,sorts in Y are void. We can deduce unconditional equations from conditional equations us-ing rules of deduction very similar to those in Definition 4.1.3, except that rule (5) must be modified to account for conditional equations in A . (Recall that conditional equations and their satisfaction have alreadybeen defined in Section 3.4.) We will see later that the resulting rule setis complete. Here is the modified rule:(5C) Conditional Instantiation . If ( ∀ Y ) t = t (cid:48) if C is in A , and if θ : Y → T Σ (X) is a substitution such that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) θ(t) = θ(t (cid:48) ) is deducible. Equational Deduction
We will write concrete instances of conditional rules in forms like ( ∀ x, y, z, w) x + z = y + w if x = y, z = w , separating pairs in the condition by commas, and using the equalitysign.We now show that the rule (5C) is sound: Lemma 4.8.1
Given a Σ -algebra M satisfying A and a substitution θ : Y → T Σ (X) , if M (cid:238) ( ∀ X) θ(u) = θ(v) for all (cid:104) u, v (cid:105) ∈ C , then also M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . Proof:
Let a : X → M and assume that M (cid:238) ( ∀ X) θ(u) = θ(v) for each (cid:104) u, v (cid:105) ∈ C . Then (θ ; a)(u) = (θ ; a)(v) , and so by the definition ofconditional satisfaction, (θ ; a)(t) = (θ ; a)(t (cid:48) ) , i.e., a(θ(t)) = a(θ(t (cid:48) )) , by Corollary 3.6.6. (cid:2) The proof above shows that the rule (5C) is sound even if C is infinite;but of course, in that case we could never write a finite proof scoreusing the rule. Here are some examples of the use of rule (5C): Example 4.8.2
In the context of a specification for the natural numbers witha Boolean-valued inequality function > , consider the conditional equa-tion ( ∀ x, y, z) x = y if z ∗ x = z ∗ y, z > = true , and suppose that at some point we have deduced that 5 ∗ a = ∗ b .Then we can use the above conditional equation and rule (5C) to deducethat a = b , since 5 > = true .In the context of the same specification, now consider the equation ( ∀ x, y, z) x > z = true if x > y = true , y > z = true . Then if at some point we have deduced that a + b > a + c = true and a + c > d = true , then we can use rule (5C) and the above to deducethat a + c > d = true . (cid:2) Let us write A (cid:96) C Σ e if e is deducible from A using the rules (1, 2, 3,4, 5C); also as usual, let us omit the subscript Σ and the superscript C ifthey are clear from context. As with Proposition 4.3.7 in Section 4.3, itnow follows by induction on the length of derivations that (cid:96) C is sound.As with deduction for unconditional equations, we have a completenesstheorem: Theorem 4.8.3 ( Completeness ) Given a set A of (possibly conditional) Σ -equa-tions, then for any unconditional Σ -equation e , A (cid:96) C e iff A (cid:238) e. (cid:2) onditional Deduction and its Completeness The proof is given in Appendix B. (Actually, Appendix B proves the moregeneral Theorem 10.3.2 for the case where the equations in A may beorder-sorted.)The following result gives the most generally useful approach toproving a conditional equation from a given set of (possibly conditional)equations: Theorem 4.8.4
Given a set A of (possibly conditional) Σ -equations and a condi-tional Σ -equation ( ∀ X) t = t (cid:48) if C , let A (cid:48) = A ∪ { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } . Then A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iff A (cid:48) (cid:96) C Σ (X) ( ∀∅ ) t = t (cid:48) . Proof:
By letting C (cid:48) = { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } and using Proposition 3.4.3, A (cid:238) Σ ( ∀ X) t = t (cid:48) if C is equivalent to A ∪ C (cid:48) (cid:238) Σ (X) ( ∀∅ ) t = t (cid:48) , which by Theorem 4.8.3 is in turn equivalent to A ∪ C (cid:48) (cid:96) C Σ (X) ( ∀∅ ) t = t (cid:48) , as desired. (cid:2) This result, although not very difficult, gives an important complete-ness theorem for conditional equations, since it says that any condi-tional equation e that is satisfied by all models of A can be proved byequational deduction from A plus the conditions of e . In practice, it isoften possible to do the proof using just (conditional) rewriting. Exercise 4.8.1
Show that the result of modifying Theorem 4.8.4 by defining A (cid:48) to be A ∪ { ( ∀ X) u = v | (cid:104) u, v (cid:105) ∈ C } , and then asserting A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iff A (cid:48) (cid:96) C Σ ( ∀ X) t = t (cid:48) is false. (cid:2) Conditional equations in OBJ are a special case of conditional equations as defined above: there is only one pair (cid:104) u, v (cid:105) in the set C of conditions,and it must have sort Bool with v = true . As a result, conditionalequations in OBJ3 can have the simplified syntactic form eq t = t (cid:48) if u . where u is a term of sort Bool . However, this is not really much ofa restriction, because equalities (as well as inequalities) can (usually)be expressed as Boolean terms, and u can also be a conjunction ofconditions.For example, the transitive law for a relation > viewed as a Boolean-valued function takes the following form: Equational Deduction eq X > Z = true if X > Y and Y > Z .
The OBJ3 commands for explicitly applying conditional equationshave exactly the same syntax as for unconditional equations. However,the rewrite will not actually be done unless is there not only a match ofthe leftside, but OBJ3 is also able to reduce the condition to true . Thereare two modes within which this reduction might be accomplished:1. If reduce conditions is off (which is the default) then the focusfor application shifts to the condition, and the user can explicitlyinvoke apply to try to prove that the condition equals true .2. If reduce conditions is set on (which must be done explicitly,using the set command) then OBJ3 will just compute the nor- mal form of the condition, and apply the equation iff that form is true .If a conditional equation is applied when the focus is on the conditionof a previously applied equation, then the focus shifts to the conditionof the latest equation; these foci can be nested arbitrarily deeply, anda given focus is abandoned in favour of the previous one iff the proofthat it is true has been completed. OBJ3 does all this automaticallywhen reduce conditions is on , as can be seen by setting trace on . Exercise 4.8.2
Recall that a function f : A → B is injective iff it satisfies theconditional equation ( ∀ x, y) x = y if f (x) = f (y) , and is a right inverse iff there is a function g : B → A such that g ; f = B , i.e., such that ( ∀ x) f (g(x)) = x is satisfied. Now do the following:(a) Write an OBJ theory which expresses the two assumptions above. (b) Write an OBJ proof score for showing that ( ∀ y) g(f (y)) = y holds under these assumptions.(c) Explain why this proof score proves the equation in (b).Note that this proves Exercise 3.1.8 in Section 3.1 for the unsorted case. (cid:2) onditional Subterm Replacement Exercise 4.8.3
Given the following code obj INT is sort Int .ops (inc_)(dec_): Int -> Int .op 0 : -> Int .vars X Y : Int .eq inc dec X = X .eq dec inc X = X .op _+_ : Int Int -> Int .eq 0 + Y = Y .eq (inc X)+ Y = inc(X + Y).eq (dec X)+ Y = dec(X + Y).endo give an OBJ proof score for the conditional equation ( ∀ x, y) x = y if inc x = inc y and justify it. (cid:2) This section develops subterm replacement for the case of conditionalequations. The basic rule, generalizing rule (6) is as follows:(6C)
Forward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution suchthat ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. Exercise 4.9.1
Show that rule (6C) is sound. (cid:2)
Exercise 4.9.2
Show that rule (0) is a special case of (6C). (cid:2)
Exercise 4.9.3
Show that if A ≠ ∅ then rule (1) is a special case of (6C). (cid:2) Exercise 4.9.4
Show that (5C) is a special case of (6C). (cid:2)
Exercise 4.9.5
Show that (4 ) is not a special case of (6C). (cid:2) As with unconditional rewriting, there is also a very useful symmet-rical variant of (6C):(–6C)
Reverse Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C Equational Deduction is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution suchthat ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is of sort s and also deducible.The soundness of (–6C) follows from that of (6C) by first using (2) on theequation ( ∀ X) t = t and then applying (6C). For clarity and emphasis,we may write (+6C) instead of (6C). We now combine (+6C) and (–6C)into a single rule, as follows:( ± C ) Bidirectional Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if either ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution such that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.This rule is sound because it is the disjunction of two sound rules. Itincludes (5C) and basic cases (where the given equation is in A ) of (2)and (4). In fact, the following can be shown: Theorem 4.9.1 ( Completeness of Subterm Replacement ) For any set A of (pos-sibly conditional) Σ -equations and any unconditional Σ -equation e , A (cid:96) C e iff A (cid:96) ( , , ± C) e . (cid:2) This result follows from the more general Theorem 10.3.3 proved inAppendix B, for the case where the equations in A may be order-sorted.Note also that Theorem 4.5.4 is a special case of Theorem 4.9.1, andhence also of Theorem 10.3.3.The rules (+6C), (–6C) and ( ± C ) can each be specialized to the casewhere t has exactly one occurrence of z . In particular, the specializedform of ( ± C) is the following rule:( ± C ) Bidirectional Conditional One Occurrence Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z where z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is deducible if for each pair (cid:104) u, v (cid:105) ∈ C( ∀ X) θ(u) = θ(v) is also deducible.We can now obtain the following: (cid:63) ) Specification Equivalence Corollary 4.9.2
Given a set A of (possibly conditional) Σ -equations, then forany unconditional Σ -equation e , A (cid:96) ( − C) e iff A (cid:96) ( , , ± C) e iff A (cid:96) ( , , , C) e . Proof:
We have already shown the soundness of rules (1), (3) and ( ± C ).Therefore A (cid:96) ( , , ± C) e implies A (cid:238) e , and then the completeness of (cid:96) C gives us that A (cid:96) C e . For the converse, by Theorem 4.9.1 it sufficesto show that we can derive the rule ( ± C ) from rules (1), (3), and ( ± C ).This can be done by using induction and rule (3) for the two cases (+6C)and (–6C) separately. The second “iff” follows as in Corollary 4.5.7. (cid:2) Results in this section provide a foundation for conditional term rewrit-ing, discussed in Section 5.8 of the next chapter. In particular, the above result, extending Corollary 4.5.7, gives completeness of the tran-sitive, reflexive, symmetric closure of forward conditional subterm re-placement, which becomes conditional term rewriting when symmetryis dropped. ( (cid:63) ) Specification Equivalence
This section discusses the equivalence of specifications that may in-volve different signatures. For example, the following gives a ratherdifferent specification of groups:
Example 4.10.1
If we define a/b = a ∗ b − in the theory GROUPL of groups inExample 4.1.4, and then try to find enough properties of this operationto define groups, we might get the following: th GROUPD- is sort Elt .op _/_ : Elt Elt -> Elt .var A B C : Elt .eq A /(B / B) = A .eq (A / A)/(B / C) = C / B .eq (A / C)/(B / C) = A / B .endth
Even though these equations are enough, this specification is not equiv-alent to
GROUPL , because the empty set is a model of
GROUPD- , whereasit is not a model of
GROUPL . However, if we add an identity to the spec-ification, we do get an equivalent theory: th GROUPD is sort Elt .op _/_ : Elt Elt -> Elt .op e : -> Elt .var A B C : Elt .eq A /(B / B) = A .eq (A / A)/(B / C) = C / B . Equational Deduction eq (A / C)/(B / C) = A / B .eq (A / A) = e .endth
The last axiom says that e is an identity. (cid:2) But how can we prove that
GROUPD and
GROUPL are equivalent? We willcertainly need a more general definition of equivalence than the onegiven in Section 3.3, because the signatures of the two specificationsare different.Before we can give this definition, we need some more notation.Each Σ -algebra has an interpretation of each operation symbol σ ∈ Σ as an actual operation; we show how this extends to an interpretationfor Σ -terms with variables. Given w = s . . . s n ∈ S *, we let w X denote an S -sorted ground signature disjoint from Σ such that ( w X s ) = { i | s i = s } . One way to construct such a signature is to let | w X | = { x , . . . , x n } where n = w , and then let w X s = { x i | s i = s } . For example, if S ={ a, b, c } and w = abbac , then w X has w X a = { x , x } , w X b = { x , x } and w X c = { x } . We shall use this construction in the following: Definition 4.10.2
Given a signature Σ , the signature of all derived Σ -operationsis the S -sorted signature Der ( Σ ) with Der ( Σ ) w,s = T Σ ( w X) s for all w ∈ S * and s ∈ S .
Any t ∈ Der ( Σ ) w,s defines an actual operation M t : M w → M s on any Σ -algebra M as follows: given a ∈ M w , there is a naturally corresponding S -indexed map a : w X → M , which lets us view M as a Σ ( w X) -algebra;hence there is a unique Σ ( w X) -homomorphism ¯ a : T Σ ( w X) → M whichlets us define M t (a) to be a(t) . This is called the derived operation defined by t . In this way, we can view any Σ -algebra M as a Der ( Σ ) -algebra, also denoted M . (cid:2) Definition 4.10.3
Given signatures Σ and Σ (cid:48) with sort sets S and S (cid:48) respectively,then a signature morphism (or map ) ϕ : Σ → Σ (cid:48) consists of a map f : S → S (cid:48) and an S * × S -indexed map g with components g w,s : Σ w,s → Σ (cid:48) f (w),f (s) where f is extended to lists by f ([]) = [] and f (s . . . s n ) = f (s ) . . . f (s n ) . Given s ∈ S, w ∈ S * we may write ϕ(s) and ϕ(w) instead of f (s) and f (w) , respectively; and given σ ∈ Σ w,s , we maywrite ϕ(σ ) instead of g(σ ) .Given a signature morphism ϕ : Σ → Σ (cid:48) and a Σ (cid:48) -algebra M , we get a Σ -algebra, called the reduct of M under ϕ and denoted ϕM , as follows:• Given s ∈ S , let (ϕM) s = M ϕ(s) ;• Given σ ∈ Σ w,s , let (ϕM) σ = M ϕ(σ) : M ϕ(w) → M ϕ(s) .In particular, given a signature morphism ϕ : Σ → Der ( Σ (cid:48) ) and a Σ (cid:48) -algebra M , we can view M as a Der ( Σ (cid:48) ) -algebra by Definition 4.10.2,and then get a Σ -algebra denoted ϕM from the construction above. (cid:63) ) Specification Equivalence We will call a signature morphism ϕ : Σ → Der ( Σ (cid:48) ) a derivor from Σ to Σ (cid:48) . (cid:2) It follows that any derivor ϕ : Σ → Der ( Σ (cid:48) ) induces a unique Σ -homo-morphism T Σ → ϕT Σ (cid:48) , because ϕT Σ (cid:48) is a Σ -algebra by the above. Let usdenote this homomorphism ϕ . Definition 4.10.4 An interpretation of specifications ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) is aderivor ϕ : Σ → Der ( Σ (cid:48) ) such that for every Σ (cid:48) -algebra M (cid:48) , M (cid:48) (cid:238) A (cid:48) implies ϕM (cid:48) (cid:238) A .
Specifications ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) are equivalent iff there exist interpre-tations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ , A) such that ϕ(ψM) = M ψ(ϕM (cid:48) ) = M (cid:48) for all ( Σ , A) -algebras M and ( Σ (cid:48) , A (cid:48) ) -algebras M (cid:48) . (cid:2) If specifications ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) are equivalent, then any ( Σ , A) -alge-bra can be seen as a ( Σ (cid:48) , A (cid:48) ) -algebra, and vice versa . We will see that thespecifications GROUPL and
GROUPD are equivalent in this sense. Also,Exercises 4.6.1–4.6.4 show that the specifications
GROUP and
GROUPL are equivalent in the sense of the less general definition of equivalencegiven in Section 3.3, which suffices because the signatures are the same.It is worth remarking that we can get an even more general definitionof equivalence by weakening the conditions in the above definition tothe following: ϕ(ψM) (cid:238) e iff M (cid:238) eψ(ϕM (cid:48) ) (cid:238) e (cid:48) iff M (cid:48) (cid:238) e (cid:48) for all Σ -equations e and Σ (cid:48) -equations e (cid:48) . Another generalization (whichhowever involves some category theory) is to require the two categoriesof models to be isomorphic. Exercise 4.10.1
Use OBJ3 to prove that the specifications
GROUPL and
GROUPD (of Example 4.10.1) are equivalent in the sense of Definition 4.10.4. (cid:2)
Exercise 4.10.2
Show that if two specifications are equivalent in the sense ofSection 3.3, then they are equivalent in the sense of Definition 4.10.4. (cid:2)
Exercise 4.10.3
Show that replacing the three equations in
GROUPD- by the sin-gle equation eq A /((((A / A)/ B)/ C)/(((A / A)/ A)/ C)) = B . yields an equivalent specification. (cid:2) Equational Deduction
Definition 4.10.5
A derivor ϕ : Σ → Der ( Σ (cid:48) ) induces a signature morphism ϕ * : Der ( Σ ) → Der ( Σ (cid:48) ) as follows: suppose that ϕ = (cid:104) f , g (cid:105) , and notethat g w,s : Σ w,s → T Σ (cid:48) ( f (w) X) f (s) for each pair (cid:104) w, s (cid:105) . Define ϕ * w : T Σ ( w X) → ϕT Σ (cid:48) ( f (w) X) for each w ∈ S * to send x i (of sort w i ) in w X s to x i (of sort f (w i ) ) in f (w) X f (s) , noting that T Σ ( w X) is the free Σ -algebra generated by w X and that ϕT Σ (cid:48) ( f (w) X) is also a Σ -algebra.Then, noting that ϕ * w is an S -sorted map, the collection ϕ * w,s forms asignature morphism ϕ * : Der ( Σ ) → Der ( Σ (cid:48) ) .Now given interpretations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) , we can define their composition as interpretations to be thesignature morphism ϕ ; ψ * : Σ → Der ( Σ (cid:48)(cid:48) ) . (cid:2) Exercise 4.10.4
Show that the composition of two interpretations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) is also an interpretation. (cid:2) Exercise 4.10.5
Let i Σ : Σ → Der ( Σ ) send σ ∈ Σ w,s to the term σ (x , . . . , x n ) ∈ T Σ ( w X) s where w has length n . Then show that i Σ * = Der ( Σ ) and that i Σ serves as an identity for the composition of interpretations. (cid:2) In Section 5.9 we will need a more syntactical formulation of inter-pretation. We first give some auxiliary notions:
Definition 4.10.6
Let X be an S -sorted variable set and let f : S → S (cid:48) be a map.Then the S (cid:48) -sorted variable set denoted f X is defined for s (cid:48) ∈ S (cid:48) by (f X) s (cid:48) = (cid:91) { X s | f (s) = s (cid:48) } . Note that f X is again a variable set because the variable symbols in X are all distinct.Now let t be a Σ -term with variables in X and let ϕ = (f , g) : Σ → Σ (cid:48) be a signature morphism. Then we extend ϕ to a function ϕ : T Σ (X) → ϕT Σ (cid:48) (f X) as follows: ϕ(x) = x for x ∈ Xϕ(σ ) = g [],s (σ ) for σ ∈ Σ [],s ϕ(σ (t , . . . , t n )) = g w,s (σ )(ϕt , . . . , ϕt n ) for σ ∈ Σ w,s where w = s . . . s n . Finally, let e be a Σ -equation ( ∀ X) t = t (cid:48) and let ϕ = (f , g) : Σ → Σ (cid:48) be a signature morphism. Then the Σ (cid:48) -equation denoted ϕe is de-fined E11 to be ( ∀ f X) ϕt = ϕt (cid:48) . In this context, we may also write ϕX instead of f X . Note that ϕ : T Σ (X) → ϕT Σ (cid:48) (ϕX) is a Σ -homomorphism, and that it could also havebeen defined using the freeness of T Σ (X) . (cid:2) We will need the following result: (cid:63) ) Specification Equivalence Theorem 4.10.7 ( Satisfaction Condition ) Given a signature morphism ϕ : Σ → Σ (cid:48) and a Σ (cid:48) -algebra M (cid:48) , then for any Σ -equation e , M (cid:48) (cid:238) Σ (cid:48) ϕe iff ϕM (cid:48) (cid:238) Σ e . (cid:2) A proof may be found in [66]; it is not trivial. A general discussion ofthe importance of this kind of result for abstract model theory is givenin [67]. We can now state the result that we have been working towards:
Theorem 4.10.8
A derivor ϕ : Σ → Der ( Σ (cid:48) ) is an interpretation of specifica-tions ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) iff for each Σ -equation e , A (cid:238) Σ e implies A (cid:48) (cid:238) Σ (cid:48) ϕe . Proof:
Call the conditions of Definition 4.10.4 and of this theorem (A) and (B),respectively. To show (A) implies (B), we assume (A) and A (cid:238) e , andthen show that A (cid:48) (cid:238) ϕe , i.e., that M (cid:48) (cid:238) A (cid:48) implies M (cid:48) (cid:238) ϕe . So assuming M (cid:48) (cid:238) A (cid:48) , (A) gives us ϕM (cid:48) (cid:238) A , and then A (cid:238) e gives us ϕM (cid:48) (cid:238) e . Now we apply the satisfaction condition to obtain M (cid:48) (cid:238) ϕe ,as desired.For the converse, we assume (B) and M (cid:48) (cid:238) A (cid:48) , and wish to showthat ϕM (cid:48) (cid:238) A . If we let e ∈ A , then A (cid:238) e , so that (B) reduces to M (cid:48) (cid:238) A (cid:48) implies M (cid:48) (cid:238) ϕe . Our assumption then gives M (cid:48) (cid:238) ϕe , and thesatisfaction condition gives ϕM (cid:48) (cid:238) e . Therefore ϕM (cid:48) (cid:238) A . (cid:2) The following is now a direct consequence of the completeness ofequational deduction:
Corollary 4.10.9
A derivor ϕ : Σ → Der ( Σ (cid:48) ) is an interpretation of specifica-tions ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) iff for each equation e ∈ A , A (cid:96) e implies A (cid:48) (cid:96) ϕe . Furthermore, a pair of derivors ϕ : Σ → Der ( Σ (cid:48) ) and ψ : Σ (cid:48) → Der ( Σ ) is an equivalence of the specifications ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) iff A (cid:96) e implies A (cid:48) (cid:96) ϕeA (cid:48) (cid:96) e (cid:48) implies A (cid:96) ψe (cid:48) for each e ∈ A and e (cid:48) ∈ A (cid:48) , and in addition ϕ(ψM) = Mψ(ϕM (cid:48) ) = M (cid:48) for each ( Σ , A) -algebra M and ( Σ (cid:48) , A (cid:48) ) -algebra M (cid:48) . (cid:2) Equational Deduction
Exercise 4.10.6
Given a specification ( Σ , A) , show that its closure ( Σ , A) is givenby A = { e | M (cid:238) A implies M (cid:238) e for all Σ -algebras M } ;that is, A equals the set of all equations that are true of all models of A .Now show that if ϕ and ψ are an equivalence of specifications ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) , then ϕA = A (cid:48) ψA (cid:48) = A . (cid:2) ( (cid:63) ) A More Abstract Formulation of Deduction
This section gives a more abstract formulation of deduction; it is rather specialized, and can safely be skipped on a first reading or in an intro-ductory course.
Definition 4.11.1
Let
Sen be a set whose elements are called sentences . Thenan inference rule over
Sen is a pair (cid:104)
H, c (cid:105) where H ⊆ Sen is finite and c ∈ Sen ; we call H the hypothesis of the rule, and c its conclusion . Asubset C ⊆ Sen is closed under (cid:104) H, c (cid:105) iff H ⊆ C implies c ∈ C . Given aset R of inference rules, a set C of sentences is closed under R iff C isclosed under each rule in R . Two sets R, R (cid:48) of inference rules are called equivalent iff they have the same closed sets of sentences.Given a set R of inference rules, a proof for a sentence e is a finitesequence of sentences, e e . . . e n with e n = e such that for i = , . . . , n there exists a rule (cid:104) H, e i (cid:105) with H ⊆ { e , . . . , e i − } ; in this case, we saythat e can be proved using R . Let Th (R) denote the set of all sentencesthat can be proved using R . (cid:2) In the case of equational logic,
Sen would be the set of all equations overa given signature, and a rule (e.g., congruence) is represented by the setof its instances for all ground equations. A rule with no hypothesis,such as reflexivity, has H = ∅ . Fact 4.11.2 If C i is closed under a set R of inference rules for i ∈ I , then (cid:84) i ∈ I C i is also closed under R . (cid:2) Proposition 4.11.3
Given any set R of inference rules, Th (R) is the least set ofsentences that is closed under R . Proof:
We first show that Th (R) is closed under R . Let (cid:104) H, c (cid:105) be in R with H = { e , . . . , e n } such that H ⊆ Th (R) . Then for i = , . . . , n , thereexists a proof p i of e i . It now follows that the sequence p . . . p n c is aproof of c , so that c ∈ Th (R) .Now assuming that T ⊆ Sen is closed under R , we will show that Th (R) ⊆ T . Let e ∈ Th (R) and let e . . . e n be a proof of e = e n . Wehave to show e ∈ T . For this purpose, we show by induction that e i ∈ T iterature for i = , . . . , n . Suppose that e j ∈ T for each j < i . Because e . . . e i is a proof, there exists a rule (cid:104) H, e i (cid:105) such that H ⊆ { e , . . . , e i − } . Since H ⊆ T and T is closed under R , we have that e i ∈ T . (cid:2) This result can be used to prove a property of Th (R) by showing thatthe set of sentences having that property is closed under R . This prooftechnique may be called “structural induction over proof rules” (seealso the discussion in Section 3.2.1). It is typical of logical systems that there are many different variantsof their rules of deduction, each variant more suitable for some pur- poses than for others. Although the intuitions behind equational logicare very familiar (basically, substituting equals for equals in equals),the details can be surprisingly subtle, and there are many errors in thepublished literature. In fact, it is also typical that reasoning about de-ducibility (such reasoning is called “proof theory”) can be very subtle,and the reasons for emphasizing semantics in this text include avoidingsuch reasoning as much as possible, as well as checking its soundness.(The proof of Theorem 4.5.4 in Appendix B is a good example of proof-theoretic reasoning.)In 1935 Birkhoff [12] first proved a completeness theorem for equa-tional logic, in the unsorted case. Example 4.3.8, showing that the un-sorted rules can be unsound for many-sorted algebras that may haveempty carriers, is from [78] and [137], which first gave rules of deduc-tion that are sound and complete for the general case. The rules in Sec-tion 4.7 are also from [78]. The discussion of completeness (especiallyTheorem 4.4.2) follows [136], although the proof in Appendix B is from[82] for the order-sorted case. The discussion in Section 4.3 followsthat in [80]. A more detailed historical discussion of various versionsof completeness for equational logic is given in [136], along with fur-ther discussion of the non-equivalence of one-sorted and many-sortedequational logic that was demonstrated in Section 4.3.2.Subterm replacement is the logical basis for term rewriting, which is the topic of Chapter 5. Some historical remarks on term rewritingare also given there. The extension of OBJ3 to permit applying rulesone at a time was done at Oxford by Timothy Winkler mostly duringSeptember 1989.The discussion of deduction with conditional equations in Sections4.8 and 4.9 parallels that in the preceding sections for unconditionalequations. The conditional Completeness Theorem (4.8.3) is a spe-cial case of the corresponding theorem for the order-sorted case, firstproved in (the first version of) [82]. Theorems 4.9.1 and 4.8.4 seem notto appear in the literature, though they are certainly important, and will Equational Deduction probably not surprise experts in the field.Lawvere’s categorical formulation of algebraic theories for the un-sorted case [121] embodies the same notion of specification equiva-lence as that discussed in Section 4.10; however, the formulation thatwe give is new. Readers with a categorical background may be inter-ested to note that
Der is a left adjoint to the forgetful functor fromLawvere theories to signatures; many properties in Section 4.10 fol-low from this property. The composition operation for interpretationsis the Kleisli category composition. The one-equation specification ofgroups in Section 4.10 is due to Higman and Neumann [102]. The sat-isfaction condition (Theorem 4.10.7) plays a central role in the theoryof institutions, which is an abstract theory of the relationship betweensyntax and semantics that has been used for a number of computing science applications, including specification [67].Equational deduction has many applications in computer scienceand other areas. For example, it is applied to modelling and verifyingseveral different kinds of hardware circuits in Chapter 5, and it haseven been used directly as a model of computation [70, 135].I thank Prof. Virgil Emil Cazanescu for the formulations in Section4.11, and for several other valuable suggestions, including finding sev-eral bugs in this chapter and suggesting fixes for them.
A Note to Lecturers:
When lecturing on Section 4.5, it wouldmake sense to treat all the preceding rules and results asleading up to rule ( ± ) and Theorem 4.5.6. In this case,the variant rules (such as ( ) ) and the results about themcan be regarded as parts of the proof of Theorem 4.5.6, andassigned as reading rather than covered in lectures.Much of the material on conditional equations can be cov-ered quickly by relying on the analogy with the uncondi-tional case. However, rule (5C) and the two completenesstheorems (4.8.3 and 4.8.4) should be covered explicitly. The material in Section 4.10 is difficult and should not beattempted in an introductory course. Section 4.11 is ratherspecialized and somewhat abstract, and should also be omit-ted in an introductory course. Section 4.5.1 is also ratherspecialized.
Rewriting
This chapter studies term rewriting , the restricted form of equationaldeduction in which equations are applied in the forward direction only, starting from a given term and “chaining” with the transitivity of equal-ity, that is, repeatedly applying the rule ( + ), or ( + C ) for the con-ditional case, in the style of Example 4.5.5. Corollary 4.5.7 (or Corol-lary 4.9.2) shows that term rewriting with symmetry is complete forequational logic. Without symmetry, completeness is lost, but we willsee that certain natural assumptions restore it, giving a decision proce-dure for equality of ground terms under loose semantics, and a com-putational semantics for sets of equations. Term rewriting has im-portant applications in algebraic specification, computer algebra, the λ -calculus, implementation of declarative languages, and much else.OBJ implements term rewriting with a command which, given a term,searches for a match of a rule to a subterm of that term, applies therule, and iterates this process until there are no more matches; the fi-nal term (if one exists) is considered the result of the computation. Thisprocess is sound, even when not complete.This chapter mainly presents and proves results that are usefulfor theorem proving; many are new, especially those on terminationand conditional term rewriting. Abstract rewrite systems are also dis-cussed, with optional sections (5.8.3 and 5.9) on Noetherian orderingsand on the relationship between term rewriting and abstract rewritingsystems. Syntactically, rewrite rules are a special kind of equation; their defini-tion (5.1.3 below) uses:
Definition 5.1.1
Given t ∈ T Σ (X) , the set of variables in t , denoted var (t) , isthe least subsignature Y ⊆ X such that t ∈ T Σ (Y ) . (cid:2) We can define var (t) formally using initial algebra semantics (Section 3.2). Let V be the S -sorted set with each V s the set of all finite subsets of the elements of X ,given Σ (X) -structure by V x = { x } ∈ V s for x ∈ X s , V σ = ∅ ∈ V s for σ ∈ Σ [],s , and Rewriting
Notation 5.1.2
From now on, we will usually just say “ Σ -term” for what we werepreviously careful to call a “ Σ -term with variables,” i.e., for t ∈ T Σ (X) for some ground signature X without overloading. A Σ -term withoutvariables will be called a ground term . (cid:2) Notice that t is a ground term iff var (t) = ∅ . Of course, every Σ -term t is a ground term over any signature that contains Σ ( var (t)) . The usualliterature on term rewriting is not very careful with bookkeeping forthe variables involved, but we have seen in Section 4.3.2 that this canbe very important for theorem proving. Definition 5.1.3 A Σ - rewrite rule is a Σ -equation ( ∀ X) t = t with var (t ) ⊆ var (t ) = X . It follows that the notation t → t is unambiguous, be-cause X is determined by t . A Σ - term rewriting system (abbreviated Σ - TRS ) is a set of Σ -rewrite rules; we may denote such a system ( Σ , A) ,and we may omit Σ here and elsewhere if it is clear from context. (cid:2) Most equations that users write in OBJ are rewrite rules with vari-ables exactly those that occur in its leftside. For example, all the equa-tions in all of our group specifications are rewrite rules in this way.Notice that if some equation is not a rewrite rule, then its converse(with its left and right sides reversed) may be a rewrite rule. The rule (+6 ) of Chapter 4 replaces exactly one subterm, moving inthe forward direction only. We now further restrict it to equations thatare rewrite rules, to get the following:( rw) Rewriting . Given t ∈ T Σ ( { z } s ∪ Y ) with exactly one occurrence of z , and given a substitution θ : X → T Σ (Y ) , if t → t is a Σ -rewriterule of sort s in A with var (t ) = X , then ( ∀ Y ) t (z ← θ(t )) = t (z ← θ(t )) is deducible.As usual, it is assumed that { z } s and Y are disjoint. This rule is soundbecause it is a restriction of a rule that we have already proved to besound. The successive application of this rule to a term gives a methodof reasoning that is formalized in the following: Definition 5.1.4
Given a Σ -TRS A , the one-step rewriting relation is defined for Σ -terms t, t (cid:48) by t ⇒ A t (cid:48) iff there exist a rule t → t of sort s in A , a V σ (v , . . . , v n ) = (cid:83) ni = v i for σ ∈ Σ w,s where w = s . . . s n and v i ∈ V s i . Then var isthe restriction to T Σ (X) of the unique Σ (X) -homomorphism T Σ (X) → V . It is common to assume that the leftside of a rewrite rule is not just a single variable,because rules of this kind, which we call lapse rules , are not very useful and moreoverhave some bad properties. (The term “lapse” is a joke based on the facts that a rulewith its rightside a variable is called a collapse rule , and that “co” indicates duality.)However, few results actually need the no lapse assumption, and we will invoke it onlywhere necessary. A TRS that has no lapse rules will be called lapse free . However, this is not necessarily the case; for example, the variables that occur inthe two sides could even be disjoint. erm Rewriting term t ∈ T Σ ( { z } s ∪ Y ) with exactly one occurrence of the variable z ,and a substitution θ : X → T Σ (Y ) where X = var (t ) , such that t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) . In this case, the pair (t , θ) is called a match to (a subterm of) t by (theterm t of) the rule t → t . The term rewriting relation is the transitive,reflexive closure of the one-step rewriting relation, for which we write t ∗ ⇒ A t (cid:48) and say that t rewrites to t (cid:48) ( under A ). We may also write t + ⇒ A t (cid:48) if t rewrites to t (cid:48) in one or more steps (i.e., + ⇒ A is the transitiveclosure of ⇒ A ), and ∗ (cid:97) A for the transitive, reflexive, symmetric cloureof ⇒ A . We may omit the subscript A if it is clear from context, and wemay also write ⇒ instead of ⇒ . The subterm θ(t ) of t is sometimes called the redex (for red ucible ex pression) of the rewrite t ⇒ t (cid:48) , and thesubterm θ(t ) of t (cid:48) is sometimes called the contractum of the rewrite,while t is called the source and t (cid:48) the target or the result of the rewrite. (cid:2) For reasons that will soon become clear, it is worth emphasizing thatthe above definitions apply only to ground terms, i.e., to T Σ . We con-sider rewriting terms with variables later. Example 5.1.5
Consider the following specification for the natural numberswith addition: obj NATP+ is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .endo
We can tell OBJ to regard this specification as a TRS and apply its equa-tions as rewrite rules to a term t , just by giving the command reduce t . where the final period is required, and must be separated from the termby a space unless the last character of t is a parenthesis. Note that, bydefault, reductions are executed in the context of the most recentlypreceding module. Also, “ reduce ” can be abbreviated “ red ”. Here aretwo examples: red s s 0 + s s 0 .red s(s s 0 + s s s 0)+ 0 . This concept is reviewed in Appendix C. Rewriting
The results are as you would expect, namely s s s s 0 and s s s ss s 0 , respectively, i.e., 2 + = ( + ( + )) + =
6. The OBJoutput from the first reduction looks as follows, ==========================================reduce in NATP+ : s (s 0) + s (s 0)rewrites: 3result Nat: s (s (s (s 0)))========================================== and the steps of this reduction are s s 0 + s s 0 ⇒ s(s s 0 + s 0) ⇒ s s(s s 0 + 0) ⇒ s s(s s 0) =s s s s 0 . If the trace mode is on, then OBJ will display each rewriting step of eachreduction it executes. The commands set trace on .set trace off . respectively turn trace mode on and off. (cid:2)
Exercise 5.1.1
Show the rewriting steps for the second reduction in Example5.1.5. (cid:2)
Examples like the above suggest how term rewriting can be considereda model of computation: given a Σ -term t in T Σ (X) , each rewrite in asequence t ⇒ t ⇒ t ⇒ . . . is considered a step of computation, and a term that cannot be rewrit-ten any further is considered a result. Note that given t with var (t) = X ,all these computations occur in the Σ -algebra T Σ (X) or T Σ (Y ) for any X ⊆ Y . The following formalizes this notion of a computation result: Definition 5.1.6
Given a Σ -TRS A , a Σ -term t is irreducible , also called a normal or reduced form ( under A ), iff there is no match to t by any rule in A .If t ∗ ⇒ t (cid:48) and t (cid:48) is a normal form, then we say that t (cid:48) is a normal (or reduced ) form of t ( under A ). (cid:2) Here is another example of computation with rewrite rules:
Example 5.1.7
The following simple version of the Booleans is sufficient forcomputing the values of ground terms that involve only true , false , and , and not : erm Rewriting obj ANDNOT is sort Bool .ops true false : -> Bool .op _and_ : Bool Bool -> Bool .op not_ : Bool -> Bool [prec 2].var X : Bool .eq true and X = X .eq false and X = false .eq not true = false .eq not false = true .endo The following are some sample reductions using this code: red not true and not false .red not (true and not false) .red (not not true and true) and not false .
We can also run reductions that involve variables, e.g., red X and not X .red not not X . (We don’t need to declare X here because it is already declared in ANDNOT ,but any other variables would need to be declared.) The results of thesetwo reductions show that this specification is not powerful enough toprove every true Boolean equation with variables. (cid:2)
Exercise 5.1.2
Show the rewrites and the results for each reduction in Example5.1.7. (cid:2)
Although this book contains many “natural” examples of TRS’s, ar-tificial examples like the two below can be illuminating, for example, ascounterexamples to conjectured general results, or to illustrate certainconcepts:
Example 5.1.8
Consider a TRS A having just one sort, with a, b, c, d constantsof that sort, and with the following rewrite rules: a → b , a → c , b → a ,and b → d . (cid:2) Exercise 5.1.3
Show that a + ⇒ c , a + ⇒ d , and also that a + ⇒ a , for the TRS ofExample 5.1.8. (cid:2) The rest of this section explores the crucial relationship betweenterm rewriting and equational deduction. We first extend rewriting toterms with variables X , which we define as ground term rewriting in T Σ (X) and denote by ⇒ A,X when rules in A are used. Rewriting
Proposition 5.1.9
Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and a Σ -TRS A , then t ⇒ A,X t (cid:48) iff t ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Proof:
The converse implication is easy, so we assume t ⇒ A,X t (cid:48) with t = t (z ← θ(t )) , t (cid:48) = t (z ← θ(t )) for t = t ∈ A with θ : var (t ) → T Σ (X) and t ∈ T Σ (X ∪ { z } ) . Since t, t (cid:48) ∈ T Σ (Y ) , we must have t ∈ T Σ (Y ∪ { z } ) as well as θ(t ), θ(t ) ∈ T Σ (Y ) , so that θ : var (t ) → T Σ (Y ) .Therefore t ⇒ A,Y t (cid:48) , and var (t (cid:48) ) ⊆ var (t) since var (t ) ⊆ var (t ) . (cid:2) Corollary 5.1.10
Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and Σ -TRS A , then t ∗ ⇒ A,X t (cid:48) iff t ∗ ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Proof:
By induction using Proposition 5.1.9. (cid:2)
Thus both ⇒ A,X and ∗ ⇒ A,X restrict and extend well over X , so we can drop the variable set subscript and write just t ∗ ⇒ A t (cid:48) , with the under-standing that any X such that var (t) ⊆ X may be used. Exercise 5.1.4
Show for any finite X , there are A, t, t (cid:48) such that t ⇒ A t (cid:48) but t ⇒ A,X t (cid:48) fails. (cid:2) The following gives soundness for term rewriting:
Proposition 5.1.11
Given Σ -TRS A and t , t ∈ T Σ (X) , then t ∗ ⇒ A t implies A (cid:96) ( ∀ X) t = t . Proof:
Each single step of rewriting is an application of the rule ( rw ), so sound-ness follows from the fact that ( rw ) is a special case of the rule ( + ) ofChapter 4, which we have already shown sound. Induction then extendsthis from ⇒ A to ∗ ⇒ A . (cid:2) Definition 5.1.12
Given Σ -TRS A and t , t ∈ T Σ (X) , write t ↓ A,X t iff thereis some Σ -term t such that t ∗ ⇒ A t and t ∗ ⇒ A t ; in this case, we saythat t and t are convergent , or converge ( to t ). We may refer to aconfiguration t ∗ ⇒ A t, t ∗ ⇒ A t as a join or a V . (cid:2) Proposition 5.1.13
Given Σ -TRS A and t , t ∈ T Σ (Y ) with Y ⊆ X , then t ↓ A,X t iff t ↓ A,Y t , so the variable set subscript can be dropped. Moreover, t ↓ A t implies A (cid:96) ( ∀ X) t = t . (cid:2) Exercise 5.1.5
Prove Proposition 5.1.13. (cid:2)
Proposition 5.1.14
Given Σ -TRS A and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) iff there are t , . . . , t n ∈ T Σ (X) such that t ↓ A t and t i ↓ A t i + for i = , . . . , n − t n ↓ A t (cid:48) . Proof: If R denotes the transitive closure of ↓ A then we wish to show that t ∗ (cid:97) A t (cid:48) iff tRt (cid:48) . Since ⇒ A ⊆ ↓ A ⊆ ∗ (cid:97) A , it follows that ∗ ⇒ A ⊆ R ⊆ ∗ (cid:97) A .But R is reflexive and symmetric because ↓ A is. Therefore R = ∗ (cid:97) A . (cid:2) anonical Form Corollary 5.1.10 and Proposition 5.1.13 show that ∗ ⇒ A,X and ↓ A,X re-strict and extend reasonably over variables, whereas ∗ (cid:97) A,X does not,because in Example 5.1.15 below, T ∗ (cid:97) FOO , { x } F holds but T ∗ (cid:97) FOO , ∅ F fails. Nevertheless, it makes sense to let t ∗ (cid:97) A t (cid:48) mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) , and we shall do so. Example 5.1.15
Using the specification of Example 4.3.8 and letting A = FOO ,then over the signature Σ ( { x } ) where x has sort A , we have T ↓ foo (x) ∨ ¬ foo (x) ↓ foo (x) ↓ foo (x) & ¬ foo (x) ↓ F , which implies the valid equation ( ∀ x) T = F , but not the invalid equa-tion ( ∀∅ ) T = F . (cid:2) In fact, ∗ (cid:97) A is complete, even though ∗ (cid:97) A,X may not be when X is finite: Theorem 5.1.16
Given Σ -TRS A and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) if and only if A (cid:96) ( ∀ X) t = t (cid:48) . Proof:
By Proposition 5.1.11, we need only prove the converse direction, soassume we have a proof for A (cid:96) ( ∀ X) t = t (cid:48) . Any such proof nec-essarily starts with (1) and then chains forward using ( , ) until anapplication of (2) occurs. Unless this chain gives t (cid:48) or is a dead end, itsfinal term must also be the final term of another chain, in which casewe have a join. Similarly, the whole proof is a set of joins, which canonly be put together without dead ends if they form a sequence as inProposition 5.1.14, which then gives t ∗ (cid:97) A t (cid:48) . (cid:2) One might think that Proposition 5.1.14 could give a decision proce-dure for ∗ (cid:97) A and hence for equality under A , since only term rewritingis involved, but this is not the case, because it can be difficult to find theappropriate t i . In fact, the problem is unsolvable, for reasons discussedin Section 5.10. When we compute, we usually hope to get a unique well-defined answerin the end. The following gives one necessary condition for this tooccur for every term using a given set of rules; we also give a versionthat applies to all terms of a particular sort.
Definition 5.2.1 A Σ -TRS A is terminating (also called Noetherian ) iff there isno infinite sequence t , t , t , . . . of Σ -terms such that t ⇒ t ⇒ t ⇒ . . . . Rewriting
Similarly, A is ground terminating iff there is no such infinite sequenceof ground terms, and A is terminating (or ground terminating ) for sort s iff there is no such sequence (of ground terms), all of sort s . (cid:2) For example, if we add a commutative law for addition to the speci-fication
NATP+ of Example 5.1.5, then there are computations like thefollowing that do not terminate:0 + s ⇒ s + ⇒ + s ⇒ . . . . If A is terminating, then every Σ -term has a normal form, but some Σ -terms may have more than one normal form. However, the followingcondition will guarantee the uniqueness of normal forms for terminat-ing TRS’s; again we also give a version for terms of a particular sort. Definition 5.2.2 A Σ -TRS A is Church-Rosser (also called confluent ) iff forevery Σ -term t , whenever t ∗ ⇒ t and t ∗ ⇒ t , there is some Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Similarly, A is Church-Rosser for sort s iff this condition holds for all t of sort s . A Σ -TRS A is canonical (orsometimes convergent or complete ) iff it is terminating and Church-Rosser, and is canonical for sort s iff it is terminating and Church-Rosser for sort s . In these cases, normal forms may also be called canonical forms .A Σ -TRS A is ground Church-Rosser (also called ground confluent )iff for every ground Σ -term t , whenever t ∗ ⇒ t and t ∗ ⇒ t , there is aground Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Similarly, A is groundcanonical iff it is ground terminating and ground Church-Rosser, and is ground Church-Rosser for sort s iff the condition holds for all t of sort s . In these cases, normal forms may also be called ground canonicalforms .Similarly, a TRS A is locally Church-Rosser (or locally confluent )iff for every Σ -term t , whenever t ⇒ t and t ⇒ t , there is a Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Also, a TRS A is ground locallyChurch-Rosser iff the above condition holds for all ground terms t , and is locally Church-Rosser for sort s or ( ground locally Church-Rosserfor sort s ) iff it holds for all (ground) terms t of sort s . (cid:2) The generalizations to a particular sort are needed because rewritingwith a many-sorted TRS may well have different properties for differentsorts. Figures 5.1(a) and 5.1(b) illustrate the Church-Rosser and localChurch-Rosser properties.The Church-Rosser property is shown graphically in Figure 5.1(a).For example, our discussion of
FOO from Example 4.3.8 shows that the But the French school does not use the terms “Church-Rosser” and “confluent” syn-onymously, e.g., [109]. anonical Form tt t t (cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) ∗∗∗ ∗ (cid:81)(cid:81)(cid:66)(cid:66) (cid:17)(cid:17)(cid:2)(cid:2)(cid:17)(cid:17)(cid:2)(cid:2) (cid:81)(cid:81)(cid:66)(cid:66) tt t t (cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) ∗ ∗ (cid:81)(cid:81)(cid:66)(cid:66) (cid:17)(cid:17)(cid:2)(cid:2)(cid:17)(cid:17)(cid:2)(cid:2) (cid:81)(cid:81)(cid:66)(cid:66) (a) Church-Rosser (b) Locally Church-RosserFigure 5.1: Church-Rosser Propertiesresulting TRS is not Church-Rosser, because if we let t = foo (x) & ¬ foo (x) , then t ∗ ⇒ F and t ∗ ⇒ foo (x) , each of which is irreducible;however, this system is terminating. The TRS of Example 5.1.8 is notterminating; it is also not Church-Rosser, because both c and d arenormal forms of a . The following result is immediate from Definitions5.2.1 and 5.2.2: Fact 5.2.3 If ( Σ , A) is Church-Rosser, then it is ground Church-Rosser; if it isterminating then it is ground terminating; and if it is canonical then itis ground canonical. (cid:2) However, a ground Church-Rosser TRS is not necessarily Church-Rosser,and a ground canonical TRS is not necessarily canonical, as the follow-ing shows:
Example 5.2.4
Consider the following variant of the theory of monoids, inwhich the direction of the associative law has been reversed: th RMON issort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq (X * Y)* Z = X *(Y * Z).endth
Viewed as a TRS, this has only one reduced ground term, namely e ,so it is certainly ground Church-Rosser, ground locally Church-Rosser,ground terminating, and ground canonical. However, it is not Church-Rosser; for example, the term (X * e) * X rewrites to both X * X and
X * (e * X) , each of which is reduced. (cid:2) Rewriting
Exercise 5.2.1
Show that the following specification of the Peano natural num-bers with addition gives a ground canonical TRS that is not Church-Rosser, and hence not canonical: obj RNATP+ is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .vars X Y Z : Nat .eq 0 + X = X .eq (s X) + Y = s(X + Y) .eq X + (Y + Z) = (X + Y) + Z .endo (cid:2)
Exercise 5.2.2
Show that if a TRS A has a lapse rule of sort s with its rightside a ground term, then A is Church-Rosser for sort s . (cid:2) Exercise 5.2.3
Show that the TRS of Example 5.1.8 is locally Church-Rosser. (cid:2)
We call the next result a “Theorem” and give its proof in detail, eventhough it is trivial, because it is such a fundamental result about termrewriting:
Theorem 5.2.5
If a Σ -TRS A is canonical, then every Σ -term t has a uniquenormal form, denoted [[t]] A , or just [[t]] if A is clear from context, andcalled the canonical form of t . Proof:
Each Σ -term t has at least one normal form by the Noetherian property.Suppose that t and t are two normal forms for t . Then by the Church-Rosser property, because t ∗ ⇒ t and t ∗ ⇒ t , there is a term t such that t ∗ ⇒ t and t ∗ ⇒ t . But because t and t are both normal forms, weget that t = t and t = t , and hence that t = t . (cid:2) For example, it will follow from later results that the TRS of Exam-ple 5.1.5 is canonical, and that its ground normal forms all have theform s s . . .
0, with zero or more s ’s. The results below bring out thevery important consequence of Theorem 5.2.5 that every canonical TRShas a natural procedure for deciding the equality of ground terms; the proposition below is proved in Section 5.7 as a consequence of the moreabstract Proposition 5.7.6 there. Proposition 5.2.6 If A is a Church-Rosser TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iff t ↓ t (cid:48) . (cid:2) Corollary 5.2.7 If A is a canonical TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iff [[t]] A = [[t (cid:48) ]] A , where the last equality is syntactical identity. Proof:
By Proposition 5.2.6, because t ↓ t (cid:48) iff [[t]] A = [[t (cid:48) ]] A when A is canonical. (cid:2) anonical Form This says that when A is canonical, we can decide whether or not anequation ( ∀ X) t = t (cid:48) is satisfied by all models of A just by compar-ing the canonical forms of its two sides. Equivalently, it says that wecan decide whether or not terms t, t (cid:48) can be proved equal using theequations in A , just by checking whether or not their canonical formsare identical. OBJ provides a function that does exactly this: t == t (cid:48) computes the normal forms of t and t (cid:48) , and then returns true if theseterms are identical, and false otherwise.For a simple illustration using the TRS of Example 5.1.5, if we showthat the terms s s + s s s s(s s + s s ) each reduce to the samething (namely s s s s s ( ∀∅ ) s s + s s s = s(s s + s s ) , holds for all models; this is very conveniently done by executing red s s 0 + s s s 0 == s(s s 0 + s s 0) . However, this method cannot be used to prove either ( ∀ x) x + s s s = s s s + x , or the more general equation ( ∀ x, y) x + y = y + x , and, in fact, both of these are false in some models of NATP+ .The situation is as follows: Term rewriting over a canonical Σ -TRSgives a decision procedure for equality of Σ -terms with respect to loose semantics. But often we are really interested in the initial semanticsof a specification, e.g., NATP+ . In such a case, we can decide the equal-ity of any two ground Σ -terms, i.e., of any two elements of the initial Σ -algebra, but we cannot decide the equality of two Σ -terms whose vari-ables are restricted to range over ground terms only, i.e., over the initialalgebra. For example, the commutative law is true for every pair x, y of ground terms over NATP+ , and hence it is true for the initial algebra, but this cannot be proved just by reduction, it requires induction . Infact, the commutative law is not true for every choice of elements fromevery model of
NATP+ . Thus it is important to remember that this kindof decision procedure only decides equality for loose semantics.
Exercise 5.2.4
1. Use the Completeness Theorem to prove that there are models of
NATP+ where the commutative law fails, without explicitly givingsuch a model.2. Now give such a model. (cid:2) Rewriting
Clearly it is useful to know if a specification is canonical as a TRS.Sections 5.5 and 5.6 will give several useful tools for proving termina-tion and confluence, respectively. However, it is important (and per-haps surprising) to note that for proving equality, it is not necessary for a TRS A to be Church-Rosser or even terminating: if t and t (cid:48) haveequal normal forms under A then the equation ( ∀ X) t = t (cid:48) is prov-able from A , whether or not A is canonical; canonicity is only neededto guarantee that if [[t]] A ≠ [[t (cid:48) ]] A then the equation ( ∀ X) t = t (cid:48) is not provable from A . This explains why OBJ does not require checkingconfluence or termination before it accepts code, and why we may usethe notation [[t]] A to denote an arbitrary normal form of t even when A is not canonical. We have found that in practice, it is more irritatingthan useful to prove canonicity, although it would be desirable to have algorithms that could help with this on an optional basis. Experiencealso shows that OBJ specifications written for many application areas,including programming, are essentially always canonical.OBJ also provides a function =/= that is the negation of == . How-ever, it can be dangerous to use when A is not Church-Rosser for thesort of the terms involved, because even if the terms are provably equal,OBJ could compute different normal forms for them. But the result willalways be correct if A is canonical for the common sort of t, t (cid:48) and thesubset of rules actually used in computing t =/= t (cid:48) (however, complica-tions can arise for conditional rules, as discussed in Section 5.8). Example 5.2.8 ( Groups ) This OBJ code gives a canonical TRS for the theory ofgroups: th GROUPC is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2].var A B C : Elt .eq e * A = A .eq A -1 * A = e .eq A * e = A .eq e -1 = e .eq (A * B)* C = A *(B * C) .eq A -1 -1 = A .eq A * A -1 = e .eq A *(A -1 * B) = B .eq A -1 *(A * B) = B .eq (A * B)-1 = B -1 * A -1 .endth
For example, suppose we want to know whether or not the equation ( ∀ w, x, y, z) ((w ∗ x) ∗ (y ∗ z)) − = ((z − ∗ y − ) ∗ x − ) ∗ w − Termination is shown in Exercise 5.5.4, while the Church-Rosser property is shownin Exercise 5.6.7. anonical Form is true in all groups. In OBJ, we can open the module
GROUPC , introducethe variables, and then reduce the left and right sides to see if they havethe same normal form: open .vars W X Y Z : Elt .red ((W * X)*(Y * Z))-1 .red ((Z -1 * Y -1)* X -1)* W -1 .close
We could also use the OBJ built-in operation == for this: open .vars W X Y Z : Elt .red ((W * X)*(Y * Z))-1 == ((Z -1 * Y -1)* X -1)* W -1 .close This tells whether or not the equation is true of all groups, but itdoes not tell us what the normal forms are, and that additional infor-mation is often useful when trying to build a proof.Alternatively, using the Theorem of Constants, we can add new con-stants a, b, c, d , and then reduce the left and right sides to see if theyhave the same normal form. This is equivalent because variables arereally just new constants. Here is how that looks in OBJ: open .ops a b c d : -> Elt .red ((a * b)*(c * d))-1 .red ((d -1 * c -1)* b -1)* a -1 .close
It is not necessary to use open and close for this example; we couldinstead define a new module which enriches
GROUPC , and then do thereduction in that context, as follows: th GROUPC+ isinc GROUPC .ops a b c d : -> Elt .endthred ((a * b)*(c * d))-1 == ((d -1 * c -1)* b -1)* a -1 .
This uses a feature of OBJ that we have not yet discussed: the pre-viously defined theory
GROUPC is “imported” into the current theory bythe declaration “ including GROUPC ”, here abbreviated “ inc GROUPC ”;the effect is exactly the same as if the code in
GROUPC were copied into
GROUPC+ .New material introduced within an open...close pair is forgottenafter the close . If you want it to be added to the module in focus andretained as part of it for future use, you should instead use the pair Rewriting openr...close . If you introduced the module
GROUPC+ after havingdone the above proof within an openr...close pair, then you wouldget parsing errors, because now there would be two copies each of a,b,c,d (you can see the problem by typing “ show . ”). In order toget around this, you could re-enter the theory
GROUPC , which has theeffect of restoring it to its original state; OBJ will then warn you that
GROUPC is being “redefined,” but this should not worry you becausethat is exactly what you wanted to do. Another approach is to type select GROUPC .red ((a * b)*(c * d))-1 == ((d -1 * c -1)* b -1)* a -1 .
This returns focus to the original
GROUPC module, which will haveretained (one copy of each of) the constants a,b,c,d , provided you previously used openr...close ; otherwise you will get a parse error.Thus we see that there is considerable flexibility in how OBJ can be usedin proofs of this kind. (cid:2)
Exercise 5.2.5
Experiment with the new features of OBJ introduced above, in-cluding “ select ”, “ openr...close ” and “ include ”. (cid:2) Exercise 5.2.6
Assuming canonicity of the TRS of Example 5.2.8, check whetheror not the following equations are true of all groups:1. ( ∀ x, y, z) (x ∗ y) − ∗ (x ∗ z) = y − ∗ z .2. ( ∀ x, y) (x − ∗ y − ) − = y ∗ x .3. ( ∀ x, y, z) ((x − ∗ e) ∗ (y − ∗ z) − ) − = ((y − ∗ e) ∗ (z ∗ x)) − . (cid:2) Exercise 5.2.7
Given the following theory of monoids, th MONOID issort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq e * X = X .eq (X * Y)* Z = X *(Y * Z).endth check whether or not the following equations are true of all monoids,assuming canonicity of the rules in
MONOID :1. ( ∀ x, y) (x ∗ e) ∗ y = y ∗ x . ( ∀ x, y, z, w) x ∗ (y ∗ (z ∗ w)) = ((x ∗ y) ∗ z) ∗ w . ( ∀ x, y, z) ((x ∗ e) ∗ (y ∗ z) = (x ∗ y) ∗ (z ∗ e) . anonical Form Exercise 5.6.7 shows that
MONOID is a Church-Rosser TRS. (cid:2)
The following fundamental result tells us that for canonical spec-ifications, the ground canonical forms give an initial algebra. This isanother justification for the use of irreducible terms as the results ofcomputations. The proof makes good use of the initiality of T Σ . Theorem 5.2.9
If a specification P = ( Σ , A) is a ground canonical TRS, then thecanonical forms of ground terms under A form a P -algebra, called the canonical term algebra of P , denoted N P or N Σ ,A or just N A , in thefollowing way:(0) interpret σ ∈ Σ [],s as [[σ ]] in N P,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ( t , . . . , t n ) with t i ∈ N P,s i to [[σ (t , . . . , t n )]] in N P,s .Furthermore, if M is any P -algebra, there is one and only one Σ -homo-morphism N P → M . Proof:
Since N P is a Σ -algebra by definition, we check that it satisfies A . Given ( ∀ X) t = t (cid:48) in A and a : X → N P we also get b : X → T Σ since N P ⊆ T Σ .Moreover, a(t) = [[b(t)]] for any t ∈ T Σ (X) because [[b( _ )]] is a Σ -homomorphism since [[ _ ]] is, and there is a unique Σ -homomorphism T Σ (X) → N P that extends a . Applying the given rule to t with thesubstitution b gives b(t) ⇒ A b(t (cid:48) ) , and so these two terms have thesame canonical form, i.e., [[b(t)]] = [[b(t (cid:48) )]] . Therefore a(t) = a(t (cid:48) ) forevery a , and so we are done.Now let h : T Σ → M be the unique Σ -homomorphism. Noting that N P ⊆ T Σ , let us define g : N P → M to be the restriction of h to N P . Wenow check that g is a Σ -homomorphism, using structural induction on Σ :(0) Given σ ∈ Σ [],s , we get g(σ N P ) = h([[σ ]]) by definition. NowProposition 5.1.16 gives us that h(σ ) = h([[σ ]]) because σ ∗ ⇒ A [[σ ]] . Therefore g(σ N P ) = σ M , as desired, because h(σ ) = σ M since h is a Σ -homomorphism.(1) Given σ ∈ Σ s ...s n ,s with n >
0, by definition we get g(σ N P (t , . . . , t n )) = h([[σ (t , . . . , t n )]]). Then Proposition 5.1.16 gives us that h(σ (t , . . . , t n )) = h([[σ (t , . . . , t n )]]), The assignments a, b are different functions (in the sense of Appendix C) becausethey have different targets, and even though a, b have the same values, the functions a, b have quite different values. Rewriting and the fact that h is a Σ -homomorphism gives us g(σ N P (t , . . . , t n )) = σ M (h(t ), . . . , h(t n )) = σ M (g(t ), . . . , g(t n )), as desired.To show uniqueness, suppose g (cid:48) : N P → M is another Σ -homo-morphism. Let r : T Σ → N P be the map that sends t to [[t]] , and notethat it is a Σ -homomorphism by the definition of [[ _ ]] . Next, note that if i : N P → T Σ denotes the inclusion Σ -homomorphism, then i ; r = N P . Fi-nally, note that r ; g = r ; g (cid:48) = h , by the uniqueness of h . It now followsthat i ; r ; g = i ; r ; g (cid:48) , which implies g = g (cid:48) . (cid:2) Exercise 5.2.8
Draw a commutative diagram that brings out the simple equa- tional character of the above uniqueness argument. (cid:2)
Exercise 5.2.9
Show that any terminating TRS is lapse free. (cid:2)
This section discusses the preservation of some basic properties of aTRS when new constants are added to its signature. This is importantbecause it allows us to conclude that a TRS is terminating from a proofthat it is ground terminating, and it can also justify using the Theoremof Constants in theorem proving. Although we may speak of adding“variables” to a signature, of course they are really constants. Recallingthat ( Σ , A) indicates that A is a Σ -TRS, if X is a suitable variable set, itis convenient to let A(X) denote the TRS ( Σ (X), A) . We begin with asimple but important result: Proposition 5.3.1
If a TRS A is terminating, then so is A(X) , for any signatureof constants X for Σ . On the other hand, if A is Church-Rosser or locallyChurch-Rosser, then so is A(X) . (cid:2) The intuition is that the variables in X are really just constants; howeverthe formal justifications in Propositions 5.3.4 and 5.3.1 use abstractrewrite systems, which have not yet been introduced at this point inthe chaper.Although proofs of the Church-Rosser property generally apply tothe non-ground case, so that one does not have to worry about addedconstants causing trouble, this is not so for termination, which is usu-ally easier to prove for the ground case. By definition (or Fact 5.2.3),any terminating TRS is also ground terminating, but the converse doesnot hold, as shown by the following simple but important TRS (and alsoby Example 5.2.4): dding Constants Example 5.3.2
Let Σ have just one sort, one unary function symbol f , and onerule, f (Z) → f (f (Z)) , where Z is a variable. Because there are no ground terms, there are noground rewrite sequences at all, and so this TRS is necessarily groundterminating. However, the term f (X) is the start of an infinite rewritesequence, so it is not terminating; similarly, if we add a constant to thesignature, the resulting TRS also fails to terminate. (cid:2) This motivates the following:
Definition 5.3.3
A signature Σ is non-void iff (T Σ ) s ≠ ∅ for each sort s . (cid:2) That is, Σ is non-void iff each of its sorts is non-void in the sense of Sec- tion 4.7. A simple sufficient condition is that each sort has a constant;also, a signature cannot be non-void if it has no constants at all. Thenext result follows from a more abstract version, Proposition 5.8.10, onpage 131: Proposition 5.3.4 If Σ is non-void, then a TRS ( Σ , A) is ground terminating iff ( Σ (X), A) is ground terminating, where X is any signature of constantsfor Σ . Also ( Σ (X), A) is ground terminating if ( Σ , A) is ground termi-nating. E12 If Σ is non-void, then a TRS is ground terminating iff it isterminating. (cid:2) Therefore when Σ is non-void, we know that A(X) is terminating if weknow that A is ground terminating; also, when Σ (X) is non-void, it issufficient that A(X) is ground terminating.Although every Church-Rosser TRS is ground Church-Rosser, theconverse is false, and the same holds for the local Church-Rosser prop-erty. Moreover, these converse implications fail even when the signa-ture is non-void, as shown by the following:
Example 5.3.5
Let Σ have just one sort, with one constant a plus three unaryfunction symbols, f , g, h , and with the following rules: f (X) → g(X) g(a) → af (X) → h(X) h(a) → a Then every ground term has the same normal form, namely a , so thisTRS is certainly ground Church-Rosser and also ground locally Church-Rosser; furthermore, it is terminating and ground terminating. How-ever, it is neither Church-Rosser nor locally Church-Rosser. (cid:2) (Example 5.3.5 is also a counterexample to the conjecture that everyground locally Church-Rosser TRS is locally Church-Rosser; it also showsthat even assuming termination does not help.) Nevertheless, we havethe following: Rewriting
Proposition 5.3.6
Given a sort set S , let X ωS be the signature of constants with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S (that is, X ωS has a countablenumber of distinct variable symbols for each sort in S , different fromthose in Σ ). Then a TRS ( Σ , A) is Church-Rosser iff ( Σ (X ωS ), A) is groundChurch-Rosser. Also, ( Σ , A) is locally Church-Rosser iff ( Σ (X ωS ), A) isground locally Church-Rosser. (cid:2) The proof is given in Section 5.7, as an application of ideas developedthere.The following example and exercise show that the above result isoptimal with respect to the number of additional constants.
Example 5.3.7
Let Σ have one sort and binary function symbols, f , g, h withthe rules: f (X, Y ) → g(X, Y ) g(X, X) → Xf (X, Y ) → h(X, Y ) h(X, X) → X This TRS is neither Church-Rosser nor locally Church-Rosser, but it isground Church-Rosser and ground locally Church-Rosser. If we add oneconstant, the resulting TRS remains ground Church-Rosser and groundlocally Church-Rosser, but if we add two constants, that TRS is neitherground Church-Rosser nor ground locally Church-Rosser. So it is notsufficient for A( { a } ) to have the ground property in order for A to havethe non-ground property. (cid:2) Exercise 5.3.1
Generalize the above example to show that adding two con-stants will not suffice. Then show that there is no natural number n such that n additional constants will suffice. (cid:2) Exercise 5.3.2
Apply the results in this section to discuss the situation result-ing from adding constants to the theory
FOO of Example 4.3.8, and thenusing rewriting of ground terms to prove equations with variables. (cid:2)
In general, a large tree will have many different sites where rewrite rules might apply, and the choice of which rules to try at which sitescan strongly affect both efficiency and termination. Most modern func-tional programming languages have a uniform lazy (i.e., top-down, oroutermost, or call-by-name) semantics. But because raw lazy evalu-ation is slow, lazy evaluation enthusiasts have built clever compilersthat figure out when an “eager” (i.e., bottom-up, or innermost, or call-by-value) evaluation can be used with exactly the same result; this iscalled “strictness analysis” (for example, see [141, 110]).OBJ is much more flexible, because each operator can be given itsown evaluation strategy. Syntactically, a local strategy , also called an
E-strategy (E is for “evaluation”), is a sequence of integers in parentheses, valuation Strategies given as an operator attribute following the keyword strategy , or just strat for short. For example, OBJ’s built-in conditional operator hasthe following declaration op if_then_else_fi : Bool Int Int -> Int [strat (1 0)] . which says its local strategy is to evaluate its first argument until it isreduced, and then apply rules at the top (indicated by “ ”). Similarly, op _+_ : Int Int -> Int [strat (1 2 0)] . indicates that _+_ on Int has strategy (1 2 0) , which evaluates botharguments before attempting to add them.Moreover, the flexibility of local evaluation strategies requires min-imum effort, because OBJ determines a default strategy for each oper- ator if none is explicitly given. This default strategy is computed veryquickly, because only a very simple form of strictness analysis is done,and it is surprisingly effective, though of course it does not fit all possi-ble needs. In OBJ3, the default local strategy for a given operator is de-termined from its equations by requiring that all argument places thatcontain a non-variable term in some rule are evaluated before equationsare applied at the top. If an operator with a user-supplied local strategyhas a tail recursive rule (in the weak sense that the top operator occursin its rightside), then it may apply an optimization that repeatedly ap-plies that rule, and thus violates the strategy. In those rare cases whereit is desirable to prevent this optimization from being applied, you canjust give an explicit local strategy that does not have an initial .There are actually two ways to get lazy evaluation. The simplest is toomit a given argument number from the strategy; then that argumentis not evaluated unless some rewrite exposes it from underneath thegiven operator. For example, taking this approach to “lazy cons” gives op cons : Sexp Sexp -> Sexp [strat (0)] . The second approach involves giving a negative number -j in a strat-egy, which indicates that the j th argument is to be evaluated “on de- mand,” where a “demand” is an attempt to match a pattern to the termthat occurs in the j th argument position. Under this approach, lazycons has the declaration op cons : Sexp Sexp -> Sexp [strat (-1 -2)] . Then a reduce command at the top level of OBJ is interpreted as a top-level demand that may force the evaluation of certain arguments. Thissecond approach cannot be applied to operators with an associative orcommutative attribute.A local strategy is called non-lazy if it requires that all argumentsof its operator are reduced in some order, and either the operator has Rewriting no rules, or the strategy ends with a final “ ”. In general, for the resultsof a reduction command to actually be fully reduced, it is necessarythat all local strategies be non-lazy. All of the default local strategiescomputed by the system are non-lazy. Giving an operator the memo attribute causes the results of evaluatinga term headed by this operator to be saved, and then used if this termneeds to be reduced later in the same context [139]. In OBJ3, users cangive any operator the memo attribute, and memoization is implementedefficiently with hash tables. More precisely, given a memoized operatorsymbol f and given a term f(t1, ...,tn) to be reduced (possibly as part of some larger term), a table entry for f(t1, ...,tn) giving itsfully reduced value is added to the memo table, and entries giving thisfully reduced value are also added for each term f(r1, ...,rn) that,according to the evaluation strategy for f , could arise while reducing f(t1, ...,tn) just before a rule for f is applied at the top. This isnecessary because at that moment the function symbol f could disap-pear. In some cases, memoizing these intermediate reductions is morevaluable than memoizing just the original expression.For example, if f has the strategy (2 3 0 1 0) , let r be the reducedform of the term f(t1,t2,t3,t4) , and let r i be the reduced form of t i for i = , ,
3. Then the memo table will contain the following pairs: (f(t1,t2,t3,t4),r)(f(t1,r2,r3,t4),r)(f(r1,r2,r3,t4),r)
Memoization gives the effect of structure sharing for common sub-terms, and this can greatly reduce term storage requirements in someproblems. Whether or not the memo tables are re-initialized beforeeach reduction can be controlled with the top level commands set clear memo on .set clear memo off .
The default is that the tables are not reinitialized. However, theycan be reinitialized at any time with the command do clear memo .
Of course, none of this has any effect on the result of a reduction,but only on its speed. A possible exception to this is the case wherethe definitions of operators appearing in the memo table have beenaltered. (When rules are added to an open module, previous com-putations may become obsolete. Therefore, you may need to explic-itly give the command “ do clear memo . ”) Memoization is an area roving Termination where term-rewriting-based systems seem to have an advantage overunification-based systems like Prolog.
It is known that (ground) termination is undecidable; that is, there isno algorithm which, given a TRS, can decide whether or not it is ter-minating. Nonetheless one can often prove termination by assigninga “weight” ρ(t) to each term t , i.e., by giving a function ρ : T Σ → ω ,such that ρ(t) > ρ(t (cid:48) ) whenever t ⇒ t (cid:48) . Because there are no infinitestrictly decreasing sequences of natural numbers, it follows that if sucha function exists, then the TRS is terminating. The converse also holdsunder a rather mild assumption. Proposition 5.5.1 A Σ -TRS A is ground terminating if there is a function ρ : T Σ → ω such that for all ground Σ -terms t, t (cid:48) , if t ⇒ A t (cid:48) then ρ(t) >ρ(t (cid:48) ) . Moreover, the converse holds when A is globally finite , in thesense that for each term, there are only a finite number of rewrite se-quences that begin with it. Proof: If t ⇒ t ⇒ t ⇒ · · · ⇒ t n , then ρ(t ) > ρ(t ) > ρ(t ) > · · · > ρ(t n ) ,that is, a sequence of n − n . Hence, because there are noinfinite strictly decreasing natural number sequences, there cannot beinfinite proper rewrite sequences; therefore A is terminating.For the converse, assume A is terminating and let ρ(t) be the max-imum of the lengths of all rewrite sequences that reduce t to a normalform; there are only a finite number of these, because of global finite-ness. Then t ⇒ A t (cid:48) implies ρ(t) ≥ + ρ(t (cid:48) ) . (cid:2) The converse direction of this result is of mainly theoretical interest,since it can be difficult to prove global finiteness without knowing ter-mination. But note that a terminating TRS is globally finite if it has afinite rule set.
Example 5.5.2
Here is a simple TRS showing that the converse of Proposi-tion 5.5.1 does not hold without the additional assumption of globalfiniteness. The signature Σ has just one sort, say s , with Σ [],s = ω and Σ w,s = ∅ for w ≠ [] ; therefore T Σ = ω . The rule set A is0 → n for each n > n → n − n > . In the many-sorted case, the target is the S -sorted set with each component ω . Wewill see later that it is sometimes convenient to replace ω by certain other ordered sets. I thank Prof. Yoshihito Toyama for providing this example. Note that A is infinitehere. Rewriting
Then there is a rewrite sequence 0 ⇒ n ⇒ n − ⇒ · · · ⇒ n for every n >
0. Now suppose there is a function ρ : T Σ → ω suchthat t ⇒ t (cid:48) implies ρ(t) > ρ(t (cid:48) ) , and let ρ( ) = K . But because thereis a rewrite sequence of length K + ⇒ K +
1, we get that K = ρ( ) > ρ(K + ) > ρ(K) > ρ(K − ) > · · · > ρ( ) , which isimpossible. (cid:2) There are two difficulties with using Proposition 5.5.1: (1) it can behard to find an appropriate function ρ ; and (2) it can be hard to provethe required inequalities. We discuss the first difficulty a little later;regarding the second, it is natural to reduce it by using the structure ofterms, as in the following definition and result: Definition 5.5.3
Given a poset P and ρ : T Σ → P , a Σ -rewrite rule r : t → t (cid:48) of sort s is strict ρ -monotone iff ρ(θ(t)) > ρ(θ(t (cid:48) )) for each appli-cable ground substitution θ . An operation symbol σ ∈ Σ is strict ρ -monotone iff ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) foreach t, t (cid:48) ∈ T Σ and any t ∈ T Σ ( { z } s ) of the form σ (t , . . . , t n ) whereeach t i except one is ground, and that one is just z . Σ -substitution isstrict ρ -monotone if ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) for any t, t (cid:48) ∈ T Σ and t ∈ T Σ ( { z } s ) having a single occurrence of z . Inany of the above, we speak of weak ρ - monotonicity if > is replaced by ≥ . (cid:2) The first definition says that every application of the rule is weightdecreasing; the second says that any weight decreasing substitutiondecreases the weight of the result; and the third says the same for asingle operation symbol.
E13
Proposition 5.5.4
Given a Σ -TRS A , if there is a function ρ : T Σ → ω suchthat each rule in A is strict ρ -monotone, and Σ -substitution is strict ρ -monotone, then A is ground terminating. Proof: If t ⇒ t then the two assumptions imply that ρ(t ) > ρ(t ) , because t = t (z ← θ(t)) and t = t (z ← θ(t (cid:48) )) . Therefore A is groundterminating by Proposition 5.5.1. (cid:2) The third condition in Definition 5.5.3 is actually a special case of thesecond that is sufficient to imply it; this fact can greatly simplify manytermination proofs; the proof of the following result is given in Ap-pendix B:
Proposition 5.5.5
Given a Σ -TRS A and a function ρ : T Σ → ω , then Σ -substi-tution is strict ρ -monotone if every operation symbol in Σ is strict ρ -monotone; the same holds for weak ρ -monotonicity. (cid:2) We can now directly combine Propositions 5.5.4 and 5.5.5 to get thefollowing useful result: roving Termination
Proposition 5.5.6
Given a Σ -TRS A , if (1) each rule in A is strict ρ -monotone,and ( (cid:48) ) each σ ∈ Σ is strict ρ -monotone, then A is ground terminating. (cid:2) Note that Section 5.3 gives easy-to-check conditions for a ground ter-minating TRS to be terminating, so it is not a problem that the aboveresults, and others that follow, only show ground termination.It is very natural to reduce the tedium of defining an appropriatefunction ρ by using initial algebra semantics, that is, by giving ω a Σ -algebra structure and letting ρ be the unique Σ -homomorphism. Thento prove that A is terminating, we can prove that each rule is weightdecreasing, and that each operation ω σ on ω is strict ρ -monotone ineach argument on ω . The two hypotheses can be stated using variablesthat range over terms of the appropriate sorts, and the resulting in- equalities can be proved by rewriting, as illustrated by examples in thisand Section 5.8.2. A subtle point is that if we know that ρ(t) ≥ t , then we can assume that all variables are > ω that are induced by the two hypotheses, as illustratedin the following examples. Example 5.5.7
Consider the TRS for Boolean conjunction that corresponds tothe following OBJ specification: obj AND is sort Bool .ops tt ff : -> Bool .op _&_ : Bool Bool -> Bool .var X : Bool .eq X & tt = X .eq tt & X = X .eq X & ff = ff .eq ff & X = ff .endo
Let Σ denote its signature, and give ω the structure of a Σ -algebraby defining ω tt = ω ff = ω & (m, n) = m + n . Then by Propo-sition 5.5.6, it suffices to prove the following, where ρ is the unique Σ -homomorphism T Σ → ω , ρ(x & tt ) > ρ(x)ρ( tt & x) > ρ(x)ρ(x & ff ) > ρ( ff )ρ( ff & x) > ρ( ff ) for all x ∈ T Σ ; and ρ(t) > ρ(t (cid:48) ) implies ρ(t & t) > ρ(t & t (cid:48) )ρ(t) > ρ(t (cid:48) ) implies ρ(t & t ) > ρ(t (cid:48) & t ) for all ground Σ -terms t, t (cid:48) , t , t . (The first four inequalities come from(1) and the next two from ( (cid:48) ) .) Each of these six assertions can be Rewriting proved straightforwardly, using ρ(t) ≥ t . For example, thefirst and third amount to ρ(x) + > ρ(x)ρ(x) + > i > j implies k + i > k + j . All the other cases are similar, and so this TRS is ground terminating.Therefore it is terminating by Proposition 5.3.4, and since Exercise 5.6.7shows it is Church-Rosser, it is also canonical. (cid:2)
Exercise 5.5.1
Fill in the missing cases and details in Example 5.5.7. (cid:2)
Exercise 5.5.2
Show that the TRS of Example 5.1.5 is terminating by applyingProposition 5.5.6 with ρ the unique homomorphism into ω satisfyingthe following: ρ( ) = ρ(s t) = + ρ(t)ρ(t + t (cid:48) ) = + ρ(t)ρ(t (cid:48) ) . (cid:2) Exercise 5.5.3
Show that the TRS of Example 5.1.7 is terminating. (cid:2)
Exercise 5.5.4
Show that the group TRS of Example 5.2.8 is terminating by ap-plying Proposition 5.5.6 with ρ the unique homomorphism into ω sat-isfying the following: ρ(e) = ρ(t − ) = ρ(t) ρ(t ∗ t (cid:48) ) = ρ(t) ρ(t (cid:48) ) . You may also enjoy mechanizing the proofs using OBJ, in the mannerillustrated in Exercise 5.5.5 below. (cid:2)
Termination proofs in the literature tend to use polynomials, but theinitial algebra point of view makes it evident that any monotone func-tion at all can be used, e.g., exponentials, as in the function for group inverse in the above exercise. Moreover, Proposition 5.5.6 generalizesto partially ordered sets other than ω , provided they are Noetherian , inthe sense that they have no infinite sequence of strictly decreasing ele-ments; this is discussed in detail in Section 5.8.3 below. OBJ can oftenbe used to do termination proofs based on Proposition 5.5.6, becauseas we have seen, these boil down to proving a set of inequalities, whichare often rather fun in themselves. We illustrate this in the following:
Exercise 5.5.5
The purpose of this exercise is to show termination for thefollowing specification of the function half ( n ) , which computes thelargest natural number k such that 2 k ≤ n : roving Termination obj NATPH is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .op half : Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .eq half(0) = 0 .eq half(s 0) = 0 .eq half(s s M) = s half(M) .endo The first step is to define an appropriate function ρ : T Σ → ω , where Σ is the signature of NATPH .1. Give a fourth equation which, when added to the three below,uniquely defines a function ρ , which should furthermore satisfythe properties in items 2–5 below: ρ( ) = ρ(s(t)) = + ρ(t)ρ(t + t (cid:48) ) = + ρ(t)ρ(t (cid:48) ) . Explain why this uniquely defines ρ .The following should first be proved by hand, and then proved usingOBJ proof scores based on the object NATP+*> given below:2. ρ(t) > ρ(t (cid:48) ) implies ρ(s(t)) > ρ(s(t (cid:48) )) for every t, t (cid:48) ∈ T Σ .3. ρ(t) > ρ(t (cid:48) ) implies ρ(t + t (cid:48)(cid:48) ) > ρ(t (cid:48) + t (cid:48)(cid:48) ) for every t, t (cid:48) , t (cid:48)(cid:48) ∈ T Σ .4. ρ(t) > ρ(t (cid:48) ) implies ρ( half ( t )) > ρ( half ( t (cid:48) )) .5. ρ(θ(t)) > ρ(θ(t (cid:48) )) for each rule t → t (cid:48) in NATPH and each substi-tution θ . Hints:
You may assume that adding the equations eq L + M > L + N = M > N .cq L * M > L * N = M > N if L > 0 .cq s M > N = M > N if M > N .cq L + M > L = true if M > 0 .cq M > 0 = true if M > s 0 . has already been justified by earlier OBJ proofs; you may also needsome other similar lemmas. (We will later see how to prove such resultsby induction.) You may need to use the fact that ρ(t) > t . Rewriting obj NATP+*> is sort Nat .ops 0 1 2 : -> Nat .op s_ : Nat -> Nat [prec 1] .eq 1 = s 0 .eq 2 = s 1 .vars L M N : Nat .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .eq M + 0 = M .eq M + s N = s(M + N).op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .eq L * (M + N) = L * M + L * N .op _>_ : Nat Nat -> Bool .eq M > M = false .eq s M > 0 = true .eq s M > M = true .eq 0 > M = false .eq s M > s N = M > N .endo
6. Explain why the above results prove that
NAPTH is terminating. (cid:2)
Exercise 5.5.6
Give mechanical proofs for the other termination examples inthis section. (cid:2)
Like termination, the Church-Rosser property is undecidable for termrewriting systems. But once again, there are useful techniques for manyspecial cases. We start with the following important result, the cleverproof of which is due to Barendregt [3]; this proof is easier to under-stand when done dynamically on a whiteboard or applet, than staticallyon paper.
Proposition 5.6.1 ( Newman Lemma ) If a TRS A is terminating, then it is Church- Rosser if and only if it is locally Church-Rosser.
Proof:
The “only if” direction is trivial. For the converse, let us call a term t ambiguous iff it has (at least) two distinct normal forms. If we canshow that when t is ambiguous, there is some ambiguous t (cid:48) such that t ⇒ t (cid:48) , it then follows that if there are any ambiguous terms, then thesystem is non-terminating. Hence by contradiction, the system cannothave any ambiguous terms. The Church-Rosser property follows fromthis.We now prove the auxiliary claim. Assume that t is ambiguous, andlet t and t be two distinct normal forms for t , where t ⇒ t (cid:48) ∗ ⇒ t and roving Church-Rosser (cid:2)(cid:2)(cid:2)(cid:2)(cid:2)(cid:2)(cid:13) (cid:66)(cid:66)(cid:66)(cid:66)(cid:66)(cid:66)(cid:78) t t (cid:48) t t (cid:65)(cid:65)(cid:65) (cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:11) (cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:85)(cid:63) tt (cid:48) t (cid:48) t (cid:48)(cid:48) t t t Figure 5.2: Barendregt’s Proof of the Newman Lemma t ⇒ t (cid:48) ∗ ⇒ t . If t (cid:48) = t (cid:48) , let t (cid:48) = t (cid:48) . If t (cid:48) ≠ t (cid:48) , apply local confluence to get t (cid:48)(cid:48) such that t ⇒ t (cid:48) ∗ ⇒ t (cid:48)(cid:48) and t ⇒ t (cid:48) ∗ ⇒ t (cid:48)(cid:48) , and let t be a normal formfor t (cid:48)(cid:48) . Then t is also a normal form for t , so that t ≠ t or t ≠ t . If t ≠ t we let t (cid:48) = t (cid:48) and if t ≠ t we let t (cid:48) = t (cid:48) . See Figure 5.2. (cid:2) Notice that although the TRS of Example 5.1.8 is locally Church-Rosser by Exercise 5.2.3, it is neither terminating nor Church-Rosser,nor does it have unique normal forms for all its terms, by Exercise 5.1.3.This shows that confluence and local confluence are not equivalent con-cepts.Theorem 5.6.9 below gives a method for showing the local Church-Rosser property, and then Corollary 5.6.10 applies the Newman lemmato get a method for showing the Church-Rosser property, and hencethe canonicity, of a terminating TRS. Chapter 12 will show how to makethis method into a general algorithm.
Definition 5.6.2
A TRS is left linear iff no rule has more than one instance ofthe same variable in its leftside, and is right linear iff no rule has morethan one instance of the same variable in its rightside.Two rules with leftsides t, t (cid:48) overlap iff there are substitutions θ, θ (cid:48) such that θ(t ) = θ (cid:48) (t (cid:48) ) with t a subterm of t not just a variable,i.e., with t = t (z ← t ) where t has just one occurrence of the new variable z and t is not a variable. If the two rules are actually thesame, it is additionally required that t ≠ z , and the rule is called self-overlapping . A TRS is overlapping iff it has two rules (possibly thesame) that overlap, and then the term θ(t ) = θ (cid:48) (t (cid:48) ) is called an over-lap of t, t (cid:48) ; otherwise the TRS is non-overlapping .A TRS is orthogonal iff it is left linear and non-overlapping. (cid:2) Example 5.6.3
The idempotent rule, of the form B + B = B , is not left linear,but the associative and commutative rules are left linear.We now show that the associative and commutative rules overlap.Let their leftsides be t = (A + B) + C and t (cid:48) = A + B , respectively. Let Rewriting t be the subterm (A + B) of t ; then t is non-trivial because t = z + C .Define substitutions θ, θ (cid:48) as follows: θ(A) = a θ (cid:48) (A) = aθ(B) = b θ (cid:48) (B) = bθ(C) = C .
Then θ(t ) = θ (cid:48) (t (cid:48) ) = a + b .The associative rule is self-overlapping. As before, let its leftside be t = (A + B) + C . Let t be the subterm (A + B) of t ; then t is non-trivialbecause t = z + C . Now define the substitutions θ and θ (cid:48) by: θ(A) = a + b θ (cid:48) (A) = aθ(B) = c θ (cid:48) (B) = bθ(C) = C θ (cid:48) (C) = c. Then θ(t ) = θ (cid:48) (t (cid:48) ) = (a + b) + c . (cid:2) Exercise 5.6.1
Prove that the commutative rule is not self-overlapping. (cid:2)
The rather complex proof of the following theorem is given in Ap-pendix B:
Theorem 5.6.4 ( Orthogonality ) E14
A TRS is Church-Rosser if it is lapse free andorthogonal. (cid:2)
Exercise 5.6.2
Show that a lapse rule overlaps with any non-lapse rule. Give aTRS showing that the lapse free hypothesis is needed in Theorem 5.6.4. (cid:2)
Example 5.6.5
The TRS of the object
NATP+ of Example 5.1.5 is lapse free andorthogonal, and therefore Church-Rosser. To prove this, we check foroverlap of each rule with itself and each other rule; this gives 4 cases,and the reader can verify that each one fails because of incompatiblefunction symbols. Combining this with Exercise 5.5.2, it follows that
NATP+ is canonical. (cid:2)
Exercise 5.6.3
1. Show that the TRS
AND of Example 5.5.7 is not orthogonal.
2. Show that the TRS’s
MONOID of Exercise 5.2.7 and
GROUPC of Ex-ample 5.2.8 are not orthogonal. (cid:2)
Chapter 12 shows that proofs of canonicity by orthogonality can befully mechanized, by using an algorithm that checks if a given pair ofrules is overlapping, and noting that it is trivial to check left linearity.
Example 5.6.6 ( Combinatory logic ) The motivation for this classical logic issimilar to that for the lambda calculus, namely to axiomatize a theoryof functions, in this case certain collection of higher-order functionscalled combinators . Here we give it as an equational theory. roving Church-Rosser
The basic operation is to apply one combinator to another. Thetraditional notation for this (which might seem a bit confusing at first)is simple juxtaposition, i.e., the syntactic form denoted __ in OBJ. Forexample, A B means apply A to B ; this might be more explicitly writtensomething like App(A,B) , or
A . B . This calculus has just one sort,which is denoted T in the OBJ code below; it is the type of functions.Thus, particular combinators will be constants of sort T , even thoughthey represent functions.The attribute gather (E e) of the operation __ makes it parse leftassociatively ([90] gives a detailed explanation of how these “gatheringpatterns” work in OBJ3). For example, A B C would be more explic-itly written as
App(App(A,B),C) , or (A . B). C . Finally, the let con-struction used below is just a convenient shorthand for first declaring a constant and then letting it equal a given term; OBJ3 computes thesort for the constant by parsing the term. obj COMBL is sort T .op __ : T T -> T [gather (E e)].ops S K I : -> T .vars L M N : T .eq K M N = M .eq I M = M .eq S M N L = (M L)(N L).endoopen .ops m n p : -> T .red S K K m == I m .red S K S m == I m .red S I I I m == I m .red K m n == S(S(K S)(S(K K)K))(K(S K K)) m n .red S m n p ==S(S(K S)(S(K(S(K S)))(S(K(S(K K)))S)))(K(K(S K K))) m n p .red S(K K) m n p == S(S(K S)(S(K K)(S(K S)K)))(K K) m n p .let X = S I .red X X X X m == X(X(X X)) m .close
The last reduction takes 27 rewrites, which is more than one wouldlike to do by hand. (cid:2)
Exercise 5.6.4
The following refer to Example 5.6.6, and if possible should bedone with OBJ.1. Define B = S(KS)K . Then show that
Bxyz = x(yz) , and hencethat Bxy is the composition of functions x and y .2. Define C by Cxyz = zxy and prove that SI(KK) = CIK .3. Define ω = SII and show that ωx = xx , so that ωω ∗ ⇒ ωω ,which implies that the TRS COMBL is non-terminating. (cid:2) Rewriting
Exercise 5.6.5
Show that the TRS
COMBL of Example 5.6.6 is orthogonal and soChurch-Rosser. (cid:2)
Exercise 5.6.6
Show that the following TRS’s are orthogonal, and since alreadyknown to be terminating, therefore canonical:1.
NATPH of Exercise 5.5.5; and2.
ANDNOT of Example 5.1.7. (cid:2)
Unfortunately, many important Church-Rosser TRS’s are not orthog-onal, and therefore cannot be checked using Theorem 5.6.4. The follow-ing material, which builds on concepts in Definition 5.6.2, and which isfurther developed in Chapter 12, is much more powerful. The nextresult is proved in Chapter 12:
Proposition 12.0.1
If terms t, t (cid:48) overlap at a subterm t of t , then there is a most general overlap p , in the sense that any other overlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) Note that if the leftsides t, t (cid:48) of two rules in a TRS have the overlap θ(t ) = θ (cid:48) (t (cid:48) ) , then the term θ(t) can be rewritten in two ways (one foreach rule). Definition 5.6.7
A most general overlap (in the sense of Proposition 12.0.1) iscalled a superposition of t and t (cid:48) , and the pair of rightsides resultingfrom applying the two rules to the term θ(t) is called a critical pair .If the two terms of a critical pair can be rewritten to a common termusing rules in A , then that critical pair is said to converge or to beconvergent . (cid:2) Theorem 5.6.9 below is our main result, while the following illustratesthe definition above:
Example 5.6.8
The fourth and sixth rules of Example 5.2.8 overlap. Their left-sides are t = e -1 and t (cid:48) = A -1 -1 , while t = A -1 , θ( A ) = e, θ (cid:48) = ∅ (the empty substitution), the superposition is e -1 , and the critical pairis e , e -1 , each term of which rewrites to e , so that the two differentrewrites of t also both yield e . (cid:2) Theorem 5.6.9 ( Critical Pair Theorem ) A TRS is locally Church-Rosser if andonly if all its critical pairs are convergent.
Sketch of Proof:
The converse is easy. Suppose that all critical pairs converge,and consider a term with two distinct rewrites. Then their redexes areeither disjoint or else one of them is a subterm of the other, since iftwo subterms of a given term are not disjoint, one must be containedin the other. If the redexes are disjoint, then the result of applying both bstract Rewrite Systems rewrites is the same in either order. If the redexes are not disjoint, theneither the rules overlap (in the sense of Definition 5.6.2), or else thesubredex results from substituting for a variable in the leftside of therule producing the larger redex. In the first case, the result terms ofthe two rewrites rewrite to a common term by hypothesis, since theoverlap is a substitution instance of the overlap of some critical pairby Proposition 12.0.1. In the second case, the result of applying bothrules is the same in either order, though the subredex may have to berewritten multiple (or zero) times if the variable involved is non-linear. (cid:2)
The full proof is in Appendix B. This and the Newman Lemma (Proposi-tion 5.6.1) give:
Corollary 5.6.10
A terminating TRS is Church-Rosser if and only if all its criti-cal pairs are convergent, in which case it is also canonical. (cid:2)
Chapter 12 introduces unification, an algorithm that can be used tocompute all critical pairs of a TRS, and hence to decide the Church-Rosser property for any terminating TRS.
Exercise 5.6.7
Use Corollary 5.6.10 to show the Church-Rosser property, andhence the canonicity, of the following TRS’s:1.
GROUPC of Example 5.2.8;2.
AND of Example 5.5.7; and3.
MONOID of Exercise 5.2.7. (cid:2)
Many important results about term rewriting are actually special casesof much more general results about a binary relation on a set. Althoughthis abstraction of term rewriting to the one-step rewrite relation ig-nores the structure of terms, it still includes a great deal. The classical approach takes an unsorted view of the elements to be rewritten, buthere we generalize to sorted sets of elements, enabling applications tomany-sorted term rewriting and equational deduction that appear to benew.
Definition 5.7.1 An abstract rewrite system (abbreviated ARS ) consists of a(sorted) set T and a (similarly sorted) binary relation → on T , i.e., → ⊆ T × T . We may denote such a system as a pair (T , → ) , or possibly as atriple (S, T , → ) , if the sort set S needs to be emphasized.An ARS (T , → ) is terminating if and only if there is no infinite se-quence a , a , . . . of elements of T such that a i → a i + for i = , , . . . Rewriting (note that a i ∈ T s and a i → s a i + for the same s ∈ S ). Also, t ∈ T iscalled reduced , or a reduced form , or a normal form , iff there is no t (cid:48) ∈ T such that t → t (cid:48) .Let ∗ → denote the reflexive, transitive closure of → . Then t is calleda normal form of t ∈ T iff t ∗ → t and t is a normal form. Let t ↓ t mean there is some t ∈ T such that t ∗ → t and t ∗ → t ; we say t , t areconvergent , or converge to t . An ARS is Church-Rosser (also called confluent ) iff for every t ∈ T , whenever t ∗ → t and t ∗ → t then t ↓ t .An ARS is canonical iff it is terminating and Church-Rosser; in this case,normal forms are called canonical forms . Let ∗ ↔ denote the reflexive,symmetric, transitive closure of → , and let , → denote the reflexive clo-sure of → .An ARS is locally Church-Rosser (or locally confluent ) iff for every t ∈ T , whenever t → t and t → t then t ↓ t . An ARS is globally finite iff for every t ∈ T , there are only a finite number of distinct maximalrewrite sequences (finite or infinite) that begin with t . (cid:2) We can relativize all these concepts to a single sort s just as in Defi-nitions 5.2.1 and 5.2.2, to take account of the fact that rewriting overdifferent sorts may have different properties. Three of the more usefulTRS results that generalize to ARS’s are as follows: Theorem 5.7.2
Given a canonical ARS, every t ∈ T has a unique normal form,denoted [[t]] and called the canonical form of t . (cid:2) Proposition 5.7.3
Given an ARS (T , → ) and t, t (cid:48) ∈ T , then t ∗ ↔ t (cid:48) iff there are t , . . . , t n ∈ T such that t ↓ t and t i ↓ t i + for i = , . . . , n − t n ↓ t (cid:48) . (cid:2) Proposition 5.7.4 ( Newman Lemma ) A terminating ARS is Church-Rosser iff itis locally Church-Rosser. Hence any ARS that is terminating and locallyChurch-Rosser is canonical. (cid:2)
These results are proved essentially the same way as the correspondingTRS results (the second generalizes Proposition 5.1.14). Alternatively,they can be proved directly from the TRS results, by using the connec- tion between ARS’s and TRS’s that we now describe; a more detaileddiscussion of this connection appears in Section 5.9.Given a TRS
T = ( Σ , A) where Σ has sort set S , we get an ARS (S, T , → ) by letting T s = (T Σ ) s and for t , t ∈ T s , defining t → s t iff t ⇒ A t ; denote this ARS by R( T ) . It is suitable for dealing withground properties of its TRS. Exercise 5.7.1
Prove that a TRS T is ground terminating iff the ARS R( T ) isterminating. Prove that a TRS T is ground Church-Rosser iff the ARS R( T ) is Church-Rosser. Also prove corresponding results for the localChurch-Rosser and canonicity properties. (cid:2) bstract Rewrite Systems Next, an ARS
A = (T , → ) gives rise to a TRS F( A ) = ( Σ T , A → ) as follows:define Σ T by letting Σ T[],s = T s and Σ w,s = ∅ for all other pairs w, s , withthe rules in A → the equations ( ∀∅ ) t = t such that t → s t in A forsome sort s . Exercise 5.7.2
Prove that an ARS A is terminating iff the TRS F( A ) is groundterminating. Do the same for the Church-Rosser, local Church-Rosser,and canonicity properties. (cid:2) It is usually easier to prove results about ARS’s than about TRS’s, butlike most bridges, this one can be used in either direction, as illustratedin the following:
Exercise 5.7.3
Prove Theorem 5.7.2 and Proposition 5.7.4 by reducing them to the corresponding results for TRS’s. Also do the reverse for the groundcase. (cid:2)
The following gives another tool for showing the Church-Rosser prop-erty:
Proposition 5.7.5 ( Hindley-Rosen Lemma ) For each i ∈ I , let (T , → i ) be a Church-Rosser ARS, and assume that for all i, j ∈ I the relations → i and → j commute in the sense that for all a, b, c ∈ T , if a ∗ → i b and a ∗ → j c thenthere is some d ∈ T such that b ∗ → j d and c ∗ → i d . Now define → on T by a → b iff there is some i ∈ I such that a → i b . Then (T , → ) isChurch-Rosser. Proof:
We begin by showing that it suffices to prove this result for the casewhere I has just two indices. First, notice that since any particularrewrites a ∗ → b and a ∗ → c can only involve a finite set of relations → i , itsuffices to consider finite sets I of indices. Now assuming that Hindley-Rosen holds for index sets of cardinality 2, we show that it holds forany finite cardinality k , by induction on k . For k =
1, there is nothingto prove. Now assume Hindley-Rosen for some k >
2, and supposewe are given relations → i for i = , . . . , k + → = (cid:83) ki = → i is Church-Rosser by the induction hypothesis. Therefore if we show that → and → k + commute, we aredone by Hindley-Rosen for k = → and → k + commute, let a ∗ → b and a ∗ → k + c .The proof is by induction on the length n of the rewrite sequence for a ∗ → b . If n =
1, we apply Hindley-Rosen for k =
2. Now assumingthe hypothesis for some n >
1, we prove it for a ∗ → b of length n + a ∗ → b be a ∗ → i b . Thenby Hindley-Rosen for k =
2, there is some d such that c ∗ → i d and b ∗ → k + d . We now conclude the proof by applying the inductionhypothesis to b ∗ → b and b ∗ → k + d , noting that the former has Rewriting a (cid:63) (cid:45) b c b (cid:45) (cid:45)(cid:63) i i d d (cid:45) (cid:63) Figure 5.3: Hindley-Rosen Proof Reduction a (cid:63) (cid:45) (cid:45) b (cid:48) (cid:63) b (cid:63) c (cid:48) (cid:63) (cid:45) d e (cid:48) c (cid:45) (cid:45)(cid:63) e (cid:48)(cid:48) e (cid:45) (cid:63) Figure 5.4: Hindley-Rosen Proof for k = n , to get d such that b ∗ → k + d and d ∗ → d , and hence also c ∗ → d (see Figure 5.3, in which every arrow has an omitted ∗ , and eachdownward arrow is ∗ → k + ).We now prove Hindley-Rosen for k =
2. Let → and → be commut-ing Church-Rosser relations, let → = → ∪ → , and let → be arbitrarilymany applications of → followed by arbitrarily many applications of → , i.e., → = ∗ → ◦ ∗ → . Then the argument suggested by Figure 5.4shows that → is Church-Rosser, where all leftside horizontal arrowsare ∗ → , all rightside horizontal arrows are ∗ → , all top downward ar-rows are ∗ → , and all bottom downward arrows are ∗ → . Next, one cancheck that → ⊆ → ⊆ ∗ → , which implies that ∗ → = ∗ → . Therefore ∗ → is also Church-Rosser, and hence so is → (by Exercise 5.7.6). (cid:2) Exercise 5.7.4
Show that there is no analog of the Hindley-Rosen Lemma fortermination: give terminating ARS’s (T , → ) and (T , → ) which com-mute in the sense of Proposition 5.7.5, such that (T , → ) is not terminat-ing, where → = → ∪ → . (cid:2) Exercise 5.7.5
Prove a more convenient version of Hindley-Rosen, which re-places commutation with the following notion of strong commutation :if a → i b and a → j c then there is some d ∈ T such that b , → j d and c ∗ → i d bstract Rewrite Systems where , → j indicates the reflexive closure of → j . (cid:2) Although the definition of strong commutation is assymetrical, in prac-tice it is used in situations where it holds symmetrically, in both orders.
Exercise 5.7.6
Prove that an ARS (T , → ) is Church-Rosser iff (T , ∗ → ) is Church-Rosser. (cid:2) There are also ARS versions of Proposition 5.2.6 and its Corollary5.2.7. For the first of these, recall that ∗ ↔ denotes the reflexive, symmet-ric, transitive closure of the relation → . Proposition 5.7.6 If (A, → ) is a Church-Rosser ARS, then t ∗ ↔ t (cid:48) iff t ↓ t (cid:48) . Proof:
We use induction on the number n of rewrites involved in t ∗ ↔ t (cid:48) . If n = t = t (cid:48) so that t ↓ t (cid:48) trivially. Now suppose that t ∗ ↔ t (cid:48) with n + t ∗ ↔ t (cid:48)(cid:48) ← t (cid:48) , and (2) t ∗ ↔ t (cid:48)(cid:48) → t (cid:48) ,where in both cases, t ∗ ↔ t (cid:48)(cid:48) in n rewrites, so that t ↓ t (cid:48)(cid:48) by the inductionhypothesis, say with t such that t ∗ → t and t (cid:48)(cid:48) ∗ → t . For case (1), since t (cid:48)(cid:48) ∗ → t and t (cid:48) → t (cid:48)(cid:48) , we get t ↓ t (cid:48) . For case (2), from t (cid:48)(cid:48) ∗ → t and t (cid:48)(cid:48) ∗ → t (cid:48) ,the Church-Rosser property gives t such that t ∗ → t and t (cid:48) ∗ → t , fromwhich it follows that t ∗ → t , so that t ↓ t (cid:48) . (cid:2) Corollary 5.7.7 If (T , → ) is a canonical ARS, then t ∗ ↔ t (cid:48) iff t ↓ t (cid:48) iff [[t]] = [[t (cid:48) ]] . Proof:
This is because t ↓ t (cid:48) iff [[t]] = [[t (cid:48) ]] for a canonical ARS, noting that theequality used here is syntactical identity. (cid:2) Propositions 5.8.16 on page 134 and 5.8.19 on page 135 give ways toprove termination of ARS’s.So far we have related ARS’s to ground term rewriting; extending therelationship to non-ground rewriting can be somewhat tricky, becausewe must take account not only of the sorts of terms, but also of thesets of variables that appear in terms through universal quantification.As a first example, we provide the proof promised in Section 5.3 for the following result:
Proposition 5.3.6
Given a sort set S , let X ωS be the signature of constantswith (X ωS ) s = { x is | i ∈ ω } , i.e., with a countable set of distinct vari-able symbols for each s ∈ S . Then a TRS ( Σ , A) is Church-Rosser iff ( Σ (X ωS ), A) is ground Church-Rosser. Also, ( Σ , A) is locally Church-Rosser iff ( Σ (X ωS ), A) is ground locally Church-Rosser. Proof:
Given a term t with var (t) = X , let T = T Σ (X) and let G = T Σ (X ωS ) . Thatthe properties for ( Σ , A) imply the corresponding ground properties for ( Σ (X ωS ), A) is direct. For the converse, form the ARS’s T = (T , → T ) and G = (G, → G ) using rewriting with A on T and on G , respectively. Let Rewriting f : X → X ωS be an injection, which we can without loss of generalityassume is an inclusion, and let f also denote its free extension to termswhich is again an inclusion, T → G . Then t → T t (cid:48) iff t → G t (cid:48) , andhence t ∗ → T t (cid:48) iff t ∗ → G t (cid:48) . Now suppose that ( Σ (X ωS ), A) is groundChurch-Rosser and let t ∗ → T t and t ∗ → T t . Then t ∗ → G t and t ∗ → G t ,with X = var (t) . Therefore there exists t such that t ∗ → G t and t ∗ → G t , and hence t ∗ → T t and t ∗ → T t , from which it follows that T is Church-Rosser, and hence that ( Σ , A) is Church-Rosser, since t was arbitrary. An analoguous proof works for the local Church-Rosserproperty. (cid:2) The above proof involves a rewrite relation explicitly indexed overthe sorts in S , and implicitly indexed over variable sets X . To make the latter explicit, we could index over I = P F (X ωS ) × S , where P F (U) denotes the set of all finite subsets of U , and where for simplicity allvariables are assumed drawn from the fixed signature X ωS introducedabove.For the next result, we need the following construction: Given aTRS T = ( Σ , A) , define the ARS N( T ) = (T , → ) by T s = (T Σ (X ωS )) s and t → s t (cid:48) iff t ⇒ A t (cid:48) , for t, t (cid:48) ∈ T s . We now apply this machinery to getthe proofs that were promised for some results in Section 5.2: Proposition 5.2.6 If T = ( Σ , A) is a Church-Rosser TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iff t ↓ t (cid:48) . Proof:
First form the ARS N( T ) as described above, and notice that A (cid:96) ( ∀ X) t = t (cid:48) iff t ∗ ↔ s t (cid:48) where t, t (cid:48) both have sort s . Now apply Proposi-tion 5.7.6 to N( T ) , and finally appeal to the Completeness Theorem. (cid:2) Definition 5.7.8
Let (T , → ) and (T (cid:48) , → (cid:48) ) be ARS’s. Then (T (cid:48) , → (cid:48) ) is a sub-ARS of (T , → ) iff T (cid:48) ⊆ T and t → (cid:48) t implies t → t . Also, an ARS isomor-phism of (T , → ) and (T (cid:48) , → (cid:48) ) is a bijective function f : T → T (cid:48) such that t → t iff f (t ) → (cid:48) f (t ) . (cid:2) Exercise 5.7.7
Show that if (T , → ) is a terminating ARS and if (T (cid:48) , → (cid:48) ) is a sub-ARS of (T , → ) , then (T (cid:48) , → (cid:48) ) is also terminating. Show that by contrast,a sub-ARS of a Church-Rosser ARS need not be Church-Rosser. Alsoshow that if two ARS’s are isomorphic, then each of them is terminat-ing, or Church-Rosser, or locally Church-Rosser iff the oher one is. (cid:2) We now prove the result stated in Section 5.3 about adding constantsto a TRS:
Proposition 5.3.4 If Σ is non-void, then a TRS ( Σ , A) is ground terminating iff ( Σ (X), A) is ground terminating, where X is any signature of constants onditional Term Rewriting for Σ . Also ( Σ (X), A) is ground terminating if ( Σ , A) is ground termi-nating, E15 and if Σ is non-void, then a TRS is ground terminating iff it isterminating. Proof:
The reader should first check that a TRS ( Σ , A) is terminating iff N( Σ , A) is. Next given an S -sorted set Y of constants and a signature isomor-phism f : X ωS → Y , we can show that N( Σ , A) and (T Σ (Y ), → A ) areisomorphic ARSs, from which it follows by Exercise 5.7.7 that one ofthem is terminating iff the other is. Finally, since X is countable, X ωS and X ωS ∪ X are isomorphic, from which it follows that the TRS ( Σ , A) is terminating iff ( Σ (X), A) is, since N( Σ (X), A) = (T Σ (X ∪ X ωS ), → A ) . (cid:2) Proposition 5.3.1
If a TRS A is Church-Rosser or locally Church-Rosser, then so is A(X) , for any suitable countable set X of constants. Proof:
Let P stand for either of the above properties. The reader shouldcheck that a TRS ( Σ , A) is P iff N( Σ , A) is P . Next given an S -sortedset Y of constants and a signature isomorphism f : X ωS → Y , N( Σ , A) and (T Σ (Y ), → A ) are isomorphic ARSs, from which it follows by Exer-cise 5.7.7 that one of them is P iff the other is. Finally, since X iscountable, X ωS and X ωS ∪ X are isomorphic, from which it follows thatthe TRS ( Σ , A) is P iff ( Σ (X), A) is P , since N( Σ (X), (T Σ (X ∪ X ωS ), → A ) . (cid:2) Conditional term rewriting arises naturally from the desire to imple-ment algebraic specifications that have conditional equations in thesame way that unconditional rewriting implements unconditional equa-tional specifications. There are many examples of such specifications,and they are very useful in practice, as well as strictly more expres-sive [176]. Just as unconditional rewrite rules are a special kind ofunconditional equation, so conditional rewrite rules are a special kind of conditional equation:
Definition 5.8.1 A conditional Σ - rewrite rule is a conditional Σ -equation ( ∀ X) t = t if C such that var (t ) ∪ var (C) ⊆ var (t ) = X ,where var (C) = (cid:83) (cid:104) u,v (cid:105)∈ C ( var (u) ∪ var (v)) .A conditional Σ - term rewriting system (abbreviated Σ - CTRS ) is aset of (possibly) conditional Σ -rewrite rules; we denote such a systemby ( Σ , A) , and we may omit Σ here and elsewhere if it is clear fromcontext. (cid:2) Rewriting
Notation and terminology for conditional term rewriting follow thosefor the unconditional case. Instead of ( ∀ Y ) t = t if C , we usuallywrite ( ∀ Y ) t → t if C , and in concrete cases we may write ( ∀ Y ) t → t if u = v , ( ∀ Y ) t → t if u = v, u (cid:48) = v (cid:48) , etc. The notation t → t if C is unambiguous because X is determined by t . Also,when t ⇒ t (cid:48) using a rule in A having leftside (cid:96) , with substitution θ where t = t (z ← θ((cid:96))) , then the pair (t , θ) , is called a match to (asubterm of) t by that rule. Unconditional rules are the special casewhere C = ∅ .Unfortunately, there is no easy way to generalize the rule ( rw ) forterm rewriting to the conditional case, e.g., by specializing the rule(+6 C ) from Section 4.9 to replace exactly one subterm using a substi-tution instance of a conditional rewrite rule. This is because the con- ditions must be checked, which may lead to further conditional termrewriting, including further condition checking, and so on recursively.We therefore need a recursive definition of the conditional term rewrit-ing relation, and so will define (sorted) relations temporarily denoted R k and R k , with the rewriting relation the union of the R k and with the R k used for evaluating conditions. Definition 5.8.2
Given a CTRS ( Σ , A) and a set X of variables, let R = R ={(cid:104) t, t (cid:105) | t ∈ T Σ (X) } , and for k ≥
0, let (cid:104) t, t (cid:48) (cid:105) ∈ R k + iff there exist aconditional rule ( ∀ Y ) t = t if C of sort s in A and a substitution θ : Y → T Σ (X) such that t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) forsome t ∈ T Σ (X ∪{ z } s ) , and such that for each (cid:104) u, v (cid:105) ∈ C there is some r such that (cid:104) θ(u), r (cid:105) , (cid:104) θ(v), r (cid:105) ∈ R k . Also let R k + = (R k + ∪ R k ) ∗ andlet R = (cid:83) ∞ k = R k . Then R is the conditional term rewriting relation ,hereafter denoted t ⇒ A t (cid:48) . As usual, ∗ ⇒ A denotes its transitive reflexiveclosure, and when X = ∅ we get the ground case. (cid:2) Note that it is possible to go into an infinite loop when evaluating thecondition of an instance of a rule, in which case the correspondinghead is simply not included in R . Note also that there exists r suchthat (cid:104) θ(u), r (cid:105) , (cid:104) θ(v), r (cid:105) ∈ R k iff θ(u), θ(v) converge under R k − . E16
It is not hard to check that t ⇒ A t (cid:48) iff (cid:104) t, t (cid:48) (cid:105) ∈ R k for some k > and that R ∗ = R . The soundness results in Proposition 5.8.8 are alsostraightforward. However, we do not prove these results here, becausethey follow from more general results in Section 7.7.Many results developed earlier in this chapter for unconditionalterm rewriting extend to the conditional case. The easiest extensionsuse the fact that CTRS’s give rise to ARS’s just as in the unconditionalcase, so we can directly apply general ARS definitions and results toconditional term rewriting. More specifically, if P = ( Σ , A) is a CTRS,we let R(P ) be the ARS (T , → ) with T s = T Σ (X) s and with t → s t (cid:48) iff t ⇒ A t (cid:48) for t, t (cid:48) of sort s . This gives us the correct notions of termina- onditional Term Rewriting tion, normal form, Church-Rosser, local Church-Rosser, and canonicityfor CTRS’s. For example, P is terminating iff R(P , s) is terminating foreach s ∈ S . As with ordinary TRS’s, we let X = ∅ for the ground case,and we choose X large enough for the general case, e.g., X ωS as definedin Proposition 5.3.6. The ARS results Theorem 5.7.2, Proposition 5.7.3,and Proposition 5.7.4 give the following: Theorem 5.8.3
Given a canonical CTRS, every t ∈ T has a unique normal form,denoted [[t]] and called the canonical form of t . (cid:2) Proposition 5.8.4
Given CTRS ( Σ , A) and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) iff thereare terms t , . . . , t n ∈ T Σ (X) such that t ↓ A t , t i ↓ A t i + for i = , . . . , n −
1, and t n ↓ A t (cid:48) . (cid:2) Proposition 5.8.5 ( Newman Lemma ) A terminating CTRS is Church-Rosser ifand only if it is locally Church-Rosser. (cid:2)
Other results generalize, not through ARS’s, but because their proofsgeneralize. We begin with two results from Section 5.1 that connectrewriting with deduction:
Proposition 5.8.6
For t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and ( Σ , A) a CTRS, t ⇒ A,X t (cid:48) iff t ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Corollary 5.8.7
For t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and ( Σ , A) a CTRS, t ∗ ⇒ A,X t (cid:48) iff t ∗ ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) ; moreover, t ↓ A,X t (cid:48) iff t ↓ A,Y t (cid:48) . (cid:2) As before, this shows that ∗ ⇒ A,X and ↓ A,X restrict and extend wellover variables, which permits dropping the variable set subscripts. Onthe other hand, noting that TRS’s are a special case of CTRS’s, Exam-ple 5.1.15 shows that ∗ (cid:97) A,X does not restrict and extend well over vari-ables. The next result gives soundness:
Proposition 5.8.8
Given CTRS ( Σ , A) and t, t (cid:48) ∈ T Σ (X) , then t ∗ ⇒ A t (cid:48) implies A (cid:96) ( ∀ X) t = t (cid:48) . Also t ↓ A t (cid:48) and t ∗ (cid:97) A t (cid:48) both imply A (cid:96) ( ∀ X) t = t (cid:48) . (cid:2) We cannot hope for completeness here, because it is possible, for acondition t = t , that A (cid:96) t = t but t ↓ t fails. The literature in-cludes several different ways to define conditional term rewriting; theone in Definition 5.8.2 is called join conditional rewriting . OBJ im-plements a special case of this, where there is just one condition ineach rule, with its leftside a Bool -sorted term, and its implicit rightsidethe constant true . Although conditional rewriting can be difficult, thespecial case implemented in OBJ is much more efficient, because theconvergence of a condition can be checked just by rewriting its leftsideterm. Rewriting
Perhaps surprisingly, OBJ’s restrictions do not limit its power inpractice. In particular, evaluation of an OBJ equation of the form t = t (cid:48) if u == v agrees with Definition 5.8.2 despite the implicit true onthe rightside of the condition, because of the operational semantics of == . In any case, soundness implies that any rewriting computation is a proof, so if you get the result you want, then you have proved theresult you wanted to prove, whether or not the CTRS that you usedwas Church-Rosser or terminating. Also, it is rare in practice that whenOBJ evaluates u == v , a term r exists such that u ∗ ⇒ A r and v ∗ ⇒ A r ,but OBJ does not find this r , because u and v reduce to different nor-mal forms, or because at least one of them does not terminate. Thereis an important obstacle to soundness for non-canonical CTRS’s: if == occurs in a negative position (such as =/= ) in a condition, then failure of == to find a common reduced form may lead to its negation return-ing an unsound true . However, soundness can be guaranteed for suchconditional rules if A is canonical for the sorts of terms that occur insuch positions, using the subset of rules that are actually applied in theparticular computation. As noted before, some uses of == have to beconsidered carefully, because they can take one outside the mathemat-ical semantics of OBJ.Theorem 5.2.9 on initiality of the algebra of normal forms also gen-eralizes; we do not prove it here, because it is a special case of Theorem7.7.8, which is proved in Section 7.7. Theorem 5.8.9
If a (conditional) specification P = ( Σ , A) is a ground canonicalCTRS, then the canonical forms of ground terms under A form a P -algebra called the canonical term algebra of P and denoted N P , in thefollowing way:(0) interpret σ ∈ Σ [],s as [[σ ]] in N P,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > t , . . . , t n ) with t i ∈ N P,s i to [[σ (t , . . . , t n )]] in N P,s .Furthermore, if M is any P -algebra, there is one and only one Σ -homo- morphism N P → M . (cid:2) This subsection extends results from Section 5.3 from TRS’s to CTRS’s,on how properties can change when new constants are added. As be-fore, these results are important because they help us conclude thatrewriting systems terminate, are Church-Rosser, local Church-Rosser,or canonical; moreover they can justify using the Theorem of Constantsin theorem proving. Proposition 5.3.1 extends to the conditional caseto support this assertion; the proof appears in Appendix B: onditional Term Rewriting
Proposition 5.8.10
A CTRS ( Σ (X), A) is ground terminating, where X is a sig-nature of constants if ( Σ , A) is ground terminating. E17
Moreover, if Σ is non-void, then ( Σ , A) is ground terminating iff ( Σ (X), A) is groundterminating. (cid:2) As with the unconditional case, proofs of the (local) Church-Rosserproperty usually cover the general case, not just the ground case, sothat constants can be added without worry. The following result is stillof interest:
Proposition 5.8.11
A CTRS ( Σ , A) is (locally) Church-Rosser if and only if theCTRS ( Σ (X ωS ), A) is (locally) ground Church-Rosser, where X ωS is as de-fined in Proposition 5.3.6. (cid:2) The proof is omitted since it is the same as that for Proposition 5.3.6 on page 108 for the unconditional case.
Proving termination of a CTRS can be much more difficult than for theunconditional case. But we can often reduce to the unconditional case,and then apply the techniques of Section 5.5. In the following result,the “unconditional version” of a conditional rule is defined to be therule obtained by deleting its condition:
Proposition 5.8.12
Given a CTRS C , let C U be the TRS whose rules are thoseof C with their conditions (if any) removed. Then C is terminating (orground terminating) if C U is. Proof:
Any rewrite sequence of C is also a rewrite sequence of C U and there-fore finite. (cid:2) Notice that the normal forms of C may be different from those of C U ,because in general C U has more rewrites than C . The following illus-trates the use of this result to prove termination of a CTRS, and becausewe give rather a lot of detail, it can also serve as a review of the tech-nique of Proposition 5.5.6: Example 5.8.13
The function max , which gives the maximum of two naturalnumbers, is often defined using conditional equations as follows: obj NATMAX is sort PNat .op 0 : -> PNat .op s_ : PNat -> PNat .op _<=_ : PNat PNat -> Bool .op max : PNat PNat -> PNat .vars N M : PNat .eq 0 <= N = true .eq s N <= 0 = false . Rewriting eq s N <= s M = N <= M .cq max(N,M) = N if M <= N .cq max(N,M) = M if N <= M .endo
We will show that this CTRS is terminating. It suffices to prove thatthe corresponding unconditional TRS is ground terminating, by Propo-sitions 5.8.10 and 5.8.12. (It is interesting to notice that this TRS is notChurch-Rosser, although the original CTRS is Church-Rosser.)We define ρ : T Σ → ω by initiality, by making ω a Σ -algebra asfollows: ω true = ω false = ω = ω s (N) = N + ω <= (N, M) = + N + M ; and ω max (N, M) = + N + M . To apply Proposition 5.5.6, we mustcheck a number of inequalities. The following arise from condition (1),and must hold for any x, y ∈ T Σ , PNat : ρ( x) > ρ( true )ρ( s x <= 0 ) > ρ( false )ρ( s x <= s y) > ρ(x <= y)ρ( max (x, y)) > ρ(x) if (x <= y) = true ρ( max (x, y)) > ρ(y) if (y <= x) = false . For condition (2 (cid:48) ), we must check the following for any x, y, z ∈ T Σ , PNat under the assumption that ρ(x) > ρ(y) : ρ( s x) > ρ( s y)ρ(x <= z) > ρ(y <= z)ρ(z <= x) > ρ(z <= y)ρ( max (x, z)) > ρ( max (y, z))ρ( max (z, x)) > ρ( max (z, y)) . All of these translate to inequalities over the natural numbers that areeasily checked mechanically, e.g., with appropriate reductions underthe following definition, noting that we must introduce only the syntaxof
NATMAX , not its equations, and that the version of
NAT used, in thiscase
NATP+*> , must contain enough facts about addition and > to makethe proofs work: obj NATMAXPF is sort PNat .pr NATP+*> .op 0 : -> PNat .op s_ : PNat -> PNat .op _<=_ : PNat PNat -> Bool .op max : PNat PNat -> PNat .op r : PNat -> Nat .op r : Bool -> Nat .vars X Y : PNat .eq r(0) = 1 .eq r(true) = 1 .eq r(false) = 1 . onditional Term Rewriting eq r(s X) = s r(X).eq r(X <= Y) = s(r(X) + r(Y)).eq r(max(X,Y)) = s s(r(X) + r(Y)).endo Thus the following proves the first set of inequalities, where we haveintroduced constants to eliminate the universal quantifiers, and then alemma that was not already in
NATP+*> : openr .ops x y : -> PNat .vars N M : Nat .eq s(N + M) > N = true .red r(0 <= x) > r(true).red r(s x <= 0) > r(false).red r(s x <= s y) > r(x <= y).red r(max(x,y)) > r(x).red r(max(x,y)) > r(y).close The last two equations are true without their conditions, althoughthe conditions could have been added as assumptions for the proofs ifthey had been needed. (cid:2)
Exercise 5.8.1
Give mechanical proofs for the second set of inequalities in Ex-ample 5.8.13, similar to those given for the first set of inequalities there. (cid:2)
The conclusion of Proposition 5.8.12 is that infinite rewrite sequencescannot occur. However, a related phenomenon can occur for terminat-ing CTRS’s, whereby a process of determining whether a conditionalrewrite applies does not stop. This is illustrated in the following:
Example 5.8.14
Let Σ have just one sort plus four constants, a, b, c, d , and let A contain the following two conditional rewrite rules: a → b if c = dc → d if a = b Then given the term a , to check if the first rule applies, we must con-sult the second, which in turn requires that we consult the first ruleagain, etc., etc. According to the formal definition of conditional termrewriting, the result of such an infinite regress is simply that the orig-inal rule does not apply to the given term; so this does not lead tonon-termination in the sense of Definition 5.2.1, and in fact, this CTRSis terminating, as is easily seen using Proposition 5.8.12. Intuitively,neither rule applies, because neither condition can ever be satisfied.What does occur here, is that a certain algorithm that might be used to Rewriting implement conditional term rewriting fails to terminate. In fact, eachof a, b, c, d are reduced forms under this CTRS (however, only b and d are reduced under its unconditional version).It is possible to get the same phenomenon with just one rule. Let Σ now have one sort plus two constants, a, b , and one unary function s .Then the rule a → b if s(a) = s(b) leads to an infinite condition evaluation regress similar to that of theabove two-rule example. Here too the CTRS is terminating, and a, b areboth reduced, for similar reasons. (cid:2) Exercise 5.8.2
Write out the details of the infinite regress and of the termina- tion proof for the one-rule CTRS above. (cid:2)
Example 5.8.15
The following OBJ code for the two examples above aborts,producing the error message “
Value stack overflow. ” because ofinfinite conditional evaluation regress: obj CTRS1 is sort S .ops a b c d : -> S .cq a = b if c == d .cq c = d if a == b .endored a .obj CTRS2 is sort S .ops a b : -> S .op s : S -> S .cq a = b if s(a) == s(b) .endored a . (cid:2)
Of course, it is interesting to know when condition evaluation termi-nates, as well as when rewriting terminates, but we do not address thatproblem here.Proposition 5.5.1 on page 111 generalizes to abstract rewrite sys- tems, and hence applies to the conditional case just as well as to theunconditional case, and Example 5.5.2 again shows the necessity ofglobal finiteness for the converse.
Proposition 5.8.16
An ARS A on a set T is terminating if there is a function ρ : T → ω such that for all t, t (cid:48) ∈ T , if t → A t (cid:48) then ρ(t) > ρ(t (cid:48) ) .Furthermore, if A is globally finite, then A is terminating iff such afunction exists. (cid:2) Although it is easy to design an algorithm that does terminate on simple examplesof this kind, just by checking for loops, it is impossible to write an algorithm that worksfor all examples, because the problem is unsolvable. onditional Term Rewriting
With this we can generalize Proposition 5.5.6 to CTRS’s, using the fol-lowing terminology:
Definition 5.8.17
Given a poset P and ρ : T Σ → P , then a conditional Σ -rewriterule t → t (cid:48) if C is strict ρ -monotone iff ρ(θ(t)) > ρ(θ(t (cid:48) )) for each ap-plicable ground substitution θ such that θ(u) ↓ θ(v) for each (cid:104) u, v (cid:105) ∈ C ; we speak of weak ρ -monotonicity if > is replaced by ≥ above. SeeDefinition 5.5.3 on page 112 for related concepts. (cid:2) Proposition 5.8.18
Given a CTRS ( Σ , A) , if there is a function ρ : T Σ → ω suchthat(1) each rule in A is strict ρ -monotone, and ( (cid:48) ) each σ ∈ Σ is strict ρ -monotone, then A is ground terminating. (cid:2) The proof is like that of Proposition 5.5.6 on page 113, and is thereforeomitted. Rather than give an example using this result now, we willfurther generalize it to the very common case of a (C)TRS that we knowterminates, to which we add some new rules, and then want to showthat the resulting system also terminates. The following easy but usefulARS result is the basis for this generalization:
Proposition 5.8.19
Let A be an ARS on a set T , let B be a terminating “base”ARS contained in A , and let N denote the “new” rewrites of A on T ,i.e., let → N = → A − → B . Then A is terminating if there is a function ρ : T → ω such that(1) if t → B t (cid:48) then ρ(t) ≥ ρ(t (cid:48) ) , and(2) if t → N t (cid:48) then ρ(t) > ρ(t (cid:48) ) . Proof:
Any A -rewrite sequence can be put in the form t ∗ → B t → N t ∗ → B t → N · · · , from which it follows that ρ(t ) ≥ ρ(t ) > ρ(t ) ≥ ρ(t ) > · · · . Hence there is some k such that no N rewrite applies to t k . Be-cause B is terminating, there can only be a finite number of rewritesafter t k , so the sequence must be finite. (cid:2) The two levels of this result can be iterated to form a multi-levelhierarchy, in which one proves the termination of each layer assumingthe one below it. The following is a conditional hierarchical version ofProposition 5.5.6; of course, it also applies to unconditional TRS’s. Theproof is not entirely trivial. Recall that the inequality only needs to hold when all the conditions of the ruleconverge. Rewriting
Theorem 5.8.20
Let ( Σ , A) be a CTRS with Σ non-void, let ( Σ , B) be a termi-nating sub-CTRS of ( Σ , A) , and let N = A − B . If there is a function ρ : T Σ → ω such that(1) every rule in B is weak ρ -monotone,(2) every rule in N is strict ρ -monotone,(3) every σ ∈ Σ is strict ρ -monotone,then A is ground terminating. Proof:
We will use Proposition 5.8.19. Let A , B , N be the ARS’s for A, B, N respectively, on the (indexed) set T = T Σ and let Σ (cid:48) be the minimal sig-nature for B . Notice that rules in B may apply to terms with operations in Σ − Σ (cid:48) ; therefore B must apply such rewrites. This means we can-not assume termination for B , and hence to apply Proposition 5.8.19,we must first establish that assumption for B on T Σ . We will do thisby induction on the depth of nesting of new operation symbols in a Σ -term t .For the base case, a Σ -term t has depth zero iff it contains no oper-ations in Σ − Σ (cid:48) , and then we have termination by our assumption that B is ground terminating on T Σ (cid:48) .Next, suppose t is a Σ -term with depth d > g(t , . . . , t n ) with g ∈ Σ − Σ (cid:48) and with each t , . . . , t n of depth less than d . Then bythe inductive assumption, rewriting with B is terminating on each t i ,and hence is terminating on t , because only a lapse rule in B could beapplied at the top of t , and any such application will reduce us to thecase of the previous paragraph, because the rightside of the lapse rulemust be a ground term, or else B would not be terminating.Now consider the general case of a Σ -term t with depth d >
0, whichwill have the form t = t (z ← t , . . . , z n ← t n ) with t involving onlyoperations in Σ (cid:48) , with each t , . . . , t n of depth d or less, and with thetop operation of each t i in Σ − Σ (cid:48) . Then any rewrite of t must either beinside of some t i or else inside of t . There can only be a finite numberof rewrites of the first kind, by the argument of the previous paragraph, and there can only be a finite number of rewrites of the second kind,noting that our signature is non-void and applying Proposition 5.8.10on page 131 of Section 5.8.1, which generalizes Proposition 5.3.4, aboutthe effect on termination of adding constants for the conditional case,because the z i are just new constants. Hence rewriting with B termi-nates on any such term t , and we have therefore proved terminationof B .Next, observe that our assumptions (1) and (3) above imply assump-tion (1) of Proposition 5.8.19, by the same reasoning that was used toprove Proposition 5.5.5 in Appendix B. Similarly our assumptions (2)and (3) imply assumption (2) of Proposition 5.8.19. (cid:2) onditional Term Rewriting In many cases, ρ is already defined on B , and we only need check theconditions for the new rules and new operations. Notice that if a CTRS A has a terminating sub-CTRS B such that the new rules in N = A − B cannot be used in evaluating the conditions of rules in N , then infinitecondition evaluation regress cannot occur.We first apply Theorem 5.8.20 to a case where all the new rulesare unconditional; here the hierarchical specification greatly simplifiesthe termination proof, using the fact that termination was previouslyshown for the base system. Example 5.8.21
Suppose we are given some (C)TRS B for the natural numbersthat we already know is terminating, such as NATP+ , and then definethe Fibonacci numbers over B by: obj FIBO is pr NATP+ .op f : Nat -> Nat .var N : Nat .eq f(0) = 0 .eq f(s 0) = s 0 .eq f(s s N) = f(s N) + f(N).endo Letting Σ be the signature for the union TRS A , and Σ (cid:48) the signaturefor B (which is NATP+ ), we define ρ : T Σ → ω by letting each σ ∈ Σ (cid:48) have its usual meaning in ω , and letting ω f (N) = N . Then for t ∈ T Σ (cid:48) ,the value of ρ(t) is the number that it denotes, and so all the B -rulesare weak monotone (in fact, with equality). For condition (2), strictmonotonicity of the three N -rules for the Fibonacci function followsfrom the corresponding inequalities, of which the most interesting isthe third,4 · N > · N . Condition (3) of Theorem 5.8.20 is easy to check from the definitionsof the functions defined on ω . Hence this TRS is terminating. (cid:2) Notice that proving termination for the specification of a function like Fibonacci gives much more than just termination of the underlyingalgorithm, because it applies to terms with any number of occurrencesof the function, in any combination with functions from the base rewrit-ing system, to any level of nesting.We next do an example with a conditional rule such that the methodof Proposition 5.8.12 cannot be used, because the unconditional ver-sion of this CTRS fails to terminate; this example is the bubblesortalgorithm.
Example 5.8.22
Assume that the following specification for lists of naturalnumbers has been shown to be terminating as a TRS: Rewriting obj NATLIST is sorts Nat List .op 0 : -> Nat .op s : Nat -> Nat .op _<_ : Nat Nat -> Bool .op nil : -> List .op _._ : Nat List -> List .*** vars and eqs omitted ...endo
Now add to this the following new operation and rule, which definethe so called bubblesort algorithm for sorting lists of naturals: obj BSORT is pr NATLIST .op sort : List -> List .vars N M : Nat .var L : List .cq sort(N .(M . L)) = sort(M .(N . L)) if M < N .endo
The conditional rewrite rule above switches two adjacent list ele-ments iff they are out of order. We want to show that this hierarchicalCTRS is terminating. Notice that the above equation without the con-dition is definitely not terminating; for example, the list can be rewritten to which can be rewritten to the originallist, etc., etc. Even though we only sketch the proof, the specificationreally needs to have an operation and equation such as sorted : List -> Bool .cq sort(L) = L if sorted(L) . to get rid of the sort function symbol when the list is finally sorted.However, the essence of bubble sort is the conditional rule in the
BSORT module above.The Σ -algebra structure of ω is defined by interpreting the opera-tions on the naturals as themselves, interpreting true , false , and nil as , letting ω < (N, M) = N + M , letting ω sort (L) = L +
1, and letting ω . (N, L) = d(N, L) + d(L) , where d is the “displacement” function, i.e.,the number of pairs that are out of order, defined by d( nil ) = d(N.L) = d(N, L) + d(L)d(N, nil ) = d(N, M.L) = + d(N, L) if N > Md(N, M.L) = d(N, L) if N ≤ M .
Proving strict monotonicity of the new rule depends on the lemma d(N.(M.L)) = + d(M.(N.L)) if N < M, onditional Term Rewriting which is not hard to prove by case analysis. The strict monotonicityof ω . can be checked from the definition of d . It is easy to check theother monotonicity conditions for both rules and operations, and so weare done. By the way, we can actually write the above definition of d inOBJ and then define eq sorted(L) = d(L) == 0 . (cid:2) Exercise 5.8.3
Show that the equations defining d in Example 5.8.22 above areterminating, when viewed as rewrite rules over NATLIST . Hint:
Showthat the unconditional version is terminating with Theorem 5.8.20 andthen apply Proposition 5.8.12. (cid:2)
Exercise 5.8.4
Give OBJ proofs for the results of Example 5.8.22 and Exercise5.8.3. (cid:2)
It should not be thought that proving termination of conditionalterm rewriting systems is always an easy task. While the results givenin this subsection seem adequate for the most common examples, thereare many others for which they are not. The following two examplesconstitute a somewhat entertaining partial digression on non-proofs ofnon-termination.
Example 5.8.23
We give two TRS’s that are ground terminating separately butcombine to give a TRS that is not; this is called “Toyama’s example”[177]. We also prove that Theorem 5.8.20 could never be used to demon-strate the termination of this TRS. obj B is sort S .ops 0 1 : -> S .op f : S S S -> S .var X : S .eq f(0,1,X) = f(X,X,X) .endoobj A is pr B .op g : S S -> S .vars X Y : S .eq g(X,Y) = X .eq g(X,Y) = Y .endo
A term that demonstrates the non-ground termination of this TRSis t = f (g( , ), g( , ), g( , )) ,which rewrites first to f ( , g( , ), g( , )) , then to f ( , , g( , )) , andthen back to the initial term t . Note that the equation in B could alsohave been given as the conditional equation cq f(X,Y,Z) = f(Z,Z,Z) if X == 0 and Y == 1 . Rewriting
Now suppose we have ρ : T Σ → ω (where Σ is the signature of A ) that isweak ρ -monotone on the rule in B , and strict ρ -monotone on the newrules of A , such that all operations in Σ are strict ρ -monotone. Let uswrite [t] for ρ(t) . Then [t] = [f (g( , ), g( , ), g( , ))] >[f ( , g( , ), g( , ))] >[f ( , , g( , ))] ≥ [t] , which is a contradiction. Therefore Theorem 5.8.20 could never be usedto prove ground termination of this TRS (which is of course consistentwith the fact that this TRS is not ground terminating). (cid:2) Example 5.8.24
Using the same technique as in Example 5.8.23, we sketch a proof that Theorem 5.8.20 cannot be used to prove ground termina-tion of the specification for the greatest common divisor given below(which is essentially Euclid’s algorithm), viewed as a hierarchical CTRSover some suitable terminating specification
NAT of the natural num-bers with subtraction and > , where ρ is defined homomorphically. Ter-mination of this CTRS is proved in Example 5.8.32 using much moresophisticated methods. obj GCD is pr NAT .op gcd : Nat Nat -> Nat .vars M N : Nat .eq gcd(M,0) = M .eq gcd(0,N) = N .cq gcd(M,N) = gcd(M - N, N) if M >= N and N > 0 .cq gcd(M,N) = gcd(M, N - M) if N >= M and M > 0 .endo The proof will be by contradiction, so we assume that there are a Σ -algebra structure on ω and a weight function ρ : T Σ → ω satisfyingall the conditions of Theorem 5.8.20, where A is GCD plus
NAT , Σ is thesignature of A , and B is NAT . We first prove a lemma, that
M(x, y) ≥ x for all x, y in ω , where M(x, y) denotes the function ω − (x, y) on ω . The proof is by contra-diction, so we suppose that there exist x , y such that M(x , y ) 0. But this is impos-sible because ω is Noetherian (i.e., has no infinite strictly decreasingsequences). By the same reasoning, the analoguous inequality holdsfor the function G(x, y) = ω gcd .Now we are ready for the main part of the proof, in which we write [t] for ρ(t) , as well as M for ω − and G for ω gcd as above. Let x, y be onditional Term Rewriting natural number terms (i.e., ground terms in the base rewriting system NAT ) with x > y . Then [ gcd (x, y)] >[ gcd (x − y, y)] = G([x − y], [y]) = G(M([x], [y]), [y]) ≥ G([x], [y]) = [ gcd (x, y)] , which is a contradiction (the first step results from monotonicity whenapplying the first conditional rule, and the next to last step uses thelemma for M , and then for G ). (cid:2) The final calculation in the above example suggests that the reason this termination proof method fails for gcd is the monotonicity requirementfor operations combined with homomorphicity. Because these assump-tions are not needed for Proposition 5.8.19, the possibility remains ofapplying something like that result directly, as is done in the next sub-section. Exercise 5.8.5 Assuming a suitable terminating specification NAT for the natu-ral numbers with inequality > , prove termination of the following CTRSfor binary search trees: obj BTREE is sort BTree .pr NAT .op empty : -> BTree .op make : BTree Nat BTree -> BTree .op insert : Nat BTree -> BTree .vars T1 T2 T3 : BTree .vars N M : Nat .eq insert(M,empty) = make(empty,M,empty) .cq insert(M,make(T1,N,T2)) = make(insert(M,T1),N,T2)if N > M .cq insert(M,make(T1,N,T2)) = make(T1,N,insert(M,T2))if M > N .endo (cid:2) ( (cid:63) ) Noetherian Orderings This subsection develops the remark after Example 5.5.4 that it is use-ful to allow weight functions that take values in Noetherian partial or-derings other than ω (see Appendix C for a review of partially orderedsets, also called posets ), where a poset is Noetherian (also called wellfounded ) iff it has no infinite sequence of strictly decreasing elements.The key observation is that Proposition 5.8.19 and Theorem 5.8.20 gen-eralize to any Noetherian poset, because their proofs depend only onthe Noetherian property; note also that a different Noetherian ordering Rewriting could be used for each sort, since we are really dealing with a sortedset of posets. As with using ω , the key intuition is that rewrites shouldstrictly decrease weight. Some examples, including the greatest com-mon divisor as computed by Euclid’s algorithm, need rather compli-cated orderings. To help with this, we introduce some ways to buildnew orderings out of old ones, such that if the old orderings are Noethe-rian then so are the new ones. Unfortunately, much of the material inthis subsection is rather technical. Definition 5.8.25 Let P , Q be posets, with both their orderings denoted ≥ . Thentheir ( Cartesian ) product poset , denoted P × Q , has as its elements thepairs (p, q) with p ∈ P and q ∈ Q , and has (p, q) ≥ (p (cid:48) , q (cid:48) ) iff p ≥ p (cid:48) and q ≥ q (cid:48) . Their lexicographic product , here denoted P (cid:11) Q , againhas as elements the pairs (p, q) with p ∈ P and q ∈ Q (we may use the notation p (cid:11) q ), but now ordered by (p, q) ≥ (p (cid:48) , q (cid:48) ) iff p > p (cid:48) or else p = p (cid:48) and q ≥ q (cid:48) . To avoid confusion with pairs (p, q) ∈ P × Q , wewill hereafter use the notation p (cid:11) q for elements of P (cid:11) Q . The sum of posets P , P , denoted P + P , has as its elements pairs (i, p) with p ∈ P i for i = i = 2, ordered by (i, p) ≥ (i (cid:48) , p (cid:48) ) iff i = i (cid:48) and p ≥ p (cid:48) in P i . A poset Q is a subposet of a poset P iff Q ⊆ P and q ≥ q (cid:48) in Q iff q ≥ q (cid:48) in P , for all q, q (cid:48) ∈ Q . (cid:2) The following result is rather straightforward to prove: Proposition 5.8.26 If P , Q are both Noetherian posets, then so are P × Q , P (cid:11) Q and P + Q . Moreover, the discrete ordering on any set X , defined by x ≥ y iff x = y , is also a Noetherian poset, and any subposet of aNoetherian poset is Noetherian. (cid:2) Example 5.8.27 Motivated by the applications of term rewriting to verifyinghardware circuits that are developed in Section 7.4, a system T of Σ (X) -equations is said to be triangular iff X is finite, there is a subset of X called input variables , say i , . . . , i n , and there is an ordering of thenon-input variables, say p , . . . , p m , such that the equations in T havethe form p k = t k (i , . . . , i n , p , . . . , p k − ) for k = , . . . , m , where each t k is a Σ (X)-term involving only input variables and thosenon-input variables p j with j < k (in particular, t must contain onlyinput variables).We first prove that any triangular system T is terminating as a TRS.Let P = (cid:11) mi = ω , the m -fold lexicographic product of ω with itself, anddefine ρ : T Σ → P by letting ρ(t) = ((cid:96) m , . . . , (cid:96) ) where (cid:96) k is the number If P and P are disjoint, then the elements of P + P can be taken as just thosein P ∪ P . The purpose of the construction with the pairs (i, p) is just to enforcedisjointness in case P , P were not already disjoint. onditional Term Rewriting of occurrences of p k in t . Rewriting a Σ (X) -term t with any equation in T will decrease ρ(t) , because it will decrease the number (cid:96) k of occur-rences of the non-input variable p k in the rule’s leftside by one, whilepossibly increasing the numbers (cid:96) j of occurrences of variables p j with j < k . Therefore T is terminating by Proposition 5.5.1 generalized toNoetherian posets.Next, we use the Newman Lemma (Proposition 5.7.4) to show that T is Church-Rosser, by proving that the local Church-Rosser propertyholds. For this purpose, we first note that if a Σ (X) -term t can berewritten in two distinct ways, it must have the form t (z ← p i , z ← p j ) where z , z are distinct new variables, each occurring just once in t . To prove this, pick one of the rewrites and note that, since its redexis a non-input variable, t must have the form t (cid:48) (z ← p i ) . Because there is just one rule for each non-input variable, the redex for the secondrewrite is disjoint from that for the first, so that t (cid:48) (z ← p i ) and hence t , has the form t (z ← p i , z ← p j ) , for which we will use the shorternotation t (p i , p j ) . It now follows that the two rewrites have the forms t ⇒ t (t i , p j ) and t ⇒ t (p i , t j ) . Therefore each target term can berewritten to t (t i , t j ) by applying the other rule once. We now concludethat any triangular system is canonical.Finally, we show that the only variables that can occur in a normalform of a triangular system are input variables, by proving the con-trapositive: If a term t contains a non-input variable, then it can berewritten using the rule with that variable as its leftside, and hence it isnot reduced. (cid:2) A more complex construction of a new Noetherian poset from anold one is given by multisets. Intuitively, multisets generalize ordinarysets by allowing elements to occur multiple times. A multiset is oftendefined to be a function A : D → ω + , where D is the domain and A(d) is the multiplicity of d ∈ D . It is common to use set notation formultisets, so that for example the multiset denoted by { , , } wouldhave D = { , } , with A( ) = A( ) = 1, indicating two instancesof 1 and one of 2. Then the most natural notion of a submultiset of A would be a subset D (cid:48) of D and a function A (cid:48) : D (cid:48) → ω + such that A (cid:48) (d) ≤ A(d) for all d ∈ D (cid:48) ; for example, { , } ≤ { , , } .However, this approach is inadequate for our applications, whichrequire multisets of elements drawn from a Noetherian poset P , with anordering such that, for example where P is ω with the usual ordering, { } < { } < { } and { , } < { , } . Also, in term rewriting theory,the phrase “multiset ordering” usually refers to an ordering that allowseven more possibilities, such as { , , } < { } and { , , , } < { , } ;but because our applications do not need this extra sophistication, wewill develop only a somewhat simplified special case.Our mathematical formulation of multisets involves a possibly sur-prising reversal of the approach sketched above, in that we dispense Rewriting with ω + , and instead rely on abstract sets whose elements representinstances of elements of P . For example, { , , } is represented by thefunction A : { x, y, z } → P with A(x) = , A(y) = 2, and A(z) = Definition 5.8.28 Given a poset P with an ordering ≥ , then a multiset over P isa function A : X → P with underlying set X ; call a multiset A : X → P finite iff its underlying set is finite; the empty multiset , denoted ∅ , hasthe empty underlying set. Given multisets A : X → P and B : Y → P ,define A ≥ B iff there is an injective function f : Y → X such that A(f (y)) ≥ B(y) for all y ∈ Y . Let M (P ) denote the class of all finitemultisets, where all multisets A, B such that A ≥ B and B ≥ A areidentified. (cid:2) Exercise 5.8.6 Prove the following, where P is ω with the usual ordering, { , } > { } > ∅{ , } > { , } > { }{ , } > { , } > { , } , and where (as in Appendix C) A > B means A ≥ B and A ≠ B (which,because of the equivalence on multisets, means A ≥ B and not B ≥ A ).However, it is not possible to show (for example) that { , } > { , , , } , which would be required by the more usual and powerful multiset or-dering. (cid:2) Although this multiset ordering is weaker than the usual one, it is easierto reason about, and is sufficient for the applications in this chapter. Proposition 5.8.29 If P is a Noetherian poset, then so is M (P ) . Proof: Reflexivity is easy. For anti-symmetry, use the lemma that A ≥ B and B ≥ A iff there is some bijective f : Y → X such that A(f (y)) = B(y) for all y ∈ Y . For transitivity, given A ≥ B ≥ C with underlying sets X, Y , Z and injections f : Z → Y and g : Y → X , then f ; g : Z → X isalso injective and satisfies A(g(f (z))) ≥ B(f (z)) ≥ C(z) for all z ∈ Z . For the Noetherian property, suppose that A > A > · · · > A n > · · · is an infinite strictly decreasing sequence, where A i has underlyingset X i . This gives rise to an infinite sequence of injections X ← X ←· · · ← X n ← · · · . Then because X is finite, there must exist some n such that (up to isomorphism) X n = X n + k for all k ≥ 1. Then for each In order to avoid set-theoretic worries, it is desirable to restrict the underlying setsthat are used, for example, to finite subsets of ω . So technically speaking, we have an ordering on the quotient set. To obtain the mul-tiset ordering that is more usual in term rewriting, the restriction to injective functionsshould be relaxed to asserting of f : Y → X that if f (y) = f (y (cid:48) ) with y ≠ y (cid:48) then A(f (y)) > B(y) and A(f (y (cid:48) )) > B(y (cid:48) ) . onditional Term Rewriting k ≥ x ∈ X n such that A n + k (x) > A n + k + (x) .But for each particular x ∈ X n , there can only be a finite number ofsuch k because P is Noetherian. Now because X n is finite, there canonly be a finite number of pairs (k, x) such that the above inequalityholds, which contradicts our initial assumption. (cid:2) We now make one further identification, of p ∈ P with { p } ∈ M (P ) ,noting that p ≥ p (cid:48) in P iff { p } ≥ { p (cid:48) } in M (P ) , so that the inclusionmap P ⊆ M (P ) is order preserving. Therefore defining M n + (P ) =M ( M n (P )) with of course M (P ) = M (P ) , we get P ⊆ M (P ) ⊆ M (P ) ⊆ · · · ⊆ M n (P ) ⊆ · · · , and can therefore form the union of all these to get M ω (P ) = (cid:91) n M n (P ) , the union ordering on which is called the nested multiset ordering . Fact 5.8.30 The nested multiset ordering M ω (P ) = (cid:83) n M n (P ) is Noetherian if P is. Proof: Each M n (P ) is Noetherian by induction using Proposition 5.8.29, andeach element of the union lies in M n (P ) for some least n , as do allelements less than any given element. (cid:2) Similar constructions are used in Exercise 5.8.8 and Example 5.8.32 be-low. Exercise 5.8.7 Given a poset P and an equivalence relation ≡ on the carrier of P (which is also denoted P ), let P / ≡ denote the set P / ≡ ordered by [p] ≤ [q] for p, q ∈ P iff p (cid:48) ≤ q (cid:48) for some p (cid:48) ≡ p and q (cid:48) ≡ q . Show that if P / ≡ is a poset, and if it has only a finite number of non-trivial equivalenceclasses, then it is Noetherian if P is. Give an example showing that thehypothesis about a finite number of non-trivial equivalence classes isnecessary. Hint: Let P have a > a , a (cid:48) > a , a (cid:48) > a , . . . , and thenidentify a i with a (cid:48) i for i = , , , . . . . (cid:2) Exercise 5.8.8 Given a poset P , let ⊥ be a new element not already in P , and let P ⊥ denote the poset having underlying set P ∪ {⊥} with the ordering of P plus ⊥ < p for all p ∈ P . Show that P ⊥ is Noetherian if P is.Now given a Noetherian poset P with a unique least element ⊥ , form (cid:11) P = P (cid:11) P , and identify p ∈ P with the element p (cid:11) ⊥ ∈ (cid:11) P , so thatthere is an order-preserving inclusion P ⊆ (cid:11) P . Iterate this to obtain Those who know some category theory may recognize this as a colimit construction.The result proved in the next sentence is that this colimit of an increasing sequence ofNoetherian posets is Noetherian. We also note that the product and sum constructionsof Definition 5.8.25 are the categorical product and coproduct. Rewriting (cid:11) n P ⊆ (cid:11) n + P , and note that each (cid:11) n P is Noetherian for each n byinduction and Proposition 5.8.26. Now form (cid:11) ω P = (cid:83) n (cid:11) n P , showthat (cid:11) ω P corresponds to the usual lexicographic ordering on the set offinite strings from P , and give an example showing that (cid:11) ω P in generalis not Noetherian. Hint: If b > a , then b > ab > aab > aaab > · · · . (cid:2) A straightforward generalization of Proposition 5.8.19 requires defin-ing a weight function ρ : T Σ → P where P is a Noetherian poset, andthen showing that each new rewrite is strict ρ -monotone and eachold rewrite is weak ρ -monotone. A less straightforward generalizationweakens the assumption that P is Noetherian to assuming that eachparticular item and everything to which it can be rewritten lie withinsome Noetherian subposet of P . Proposition 5.8.31 Let A be an ARS on ( S -indexed) set T , let B be a terminating“base” ARS contained in A , let N denote the “new” rewrites of A on T (i.e., → N = → A − → B ), and let P be a poset. Then A is terminating ifthere is a function ρ : T → P such that(1) if t → B t (cid:48) then ρ(t) ≥ ρ(t (cid:48) ) ,(2) if t → N t (cid:48) then ρ(t) > ρ(t (cid:48) ) , and(3) P is Noetherian, or if not, then for each t ∈ T Σ ,s there is a Noethe-rian poset P ts ⊆ P s such that t ∗ → A t (cid:48) implies ρ(t (cid:48) ) ∈ P ts . Proof: By exactly the same reasoning that was used for Proposition 5.8.19. (cid:2) Example 5.8.32 ( (cid:63) ) We show termination of the GCD CTRS of Example 5.8.24using Proposition 5.8.31 with a rather complex ordering. To define thisordering, for a given poset P , let N (P ) = (P × P ) + (ω × ω) (cid:11) P + M (P ) , with its ordering given by Definition 5.8.25. Then N (P ) is Noethe-rian if P is by Propositions 5.8.26 and 5.8.29, and because P ⊆ M (P ) is an order-preserving inclusion, so is P ⊆ N (P ) . Therefore defining N n + (P ) = N ( N n (P )) with N (P ) = P , we have that each N n (P ) isNoetherian if P is. However, the union N ω (P ) of the chain P ⊆ N (P ) ⊆ N (P ) ⊆ · · · ⊆ N n (P ) ⊆ · · · is in general not Noetherian, for the reasons considered in Exercise 5.8.8.The case in which we are most interested takes P = ω ⊥ . For each N n (ω ⊥ ) , identify p ∈ P with p (cid:11) ⊥ ∈ (cid:11) P , and with p (cid:11) ⊥ (cid:11) ⊥ ∈ (cid:11) P ,etc., recursively as in Exercise 5.8.8, and also identify ⊥ in P with ( ⊥ , ⊥ ) When T is S -indexed, a family { P s | s ∈ S } of posets, although in practice they areoften all the same. onditional Term Rewriting in (P × P ) and with ∅ in M (P ) ; then the result of these identificationsis Noetherian for each n , by Exercise 5.8.7. Finally, for all p, q , add theinequalities1 . (p, q) > p, q if p, q ≠ ⊥ . (m, n) (cid:11) (p, q) > p, q if p, q ≠ ⊥ . To simplify notation, denote the resulting poset at level n by N n andthe union by N ; let N = {∅} . We leave the reader to check that theabove new inequalities do not violate the poset axioms or the Noethe-rian condition for each N n .In order to apply Proposition 5.8.31, let T be T Σ where Σ is thetotal signature of GCD , let A consist of all rewrites induced on T by therules in GCD , let B consist of all rewrites induced on T by the rules in NAT , and let N = A − B . We must define a weight function ρ : T → N such that each rewrite in N is strict ρ -monotone, such that each rewritein B is weak ρ -monotone, and such that for each Σ -term, everything towhich it can be rewritten lies within some fixed Noetherian subposet N t of N . We take N t be the poset that was denoted N n above, with n = d ,where d is the maximum depth of nesting of gcd ’s inside of t . In thefollowing, we let T d denote the set of ground Σ -terms of maximumnesting depth not greater than d , we let Σ (cid:48) denote the signature of NAT ,and we let g abbreviate gcd . Notice that for any d , rewriting on T d with A always remains within T d , because none of the rules in GCD or NAT can increase the depth of nesting of gcd ’s in terms.We give a recursive definition for ρ , in which a subterm of t ∈ T iscalled top if it is a maximal subterm of t having g as its head, and asbefore we write [t] for ρ(t) : (a) [n] = ∅ ( the empty multiset ) if n is a Σ (cid:48) term (b) [g(t, t (cid:48) )] = (m, n) (cid:11) ([t], [t (cid:48) ]) where t, t (cid:48) reduce to Peanoterms m, n(c) [t] = { [t ], . . . , [t n ] } if t is not top and t , . . . , t n are its top subterms . By a “Peano term” we mean a term of the form s . . . s 0. We can show that rewriting on T Σ (cid:48) always terminates with such a term using an argu-ment E18 like that given in Example 5.5.5, and then we can show that B isterminating on T Σ with an argument like that given in Theorem 5.8.20on page 136, noting that Σ is non-void and using Proposition 5.3.4.To apply (b), we need to know that the Σ -terms t and t (cid:48) reduce toPeano terms under A , whereas in general we don’t even know whetherrewriting with A terminates for arbitrary Σ -terms. Therefore we should Generalizing the proof in Example 5.8.24 to poset weights shows that the general-ization of Proposition 5.8.19 to poset weights (stated below as Theorem 5.8.33) cannotbe made to work for this example. Rewriting demonstrate termination with a Peano term result along with condi-tions (1) and (2) of Proposition 5.8.31, as part of our induction on themaximum depth of nesting of gcd ’s in Σ -terms.Our induction hypothesis is the conjunction of four subsidiary hy-potheses: ( A d ) rewriting on T d with B preserves weight; ( B d ) rewritingon T d with N is strict monotone; ( C d ) the weights of terms in T d al-ways lie in N d ; and ( D d ) rewriting on T d with A always terminateswith a Peano term. Notice that ( A d ) implies that rewriting with B isweak monotone.The base case takes d = t ∈ T = T Σ (cid:48) . By (a) ofthe definition of ρ , we have [t] = ∅ ; therefore rewrites with old rulesare weight preserving, and weights remain within N = {∅} becauserewriting remains within T . Also, because no new rules can be applied to t ∈ T Σ (cid:48) and no operations from Σ − Σ (cid:48) can be introduced by rewrit- ing Σ (cid:48) -terms, rewrites using new rules are vacuously strict monotone,because there aren’t any.The induction step assumes the four induction hypotheses ( A d , B d , C d , D d ) for some d > 0. We first prove a preliminary lemma, which saysthat any rewrite induced by applying a new rule at the top of a term in T d + is strict monotone. For the first rule, g(M, ) = M , because any t ∈ T d reduces to a Peano term (say) m by ( D d ), we get [g(t, )] = (m, ) (cid:11) ([t], ⊥ ) , while for the rightside, we get just [t] . Therefore [g(t, )] > [t] by the inequality 2. The argument for the second rule, g( , N) = N , is the same. Of the two conditional rules in GCD , we checkonly the first, because the second follows the same way. This rule is g(M, N) = g(M − N, N) if M ≥ N and N > 0. By ( D d ), t, t (cid:48) reduce toPeano terms, say m, n ; then [g(t, t (cid:48) )] = (m, n) (cid:11) ([t], [t (cid:48) ]) , while forthe rightside we have [g(t − t (cid:48) , t (cid:48) )] = (m − n, n) (cid:11) ([t − t (cid:48) ], [t (cid:48) ]) , and thedesired inequality follows because n > 0, so that (m, n) > (m − n, m) ,and hence (m, n) (cid:11) ([t], [t (cid:48) ]) > (m − n, n) (cid:11) ([t − t (cid:48) ], [t (cid:48) ]) , by thedefinitions of the product and lexicographic orderings.The induction step for the first two inductive assertions has twocases. The first case considers t = g(t , t ) with t , t ∈ T d . Then by(b), [t] = (n , n ) (cid:11) ([t ], [t ]) , where the reduced forms of t , t are respectively n , n , which are Peano terms by ( D d ). For assertion ( A d + ),any application of an old rule is weight preserving because it rewriteseither t or t , which preserves the weight of t by ( A d ). For assertion( B d + ), any application of a new rule at the top is strict monotone byour lemma, and otherwise is strict monotone by ( B d ).The second case of the induction step considers a Σ -term t havingdepth d + t (g , . . . , g k ) where k > g i hasthe form g(t i, , t i, ) with t i,j ∈ T d and t ∈ T Σ (cid:48) ( { z , . . . , z k } ) . Then [g i ] = (n i, , n i, ) (cid:11) ([t i, ], [t i, ]) as in the first case, and so we have [t] = { (n , , n , ) (cid:11) ([t , ], [t , ]), . . . , (n k, , n k, ) (cid:11) ([t k, ], [t k, ]) } by (c).For assertion ( A d + ), once again any application of an old rule preserves onditional Term Rewriting the weight of t , because it either rewrites some t i,j , which preservesthe weight of t because it preserves the weight of g i by ( A d ), or else itrewrites within t , which also preserves the weight of t , because of (c).For assertion ( B d + ), any application of a new rule is strict monotoneon any such t , either by ( B d ), or else by the lemma.Finally, we consider the remaining two inductive assertions. For( C d + ), when rewriting with A on T d , all weights remain within N d + = N (d + ) because of the form of [t] and ( C d ). For ( D d + ), rewriting alwaysterminates for terms in T d + by Proposition 5.8.31 with T = T d + and P = N d + , plus the induction hypotheses; moreover, the result mustbe a Peano term because of the form of [t] , using ( D d ) and (a).It now follows that ( A d , B d , C d , D d ) hold for every d ≥ 0, and in par-ticular that rewriting on any ground Σ -term t ∈ T = (cid:83) d T d necessarilyterminates with a Peano term as its result. (cid:2) The main result of Section 5.8.2, Theorem 5.8.20 on page 136, gen-eralizes to Noetherian orderings, though we do not use this result inthis book: Theorem 5.8.33 Let ( Σ , A) be a CTRS with Σ non-void, let ( Σ , B) be a terminat-ing sub-CTRS of ( Σ , A) , let P be a poset, and let N = A − B . If there is afunction ρ : T Σ ,B → P such that(1) each rule in B is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least, then for each t ∈ (T Σ ,B ) s there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ A t (cid:48) implies ρ(t (cid:48) ) ∈ P ts .then ( Σ , A, B) is ground terminating. (cid:2) The proof depends on the straightforward generalization from ρ : T Σ → ω to ρ : T Σ → P of results in Section 5.8.2 and hence is omitted here. When applying a conditional rule t = t (cid:48) if u == v over some theory A in OBJ, it is possible that a term r exists such that θ(u) ∗ ⇒ A r and θ(v) ∗ ⇒ A r , but rewriting does not find this r because θ(u) and θ(v) reduce to different normal forms. In fact, it cannot be guaranteed thatOBJ will always evaluate a condition u == v to true when θ(u) ↓ θ(v) unless the set of rules that can be used for evaluating conditions isboth Church-Rosser and terminating. This provides some motivationfor checking the Church-Rosser property in the conditional case. But as Rewriting we continue to emphasize, because of soundness, any rewriting com-putation is a proof, so if you do get the result that you want, then youhave proved the result that you wanted to prove, whether or not theCTRS is terminating or Church-Rosser. In our experience, practical ex-amples can usually be handled without bothering to check canonicity,though of course it is comforting.This subsection extends the techniques presented for proving con-fluence in Section 5.6 to the conditional case. Many basic results extendjust because they follow directly from the corresponding results aboutARS’s, as discussed at the beginning of Section 5.8. The situation forProposition 5.3.6 is slightly different; the proof, which involves passingto two different ARS’s, generalizes to the conditional case without anychange. Proposition 5.8.34 Given a sort set S , let X ωS be the ground signature with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S . Then a CTRS ( Σ , A) is Church-Rosser iff the CTRS ( Σ (X ωS ), A) is ground Church-Rosser. Similarly, ( Σ , A) is locally Church-Rosser iff the CTRS ( Σ (X ωS ), A) is ground lo-cally Church-Rosser. (cid:2) Example 5.8.35 The analog of Proposition 5.8.12 for confluence is not true,not even for orthogonal CTRS’s. Let C be the CTRS corresponding tothe following equations (with x a variable): f (x) = a if x = f (x)b = f (b) . Then b ⇒ f (b) ⇒ a and f (b) ∗ ⇒ f (a) . However, it is not true that f (a) ↓ a . Hence C is not Church-Rosser. However, C U is Church-Rosser.Note also that C is orthogonal. (This example is due to Bergstra andKlop [10].) (cid:2) The Orthogonality Theorem (Theorem 5.6.4) can be generalized toCTRS’s by generalizing the notion of non-overlapping, but we do not doso here, because orthogonality is a rather strong property, and in any case the generalization of the Newman Lemma handles most examplesof practical interest; to maximize practicality, we give a hierarchicalversion that allows checking the key properties “incrementally,” that is,one level at a time, as was previously done for termination. Proposition 5.8.36 Let A be a terminating Σ -CTRS, let B be a “base” Church-Rosser CTRS contained in A , and let N = A − B . Then A is Church-Rosser(and hence canonical) if:(1) N is locally Church-Rosser, i.e., if t ⇒ N t and t ⇒ N t then thereis some t (cid:48) such that t ∗ ⇒ N t (cid:48) and t ∗ ⇒ N t (cid:48) ; and onditional Term Rewriting (2) if t ⇒ B t and t ⇒ N t then there is some t (cid:48) such that t ∗ ⇒ N t (cid:48) and t ∗ ⇒ B t (cid:48) . Proof: (1) and (2) imply that A is locally Church-Rosser, and then the NewmanLemma gives the full Church-Rosser property. (cid:2) Note that by the original Newman Lemma, it would be equivalent toassume that B is locally Church-Rosser. Condition (2) is a local versionof the Hindley-Rosen property from Proposition 5.7.5. We now givesome applications of the above result. Example 5.8.37 We first consider the maximum function of Example 5.8.13,which has already been shown terminating. Assume that the natural number part of this CTRS, here denoted B , has been shown Church-Rosser. Then it remains to check conditions (1) and (2). Condition(1) can be checked completely mechanically by using the Knuth-Bendixalgorithm [117] (see Chapter 12), and condition (2) can be checked bya variant of the same algorithm. But here we give a rather informalargument, which will serve to motivate the more formal developmentsof Chapter 12. The idea is to determine which rules could give riseto the two given rewrites, and then show the existence of a suitable t (cid:48) for each such case. Note that unless the two rewrites overlap, it isstraightforward to see that t (cid:48) exists. e will abbreviate max by just m in the detailed arguments below.For (1), we suppose that t ⇒ N t and t ⇒ N t . The only way that twonew rules can overlap is if the redex has the form m(u, u) for some Σ -term u with t = t = u , so that we have t ∗ ⇒ N t (cid:48) and t ∗ ⇒ N t (cid:48) with t (cid:48) = u . For (2), we consider t ⇒ B t and t ⇒ N t . But it is impossiblefor a new rule to overlap with a base rule in this specification, becausethe leftsides of the rules in B and N have disjoint function symbols;so there is nothing to check. Thus Proposition 5.8.36 implies that thisspecification is Church-Rosser, and hence canonical.Now we consider the greatest common divisor function of Exam-ple 5.8.24, which was already shown terminating in Example 5.8.32. For (1), note that in this specification, there is no overlap between newrules, so there is nothing to check. Similarly for (2), there is no over-lap between new and old rules, because the leftsides involve disjointfunction symbols, and so again there is nothing to check. ThereforeProposition 5.8.36 shows that this specification is Church-Rosser, andhence canonical. (cid:2) There is also a version of Proposition 5.8.36 that does not assumetermination, but instead requires stronger confluence conditions; how-ever we are usually less interested in the Church-Rosser property iftermination does not hold, so we do not give this result here. Rewriting Exercise 5.8.9 Given a CTRS C , let C U be the TRS whose rules are those of C with their conditions (if any) removed. Then C is Church-Rosser (orground Church-Rosser) if C U is. (cid:2) ( (cid:63) ) Relation between Abstract and Term Rewriting Systems This section describes a relationship between abstract rewriting sys-tems and term rewriting systems, including a construction of each fromthe other, and proves that these two constructions are the best possi-ble with respect to each other, in a sense that is made precise by sayingthat they form an “adjoint pair of functors.” However, the notion ofadjointness is not needed to understand our statement of this result,and in fact, no category theory at all is used in this section, although we do mention categories and functors in some exercises. We can say much more about the relationship between the TRS’s andARS’s than in Section 5.7 after we introduce morphisms of TRS’s andARS’s. First, we add a little more information E19 to TRS’s, by includinga sort s from its signature. Then TRS morphisms are interpretations (inthe sense of Definition 4.10.4) that preserve the designated sort and allone-step rewrites. Definition 5.9.1 A TRS morphism ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) is a signature morphism h : Σ → Der ( Σ (cid:48) ) such that whenever t ⇒ A t then h(t ) ⇒ A (cid:48) h(t ) , where h is as defined just after Definition 4.10.3, the unique Σ -homomorphismto T Σ (cid:48) obtained by looking at T Σ (cid:48) first as a Der ( Σ (cid:48) ) -algebra, and thenthrough h , as the reduct Σ -algebra hT Σ (cid:48) (see page 84 for this).Given TRS morphisms h : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and h (cid:48) : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) , their composition h ; h (cid:48) : ( Σ , A) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) is defined to betheir composition h ; h (cid:48) * : Σ → Der ( Σ (cid:48)(cid:48) ) as derivors in the sense of Defi-nition 4.10.5. (cid:2) Exercise 5.9.1 Show that the composition of TRS morphisms is a TRS mor-phism. Use Exercise 4.10.5 to show that i Σ : Σ → Der ( Σ ) (sending σ ∈ Σ w,s to the term σ (x , . . . , x n ) ∈ T Σ ( w X) s ) is the identity for TRS morphism composition. Show that TRS morphism composition is asso-ciative (whenever the compositions involved are defined). These resultsshow that TRS’s form a category; let us denote it T RS . (cid:2) Definition 5.9.2 An ARS morphism (S, T , → ) (cid:45) → (S (cid:48) , T (cid:48) , → (cid:48) ) is a pair (f , g) where f : S → S (cid:48) and g s : T s → T (cid:48) f (s) for each s ∈ S , such that t → t implies g(t ) → (cid:48) g(t ) . Given ARS morphisms (f , g) : (S, T , → ) (cid:45) → A category is just a collection of objects (such as TRS’s) and maps (also often calledmorphisms) between them, such that certain axioms are satisfied. There are manyplaces to learn about these concepts, including [91, 6] and [126]; [63] discusses theintuitive meanings of these and other categorical concepts. (cid:63) ) Relation between Abstract and Term Rewriting Systems (S (cid:48) , T (cid:48) , → (cid:48) ) and (f (cid:48) , g (cid:48) ) : (S (cid:48) , T (cid:48) , → (cid:48) ) (cid:45) → (S (cid:48)(cid:48) , T (cid:48)(cid:48) , → (cid:48)(cid:48) ) , then their compo-sition is the pair (f ; f (cid:48) , { g s ; g (cid:48) f (s) | s ∈ S } ) : (S, T , → ) (cid:45) → (S (cid:48)(cid:48) , T (cid:48)(cid:48) , → (cid:48)(cid:48) ) . (cid:2) Exercise 5.9.2 Show that the composition of ARS morphisms is an ARS mor-phism. Show that the pair of identity maps ( S , T ) serves as an iden-tity for (S, T , → ) under ARS morphism composition. Show that ARSmorphism composition is associative (when the compositions involvedare defined). These results show ARS’s form a category; let us denote it A RS . (cid:2) Now we are ready for the first of our two main constructions: Definition 5.9.3 Let R send a TRS ( Σ , A) to the ARS (T Σ , ⇒ A ) , and send a TRS morphism h : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) , i.e., h : Σ → Der ( Σ (cid:48) ) , to R(h) = (f , h) : T Σ → T Σ (cid:48) , where f is the sort component of the signature morphism h . (cid:2) Exercise 5.9.3 Show that R(h) is an ARS morphism, and that R : T RS → A RS preserves composition and identities, i.e., that it is a functor. Hint: To show that R(h) is an ARS morphism, check that t ⇒ A t implies h(t ) ⇒ A (cid:48) h(t ) . (cid:2) Here is our second construction: Definition 5.9.4 Let F send an ARS (S, T , → ) to ( Σ T , A → ) , where Σ T is definedby Σ T[],s = T s and Σ Tw,s = ∅ for all other w, s , and where A → containsa rewrite rule ( ∀∅ ) t = t iff t → t in (S, T , → ) . Also, if (f , g) : (S, T , → ) (cid:45) → (S (cid:48) , T (cid:48) , → (cid:48) ) is an ARS morphism, then define F(f , g) : ( Σ T , A → ) → ( Σ T (cid:48) , A → (cid:48) ) to be h : Σ T → Der ( Σ T (cid:48) ) defined by h s (t) = g f (s) (t) ∈ Σ T (cid:48) [],f (s) = T (cid:48) f (s) for t ∈ Σ T[],s = T s . (cid:2) Exercise 5.9.4 Show that F(f , g) is a TRS morphism, and that F : A RS → T RS preserves composition and identities, i.e., that it is a functor. Hint: To show that F(f , g) is a TRS morphism, show that t → A → t implies h(t ) → A →(cid:48) h(t ) where h = F(f , g) . (cid:2) Before stating the main result, we need the following: Fact 5.9.5 R(F( A )) = A , for any ARS A . Proof: If A = (S, T , → ) , then F( A ) = ( Σ T , A → ) where Σ T[],s = T s and Σ w,s = ∅ for all other w, s , and where the rewrite rule ( ∀∅ ) t = t is in A → iff t → t in A . Then R(F( A )) = (T Σ T , ⇒ A → ) = (T , → ) . (cid:2) The theorem below says that the functor F is left adjoint to R , butit is stated as a so-called “universal property” that does not use anycategory theory. (Figure 5.5 shows the traditional commutative diagramfor this property.) Rewriting (cid:45)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:82) (cid:63) (cid:63) A A R(F( A )) F( A )R(u) u(f , g) R( T ) T Figure 5.5: Universal Property of F( A ) Theorem 5.9.6 For every ARS A , TRS T and ARS morphism (f , g) : A → R( T ) ,there is a unique TRS morphism u : F( A ) → T such that R(u) = (f , g) . Proof: Let A be (S, T , → ) and let T be ( Σ (cid:48) , A (cid:48) ) . Then F( A ) = ( Σ T , A → ) . If we assume that R(u) = (f , g) as morphisms (S, T , → ) (cid:45) → (T Σ (cid:48) , ⇒ A (cid:48) ) , thenfor u : ( Σ T , A → ) → ( Σ (cid:48) , A (cid:48) ) , which is really u : Σ T → Der ( Σ (cid:48) ) , we musthave u [],s (t) = g s (t) for each t ∈ Σ T[],s = T s , and that all the other maps Σ Tw,s → T Σ (cid:48) ,s (cid:48) ( w X) are empty. Furthermore, with this definition of u ,we have that R(u) = (f , g) , because R(u) = u : T Σ T = T → T Σ (cid:48) where u s (t) = g s (t) for t ∈ T s .Finally, to show that u is a TRS morphism, we must show that t ⇒ A → t implies u(t ) ⇒ A (cid:48) u(t ) . But t ⇒ A → t iff t → t , and since (f , g) isan ARS morphism, from this we get g s (t ) → A (cid:48) g s (t ) , which occurs iff u(t ) → A (cid:48) u(t ) . (cid:2) Exercise 5.9.5 Substitute “ t ⇒ A t implies h(t ) ∗ ⇒ A (cid:48) h(t ) ” for “ t ⇒ A t implies h(t ) ⇒ A (cid:48) h(t ) ” in Definition 5.9.1, substitute “ t → t implies g(t ) ∗ → (cid:48) g(t ) ” for “ t → t implies g(t ) → (cid:48) g(t ) ” in Definition 5.9.2,and then show that Theorem 5.9.6 still holds. Give an interpretation forthis result. Show that the local Church-Rosser property is not preservedby either F or R when morphisms are generalized in this way. (cid:2) Term rewriting captures a basic computational aspect of equationallogic, and is fundamental for theorem proving. However, expositionsof term rewriting typically have a combinatorial, syntactic flavor, ratherthan an algebraic, semantic flavor. This is due in part to the historicalfact that term rewriting arose as an abstraction of the lambda calculus,especially the so called Normalization (i.e., Church-Rosser) Theorem,which was first proved by Church and Rosser, and which is the originof the “Church-Rosser property.”There is a very large literature on term rewriting. This chapter doesnot faithfully represent that literature, because it emphasizes results iterature that are of practical value for theorem proving, as opposed to resultsthat are largely of theoretical interest, and it leans heavily towards al-gebra. Huet and Oppen gave a good survey that developed some of theconnections with algebra [109]. Klop [114, 115], Dershowitz and Jouan-naud [39], and Plaisted [151] have also written useful surveys; the lattertwo describe some more recent developments. A nice self-contained in-troductory textbook has been written by Baader and Nipkow [2]. New-man proved his lemma for the unsorted case in 1942 [143]; the elegantproof given here is due to Barendregt [3], but generalized to overloadedmany-sorted rewriting. Theorem 5.2.9 is from [56]; it expresses a funda-mental connection between term rewriting and algebra. The Noetheriancondition is named after Emmy Noether, the great pioneer in abstractalgebra mentioned in Section 2.8.Combinatory logic was developed by Schönfinkel [160] to eliminate bound variables from predicate logic, and was later independently de-veloped further by Haskell Curry as a foundation for mathematics thathe called “Illative Combinatory Logic” [37, 38]. Combinatory logic alsoplays an important role in implementing functional programming lan-guages, as described in [4], [178] and many other places. The so-calledcategorical combinators developed more recently by Curien and othershave played a similar role [36].Although not discussed here, the lambda calculus is a TRS closelyrelated to combinatory logic. It was developed by Alonzo Church [28]as a calculus of functions, again as part of a foundational programmefor mathematics. This TRS played a key role in formalizing the no-tion of computability (the so-called Church-Turing thesis), and follow-ing work of Landin [120] and Strachey [172], it became the basis forthe “denotational semantics” [162] method for defining the meaningof programming languages. Lambda calculus has also been an impor-tant influence on the design of programming languages, including Lisp[131] and more recently, higher-order functional languages like ML [99],Miranda [179], and Haskell [107].Term rewriting plays a basic role in proving properties of abstractdata types, including their correctness and implementation, and also in the study of their computability properties [137]. In addition, termrewriting has played an important role in developing languages thatcombine the functional and logic paradigms, through an operationalsemantics based on so called narrowing [79, 40, 105]. As far back as1951, Evans [46] used term rewriting to prove the decidability of theequational theory called “loops.”Most expositions of term rewriting do not make the signature ex-plicit, so that the distinction between (for example) confluence andground confluence can seem mysterious, and various confusions caneasily arise. Similarly, it is not usual to be careful about the variablesand constants involved in rewriting a given term. The results of Propo- Rewriting sitions 5.3.4 and 5.3.6, and of Corollary ?? , E20 which address these is-sues, do not seem to be in the literature, nor do the correspondingresults for the conditional case, Propositions 5.8.10 and 5.8.11. This ispresumably because they cannot even be stated without the additionalcare for variables and constants that we have taken.Section 5.4 on evaluation strategies has been largely taken from [90].There is an interesting literature on proving termination of rewritingwhen operations have local strategies, for example, see [50], which citesmany other papers.The unsolvability of equality mentioned in connection with Proposi-tion 5.1.14 is shown by the unsolvability of the so-called word problemfor groups, posed by Max Dehn in 1911: given a group presentation, de-termine whether or not two terms over the generators are equal in that equational theory; this was shown unsolvable by Petr Novokov [144]and William Boone [16] in the 1950s. Unsolvability of equality also fol-lows from the word problem for semigroups, posed by Axel Thue in1914 and shown unsolvable by Emil Post in 1947 [153].That orthogonality implies Church-Rosser (Proposition 5.6.4) hasbeen proved many times, perhaps first by Rosen [158], but our versionmay be the first that goes beyond the unsorted case, and our proof alsoappears to be novel. The term “orthogonal” is due to Dershowitz. Manyother results in this chapter are also new, in the same limited sensethat they are proved for overloaded many-sorted rewriting. Theorem5.6.9 and the notions of superposition and critical pair are part of alarger story about unification and the Knuth-Bendix method covered inChapter 12; that material appears in this chapter because of its valuefor showing the Church-Rosser property. Hindley’s original proof ofthe Hindley-Rosen Lemma appears in [103].The literature includes several different notions of conditional rewrit-ing, e.g., see the survey of Klop [115]. The join conditional rewritingapproach of our Definition 5.8.2 is the most satisfactory for OBJ be-cause it includes the computations done in the common case when == occurs in a condition. The alternative notions are either less general,or else are too general, for example, going beyond term rewriting by requiring conditions to be evaluated using the full power of equationaldeduction.Although Propositions 5.5.1 and 5.8.16 are very simple, they ex-press the fundamental relationship between termination and weightfunctions, and they have not been emphasized in the literature, andmay even be partially new. Results 5.8.18, 5.8.19, 5.8.20, and 5.8.33all appear to be new, and have practical value for proving termination,especially Theorems 5.8.20 and 5.8.33, which also handle conditionalrules.The results on constructing Noetherian orderings in Section 5.8.3are standard, although the particular constructions given for the mul- iterature tiset and lexicographic orderings may be new. The observation thatcolimits appear in several places is new, as is the termination criterionin Theorem 5.8.31 and its application in Example 5.8.32. Though theproof in this example is a bit elaborate for a result that is intuitivelyrelatively obvious, it does provide a fairly thorough illustration of themachinery introduced in Section 5.8.3. The material on the Church-Rosser property in Section 5.8.4 may be new; although the special caseof Proposition 5.8.36 with B = ∅ is of course familiar, the generaliza-tion to hierarchical CTRS’s is very useful in practice.The use of ARS’s to study TRS’s is standard in the literature, al-though terminology and definitions vary. Klop [114, 115] considerssets with an indexed family of relations (as in Proposition 5.7.5), callingthem “abstract reduction systems,” and using the name “replacement system” for the case of just one relation, which we call an abstractrewrite system; actually, our formulation is a bit more general, becauseit is S -indexed, which enables some novel applications to many-sortedterm rewriting and equational deduction. The results in Section 5.9 arenew, especially Theorem 5.9.6 on the adjoint relation between ARS’sand TRS’s. This material suggests many questions for further research,such as exploring properties of the two categories involved, and moreambitiously, reformulating term rewriting theory in a more categoricalstyle.José Meseguer [134] has developed rewriting logic, which gives soundand complete rules of inference for term rewriting; these rules are thesame as those for equational deduction, except that the symmetry lawis omitted. This logic can also be seen as a logic for the term rewrit-ing model of computation, and as such has many interesting applica-tions, including a comprehensive unification of different theories ofconcurrency, a nice operational semantics for inference systems, anda uniform meta-logic in which inference systems can be described andimplemented [29].I thank Prof. Virgil-Emil Cazanescu for his help with the proofs ofPropositions 5.2.6 and 5.3.4, and Dr. R˘azvan Diaconescu for help withthe proof of Theorem 5.9.6. I also thank José Barros and R˘azvan Dia-conescu for their help with some of the examples, Kai Lin for several very useful discussions, as well as for significant help with the exam-ples in Section 5.8.2, and Grigore Ro¸su for the proof of the Orthogo-nality Theorem (Theorem 5.6.4) in Appendix B, and for several valu-able suggestions. Finally, I thank Ms. Chiyo Matsumiya and especiallyProf. Yoshihito Toyama, and Dr. Monica Marcus, for their valuable com-ments and corrections to this chapter. The proof of Theorem 5.6.9 inAppendix B is due to Dr. Marcus. Rewriting A Note to Lecturers: This chapter contains a great deal ofmaterial, some of which is rather difficult. Except in the caseof an advanced course of some duration, the lecturer willhave to omit a fair amount, certainly including all the starredsections. Beyond that, the material to be covered may bedetermined by the taste of the lecturer and the choice ofmaterial to be covered from later chapters. In particular,it is safe to omit most of the detailed material on provingtermination and the Church-Rosser property, since little ofthat is needed for later chapters. It could also be a goodidea to interleave material from this chapter with parts ofChapter 6, to create a bit more variety. Initial Algebras, StandardModels and Induction This chapter shows that every equational specification has an initialalgebra, gives further characterizations for these structures, and justi-fies and illustrates the use of induction for verifying their properties.It also investigates abstract data types and standard models for equa-tional specifications, showing that they are initial models. Congruenceand quotients are important technical tools, and we prove some of theirmain properties. This section discusses congruences, quotients, initial and free algebrassatisfying equations, and then substitutions modulo equations. Mainresults include the so-called homomorphism theorem, the universalcharacterization of quotients, and the existence of initial algebras. Initial algebras for specifications with equations are constructed asquotients of term algebras, a construction that relies upon the follow- ing: Definition 6.1.1 A Σ - congruence relation on a Σ -algebra M is an S -sorted equiv-alence relation ≡ = {≡ s | s ∈ S } on M , for S the sort set of Σ , where each ≡ s is an equivalence relation on M s such that whenever σ ∈ Σ s ...s n ,s then a i ≡ s i a (cid:48) i for i = , . . . , n implies M σ (a , . . . , a n ) ≡ s M σ (a (cid:48) , . . . , a (cid:48) n ) , for a i , a (cid:48) i ∈ M s i for i = , . . . , n . (cid:2) Initial Algebras, Standard Models and Induction Example 6.1.2 Define a signature Σ by the OBJ fragment, sorts Nat Bool .op 0 : -> Nat .op s : Nat -> Nat .ops T F : -> Bool .op odd : Nat -> Bool . and let N be the Σ -algebra with N Nat = ω , with N Bool = { T , F } , andwith the operations interpreted as expected. Then we can define a Σ -congruence Q on N as follows: nQ , Nat n (cid:48) iff n − n (cid:48) is divisible by 8;and bQ , Bool b (cid:48) iff b = b (cid:48) . We can also define another Σ -congruence Q on N as follows: nQ , Nat n (cid:48) iff n − n (cid:48) is divisible by 2; and bQ , Bool b (cid:48) iff b = b (cid:48) . (cid:2) Exercise 6.1.1 In the context of Example 6.1.2, prove that Q and Q are Σ -congruences on N . Now define Q to mean having the same remainderunder division by 3, and show that Q is not a Σ -congruence on N . (cid:2) Proposition 6.1.3 Given a Σ -algebra M and a Σ -congruence ≡ on M , the quo-tient of M by ≡ , denoted M/ ≡ , is a Σ -algebra, interpreting constantsymbols σ ∈ Σ [],s as [M σ ] , and operations σ ∈ Σ s ...s n ,s with n > [a ], . . . , [a n ] to [M σ (a , . . . , a n )] , for a i ∈ M s i . Proof: We have to show that [M σ (a , . . . , a n )] is well defined. So let us assume,for a i , a (cid:48) i ∈ M s i , that a i ≡ s i a (cid:48) i for i = , . . . , n , i.e., that [a i ] = [a (cid:48) i ] .Then the definition of congruence gives us that [M σ (a , . . . , a n )] = [M σ (a (cid:48) , . . . , a (cid:48) n )] . (cid:2) Example 6.1.4 The equivalence classes of ground Σ -terms under a set A of Σ -equations form a nice Σ -algebra, which is in fact a quotient of theterm algebra by a congruence based on equational deduction as fol-lows: Given Σ -terms t, t (cid:48) with variables in X , let t (cid:39) XA t (cid:48) iff A (cid:96) ( ∀ X) t = t (cid:48) . where (cid:39) XA for sort s has t, t (cid:48) also of sort s . That (cid:39) XA is an equivalence re-lation follows directly from rules (1), (2), (3) of Definition 4.1.3 on page 58, which are the reflexivity, symmetry and transitivity of equationaldeduction, respectively. In the special case where X = ∅ , we write just (cid:39) A instead of (cid:39) ∅ A . Now we take the quotient T Σ / (cid:39) A as an S -sorted set,and let [t] A or (usually) just [t] , denote the equivalence class of a term t under (cid:39) A .For these equivalence classes to form a Σ -algebra, we need to giveinterpretations for the constant and operation symbols in Σ . It seemsclear that we should interpret the constant symbol σ ∈ Σ [],s as [σ ] ,and interpret σ ∈ Σ s ...s n ,s with n > [t ], . . . , [t n ] Appendix C reviews this construction for S -sorted sets. uotient and Initiality to [σ (t , . . . , t n )] , where t i ∈ T Σ ,s i for i = , . . . , n . But it may not beclear that this definition makes sense. For, if we had picked some other t (cid:48) , . . . , t (cid:48) n such that [t i ] = [t (cid:48) i ] , then we would need to know that [σ (t , . . . , t n )] = [σ (t (cid:48) , . . . , t (cid:48) n )] in order to know that the proposed interpretation for σ gives the sameresult, no matter which representatives we happen to have chosen forthe equivalence classes. Translating back to the notation of equationaldeduction, the property we need is A (cid:96) ( ∀∅ ) t i = t (cid:48) i for i = , . . . , n implies A (cid:96) ( ∀∅ ) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) . But this follows directly from the rule of deduction (4) of Definition4.1.3: let X = { x , . . . , x n } with x i of sort s i , let Y = ∅ , let θ(x i ) = t i ,let θ (cid:48) (x i ) = t (cid:48) i , and let t = σ (x , . . . , x n ) ; then ( ∀∅ ) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) is deducible, because θ(t) = σ (t , . . . , t n ) and θ (cid:48) (t) = σ (t (cid:48) , . . . , t (cid:48) n ) . Thus T Σ / (cid:39) A is a Σ -algebra, in fact an initial ( Σ , A) -algebra,though we do not prove it directly in this way. (cid:2) Definition 6.1.5 Given a Σ -homomorphism h : M → M (cid:48) , the kernel of h is the S -sorted family of equivalence relations ≡ h on M , defined on M s by a ≡ h,s a (cid:48) iff h s (a) = h s (a (cid:48) ) ; the kernel of h may be denoted ker (h) .The image of h , denoted im (h) or h(M) , is the Σ -subalgebra of M (cid:48) with h(M) s = h s (M s ) for each s ∈ S , and with operations those of M (cid:48) suitably restricted. (cid:2) Proposition 6.1.6 The kernel of a Σ -homomorphism h : M → M (cid:48) is a Σ -congru-ence, and its image is a Σ -algebra. Proof: Each ≡ h,s is an equivalence relation, for any S -indexed function h : M → M (cid:48) . To prove the congruence property, let σ ∈ Σ w,s with w = s . . . s n and assume a i ≡ h,s a (cid:48) i , i.e., that h s (a i ) = h s (a (cid:48) i ) for i = , . . . , n . Then h s (M σ (a , . . . , a n )) = M (cid:48) σ (h s (a ), . . . , h s n (a n )) = M (cid:48) σ (h s (a (cid:48) ), . . . , h s n (a (cid:48) n )) = h s (M σ (a (cid:48) , . . . , a (cid:48) n )) , so that M σ (a , . . . , a n ) ≡ h,s M σ (a (cid:48) , . . . , a (cid:48) n ) , as desired.For the second assertion, we first check condition (2) of the def-inition of subalgebra given in Exercise 3.1.1, let σ ∈ Σ w,s with w = s . . . s n , let b i ∈ h(M) s i for i = , . . . , n , and let a i ∈ M s i such that b i = f s i (a i ) for i = , . . . , n . Then M (cid:48) σ (b , . . . , b n ) ∈ h(M) s since M (cid:48) σ (b , . . . , b n ) = h s (M σ (a , . . . , a n )) . (cid:2) Initial Algebras, Standard Models and Induction The following is one of the most important elementary results ofgeneral algebra. Due to Emmy Noether in its original form, calledthe “first isomorphism theorem,” it relates homomorphisms, quotients,and subalgebras in a very elegant (and useful) way. Theorem 6.1.7 ( Homomorphism Theorem ) For any Σ -homomorphism h : M → M (cid:48) , there is a Σ -isomorphism M/ ker (h) (cid:155) Σ im (h) . Proof: Let ≡ denote ker (h) , let Q denote M/ ≡ , and define f : Q → h(M) asfollows: given some ≡ -class c , let f (c) be h(m) where m is any elementof M such that [m] = c ; by the definition of ≡ , if m , m are two suchelements, then h(m ) = h(m ) , so that f is well-defined. Also, f is surjective, since for any h(m) ∈ h(M) , we have f ([m]) = h(m) . So itremains to show that f is a Σ -homomorphism and is injective.For the first, if σ ∈ Σ [],s then f (Q σ ) = f ([M σ ]) = h(M σ ) = M (cid:48) σ .Also, if σ ∈ Σ w,s with w = s . . . s k then f (Q σ ([m ], . . . , [m k ])) = f ([M σ (m , . . . , m k )]) = h(M σ (m , . . . , m k )) = M (cid:48) σ (h(m ), . . . , h(m k )) = M (cid:48) σ (f ([m ]), . . . , f ([m k ])). To show that f is injective, assume that [m ] ≠ [m ] but f ([m ]) = f ([m ]) , which by definition of f means that h(m ) = h(m ) , whichby definition of ≡ means [m ] = [m ] , contradicting our assumption. (cid:2) The following two corollaries and one exercise spell out some easy con-sequences of the above: Corollary 6.1.8 If h : M → M (cid:48) is an injective Σ -homomorphism, then M is iso-morphic to the subalgebra h(M) of M (cid:48) . (cid:2) Corollary 6.1.9 If h : M → M (cid:48) is a surjective Σ -homomorphism, then M (cid:48) isisomorphic to the quotient M/ ker (h) of M . (cid:2) Exercise 6.1.2 Show that the converses of the above two corollaries also hold,i.e., show that M is isomorphic to a subalgebra of M (cid:48) iff there is aninjective Σ -homomorphism h : M → M (cid:48) , and show that M (cid:48) is isomorphicto a quotient of M iff there is a surjective Σ -homomorphism h : M → M (cid:48) . (cid:2) Example 6.1.10 There is a nice example of the homomorphism theorem in au-tomaton theory. Define a state system to consist of an input set X , a state set Z , and a transition function t : X × Z → Z ; it is conventional touse a tuple notation (X, Z, t) for such systems. Recall that X ∗ is the setof all finite sequences from X , with the empty sequence denoted [] . We uotient and Initiality can extend t to a function t : X ∗ × Z → Z , by defining t([], z) = z and t(wx, z) = t(x, t(w, z)) for x ∈ X, w ∈ X ∗ , z ∈ Z ; this gives the statereached from z after a sequence of inputs. State systems with input set X are Σ -algebras with Σ the one-sorted signature having Σ = X , and Σ n = ∅ for all n ≠ 1, where the “action” of x ∈ X on z ∈ Z is definedto be t(x, z) . It is conventional to write x · z instead of t(x, z) , and alsoto extend this notation to write w · z instead of t(w, z) for w ∈ X ∗ .Next, define an automaton to be a state system plus a function o : Z → Y , and define the behavior of an automaton A = (X, Z, t, o) at state z ∈ Z to be the function b z : X ∗ → Y defined by b z (w) = o(t(w, z)) . Now let B be the Σ -algebra of all possible behaviors for A ,with carrier [X ∗ → Y ] , by defining (x · b)(w) = b(xw) for x ∈ X, b ∈ B, w ∈ X ∗ . Then the function b that sends z ∈ Z to b z ∈ B is a Σ -homomorphism, and thus Theorem 6.1.7 gives the Σ -isomorphism A/ ker (b) (cid:155) Σ im (b) . The Σ -algebra A/ ker (b) is called the minimal realization of A ; let usdenote it M . The above isomorphism says that M is the state systemwith the minimal set of states that realizes the same behaviors as A .We can extend M to an automaton by defining M o ([z]) = o(z) ; thereader may check that this is well defined, in that if [z] = [z (cid:48) ] then o(z) = o(z (cid:48) ) .The literature often adds an initial state σ ∈ Z to state systemsand/or automata; then the signature Σ is extended by adding Σ = { σ } ,and the algebra B is extended by adding B σ = b σ . Attention is oftenrestricted to automata that are reachable in the sense that for every z ∈ Z , there is some w ∈ X ∗ such that w · σ = z , and minimal realizationsfor such automata are again obtained by taking the quotient by thekernel of the behavior map. The kernel of b is often called the Nerodeequivalence , after Anil Nerode, who first defined it and the minimalrealization of a machine, though in a different way. (cid:2) A slightly more general notion of quotient than that developed abovestarts with an arbitrary relation on a Σ -algebra: Definition 6.1.11 Given a Σ -algebra M and a subset R s of M s × M s for each sort s of Σ , let ≡ R be the Σ -congruence generated by R on M , which isthe least Σ -congruence on M that contains R , and let M/R denote thequotient M/ ≡ R . (cid:2) The relation ≡ R exists because any intersection of Σ -congruences on M containing R is another such, necessarily the least, and the intersectionis non-empty because M × M is a Σ -congruence on M containing R . Thefollowing states a fundamental property of quotients: Proposition 6.1.12 Given a Σ -algebra M and a relation R on M , then the quo-tient map q : M → M/R satisfies the following: Initial Algebras, Standard Models and Induction MQ Q (cid:48) q q (cid:48) uu (cid:48) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:94)(cid:45)(cid:27) Figure 6.1: Proof for Uniqueness of Quotients(1) R ⊆ ker (q) ; and(2) if h : M → B is a Σ -homomorphism such that R ⊆ ker (h) thenthere is a unique Σ -homomorphism u : M/R → B such that q ; u = h . Proof: For (1), it suffices to note that ker (q) is the least congruence containing R . For (2), Theorem 6.1.7 gives M/ ker (h) (cid:155) im (h) ⊆ B , so R ⊆ ker (h) implies ker (q) ⊆ ker (h) , which implies by Lemma 6.1.13 below that h factors as q ; q (cid:48) ; i where q (cid:48) : Q → M/ ker (h) with Q = M/ ker (h) , andwhere i : ker (h) → B is the inclusion. Therefore we get u = q (cid:48) ; i : Q → B such that q ; u = h . To show uniqueness, if u (cid:48) : A → B such that q ; u (cid:48) = h , then q ; u (cid:48) = q ; u and so the surjectivity of q implies u = u (cid:48) . (cid:2) Lemma 6.1.13 Given Σ -congruences ≡ , ≡ (cid:48) on a Σ -algebra M with ≡ ⊆ ≡ (cid:48) , let Q, Q (cid:48) and q, q (cid:48) be the respective quotients and quotient maps for M .Then q (cid:48) factors as q ; q (cid:48)(cid:48) , where q (cid:48)(cid:48) is also surjective. Proof: Define q (cid:48)(cid:48) ([m]) = [m] (cid:48) where [m] (cid:48) is the ≡ (cid:48) congruence class of m ∈ M . This is well defined because if [m] = [m (cid:48) ] then [m] (cid:48) = [m (cid:48) ] (cid:48) since ≡ ⊆ ≡ (cid:48) . Then q (cid:48) (q(m)) = [m] (cid:48) = q (cid:48) (m) . (cid:2) An assertion that a unique map exists satisfying certain conditionsis often called a universal property ; the above is an example, as areinitiality assertions (e.g., Theorems 3.2.1, 3.2.10 and 6.1.15). In eachcase, the universal property characterizes a structure uniquely up to isomorphism. Proposition 6.2.1 shows this for initial algebras, and thefollowing proves it for quotients: Proposition 6.1.14 If both q : M → Q and q (cid:48) : M → Q (cid:48) satisfy conditions (1)and (2) of Proposition 6.1.12, then Q and Q (cid:48) are isomorphic. Proof: First notice that if B = M in Proposition 6.1.12, then uniqueness im-plies u = M since 1 M satisfies (1) and (2) in this case. Now under ourassumptions, we get Σ -homomorphisms u, u (cid:48) such that q ; u = q (cid:48) and q (cid:48) ; u (cid:48) = q , from which it follows that q (cid:48) ; u (cid:48) ; u = q (cid:48) and q ; u ; u (cid:48) = q ,which by our initial remark implies that u (cid:48) ; u = Q and u ; u (cid:48) = Q (cid:48) , sothat Q and Q (cid:48) are isomorphic. See Figure 6.1. (cid:2) uotient and Initiality T Σ ,A T Σ Mq vu (cid:45)(cid:63)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 6.2: Proof for Initiality with Equations The following is the main result of this section. It says that given aset A of Σ -equations, there is a Σ -algebra T Σ ,A , also denoted T P when P = ( Σ , A) , with the property that given any Σ -algebra M satisfying A ,there is a unique Σ -homomorphism T Σ ,A → M ; i.e., it says that everyequational specification has an initial model. Note that, strictly speak-ing, we are dealing with sorted (or annotated) terms here, in the senseof Definition 3.2.9. Theorem 6.1.15 ( Initiality ) Given a set A of (possibly conditional) Σ -equations,let ≡ be the Σ -congruence on T Σ generated by the relation R having thecomponents R s = { (cid:104) t, t (cid:48) (cid:105) | A (cid:96) ( ∀∅ ) t = t (cid:48) where t, t (cid:48) are of sort s } . Then T Σ /R , denoted T Σ ,A , is an initial ( Σ , A) -algebra. Proof: E21 Given any ( Σ , A) -algebra M , let v : T Σ → M be the unique homomor-phism, and note that R ⊆ ker (v) because M (cid:238) A implies M (cid:238) ( ∀∅ ) t = t (cid:48) for every (cid:104) t, t (cid:48) (cid:105) ∈ R , by the soundness of equational deduction. Nowlet v = q ; u with u : T Σ ,A → M be the factorization of v given by Propo-sition 6.1.12; see Figure 6.2. This shows existence. For uniqueness, ifalso u (cid:48) : T Σ ,A → M , then q ; u (cid:48) = v by the initiality of T Σ , and R ⊆ ker (v) because M (cid:238) A . Therefore u = u (cid:48) by (2) of Proposition 6.1.12. (cid:2) In fact, Example 4.3.8 shows that R is a Σ -congruence, there denoted (cid:39) A . Theorem 6.1.15 is very fundamental in algebraic specification; it isbasic to the theory of abstract data types developed later in this chap-ter, as well as to the theory of rewriting modulo equations in Chapter 7,and several other topics. Moreover, it has a satisfying intuitive interpre-tation similar to that given for T Σ in Section 3.2: we can view T Σ ,A asa kind of “universal language” of (simple expression-like) programs forthe Σ -algebras that satisfy A , which are the “processors” that are able tocorrectly evaluate the programs in T Σ ,A . Initiality says that every suchprogram has a unique result on each such processor. Let’s consideran example, which will help to motivate our approach to substitutionsmodulo equations in Section 6.1.3: Initial Algebras, Standard Models and Induction Example 6.1.16 Let Σ = Σ GROUPL ( { a, b, c } ) and let A contain just the associativelaw. Then expressions like a ∗ b ∗ c and (a ∗ a ∗ b ∗ b) − have uniqueinterpretations in any group in which a, b, c have been given interpre-tations. For example, if M is the group of non-zero rational numbersunder multiplication with a = b = c = 2, then the first expres-sion has value 12, while the second has value . (cid:2) We now generalize freeness to the case where there are equations.Given an S -sorted set X , define T Σ ,A (X) to be the algebra T Σ ∪ X / (cid:39) A viewed as a Σ -algebra, with the A -equivalence class of t denoted [t] A .This algebra is called the free ( Σ , A) -algebra generated by X , and aswas the case with Theorem 3.2.1, it has the following important univer-sal property: Theorem 6.1.17 Given a set A of Σ -equations, a Σ -algebra M satisfying A , andan assignment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M that extends a , i.e., such that a(x) = a(x) for all x ∈ X . Proof: A ( Σ ∪ X, A) -algebra M is exactly the same thing as a ( Σ , A) -algebra M and an assignment a : X → M . For T Σ ,A (X) , the assignment is theinjective morphism i x : X → T Σ ,A (X) that sends x to [x] A . Theo-rem 3.2.1 implies that there is a unique ( Σ ∪ X) -homomorphism a : T Σ ∪ X,A → M , which is exactly the same as saying that there is a unique Σ -homomorphism a : T Σ ,A (X) → M that extends a . (cid:2) Exercise 6.1.3 Show that any two Σ -algebras that satisfy the freeness universalproperty of Theorem 6.1.17 are Σ -isomorphic, and show that any Σ -algebra that is Σ -isomorphic to T Σ ,A (X) also satisfies the same universalproperty. (cid:2) The following consequence of the above free algebra theorem and thehomomorphism theorem will be used in Chapter 7: Proposition 6.1.18 Every ( Σ , A) -algebra is a quotient of a free ( Σ , A) -algebra. Proof: Let M be a ( Σ , A) -algebra, and let | M | denote its underlying ( S -indexed) carrier set. We will show that M is a quotient of the free algebra P = T Σ ,A ( | M | ) . Let h be the unique Σ -homomorphism a from P to M givenby Theorem 6.1.17 with a the identity map on | M | . Notice that h issurjective, because a is already surjective. Then by Corollary 6.1.9, P / ≡ h is isomorphic to M . (cid:2) We conclude this section with substitution modulo A , generalizing Sec-tion 3.5; this also is needed in Chapter 7. bstract Data Types Definition 6.1.19 Given a set A of Σ -equations, a substitution modulo A of ( Σ , A) -terms with variables in Y for variables in X is an arrow a : X → T Σ ,A (Y ) ; we may use the notation a : X → Y . The application of a to t ∈ T Σ ,A (X) is a(t) . Given substitutions a : X → T Σ ,A (Y ) and b : Y → T Σ ,A (Z) , then their composition (as substitutions), denoted a ; b , is the S -sorted arrow a ; b : X → T Σ ,A (Z) . (cid:2) Again as in Section 3.5, an alternative notation makes this look morefamiliar: Given t ∈ T Σ ,A (X) and a : X → T Σ ,A (Y ) such that | X | ={ x , . . . , x n } and a(x i ) = [t i ] A for i = , . . . , n , then a(t) can also bewritten t(x ← t , x ← t , . . . , x n ← t n ) , and whenever t i is just x i , thepair x i ← t i can be omitted from this notation. Exercise 6.1.4 The following assume a set A of Σ -equations. 1. If i X : X → T Σ ,A (X) is the inclusion, show that i X ([t] A ) = [t] A foreach [t] A ∈ T Σ ,A (X) .2. Given a substitution a : X → T Σ ,A (Y ) , show that i X ; a = a and that a ; i Y = a ; as before, this justifies writing 1 X for i X .3. Show that substitution modulo A is associative, in the sense thatgiven substitutions a : W → T Σ ,A (X) , b : X → T Σ ,A (Y ) and c : Y → T Σ ,A (Z) , then (a ; b) ; c = a ; (b ; c) . Hint: The “magical” proof for the ordinary case (Proposition 3.6.5)generalizes, using the free property of term algebras modulo A (Theorem 6.1.17).4. Show that Corollary 3.6.6, asserting a ; b = (a ; b) , generalizes tosubstitution modulo A . (cid:2) This section motivates abstract data types from the viewpoint of soft-ware engineering, then gives a precise definition for this concept, and finally proves some of its most basic properties, especially that an ab-stract data type is uniquely determined by its specification as an initialalgebra, and that abstract data types are indeed abstract. Section 6.6discusses some limitations of this chapter’s approach, placing it alonga path having important further developments. It is well known that most of the effort in programming goes into de-bugging and maintenance (i.e., into improving and updating programs)[15]. Therefore anything that can be done to ease these processes has Initial Algebras, Standard Models and Induction enormous economic potential. One step in this direction is to “encap-sulate data representations”; this means to make the actual structureof data invisible, and to provide access to it only through a given set ofoperations which retrieve and modify the hidden data structure. Thenthe implementing code can be changed without having any effect onother code that uses it. On the other hand, if client code relies on prop-erties of the representation, it may be extremely hard to track down allthe consequences of modifying a given data structure (say, changing adoubly linked list to an array), because the client code may be scatteredall over the program, without any clear identifying marks. The so-calledY2K problem is one relatively dramatic example of this phenomenon.An encapsulated data structure with its accompanying operationsis called an abstract data type . The crucial advance was to recognize that operations should be associated with data representations; this isexactly the same insight that advanced algebra from mere sets to al-gebras , which are sets with their associated operations. In softwareengineering this insight seems to have been due to David Parnas [146],and in algebra to Emmy Noether [20, 127]. The parallel between de-velopments in software engineering and in abstract algebra is a majorsubtheme of this chapter.A theory of abstract data types should enable us to check whetheror not implementations are correct, by verifying their properties. Thischapter presents some of the basics of such a theory. Abstract datatypes also provide the foundation for many theorem-proving problems:before we can prove something about the natural numbers, or aboutlists, we need a precise characterization of the structure that is in-volved. Even results about groups often use the natural numbers. Moreelaborate problems in computer science, such as proving the correct-ness of a compiler, usually involve more elaborate data structures, suchas queues, stacks, arrays, or lists of stacks of integers. We usually wantsuch proofs to be independent of how the underlying data types hap-pen to be represented; for example, we are usually not interested inproperties of the decimal or binary representations of natural num-bers, but instead are interested in abstract properties of the naturalnumbers, like the commutativity of addition. We have already seen several examples where the Σ -term algebra T Σ serves as a standard model for a specification P = ( Σ , ∅ ) with no equa-tions. For example, if Σ = Σ NATP (from Example 2.3.3) then T Σ is the nat-ural numbers in Peano notation, and if Σ = Σ NATEXP (from Example 2.3.4)then T Σ consists of all expressions formed from the operation symbols , s , + and * .There are also many examples that need equations, such as Example bstract Data Types (cid:38)(cid:37)(cid:39)(cid:36) (cid:38)(cid:37)(cid:39)(cid:36)(cid:15) (cid:12)(cid:13)(cid:14) (cid:63) (cid:63)(cid:54)(cid:54) • ••• (cid:15) (cid:12) (cid:15) (cid:12)(cid:13)(cid:14)(cid:14) (cid:13) f ; g f g ; fgM M (cid:48) Figure 6.3: Uniqueness of Initial Algebras5.1.5, the natural numbers with addition, for which we now repeat theOBJ code: obj NATP+ issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .op _+_ : Nat Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .endo Theorem 6.1.15 tells us that such specifications do indeed have ini-tial models, in which the elements of the carriers are the equivalenceclasses of terms modulo the equations. However, Theorem 5.2.9 givesa different initial algebra for specifications that are also canonical asterm rewriting systems, namely as the normal forms of terms underreduction. Moreover, a specification like NATP+ may well have stillother representations that are preferred, such as natural numbers inthe usual decimal positional notation. The choice of representation isjust a matter of convenience, because all initial algebras are “essentiallythe same” in the sense that they are isomorphic algebras, as shown bythe following: Proposition 6.2.1 Given a specification P = ( Σ , A) , any two initial P -algebrasare Σ -isomorphic; in fact, if M and M (cid:48) are two initial P -algebras, then the unique Σ -homomorphisms M → M (cid:48) and M (cid:48) → M are both isomor-phisms, and indeed, are inverse to each other. Proof: The diagram in Figure 6.3 pertains to this proof. Because M and M (cid:48) areboth initial, there are Σ -homomorphisms f : M → M (cid:48) and g : M (cid:48) → M .Thus there are Σ -homomorphisms f ; g : M → M and g ; f : M (cid:48) → M (cid:48) . Butbecause the identity on M is a Σ -homomorphism and there is a unique Σ -homomorphism from M to M by the initiality of M , we necessarilyhave f ; g = M . Similarly, g ; f = M (cid:48) . (cid:2) For example, if P = ( Σ , A) is NATP+ , then the Σ -algebra N P of normalforms under A (of Theorem 5.2.9) and the Σ -algebra T / (cid:39) A of equiva- Initial Algebras, Standard Models and Induction lence classes of ground terms under A are isomorphic, and in fact, bothare isomorphic to ω .The following result shows that satisfaction of an equation by analgebra is an “abstract” property, in the sense that it is independent ofhow the algebra happens to be represented; more precisely, it is invari-ant under isomorphism. This is fortunate, because these are usually theproperties in which we are most interested. This and Proposition 6.2.1imply that exactly the same equations are true of any one initial P -algebra as any other. Proposition 6.2.2 Given isomorphic Σ -algebras M and M (cid:48) , and given a Σ -equa-tion e , then M (cid:238) e iff M (cid:48) (cid:238) e . Proof: Let h : M → M (cid:48) be an isomorphism, let e be ( ∀ X) t = t (cid:48) and let a : X → M be an interpretation of X in M . Then a(t) = a(t (cid:48) ) implies h(a(t)) = h(a(t (cid:48) )) . Moreover, any interpretation b : X → M (cid:48) is of theform a ; h for some a : X → M , namely a = b ; g , where g is the inverseof h . Hence a(t) = a(t (cid:48) ) for all a : X → M implies b(t) = b(t (cid:48) ) for all b : X → M (cid:48) . The converse implication follows by symmetry. (cid:2) The word “abstract” in the phrase “abstract algebra” means “uniquelydefined up to isomorphism.” In abstract group theory, we are not in-terested in properties of representations of groups, but only in thosethat hold up to isomorphism. Because Proposition 6.2.1 implies that allthe initial models of a specification P = ( Σ , E) are abstractly the samein precisely this sense, the word “abstract” in “abstract data type” has exactly the same meaning. This is not a mere pun, but a significant factabout software engineering.Another fact which strongly suggests that we are on the right trackis that any computable abstract data type has an equational specifi-cation; moreover, this specification tends to be reasonably simple andintuitive in practice. The following result from [137] somewhat general-izes the original version due to Bergstra and Tucker [11] ( M is reachable iff the unique Σ -homomorphism T Σ → M is surjective): Theorem 6.2.3 ( Adequacy of Initiality ) Given any computable reachable Σ -alge-bra M with Σ finite, there is a finite specification P = ( Σ (cid:48) , A (cid:48) ) such that Σ ⊆ Σ (cid:48) , such that Σ (cid:48) has the same sorts as Σ , and such that M is Σ -isomorphic to T P viewed as a Σ -algebra. (cid:2) We do not here define the concept of a “computable algebra,” but itcorresponds to what one would intuitively expect: all carrier sets aredecidable and all operations are total computable functions; see [137].What this result tells us is that all of the data types that are of interest incomputer science can be defined using initiality, although sometimes it tandard Models are Initial Models may be necessary to add some auxiliary functions. All of this motivatesthe following: Definition 6.2.4 The abstract data type (abbreviated ADT ) defined by a speci-fication P is the class of all initial P -algebras. (cid:2) We now address the basic question of what a standard model is bygiving two intuitively motivated properties of a standard model, andthen showing that any model satisfying these properties is in fact aninitial model; because we already know that initial models are uniqueup to isomorphism, this settles the question. Suppose we are given a theorem-proving problem that involves asignature Σ and a set A of equations that characterize the operations in Σ . Suppose further that M is a standard model of P = ( Σ , A) , and let h denote the unique Σ -homomorphism T Σ → M . Then the two propertiesare as follows:1. No Junk . For each m ∈ M , there is some t ∈ T Σ such that m = h(t) .2. No Confusion . Given t, t (cid:48) ∈ T Σ , then h(t) = h(t (cid:48) ) iff A (cid:96) ( ∀∅ ) t = t (cid:48) .The intuitive justification for these principles is as follows: Becausethe elements of M are supposed to represent the entities that exist inthe “world” of the problem, it would be wrong to allow entirely newentities. Similarly, it is necessary that all entities are distinct unless itfollows from the statement of the problem that they must be the same.For example, consider the “Missionaries and Cannibals” problem, inwhich n Missionaries and n Cannibals are on one shore of a river, andall of them wish to get to the other shore, using a boat which can holdat most k people. If ever there are more Missionaries than Cannibals,either on one shore or the other, or in the boat, then all the Cannibalspresent in that place are converted to Christianity. The problem is to get everyone to the other shore without any conversions.Clearly, it would not be legitimate to postulate a bridge over whicheveryone could just walk, or a second larger boat into which everyonecould fit; this would be “junk.” Similarly, it would not be legitimateto postulate some number of extra Cannibals to stand guard. A dif-ferent kind of illegitimate solution would simply assume that all theMissionaries are actually the same individual, with a number of differ-ent names; this would be a “confusion” of identities.We can also give a “fair mystery story” interpretation of these twoconditions: the first says that the butler didn’t do it unless he wasactually introduced into the story as a suspect (“no deus ex machina ”), Initial Algebras, Standard Models and Induction T Σ M M (cid:48) h uh (cid:48) (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 6.4: No Junk, No Confusion Proofwhile the second says that all the characters are distinct unless theauthor has explicitly said otherwise (“no artificial aliases”). Thus, if theclues point to two different characters, the author would be cheatingif he resolved the apparent conflict by saying that these two charactersare really the same. Rather, he should give sufficient evidence to narrow the suspects down to just one. Theorem 6.3.1 Given a specification P = ( Σ , A) , then a Σ -algebra M has no junkand no confusion relative to P iff it is an initial P -algebra. Proof: The diagram in Figure 6.4 pertains to this proof.If M is an initial P -algebra, then M (cid:155) T Σ / (cid:39) A and the no junk and noconfusion conditions are obvious for this algebra.For the converse, we first show that if a Σ -algebra M has no junk andno confusion relative to P = ( Σ , A) , then it satisfies A . Let ( ∀ X) t = t (cid:48) be in A , and let θ : X → M be a substitution; then we must show that θ(t) = θ(t (cid:48) ) in M . Let h : T Σ → M be the unique Σ -homomorphism, andlet | X | = { x , . . . , x n } . By no junk, we may assume that θ(x i ) = h(t i ) for some t i ∈ T Σ . Because A (cid:96) ( ∀∅ ) t(x ← t , . . . , x n ← t n ) = t (cid:48) (x ← t , . . . , x n ← t n ) follows by equational deduction, and because θ(t) = h(t(x ← t , . . . , x n ← t n )) and θ(t (cid:48) ) = h(t (cid:48) (x ← t , . . . , x n ← t n )) , the no confusion condition gives us that θ(t) = θ(t (cid:48) ) , as desired.To show that M is initial, let M (cid:48) be another P -algebra, and let h : T Σ → M and h (cid:48) : T Σ → M (cid:48) be the unique homomorphisms. Now given m ∈ M , by no junk we may assume that m = h(t) for t ∈ T Σ andthen define u : M → M (cid:48) by u(m) = h (cid:48) (t) . To show that this is well-defined, let us also assume that h(t) = h(t (cid:48) ) ; then by no confusion, A (cid:96) ( ∀∅ ) t = t (cid:48) , and so M (cid:48) (cid:238) ( ∀∅ ) t = t (cid:48) , which implies that h (cid:48) (t) = h (cid:48) (t (cid:48) ) , as desired. Moreover, h ; u = h (cid:48) by construction, and u is uniquebecause if it exists, it must satisfy the equation h ; u = h (cid:48) , which as wehave just seen determines its value. (cid:2) nitial Truth and Subalgebras This section defines initial satisfaction, and proves the fundamental re-sult that an initial algebra has no proper subalgebras. The proof is,I think, suprisingly simple and beautiful, and many important resultsabout induction will follow from it in subsequent sections of this chap-ter. Definition 6.4.1 Given a specification P = ( Σ , A) and a Σ -equation e , we say that P initially satisfies e iff T P (cid:238) e ; in this case we write P |(cid:155) e or A |(cid:155) Σ e ,and we may omit the subscript Σ when it is clear from context. (cid:2) Notice that this is a semantic property. Because anything that is true ofall models is certainly true of initial models, we have that P (cid:238) e implies P |(cid:155) e (where P (cid:238) e means that A (cid:238) Σ e ). However, the converse does not hold: Example 6.4.2 Let Σ contain a constant 0, a unary function symbol s , and abinary function symbol + , which we will write with infix notation; let A contain the equations ( ∀ n) + n = n( ∀ m, n) s(m) + n = s(m + n) . Then the commutative equation ( ∀ m, n) m + n = n + m holds in T P but does not hold in every P -algebra. For example, it doesnot hold in the Σ -algebra M with carrier all strings of a ’s and b ’s, with0 ∈ Σ denoting the empty string in M , with m + n denoting the concate-nation of the string n after the string m , and with s sending a string m to the string a + m . For example, a + b ≠ b + a , because ab ≠ ba . (cid:2) Theorem 6.4.4 below is another fundamental property of initial alge-bras, with a very simple proof. We will soon see that this result providesthe foundation for proofs by induction. But first, we need the following:Recall that given a Σ -algebra M , another Σ -algebra M (cid:48) is a subalge-bra (or sub- Σ -algebra ) of M iff there is an inclusion M (cid:48) → M that is a Σ -homomorphism. In this case, we may write M (cid:48) ⊆ M . A subalgebra M (cid:48) of M is proper iff M (cid:48) ≠ M (i.e., iff M (cid:48) s ≠ M s for some s ∈ S ). Example 6.4.3 If ω denotes the natural numbers, and Z denotes the integers(positive, negative and zero), and if Σ = Σ NATP , then ω is a sub- Σ -algebraof Z . (cid:2) Initial Algebras, Standard Models and Induction T Σ M T P u jh (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 6.5: No Proper Subalgebra Proof Exercise 6.4.1 E22 Given a Σ -algebra M and given a subset M (cid:48) s ⊆ M s for each s ∈ S , show that M (cid:48) gives the carriers of a subalgebra of M if and onlyif M σ (m , . . . , m k ) ∈ M (cid:48) s for each σ ∈ Σ , where m i ∈ M (cid:48) s i for i = , . . . , k and where σ has arity s . . . s k and sort s . (cid:2) Exercise 6.4.2 Show that if M is a ( Σ , A) -algebra and if M (cid:48) ⊆ M is a sub- Σ -algebra, then M (cid:48) is also a ( Σ , A) -algebra. (cid:2) Theorem 6.4.4 If P = ( Σ , A) is a specification, then an initial P -algebra has noproper sub- Σ -algebras. Proof: The diagram in Figure 6.5 pertains to this proof. Let h : T Σ → T P bethe unique Σ -homomorphism, which is surjective by hypothesis, let j : M → T P be the inclusion for a sub- Σ -algebra M , and let u : T Σ → M bethe unique Σ -homomorphism. Then by initiality of T Σ and because j isa Σ -homomorphism, u ; j = h . Hence j is also surjective, and so T P hasno proper sub- Σ -algebra. Therefore, no other initial P -algebra can havea proper sub- Σ -algebra, because they are all isomorphic. (cid:2) In general, pure equational deduction is inadequate for proving proper-ties of standard models, and many properties require the use of induc-tion. Fortunately, there exist powerful induction principles for initialmodels, that let us prove that some predicate holds for all values byproving that it holds for each constructor whenever it holds for all ar- guments of that constructor. These generalize Peano induction fromthe natural numbers to arbitrary data types, and can be consideredforms of “structural induction” [21], as discussed in Section 3.2.1.The results in this section follow [137], and justify using inductionin proof scores for unsorted signatures; Section 6.5.2 extends this tothe many-sorted case; much more general results on induction, includ-ing its use on first-order formulae, are given in Chapter 8. The followingbasic definition applies to both the unsorted and many-sorted cases: Definition 6.5.1 Given a signature Σ and a Σ -algebra M , then a subsignature Φ ⊆ Σ is a signature of constructors for M iff the unique Φ -homomorphism nduction h : T Φ → M is surjective, and is a signature of unique constructors iff h is a Φ -isomorphism. A signature Φ of constructors is minimal iff noproper subsignature Φ (cid:48) ⊂ Φ has T Φ (cid:48) → M surjective. A signature of con-structors for a specification P = ( Σ , A) is a signature of constructorsfor T P . (cid:2) If a Σ -algebra M has a signature of constructors, then it has a mini-mal signature of constructors; however, it may have more than one. Ev-ery specification has a signature of constructors and at least one min-imal signature of constructors. Clearly, using a minimal signature ofconstructors will require less effort in proofs. Although specificationsneed not in general have signatures of unique constructors, the specifi-cations that arise in practice often have a unique minimal signature of unique constructors. The properties of reachability and induction cor-respond to the “no junk” and “no confusion” conditions that togetherare equivalent to initiality [24, 137]. Example 6.5.2 If Σ = Σ NATP + , then Φ = Σ NATP is a minimal signature of construc-tors and also a signature of unique constructors for T NATP + . (cid:2) You may wish to review Section 2.7 on unsorted algebra before read-ing the next result. Theorem 6.5.3 ( Structural Induction I ) Given an unsorted specification P = ( Σ , A) and a signature Φ of constructors for P , let V be a subset of T P . Then V = T P if(0) c ∈ Φ implies [c] ∈ V , and(1) f ∈ Φ n for n > [t i ] ∈ V for i = , . . . , n imply [f (t , . . . , t n )] ∈ V . Proof: Because T P has no proper subalgebras by Proposition 6.4.4 and because V ⊆ T P , we need only show that V is closed under Φ ; but that is exactlywhat conditions (0) and (1) say. (cid:2) This very simple proof is possible because we have taken initial alge-bra semantics as our starting point. Note that a complete inductiveproof using a signature Φ of constructors must include a proof that Φ in fact is a signature of constructors, i.e., that there is a surjective Φ -homomorphism h : T Φ → T P . Often this surjective property will beproved using structural induction. But if A is ground canonical and allits normal forms are Φ -terms, then h is just the isomorphism of N P with T P that is discussed in Section 5.2.9. Initial Algebras, Standard Models and Induction The usual formulation of induction follows easily from Theorem6.5.3: Corollary 6.5.4 ( Structural Induction II ) Given an unsorted specification P = ( Σ , A) and a signature Φ of constructors for P , let Q(x) be a Σ ( { x } ) -sentence. Then A |(cid:155) Σ ( ∀ x) Q(x) if(0) c ∈ Φ implies A |(cid:155) Σ Q(c) , and(1) f ∈ Φ n for n > t i ∈ T Σ for i = , . . . , n and A |(cid:155) Σ Q(t i ) for i = , . . . , n imply A |(cid:155) Σ Q(f (t , . . . , t n )) . Proof: This follows from Theorem 6.5.3 by letting V = { h(x) | A |(cid:155) Σ Q(x) } ,where h : T Φ → T P is the surjective Φ -homomorphism that exhibits Φ asa signature of constructors. (cid:2) Chapter 8 is much more precise about the notion of “sentence,” butfor now it suffices to think of sentences as including equations andtheir combinations under conjunction and implication. Corollary 6.5.4justifies the use of simple induction in proof scores, and is used inexamples below. Example 6.5.5 When Σ = Σ NATP , the above result states exactly the usual prin-ciple of induction for the natural numbers. (cid:2) Induction is basic to many theorem-proving systems, including theBoyer-Moore theorem prover [19], although not in the same form asabove. Experience [59] shows that it is often easier to prove results bystructural induction, as justified by the above results, than by so-calledinductionless induction [56, 137, 140] using Knuth-Bendix completion,because structural induction arguments do not require showing thetermination of new rule sets, and do not produce uncontrollable ex-plosions of strange new rules that may gradually become less and lessrelevant; Garland and Guttag [49] report a similar experience.Note that inductive proof techniques are not valid for loose seman-tics, because (in general) the results proved by induction are not true ofall models, but only of the standard (initial) models. Also, it is usuallymuch easier to directly exploit the close connection between rewrite rules, initiality, and induction than to try to remain within a “loose”semantics framework by axiomatizing a standard model with explicitreachability and induction schemata, because the first of these requiresexistential quantification (e.g., Skolem functions) and the second re-quires second-order quantification. The following proof scores use the inductive proof techniques intro-duced above. The first two examples import the following code for thenatural numbers with addition: nduction obj NAT is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [prec 3] .vars M N : Nat .eq M + 0 = M .eq M + s N = s(M + N) .endo Example 6.5.6 ( Associativity of Addition ) The following score proves that addi-tion of natural numbers is associative: open NAT .ops l m n : -> Nat .***> base case, n=0: l+(m+0)=(l+m)+0reduce l + (m + 0) == (l + m) + 0 .***> induction stepeq l + (m + n) = (l + m) + n .reduce l + (m + s n) == (l + m) + s n .close Therefore we have proved the equation ( ∀ L, M, N) L + (M + N) = (L + M) + N . (cid:2) Example 6.5.7 ( Commutativity of Addition ) E23 The following proof score showsthat addition of natural numbers is commutative: open NAT .ops m n : -> Nat .***> first lemma0: 0 + n = n, by induction on n***> base for lemma0, n=0reduce 0 + 0 == 0 .***> induction stepeq 0 + n = n .reduce 0 + s n == s n .*** thus we can asserteq 0 + N = N .***> show lemma1: s m + n = s(m + n), again by induction on n***> base for lemma1, n=0reduce s m + 0 == s(m + 0) .***> induction stepeq s m + n = s(m + n) .reduce s m + s n == s(m + s n) .*** thus we can asserteq s M + N = s(M + N).***> show m + n = n + m, again by induction on n***> base case, n=0 Initial Algebras, Standard Models and Induction reduce m + 0 == 0 + m .***> induction stepeq m + n = n + m .reduce m + s n == s n + m .close Of course, we should not assert commutativity as a rewrite rule, or wemay get non-terminating behavior. (cid:2) We will see in Chapter 7 that the above results imply that we can useassociative-commutative rewriting for addition in doing more complexexamples.It is interesting to contrast the above proofs with correspondingproofs due to Paulson in Cambridge LCF [147]. The LCF proofs aremuch more complex, in part because LCF allows partial functions, and then must prove them total, whereas functions are automatically total(on their domain) in equational logic. Exercise 6.5.1 E24 Use OBJ3 to prove that the equation ( ∀ N) + M = M holdsfor the specification NAT above. (cid:2) Exercise 6.5.2 Given the following code: obj INT is sort Int .ops (inc_)(dec_): Int -> Int .op 0 : -> Int .vars X Y : Int .eq inc dec X = X .eq dec inc X = X .op _+_ : Int Int -> Int .eq 0 + Y = Y .eq (inc X) + Y = inc(X + Y).eq (dec X) + Y = dec(X + Y).endo (a) What set of algebras does it denote? What are its signatures ofcontructors (if any)? What are its minimal signatures of contruc-tors (if any)? What are its signatures of unique constructors (ifany)?(b) Give an OBJ proof score for the equation ( ∀ Y ) Y + = Y , and justify it. (cid:2) Exercise 6.5.3 This question refers to the same code as Exercise 6.5.2.(a) Explain how to represent ( − ) + 4, and explain how OBJ wouldreduce it.(b) Give an OBJ proof score for ( ∀ X, Y ) X + ( dec Y ) = dec (X + Y ) ,and justify it.(c) Give an OBJ proof score for ( ∀ X, Y ) X + Y = Y + X , and justify it. (cid:2) nduction This section extends the results of Section 6.5 to the many-sorted case;hence, Σ is an S -sorted signature throughout. Again, much more gen-eral results may be found in Chapter 8. Definition 6.5.8 Given a Σ -algebra M and s ∈ S , then a subsignature ∆ ⊆ Σ is inductive for s over M iff(0) each δ ∈ ∆ has sort s , and(1) Σ (cid:48) = { σ ∈ Σ | sort (σ ) ≠ s } ∪ ∆ is a signature of constructors for M .A signature ∆ that is inductive for s over M is minimal iff it is an in- ductive signature for s over M such that no proper subsignature is in-ductive for s over M . Given a specification P = ( Σ , A) , a signature ∆ is inductive for s over P iff it is inductive for s over T P . (cid:2) Of course, we want to do as little work as possible in an inductive proof.A minimal inductive signature allows this. Exercise 6.5.4 Show that a signature ∆ is a minimal inductive signature for s over M iff ∆ ⊆ Φ where Φ is a minimal signature of constructors for M . (cid:2) Theorem 6.5.9 ( Structural Induction I (cid:48) ) Given a specification P = ( Σ , A) anda signature ∆ that is inductive for sort s over P , let V ⊆ T P,s . Then V = T P,s if(0) c ∈ ∆ [],s implies [c] ∈ V , and(1) f ∈ ∆ s ...s n ,s for n > t i ∈ T Σ ,s i for i = , . . . , n with [t i ] ∈ V if s i = s imply [f (t , . . . , t n )] ∈ V . Proof: We first define an S -sorted set M by M s = V and M s (cid:48) = T P,s (cid:48) for s (cid:48) ≠ s .Then (0) and (1) tell us that M is a Σ (cid:48) -algebra, where Σ (cid:48) is as in Defini-tion 6.5.8. Now because T P has no proper sub- Σ (cid:48) -algebras by Proposi-tion 6.4.4, we conclude that V = T P,s . (cid:2) As before, the more familiar formulation of induction follows im-mediately: Corollary 6.5.10 ( Structural Induction II (cid:48) ) Given a specification P = ( Σ , A) anda signature ∆ that is inductive for sort s over P , let Q(x) be a Σ ( { x } ) -sentence where x is a variable of sort s . Then A |(cid:155) Σ ( ∀ x) Q(x) if(0) c ∈ ∆ [],s implies A |(cid:155) Σ Q(c) , and(1) f ∈ ∆ s ...s n ,s for n > t i ∈ T Σ ,s i for i = , . . . , n and A |(cid:155) Σ Q(t i ) when s i = s imply A |(cid:155) Σ Q(f (t , . . . , t n )) . Initial Algebras, Standard Models and Induction Proof: This follows directly from Theorem 6.5.9 by letting V = { x ∈ T P,s | A |(cid:155) Σ Q(x) } . (cid:2) Actually, “if” can be replaced by “iff” in both Theorem 6.5.9 and Corol-lary 6.5.10; however, these converse “completeness” results do notseem to be useful in practice. Also, note that we can generalize all thisa bit further by considering an S -sorted set V = { V s | s ∈ S } defined by Q = { Q s (x) | s ∈ S } . Exercise 6.5.5 E25 Consider the following specification: obj SET is sort Set .pr NAT .op {} : -> Set .op ins : Nat Set -> Set .op _U_ : Set Set -> Set [id: ({})] .vars N N’ : Nat .vars S S’ : Set .eq ins(N,ins(N’,S)) = ins(N’,ins(N,S)).eq ins(N,ins(N,S)) = ins(N,S).eq ins(N,S) U S’ = ins(N,S U S’).endo where NAT is the Peano natural numbers. Then do the following:(a) Write a specification for SET that involves neither module impor-tation (the line “ pr NAT ”) nor attributes (“ [id:{}] ”). What wouldresult from executing the following in OBJ3? open SET .op s0 : -> Set .red ins(0,{}) .red {} U {} .red ins(0,ins(0,{})) .red ins(0,ins(0,s0)) . (b) Give an inductive signature for the sort Set over SET which isminimal, and one which is not. Explain why.(c) Give a manual proof that the equation ( ∀ S, S (cid:48) ) S ∪ S (cid:48) = S (cid:48) ∪ S holds for SET .(d) Give an OBJ proof score for this equation. (cid:2) We can formulate induction as a general rule of inference. Let thenotation A |(cid:39) Σ e indicate that e can be proved from A using the new rulegiven below plus the usual rules for (cid:96) Σ , and assume that ∆ is inductivefor sort s over P = ( Σ , A) . Then the new rule is: Closer Look at State, Encapsulation and Implementation (I) Given t, t (cid:48) ∈ T Σ ( { x } ) with x of sort s , if A |(cid:39) Σ ( ∀∅ ) t(x ← c) = t (cid:48) (x ← c) for each c ∈ ∆ [],s , and if A |(cid:39) Σ ( ∀∅ ) t(x ← t i ) = t (cid:48) (x ← t i ) for i = , . . . , n and f ∈ ∆ s ...s n ,s imply A |(cid:39) Σ ( ∀∅ ) t(x ← f (t , . . . , t n )) = t (cid:48) (x ← f (t , . . . , t n )), then A |(cid:39) Σ ( ∀ x) t = t (cid:48) . Corollary 6.5.10 and the soundness of (cid:96) show that |(cid:39) Σ is sound forinitial truth, i.e., that A |(cid:39) Σ e implies A |(cid:155) Σ e . However, |(cid:39) Σ cannot be complete, i.e., in general the converse implica-tion does not hold [129]. In fact, the set of equations satisfied by T P is not in general even recursively enumerable, for reasons discussed inSection 5.10. Initial semantics works very well for static data structures like inte-gers, lists, booleans, and even vectors and matrices, that are typicallypassed as values in programs unless they are very large. But such anapproach is more awkward for dynamic data structures that have ded-icated storage and commands that change the internal representation,which is not viewed directly, but only through “attributes.” For exam-ple, it is usually more appropriate to view stacks as “state machines”with an encapsulated internal state, and with “top” as an attribute. Al-though initial models exist for any reasonable specification of stacks, real stacks are more likely to be implemented by a model that is notinitial, such as a pointer plus an array. This means that a notion ofimplementation is needed that differs from initial models. Moreover, inconsidering (for example) stacks of integers, the sorts for stacks and forintegers have a different character, since the latter can still be modelledinitially as data. Although an initial framework has been successfullyused for some applications of this kind (e.g., see [86]), it is really betterto take a viewpoint that explicitly distinguishes between “visible” sortsfor data and “hidden” sorts for states, and that defines an implementa-tion to be any model satisfying certain natural constraints; the hiddenalgebra developed in Chapter 13 takes such an approach. Initial Algebras, Standard Models and Induction The notions of quotient and image algebra, like the notions of homo-morphism and isomorphism, developed as abstractions of the corre-sponding basic notions for groups and rings, mainly from lectures byEmmy Nöther in the late 1920s; the same holds for the HomomorphismTheorem (Theorem 6.1.7). The development of minimal automata inExample 6.1.10 using the homomorphism theorem may be original, butalgebraic treatments of automata go back to Anil Nerode and others inthe earliest days of theoretical computer science [142, 154]. The uni-versal properties of quotient and free algebras (Proposition 6.1.12 andTheorem 6.1.17) are notable for the smooth way that consequences canbe drawn from them, as illustrated by the proof of Theorem 6.1.15 andthe results at the end of Section 6.1; several proofs in the chapter also make good use of diagrams. The concept of universal property comesfrom category theory, where it takes the elegant form of an adjoint pairof functors (see [128] or the end of [126]); it was also developed byBourbaki in a more concrete form closer to the one in this text, whichis also similar to that in the excellent abstract algebra text of Mac Laneand Birkhoff [128].The importance of initiality for computing has developed gradu-ally. The term “initial algebra semantics” and its first applications, toKnuthian attribute semantics, appear in [52], while the first applicationsto abstract data types are in [87]; a more complete and rigorous expo-sition is given in [86]. The terminology “no junk” and “no confusion”and Theorem 6.3.1 are from [24]. Many examples of initiality can befound in [89] and [137]; the latter especially develops connections withinduction and computability. Results on the computational adequacyof initiality first appeared in the work of Bergstra and Tucker [11]. Theprinciple that standard models are initial models extends beyond equa-tional logic, for example, to standard models of Horn clause logic asused in (pure) Prolog; [79] discusses this and some other applicationsof the principle. A significant generalization of algebraic induction isgiven in Section 8.7. Deduction and RewritingModulo Equations This chapter generalizes term rewriting to the situation where someequations B are “built in” as part of matching, rather than used asrewrite rules; the phrase “ modulo B ” refers to this. We also intro-duce terms, equations, and deduction modulo B , give a decision pro-cedure for the propositional calculus (using rewriting modulo associa-tivity and commutativity), generalize theory from Chapter 5, includingways to prove termination and Church-Rosser, and apply all this to sev-eral kinds of digital hardware, among other things. The reader may well have felt that the repeated use of the associativelaw in Examples 4.1.4 and 4.5.5 (as well as Exercises 4.6.1–4.6.4) wasrather tedious, and perhaps even unnecessary. A more precise way toexpress this feeling is to say that because the associative law tells usthat “parentheses are unnecessary,” it should be unnecessary to moveparentheses around in deductions that assume this law. For example,the two expressions (a − − ∗ a − ) ∗ (a ∗ a − ) (a − − ∗ (a − ∗ a)) ∗ a − are “obviously” equal to each other, because they have the same un-parenthesised form, namely a − − ∗ a − ∗ a ∗ a − . Example 7.1.1 Eliminating parentheses in the proof of Example 4.5.5 gives the Deduction and Rewriting Modulo Equations following, [ ] a ∗ a − = e ∗ a ∗ a − by ( − ) on GL . [ ] = a − − ∗ a − ∗ a ∗ a − by ( − ) on GL . A = a − [ ] = a − − ∗ e ∗ a − by ( ) on GL . [ ] = a − − ∗ a − by ( ) on GL . [ ] = e by ( ) on GL . (cid:2) Because we want to be precise throughout this book, we must askwhat it means to “build in” associativity this way. One approach is toconsider expressions that differ only in their parenthesization to be equivalent ; the equivalence classes should then form an algebra such that we can do deduction on the classes in essentially the same way thatwe do deduction on terms. For example, the following is an equivalenceclass of terms modulo associativity: { (a ∗ b) ∗ (c ∗ d), a ∗ (b ∗ (c ∗ d)), a ∗ ((b ∗ c) ∗ d),((a ∗ b) ∗ c) ∗ d, (a ∗ (b ∗ c)) ∗ d } . We also want the unparenthesized expression a ∗ b ∗ c ∗ d to serve assurface syntax of this class, representing it to users in the familiar way.Because there are applications other than associativity, it is worth-while to develop the theory at a general level. For example, if we want tostudy commutative groups, which in addition to the usual group laws,also satisfy the commutative law, ( ∀ X, Y ) X ∗ Y = Y ∗ X , then we need to avoid using this equation as a rewrite rule, because itcan lead to non-terminating computations, such as a ∗ b ⇒ b ∗ a ⇒ a ∗ b ⇒ · · · . This observation means it is impossible to give a canonical term rewrit-ing system for the theory of commutative groups. However, by regard-ing terms that differ only in the order of their factors as equivalent, we can build in commutativity and thus avoid non-termination. Forexample, (a ∗ b) ∗ c has the following class of equivalent terms { (a ∗ b) ∗ c, (b ∗ a) ∗ c, c ∗ (a ∗ b), c ∗ (b ∗ a) } , and moreover, the class { a ∗ b, b ∗ a } is a normal form under rewritingmodulo commutativity.There are also many examples, including commutative groups, whereit is useful to identify terms that are the same up to the order of theirfactors and their parenthesization. Thus, there are at least three inter-esting cases: associativity, commutativity, and both together, which areoften abbreviated A, C, and AC, respectively. eduction Modulo Equations Exercise 7.1.1 Prove that there are 12 terms in the equivalence class moduloAC of the term (a ∗ b) ∗ c , and write them all out. Also prove there are 5terms in the equivalence class of (a ∗ b) ∗ (c ∗ d) modulo associativity. (cid:2) The “ Semantics First ” slogan of Chapter 1 implies that a discussionof deduction should be preceded by a discussion of satisfaction, as astandard against which to test soundness and completeness. We nowdo this for deduction with a set A of equational axioms, modulo a set B of equations. The first step is to define the kind of equation involved;we begin with B -equivalence classes of Σ -terms, i.e., with elements of T Σ ,B (X) , as defined in Section 6.1.1. Definition 7.2.1 Given a set B of (possibly conditional) Σ -equations, a ( condi-tional ) Σ - equation modulo B , or ( Σ , B) - equation , is a 4-tuple (cid:104) X, t, t (cid:48) , C (cid:105) with t, t (cid:48) ∈ T Σ ,B (X) and C a finite set of pairs from T Σ ,B (X) , usually writ-ten ( ∀ X) t = B t (cid:48) if C ; we may use the same notation with t, t (cid:48) , C all Σ -terms to represent their B -equivalence classes, and we may drop the B subscripts. We write just ( ∀ X) t = B t (cid:48) when C = ∅ and call it an unconditional equation .Given a ( Σ , B) -algebra M , Σ - satisfaction modulo B , written M (cid:238) Σ ,B ( ∀ X) t = B t (cid:48) if C , is defined by a(t) = a(t (cid:48) ) whenever a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , for all a : X → M . Theorem 6.1.17 provides theunique Σ -homomorphism a : T Σ ,B (X) → M extending a . Given a set A of ( Σ , B) -equations, let A (cid:238) Σ ,B e mean that M (cid:238) Σ ,B A implies M (cid:238) Σ ,B e for all B -models M . We may drop the subscripts Σ and/or B if they areclear from context. (cid:2) We can get class deduction versions of the rules for equational infer-ence in Chapter 4, just by replacing each occurrence of T Σ by T Σ ,B andeach occurrence of = by = B , assuming that all axioms in A are ( Σ , B) -equations. We denote the B -class deduction version of rule (i) by (i B ) ,and we let A (cid:96) Σ ,B e indicate deduction modulo B , also called class deduction , which is deduction using the rules ( B ), ( B ), ( B ), ( B ) and ( C B ) to deduce e from A modulo B ; as above we may drop either orboth subscripts Σ and B if they are clear from context. For example,here is the class deduction version of rule ( C) : ( C B ) Forward Conditional Subterm Replacement . Given t ∈ T Σ ,B (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = B t if C is of sort s and is in A , and if θ : Y → T Σ ,B (X) is a substitution suchthat ( ∀ X) θ(u) = B θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then Deduction and Rewriting Modulo Equations ( ∀ X) t (z ← θ(t )) = B t (z ← θ(t )) is also deducible.Note that this uses substitution modulo B , Definition 6.1.19 of Chap-ter 6.The next result lets us carry over soundness and completeness re-sults from Chapter 4 to class deduction. It says that deduction from A modulo B is equivalent to deduction from A ∪ B on representatives,and that satisfaction of an equation modulo B is equivalent to ordinarysatisfaction by a representative of the equation. To express this moreprecisely, given a Σ -equation e of the form ( ∀ X) t = t (cid:48) , let [e] denoteits modulo B version, ( ∀ X) [t] = B [t (cid:48) ] , and similarly for conditionalequations. Given a set A of Σ -equations, let [A] denote the set of mod-ulo B versions of equations in A ; for more clarity, we may also write [t] B , [e] B and [A] B . Proposition 7.2.2 ( Bridge ) Given sets A, B of Σ -equations and another Σ -equa-tion e (with A, B and e possibly conditional), then [A] (cid:96) B [e] iff A ∪ B (cid:96) e . Furthermore, given any ( Σ , B) -algebra M and a (possibly conditional) Σ -equation e , then M (cid:238) Σ ,B [e] iff M (cid:238) Σ e . Proof: For the first assertion, if e , . . . , e n is a proof of e from A ∪ B , then thesubsequence [e i ], . . . , [e ik ] formed by omitting all steps that used B ,and then taking the B -classes of those equations that remain, is a proofof [e] from [A] , and conversely, any proof [e ], . . . , [e k ] of [e] from [A] can be filled out to become a proof of e from A ∪ B , by choosingrepresentatives for each [e i ] and adding intermediate steps that use B .For the second assertion, let q : T Σ (X) → T Σ ,B (X) be the quotientmap, let e be the equation ( ∀ X) t = t (cid:48) if C , and (just for now) let ˜ a denote the extension of a : X → M to a Σ -homomorphism T Σ ,B (X) → M , with a the usual Σ -homomorphism T Σ (X) → M . Then ( ∗ ) q ; ˜ a = a , by the universal property of Theorem 6.1.17, because both sides are Σ -homomorphisms T Σ ,B (X) → M that agree on X , since q(x) = [x] for x ∈ X implies ˜ a(q(x)) = a(x) for x ∈ X . Then M (cid:238) e iff for all a : X → M , a(t) = a(t (cid:48) ) whenever a(u) = a(v) for all (cid:104) u, v (cid:105) ∈ C , andcomposing with q gives us that for all a : X → M , ˜ a([t]) = B ˜ a([t (cid:48) ]) whenever ˜ a([u]) = B ˜ a([v]) for all (cid:104) u, v (cid:105) ∈ C , because of ( ∗ ). But thissays that M (cid:238) B [e] . (cid:2) eduction Modulo Equations Theorem 7.2.3 ( Completeness ) Given sets A, B of Σ -equations and another Σ -equation e (all possibly conditional), then the following are equivalent: ( ) [A] (cid:96) B [e] ( ) [A] (cid:238) B [e] ( ) A ∪ B (cid:96) e ( ) A ∪ B (cid:238) e Proof: The first part of Proposition 7.2.2 gives the equivalence of (1) with (3),the Completeness Theorem (4.8.4) gives the equivalence of (3) with (4),and the second part of Proposition 7.2.2 gives the equivalence of (2)with (4). (cid:2) We also have the following completeness result, which (with the The-orem of Constants) justifies the calculation in Example 7.1.1, since eachstep there is an instance of rule ( ± C B ) : Theorem 7.2.4 Given sets A, B of (possibly conditional) Σ -equations and an un-conditional Σ -equation e , then [A] (cid:96) B [e] iff [A] (cid:96) ( B , B , ± C B )B [e] . Moreover, [A] (cid:96) ( B , B , ± C B )B [e] iff M (cid:238) B [e] for all ( Σ , A ∪ B) -algebras M . Proof: The two predicates in the first assertion are equivalent to (A ∪ B) (cid:96) e and (A ∪ B) (cid:96) ( , , ± C) e , respectively, the latter by reasoning analoguousto that in Proposition 7.2.2; hence they are equivalent by Theorem 4.9.1.The second assertion now follows by Theorem 7.2.3. (cid:2) When B consists of just the associative law, every equivalence class inany T Σ (X)/ (cid:39) B is finite; however, there is no upper bound for the num-ber of terms that may be in these classes. Moreover, there are specifica-tions where the equivalence classes are actually infinite. For example,if B contains an identity law ( x ∗ = x ), then equivalence classes mod-ulo B are infinite; the same holds for an idempotent law ( x ∗ x = x ).Consequently, equivalence classes are not feasible representations for systems like OBJ, either for surface syntax seen by users, or for con-crete data structures used internally for calculation. However, they arefundamental for semantics.When B consists of the associative law, B -equivalence classes havea simple natural representation based on omitting parentheses; for ex-ample, the equivalence class of (a ∗ b) ∗ c can be represented to usersas a ∗ b ∗ c and internally as ∗ (a, b, c) , which avoids the ambiguityof the simpler list representation (a, b, c) if more than one binary op-eration is declared associative. Similarly, bags and sets of terms canrepresent terms modulo AC, and AC plus idempotency, respectively,and these can be implemented using standard concrete data structures. Deduction and Rewriting Modulo Equations However, there is no optimal linear representation for built in commu-tativity, since any linear representation must choose some ordering forsubterms. Theorem 7.2.3 shows that equivalence classes can be repre-sented by terms, and in fact this is what OBJ3 does, for both calculationand surface syntax, except for associativity, where omitting parenthe-ses is the default; however, this default can be turned off with the com-mand “ set print with parens off ”.Another difficulty involved with implementing deduction modulo B with equivalence classes is that occurrences of a variable in [t] B maynot be well defined, since different representatives of the class mayhave different numbers of occurrences. For example, if B contains anidempotent law for a binary operation ∗ , then the terms x, x ∗ x, (x ∗ x) ∗ x, x ∗ (x ∗ x), . . . are all B -equivalent, so that the class [x] B contains terms with n oc-currences of x for every n > 0. Similarly, if B contains a zero law( x ∗ = [x ∗ ] B contains infinitely many terms, e.g., with n occurrences of x for every n ≥ 0. In such cases subterm replacementcannot make sense at a single instance of a variable, i.e., the rule ( ± ) does not generalize to arbitrary equational theories B ; this is unfortu-nate because this rule is the basis for term rewriting.However, we can make single occurrence rewriting work on B -classeswith two additional assumptions. Recall that an equation ( ∀ X) t = t (cid:48) is balanced iff var (t) = var (t (cid:48) ) = X , and note that if an equation in B does not have the same variables in its two sides, then deduction mod-ulo B may require finding values for the unmatched variables, whichin general cannot be done automatically. Also, recall that an equation ( ∀ X) t = t (cid:48) is linear iff t and t (cid:48) each have at most one occurrence ofeach variable in X . For example, associative, commutative and iden-tity laws are all both linear and balanced, while an idempotent law isbalanced but not linear, and a zero law is linear but not balanced. Fact 7.2.5 Given B linear balanced and [t] ∈ T Σ ,B (X) , then every term t (cid:48) that is B -equivalent to t has the same number of occurrences of any x ∈ X as t does. Proof: This is because the number of occurrences of a variable symbol is thesame in the result of applying a linear balanced equation to a term asit was in the original term. (cid:2) Therefore if B is linear balanced, the following makes sense:( ± ,B ) Bidirectional Single Subterm Replacement Modulo B . Given t ∈ T Σ ,B ( { z } s ∪ Y ) with exactly one occurrence of z , where z (cid:54)∈ Y , andgiven a substitution θ : X → T Σ ,B (Y ) , if ( ∀ X) t = B t or ( ∀ X) t = B t is of sort s and is in A , then eduction Modulo Equations ( ∀ Y ) t (z ← θ(t )) = B t (z ← θ(t )) is deducible.This suggests the following notion of class rewriting modulo B basedon single subterm replacement: If B is linear balanced and A is aset of modulo B rewrite rules, define an abstract rewriting system onthe ( S -sorted) set T Σ ,B of B -equivalence classes of ground Σ -terms, by c ⇒ [A/B] c (cid:48) iff there is some c ∈ T Σ ,B ( { z } s ) such that c = c (z ← θ(t )) and c (cid:48) = c (z ← θ(t )) , for some (modulo B ) substitution θ and somerule t → t of sort s in A . Later in this chapter, we show that classrewriting can be defined without restricting B to be linear or balanced.As noted above, class rewriting is impractical, because classes can bevery large, even infinite; nevertheless our later general version provides a semantic standard for the correctness of efficient implementationslike that in OBJ3, which rewrites representive terms rather than classes.This is discussed in detail in Section 7.3, and also appears in the nextsubsection. OBJ3 implements deduction modulo any combination of A, C and I(where “I” stands for identity ), for any subset of binary operationsin the signature; the equations in B are declared using attributes ofoperations. For example, an operation modulo CI is declared by op _*_ : S S -> S [comm id: e] . Note that the identity constant “ e ” must be declared explicitly, becausethere could be other constants of an appropriate sort. The keyword“ assoc ” is used for associativity. We illustrate OBJ’s rewriting mod-ulo associativity with the following proof of the right inverse law forleft groups, as in Example 7.1.1; the reader may wish to first reviewSection 4.6. Example 7.2.6 ( Right Identity for Left Groups ) We must first give a new ver- sion of the specification that treats associativity as a built in equationrather than as a rewrite rule. Then we do the proof itself, beginningwith a constant for the universal quantifier. The “range” notation, as in“ [2 .. 3] ”, is explained after the proof. th GROUPLA is sort Elt .op _*_ : Elt Elt -> Elt [assoc] .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A : Elt .[lid] eq e * A = A . For the non-commutative case, this means both the left and right identity laws. Deduction and Rewriting Modulo Equations [linv] eq A -1 * A = e .endthopen . ***> first prove the right inverse law:op a : -> Elt .start a * a -1 .apply -.lid at term . ***> should be: e * a * a -1apply -.linv with A = (a -1) at [1] .***> should be: a -1 -1 * a -1 * a * a -1apply .linv at [2 .. 3] . ***> should be: a -1 -1 * e * a -1apply .lid at [2 .. 3] . ***> should be: a -1 -1 * a -1apply .linv at term . ***> should be: e[rinv] eq A * A -1 = e . ***> add the proven equation:start a * e . ***> now prove the right identity law:apply -.linv with A = a at [2] . ***> should be: a * a -1 * aapply .rinv at [1 .. 2] . ***> should be: e * aapply .lid at term . ***> should be: aclose The keyword “ term ” indicates application of the rule at the top of thecurrent term (i.e., with t = z in rule ( ± ) B ), while the notation “ [2] ”indicates application at its second subterm, and “ [2 .. 3] ” indicatesthe subterm consisting of the second and third subterms; the selectedrule is applied at most once, and fails if the selected subterm does notmatch. The next section shows how this example can be simplifiedeven further by using term rewriting modulo equations, in addition todeduction modulo equations. (cid:2) We now discuss apply in somewhat more detail; a complete descrip-tion is given in [90]. The notations “ [k] ” and “ [k .. n] ” are used forbinary operations that are associative only; incidentally, “ [k .. k] ”is equivalent to “ [k] ”, and “ [] ” is not allowed. Because OBJ3 repre-sents terms modulo A, C, and AC with ordinary terms, if you know howthe representing term is parenthesized, then for each case you can se-lect subterms using the parenthetic occurrence notation of Section 4.6.Thus, instead of “ [2] ” above, we could have written “ (2) ”; however, thesquare bracket range notation is preferable. You can see the parenthe-sization with the command “ apply print at term ” provided “ printwith parens ” is on . The default parenthesization takes the right most subterm as the innermost; note that applying an equation may cause re-parenthesization closer to the default form, even in subterms disjointfrom the redex.Occurrence notation must be used for selecting subterms for oper-ations that are commutative only. Note that “ () ” is a valid occurrence,and is equivalent to “ term ”; another synonym is “ top ”. The “set” nota-tions “ {k} ” and “ {k , . . . , n} ” are used for AC binary operations, analo-gously to “ [k] ” and “ [k .. n] ” for associative operations.Because specifications can have multiple binary operations with vary-ing combinations of modulo attributes, a notation is needed for com-posing the three selection methods discussed above. For example, sup- eduction Modulo Equations pose ∗ , + , are respectively associative, AC, and without attributes, andconsider the term t = (a ∗ b) (a + b + (a ∗ c)) . Then the following [1] of {3} of (2) ,selects the third subterm a , while {3,1} of (2) will select the subterm (a ∗ c) + a , also causing the representing termto be rearranged. A final “ of term ” is optional for composite selectors.If we knew that the representation of t was parenthesized as (a ∗ b) (a + (b + (a ∗ c)))) , then we could also do the first selection aboveusing the occurrence notation (2 3 1) ; however, the second selectionabove cannot be done with occurrence notation. Note the reversal oforder between “ [1] of {3} of (2) ” and “ (2 3 1) ”. Note also thatthe command apply print at {3,1} of (2) will cause the representation of t to be rearranged as (a ∗ b) (((a ∗ c) + a) + b) , even though no deduction is involved.Instead of “ at ”, the keyword “ within ” can be used, indicating a sin-gle application at some proper subterm of the selected term; this can beconvenient when there is a unique subterm that matches. A summaryof the syntax for apply is produced by the command “ apply ? ”. Tobetter understand this material, in addition to the two exercises below,the reader should also look at Example 7.3.6 on page 194. Exercise 7.2.1 (1) Do the proof in Example 7.2.6, everywhere replacing rangenotation with occurrence notation. (2) Do the proof of Example 7.2.6using “ within ” wherever possible. (cid:2) Exercise 7.2.2 The following is part of the calculus of relations (Appendix Chas some basics): th REL is sort Rel .op I : -> Rel .op _U_ : Rel Rel -> Rel [assoc comm].op _;_ : Rel Rel -> Rel [assoc].vars R R1 R2 : Rel .eq I ; R = R . eq R ; I = R .eq R ;(R1 U R2) = (R ; R1) U (R ; R2).eq (R1 U R2); R = (R1 ; R) U (R2 ; R).endth These operations and laws constitute a so-called semi-ring . Now addthe recursive definition op _*_ : Rel Nat -> Rel .var R : Rel . var N : Nat .eq R * 0 = I .eq R * s N = (R * N);(I U R). and prove that if R * N = R * s N , then also Deduction and Rewriting Modulo Equations (R U I);(R * N) = (R * N) . (In this case, R * N is the transitive reflexive closure of R .) Use the OBJ3 apply feature to do all the calculations, although the induction itselfremains outside the OBJ3 framework. Hint: Use induction to prove that R ;(R * N) = (R * N); R for all R and all N . (cid:2) Term rewriting is the main computational engine for theorem prov-ing in this book, and this section develops rewriting modulo B , which has significant advantages over the class rewriting sketched in Sec-tion 7.2.1: (1) it uses ordinary rules instead of modulo B rules; (2) B needs be neither balanced nor linear; (3) infinite classes of terms areavoided; and (4) occurrences make sense. We assume throughout that B is unconditional. Definition 7.3.1 A modulo term rewriting system , abbreviated MTRS , consistsof a signature Σ , a set B of Σ -equations, and a Σ -term rewriting system A , written ( Σ , A, B) .Given an MTRS ( Σ , A, B) , define an ARS on T Σ ,B by c ⇒ [A/B] c (cid:48) iff thereare t, t (cid:48) ∈ T Σ such that c = [t] , c (cid:48) = [t (cid:48) ] , and t ⇒ A t (cid:48) . This relation iscalled one-step class rewriting , and its transitive, reflexive closure is class rewriting .Given an MTRS ( Σ , A, B) , define an ARS on T Σ by t ⇒ A/B t (cid:48) iff thereare t , t ∈ T Σ such that t (cid:39) B t , t (cid:48) (cid:39) B t , and t ⇒ A t . This relation iscalled one-step rewriting with A modulo B , and its transitive, reflexiveclosure is rewriting modulo B .We extend to rewriting terms with variables by extending Σ to Σ (X) ,and in this case write c ⇒ [A/B],X c (cid:48) and t ⇒ A/B,X t (cid:48) , which are defined on T Σ (X),B and T Σ (X) respectively. (cid:2) The proof of the following is similar to those of Proposition 5.1.9 and Corollary 5.1.10: Proposition 7.3.2 Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iff t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Also t ∗ ⇒ A/B,X t (cid:48) iff t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend reasonably over vari-ables, so that we can drop the subscript X , with the understanding thatany X such that var (t) ⊆ X may be used. On the other hand, ∗ (cid:97) A/B,X erm Rewriting Modulo Equations does not restrict and extend reasonably, as shown by Example 5.1.15.Thus, we define t ∗ (cid:97) A t (cid:48) to mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) .Example 5.1.15 also shows bad behavior for (cid:39) XA/B (defined by t (cid:39) XA/B t (cid:48) iff A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although the concretion rule (8) of Chapter 4(extended to rewriting modulo) implies (cid:39) XA,B does behave reasonablywhen the signature is non-void. Defining ↓ A/B,X as usual from the ARSof an MTRS ( Σ , A, B) , we can generalize Proposition 5.1.13 to show that ↓ A/B,X also restricts and extends reasonably, again allowing the sub-script X to be dropped: Proposition 7.3.3 Given t , t ∈ T Σ (Y ) , Y ⊆ X and MTRS ( Σ , A, B) , then wehave t ↓ A/B,X t if and only if t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) There are good implementations of rewriting modulo B for those B that are actually available through attributes in OBJ. Although rewrit-ing modulo B accurately describes what OBJ does, it is not how OBJactually does it, because this would require much needless search formatches; Section 7.3.3 gives some details of what OBJ3 really does,which of course is equivalent to rewriting modulo B . The following(generalizing 5.1.11 and 5.1.16) says rewriting modulo B is equivalentto class rewriting, and that both are semantically sound and complete: Theorem 7.3.4 ( Completeness ) Given an MTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) , then t ⇒ A/B t (cid:48) iff [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iff [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) ( ∀ X) [t] = [t (cid:48) ] ,t ∗ (cid:97) A/B t (cid:48) iff A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Thus ∗ (cid:97) A/B is complete for satisfaction of A ∪ B , and ∗ (cid:97) [A/B] is completefor satisfaction of A modulo B . Moreover, ∗ (cid:97) A/B and (cid:39) XA ∪ B are equalrelations on terms with variables in X . Proof: The first assertion follows from the definitions of ⇒ A/B and ⇒ [A/B] (Def-inition 7.3.1), and the second follows from the first by induction. The third follows from ⇒ [A/B] being a rephrasing of the rule ( + B ) . Theforward direction of the fourth follows from the second and third, plus(1) implies (4) of Theorem 7.2.3, while the converse follows from The-orem 5.1.16 for A ∪ B and the definition of ∗ ⇒ A/B . The fifth and sixthfollow from the fourth plus Theorem 7.2.3. (cid:2) Example 7.3.5 ( Right Identity for Left Groups ) Although the proof in Exam-ple 7.2.6 using built in associativity is simpler than the original proof inSection 4.6, it can be simplified even further, by using rewriting moduloassociativity, replacing the last two apply commands in both the firstand the second subproof by “ apply reduction within term . ” Deduction and Rewriting Modulo Equations open GROUPLA .op a : -> Elt .start a * a -1 . ***> first prove the right inverse law:apply -.lid at term . ***> should be: e * a * a -1apply -.linv with A = (a -1) within term .***> should be: a -1 -1 * a -1 * a * a -1apply reduction within term . ***> should be: e[rinv] eq A * A -1 = e . ***> add the proven equationstart a * e . ***> now prove the right identity lawapply -.linv with A = a within term . ***> should be: a * a -1 * aapply reduction within term . ***> should be: aclose Now the first subproof takes only three apply commands, and the sec-ond only two. But it is something of an accident that this works, be-cause a different ordering of rewrites could have blocked this proof by undoing some prior backward applications; thus, the proof style ofExample 7.2.6 represents a more safe and sure way to proceed. (cid:2) Example 7.3.6 The definition of ring is as follows (e.g., see [128], Chapter IV),noting that the multiplication * need not be commutative: th RING is sort R .ops 0 1 : -> R .op _+_ : R R -> R [assoc comm idr: 0 prec 5] .op _*_ : R R -> R [assoc idr: 1 prec 3] .op -_ : R -> R [prec 1] .vars A B C : R .[ri] eq A + (- A) = 0 .[ld] eq A *(B + C) = A * B + A * C .[rd] eq (B + C) * A = B * A + C * A .endth We will prove that ( ∀ A) A ∗ = 0. For this, we should turn on the printwith parens feature so that when rules are shown, we can check howthey are parenthesized. open .ops a b c : -> R .show rules .start a * 0 .apply -.6 at top .apply -.ri with A = a * a at 1 .apply -.ld with A = a at [1 .. 3] .apply red at term .close Rule .6 is (0 + X_id) = X_id , which was generated by the “ idr: 0 ”declaration (we learn this from the output of the show rules com-mand). The result of the final reduction is , and so the proof is done. (cid:2) erm Rewriting Modulo Equations Modulo B versions of the basic term rewriting concepts are obtainedfrom the ARS definitions applied to the class rewriting relation: Definition 7.3.7 Given an MTRS M = ( Σ , A, B) , then A is terminating modulo B iff ⇒ [A/B] is terminating, and is Church-Rosser modulo B iff ⇒ [A/B] isChurch-Rosser. Ground terminating, ground Church-Rosser, canonical,ground canonical, etc. modulo B are defined similarly, and we also saythat M is terminating, Church-Rosser, canonical, etc. (cid:2) From this and Theorem 7.3.4 we get the following: Proposition 7.3.8 An MTRS M = ( Σ , A, B) is terminating iff ⇒ A/B is terminating,i.e., iff there is no infinite sequence t , t , t , . . . of Σ -terms such that t ⇒ A/B t ⇒ A/B t ⇒ A/B · · · . Similarly, M is ground terminating iff there is no such infinite sequence of ground Σ -terms. Also, M is Church-Rosser iff whenever t ∗ ⇒ A/B t and t ∗ ⇒ A/B t (cid:48) , there exist t , t (cid:48) equivalentmodulo B such that t ∗ ⇒ A/B t and t (cid:48) ∗ ⇒ A/B t (cid:48) , and is ground Church-Rosser iff this property holds for all ground terms t . Moreover, a Σ -term t is a normal form for M iff there is no t (cid:48) such that t ⇒ A/B t (cid:48) , and is anormal form for a Σ -term t (cid:48) iff t is a normal form and t (cid:48) ∗ ⇒ A/B t . Finally, t is a normal form for ⇒ A/B iff [t] B is a normal form for ⇒ [A/B] . (cid:2) The following generalizes Theorem 5.2.9 to rewriting modulo B : Theorem 7.3.9 Given a ground canonical MTRS ( Σ , A, B) , if t , t are two normalforms of a ground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalence classes of ground normal forms under ⇒ A/B form an initial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B or N A/B , as follows, where [[t]] denotesany arbitrary normal form of t , and [[t]] B denotes the B -equivalenceclass of [[t]] :(0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . Proof: For convenience, write N for N Σ ,A/B . The first assertion follows fromthe ARS result Theorem 5.7.2, using Theorem 7.3.4. Note that σ N iswell defined by (1), because of the first assertion, plus the fact that (cid:39) B is a Σ -congruence relation.Next, we check E26 that N satisfies A ∪ B . Satisfaction of B is bydefinition of N as consisting of B -equivalence classes of normal forms.Now let ( ∀ X) t = t (cid:48) be in A ; we need a(t) = a(t (cid:48) ) for all a : X → N .Let b : X → T Σ ,B denote the extension of a from target N to T Σ ,B . Then a(t) = [[b(t)]] B for any t ∈ T Σ ,B because [[b( _ )]] B is a Σ -homomorphismsince [[ _ ]] B is, and there is a unique Σ -homomorphism T Σ ,B (X) → N Deduction and Rewriting Modulo Equations that extends a . Now applying the given rule to t with the substitution b gives b(t) ⇒ A/B b(t (cid:48) ) , so these two terms have the same canonicalform, i.e., [[b(t)]] B = [[b(t (cid:48) )]] B and thus a(t) = a(t (cid:48) ) , as desired.Next, let M be an arbitrary ( Σ , A ∪ B) -algebra, and let h : T Σ ,B → M bethe unique Σ -homomorphism. Noting that N ⊆ T Σ ,B , let g : N → M bethe restriction of h to N . We now prove that g is a Σ -homomorphismby structural induction over Σ :(0) Given σ ∈ Σ [],s , we get g(σ N ) = h([[σ ]] B ) by definition. Then The-orem 7.3.4 gives h([σ ]) = h([[σ ]] B ) because [σ ] ∗ ⇒ [A/B] [[σ ]] B , andthen h([σ ]) = σ M because h is a Σ -homomorphism. Therefore g(σ N ) = σ M , as desired.(1) Given σ ∈ Σ s ...s n ,s with n > 0, we get g(σ N ([[t ]] B , . . . , [[t n ]] B )) = h([[σ (t , . . . , t n )]] B ) by definition. Then Theorem 7.3.4 gives h([σ (t , . . . , t n )]) = h([[σ (t , . . . , t n )]] B ), so that h([σ (t , . . . , t n )]) = σ M (h([t ]), . . . , h([t n ])) = σ M (g([[t ]] B ), . . . , g([[t n ]] B )) because h is a Σ -homomorphism. Therefore g(σ N ([[t ]] B , . . . , [[t n ]] B )) = σ M (g([[t ]] B ), . . . , g([[t n ]] B )), as desired.For uniqueness, suppose g (cid:48) : N → M is another Σ -homomorphism.Let r : T Σ ,B → N be the map sending [t] to [[t]] B , and note that it is a Σ -homomorphism by the definition of [[ _ ]] B . Next, note that if i : N → T Σ ,B denotes the inclusion Σ -homomorphism, then i ; r = N . Finally, notethat r ; g = r ; g (cid:48) = h , by the uniqueness of h . It now follows that i ; r ; g = i ; r ; g (cid:48) , which implies g = g (cid:48) . The last assertion follows sinceboth are initial ( Σ , A ∪ B) -algebras. (cid:2) Theorem 7.3.10 Given a canonical MTRS ( Σ , A, B) , then [A] (cid:96) B ( ∀ X) t = B t (cid:48) iff A ∪ B (cid:96) ( ∀ X) t = t (cid:48) iff [[t]] (cid:39) XB [[t (cid:48) ]] , where as before, [[t]] denotes an arbitrary normal form of t under ⇒ A/B . Proof: The first “iff” is Theorem 7.2.3. The “if” direction of the second “iff” isstraightforward. For its “only if”, let h and h (cid:48) be the unique Σ (X) -homomorphisms from T Σ (X) to T Σ (X),A ∪ B and N Σ (X),A/B , respectively.Then A ∪ B (cid:96) ( ∀ X) t = t (cid:48) implies h(t) = h(t (cid:48) ) . But T Σ (X),A ∪ B is Σ (X) -isomorphic to N Σ (X),A/B by Theorem 7.3.9. Therefore h (cid:48) (t) = h (cid:48) (t (cid:48) ) , i.e., t and t (cid:48) have the same normal form modulo B . (cid:2) erm Rewriting Modulo Equations An important consequence of the above theorem is that we can definea function == that works for canonical MTRS’s the same way that thefunction == described in Section 5.2 works for ordinary canonical TRS’s,namely, t == t (cid:48) returns true over an MTRS ( Σ , A, B) iff t, t (cid:48) are provablyequal under A ∪ B as an equational theory. This function is implementedin OBJ3 by computing the normal forms of t, t (cid:48) and checking whetherthey are equal modulo B . Note that even when ( Σ , A, B) is not canonical,if t == t (cid:48) does return true then t and t (cid:48) are equal under ( Σ , A, B) , againjust as for ordinary TRS’s. The function =/= is also available for MTRS’sin OBJ3, but just as for ordinary TRS’s, it is dangerous if the system isnot canonical for the sort involved. The use of == is illustrated byproofs in the following subsection. This section gives more inductive proofs along the lines of those inSection 6.5.1, but using associative-commutative rewriting. Example6.5.7 implies we can use AC rewriting for addition, and Exercises 7.3.1and 7.3.2 below imply we can also use AC rewriting for multiplication. Example 7.3.11 ( Formula for + · · · + n ) We give an inductive proof of a for-mula for the sum of the first n positive numbers,1 + + · · · + n = n(n + )/ , using Exercises 7.3.1 and 7.3.2 by giving + and * the attributes assoc and comm . This saves us from having to worry about the ordering andgrouping of subterms within expressions. The second module definesthe function sum (n) = + · · · + n . (What we actually prove is that sum (n) + sum (n) = n(n + ) .) obj NAT is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .vars M N : Nat .eq M + 0 = M .eq M + s N = s(M + N).op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .endoobj SUM is protecting NAT .op sum : Nat -> Nat .var N : Nat .eq sum(0) = 0 .eq sum(s N) = s N + sum(N) .endo Deduction and Rewriting Modulo Equations open .ops m n : -> Nat .***> base casereduce sum(0) + sum(0) == 0 * s 0 .***> induction stepeq sum(n) + sum(n) = n * s n .reduce sum(s n) + sum(s n) == s n * s s n .close The line “ protecting NAT ” indicates that the natural numbers are im-ported in “protecting” mode, which means that there are supposed tohave no junk and no confusion for the sort Nat in the models of SUM .We can also use this example to illustrate how unsuccessful proofscores can yield hints about lemmas to prove. If we try the same proofscore as above, but without the assoc and comm attributes for multi- plication, then the base case works, but the induction step fails, withthe two sides being s (s (n + n + (n * n) + n)) and s (s (n +(s n * n) + n)) . (The reduction actually evaluates to a rather non-informative false . However, we can get the desired information eitherby reducing the two sides separately, or else by replacing _==_ by aBoolean-valued operation _eq_ satisfying the single equation N eq N= true ). The difference between these comes from the terms n + (n* n) and (s n * n) , the equality of which differs from the second lawfor * by commutativity. This might suggest either proving the lemma s N ∗ N = N + N ∗ N , or else proving the commutativity of * . The induction step goes througheither way, and we also discover that associativity of multiplication isnot needed here. The two lemmas in the proof of the commutativity ofaddition (Example 6.5.7) were arrived at in the same way. (cid:2) Exercise 7.3.1 Use OBJ3 to show the associativity of multiplication of naturalnumbers. (cid:2) Exercise 7.3.2 Use OBJ3 to show the commutativity of multiplication of natural numbers. (cid:2) Example 7.3.12 ( Fermat’s Little Theorem for p = 3) The “little Fermat theorem”says that x p ≡ x ( mod p) for any prime p , i.e., that the remainder of x p after division by p equalsthe remainder of x after division by p . The following OBJ3 proof scorefor the case p = erm Rewriting Modulo Equations proof where there are non-trivial relations among the constructors; forin this example, unlike the usual natural numbers, s s s 0 = 0 . obj NAT3 is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .vars L M N : Nat .eq M + 0 = M .eq M + s N = s(M + N) .op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .eq L * (M + N) = L * M + L * N .endoopen .var M : Nat .eq M + M + M = 0 .op x : -> Nat .***> base case, x = 0red 0 * 0 * 0 == 0 .***> induction stepeq x * x * x = x .red s x * s x * s x == s x .close The first equation after the open quotients the natural numbers to thenaturals modulo 3. (cid:2) Exercise 7.3.3 Let (A, ⊕ ) be an Abelian semigroup , i.e., suppose that ⊕ is abinary associative, commutative operation on A . Now use OBJ for thefollowing:1. Define (cid:76) ≤ i ≤ n a(i) , where a(i) ∈ A for 1 ≤ i ≤ n and n > a(i), b(i) ∈ A for 1 ≤ i ≤ n , prove that (cid:77) ≤ i ≤ n (a(i) ⊕ b(i)) = ( (cid:77) ≤ i ≤ n a(i)) ⊕ ( (cid:77) ≤ i ≤ n b(i)) Hint: use the following declarations in OBJ: op _+_ : A A -> A [assoc comm] .ops a b : Nat -> A . 3. Give an example of this formula where A is the integers and ⊕ isaddition. Explain how to extend (cid:76) ≤ i ≤ n a(i) to the case where n = 0, and generalize this to the case of an arbitrary Abeliansemigroup. (cid:2) I thank Dr. Immanuel Kounalis for doubting that OBJ3 could handle non-trivial re-lations on constructors, and then presenting the challenge to prove this result. Deduction and Rewriting Modulo Equations A very nice application of term rewriting modulo equations is a deci-sion procedure for the propositional calculus. One way to define thepropositional calculus is with the following equational theory, which iswritten in OBJ3; we override the default inclusion of the Booleans inorder to avoid ambiguous parsing for and , or , etc.; the imported mod-ule TRUTH provides OBJ’s built in sort Bool with just the two constants true and false , and basic built in operations like == . set include BOOL off .obj PROPC is protecting TRUTH .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _xor_ : Bool Bool -> Bool [assoc comm prec 3] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P xor false = P .eq P xor P = false .eq P and (Q xor R) = (P and Q) xor (P and R).op _or_ : Bool Bool -> Bool [assoc comm prec 7] .op not_ : Bool -> Bool [prec 1] .op _implies_ : Bool Bool -> Bool [prec 9] .op _iff_ : Bool Bool -> Bool [assoc prec 11] .eq P or Q = (P and Q) xor P xor Q .eq not P = P xor true .eq P implies Q = (P and Q) xor P xor true .eq P iff Q = P xor Q xor true .endo The main part of this specification involves only connectives and and xor ; the second part defines the remaining propositional connectives interms of these two, plus the constant true . Because it is already knownthat these equations (including those for AC) are one way to define thepropositional calculus, we know that the above really is a theory of the propositional calculus. The following result (due to Hsiang [106])explains why PROPC is important: Theorem 7.3.13 As a term rewriting system, PROPC is canonical modulo B ,where B consists of the associative and commutative laws for xor and and . (cid:2) This is proved in Exercise 12.1.3. Note also that this B is linear balanced.Moreover, Fact 7.3.14 The initial algebra of PROPC has just two elements, namely true and false . erm Rewriting Modulo Equations Proof: Given Theorem 7.3.13, it suffices to determine the reduced forms, byTheorem 7.3.9. The terms true and false are reduced because norules apply to them, and a case analysis of the eight terms built from true and false using and and xor shows that they all reduce to either true or false . (We can ignore the other operations because they aredefined in terms of and and xor .) (cid:2) It follows from this that PROPC really does protect its imported mod-ule TRUTH , as its specification claims (the “protecting” notion was de-fined E27 in Chapter 6).Our commitment to semantics demands that before going further,we should make it clear what we mean by saying that an equation is“true” in the propositional calculus: Definition 7.3.15 An equation e is a theorem of the propositional calculus iff T PROPC (cid:238) e , and a T Σ PROPC (X) -term t is a tautology iff ( ∀ X) t = true is atheorem of the propositional calculus. Formulae t, t (cid:48) are equivalent iff ( ∀ X) t = t (cid:48) is a theorem of the propositional calculus. (cid:2) We can prove T PROPC (cid:238) ( ∀ X) t = t (cid:48) directly, by checking whether a(t) = a(t (cid:48) ) for all a : X → T PROPC . Because T PROPC has exactly twoelements, true and false , by Fact 7.3.14, there are exactly 2 N cases tocheck when X has N variables, one case for each possible assignment a . This is essentially Ludwig Wittgenstein’s well-known method of truthtables , also called ( Boolean ) case analysis ; it is easy to apply this methodby hand when N is small. Example 7.3.16 To illustrate the method of truth tables, let’s check whetheror not the equation ( ∀ P , Q) P and (P or Q) = P or Q is a theorem ofthe propositional calculus. Since there are two variables, there are fourpossible assignments; these appear in the left two columns, serving aslabels for the four rows of the table below: P Q P or Q P and (P or Q) equal?true true true true yestrue false true true yesfalse true true false no false false false false yesThus we see that the equation is false, and that P = false , Q = true is a counterexample. Of course, such calculations do not require anelaborate L A TEX tabular format. (cid:2) Theorem 7.3.13 plus some results from Chapter 5 that are generalizedto rewriting modulo B later in this chapter will imply that PROPC (X) ,the enrichment of PROPC by X , is canonical for any X . However, thecanonical forms of this MTRS are not as well known as they should be.The following is needed to describe those forms: Deduction and Rewriting Modulo Equations Definition 7.3.17 A formula of the propositional calculus is an exclusive nor-mal form (abbreviated ENF ) iff it has the form E xor E xor . . . xor E n , where each E i (called an exjunct ) has the form C i, and C i, and . . . and C i,k i , where each C i,j (called a conjunct ) either has the form P or else theform not P , where P is a variable of sort Prop ; by convention, we saythat the empty ENF ( n = false , and that the empty exjunct( k i = true . Those P that occur in an exjunct with a preced-ing not are said to occur negatively , and those that occur without it positively . Given a set X of variable symbols, an exjunct E is complete (with respect to X ) iff each variable in X appears in E , and an ENF is complete (with respect to X ) iff each of its exjuncts is complete. (cid:2) The result mentioned above may now be stated as follows: Proposition 7.3.18 Every formula of the propositional calculus is equivalent toa unique (modulo B ) irredundant positive ENF , defined to be an ENFhaving only positive conjuncts, involving (at most) the same variables,with no repeated conjuncts and no repeated exjuncts; these irredun-dant positive ENFs are the canonical forms of PROPC . Moreover, everyformula of the propositional calculus is equivalent to a unique (modulo B ) complete ENF involving (exactly) the same variables that it has. Proof: The first assertion follows from noticing that no rule of PROPC (X) ap-plies to any irredundant positive ENF, so that these forms are reduced,and noticing that any formula using only xor and and that is not anirredundant positive ENF can be rewritten using one of the rules in thefirst part of PROPC , and so cannot be canonical. Therefore the irredun-dant positive ENFs must be its canonical forms.For the second assertion, the complete ENF of a formula can be ob-tained from its irredundant positive ENF as follows: for each exjunct, ifsome variable x does not appear in it, exjoin to it the conjunct x and not x , and then simplify the resulting term using only the distributive and idempotent laws; the result will be a complete ENF that is equivalent tothe original irredundant positive ENF, because only rules from PROPC were used. This equivalence and the first assertion imply that distinctcomplete ENFs are inequivalent, and that every term is equivalent to aunique (modulo B ) complete ENF. (cid:2) Under the correspondence of the above proof, a term t has the irre-dundant positive ENF true iff its complete ENF contains all 2 N exjuncts.More generally, the exjuncts in the complete ENF of t correspond tothose rows in its truth table where it is true. As an example, we findthe complete ENF of the term x xor y , using a simplified notation with erm Rewriting Modulo Equations + for xor , juxtaposition for and , and overbar for negation: calcu-lating with PROPC , modulo AC for both binary operations, we have x = x + y ¯ y = xy + x ¯ y and y = y + x ¯ x = yx + y ¯ x , so that thecomplete ENF for x + y is xy + x ¯ y + ¯ xy . Similarly, the complete ENFfor x ¯ y + xz is xyz + x ¯ yz + x ¯ y ¯ z . Corollary 7.3.19 Two propositional calculus formulae over variables X are prov-ably equal in PROPC (X) iff they yield the same Boolean value for everyassignment of Boolean values for the variables in X . Proof: Two terms are provably equal iff they have the same complete ENF, theconjuncts of which give exactly the Boolean assignments to X for whichthe terms are true. E28 (cid:2) Proposition 7.3.20 Given a set X of N Boolean variables, PROPC has a free alge- bra on X generators, and it has 2 N elements. Proof: Recall that the free PROPC -algebra on X generators is the initial alge-bra of PROPC (X) viewed as a PROPC -algebra. By the proof of Proposi-tion 7.3.18, the normal forms of PROPC (X) are in bijective correspon-dence with the complete exclusive normal forms. Each complete ex-clusive normal form can be seen as a set of complete disjuncts, andthen it is easy to see that there are 2 N different complete disjuncts, andtherefore 2 N different sets of complete disjuncts. (cid:2) The elements of this free algebra can be seen as all of the possibleBoolean functions on N variables, noting that N variables can take 2 N configurations, each of which can have 2 values. Exercise 7.3.4 Let B be the set containing true and false , let Σ be the signatureof PROPC (X) , and let M be the Σ -algebra with carrier [[X → B ] → B ] ,with operations from PROPC defined “pointwise” on functions fromBoolean operations on B (e.g., with xor M (f , g)(a) = f (a) xor g(a) for a : X → B , and with x ∈ X interpreted as x M (a) = a(x) . Show that M is a free PROPC -algebra on X generators. (cid:2) The PROPC MTRS has the very special property that we can decide whether or not non-ground equations hold in the initial algebra just by comparing the canonical forms of their left- and rightsides; this prop-erty is unfortunately as rare as it is useful. Note that canonicity onlyallows showing that an equation does hold for the initial algebra; it de-cides whether or not the equation holds for all models. The followingprovides a precise formulation of what it means to say that PROPC givesa decision procedure for the propositional calculus : Definition 7.3.21 A TRS ( Σ , A) is reduction complete iff it is canonical and forany Σ -equation e , say ( ∀ X) t = t , we have T Σ ,A (cid:238) e iff [[t ]] = [[t ]] .An MTRS ( Σ , A, B) is reduction complete iff it is canonical and for any Σ -equation e , say ( ∀ X) t = t , we have T Σ ,A ∪ B (cid:238) e iff [[t ]] (cid:39) XB [[t ]] . (cid:2) Deduction and Rewriting Modulo Equations Exercise 7.3.5 Show that the TRS’s from Examples 5.1.7 and 5.5.7 are not re-duction complete. (cid:2) Theorem 7.3.22 The MTRS PROPC is reduction complete. Proof: Let E = A ∪ B . By the completeness theorem, it will suffice to provethat, for any Σ -equation e , T Σ ,E (cid:238) e iff M (cid:238) e for every ( Σ , E) -algebra M . That the second condition implies the firstis immediate. For the converse, we first treat the free ( Σ , E) -algebras.We will use contradiction, and so we suppose that T Σ ,E satisfies e butthat T Σ ,E (Z) does not satisfy e . Then there exists an a : X → T Σ ,E (Z) such that a(t) ≠ a(t (cid:48) ) . By Exercises 7.3.4 and 6.1.3, we can take T Σ ,E (Z) to be [[Z → B ] → B ] with pointwise operations, where B = { true , false } ;similarly, we can take T Σ ,E to be B . Let u = a(t) and let u (cid:48) = a(t (cid:48) ) . Since u ≠ u (cid:48) , there exists some b : Z → B such that u(b) ≠ u (cid:48) (b) . Now defin-ing c = a ; b : X → B we get c = a ; b : T Σ ,E (X) → B , by 4. of Exercise 6.1.4.Next, if we define ˆ b : [[Z → B ] → B ] → B by ˆ b(u) = u(b) , then thereader can check that ˆ b is a Σ -homomorphism such that ˆ b(z) = b(z) where the first z is the function in [[Z → B ] → B ] defined in Exer-cise 7.3.4. Then ˆ b = b since there is just one such Σ -homomorphismextending b . Therefore c(t) = b(a(t)) = b(u) = ˆ b(u) = u(b) , andsimilarly c(t (cid:48) ) = u (cid:48) (b) . Therefore c(t) ≠ c(t (cid:48) ) , contradicting our as-sumption that B satisfies e . We next show the desired implication forany ( Σ , E) -algebra M . By Proposition 6.1.18, there is some Z such that q : T Σ ,E (Z) → M is surjective. But then T Σ ,E (Z) (cid:238) e implies M (cid:238) e , andso we are done. (cid:2) It is easy to apply this result in OBJ, because its built in operation == re-turns true iff its two arguments have normal forms that are equivalentmodulo the attributes declared for the operations involved; an alterna-tive, which is justified in Exercise 7.3.8, is just to reduce the expression t iff t (cid:48) . Exercise 7.3.6 Use OBJ3 to determine whether or not the following are tautolo-gies of the propositional calculus:1. P implies (P implies P) .2. P implies (P implies not P) .3. not P implies (P implies not P) .4. (P implies Q) implies Q .5. P iff P iff P .6. P iff P iff P iff P . erm Rewriting Modulo Equations (P implies Q) implies (Q implies Q) .Now use truth tables to check at least three of the above. (cid:2) Exercise 7.3.7 Use OBJ3 to determine whether or not the following are theo-rems of the propositional calculus:1. ( ∀ P ) P = not not P .2. ( ∀ P , Q) P or Q = not P xor not Q .3. ( ∀ P ) P = P iff P .4. ( ∀ P , Q, R) P implies (Q and R) = (P implies Q) and (P implies R) .5. ( ∀ P , Q) not (P and Q) = not P or not Q . ( ∀ P , Q, R) P implies (Q or R) = (P implies Q) or (P implies R) .Also use truth tables to check at least three of them. (cid:2) Exercise 7.3.8 Show that ( ∀ X) t = t (cid:48) is a theorem of the propositional calculusiff t iff t (cid:48) is a tautology, iff not (t xor t (cid:48) ) is a tautology. (cid:2) Exercise 7.3.9 Show that if Σ PROPC -formulae t, t (cid:48) are equivalent, then t is a tau-tology iff t (cid:48) is. (cid:2) Exercise 7.3.10 Show that if X ⊆ Y then ( ∀ X) t = t (cid:48) is a theorem of the propo-sitional calculus iff ( ∀ Y ) t = t (cid:48) is. (cid:2) Definition 7.3.23 A formula of the propositional calculus is a disjunctive nor-mal form (abbreviated DNF ) iff it has the form D or D or . . . or D n , where each D i (called a disjunct ) has the form C i, and C i, and . . . and C i,k i , where each C i,j (called a conjunct ) either has the form P or else the form not P , where P is a variable of sort Prop ; by convention, we saythat the empty DNF ( n = false and the empty disjunct ( k i = true . Those P that occur in a disjunct with a preceding not occur negatively , and those that occur without it occur positively .Given a set X of variable symbols, a disjunct C is complete (with respectto X ) iff each variable in X appears in C , and a DNF is complete (withrespect to X ) iff each of its disjuncts is complete. (cid:2) It follows from the above conventions that both true and false are DNFs.It also follows that if X has N elements, then each complete disjuncthas N conjuncts. The following is well known: Deduction and Rewriting Modulo Equations Proposition 7.3.24 Every formula of the propositional calculus is equivalent toa DNF having (at most) the same variables, and to a unique (modulo B )complete DNF having (exactly) the same variables. (cid:2) A nice proof of the above uses the MTRS in Exercise 7.3.11 below torewrite formulae to disjunctive normal form; this MTRS is shown termi-nating in Example 7.5.10, and Church-Rosser in Exercise 12.1.1; hencethis MTRS is canonical. Exercise 12.1.2 shows that its reduced formsare DNF’s. Exercise 7.3.11 Choose five non-trivial formulae of the propositional calculusand use the TRS below to find their DNFs; explain why the reducedforms of this TRS are necessarily correct if they are DNFs, without using the as yet unproved result that the TRS is canonical. obj DNF is protecting TRUTH .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _or_ : Bool Bool -> Bool [assoc comm prec 3] .op not_ : Bool -> Bool [prec 1] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P or false = P .eq P or true = true .eq P or P = P .eq not false = true .eq not true = false .eq P or not P = true .eq not not P = P .eq not(P and Q) = not P or not Q .eq not(P or Q) = not P and not Q .eq P and (Q or R) = (P and Q) or (P and R).op _xor_ : Bool Bool -> Bool [assoc comm prec 7] .op _implies_ : Bool Bool -> Bool [prec 9] .op _iff_ : Bool Bool -> Bool [assoc prec 11] .eq P xor Q = (P and not Q) or (not P and Q).eq P implies Q = not P or Q .eq P iff Q = (P and Q) or (not P and not Q) .eq P and not P = false .endo Please note that once again, BOOL should not be included. Although thisTRS is similar to PROPC , it has a quite different purpose; in particular,the formula (p and q) or (p and not q) is reduced under DNF but notunder PROPC , where it has the canonical form p . Similarly, not p isreduced under DNF but not under PROPC , where it has canonical form pxor true . (cid:2) erification of Hardware Circuits This section brings us closer to how OBJ3 actually implements termrewriting modulo equations, with the following weaker relation whichovercomes the inefficiency of Definition 7.3.1, because it only requiresmatching on subterms of the source term: Definition 7.3.25 Given an MTRS ( Σ , A, B) , for t, t (cid:48) ∈ T Σ (X) , we say t weaklyrewrites to t (cid:48) under (or with ) A modulo B in one step iff there exist arule t → t of sort s in A with variables Y , a term t ∈ T Σ ( { z } s ∪ X) , anda substitution θ : Y → T Σ (X) such that t = t (z ← t ∗ ) and t ∗ (cid:39) B θ(t ) and t (cid:48) = t (z ← θ(t )) . In this case we write t ⇒ A,B t (cid:48) . The relation weakly rewrites under A modulo B is the reflexive, transitive closureof ⇒ A,B , denoted ∗ ⇒ A,B . (cid:2) As usual, ⇒ A,B gives an abstract rewriting system, so we automaticallyget the appropriate notions of termination, Church-Rosser, canonical,and local Church-Rosser for weak term rewriting modulo B , in boththe general and the ground cases; and of course we also get the usualcollection of results by specializing the general results about abstractrewrite systems.It is clear that weak rewriting modulo B implies rewriting modulo B ,i.e., ⇒ A,B ⊆ ⇒ A/B . But the following example shows that weak rewrit-ing modulo B is strictly weaker than rewriting modulo B , so that itsreflexive, transitive, symmetric closure cannot be complete. Example 7.3.26 Let Σ have one sort with a binary operation + and constants0 , a, b , let A contain the left zero law, 0 + X = X , and let B contain theassociative law. Then (a + ) + b ⇒ A/B a + b because (a + ) + b (cid:39) B a + ( + b) and a + ( + b) → A a + b , but (a + ) + b is a normal formfor ⇒ A,B . Therefore ⇒ A,B really is weaker than ⇒ A/B . (cid:2) Despite this incompleteness, many B have “completion procedures,”which given a set A of rewrite rules, produce another set A (cid:48) such thatrewriting with ∗ ⇒ A/B and with ∗ ⇒ A (cid:48) ,B always yield B -equivalent terms. Infact this is how OBJ3 actually implements rewriting modulo some equa-tions [90, 113]. This allows users to think of computation as being done with ∗ ⇒ A/B even though it is really done with ∗ ⇒ A (cid:48) ,B ; in OBJ3, the newrules generated by completion can be seen with the “ show all rules. ” command. Completion will be discussed in Chapter 12. Hereafter,we study ⇒ A/B since it describes what OBJ3 does, though not how. Hardware verification is a natural application for equational logic, be-cause both circuits and their behaviors are described by sets of equa-tions in a very natural way; moreover equational logic is simple and well Deduction and Rewriting Modulo Equations Pwr GndFigure 7.1: Power and Groundunderstood, with efficient algorithms for many relevant decision prob-lems. We first treat so called combinatorial (or combinational ) circuits,which have no loops or memory, illustrated by combinations of “logicgates” and also by simple cmos transistor circuits. Bidirectional logiccircuits are then treated in Section 7.4.5; these may have loops andmemory. Chapter 9 extends equational logic with second-order uni-versal quantification, enabling us to verify so-called sequential circuits,which have time-dependent behavior. Much more could be said about solving the equations that arise from hardware circuits, but our inten-tion here is not to develop a complete theory, but rather to provide acollection of enticing applications for term rewriting modulo equations.Our circuit models involve two voltage levels, called “power” and“ground,” diagramed as shown in Figure 7.1, and identified with true and false , respectively. Wires in circuits are assumed to have eitherthe value power or else the value ground, and are modeled by Booleanvariables. Wires that are directly connected must share the same volt-age level, and are therefore represented by the same variable. Theseassumptions allow us to use the term rewriting decision procedure forthe propositional calculus in our proofs. The simplest kind of combinatorial circuit features a direct “flow” fromsome given input wires, through some logic gates, to some outputwires. The gates are modeled by the corresponding Boolean functions,and in this situation, all the computation can be done by the proposi-tional calculus decision procedure, as illustrated in the following: Example 7.4.1 ( ) Figure 7.2 is a circuit diagram for the usual xor (exclusive or). The equations Although this approach ignores issues such as load (i.e., current flow), resistance,timing, and capacitance, it does fully capture the logical aspects of circuits. Moreover,it seems likely that many other issues can be handled by using larger sets of values onwires, for example, in the style of Winskel [183], and that these larger value sets canalso be implemented with term rewriting. erification of Hardware Circuits (cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13) cini2i1 p5p2p1 p4p3 coutsout • ••• • •• • • • • • Figure 7.2: 1-bit Full Adderin the OBJ module FADD below say the same thing as this diagram, us-ing constants for the values of wires: i1 , i2 , cin are inputs, and cout , sout are the outputs. To verify that this circuit has the logical behaviorof a full adder, we must prove the following first-order formula (Sec-tion 7.4.3 explains why we prove this particular formula): ( ∀ Z) (T ⇒ e ∧ e ) , where T specifies the circuit, Z consists of its variables, and e , e arethe two equations cout = (i1 and i2) or (i1 and cin) or (i2 and cin)sout = (i1 and i2 and cin) or (i1 and not i2 and not cin) or(not i1 and i2 and not cin) or (not i1 and not i2 and cin) with the equations in T as in the module FADD below. The following OBJ proof score for this verification first introducesconstants for the variables representing the wires, then gives the equa-tions that describe the circuit, and finally checks whether the outputvariables satisfy their specifications by using reduction over the PROPC Boolean decision procedure. Proposition 7.4.3 below and familiar re-sults on first-order logic (fully explained in Chapter 8) justify that thisscore actually proves the desired formula. th FADD is extending PROPC .ops i1 i2 cin p1 p2 p3 p4 p5 cout sout : -> Bool .eq p1 = i1 and i2 . Deduction and Rewriting Modulo Equations eq p2 = i1 and cin .eq p3 = p1 or p2 .eq p4 = cin and i2 .eq p5 = cin xor i2 .eq cout = p3 or p4 .eq sout = i1 xor p5 .endthreduce cout iff (i1 and i2) or (i1 and cin) or (i2 and cin) .reduce sout iff (i1 and i2 and cin) or(i1 and not i2 and not cin) or(not i1 and i2 and not cin) or(not i1 and not i2 and cin). No manual application of rules is needed, since OBJ does all the work.By contrast, [25] gives a six step, one and a half page outline of a proof for just the sout formula. (cid:2) The equations for this circuit have a very special form, which isdescribed in the following, as a first step towards an algebraic theoryof hardware circuits: Definition 7.4.2 A set T of Σ (Z) -equations is an unconditional triangular prop-ositional system iff Σ is the signature of PROPC (the propositional cal-culus specification given in Section 7.3.2), Z is a finite set of constantsof sort Bool called variables , there is a subset X of Z , say x , . . . , x n ,called the input variables , and there is an ordering of the rest of Z ,say y , . . . , y m , called the dependent variables , such that T consists ofequations having the form y k = t k for k = , . . . , m , where each t k is a Σ (Z) -term involving only input variables and those non-input variables y j with j < k ; there must be exactly one equation for each k . In ad-dition, some of the non-input variables may be designated as outputvariables , with the rest being called internal (or “test point”) variables . (cid:2) Hereafter we may omit “unconditional” and we may also use thephrases “combinatorial system” and “triangular propositional system”interchangeably, often omitting the word “propositional.” The follow-ing display demonstrates why we chose the term “triangular”: y = t (x , . . . , x n ). . . . . .y k = t k (x , . . . , x n , y , . . . , y k − ). . . . . . . . .y m = t m (x , . . . , x n , y , . . . , y k − , . . . , y m − ) Note that each equation has sort Bool since that is the only sort in Σ (Z) , and that t can only contain input variables. Also, the equations In practice we may let Σ and PROPC contain some propositional functions not inthe original version of PROPC that could have been defined in it, such as p nor q = not (p or q) . erification of Hardware Circuits in a triangular system are usually considered (implicitly) quantified by ( ∀∅ ) , which means that, although we call them variables, all the x i and y i are technically constants; however, they will sometimes be uni-versally quantified in formulae that describe the intended behavior ofhardware circuits. In particular, we are interested in solving the equa-tions in the triangular system over the initial model of PROPC , so thatthe variables in the triangular system are constrained to be either true or false , i.e., power or ground; we will see that these solutions cor-respond to certain Σ (X) -models of the system. Here PROPC has initialsemantics, while triangular systems over it have loose semantics.Triangular systems are not in general term rewriting systems over Σ , because the rightsides in general contain variables that do not oc-cur in the leftsides. However, they are MTRS’s over Σ (Z) , since for this signature the “variables” are really constants. Example 5.8.27 showedthat unconditional triangular systems (in the sense of that example) arecanonical as TRS’s, and that the only variables in their normal formsare input variables. We want similar results for triangular systems over PROPC . We will generalize techniques from Chapter 5 to rewriting mod-ulo B to show that enriching PROPC with a triangular system T againyields a canonical system, modulo the same B used for PROPC (Theo-rem 7.7.24). We use this in the following: Proposition 7.4.3 Given an unconditional triangular system E29 T with variables Z , let B be the associative and commutative laws for and and xor , P theequations of PROPC except B , and A = T ∪ P . Then the following areequivalent, for t, t (cid:48) any Σ (Z) -terms:1. PROPC (cid:238) Σ ( ∀ Z) (T ⇒ t = t (cid:48) ) ;2. (A ∪ B) (cid:238) Σ (Z) ( ∀∅ ) t = t (cid:48) ;3. (t iff t (cid:48) ) ∗ ⇒ A/B true ;4. [[t]] A (cid:39) B [[t (cid:48) ]] A ;5. t ↓ A/B t (cid:48) ; 6. ( t == t (cid:48) ) ∗ ⇒ A/B true . Proof: We omit subscripts B from (cid:39) B , = B , ⇒ A/B , ⇒ T/B and ⇒ P/B . Conditions1. and 2. are equivalent by rules of first-order logic. Conditions 2. and4. are equivalent by Theorem 7.3.10. Conditions 3. and 4. are equiv-alent because ([[t]] T iff [[t (cid:48) ]] T ) ∗ ⇒ P true iff [[[[t]] T ]] P (cid:39) [[[[t (cid:48) ]] T ]] P byExercise 7.3.8, and [[[[t]] T ]] P (cid:39) [[t]] A because both T modulo B and A Chapter 8 gives full details, including the first-order version of the Theorem of Con-stants, which gives us that P (cid:238) Σ (Z) ( ∀∅ ) (T ⇒ t = t (cid:48) ) , and an implication eliminationrule which moves T over to conjoin with P . Deduction and Rewriting Modulo Equations modulo B are canonical, by Theorem 7.7.24. Finally, 4. and 5. are equiv-alent by Corollary 5.7.7, and 4. and 6. are equivalent by the definitionof == . (cid:2) This result justifies proving universally quantified implications for tri-angular systems as in Example 7.4.1, by reducing t iff t (cid:48) to true , toconclude that ( ∀ Z) (T ⇒ t = t (cid:48) ) . We call this the “method of reduc-tion.” Note that the variables in T , t, t (cid:48) are constants for reduction,while those in PROPC are not. Fact 7.4.4 The canonical forms of an unconditional triangular propositionalsystem contain only input variables. Proof: We prove the contrapositive: If a term t contains an occurrence of anon-input variable y k , then t is not reduced, because the rewrite rule y k = t k can be applied to it. (cid:2) Proof scores like that in Example 7.4.1 can be generated completelyautomatically from the circuit diagram and the sentences to be proved,because there is an exact correspondence between circuit diagrams likethat in Figure 7.2 and triangular propositional systems. Although itwould be too tedious to spell out this correspondence in detail here,we note that if a circuit cannot be put in triangular form, then either itis not combinatorial because it has some loops, or else it has internalvariables that should instead have been declared as input variables, orvice versa. Exercise 7.4.1 Design a circuit with inputs a , a , a , and with one output z which is true iff exactly two of the inputs are true; you may use any (2input) logic gates you like. Prove that your design is correct using themethod of reduction and OBJ3. (cid:2) Exercise 7.4.2 Design a circuit using only not , nand and or gates having in-puts a , a , a , a , and an output z which is true iff exactly two of theinputs are true. Use as few gates as you can (there is a solution withjust 19). Prove the correctness of your design using OBJ3. (cid:2) Definition 7.4.5 A Boolean (or ground ) solution to a triangular system is an as- signment of Boolean values to all its variables such that all its equationsare satisfied. A system is consistent iff each assignment of Booleanvalues to input variables extends to a Boolean solution. A system is underdetermined iff it is consistent and some assignment of Booleanvalues to input variables extends to more than one Boolean solution.Two systems are Boolean equivalent iff they have exactly the sameBoolean solutions. A model of a triangular system is a Σ (Z) -model thatsatisfies PROPC and the equations of the system (considered quantifiedby ( ∀∅ ) ), and a Boolean (or protected ) model is a model having theset { true , false } as its carrier. (cid:2) erification of Hardware Circuits We are interested in Boolean solutions because these correspond topossible behaviors of the circuit. They are bijective with Boolean mod-els, by letting a : Z → { true , false } correspond to the model that inter-prets each z ∈ Z as a(z) ; we might say that Boolean models satisfy theso-called Law of the Excluded Middle, in that the only values allowedare true and false , with everything else excluded. It follows that twosystems are Boolean equivalent iff they have exactly the same Booleanmodels.Underdetermination is similar to the situation for a system of lin-ear equations where there are more variables than (independent) equa-tions. The next subsection will show that transistors are consistentand underdetermined; this is possible because these are conditionalrather than unconditional systems. A circuit consisting of an inverter (i.e., negation) with its output connected to its input is inconsistent,because its equation p = not p has no solutions; but no system containing such an equation can betriangular. It is also possible for an unwise choice of input variables toproduce inconsistency. For example, a system that contains the equa-tion i1 = not i2 is unsolvable if both i1 and i2 are input variables. To avoid this, oneof these two variables could instead be declared internal. Proposition 7.4.6 Every unconditional triangular propositional system is con-sistent, and no unconditional triangular propositional system is under-determined. Proof: The proof is by induction. Let a be a Boolean assignment to the inputvariables. Then by the first equation, t (a(x ), . . . , a(x n )) gives a valuefor y , let’s denote it a(y ) . Similarly, t (a(x ), . . . , a(x n ), a(y )) gives a value for y , denoted a(y ) . And so on, until we get a value a(y m ) = t m (a(x ), . . . , a(x n ), a(y ), . . . , a(y m − )) for y m . This assignment a on Z is a solution by construction, andsince its values are computed directly from the equations, it is the onlypossible solution extending the original assignment. (cid:2) This is more than an analogy, because systems of linear equations over the field Z (the two-element field of the integers modulo 2), with 0 representing false and1 representing true , describe certain kinds of circuit. Our systems are more generalsince their equations may be non-linear (they are multi-linear) and/or conditional. Deduction and Rewriting Modulo Equations (cid:102) a b a bg g Figure 7.3: The Two Kinds of mos Transistor: n on Left, p on Right This section considers conditional triangular propositional systems andtheir solutions. A key difference between conditional and unconditionalsystems is that a conditional system may fail to determine the values of some of its variables under some conditions. The most basic (andmost important) example of such a system is a transistor. Example 7.4.7 The two main types of transistor are the n-transistor and the p-transistor , diagramed as shown in Figure 7.3, and having logical be-havior given by conditional equations of respective forms a = b if g = true a = b if g = false where the variables a, b, g are Boolean, with a, g as inputs and b asoutput. The system consisting of a single n-transistor is consistent andunderdetermined, because if g = false , then according to its equation(the first above), b can have any value, no matter what value a has; thesame holds for p-transistors with g = true . (These single equationsystems are clearly triangular.) (cid:2) More complex systems can be built by putting several transistors to-gether, as illustrated in examples in Section 7.4.4 and thereafter. Wenow generalize Definition 7.4.2 to conditional equations, and then de-velop some theory for such systems. Definition 7.4.8 A system T of (possibly conditional) Σ (Z) -equations is a ( con-ditional ) triangular ( propositional ) system iff Σ is the signature of PROPC (the propositional calculus specification of Section 7.3.2), Z isa finite set of constants of sort Bool called variables , there is a subset X of Z , say x , . . . , x n , called the input variables , and there is an or-dering of the rest of Z , say y , . . . , y m , called the dependent variables ,such that T consists of equations of the form y k = t if C , where each t is a Σ (Z) -term involving only input variables and variables y j with j < k , where each C is a finite set of pairs of such terms, and wherethere may be any number (including zero) of equations for each k . Inaddition, some of the dependent variables may be designated as output erification of Hardware Circuits variables , with the remaining dependent variables being called internal (or “test point”) variables.If the rightside of each pair in each C is true , then the system has Boolean conditions . A triangular system T has consistent conditions iff for each variable y k and for each assignment of Boolean values toinput variables, if more than one condition of an equation with leftside y k is true, then the corresponding rightsides are not provably equal(using PROPC and all previous equations in T ) to different Boolean val-ues. A triangular system has disjoint conditions iff whenever C, C (cid:48) arethe conditions of distinct equations with the same dependent variableas leftside, then C ∧ C (cid:48) is provably false for each assignment of Booleanvalues to input variables. A triangular system is total iff every depen-dent variable has equations, and for each k and each choice of Boolean values for its input variables, the disjunction of the conditions in itsequations with leftside y k , say C k = C k, ∨ C k, ∨ · · · ∨ C k,(cid:96) , is prov-ably true, where each C k,i is considered the conjunction of its pairs asequations; otherwise the system is called partial . (cid:2) When a conditional triangular system has all C k = ∅ , it is equivalentto an unconditional system. We may assume that triangular systemshave Boolean conditions, since this is convenient and entails no loss ofgenerality. As with unconditional triangular systems, the equations areusually considered to be quantified by ( ∀∅ ) , with all variables consid-ered as constants, although they may sometimes appear universallyquantified, e.g., in formulae that describe intended circuit behavior.As in the unconditional case, conditional triangular systems are notin general rewriting systems over Σ , but are over Σ (Z) . However, unlikethe unconditional case, conditional triangular systems are not alwaysChurch-Rosser, which motivates the next result. Since the concepts inDefinition 7.4.5, including solution of a system, consistent system, andunderdetermined system, carry over completely unchanged to condi-tional triangular systems, we do not repeat this material here. Proposition 7.4.9 A conditional triangular system is consistent if its conditionsare consistent. Moreover, a conditional triangular system with disjoint conditions has consistent conditions, and also is Church-Rosser. Proof: Let a be an assignment to the input variables. Then if the condition ofthe first equation is true for that assignment, its rightside gives a value t (a(x ), . . . , a(x n )) for y , let’s denote it a(y ) . Otherwise, if there isa subsequent equation with leftside y the condition of which is true,let the value of its rightside be a(y ) ; if there is no such equation, pickan arbitrary value for a(y ) . Similarly, we get a value a(y ) for y , andso on, until we get a value a(y m ) for y m . The resulting assignment a is by construction a solution.The second and third assertions follow since for each assignment,each independent variable can be rewritten in at most one way. (cid:2) Deduction and Rewriting Modulo Equations The next result generalizes Proposition 7.4.3 to conditional trian-gular systems, and applying its equivalence of 1. and 2. to conditionalrewrite rules reassures us that our join semantics for conditional termrewriting modulo equations is adequate for our hardware applications(see Section 7.7 for the technical details of this semantics). Proposition 7.4.10 Using the notation of Proposition 7.4.3, the following areequivalent for any conditional triangular system E30 T with variables Z that is Church-Rosser as a rewrite system, for t, t (cid:48) any Σ (Z) -terms:1. t ↓ A/B t (cid:48) ;2. [[t]] A (cid:39) B [[t (cid:48) ]] A ;3. ( t == t (cid:48) ) ∗ ⇒ A/B true ; (A ∪ B) (cid:96) Σ (Z) ( ∀∅ ) t = t (cid:48) ;5. PROPC (cid:238) Σ ( ∀ Z) (T ⇒ t = t (cid:48) ) ;6. (t iff t (cid:48) ) ∗ ⇒ A/B true . Proof: We omit subscripts B from (cid:39) B , = B , ⇒ A/B , ⇒ T/B and ⇒ P/B . A is termi-nating by Proposition 7.7.19 and Church-Rosser by hypothesis, so it iscanonical. Therefore 1. and 2. are equivalent by Corollary 5.7.7, and2. and 3. are equivalent by definition of == . Also, 1. implies 4. and 4.implies 2., while 4. and 5. are equivalent by the Completeness Theoremand first-order logic (see footnote 5). Finally, 2. and 6. are equivalent,since ([[t]] T iff [[t (cid:48) ]] T ) ∗ ⇒ P true iff [[[[t]] T ]] P (cid:39) [[[[t (cid:48) ]] T ]] P by Exercise7.3.8, and [[[[t]] T ]] P (cid:39) [[t]] A because T modulo B and A modulo B arecanonical. (cid:2) This result justifies proving sentences of the form ( ∀ Z) (T ⇒ t = t (cid:48) ) byreducing t iff t (cid:48) to true when T has conditional equations. However,for hardware verification problems, this reduction will not in generalwork without using case analysis on the input variables, for reasonsthat are discussed in Section 7.4.3. For underdetermined systems, it is often convenient to use parame-ters in solutions. Definition 7.4.11 A general ( Boolean ) solution of a conditional triangular sys-tem T with dependent variables y , . . . , y m is a family f k (x , . . . , x n ,w , . . . , w (cid:96) ) of terms for k = , . . . , m such that for every assignmentof Boolean values a , . . . , a n to the input variables x , . . . , x n and of The requirement that a general solution include all dependent variables, not justthe output variables, is reasonable, because a circuit designer should know what allhis wires are supposed to do; indeed, the redundancy involved in checking the mutualconsistency of solutions for all internal variables is desirable in itself. erification of Hardware Circuits Boolean values b , . . . , b (cid:96) to the parameter variables w , . . . , w (cid:96) , thevalues of f k (a , . . . , a n , b , . . . , b (cid:96) ) are a Boolean solution extending theoriginal input variable assignment. A most general ( Boolean ) solution of T is a general Boolean solution such that the Boolean ground so-lutions of T are exactly its Boolean substitution instances with theircorresponding input assignments. A set F of equations has the formof an unparameterized general solution iff its equations are y k = f k (x , . . . , x n ) for k = , . . . , m , and has the form of a parameterizedgeneral solution iff its equations are y k = f k (x , . . . , x n , w , . . . , w (cid:96) ) for k = , . . . , m . (cid:2) To check if a family of terms for dependent variables is a general solu-tion of a system T , it is by definition sufficient to substitute the termsfor the corresponding variables in each equation of T , and check if the two sides are equal for all Boolean values of the input and parametervariables. Notice that equations having the form of general solutionsare in particular unconditional triangular systems. Example 7.4.12 Consider the following conditional triangular system T , y = x or x y = not y if not x which is underdetermined, since y can have any value when x is true.The following is a proposed general solution F for T , y = x or x y = ( not x and not x ) or (w and x ) where w is a parameter variable. We can check that F is a most gen-eral solution of T by enumerating the solutions of T and then check-ing that they all are substitution instances of F . Using the format (x , x , y , y ) , and representing true by 1 and false by 0, the solu-tions of T are ( , , , ), ( , , , ), ( , , , (cid:63)) , and ( , , , (cid:63)) , where (cid:63) can be either 0 or 1. The reader may now verify that exactly the sameset of six Boolean 4-tuples arises from F . (cid:2) Proposition 7.4.13 Every unconditional triangular system has a most generalsolution, obtained by progressively substituting its equations into laterequations; this solution has no parameters, and is unique in the sensethat the corresponding terms for each y i are equal under PROPC . Suchsystems have exactly 2 n Boolean solutions, where n is the number ofinput variables. Proof: The construction is like that of Proposition 7.4.9, except that we do thesubstitutions with the terms in the triangular system, instead of withBoolean values. First, let f be t . Next, substitute f for each instanceof y in t and call the result f , noting that both f and f contain Deduction and Rewriting Modulo Equations no non-input variables. Then, in t , substitute f for each instance of y and f for each instance of y and call the result f , noting thatit too contains no non-input variables. Continuing in this way, afterthe appropriate substitutions t k becomes f k , which by induction alsocontains no non-input variables. Since this construction involves onlyequational reasoning, the result is sound, and because each f k involvesonly input variables, the result is indeed a solution with no parameters.Moreover, this process is reversible, i.e., we can also derive the orig-inal triangular system from this solution. Therefore the two sets ofequations are equivalent as theories. Because equivalent theories haveexactly the same models, they also have exactly the same Boolean mod-els, and hence exactly the same Boolean solutions. For uniqueness,Corollary 7.3.19 implies that two Σ (X) -terms are equivalent if they areequal for all Boolean values of the input variables. Finally, the form of a general solution ensures that it produces ex-actly one Boolean solution for each Boolean assignment to the inputvariables, and since there are 2 n such distinct assignments, that is alsothe number of Boolean solutions. (cid:2) Throughout this subsection, Σ denotes the signature of PROPC , and allterms are over Σ (Z) for some variable symbols Z containing input vari-ables X = { x , . . . , x n } and parameter variables W = { w , . . . , w (cid:96) } (ifany). If T is a finite set of equations (perhaps conditional), let T denotethe conjunction of the equations in T without quantification, and withconditional equations represented as implications. Note that an assign-ment a : Z → M to a Σ (Z) -model M is a solution of T iff a(T ) is true in M , where a(T ) denotes the truth value of T in M under a . Proposition 7.4.14 If T is a (conditional) triangular system, then a set F ofequations having the form of an unparameterized general solution is amost general solution for T if the formula ( ∀ Z)(F (cid:97) T ) can be provedassuming PROPC . Proof: The formula says that the two sets of equations are equivalent as theo- ries extending PROPC , which implies that they have exactly the samemodels, and therefore in particular, have exactly the same Booleanmodels, and hence exactly the same Boolean solutions. (cid:2) In examples, this formula can be proved by checking it for all possibleBoolean values of the input and parameter variables, since values ofthe other variables are determined by these using reduction, due to theforms of T and F . There are 2 n + (cid:96) cases to check, which is manageablein comparison with 2 n + (cid:96) + m , which could be much larger for a complexcircuit. erification of Hardware Circuits The formula ( ∀ Z)(F ⇒ T ) says that if the y k are defined accord-ing to F , then they satisfy T . Its converse, ( ∀ Z)(T ⇒ F) , says that ifsome assignment of values to Z satisfies T , then it also satisfies F ; thisis the “most general” part of “most general solution.” Together thesegive the equivalence of the two theories. Note that it can be satisfiedwithout F actually being a solution, for example, if T is inconsistent(i.e., has no solutions), then the converse formula is valid for any F whatsoever. Nevertheless, Proposition 7.4.16 shows that under somemild assumptions, it suffices to prove just the “most general” part ofthe equivalence; this explains why we only proved that direction in Ex-ample 7.4.1, and why we do the same in several examples below. Recallthat B = { true , false } . Lemma 7.4.15 If T is a total triangular system with consistent conditions, then every Boolean assignment i : X → B to input variables extends to aunique solution i ∗ : Z → B for T . Proof: Since T is total and has consistent conditions, the conditions of at leastone equation with leftside y evaluate to true, and all the rightsidesof such equations evaluate to the same value. Therefore the value of y is uniquely determined by the values of i . Similarly, the value of y is uniquely determined by the values of i and y , and so on byinduction, so that all the dependent variables in a solution are uniquelydetermined. (cid:2) Proposition 7.4.16 If T is a total triangular system with consistent conditionsand if F has the form of an unparameterized general solution, then ( ∀ Z)(T ⇒ F) implies ( ∀ Z)(T (cid:97) F) , and hence implies that F is a mostgeneral solution. Proof: Given a Boolean assignment i : X → B for input variables, there areunique extensions i ∗ T and i ∗ F that are solutions to T and F , by Lemma7.4.15, noting that F is also total with consistent conditions. The for-mula in the hypothesis implies that i ∗ T = i ∗ F for any i : X → B , so wecan write just i ∗ . Uniqueness implies that a : Z → B is a solution to T iff a = i ∗ when the restriction of a to X is i ; the same holds for F .Hence, a : Z → B is a solution to T iff it is a solution to F , from which ( ∀ Z)(F (cid:97) T ) follows by Corollary 7.3.19. (cid:2) In practice, it suffices to prove ( ∀ X)(T ⇒ F) , regarding the input vari-ables as quantified and the dependent variables as constants, whosevalues are determined by X ; this is considerably easier by case analy-sis, since there are considerably fewer cases than required by Z .The situation is more complex for parameterized solutions, wherefor Z (cid:48) = Z ∪ W , with W the parameter variables, the relevant for-mula is ( ∀ Z)((( ∃ W )F) (cid:97) T ) , for which it often suffices to prove just Using the inference rule (( ∀ Z)P) ∧ (( ∀ Z)Q) = ( ∀ Z)(P ∧ Q) , which is discussed inChapter 8. Deduction and Rewriting Modulo Equations i • • Gnd Pwr (cid:102) o •• Figure 7.4: A cmos not Gate ( ∀ Z)((T ⇒ ( ∃ W )F)) , noting that ( ∀ Z)((( ∃ W )F) ⇒ T ) is equivalent to ( ∀ Z (cid:48) )(F ⇒ T ) . We omit details, which are similar to those for Proposi-tion 7.4.16.The method of reduction of Proposition 7.4.3 and its extension toconditional equations in Theorem 7.4.10, provide a way to verify uni-versally quantified implications of the kind considered here. (But notethat rewriting with conditional rules modulo equations is not treateduntil Section 7.7.) Although often not applicable, the method of re-duction is efficient when it is applicable, and Boolean case analysis isavailable when it is not. Also note that while proving a conditionalequation by checking that its two sides reduce to the same thing, weshould assume that its condition is true when doing the reduction, asis of course familiar mathematical practice (Chapter 8 gives a formaltreatment). All this together gives a powerful and flexible tool set forverifying hardware circuits. This subsection gives some examples of conditional triangular systemsand their solutions, mainly so-called “ cmos ” circuits, which use n- andp-transistors in balanced pairs. Example 7.4.17 ( not Gate ) We prove that the cmos circuit shown in Figure 7.4implements a not gate (also called a negation , or an inverter gate), i.e.,we prove that o = not i is a most general solution for T , which containsthe two conditional equations of the module NOT below, describing thebehavior of the two transistors: th NOT is extending PROPC .ops i o : -> Bool .cq o = true if not i .cq o = false if i .endth erification of Hardware Circuits We show that negation is a most general solution by case analysis onthe variable i (the validity of case analysis was shown in Section 7.4.3): open NOT . eq i = true .red o iff not i .closeopen NOT . eq i = false .red o iff not i .close Since both reductions give true and the circuit is easily seen to be to-tal and consistent, the proof is done by Proposition 7.4.16. Althoughit is now unnecessary, we can also show directly that negation is a so-lution, by assuming it as an equation and then checking that the twoconditional equations of the circuit hold: th BEH is ex PROPC .ops i o : -> Bool .eq o = not i .endthopen . eq not i = true .red o == true .closeopen . eq i = true .red o == false .close (cid:2) Example 7.4.18 ( xor Gate ) We show that the six-transistor circuit of Figure 7.5realizes the exclusive or ( xor ) function, i.e., that ( ∀ Z) (T (cid:97) o = i xor i ) , where T consists of the equations in the module XOR be-low, describing the behavior of the circuit, and where Z contains thefour variables i1 , i2 , p1 , o . The variables i1 and i2 are inputs, while p1 is internal and o is the output. To model this circuit, we write oneconditional equation for each transistor: th XOR is extending PROPC .ops i1 i2 p1 o : -> Bool .cq p1 = false if i1 .cq p1 = true if not i1 .cq o = i1 if not i2 .cq o = p1 if i2 .cq o = i2 if not i1 .cq o = i2 if p1 .endth Example 7.4.17 shows that the internal variable p1 is the negation of i1 , but we nevertheless prove this again in the present context. Becausethis circuit is total, Proposition 7.4.16 implies that it suffices to showthat xor is most general: Deduction and Rewriting Modulo Equations (cid:102) (cid:102) (cid:102) PwrGnd • •• • • •• • •• • • • • i1i2 p1 o Figure 7.5: A cmos xor Gate open XOR . eq i1 = false . eq i2 = true .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = true . eq i2 = false .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = false . eq i2 = true .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = false . eq i2 = false .red p1 iff not i1 .red o iff i1 xor i2 .close Since all reductions give true , the proof is done. (cid:2) We can summarize our method for verifying a proposed unparame-terized most general solution for a total consistent conditional systemas follows: use case analysis on the input variables and then reduc-tion to prove that the circuit equations imply the solution equations.This does not require human intervention: the OBJ proof score can begenerated automatically from a circuit diagram and a proposed solu-tion. Moreover, it is a kind of decision procedure under the conditions erification of Hardware Circuits PwrGnd (cid:102)(cid:102) •• • •• • •• ab zx Figure 7.6: A cmos nor Gateof Proposition 7.4.16, because an unparameterized system fails to bea most general solution for a circuit iff reduction fails to be true forsome case of some equation (but the system could still be a solution,even though not most general). In some cases, this method can be moreefficient than more traditional approaches, and it works for combinato-rial circuits built from any (combinatorial) components, including e.g.,both transistors and logic gates. Exercise 7.4.3 Use OBJ to prove correctness of the nor circuit shown in Figure7.6 (where x nor y = not (x or y) ). (cid:2) The method for checking solutions described above extends to par- tial solutions, where the desired behavior of a circuit is not determinedfor some input conditions. This can arise in specifications where it isknown that certain inputs will never occur, so it doesn’t matter whatthe circuit does in these cases. Partial specifications are preferable tochoosing arbitrary values for these inputs, since this allows engineersmore freedom to produce better designs. Definition 7.4.19 A partial solution is a family of conditional equations, eachof the form y k = f k (x , . . . , x n ) if c k (x , . . . , x n ) , Deduction and Rewriting Modulo Equations where y k is a non-input variable, where x , . . . , x n are the input vari-ables, and where f k and c k are propositional terms. The cases wherethe predicates c k (x , . . . , x n ) are not true correspond to what are oftencalled don’t care conditions . (cid:2) Although we do not give all the details here, when verifying a partialsolution, it is necessary to determine when the condition of a condi-tional equation is true on Boolean models: first, substitute the expres-sions of the proposed solution into the condition; then compute thecomplete disjunctive normal form (see Proposition 7.3.24) of the result;next, consider each disjunct as a “case,” in which the input variablesthat occur positively (i.e., those that are not negated) must be true ,while those that occur negatively must be false ; and finally, checkthat for each case the two sides evaluate to the same reduced form, using the values for the variables that belong to that case. For example,the condition of the sixth equation in Example 7.4.18 is p1 ; substitut-ing the equation for p1 yields not i1 , which is already in disjunctivenormal form, with only one disjunct and hence only one case, in which i1 = false ; the last open above sets up this assumption before eval-uating the equation. More complex conditions can easily arise, and canbe handled the same way. It is also valid to use exclusive normal formin the same manner.The problem of verifying a proposed solution is inherently simplerthan the problem of finding a solution, because we can look at the latteras having the form ( ∀ X)( ∃ N) PROPC (cid:238) Σ T , where T is a propositional system and X, N are its input and non-input variables, respectively. Although standard solution methods likeGaussian elimination require linearity for their completeness, essen-tially the same method of successive substitution and simplificationcan often be used to find solutions for general propositional systems;sometimes even most general solutions can be found this way. Notethat the above formula only concerns Boolean solutions; a formuladefining general solutions would require second-order variables. Exercise 7.4.4 Design and verify a circuit that implements the conditional equa-tion a = b if c = d . (Note that this is a partial specification.) (cid:2) Recall that bidirectional circuits may involve feedback loops, in thesense that they are not triangular. It may not be obvious that a function-based formalism can deal with bidirectional logic circuits, due to theinput/output character of functions. But equational logic is based onthe relation of equality, which is symmetric, and conditional equations erification of Hardware Circuits provide additional expressive power. Section 7.4.4 showed that thisframework is sufficient for mos transistor circuits. We now general-ize beyond the simple input-output flow-oriented structure of triangu-lar systems, to circuits described by arbitrary conditional propositionalsystems, but with a designated subset of “external” variables, insteadof designated input and output variables as in Definition 7.4.8. Thetransistor and the cell (Example 7.4.21 below) are examples. Definition 7.4.20 A propositional system (or propositional theory ) over a vari-able set Z is a finite set of (possibly conditional) Σ (Z) -equations, where Σ is the signature of PROPC , with a designated subset of Z called the external variables , some of which may be input variables, and with theremainder called the internal variables . (cid:2) The notions of Boolean solution , consistency , underdetermination , Boolean equivalence , and Boolean model for propositional systemsare the same as in Definition 7.4.5, and the bijection between Booleansolutions and Boolean models also carries over, so that two proposi-tional systems are Boolean equivalent iff they have the same Booleanmodels. Moreover, the notions of general solution and most generalsolution in Definition 7.4.11 also carry over.The task of proving that some equations E are satisfied if the equa-tions T describing a circuit are satisfied has the form ( ∀ Z)(T ⇒ E) , andcan therefore be proved by showing PROPC (cid:238) Σ (Z) (T ⇒ E) , which in turncan be proved by showing PROPC ∪ T (cid:238) Σ (Z) E , in which the variables inthe equations of PROPC remain variables, while those in T and E becomeconstants. The method of Proposition 7.4.16 is not available, becauseneither set of equations is in triangular form. Therefore to show thata proposed solution is most general, it is necessary both to prove thatit is a solution by proving the formula ( ∀ Z)(T ⇒ E) as above, and toprove the converse formula, ( ∀ Z)(E ⇒ T ) ; of course, we hope thesecan be done by reduction based on the forms PROPC ∪ T (cid:238) Σ (Z) E and PROPC ∪ E (cid:238) Σ (Z) T , but in general some case analysis is required. Example 7.4.21 ( Cell ) Figure 7.7 is a cmos circuit for a simple 1-bit memory cell. This circuit is underdetermined, and in fact is bistable , i.e., ithas exactly two distinct Boolean solutions, which are its possible stablypersistent memory states. The system describing this circuit is p1 = true if not p2p1 = false if p2p2 = true if not p1p2 = false if p1 where p1 and p2 are external variables (so there are no internal vari-ables). This system is equivalent to Deduction and Rewriting Modulo Equations PwrGnd PwrGnd (cid:102) (cid:102) • • •• • • p1 p2 Figure 7.7: A 1-bit cmos Cell p1 = not p2p2 = not p1 as well as to p1 = not p2 . This circuit has no designated inputs or outputs because the wires p1 and p2 are used bidirectionally, for both reading and writing. Becausesolutions are Boolean expressions in the internal variables, of whichthere are none, and because there are only two Boolean expressionsthat depend on no variables, namely true and false , it is clear thatthe two systems p1 = truep2 = false and p1 = falsep2 = true are the only possible unparameterized solutions for this circuit, and itis also clear that they do indeed satisfy the equations.We can introduce a parameter q for a most general parameterizedsolution p1 = not qp2 = q roving Termination Modulo Equations and observe that it is indeed a most general solution because it hasexactly the above two Boolean solutions as its Boolean instances.We can also give a mechanical proof that the above parameterizedsystem is indeed a solution. The OBJ3 proof score for this is a bit moresubtle than previous examples because the cases when the conditionof a conditional equation is true must be expressed in terms of theparameter variable q . th BEH is extending PROPC .ops p1 p2 q : -> Bool .eq p1 = not q .eq p2 = q .endth*** p1 = true if not p2 .open BEH . eq q = false .reduce p1 iff true .close*** p1 = false if p2 .open BEH . eq q = true .reduce p1 iff false .close*** p2 = true if not p1 .open BEH . eq q = true .reduce p2 iff true .close*** p2 = false if p1 .open BEH . eq q = false .reduce p2 iff false .close Since all these reductions give true , we are done. (cid:2) Exercise 7.4.5 Design and verify a cmos circuit that can store and read twobits. (cid:2) This section considers ways to prove termination modulo equations,generalizing results from Chapter 5 on rewriting with unconditionalrules, and illustrating their use on examples from earlier in this book.We first note that any TRS result proved from an ARS result will ofcourse generalize to MTRS’s, although this doesn’t get us very far withtermination proofs. We first show that it is not necessary that a termi-nating TRS is also terminating modulo B . Example 7.5.1 Let A contain the rules a + b → c and c → b + a where a, b, c are constants, + is binary, and there is just one sort. It is not hard tosee that A is terminating. Now let B contain the commutative law for Deduction and Rewriting Modulo Equations + . Then a + b ⇒ A/B a + b because b + a (cid:39) B a + b . Therefore rewritingwith A modulo B is nonterminating. (cid:2) Nevertheless, there is a simple condition that sometimes allows us toinfer termination of A modulo B from termination of A ; we will seelater that the same condition also works for the Church-Rosser andlocal Church-Rosser properties. Definition 7.5.2 A Σ -TRS A and set B of Σ -equations are said to commute ifffor any Σ -term t , whenever t (cid:39) B t (cid:48) and t (cid:48) ⇒ A t there is a Σ -term t (cid:48) such that t ⇒ A t (cid:48) and t (cid:48) (cid:39) B t . (cid:2) Lemma 7.5.3 Given a Σ -MTRS A commuting with Σ -equations B , if t ⇒ A/B t · · · ⇒ A/B t n then there exist t (cid:48) , . . . , t (cid:48) n such that t ⇒ A t (cid:48) · · · ⇒ A t (cid:48) n and t (cid:48) n (cid:39) B t n . The same result also holds for weak rewriting modulo B . Proof: The first step, t ⇒ A t (cid:48) (with t (cid:48) (cid:39) B t ), is direct from commutativity;next, because t ⇒ A/B t and t (cid:48) (cid:39) B t , we get t (cid:48) ⇒ A t (cid:48) with t (cid:48) (cid:39) B t ; andwe can continue in this same way until we get t (cid:48) n − ⇒ A t (cid:48) n with t (cid:48) n (cid:39) B t n .The result for weak rewriting modulo B now follows because ⇒ A,B is asubrelation of ⇒ A/B . (cid:2) Proposition 7.5.4 Given a Σ -TRS A commuting with Σ -equations B , if A is ter-minating, then A is also terminating modulo B ; this also holds for weakrewriting modulo B . Proof: We prove the contrapositive. Suppose there is an infinite sequence t ⇒ A/B t · · · ⇒ A/B t n ⇒ A/B · · · . Then we can construct an infinite sequence t ⇒ A t (cid:48) · · · ⇒ A t (cid:48) n ⇒ A · · · by repeatedly using Lemma 7.5.3. This also holds for weak rewritingbecause ⇒ A,B is a subrelation of ⇒ A/B . (cid:2) We now generalize some results from Chapter 5 to rewriting mod- ulo B . For the results that arise from ARS’s, we can simply re-applythe ARS result, without doing any new work. For example, instead ofgeneralizing the proof of Proposition 5.5.1 on page 111, we can simplyapply its ARS version, which is Proposition 5.8.16 on page 134, to ob-tain the following, in which we also generalize from ω to an arbitraryNoetherian poset P : Proposition 7.5.5 An MTRS M = ( Σ , A, B) is ground terminating if there is afunction ρ : T Σ ,B → P , where P is a Noetherian poset, such that for allground ( Σ , B) -terms t, t (cid:48) , if t ⇒ A/B t (cid:48) then ρ(t) > ρ(t (cid:48) ) . Moreover, theconverse holds provided M is globally finite. (cid:2) roving Termination Modulo Equations Note that if we want to define ρ using the initiality of T Σ ,B , then we haveto check that the Σ -structure given to P actually satisfies B .Of course, we also want to generalize Proposition 5.5.4, which givesa termination criterion that is easier to apply in practice, by taking ac-count of the structure of terms. Here we cannot rely on an ARS result,but fortunately the proof generalizes from terms to B -classes of terms;again we generalize to an arbitrary Noetherian poset P : Proposition 7.5.6 Given an MTRS M = ( Σ , A, B) and ρ : T Σ ,B → P with P Noethe-rian, if(1) ρ(θ(t)) > ρ(θ(t (cid:48) )) for each t → t (cid:48) in A and applicable substitu-tion θ : X → T Σ ,B , and(2) ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) for each t, t (cid:48) ∈ (T Σ ,B ) s and any t ∈ T Σ ,B ( { z } s ) having a single occurrence of z ,then M is ground terminating. (cid:2) Note that expressions of the form ρ(t) really mean ρ([t]) above. Theproof is the same as that of Proposition 5.5.4, with ⇒ A/B substitutedfor ⇒ A . We next generalize Proposition 5.5.6 in the same way, notingthat all the concepts in Definition 5.5.3 generalize from Σ -terms to B -classes of Σ -terms by substituting ρ([ __ ]) for ρ( __ ) , and that the proofof Proposition 5.5.5 in Appendix B also generalizes, using induction onthe Σ -structure of Σ -terms that represent ( Σ , B) -terms: Proposition 7.5.7 Given an MTRS M = ( Σ , A, B) and ρ : T Σ ,B → P with P Noethe-rian, if(1) each rule in A is strict ρ -monotone, and(2’) each σ ∈ Σ is strict ρ -monotone,then M is ground terminating. (cid:2) Here the rules must be seen as class rewriting rules, and monotonicitymust be interpreted on classes. As before, it is often easiest to definethe function ρ by giving its target a ( Σ , B) -algebra structure and then letting initiality define ρ , as in the following: Example 7.5.8 We can use Proposition 7.5.7 with P = ω to simplify the groundtermination proof of Example 5.5.7 by building in commutativity, i.e.,putting it into B . The resulting simplified specification is as follows: obj ANDCOMM is sort Bool .ops tt ff : -> Bool .op _&_ : Bool Bool -> Bool [comm] .var X : Bool .eq X & tt = X .eq X & ff = ff .endo Deduction and Rewriting Modulo Equations Letting Σ denote the signature of ANDCOMM , we give ω the structureof a Σ -algebra by defining ω tt = ω ff = ω & (m, n) = m + n .Note that ω with this structure is a ( Σ , B) -algebra (because addition iscommutative), and let ρ denote the resulting unique Σ -homomorphism T Σ ,B → ω . Then by Proposition 7.5.7, we only need to prove ρ(x & tt ) > ρ(x)ρ(x & ff ) > ρ( ff ) for all x ∈ T Σ , plus (from condition ( (cid:48) ) ) that ρ(t) > ρ(t (cid:48) ) implies ρ(t & t) > ρ(t & t (cid:48) ) for all t, t (cid:48) , t ∈ T Σ . (Note again that there is an implicit [ _ ] inside eachinstance of ρ above.) As in Example 5.5.7, the proofs are all trivial – but there are only half as many of them. (cid:2) Example 7.5.9 We can use Proposition 7.5.7 with P = ω to prove first groundtermination and then (non-ground) termination of the basic proposi-tional calculus specification that we have been using as a decision pro-cedure: obj BPROPC is sort Bool .ops true false : -> Bool .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _xor_ : Bool Bool -> Bool [assoc comm prec 3] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P xor false = P .eq P xor P = false .eq P and (Q xor R) = (P and Q) xor (P and R).endo We let B contain the associative and commutative laws for and and xor , and we define a B -algebra structure on ω as follows, where p, q range over ω : ω true = ω false = ω and (p, q) = pqω xor (p, q) = p + q + . Now observe that ω with this structure satisfies B , because product andaddition are both associative and commutative, and that the resultingunique ( Σ , B) -homomorphism ρ satisfies ρ( true ) = ρ( false ) = ρ(P and Q) = ρ(P )ρ(Q) ρ(P xor Q) = ρ(P ) + ρ(Q) + , roving Termination Modulo Equations and also (by induction) ρ(P ) > P . It is now easy to check that allrules and both operations are strict ρ -monotone, the least trivial beingthe distributive law. It then follows that BPROPC is ground terminating.If constants, such as p, q, r , are added to the signature, then by defining ρ on them to be 2, the above results still hold, and termination againfollows; but since variables are just such constants, we get termination,not just ground termination. (cid:2) Exercise 7.5.1 Use OBJ3 to check all equalities and inequalities needed in Ex-ample 7.5.9. (cid:2) Example 7.5.10 We can prove termination of the MTRS DNF of Example 7.3.11in much the same way. We first consider just the operations and , or and not , with the first fourteen equations, and we let B contain the associa-tive and commutative laws for and and or . Then as in Example 7.5.9above, we can give a B -algebra structure for ω : ω true = ω false = ω and (p, q) = pqω or (p, q) = p + q + ω not (p) = p + . Because ω with this structure satisfies B (since product and additionare both associative and commutative), the unique ( Σ , B) -homomor-phism ρ satisfies ρ( true ) = ρ( false ) = ρ(P and Q) = ρ(P )ρ(Q)ρ(P or Q) = ρ(P ) + ρ(Q) + ρ( not P ) = ρ(P) + . and also (by induction) ρ(P ) > P . It is easy to check thatall fourteen rewrite rules and both operations are strict ρ -monotone,the least trivial again being the distributive law. It follows that DNF isground terminating, and because after adding constants, such as p, q, r to the signature and defining ρ on them to be 3, all of the above results still hold, we get termination.To extend this to the full MTRS of Example 7.3.11, we define eachadditional operation on ω to be one more than ρ of the rightside of itsdefining equation. For example, ω xor (p, q) = p q + + q p + + ω implies (p, q) = p + + q + . It is easy to see that these operations and their rules are strict mono-tone, so the full MTRS is ground terminating by Proposition 7.5.7, andthen termination follows using the same trick. (cid:2) Deduction and Rewriting Modulo Equations The above gives a general method for proving ground termination ofan extension of an MTRS by derived operations when the MTRS withoutthem has already been proved terminating using Proposition 7.5.7. Exercise 7.5.2 Prove termination of the entire PROPC MTRS using the abovemethod for handling derived operations. (cid:2) This section generalizes results from Chapter 5 for proving the Church-Rosser property to unconditional rewriting modulo equations. We de-fer some proofs and even statements of results to Section 7.7.3, whichtreats the more general case of conditional rewriting modulo equations. We first show that a Church-Rosser TRS is not necessarily Church-Rosser modulo equations: Example 7.6.1 Let Σ be one-sorted with constants a, b, 0, let A have rules (a + ) + b → a + b → a , and let B have equations 0 + X = X and X + (Y + Z) = (X + Y ) + Z . Then (a + ) + b ⇒ A/B 0, and (a + ) + b ⇒ A/B a + b ⇒ A/B a , which are both reduced. Therefore A modulo B is notChurch-Rosser, although A without B is Church-Rosser. (cid:2) The same simple commutativity condition that let us infer terminationmodulo B from termination without B also works for the Church-Rosserproperty; the result below is a special case of Proposition 7.7.23, whichis proved in Section 7.7.3. Proposition 7.6.2 If A is a Church-Rosser (or locally Church-Rosser) Σ -TRS com-muting with Σ -equations B , then A is Church-Rosser (or locally Church-Rosser) modulo B . (cid:2) The following is a straightforward application of the above plus Propo-sition 7.5.4: Proposition 7.6.3 Any triangular propositional system T is canonical modulo the associative and commutative laws for and and xor . Proof: Associative and commutative laws commute with any rule having a con-stant as leftside, and Example 5.8.27 showed that any such T is termi-nating and Church-Rosser (without B ). (cid:2) Exercise 7.6.1 Give a proof or counterexample for the assertion that if A isChurch-Rosser and commutes with B then weak rewriting modulo B isalso Church-Rosser. (cid:2) Results in Chapter 5 for proving the Church-Rosser property thatfollow from ARS results immediately extend to MTRS’s. One such is roving Church-Rosser Modulo Equations the Hindley-Rosen Lemma, which is stated in even greater generality inSection 7.7.3. Here we state another modulo B result, which followsdirectly from Proposition 5.7.4: Proposition 7.6.4 ( Newman Lemma ) A terminating MTRS is Church-Rosser iffit is locally Church-Rosser. (cid:2) Since Example 7.5.9 shows termination of PROPC , it would be very de-sirable to use Proposition 7.6.4 to prove Hsiang’s Theorem (Theorem7.3.13) by proving the local Church-Rosser property for PROPC . Thisprovides strong motivation for generalizing the Critical Pair Theorem(Theorem 5.6.9) to MTRS’s. Unfortunately, this is far from straightfor-ward; some aspects of the problem are treated in Chapter 12. Herewe content ourselves with some simple, specialized results, and with showing that the Critical Pair and Orthogonality Theorems do not gen-eralize straightforwardly to the modulo B case. First we generalize therelevant definitions: Definition 7.6.5 Two rules of an MTRS with leftsides (cid:96), (cid:96) (cid:48) , overlap iff thereexist a subterm (cid:96) of (cid:96) not just a variable, and substitutions θ, θ (cid:48) suchthat θ((cid:96) ) = B θ (cid:48) ((cid:96) (cid:48) ) . If the two rules are the same, it is required inaddition that the corresponding substitution instances of the rightsidesare not equivalent modulo B , and in this case the rule is called self-overlapping . An MTRS is overlapping iff it has two rules (possibly thesame) that overlap, and then (the B -class of) θ((cid:96) ) is called an overlap of the rules; otherwise the MTRS is called non-overlapping . A mostgeneral overlap p of (cid:96), (cid:96) (cid:48) at (cid:96) is an overlap of (cid:96), (cid:96) (cid:48) at (cid:96) such thatany other is equal (modulo B ) to a substitution instance of p , and a complete overlap set for (cid:96), (cid:96) (cid:48) at (cid:96) is a set of overlaps of (cid:96), (cid:96) (cid:48) at (cid:96) such that any other is equal (modulo B ) to a substitution instance ofsome overlap in the set. (cid:2) Note that the subredex need not be proper in the self-overlapping case,as was required in Definition 5.6.2 for ordinary term rewriting. How-ever, it is still true that if the leftsides (cid:96), (cid:96) (cid:48) of two rules in A overlapat θ((cid:96) ) , then that overlap can be rewritten in two different ways (one for each rule). The following, due to Dr. Monica Marcus, refutes thestraightforward modulo B generalization of the TRS orthogonality the-orem that replaces all concepts by their modulo B counterparts: Example 7.6.6 Let Σ be one-sorted with a binary operation + and constants a, b , let A have the rules a + b → b and b + a → a , and let B have theassociative law. Then A is non-overlapping modulo B , but a + b + a ⇒ A/B a + a , and a + b + a ⇒ A/B b + a ⇒ A/B a , which are both reduced modulo B . Therefore A modulo B is not Church-Rosser. (And it is not hard tosee that this MTRS terminates.) (cid:2) Deduction and Rewriting Modulo Equations The following is proved in Chapter 12, recalling that the associative,commutative, and identity laws are abbreviated A, C, I, respectively: Proposition 12.0.2 Given an MTRS ( Σ , A, B) , if B consists of any combination ofA, C, I laws for operations in Σ , except A, I and AI, and if the leftsides (cid:96), (cid:96) (cid:48) of two rules in A overlap at a subterm (cid:96) of (cid:96) , then there is a finitecomplete overlap set for (cid:96), (cid:96) (cid:48) at (cid:96) . (cid:2) Note that any finite complete overlap set contains a minimal such set,in the sense that no subset of it is a complete overlap set; however,there may be more than one such subset. Definition 7.6.7 An MTRS ( Σ , A, B) is said to have complete overlaps iff when- ever leftsides (cid:96), (cid:96) (cid:48) of rules in A overlap at a subterm (cid:96) of (cid:96) , they havea finite complete overlap set. Each such overlap is called a superposi-tion of (cid:96), (cid:96) (cid:48) , and the pair of rightsides resulting from applying the tworules to the overlap θ((cid:96) ) is called a critical pair . If the two terms of acritical pair can be rewritten modulo B to a common term using A , thenthat critical pair is said to converge or to be convergent . (cid:2) The following illustrates the definitions above: Example 7.6.8 The first two rules of PROPC (see Example 7.5.9), with leftsides t = P and false and t (cid:48) = P and true , have the overlap true and false modulo B , with θ(P ) = true and θ (cid:48) (P ) = false . Then the term θ(t) = θ (cid:48) (t (cid:48) ) = true and false rewrites to false in two different ways. (Only thecommutative law for and is actually used here.) (cid:2) For the Critical Pair Theorem (Theorem 5.6.9) to generalize to the mod-ulo B case, would mean that an MTRS with complete overlaps is locallyChurch-Rosser if all its critical pairs are convergent. This does not hold,and in fact Example 7.6.6 is a counterexample. Chapter 12 discussessome algorithms that generalize unification for computing an analogof critical pairs for MTRS’s over certain sets of equations, and thus de- ciding the local Church-Rosser property.The following weak modulo B orthogonality result, which followsfrom the ordinary version, is the best we can do here: Proposition 7.6.9 ( Weak Orthogonality Modulo B ) Given an MTRS M = ( Σ , A, B) ,let R = A ∪ B ∪ B (cid:94) (where B (cid:94) denotes the converse of B ). If R is lapsefree and orthogonal, and if B is balanced, then M is Church-Rosser. Proof: ( Σ , R) is Church-Rosser by Theorem 5.6.4, and since B is balanced andlapse free, any rewrite sequence t ∗ ⇒ A/B t (cid:48) expands to a rewrite se-quence t ∗ ⇒ R t (cid:48) , which implies that M is also Church-Rosser. (cid:2) onditional Term Rewriting Modulo Equations Unfortunately, this is not very useful: the associative law is disqualifiedbecause it is self-overlapping; although the commutative law satisfiesthe assumptions for B , it is unlikely to be non-overlapping with inter-esting rule sets A ; moreover, B cannot contain identity or idempotentlaws, since these (or else their converses) are not lapse free rewriterules. Exercise 7.6.2 Use Propositions 7.6.9 and 7.6.2 for an alternative proof of Propo-sition 7.6.3. (cid:2) The results of Section 5.3 on adding new constants generalize straight-forwardly to the modulo B setting, and as before, are important for theorem proving; however, we do not explicitly state them here, butrefer the reader to Section 7.7.1, which gives more general results forconditional rules modulo B . We first develop an ARS version of join conditional term rewriting,which will let us define conditional rewriting modulo equations moreeasily than by developing it directly; we will also use it again for order-sorted term rewriting in Chapter 10. An unconditional ARS (Definition5.7.1) consists of a set of “rules,” which are really just pairs of elementsof the same sort from an indexed set T , by convention written t → t (cid:48) .We generalize this as follows: Definition 7.7.1 A join conditional ARS , abbreviated JCARS or just CARS , is apair (T , W ) , where T is S -sorted and W is a set of conditional rules on T , which are (n + ) -tuples for n ≥ 0, of pairs of elements of T of thesame sort, by convention written in one of the forms t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n (cid:16) ∧ ni = t i = t (cid:48) i (cid:17) ⇒ t → t (cid:48) where the first pair, or head , of the tuple is t → t (cid:48) . Note that uncondi-tional rules are the special case where n = 0. Now given a CARS (T , W ) ,define an ordinary ARS on T by R = {(cid:104) t, t (cid:105) | t ∈ T } R k + = {(cid:104) t, t (cid:48) (cid:105) | ( ∧ ni = t i = t (cid:48) i ) ⇒ t → t (cid:48) in W and t i ↓ t (cid:48) i by R k for i = , . . . , n } ∪ R (cid:63)k for each k ≥ 0, where R (cid:63)k denotes the transitive, reflexive closure of R k ,and then let R = ∪ ∞ k = R k . Deduction and Rewriting Modulo Equations We often write W (cid:5) for the relation R in the following. Now define anordinary ARS on T by t → t (cid:48) iff there is a rule ( ∧ ni = t i = t (cid:48) i ) ⇒ t → t (cid:48) in W such that t i ↓ t (cid:48) i using W (cid:5) for i = , . . . , n . We call this the ARSdefined by W , and we may write (T , → W ) or even just (T , → ) for it.We say that a relation R on T is join closed under W iff whenever t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in W and t i ↓ t (cid:48) i by R for i = , . . . , n then (cid:104) t, t (cid:48) (cid:105) is in R . (When n = 0, this just means (cid:104) t, t (cid:48) (cid:105) ∈ R .) (cid:2) Proposition 7.7.2 Given a CARS (T , W ) , then W (cid:5) (as above) is the least transi-tive, reflexive relation on T that is join closed under W . Moreover, therelation ∗ → W is equal to W (cid:5) . Proof: We write R for W (cid:5) . Reflexivity of R follows from the inclusion of R .To show transitivity, suppose (cid:104) t , t (cid:105) , (cid:104) t , t (cid:105) ∈ R . Then there is some k such that (cid:104) t , t (cid:105) , (cid:104) t , t (cid:105) ∈ R k , so that (cid:104) t , t (cid:105) is in R ∗ k and hence in R .To show join closure under W , suppose t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in W and that t i ↓ t (cid:48) i by R for i = , . . . , n . Then there is some k suchthat t i ↓ t (cid:48) i by R k for i = , . . . , n , so that (cid:104) t, t (cid:48) (cid:105) is in R k + and hence isin R .To show minimality, suppose R (cid:48) is join closed under W . Then R ⊆ R (cid:48) , and also R k ⊆ R (cid:48) implies R k + ⊆ R (cid:48) . Therefore R ⊆ R (cid:48) .For the second assertion, we first show → W ⊆ W (cid:5) , which implies ∗ → W ⊆ W (cid:5) since W (cid:5) is transitive. So we suppose t → W t (cid:48) . Then there exists k such that t i ↓ t (cid:48) i by R k , which implies that (cid:104) t, t (cid:48) (cid:105) is in R k + and hencein W (cid:5) . To prove the converse, we show R k ⊆ ∗ → W for all k . R ⊆ ∗ → W since ∗ → is reflexive by definition. Next suppose (cid:104) t, t (cid:48) (cid:105) is in R k + but notin R (cid:63)k . Then there is a rule t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n in W with t i ↓ t (cid:48) i by R k for i = , . . . , n . Therefore t → W t (cid:48) since t i ↓ t (cid:48) i also by W (cid:5) since R k ⊆ W (cid:5) . (cid:2) This result can be used to prove properties of ∗ → W . We now apply theCARS machinery to conditional term rewriting modulo equations: Definition 7.7.3 A conditional modulo term rewriting system , abbreviated CMTRS , is ( Σ , A, B) , where A is a set of (possibly) conditional Σ -rewriterules, and B is a set of unconditional Σ -equations. From a given CMTRS ( Σ , A, B) , we define two different CARS’s, where t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n has sort s and is in A , Y = var (t) , θ : Y → T Σ , u ∈ T Σ ( { z } s ) :1. For class rewriting, let W be the set of rules of the form c → c (cid:48) if c = c (cid:48) , . . . , c n = c (cid:48) n where c = [u(z ← θ(t))], c (cid:48) = [u(z ← θ(t (cid:48) ))] , c i = [θ(t i )] and c (cid:48) i = [θ(t (cid:48) i )] for i = , . . . , n .2. For term rewriting, let W be the set of rules of the form v → v (cid:48) if v = v (cid:48) , . . . , v n = v (cid:48) n where v (cid:39) B u(z ← θ(t)), v (cid:48) (cid:39) B u(z ← θ(t (cid:48) )) , v i = θ(t i ) and v (cid:48) i = θ(t (cid:48) i ) for i = , . . . , n . onditional Term Rewriting Modulo Equations Definition 7.7.1 now yields an ARS for each of these CARS’s. We write ⇒ [A/B] for the first, and ⇒ A/B for the second, which are conditionalclass rewriting modulo equations, and conditional term rewriting mod-ulo equations, respectively. The pair (u, θ) is called a match . As before,rewriting extends to terms with variables by extending Σ to Σ (X) , andwhich case write c ⇒ [A/B],X c (cid:48) and t ⇒ A/B,X t (cid:48) , defined on T Σ (X),B and T Σ (X) respectively. (cid:2) All ARS results, e.g., the Newman lemma and the multi-level ter-mination results in Section 5.8.2, apply, because ⇒ [A/B] and ⇒ A/B aredefined by ARS’s; Theorem 7.7.7 below shows an equivalence of ⇒ [A/B] and ⇒ A/B . The following is proved similarly to Proposition 7.3.2: Proposition 7.7.4 Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and CMTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iff t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Therefore t ∗ ⇒ A/B,X t (cid:48) iff t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend reasonably over vari-ables, so we can drop the subscript X and use any X with var (t) ⊆ X ;also as before, ∗ (cid:97) A/B,X does not restrict and extend reasonably, asshown by Example 5.1.15, so we define t ∗ (cid:97) A t (cid:48) to mean there existsan X such that t ∗ (cid:97) A,X t (cid:48) . Example 5.1.15 also shows bad behavior for (cid:39) XA/B (defined by t (cid:39) XA/B t (cid:48) iff A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although again, rule(8) in Chapter 4 (extended to rewriting modulo B ) implies (cid:39) XA,B does be-have reasonably when the signature is non-void. Defining ↓ A/B,X fromthe ARS, we generalize Proposition 5.1.13, again allowing the subscript X to be dropped: Proposition 7.7.5 Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and CMTRS ( Σ , A, B) , then wehave t ↓ A/B,X t if and only if t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) We next give a cute proof to illustrate the approach (some additionaltheory needed for its justification is discussed after Theorem 7.7.10): Example 7.7.6 We continue Example 7.3.6 by showing that a ring has no zerodivisors (i.e., non-zero elements a, b such that a ∗ b = 0) if it satisfiesthe left cancellation law, that a ∗ b = a ∗ c and a ≠ b = c . Forthis proof, we turn on the “reduce conditions” feature, so that whenthe conditional rewrite rule for the cancellation law is applied, its con-dition is automatically checked by reduction; because this rule involvesBoolean “ and ” it is also important that include BOOL be turned on.The result of Example 7.3.6 is used as a lemma. set reduce conditions on .open RING . vars-of . Deduction and Rewriting Modulo Equations ops a b c : -> R .eq A * 0 = 0 . *** the lemma[lc] cq B = C if A * B == A * C and A =/= 0 .eq a * b = 0 . *** the assumptionshow rules .start b .apply .lc with C = 0, A = a at top .close Since the result is , as expected, the proof is done. Note that both == and =/= are used in the conditional equation lc . (cid:2) Exercise 7.7.1 Prove the converse of the last result of Example 7.3.6, that a ringsatisfies the left cancellation law if it has no zero divisors. (cid:2) We now generalize some basic semantic results on term rewritingmodulo equations to the conditional case, beginning with Theorem 7.3.4,noting that Proposition 7.2.2 and Theorem 7.2.3 were already stated forconditional equations. As in the unconditional case, we cannot hope forcompleteness due to the semantics of join conditional rewriting. Theorem 7.7.7 ( Soundness ) Given a CMTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) , then t ⇒ A/B t (cid:48) iff [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iff [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) B ( ∀ X) [t] = [t (cid:48) ] ,t ∗ (cid:97) A/B t (cid:48) implies A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Also ∗ (cid:97) A/B is sound for satisfaction of A ∪ B , and ∗ (cid:97) [A/B] is sound forsatisfaction of A modulo B . Moreover, ∗ (cid:97) A/B ⊆ (cid:39) XA ∪ B on terms with vari-ables in X . Proof: The first assertion follows from the definitions of ⇒ A/B and ⇒ [A/B] (Def-inition 7.7.3), and the second follows from the first by induction. Thethird follows because ⇒ [A/B] rephrases the rule ( + C B ) . The fourth fol-lows from the second and third, plus Theorem 7.2.3. The fifth and sixth follow from the third and fourth plus Theorem 7.2.3. (cid:2) The following generalizes Theorem 7.3.9 to the conditional case: Theorem 7.7.8 Given a ground canonical CMTRS ( Σ , A, B) , if t , t are both nor-mal forms of a ground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalence classes of ground normal forms under ⇒ A/B form an ini-tial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B , in the following way, where [[t]] denotes any arbitrary normal form of t , and where [[t]] B denotes the B -equivalence class of [[t]] :(0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and onditional Term Rewriting Modulo Equations (1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . Proof: For convenience, write N for N Σ ,A/B . The first assertion follows fromthe ARS result Theorem 5.7.2, using also Theorem 7.7.7. Note that σ N is well defined by (1), by the first assertion, plus the fact that (cid:39) B is a Σ -congruence.Next, we check that N satisfies A ∪ B . Satisfaction of B is by defini-tion of N as consisting of B -equivalence classes of normal forms. Nowlet ( ∀ X) t = t (cid:48) if C be in A ; we need to prove that a(t) = a(t (cid:48) ) for all a : X → N that satisfy C . The proof follows that of Theorem 7.3.9, ex-cept that we must restrict to assignments that satisfy the condition, andthat uses of Theorem 7.3.4 must be replaced by uses of Theorem 7.7.7. (cid:2) Definition 7.7.9 A CMTRS ( Σ , A, B) is join condition canonical iff the CMTRS ( Σ (cid:48) , A (cid:48) , B) is canonical, where: (1) Σ (cid:48) ⊆ Σ is least such that if t i = t (cid:48) i isa condition in some rule r in A , then θ(t i ) and θ(t (cid:48) i ) are in T Σ (cid:48) for all θ : X → T Σ where X = var (t) where t is the leftside of the head of rule r ; and (2) A (cid:48) ⊆ A is least such that all conditional rules are in A (cid:48) , andall unconditional rules that can be used in evaluating the conditions ofrules in A are also in A (cid:48) . (cid:2) It is of course possible that Σ (cid:48) = Σ , but in many real examples, con-ditions are relatively simple tests on data types that involve relativelyfew operations and rules. Consequently, ( Σ (cid:48) , A (cid:48) , B) is often significantlysimpler than ( Σ , A, B) . In OBJ, the operations used in conditions may bebuilt in rather then defined by explicit equations, but in this case, theyshould be considered defined by a canonical TRS. Theorem 7.7.10 ( Completeness ) Given a join condition canonical CMTRS ( Σ , A,B) , the following four conditions are equivalent for any t, t (cid:48) ∈ T Σ (X) : t ∗ (cid:97) A/B t (cid:48) A ∪ B (cid:96) ( ∀ X) t = t (cid:48) [A] (cid:96) B ( ∀ X) t = B t (cid:48) t (cid:39) A ∪ B t (cid:48) Moreover, if ( Σ , A, B) is Church-Rosser, then t ↓ A/B t (cid:48) is also equivalent to the above. Finally, if ( Σ , A, B) is canonical, then [[t]] A (cid:39) B [[t (cid:48) ]] A is alsoequivalent. Proof: Equivalence of the last three of the first four assertions has alreadybeen proved. The first implies the second by the soundness of rewrit-ing. For the converse, because ( ± C ∗ B ) is complete and is equivalentto bidirectional rewriting, we are done if the conditions of rules can al-ways be evaluated; but this is given by the join condition canonical as-sumption. Equivalence of the fifth condition assuming Church-Rosseris a general ARS result, and equivalence of the final condition with thefourth is immediate when ⇒ A/B is canonical. (cid:2) Deduction and Rewriting Modulo Equations This generalization of Theorem 7.3.10 implies that we can define anoperation == that works for a canonical CMTRS ( Σ , A, B) the same wayas for a TRS, CTRS, or MTRS that is canonical: t == t (cid:48) returns true over ( Σ , A, B) iff t, t (cid:48) have the same normal form modulo B iff they are prov-ably equal under A ∪ B as an equational theory. As before, even when ( Σ , A, B) is not canonical, if t == t (cid:48) returns true then t and t (cid:48) are equalmodulo B . Also as before, if ( Σ , A, B) is non-canonical, rewriting can beunsound if == occurs in a negative position (such as =/= in a positiveposition) in a condition; however, it is sound if the system is canonicalfor the sorts of terms that occur in negative positions, with respect tothe subset of rules actually used. Exercise 7.7.2 Replace ↓ in Definition 7.7.1 by ∗ ↔ to obtain equality conditionabstract rewrite systems , apply this definition to term rewriting mod- ulo B to obtain equality condition term rewriting modulo equations ,and then prove the analogs of Proposition 7.7.2 and Theorem 7.7.10,where the latter asserts completeness, not just soundness. (cid:2) The OBJ implementation does not use equality condition rewriting, be-cause it is too inefficient for use in a practical system. On the otherhand, most of the literature on conditional rewriting, e.g., [113], usesequality condition rewriting, because it allows stronger theorems to beproved. Fact 7.7.11 If a CMTRS has no variables in its rules, then it is terminating iffit is ground terminating, Church-Rosser iff ground Church-Rosser, andhence canonical iff ground canonical. Proof: Let t be a term with variables. Because only subterms (modulo theequations) without variables can be redexes, and because rewriting onthese ground subterms of t is terminating (or Church-Rosser), so isrewriting on all of t . Therefore the ground properties imply the generalproperties. The converse is immediate. (cid:2) Applying this result to triangular systems tells us that when variablesare treated as constants, ground canonicity is equivalent to generalcanonicity. The results of Section 5.3 on adding new constants generalize straight-forwardly to conditional rewriting modulo B ; we state these generaliza-tions explicitly because of their importance for theorem proving, andbecause they appear to be new in this context. Proposition 7.7.12 If a CMTRS ( Σ , A, B) is terminating, or Church-Rosser, or lo-cally Church-Rosser, then so is ( Σ (X), A, B) , for any suitable countableset X of variable symbols. (cid:2) onditional Term Rewriting Modulo Equations The ARS proof of Proposition 5.3.1 generalizes, using the ARS (T Σ ,B (X ωS ), ⇒ [A/B] ) , where the reader should recall that for a given set S of sorts, X ωS denotes the ground signature with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S , a countable set of new variable symbols distinct from the sym-bols in Σ . The proof of Proposition 5.3.4 in Appendix B also generalizesto conditional rewriting modulo B , giving the following: Proposition 7.7.13 A CMTRS ( Σ , A, B) is ground terminating if ( Σ (X), A, B) isground terminating, where X is a variable set for Σ ; moreover, if Σ isnon-void, then ( Σ , A, B) is ground terminating iff ( Σ (X), A) is groundterminating. (cid:2) Corollary 7.7.14 If Σ is non-void, then a CMTRS ( Σ , A, B) is ground terminatingiff it is terminating. (cid:2) Exercise 7.7.3 Show that adding any set of constants to DNF gives a terminatingMTRS. (cid:2) Proposition 5.3.6 in Section 5.7 can be generalized to the following: Proposition 7.7.15 A CMTRS ( Σ , A, B) is Church-Rosser if and only if ( Σ (X ωS ),A, B) is ground Church-Rosser, and ( Σ , A, B) is locally Church-Rosser ifand only if ( Σ (X ωS ), A, B) is ground locally Church-Rosser. Proof: If we let T = T Σ ,B (X) and G = T Σ ,B (X ωS ) , then the proof given for Propo-sition 5.3.6 on page 125 goes through as it stands. (cid:2) Exercise 7.7.4 Use Corollary 7.7.14 and Proposition 7.7.15 to show that PROPC ( X ) is canonical if PROPC is canonical. (cid:2) Definition 7.5.2 (of commutativity) generalizes to conditional rules, asdo Lemma 7.5.3 and Proposition 7.5.4; these results are stated below,and their proofs are exactly the same as for the unconditional case, except that Proposition 7.7.17 uses Lemma 7.7.16. E31 Lemma 7.7.16 Given a Σ -CMTRS A commuting with Σ -equations B , if t ⇒ A/B t ⇒ A/B · · · ⇒ A/B t n then there exist t (cid:48) , . . . , t (cid:48) n such that t ⇒ A t (cid:48) ⇒ A · · · ⇒ A t (cid:48) n and t (cid:48) n (cid:39) B t n . The same result also holds for weak rewritingmodulo B . (cid:2) Proposition 7.7.17 If a CTRS ( Σ , A) commuting with Σ -equations B is terminat-ing, then the CMTRS ( Σ , A, B) is also terminating, and the same holdsunder weak rewriting. (cid:2) Deduction and Rewriting Modulo Equations We also have the following: Proposition 7.7.18 Given a CMTRS C , let C U be the MTRS whose rules are thoseof C with their conditions (if any) removed. Then C is terminating (orground terminating) if C U is. Proof: Any rewrite sequence of C is also a rewrite sequence of C U and there-fore finite. (cid:2) The following is a nice application of several results earlier in thischapter: Theorem 7.7.19 Given a conditional triangular propositional system, let T beits set of equations, let A = T ∪ P where P is the set of equations in PROPC , and let B be the associative and commutative laws for and and xor . Then T and A are terminating modulo B . Proof: Let T (cid:48) , A (cid:48) be the unconditional non-modulo versions of T , A , respec-tively, so that A (cid:48) = T (cid:48) ∪ P . Example 5.8.27 showed that unconditionaltriangular propositional systems are terminating under rewriting mod-ulo no equations, so T (cid:48) is terminating. Therefore T is terminating byProposition 5.8.12, and Proposition 7.7.17 implies that T modulo B isterminating, since B commutes with T because it commutes with anyrule with a constant as its leftside.An argument similar to that of Example 5.8.27 shows that A (cid:48) is ter-minating. Let N = (cid:11) m + i = ω where m is the number of dependent vari-ables in T , and note that N is Noetherian. For t a Σ (Z) -term, let ψ i (t) be the number of occurrences of y i in t , let ρ(t) be as in Example 7.5.9with ρ(z) = z ∈ Z , and let τ(t) = (ψ (t), ψ (t), . . . , ψ m (t), ρ(t)) .Then τ satisfies the hypotheses of Proposition 7.5.7, and thus A (cid:48) is ter-minating. Therefore Proposition 5.8.12 implies A is terminating, and so A is also terminating modulo B by Proposition 7.7.17. (cid:2) Given a poset P , we can define weak and strong ρ -monotonicty ofconditional rewrite rules modulo B , of substitution modulo B , and ofoperations in Σ , just as in Definition 5.5.3, except that T Σ and T Σ ( { z } s ) arereplaced by T Σ ,B and T Σ ,B ( { z } s ) , respectively. Note that as before, theinequalities for a rule are only required to hold when all the conditions of the rule converge (modulo B ). The following is the modulo B gener-alization of the most powerful termination result (Theorem 5.8.33) forconditional rules in Chapter 5: Theorem 7.7.20 Let ( Σ , A, B) be a CMTRS with Σ non-void and with ( Σ (cid:48) , A (cid:48) , B) a CMTRS with Σ (cid:48) ⊆ Σ and A (cid:48) ⊆ A ground terminating; let P be a poset,and let N = A − A (cid:48) with Σ (cid:48) the minimal signature. If there is a ρ : T Σ ,B → P such that(1) each rule in A (cid:48) is weak ρ -monotone,(2) each rule in N is strict ρ -monotone, onditional Term Rewriting Modulo Equations (3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or if not, then for each t ∈ T Σ ,s there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. (cid:2) The proof, which is much like that of Theorem 5.8.33, is sketched inAppendix B. Exercise 7.7.5 Given a CMTRS C , let C U be the MTRS whose rules are those of C with their conditions (if any) removed. Then C is Church-Rosser (orground Church-Rosser) if C U is. (cid:2) As always, ARS results apply directly, including the Newman Lemmaand the Hindley-Rosen Lemma (Exercise 5.7.5), so we do not state themhere. The results stated here are actually rather weak. Perhaps themost generally useful methods for proving Church-Rosser are based onthe Newman Lemma, because it is usually much easier to prove the localChurch-Rosser property. As discussed in Section 7.6, and in more detailin Chapter 12, although the Critical Pair Theorem (5.6.9) does not gen-eralize to modulo B rewriting, in many cases the local Church-Rosserproperty can still be checked by a variant of an algorithm introducedby Knuth and Bendix [117] for the unsorted unconditional non-modulocase. The Hindley-Rosen Lemma applied to conditional rewriting mod-ulo equations gives the following: Proposition 7.7.21 Given Church-Rosser CMTRS’s M i = ( Σ , A i , B) for i ∈ I ,which strongly commute in the sense thatif t ⇒ A i /B t and t ⇒ A j /B t for some i, j ∈ I , then there issome t such that t , ⇒ A j /B t and t ∗ ⇒ A i /B t , then M = ( Σ , (cid:83) i ∈ I A i , B) is Church-Rosser, where , ⇒ indicates reflexiveclosure. (cid:2) Proposition 5.2.6 also generalizes, since it follows from the ARS re-sult Proposition 5.7.6: Proposition 7.7.22 If ( Σ , A, B) is a Church-Rosser CMTRS, A (cid:238) ( ∀ X) t = B t (cid:48) iff t ↓ A/B t (cid:48) . (cid:2) Deduction and Rewriting Modulo Equations We next prove the following generalization of Proposition 7.6.2: Proposition 7.7.23 Given a CTRS ( Σ , A) commuting with B , if A is (locally)Church-Rosser, then the CMTRS ( Σ , A, B) is also (locally) Church-Rosser. Proof: For the Church-Rosser property, suppose t ∗ ⇒ A/B t and t ∗ ⇒ A/B t .Then by Lemma 7.7.16, we can find t (cid:48) and t (cid:48) such that t ∗ ⇒ A t (cid:48) and t ∗ ⇒ A t (cid:48) with t (cid:48) (cid:39) B t and t (cid:48) (cid:39) B t . Then by Church-Rosser, thereis a term t such that t (cid:48) ∗ ⇒ A t and t (cid:48) ∗ ⇒ A t . Hence t ∗ ⇒ A/B t and t ∗ ⇒ A/B t . So again by Lemma 7.7.16, there exist terms t (cid:48) and t (cid:48)(cid:48) suchthat t ∗ ⇒ A t (cid:48) and t ∗ ⇒ A t (cid:48)(cid:48) with t (cid:48) (cid:39) B t and t (cid:48)(cid:48) (cid:39) B t , so that t (cid:48) (cid:39) B t (cid:48)(cid:48) .Thus t ∗ ⇒ A/B t (cid:48) and t ∗ ⇒ A/B t (cid:48)(cid:48) with t (cid:48) (cid:39) B t (cid:48)(cid:48) , and we are done. A similar proof works for the local Church-Rosser property. (cid:2) The next result uses the above and Proposition 7.7.18: Theorem 7.7.24 Given a conditional triangular propositional system T withconsistent conditions, let B be the associative and commutative lawsfor and and xor , and let A = T ∪ P where P is the equations in PROPC excluding B . Then T and A are ground Church-Rosser modulo B , andhence ground canonical. Proof: Proposition 7.7.19 shows termination modulo B for T and A . We nowshow T locally Church-Rosser modulo B , using the fact that only thesymbols y k can be rewritten. Suppose t ⇒ A/B t and t ⇒ A/B t , withredexes y i and y j respectively. If y i = y j and they are at the sameoccurrence in t , then t = B t by the consistent conditions hypothesis.Otherwise, we can rewrite y j in t and y i in t to get the same term t (cid:48) . This gives the local Church-Rosser property, so that the Newmanlemma implies that T modulo B is Church-Rosser and thus canonical.Since P modulo B is Church-Rosser, and it is not difficult to checkthat ⇒ P/B and ⇒ T/B strongly commute in the sense of Proposition 7.7.21(the Hindley-Rosen Lemma), it follows that A is Church-Rosser modulo B . Therefore A is also canonical. (cid:2) The treatment of equational deduction modulo equations in Section 7.2may be novel, but it is similar to the treatment of term rewriting mod-ulo equations in Section 7.3, which follows the work of Huet [108] andothers, although our exposition is more semantic and algebraic thanthe standard literature. Basic semantic results on rewriting moduloequations include its equivalence with class rewriting and its relation-ship with equational deduction (Theorem 7.3.4 and Proposition 7.2.2). iterature Theorem 7.3.9 is also very fundamental, and although it will not sur-prise experts, it does not appear in the literature. Hsiang’s Theorem(7.3.13) was first proved by Hsiang [106]. The exclusive normal formresults such as Proposition 7.3.18 are not as well known as they shouldbe, although they are direct consequences of Hsiang’s Theorem. Theo-rem 7.3.22 says that reduction gives a decision procedure for the predi-cate calculus; it is of basic importance to this book, and the proof givenhere, which appears to be new, is a nice example of algebraic techniquesin term rewriting theory.Our algebraic approach to hardware verification was first outlined in[59]. Although influenced by Mike Gordon’s clear expositions of hard-ware verification using higher-order logic, e.g., [93, 94], we disagreewith Gordon’s claim that higher-order logic is necessary for hardware verification. A key insight for this chapter is that equality is alreadya “bidirectional” (i.e., symmetric) relation, which of course is axioma-tized by equational logic. Bidirectionality is needed for many importantcircuits, and conditional equational logic adds important expressivepower. The results about triangular propositional systems seem to benew, and Proposition 7.4.3 and Theorem 7.4.10 justify the use of reduc-tion to prove properties of combinatorial circuits. Proposition 7.4.16and the subsequent discussion in Section 7.4.3 are very useful. Severalof the main techniques from this chapter are illustrated in the proofsof Theorems 7.7.19 and 7.7.24. It is reasonable to conjecture that everytriangular system has a most general partial solution, and that generalconditional solutions are unique (even though most general uncondi-tional solutions are not unique).This approach to hardware verification has been applied to manyexamples, including some that are non-trivial [168, 169, 84], using the2OBJ theorem prover. An early application of OBJ to hardware specifi-cation and testing appears in [159]. Most general solutions may remindsome readers of unifiers, and indeed, they are a special case of unifi-cation understood in a sufficiently broad sense, as for example in [60],though this is not the place to discuss such abstract notions.Many results from Section 7.5 onward are novel, in that few have been proved for many-sorted rewriting, let alone overloaded many-sorted rewriting, and several are new even for the unsorted case. Theresults about rules commuting with equations in Proposition 7.6.2 andits generalization Proposition 7.7.23 do not appear to be in the litera-ture. Conditional abstract rewrite systems (Definition 7.7.1) appear tobe a new and useful concept. The construction of W (cid:5) can be describedmore abstractly using concepts from Section 8.2: because the relationof the ARS is defined using only Horn clauses, an initiality theorem forHorn clause theories gives the least relation satisfying those sentences.The CARS approach could also have been applied to ordinary CTRS’s,simplifying Section 5.8. Theorems 7.7.7, 7.7.8 and 7.7.10 extend the Deduction and Rewriting Modulo Equations main semantic results of Chapter 5 to conditional term rewriting mod-ulo equations. In particular, Theorem 7.7.8 makes explicit the connec-tion with initial algebra semantics. Proposition 7.7.17 (about commu-tativity) and the very general Theorem 7.7.20 also appear to be usefulnew results for proving termination modulo equations.I thank Prof. Mitsuhiro Okada and Dr. Monica Marcus for many valu-able comments on the material in this chapter; the latter has read manyparts of the chapter carefully and offered many corrections, though sheis of course not responsible for any remaining errors. A Note to Lecturers: This chapter contains a great deal of difficult material, much of which would have to be skippedin a one semester or one quarter course. I have found it pos-sible to present the hardware examples with only a pointerto the theory, since intuitions about hardware are strongenough to make the computations convincing. Of course,many proofs can be skipped in lectures and even in read-ings, especially since this chapter has the structure of a se-quence of generalizations, in which similar results appear ingradually increasing generality. First-Order Logic and ProofPlanning This chapter extends our algebraic approach to full first-order logic.Section 8.2 treats the special case of Horn clause logic, which we showis essentially the same as equational logic. Section 8.3 presents first-order logic syntax and some basics of its model theory; this devel-opment is unusual in treating the many-sorted case and in allowing(partially) empty models. Section 8.4 discusses proof planning. Proofrules for existential quantifiers, case analysis, and induction are givenin Sections 8.5, 8.6, and 8.7, respectively. The notion of induction isunusually general. First-order signatures provide symbols for building first-order sentences,but actually defining these sentences and their satisfaction is put off toSection 8.3. Definition 8.1.1 Given a set S of sorts , an S -sorted first-order signature Φ isa pair ( Σ , Π ) , where Σ is an S -sorted algebraic signature, i.e., an in-dexed set of the form { Σ w,s | w ∈ S ∗ , s ∈ S } whose elements are called function (or operation ) symbols , and Π is an indexed set of the form { Π w | w ∈ S ∗ } whose elements are called predicate symbols , where π ∈ Π w is said to have arity w . (cid:2) Example 8.1.2 Let Σ be the algebraic signature Σ NATP of Example 2.3.3, with onesort, Nat , and operations 0 , s ; let Π have Π Nat = { pos } , Π NatNat = { geq } ,and Π w = ∅ otherwise. The signature Φ NAT = ( Σ , Π ) is adequate forexpressing many simple properties of natural numbers. (cid:2) Our discussion of semantics for a given signature starts with itsmodels: Definition 8.1.3 Given a first-order signature Φ = ( Σ , Π ) , a first-order Φ - model M consists of a Σ -algebra, also denoted M , together with for each π ∈ First-Order Logic and Proof Planning Π w , a subset M π ⊆ M w , where M s ...s n denotes the set M s × · · · × M s n .A model M is nonempty iff each M s is nonempty. (cid:2) Think of M π as the set of values where π is true in M . When w = [] , then π ∈ Π w should represent a relation that is constant, i.e., atruth value. Because M [] is a one-point set, say { (cid:63) } , there are only twopossible values for M π ⊆ { (cid:63) } , namely ∅ and { (cid:63) } . We let the first casemean π is false, and the second mean it is true. Example 8.1.4 If we let Φ be the signature Φ NAT of Example 8.1.2 above, then thestandard Φ -model M has M Nat = T Σ (the natural numbers in Peano nota-tion, with 0 and s interpreted as usual in T Σ ), with M pos = { s( ), s(s( )),s(s(s( ))), . . . } and with M geq = {(cid:104) m, n (cid:105) | m ≥ n } (where ≥ has theusual meaning, and n, m are Peano numbers). Of course, there aremany other Φ -models, most of which are not isomorphic to the natural numbers. For example, there are Φ -models with just one element. (cid:2) Exercise 8.1.1 For Φ the signature of Example 8.1.2 above, how many Φ -models M are there with M Nat = { } ? How many are there with M Nat = { , } ? (cid:2) Exercise 8.1.2 (a) Give a first-order signature Φ that is adequate for partiallyordered sets, and interpret the natural numbers as a Φ -model.(b) Give a first-order signature Φ that is adequate for equivalence re-lations, and interpret the natural numbers as a Φ -model. (cid:2) Definition 8.1.5 Given a first-order signature Φ = ( Σ , Π ) and given Φ -models M, M (cid:48) , then a Φ - morphism h : M → M (cid:48) is a Σ -homomorphism h : M → M (cid:48) such that for each π ∈ Π s ...s n (m , . . . , m n ) ∈ M π implies (h s (m ), . . . , h s n (m n )) ∈ M (cid:48) π for all m i ∈ M s i . The composition of two Φ -morphisms is their compo-sition as Σ -homomorphisms. The identity Φ -morphism on M , denoted1 M , is the identity on M .A Φ -morphism h : M → M (cid:48) is a Φ - isomorphism iff there is a Φ -morphism g : M (cid:48) → M such that h ; g = M and g ; h = M (cid:48) . Such a morphism g is called an inverse to h . (cid:2) Exercise 8.1.3 (a) Show that the composition of two Φ -morphisms is also a Φ -morphism.(b) Show that 1 M satisfies the identity law for composition of Φ -mor-phisms.(c) Show that if h is a Φ -isomorphism then it has a unique inverse.(d) Show that a bijective Φ -morphism is not necessarily a Φ -isomor-phism. (cid:2) orn Clause Logic Horn clause logic is a sublogic of first-order logic that is essentially thesame as conditional equational logic. Although Horn clause notationuses first-order logic symbols, this section will develop its syntax andsatisfaction independently. Definition 8.2.1 Given a first-order signature Φ = ( Σ , Π ) , a Φ - Horn clause is anexpression of the form ( ∀ X) p ∧ . . . ∧ p n ⇒ p where X is an S -sorted set of variable symbols, and the p i , called atoms ,are each of the form π(t , . . . , t k ) such that π ∈ Π s ...s k and t j ∈ T Σ (X) s j for j = , . . . , k . We may say that p is the head of the clause, and that p , . . . , p n is its body . As usual, we assume that the components X s for X are mutually disjoint, and are also disjoint from the symbols in Σ and Π . For n = 0, we include Horn clauses of the form ( ∀ X) p , whichare just universally quantified atoms. (cid:2) Note that the symbols ∀ , ∧ and ⇒ in a Horn clause do not have anyseparate meanings, but are parts of one single mixfix symbol, as arethe symbols ∀ and if in conditional equations. Example 8.2.2 For Φ the signature of Example 8.1.2 above, the following are allHorn clauses: ( ∀ n) geq (n, )( ∀ n, m) geq (n, m) ⇒ geq (s(n), m)( ∀ n, m) geq (n, m) ⇒ geq (s(n), s(m))( ∀ n) pos (s(n)) . (cid:2) Exercise 8.2.1 Are all the axioms for partially ordered sets Horn clauses? Whatabout those for equivalence relations? (cid:2) Section 3.5 discussed extending an assignment θ : X → M where X isa variable set and M is a Σ -algebra, to a Σ -homomorphism θ : T Σ (X) → M , where θ(t) is the result of simultaneously substituting θ(x) for each x into t ∈ T Σ (X) . The following uses this for Φ -models M when Σ is thealgebraic part of Φ . Definition 8.2.3 Given a first-order signature Φ = ( Σ , Π ) , a first-order Φ -model M , and a Φ -Horn clause h c of the form ( ∀ X) p ∧ · · · ∧ p n ⇒ p ,then we say M satisfies h c and write M (cid:238) Φ h c , iff for every assignment a : X → M , a(t i ) ∈ M π i for i = , . . . , n implies a(t ) ∈ M π , where p i = π i (t i ) with t i = (t i , . . . , t ik(i) ) , and where a(t i ) = (a(t i ), . . . ,a(t ik(i) ) , for i = , . . . , n . First-Order Logic and Proof Planning A Horn specification consists of a first-order signature Φ = ( Σ , Π ) and a set H of Φ -Horn clauses; we may write ( Σ , Π , H) , or ( Φ , H) , oreven just H . A Φ -model M satisfies ( Φ , H) iff it satisfies each clause in H , and then we write M (cid:238) Φ H and call M an H - model . (cid:2) Let HCL denote the institution of Horn clause logic, consisting of itssignatures (first-order signatures), its sentences (Horn clauses), its mod-els and morphisms (first-order models and morphisms), and the notionof satisfaction in Definition 8.2.3 above. Exercise 8.2.2 Show that each Horn clause in Example 8.2.2 is satisfied by themodel M Nat of Example 8.1.4, or else that it is not. (cid:2) Example 8.2.4 Letting Φ be the signature of Example 8.1.2 and H the Horn clauses of Example 8.2.2 gives a Horn specification for the natural num-bers with predicates for inequality and positivity. The intended modelis the initial model, which exists by Theorem 8.2.6 below. (cid:2) ( (cid:63) ) Initial Horn Models In a precise analogy with the equational case, we have the followingdefinition and theorem: Definition 8.2.5 Given a Horn specification ( Φ , H) , then an H -model I is initial iff given any H -model M , there is a unique Φ -morphism from I to M . (cid:2) Theorem 8.2.6 ( Initiality ) Every Horn specification has an initial model. Proof: Given a Horn specification ( Σ , Π , H) , we construct an initial model T H as follows:1. If S is the sort set of Φ = ( Σ , Π ) , let S = S ∪ { B } , where B ∉ S (thinkof B as the Booleans).2. Define an algebraic S -sorted signature Π by Π w, B = Π w for all w ∈ S + , and Π [], B = { true } ∪ Π [] , and Π w,s = ∅ otherwise. Let Φ = (S, Σ ∪ Π ) . 3. Given a Φ -model M , let M be the Φ -model constructed as follows:(a) M s = M s for all s ∈ S ;(b) M B = { π(m , . . . , m n ) | π ∈ Π and (cid:104) m , . . . , m n (cid:105) ∉ M π } ∪{ true } ;(c) M σ = M σ for all σ ∈ Σ ; This chapter uses the word “institution” for the signatures, sentences, models,model morphisms, and satisfaction associated with some logical system. This is usefulbecause we work with many different logical systems. Institutions are formalized in[67]; see also the discussion of the satisfaction condition in Section 4.10. orn Clause Logic (d) M π (m , . . . , m n ) = true if (cid:104) m , . . . , m n (cid:105) ∈ M π ; and(e) M π (m , . . . , m n ) = π(m , . . . , m n ) if (cid:104) m , . . . , m n (cid:105) ∉ M π .4. Conversely, any Φ -algebra A gives a Φ -model A by dropping every-thing involving the sort B , and defining (cid:104) m , . . . , m n (cid:105) ∈ A π iff A π (m , . . . , m n ) = A true . 5. Now let H be the set of Φ -conditional equations of the form ( ∀ X) p = true if { p = true , . . . , p n = true } , one for each Horn clause in H of the form ( ∀ X) p ∧ · · · ∧ p n ⇒ p . 6. Finally, define T H to be T H , where T H is the initial ( Φ , H) -algebra.We now check that this construction works, i.e., that T H really is aninitial ( Φ , H) -model. Let M be any ( Φ , H) -model. Then we can use 3and 5 to check that the algebra M satisfies H . Hence there is a unique Φ -homomorphism h : T H → M , which then gives us a Φ -morphism h : T H = T H → M = M . Uniqueness of this morphism follows from thefact that the translations described in 3 and 4 above define a bijectivecorrespondence between Φ -morphisms and Φ -homomorphisms. (cid:2) This proof translates Horn clauses into conditional equations and Hornmodels into algebras, and then exploits the existence of initial algebras.It is noteworthy that Π contains true but not false , and that the truthvalues of false atoms are just the atoms themselves. Note that by 3(a)and 6, the carrier (T H ) s for s ∈ S is the same as that of T Σ , becausethere are no equations among Σ -terms. Exercise 8.2.3 Give the details of the argument that M satisfies H for the aboveproof. (cid:2) Exercise 8.2.4 Show that if there are no atoms in H then (T H ) π = ∅ for each π ∈ Π . (cid:2) Theorem 8.2.6 justifies defining relations “by induction” with Hornclauses: if we define a relation π with Horn clauses H and if M isanother H -model having the same carriers as the initial H -model T H ,then the unique Φ -morphism h : T H → M must be the identity, and sowe must have (T H ) π ⊆ M π . The same argument works for any initial H -model; moreover, for any H -model M , the image of (T H ) π under theunique homomorphism h to M is contained in M π , that is, h((T H ) π ) ⊆ M π . Thus (T H ) π is the smallest relation satisfying the given formulae. First-Order Logic and Proof Planning Example 8.2.7 We use Theorem 8.2.6 to define the transitive closure of a rela-tion, i.e., the least transitive relation containing the given relation. Here Σ has just one sort, Elt , with Σ [], Elt = X for some set X , and withall other Σ w,s empty; and Π has Π Elt Elt = { R, R ∗ } , with all other Π w empty. Assuming R is already defined on some set X , the followingHorn clauses define the transitive closure R ∗ of R , where the variables x, x (cid:48) , x (cid:48)(cid:48) range over X : xRx (cid:48) ⇒ xR ∗ x (cid:48) xR ∗ x (cid:48) ∧ x (cid:48) R ∗ x (cid:48)(cid:48) ⇒ xR ∗ x (cid:48)(cid:48) . We can use the construction in the proof of Theorem 8.2.6 to jus-tify an OBJ specification for the transitive closure R* of a relation R .Because this specification works for any R , we give a theory for R ; this is preceded by a specification for the auxiliary sort B and its one truthvalue, denoted tt to distinguish it from OBJ’s builtin Boolean value true . obj B is sort B .op tt : -> B .endoth R is sort Elt .ex B .op _R_ : Elt Elt -> B .endthobj R* is pr R . ex B .op _R*_ : Elt Elt -> B .vars X Y Z : Elt .cq X R* Y = tt if X R Y == tt .cq X R* Z = tt if X R* Y == tt and Y R* Z == tt .endo Notice that the module B has initial semantics, R has loose seman-tics, and R* has again initial semantics. The notation “ pr R . ex B . ”in R* indicates that the sort Elt is protected but the sort B is only ex-tended; although the single truth value of sort B will not be corrupted, it is likely that other terms of sort B will be added to serve as “false”values for relations. On the other hand, the module R is protected in R* ,because the equations in R* do not affect the relation R . What all thismeans is that given any relation R , a unique R* is defined for it; this canbe seen by considering the models of R* , and a more formal approachinvolving second-order quantifiers is also given in Chapter 9. Note thatthe second equation of the module R* cannot be used for reductionbecause its condition has variables that are not in its leftside; howeverOBJ3’s apply can be used in equational deduction.We can give a more idiomatic version of the above specification us-ing builtin truth values; this is valid because if BOOL is protected in the irst-Order Logic module R , then it is necessarily also protected in R* , because BOOL isprotected in the module R below iff the relation R is correctly definedin whatever model is chosen. th R is sort Elt .op _R_ : Elt Elt -> Bool .endthobj R* is pr R .op _R*_ : Elt Elt -> Bool .vars X Y Z : Elt .cq X R* Y = true if X R Y .cq X R* Z = true if X R* Y and Y R* Z .endo (In OBJ, it is more natural to parameterize the object R * by the the-ory R , but since parameterization is not discussed until Chapter 11, wetake a simpler approach here.) (cid:2) Example 8.2.7 illustrates a general method for replacing relations andHorn clauses by functions and equations. This implies we don’t needto add relations and Horn clauses to an implementation of equationallogic like OBJ. Exercise 8.2.5 (a) Define a relation R on the Peano numbers by N R M = s N == M and use (one of) the above object(s) R* and OBJ3’s apply to showthat s 0 R* s s s 0 . Now explain what R* is.(b) Show that any relation is contained in a least equivalence relation,and give corresponding OBJ code defining that relation. Now useOBJ to compute three values of the equivalence closure of somesimple (but not wholly trivial) relation. (cid:2) Exercise 8.2.6 Show that X R * Y in the second equation of the specification R * in Example 8.2.7 can be replaced by X R Y without changing the mean-ing; say what “without changing the meaning” means in this context. (cid:2) This section gives an algebraic treatment of first-order logic syntax, andthen defines first-order satisfaction; after that, adding equality predi-cates and fixing a “data” model are considered; Gödel’s completenessand incompleteness theorems are also informally discussed. First-Order Logic and Proof Planning We define the first-order sentences over a first-order signature Φ = ( Σ , Π ) by first defining Σ -terms and then defining Φ -formulae; we willhave to be careful about variables. Let S be the sort set of Φ , let X be an S -sorted set of variable symbols disjoint from Σ and Φ , such that eachsort has an infinite number of symbols, and let X be a fixed S -indexedsubset of X .A ( Φ , X) - term is an element of T Σ ∪ X . Recall that the ( S -indexed)function Var is defined on T Σ ∪ X as follows:0. Var s (σ ) = ∅ if σ ∈ Σ [],s ;1. Var s (x) = { x } if x ∈ X s ; Var s (σ (t , . . . , t n )) = (cid:83) ni = Var s i (t i ) for n > ( Var is the unique ( Σ ∪ X) -homomorphism T Σ ∪ X → P (X) where the S -indexed set P (X) of all subsets of X is given an appropriate ( Σ ∪ X) -structure.) We are now ready for the syntax of first-order logic: Definition 8.3.1 A (well-formed) ( Φ , X) - formula is an element of the carrier ofthe (one-sorted) free algebra WFF X ( Φ ) defined to have the following asits (one-sorted) signature, which we denote Ω and call the metasigna-ture :0. a constant true ,1. a unary prefix operation ¬ , called negation ,2. a binary infix operation ∧ , called conjunction ,3. a unary prefix operation ( ∀ x) for each x in X , called universalquantification over x ,plus as its generators (i.e., constants not in Ω ), the atomic ( Φ , X) - formu-lae , which are the elements of G X = { π(t , . . . , t n ) | π ∈ Π s ...s n and t i ∈ (T Σ ∪ X ) s i for i = , . . . , n } . Note that Ω is infinite if X is, because then there is an infinite number of unary operations ( ∀ x) , but of course all first-order formulae are finite.The symbols in Ω are called logical symbols, whereas those in Φ arecalled non-logical symbols. Let WFF ( Φ ) = WFF X ( Φ ) ; it contains every WFF X ( Φ ) , and its elements are called Φ - formulae .The functions Var and Free , giving the sets of all variables , and ofall free variables , of Φ -formulae, are defined by the following:0. Var ( true ) = Free ( true ) = ∅ ,1. Var (π(t , . . . , t n )) = Free (π(t , . . . , t n )) = (cid:83) ni = Var (t i ) ,2. Var ( ¬ P ) = Var (P ) , and Free ( ¬ P ) = Free (P ) , irst-Order Logic Var (P ∧ Q) = Var (P ) ∪ Var (Q) , and Free (P ∧ Q) = Free (P ) ∪ Free (Q) , and4. Var (( ∀ x)P ) = Var (P ) ∪ { x } , and Free (( ∀ x)P ) = Free (P ) − { x } .A variable that is not free is called bound ; let Bound (P ) = Var (P ) − Free (P ) . A Φ - sentence is a Φ -formula P with no free variables, i.e., with Free (P ) = ∅ ; Φ -sentences are also called closed Φ - formulae . A formulathat is not closed is called open . Let FoSen ( Φ ) denote the set of all Φ -sentences. (cid:2) Exercise 8.3.1 Show that the functions Var and Free are Ω -homomorphisms,by giving P ( X ) appropriate Ω -algebra structures. (cid:2) We introduce the remaining logical connectives, false , ∨ , ⇒ , (cid:97) , and ( ∃ x) (the last four are called disjunction , implication , equivalence ,and existential quantification , respectively), as abbreviations for cer-tain terms over the operations already introduced, as follows: false = ¬ true P ∨ Q = ¬ ( ¬ P ∧ ¬ Q)P ⇒ Q = ¬ P ∨ QP (cid:97) Q = (P ⇒ Q) ∧ (Q ⇒ P )( ∃ x) P = ¬ (( ∀ x) ¬ P ) . The symbols P , Q above are variables over formulae, and the five op-eration symbols on the left sides of the equations extend the metasig-nature Ω to a new metasignature Ω . Given any Ω -algebra M , the fiveequations above extend M in a unique way to an Ω -algebra; moreover,any Ω -homomorphism is automatically an Ω -homomorphism betweenits extended algebras. In particular, the Ω -algebras WFF X ( Φ ) extendto Ω -algebras, and the Ω -homomorphisms Var and Free extend to Ω -homomorphisms that correctly handle the new logical symbols. With-out these symbols, many of our theorem-proving applications wouldbe much more awkward; this illustrates the conflict between logics forfoundations and logics for applications discussed in Section 1.3. Exercise 8.3.2 Extend the recursive definitions of Var and Free so that theydirectly handle the new symbols in Ω . (cid:2) It might first appear that formulae like the following are ill-formed orambiguous: ( ∃ x)( ∀ x) geq (x, x)( ∃ x) ( pos (x) ∧ ( ∀ x) geq (x, x)) ( ∃ x)((( ∀ x) pos (x)) ∧ geq (x, x)) . First-Order Logic and Proof Planning But because we defined quantifiers as unary operations on expressions,every quantifier has a unique argument, a subformula called its scope .Every free instance of the quantifying variable within its scope is saidto be bound (or captured ) by that quantifier. Thus, in the first formulaabove, the two x ’s in geq (x, x) are bound to the universal quantifier,not to the existential. In the second formula, the first x is bound to theexistential quantifier, and the other two are bound to the universal. Inthe third formula, the first x is bound to the universal and the next twoare bound to the existential quantifier. It is poor style to keep reusingthe same variable for quantifiers, and the following equivalent formulaewould have been clearer: ( ∃ y)( ∀ x) geq (x, x)( ∃ x) ( pos (x) ∧ ( ∀ y) geq (y, y)) ( ∃ y)((( ∀ x) pos (x)) ∧ geq (y, y)) . However, the original formulae still have definite structure and mean-ing, due to our algebraic notion of formula, which does not require anyprior definition of scope.Of course, it is still very possible to write ambiguous formulae, suchas ( ∃ x) pos (x) ∧ geq (x, x) , where the argument (scope) of the existential quantifier cannot be de-termined; however, this is a parsing problem, not a problem in first-order logic as such (see Section 3.7). In fact, it is rather common towrite ambiguous formulae when it doesn’t matter which parse is taken.For example, in the formula ( ∃ x) pos (x) ∧ ( ∀ x) geq (x, x) , it is not clear whether the existential quantifier acts on the universallyquantified subformula, but it does not matter because that subformulais closed (see E 17 of Exercise 8.3.10). This situation is much the same asin arithmetic when we write x + y + z instead of x + (y + z ) or (x + y) + z ,since we know it doesn’t matter because addition is associative. In OBJ,precedence declarations can make quantifiers bind however tightly we wish.In summary, our algebraic approach to first-order logic syntax hasthe advantage over the more ad hoc approaches usually found in theliterature, of a clean separation between structure and parsing, whichsimply avoids complex and confusing definitions of scope. Many different systems of deduction have been given for first-orderlogic. Kurt Gödel first showed completeness for one of these with re-spect to a “semantic definition of truth” given by Tarski; this is a notion irst-Order Logic of satisfaction of first-order sentences by first-order models. All soundand complete systems are equivalent in the sense that they give riseto the same theorems for any theory. This chapter uses (a version of)Tarski’s semantics for justifying a set of rules to transform complexproof tasks into Boolean combinations of simpler proof tasks; we callthese proof planning rules; we do not attempt a completeness proof.More technically, the definitions in this subsection first extend assign-ments a : X → M from terms to first-order formulae P , and then definethe “meaning” or denotation [[P ]] of a formula P to be the set of allassignments that make P true. Definition 8.3.2 Given a first-order signature Φ = ( Σ , Π ) , a Φ -model M , and an assignment (of values in M to variables in X ), i.e., an S -indexed function a : X → M , then we define E32 a : WFF X ( Φ ) → B , where B = { true , false } by the following:0. a( true ) = true .1. a( ¬ P ) = ¬ a(P ) .2. a(P ∧ Q) = a(P ) ∧ a(Q) .3. a(( ∀ x)P ) = true iff b(P ) = true for all b : X → M such that b(z) = a(z) if z ≠ x .4. a(π(t , . . . , t k )) = true iff (a(t ), . . . , a(t k )) ∈ M π .When X is small, it may be convenient to use the notation P [x ← m , x ← m , . . . , x n ← m n ] instead of a(P ) with X = { x , . . . , x n } and a(x i ) = m i for i = , . . . , n .We now define the denotation of a ( Φ , X) -formula P , written [[P ]] MX ,or just [[P ]] , to be { a : X → M | a(P ) = true } . Then M satisfies P ∈ WFF X ( Φ ) , written M (cid:238) Φ P , iff [[P ]] MX = [X → M] ,i.e., iff all assignments from X to M make P true. E33 Given a set A ofwell-formed Φ -formulae, let M (cid:238) Φ A mean M (cid:238) Φ P for each P ∈ A , and let A (cid:238) Φ P mean that M (cid:238) Φ A implies M (cid:238) Φ P for all Φ -models M ; whenthe symbol (cid:238) is used in this way, it may be called semantic entailment .Note that A need not be finite. A set A of formulae is closed iff allits elements are closed. As usual, we may omit the subscript Φ on (cid:238) Φ when it is clear from context. Let us write FOL for the institution E34 offirst-order logic with this notion of satisfaction. (cid:2) Intuitively, 3 in Definition 8.3.2 says [[( ∀ x)P ]] is the set of assignmentsthat make P true no matter what value they assign to x . This suggeststhat when P has no free variables, a(P ) should be independent of thevalues a(x) for all x ∈ X . This is made precise in the following: First-Order Logic and Proof Planning Proposition 8.3.3 Given a ( Φ , X) -formula P and assignments a, a (cid:48) : X → M , if a(x) = a (cid:48) (x) for all x ∈ Free (P ) , then a(P ) = a (cid:48) (P ) . Proof: We use induction on the structure of ( Φ , X) -formulae. The two basecases are 0 and 4 of Definition 8.3.2. For 0, if P = true , then a(P ) = a (cid:48) (P ) for all a, a (cid:48) . For 4, if P = π(t , . . . , t k ) , then Free (P ) = Var (P ) = (cid:83) ki = Var (t i ) , and so if a(z) = a (cid:48) (z) for all z ∈ Free (P ) then a(t i ) = a (cid:48) (t i ) for i = , . . . , k , and so a(P ) = a (cid:48) (P ) .There are three “step” cases, of which 1 and 2 are easy, because Free ( ¬ P ) = Free (P ) and Free (P ∧ Q) = Free (P ) ∪ Free (Q) . For 3, assume a(P ) = a (cid:48) (P ) if a(z) = a (cid:48) (z) for all z ∈ Free (P ) , and let Q = ( ∀ x)P .Then Free (Q) = Free (P ) − { x } , call it Z , and so a(z) = a (cid:48) (z) for all z ∈ Z implies a(Q) = a(( ∀ x)P ) = true iff b(P ) = true for all b such that b(z) = a(z) if z ≠ x . Also a (cid:48) (Q) = a (cid:48) (( ∀ x)P ) = true iff b (cid:48) (P ) = true for all b (cid:48) such that b (cid:48) (z) = a (cid:48) (z) if z ≠ x . Now suppose a(Q) = true and let b (cid:48) : X → M be such that b (cid:48) (z) = a (cid:48) (z) if z ≠ x . Define b : X → M by b(z) = a(z) if z ≠ x and b(x) = b (cid:48) (x) . Then b(z) = b (cid:48) (z) for each z ∈ Free (P ) , and so the induction hypothesis gives us b (cid:48) (P ) = b(P ) = true and also a (cid:48) (Q) = true . Similarly, we can show that a (cid:48) (Q) = true implies a(Q) = true . Therefore a(Q) = a (cid:48) (Q) if a(z) = a (cid:48) (z) for all z ∈ Z . (cid:2) Corollary 8.3.4 If P is a closed ( Φ , X) -formula and M is a Φ -model, then either [[P ]] MX = ∅ or else [[P ]] MX = [X → M] . Proof: Since Free (P ) = ∅ , we have a(z) = a (cid:48) (z) for all z ∈ Free (P ) for any a, a (cid:48) at all. Therefore a(P ) = a (cid:48) (P ) for all a, a (cid:48) . Hence a(P ) = true forall a , or else a(P ) = false for all a . In the first case, [[P ]] = [X → M] while in the second [[P ]] = ∅ . (cid:2) That is, any closed formula is either true or else false of any givenmodel.As usual, [[ _ ]] MX is an Ω -homomorphism to a suitable target algebra,in this case with carrier A MX = P ([X → M]) ; we usually drop the su-perscript M and subscript X . The Ω -algebra structure for A is given asfollows, for A, B ⊆ [X → M] :0. A true = [X → M] . A ¬ (A) = [X → M] − A .2. A ∧ (A, B) = A ∩ B .3. A ( ∀ x) (A) = { a : X → M | a (cid:48) (y) = a(y) for all y ≠ x implies a (cid:48) ∈ A } .If we now define α : G X → A X by α(π(t , . . . , t n )) = { a | (a(t ), . . . , a(t n )) ∈ M π } , then [[ _ ]] is the unique Ω -homomorphism WFF X ( Φ ) → A X extending α ;i.e., [[P ]] = α(P ) . irst-Order Logic Definition 8.3.5 First-order ( Φ , X) -formulae P , Q are ( semantically ) equivalent ,written P ≡ Q , iff [[P ]] MX = [[Q]] MX for all M . (cid:2) Note that ≡ is neither a logical nor a non-logical symbol, but a metalog-ical symbol , used for talking about the satisfaction of formulae. Equiva-lent formulae are true under exactly the same circumstances and hencecan be substituted for each other without changing the truth value ofany formula of which they are part. (The ubiquity of this concept re-flects the obsession of classical logic with truth.) Exercise 8.3.3 Given a first-order signature Φ , a Φ -model M , and ( Φ , X) -formu-lae P , Q , show the following:(a) M (cid:238) P ⇒ Q iff [[P ]] MX ⊆ [[Q]] MX . (b) P ≡ Q iff M (cid:238) (P (cid:97) Q) for all M .(c) P ≡ Q implies M (cid:238) P iff M (cid:238) Q for all M .(d) “ implies ” cannot be replaced by “ iff ” in (c) above.(e) [[( ∀ x) P ]] MX ⊆ [[P ]] MX .Note that (c) implies that P ≡ Q implies [[P ]] = [X → M] iff [[Q]] = [X → M] . (cid:2) Exercise 8.3.4 Given a first-order signature Φ , and Φ -formulae P , Q, R , showthe following: E . P ∧ Q ≡ Q ∧ P .E . P ∧ (Q ∧ R) ≡ (P ∧ Q) ∧ R .E . P ∧ P ≡ P . Also, for Φ -formulae A, A (cid:48) , P , P (cid:48) , show that E . A ≡ A (cid:48) and P ≡ P (cid:48) imply A (cid:238) P iff A (cid:48) (cid:238) P (cid:48) . (cid:2) E4 is a (weak) version of Leibniz’s principle , that equal things may besubstituted for each other; this supports the importance of ≡ , and un-derlines the somewhat strange view of traditional logic that all true sentences are equal (as are all false sentences). The following builds on E E Notation 8.3.6 In the notation “ A (cid:238) . . . ” where A is a set, we may write A, P or A ∧ P for A ∪ { P } , and write A, A (cid:48) or A ∧ A (cid:48) for A ∪ A (cid:48) . For example, A, P , Q (cid:238) P , Q . (cid:2) This notation makes sense because both set notation and conjunctionare commutative, associative and idempotent. Any finite set A can beregarded as the conjunction of its sentences, although this does notwork if A is infinite. First-Order Logic and Proof Planning Exercise 8.3.5 Let Φ be the signature of Example 8.1.2 above, let M be the stan-dard Φ -model of Example 8.1.4, and let X = { x, y } . Now describe [[P ]] ,for P each of the following: geq (s(s(s( ))), x)( ∀ x) geq (s(s(s( ))), x)( ∀ x)( ∀ y) geq (x, y) ∧ pos (x)( ∃ y) geq (x, y) ∧ pos (x)( ∀ y)( ∃ x) geq (x, y) ∧ pos (x)( ∀ x) pos (x) ⇒ geq (y, x)( geq (x, s(s( ))) ∧ geq (s(s(s( ))), x) ∧ geq (x, y)) ∨ ( geq (x, y) ∧ geq (y, x)) (cid:2) Proposition 8.3.7 Given a signature Φ , a Φ -model M , and ( Φ , X) -formulae P , Q ,then: P . M (cid:238) P ∧ Q iff M (cid:238) P and M (cid:238) Q .P . M (cid:238) P ∨ Q if M (cid:238) P or M (cid:238) Q .P . M (cid:238) P ∨ Q iff M (cid:238) P or M (cid:238) Q if P or Q is closed .P . M (cid:238) P ⇒ Q iff M (cid:238) P implies M (cid:238) Q if P is closed .P . M (cid:238) ¬¬ P iff M (cid:238) P .P a. M (cid:238) ¬ P iff M (cid:238) P is false if P is closed and M nonempty . P . M (cid:238) ( ∀ x) P iff M (cid:238) P .P . M (cid:238) P if M (cid:238) Q and M (cid:238) Q ⇒ P . Proof: P . Since [[P ∧ Q]] = [[P ]] ∩ [[Q]] , we have [[P ∧ Q]] = [X → M] iff [[P ]] = [[Q]] = [X → M] . P . Since [[P ∨ Q]] = [[P ]] ∪ [[Q]] , we have [[P ∨ Q]] = [X → M] if [[P ]] = [X → M] or [[Q]] = [X → M] . P . If P is closed then [[P ]] = [X → M] or else [[P ]] = ∅ by Corol-lary 8.3.4. Therefore [[P ∨ Q]] = [X → M] iff [[P ]] = [X → M] or [[Q]] = [X → M] . The argument is the same for the Q closed case. P . This follows from Corollary 8.3.4 plus the fact that [[P ⇒ Q]] = ([X → M] − [[P ]]) ∪ [[Q]] .P . We need [[( ∀ x)P ]] = [X → M] iff [[P ]] = [X → M] , for x ∈ X . By (e) of Exercise 8.3.3, [[( ∀ x)P ]] = [X → M] implies [[P ]] = [X → M] . Conversely, if [[P ]] = [X → M] , then for each Recall that this means that all its carriers are non-empty. irst-Order Logic a : X → M and each b : X → M such that b(z) = a(z) for all z ≠ x , we have b(P ) = true , so that a ∈ [[( ∀ x)P ]] . Therefore [[( ∀ x)P ]] = [X → M] . P , P a and P (cid:2) P ¬¬ P ≡ P , and is often called the law of double negation . P P modus ponens and goes back tothe ancient Greeks (though the name is Latin); it is closely related to the proof rule T Example 8.3.8 The “ if ” in P iff .”Let Φ be the signature of Example 8.1.2, let M be its standard model,and let P , Q be the formulae pos (x), ¬ geq (x, s( )) , respectively. Then M (cid:238) P ∨ Q holds, but both M (cid:238) P and M (cid:238) Q are false. (cid:2) Exercise 8.3.6 The following refer to Proposition 8.3.7 above:(a) Give a signature Φ , a Φ -model M and Φ -formulae P , Q which showthat P4 does not hold without the restriction that P is closed.(b) Show that M (cid:238) P ⇒ Q implies ( M (cid:238) P implies M (cid:238) Q ), withneither P , Q required to be closed.(c) Prove P P a .(d) Prove P (cid:2) The following results for semantic entailment are analoguous tothose in Proposition 8.3.7: Proposition 8.3.9 Let Φ be a first-order signature, let A be a set of ( Φ , X) -formulae, and let P , Q be ( Φ , X) -formulae. Then R . A, P (cid:238) P .R . A (cid:238) P ∧ Q iff A (cid:238) P and A (cid:238) Q .R . A (cid:238) P ∨ Q if A (cid:238) P or A (cid:238) Q .R . A (cid:238) P ⇒ Q iff A, P (cid:238) Q if P is closed .R . A (cid:238) ¬ P iff A, P (cid:238) false if P is closed .R a. A (cid:238) Φ ( ∀ x) P iff A (cid:238) Φ P if x not free in A .R . A (cid:238) Φ ( ∀ x) P iff A (cid:238) Φ ( { x } ) P if x not free in A .R b. A, ( ∀ x)P (cid:238) Q iff A, P (cid:238) Q .R . A (cid:238) P if A (cid:238) Q and A (cid:238) Q ⇒ P . R a. A (cid:238) P ⇒ R if A (cid:238) P ⇒ Q and A (cid:238) Q ⇒ R . First-Order Logic and Proof Planning Proof: R , R , R P , P R A (cid:238) P ⇒ Q iff for all models M , (M (cid:238) A) ⇒ (M (cid:238) P ⇒ Q)) iff ¬ (M (cid:238) A) ∨ ¬ (M (cid:238) P ) ∨ (M (cid:238) Q) iff ¬ (M (cid:238) A ∧ P ) ∨ (M (cid:238) Q) iff (M (cid:238) A ∧ P ) ⇒ (M (cid:238) Q) , which is equivalent to A, P (cid:238) Q , where the first ⇒ in the first line andall the iff s are in the metalanguage, while the second ⇒ is in first-orderlogic, and where the first iff uses P R R 3, by substituting false for Q and ¬ P for P , andusing ¬ P ≡ (P ⇒ false ) . R a, R b, R P , P , P sition 8.3.7, respectively.For R 5, by R a , it suffices to show that A (cid:238) Φ P iff A (cid:238) Φ ( { x } ) P , i.e., to show that the following are equivalent,for all models M, (M (cid:238) Φ A ⇒ M (cid:238) Φ P ) for all models M (cid:48) , (M (cid:48) (cid:238) Φ ( { x } ) A ⇒ M (cid:48) (cid:238) Φ ( { x } ) P ) , noting that the M in the first assertion are Φ -models, while the M (cid:48) inthe second are Φ ( { x } ) -models. Since x is not free in A , it is sufficientto showfor all models M, M (cid:238) Φ P iff for all models M (cid:48) , M (cid:48) (cid:238) Φ ( { x } ) P . These two expressions respectively equalfor all models M, [[P ]] MX = [X → M] for all models M (cid:48) , [[P ]] M (cid:48) X −{ x } = [(X − { x } ) → M (cid:48) ] , which are equivalent because an assignment a : X → M to a Φ -model M is the same thing as an assignment a (cid:48) : (X − { x } ) → M (cid:48) to a Φ ( { x } ) -model M (cid:48) . R a is left as an exercise. (cid:2) R Theorem of Deduction . R proof by contradiction . R Theorem of Constants ,and the heart of its proof is similar to that for the equational case; seealso (d) of Exercise 8.3.7 below. Note that in forming Φ ( { x } ) , Φ and { x } are disjoint because Φ and X are. Strictly speaking, we have changedthe variable set from X to X − { x } , so that occurrences of x in Φ ( { x } ) -formulae are constants, not variables. R modus ponens . R b implies outermost universal quantifiers irst-Order Logic can be removed from the left of a turnstile; however this should not bedone automatically, because it precludes substituting for the variableinvolved (see Section 8.3.4). R a expresses the transitivity of implica-tion. Exercise 8.3.7 The following refer to Proposition 8.3.9:(a) Give Φ , A, P showing that “ if ” in R iff ”.(b) Prove R only if ” replacing “ iff ” and without the clause “ if P is closed.”(c) Give Φ , A, P showing that neither direction of the assertion A (cid:238) ¬ P iff A (cid:54)(cid:238) P is correct.(d) Generalize R 5, replacing x by a variable set X .(e) Prove R a . Hint: Very simple choices will work for (a) and (c). (cid:2) Mathematics, and especially logic, is often said to deal with absolute or eternal truths , true under all possible interpretations in all possi-ble models (or “worlds”). Mathematical truths are also said to be for-mal truths , “trivially” true for formal, non-empirical reasons (though ofcourse establishing such truths can be non-trivial); the word “tautolog-ical” is also used. Definition 8.3.10 A Φ -formula P is a tautology iff M (cid:238) P for every Φ -model M . (cid:2) Exercise 8.3.8 For Φ the signature of Example 8.1.2, show that each of the fol-lowing is a tautology, or else show that it is not: ( ∀ x) ( geq (x, ) ⇒ pos (x)) geq (x, x) geq (x, y) ∨ ¬ geq (x, y) geq (x, y) ∨ geq (y, x)( ∀ x) geq (x, x)( ∀ x) ( geq (x, y) ⇒ geq (x, x))( ∀ x) ( pos (x) ⇒ pos (x)) . Hint: Don’t forget that there is no fixed Φ -model here. (cid:2) First-Order Logic and Proof Planning Exercise 8.3.9 Prove the following equivalences for P , Q first-order formulae: E . true ∧ P ≡ P .E . false ∧ P ≡ false .E . true ∨ P ≡ true .E . false ∨ P ≡ P .E . P ∧ ¬ P ≡ false .E . P ∨ ¬ P ≡ true .E . ¬¬ P ≡ P .E . ( ∀ x)( ∀ y) P ≡ ( ∀ y)( ∀ x) P .E . ( ∀ x)( ∀ x) P ≡ ( ∀ x) P . (cid:2) Notation 8.3.11 Let X be a variable set with elements x , . . . , x n . Then we maywrite ( ∀ X)P for ( ∀ x ) . . . ( ∀ x n )P . By E 12 and E 13, ordering and repe- tition of variables do not matter. Note that ( ∀ X) does not make senseif X is infinite. We extend existential quantifiers in the same way, to ( ∃ X) where X is a finite set of variables. (cid:2) Results about quantifiers generally extend by induction on the num-ber of quantified variables. Let Φ (X) denote the first-order signature ( Σ (X), Π ) when Φ = ( Σ , Π ) . Proposition 8.3.12 Let Φ be a first-order signature, A a set of Φ -sentences, and P a Φ -formula. Then R aX. A (cid:238) Φ ( ∀ X) P iff A (cid:238) Φ P .R X. A (cid:238) Φ ( ∀ X) P iff A (cid:238) Φ (X) P .R X is the classical Theorem of Constants. (cid:2) Exercise 8.3.10 Prove the following equivalences for P , P (cid:48) , Q, R first-order for-mulae: E . (P ⇒ Q) ∧ (P (cid:48) ⇒ Q) ≡ (P ∨ P (cid:48) ) ⇒ Q .E . ( ∀ X) (P ∧ Q) ≡ ( ∀ X)P ∧ ( ∀ X)Q .E . ( ∀ X) P ≡ P if P is closed .E . ( ∃ X) (P ∨ Q) ≡ ( ∃ X)P ∨ ( ∃ X)Q .E . ( ∃ X) (P ∧ Q) ≡ (( ∃ X)P ) ∧ Q if Q is closed . E . ( ∃ X) P ≡ P if P is closed . E . ( ∀ X)( ∀ Y ) P ≡ ( ∀ Y )( ∀ X) P .E . ( ∃ X)( ∃ Y ) P ≡ ( ∃ Y )( ∃ X) P .E . (P ∨ Q) ∧ R ≡ (P ∧ R) ∨ (Q ∧ R) .E 22, as well as E E 11, are examples of the very general principle thatevery equational law of Boolean algebra holds as an equivalence of first-order formulae, i.e., is a tautology; see also E E (cid:2) Exercise 8.3.11 Let Φ be a first-order signature and let P , Q be Φ -formulae.(a) Show that P is a tautology iff true (cid:238) Φ P . irst-Order Logic (b) Give an example showing that the condition “ Q is closed” is nec-essary in E Q is closed then ( ∀ X) (P ∨ Q) ≡ (( ∀ X) P ) ∨ Q . (cid:2) We noted earlier that in our syntax for Horn clause logic, the sym-bols ∀ , ∧ , and ⇒ are not themselves logical symbols, but instead to-gether constitute a single mixfix logical symbol. However, this nota-tion does suggest a simple translation into first-order logic, where eachsymbol is taken as the corresponding logical symbol in first-order logic.This can perhaps be made clearer by adding parentheses, so that thetranslation of the Horn clause h = ( ∀ X) p ∧ · · · ∧ p n ⇒ p is the first-order formula h (cid:48) = ( ∀ X) ((p ∧ · · · ∧ p n ) ⇒ p ) , where of course the same signature Φ is used in each case. The follow-ing enables us to regard HCL as a “subinstitution” of FOL : Fact 8.3.13 Let M be a Φ -model, let h c be a Horn clause, and let h (cid:48) c be its firstorder translation. Then M (cid:238) Φ h c iff M (cid:238) Φ h (cid:48) c . (cid:2) Exercise 8.3.12 Prove Fact 8.3.13 from the appropriate definitions of satisfac-tion. (cid:2) The following demonstrates the very important fact that initial mod-els do not always exist for theories over full first-order logic; this im-plies that it is not (in general) valid to do induction over models definedby sets of first-order sentences, even if they are supposed to be initial. Example 8.3.14 Let Σ have one sort and two constants, a, b , let Φ have just onerelation symbol, π , and let A consist of the axiom π(a) ∨ π(b) . Thisspecification has no initial model: clearly the carrier must be { a, b } ,but there is no way to get a smallest subset for π ; in fact, there are twodifferent equally good (and equally bad) minimal choices, namely { a } and { b } . (cid:2) Proposition 8.3.15 Given a ( Φ , X) -sentence ( ∃ x)P with Free (P ) = { x } and a Φ -model M with all carriers nonempty, E35 then M (cid:238) ( ∃ x)P iff there is anassignment a : X → M such that a(P ) = true . First-Order Logic and Proof Planning Proof: By the following computation: M (cid:238) ( ∃ x)P iff ( by definition of ∃ )M (cid:238) ¬ (( ∀ x) ¬ P ) iff ( by P a) not (M (cid:238) ( ∀ x) ¬ P ) iff ( by P ) not (M (cid:238) ¬ P ) iffnot ([[ ¬ P ]] = [X → M]) iffnot ([[P ]] = ∅ ) iffexists a : X → M with a(P ) = true . (cid:2) Exercise 8.3.13 Give examples showing how Proposition 8.3.15 fails if either M has empty carriers, or P has free variables other than x . (cid:2) This section extends substitution from terms to first-order formulae,and gives the so-called Substitution Theorem, which will be importantfor several later developments, including that of quantifiers. Definition 8.3.16 Let Φ = ( Σ , Π ) be a first-order signature and let θ : X → T Σ (X) be a substitution. Now define (cid:98) θ : WFF X ( Φ ) → WFF X ( Φ ) recursively asfollows:0. (cid:98) θ(π(t , . . . , t n )) = π(θ(t ), . . . , θ(t n )) ;1. (cid:98) θ( true ) = true ;2. (cid:98) θ( ¬ P ) = ¬ (cid:98) θ(P ) ;3. (cid:98) θ(P ∧ Q) = (cid:98) θ(P ) ∧ (cid:98) θ(Q) ;4. (cid:98) θ(( ∀ x)P ) = ( ∀ x) (cid:99) θ x (P ) ,where θ x is the substitution that agrees with θ everywhere on X except x , and θ x (x) = x . We may write θ(P ) , or sometimes more elegantly P θ ,for (cid:98) θ(P ) , and call it the result of applying θ to P , or of substituting θ(x) in P for each x ∈ X . When X is small, the notation P [x ← t , . . . , x n ← t n ] may be more convenient than P θ . (cid:2) The simplicity of this definition, which as usual is recursive over Ω ,may come as a pleasant surprise. Notice that (cid:98) θ automatically avoidssubstituting for bound variables. However, there is a subtle difficulty: Example 8.3.17 Let Φ be the signature of Example 8.1.2, let X = { x, y, z } , let θ(x) = θ(y) = s(x) , and θ(z) = z . Then θ( geq (y, x)) = geq (s(x), ).θ(( ∀ x)( geq (x, ) ⇒ pos (x))) = ( ∀ x)( geq (x, ) ⇒ pos (x)). θ(( ∀ x)( geq (x, y) ⇒ geq (x, x))) = ( ∀ x)( geq (x, s(x)) ⇒ geq (x, x)). irst-Order Logic Note the capture of the x in s(x) by the quantifier in the last formula,although the variable y that it replaced was free in this formula. Thisphenomenon is called variable capture .Now define a substitution τ by τ(x) = τ(y) = τ(z) = z . Then (θ ; τ)(x) = (θ ; τ)(y) = s( ) , and (θ ; τ)(z) = z , so that if we let P denote the third formula above, then P (θ ; τ) = ( ∀ x)( geq (x, s(x)) ⇒ geq (x, x)) , whereas (P θ)τ = ( ∀ x)( geq (x, s( )) ⇒ geq (x, x)) . Thusvariable capture thwarts the compositionality of substitution. This mo-tivates Definition 8.3.18 below. (cid:2) Exercise 8.3.14 We can extend the notation θ x to θ Z for Z ⊆ X , by defining θ Z (y) = θ(y) for y (cid:54)∈ Z and θ Z (y) = y for y ∈ Z . Show the followingfor any P in WFF X ( Φ ) and substitution θ : θ = θ Z iff θ is the identity on (at least) Z .2. θ(( ∀ Z) P ) = ( ∀ Z) θ Z (P ) .3. θ(P ) = P if P is closed.4. More generally, θ Free(P) (P ) = P .5. Even more generally, θ(P ) = τ(P ) if θ(y) = τ(y) for y ∈ Free (P ) . (cid:2) Definition 8.3.18 Given a ( Φ , X) -formula P and a substitution θ , define θ to be capture free for P as follows:0. θ is capture free for P if P is atomic;1. θ is capture free for true ;2. θ is capture free for ¬ P if it is for P ;3. θ is capture free for P ∧ Q if it is for P and for Q ; and4. θ is capture free for ( ∀ x)P if θ x is capture free for P and if y ≠ x is a free variable of P then x is not free in θ(y) .Capture freedom extends from the operations in Ω to those in Ω ; forexample, θ is capture free for ( ∃ x)P under exactly the same conditionsas those for ( ∀ x)P . (cid:2) Proposition 8.3.19 Let θ, τ be two substitutions such that θ is capture free for P . Then1. (P θ)τ = P (θ ; τ) , and2. if τ is capture free for P θ , then θ ; τ is capture free for P . First-Order Logic and Proof Planning Proof: We prove 1. by structural induction over Ω . We leave the reader tocheck the result for P atomic or true , and for negation and conjunction.Suppose P = ( ∀ x)Q . Then (P θ)τ = ( ∀ x)((Qθ x )τ x ) and P (θ ; τ) = ( ∀ x)Q(θ ; τ) x ; because θ is capture free for P , so is θ x for Q ; thus bythe induction hypothesis, (Qθ x )τ x = Q(θ x ; τ x ) .Now we claim Q(θ x ; τ x ) = Q(θ ; τ) x . By 5. of Exercise 8.3.14, it suf-fices to show (θ x ; τ x )(y) = (θ ; τ) x (y) for all y ∈ Free (Q) . If x ∈ Free (Q) then (θ x ; τ x )(x) = x = (θ ; τ) x (x) ; if y ≠ x is in Free (Q) , then (θ ; τ) x (y) = (θ ; τ)(y) and also (θ x ; τ x )(y) = τ x (θ x (y)) = τ x (θ(y)) = τ(θ(y)) , the last equality because θ is capture free for P , since x doesnot occur in θ(y) .We also prove 2. by structural induction over Ω . Because capturefreedom commutes with negation and conjunction, as does substitu- tion, it suffices to check the induction step for P = ( ∀ x)Q . But (θ ; τ) x is capture free for Q because θ x is capture free for Q and τ x is capturefree for Qθ x , plus the induction hypothesis. Now let y ≠ x be a freevariable in Q ; because x ∉ Var (θ(y)) and x ∉ Var (τ(z)) , for any freevariable z ≠ x of Qθ x , we have x ∉ Var (τ(θ(y))) . Therefore θ ; τ iscapture free for P . (cid:2) Definition 8.3.20 Given a substitution θ : X → T Σ (X) and a model M , define [[θ]] M : [X → M] → [X → M] by [[θ]] M (a) = θ ; a . As usual, we omit thesubscript M when context makes it unnecessary. (cid:2) The next result does most of the work involved in proving the mainresult of this subsection; its rather technical proof has been placed inAppendix B to avoid distraction. Proposition 8.3.21 If θ is capture free for P , then [[θ(P )]] M = [[θ]] − M ([[P ]] M ) forany model M . (cid:2) The main result now says that any substitution instance of a validformula is valid: Theorem 8.3.22 ( Substitution ) Let Φ = ( Σ , Π ) be a first-order signature, A a setof Φ -sentences, a ( Φ , X) -formula P , and θ : X → T Σ (X) a substitutionthat is capture free for P . Then A (cid:238) P implies A (cid:238) P θ . Proof: Fix a model M of A . It suffices to show that [[P ]] = [X → M] implies [[P θ]] = [X → M] , which follows directly from Proposition 8.3.21. (cid:2) Corollary 8.3.23 Let Φ = ( Σ , Π ) be a first-order signature, A a set of Φ -sentences,and P in WFF X ( Φ ) . Let Y ⊆ X be a finite variable set and let θ : X → T Σ (X) be a substitution such that θ Y = θ (i.e., θ can only be non-identityoutside Y ). Then A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y ) P θ . irst-Order Logic Proof: Let Q = ( ∀ Y ) P and apply E36 Theorem 8.3.22 to get A (cid:238) ( ∀ Y ) P implies A (cid:238) θ(( ∀ Y ) P ) . Now by 2 of Exercise 8.3.14 and because θ Y = θ , we get A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y ) P θ . (cid:2) From this, it further follows that A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y (cid:48) ) P θ . when Y (cid:48) ⊆ Y and θ = θ Y − Y (cid:48) . This says we can substitute values forvariables in Z = Y − Y (cid:48) and eliminate their quantifiers.The second result below is needed in Section 8.5. Lemma 8.3.24 Let P be a Φ -formula, let θ : X → T Σ (X) be a substitution that iscapture free for P , and let a : X → M be an interpretation in a Φ -model M . Then a(P θ) = (θ ; a)(P ) . Proof: More precisely, we need to show that (cid:98) θ ; a = θ ; a , which follows byshowing that ( (cid:98) θ ; a)(x) = (θ ; a)(x) for all x ∈ X , and that (cid:98) θ ; a satisfiesthe conditions of Definition 8.3.2. (cid:2) Lemma 8.3.25 Let P be a Φ -formula and let substitutions θ, θ (cid:48) : X → T Σ (X) be capture free for P . Given interpretations a, a (cid:48) : X → M in a Φ -model M , then a(P θ) = a (cid:48) (P θ (cid:48) ) whenever a(θ(z)) = a (cid:48) (θ (cid:48) (z)) for all z ∈ Free (P ) . Proof: By Lemma 8.3.24, (θ ; a)(P ) = a(P θ) and (θ (cid:48) ; a (cid:48) )(P ) = a (cid:48) (P θ (cid:48) ) . Be-cause (θ ; a)(x) = (θ (cid:48) ; a (cid:48) )(x) for all x ∈ Free (P ) , Proposition 8.3.3 gives (θ ; a)(P ) = (θ (cid:48) ; a (cid:48) )(P ) . Thus a(P θ) = a (cid:48) (P θ (cid:48) ) . (cid:2) The syntax of first-order logic with equality is exactly the same as thatof first-order logic, but its signatures are required to have binary (infix) equality predicates, exactly one for each sort s ∈ S , denoted = s ; moreprecisely, we assume that {= s } ⊆ Π ss for each s ∈ S . Semantically,first-order logic with equality restricts its models to those where theequality predicates are interpreted as actual identities; that is, for eachmodel M and s ∈ S , M = s = {(cid:104) m, m (cid:105) | m ∈ M s } . Satisfaction is as usual. Let us denote this institution FOLEQ . It is im-portant to notice that all our definitions and results for FOL carry overto FOLEQ . This is because the proofs are the same, the only differencebeing that there are fewer models. First-Order Logic and Proof Planning Similarly, Horn clause logic with equality is Horn clause logic withthe same given equality predicates, interpreted the same way as above;let us denote this institution HCLEQ .The first-order logic of equality is the special case of first-orderlogic with equality where equalities are the only predicates. Since Φ is completely determined by Σ , we may write (cid:238) Σ instead of (cid:238) Φ in thiscontext. Again, the definitions, results, and proofs for FOL carry over.Let us denote this institution FOLQ .Similarly, the Horn clause logic of equality is Horn clause logic withequality where the only predicates are the equalities. In fact, the Hornclause logic of equality is the same as conditional equational logic (seethe exercise below); therefore it is another way to view the logic of OBJ.Of course, our algebraic orientation prefers the conditional equationalformulation to the Horn clause formulation. Exercise 8.3.15 Let Φ = ( Σ , Π ) be a first-order signature with exactly one predi-cate symbol for each sort, namely the equality. Now define a translationfrom conditional Σ -equations e to Φ -Horn clauses h e , and prove that M (cid:238) Φ e iff M (cid:238) Φ h e , for any Φ -model M . (cid:2) It now follows that the institution CEQL of conditional equational logicis a subinstitution of FOLEQ . Hence the rules of deduction for CEQL arealso valid for FOLEQ , of course restricted to the sentences that corre-spond to conditional equations. Going further along the line of Section 8.3.5, we can give fixed “stan-dard” interpretations not only for equality symbols, but also for anydesired sorts and non-logical symbols. For example, if Ψ is the signa-ture Φ Nat of Example 8.1.2 and if Φ is some first-order signature with Ψ ⊆ Φ , then we can fix the interpretation of Ψ to be the standard nat-ural numbers. Define a Φ - model over D to be a Φ -model M such thatthe restriction (reduct) M | Ψ of M to Ψ is the fixed model D . We denotethis institution FOL/D ; then all our definitions and results for FOL carry over to FOL/D , because the same proofs work on the reduced collectionof models. If A is a set of Φ -axioms, then a ( Φ , A) -model over D is a Φ -model over D satisfying A . Note that for some A there may be no suchmodels, for example, if A implies a Ψ -sentence that is false in D (suchas 1 = FOLEQ/D of first-orderlogic with equality over some fixed Ψ -model D . If we add a few morearithmetic operations to the signature Ψ = Φ Nat of the natural num-bers, we get a system to which Gödel’s incompleteness theorem applies.This famous result says that any first-order theory rich enough to talkabout a certain fragment of arithmetic will always have true sentences roof Planning that cannot be proved; in other words, no finite set of axioms can becomplete for arithmetic. The situations that arise in our applicationsare often of this kind, since we need to reason about some fixed datatypes, e.g., natural numbers, integers, lists of natural numbers, etc. Inpractice, when we stumble over a result that cannot be proved by equa-tional reasoning from the axioms in our theory, we try to prove it usinginduction. Induction is a second-order axiom, not a first-order axiom,but even so, there is no guarantee that we will find the proof we wantby using it.We can also consider the institution of the first-order logic of equal-ity over a fixed model D , denoted FOLQ/D . The definitions and resultsfor FOL again carry over, because the proofs are the same; and theabove discussion about incompleteness also applies. FOLQ/D is funda- mental for this book, because our method is to state proof tasks usingformulae in this logic, and then reduce them to a combination of equa-tional proof tasks that can be handled with reduction (see Section 8.4below). Because D is usually defined by initiality with respect to somegiven equational theory, induction can usually be used to prove addi-tional properties of D that are needed (such properties are traditionallycalled “lemmas”). For proof planning, we will use the 2-bar semantic entailment turnstile, (cid:238) , in a new way, reading “ A (cid:238) P ” as indicating the task of proving the goal P from the assumptions A . With this in mind, we can reformulatethe assertions of Proposition 8.3.9 as “proof planning rules,” rewriterules that transform complex proof tasks into combinations of simplerproof tasks. Given a first-order signature Φ , a set A of Φ -sentences, and Φ -formulae P , Q , these rules are as follows: T . A, P (cid:238) Φ P (cid:45) → true .T . A (cid:238) Φ P ∧ Q (cid:45) → A (cid:238) Φ P and A (cid:238) Φ Q .T . A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ P or A (cid:238) Φ Q .T . A (cid:238) Φ P ⇒ Q (cid:45) → A, P (cid:238) Φ Q if P is closed .T . A (cid:238) Φ ¬ P (cid:45) → A, P (cid:238) Φ false if P is closed .T . A (cid:238) Φ ( ∀ X) P (cid:45) → A (cid:238) Φ (X) P . Thus, T P ∧ Q from A , we can prove P from A and Q from A . These really are rewrite rules rather than equations, becausethey have a definite left to right orientation. In particular, T vice versa , because R if ,” not “ iff ”. Rule T false , we must First-Order Logic and Proof Planning supplement this rule later. Note that the signature subscripts on (cid:238) areimportant for T 5, but not for T T 4. We will call the signature thatappears as a subscript on the turnstile the working signature of theproof task A (cid:238) Φ P in this context.We say a proof planning rule is sound if its rightside is a sufficient condition for its leftside; note that this is opposite to soundness forrules of deduction, where the rightside is a necessary condition for theleftside. For example, if we weaken rule T P ∨ Q from A , it suffices to prove P , and it also suffices to prove Q , then the right sides are sufficient for their left sides, but far fromnecessary: T a. A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ P .T b. A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ Q . We will see that rules like these are adequate for many interestingproblems, including the ripple carry adder discussed in Section 8.4.2below. We will also see that these rules can themselves be expressedand executed in OBJ.As a first step in making the above intuitions more precise, let usconsider the language used for expressing proof tasks. We can seethat all the terms in T T A (cid:238) Φ B , where A, B are Φ -formulae. Since our proof-planningapplications involve atomic sentences from the institution FOLQ/D , wemay write (cid:238) Σ instead of (cid:238) Φ . The terms in T T FOLQ/D : they make assertions about combinations of proof tasksinvolving sentences in FOLQ/D . Of course, most assertions of this formare false.Our paradigm takes a proof task A (cid:238) Φ P and transforms it into aBoolean combination of proof tasks that can be checked by reductionwith OBJ. Proposition 8.3.9 then implies that if we use T T A (cid:238) Φ P is true, i.e., the proof score consistingof those reductions really does prove P from A . If some reductionsdon’t do what we want, then we have failed to prove the result, but in general, this does not mean it isn’t true (though there are some caseswhere failure does imply that the original proof task is false).The rules in the object META below encode the transformations T T Ground terms of sort Meta are metasentences that describe structures of possible proofs;they could also be called “proof terms,” because they are possible proofsexpressed as terms. This module uses order-sorted algebra ; for exam-ple, the line “ subsort BType < Type ” means that every BType (for Strictly speaking, the equations here are really rewrite rules, so that the full powerof equational logic cannot be used, but only term rewriting. roof Planning “basic type”) is also a Type . Order-sorted algebra is developed in Chap-ter 10, but the OBJ code below should be understandable without adetailed knowledge of Chapter 10. obj META is sorts Meta Sen Sig Type .pr QID .dfn BType is QID .subsort BType < Type .subsort Bool < Sen Meta .op _|=[_] _ : Sen Sig Sen -> Meta [prec 11].op (_][_:_) : Sig Id Type -> Sig .op _and_ : Meta Meta -> Meta [assoc comm prec 2].op _and_ : Sen Sen -> Sen [assoc comm prec 2].op _or_ : Meta Meta -> Meta [assoc comm prec 7].op _or_ : Sen Sen -> Sen [assoc comm prec 7].op _=>_ : Meta Meta -> Meta [prec 9].op _=>_ : Sen Sen -> Sen [prec 9].op not_ : Meta -> Meta [prec 1].op not_ : Sen -> Sen [prec 1].op (all_:_ _) : Id Type Sen -> Sen .vars A P Q : Sen . var X : Id .var T : Type . var S : Sig .[ass] eq A and P |=[S] P = true .[and] eq A |=[S] (P and Q) = (A |=[S] P) and (A |=[S] Q) .[or] eq A |=[S] (P or Q) = (A |=[S] P) or (A |=[S] Q) .[imp] eq A |=[S] (P => Q) = (A and P) |=[S] Q .[not] eq A |=[S] (not P) = (A and P) |=[S] false .[all] eq A |=[S] (all X : T P) = A |=[S][X : T] P .endo Note that the Boolean operations and , or , => , and not are triply over-loaded, because they are defined for both sentences and metasentences,as well as for OBJ’s builtin Booleans. Since we have shown these opera-tions to be associative and commutative, we can include these laws asattributes.Strictly speaking, the rule all should require that the variable X not occur in A , and most texts on first-order logic do give such a “side condi-tion” for this rule. However, it is more natural in our setting to considerthis a condition on signatures, since forming Σ (X) already requires X tobe disjoint from Σ . This well-formedness condition is easily expressedusing so-called “error supersorts” in order-sorted algebra, but becausewe have not treated that topic, we omit this from the above specifica-tion.Now let’s use this machinery to plan some proofs: Example 8.4.1 Below is a simple reduction of a compound proof task to a sim-pler proof task. This computation tells us that if we want to prove a First-Order Logic and Proof Planning sentence of the form A (cid:238) Σ ( ∀ w , w : Bus ) P ⇒ P , then it suffices to prove A, P (cid:238) Σ (w ,w : Bus ) P . Here is the OBJ code: open .ops A1 P1 P2 : -> Sen .op Sigma : -> Sig .red A1 |=[Sigma] (all ’w1 : ’Bus (all ’w2 : ’Bus P1 => P2)).***> should be: A1 and P1 |=[Sigma] [’w1 : ’Bus]***> [’w2 : ’Bus] P2close Of course, it works; OBJ does just three rewrites, each an applicationof a proof rule. This reduction justifies the proof score used in theexample of Section 8.4.2. (cid:2) Example 8.4.2 We can use META to plan a proof that the intersection of twotransitive relations is transitive. Our proof task has the form ( ∀ X) P ⇒ Q , ( ∀ X) P ⇒ Q (cid:238) Σ ( ∀ X) P ⇒ Q where X has variables x, y, z of sort Elt , and where the first clauseexpresses transitivity of a relation, e.g., R , with P = (x R y) ∧ (y R z)Q = x R z , the second clause expresses transitivity of R , and the third expressestransitivity of their intersection, which is defined to be R ∧ R . Thisdefinition justifies adding the two lemmas P = P ∧ P Q = Q ∧ Q . Now we can write the proof task in OBJ, and reduce it to get a proofplan: open .op all-X_ : Sen -> Sen .op Sigma : -> Sig .vars-of .eq all-X A = (all ’x : ’Elt (all ’y : ’Elt (all ’z : ’Elt A))).ops P1 P2 P12 Q1 Q2 Q12 : -> Sen .eq P12 = P1 and P2 .eq Q12 = Q1 and Q2 .red ((all-X (P1 => Q1)) and (all-X (P2 => Q2))) |=[Sigma](all-X (P12 => Q12)).close roof Planning OBJ3 does 10 rewrites and produces a rather large term, suggestingthat setting up this proof is not completely trivial. (cid:2) Exercise 8.4.1 Execute the reduction in the example above in OBJ3 and inter-pret the result. Does it make sense? What does it say? Now use OBJ toactually do the proof that has been planned, and interpret the results. (cid:2) Exercise 8.4.2 In a way similar to the above example and exercise:(a) Use OBJ to plan and carry out a proof that the union of two sym-metric relations (on the same set) is symmetric.(b) Use OBJ to plan and carry out a proof that the intersection of twoequivalence relations (on the same set) is an equivalence relation. (cid:2) The transformation corresponding to modus ponens (R6 on page 261)is T . A (cid:238) Φ P (cid:45) → A (cid:238) Φ Q and A (cid:238) Φ Q ⇒ P , which is not a rewrite rule, because its rightside contains a variable notin its leftside; hence it cannot be applied automatically by rewriting.But it is still very important for proofs.We mentioned earlier that to use the rule T false . Rule T Q that can be both proved and disproved. Pure equational logic can neverprove disequalities (i.e., negations of equations). But initiality gives usa way forward. If a specification is canonical, then different reducedground terms necessarily denote distinct elements of its initial model.For example, we know that 0 ≠ false ≠ true are satisfied in thestandard models, so if we can prove 0 = false = true , then we havethe desired contradiction.The second rule below, T 8, justifies introducing a “lemma” Q tohelp prove P from A ; of course, Q itself must also be valid for A . Inpractice, lemmas are often results about D that require induction, such as the associative and commutative laws for addition. We will see somemore substantial lemmas in the proof of the next section. T . A (cid:238) Φ false (cid:45) → A (cid:238) Φ Q and A (cid:238) Φ ¬ Q .T . A (cid:238) Φ P (cid:45) → A (cid:238) Φ Q and A, Q (cid:238) Φ P . The following justifies the two rules above: Proposition 8.4.3 Let Φ be a first-order signature, let A be a set of Φ -sentences,and let P , Q also be Φ -sentences. Then R . A (cid:238) Φ false iff A (cid:238) Φ Q and A (cid:238) Φ ¬ Q . R . A (cid:238) Φ P if A (cid:238) Φ Q and A, Q (cid:238) Φ P . First-Order Logic and Proof Planning Proof: The first assertion follows from R E (cid:2) Exercise 8.4.3 (a) Show that “ if ” in R8 above cannot be replaced by “ iff ”.(b) Use R R R Q is closed. (cid:2) Another useful rule allows us to “strengthen” the axioms (or as-sumptions) used for a proof. Intuitively, if we can prove somethingfrom stronger (i.e., more restrictive) assumptions, then it is also validunder the weaker assumptions: R . A (cid:238) P if A (cid:238) A (cid:48) and A (cid:48) (cid:238) P . The transformational form of this rule is of course T . A (cid:238) P (cid:45) → A (cid:238) A (cid:48) and A (cid:48) (cid:238) P . This is not a rewrite rule because its rightside contains a variable not inits leftside. This rule may be called the “ wmawlog ” rule, because it jus-tifies the “we may assume without loss of generality” steps that occur atthe beginning of many proofs, replacing the original assumptions withothers that are stronger or equivalent. (Some proofs that say “we mayassume without loss of generality” are actually case analyses, where arelatively easy special case is eliminated; e.g., in showing n ≥ n , wemay assume n ≥ Exercise 8.4.4 Prove soundness of R (cid:2) The module META2 below expresses T T META module. None of these are rewrite rules, because eachhas a variable on its rightside that is not on its left. Hence they must beapplied “by hand,” e.g., with OBJ3’s apply feature. This makes sense,because creativity is required in choosing suitable Q , and this creativitycan never be fully automated. obj META2 is pr META .var A A’ P Q : Sen . var S : Sig .[modp] eq A |=[S] P = (A |=[S] Q) and (A |=[S] Q => P) .[contd] eq A |=[S] false = (A |=[S] Q) and (A |=[S] not Q) .[lemma] eq A |=[S] P = (A |=[S] Q) and (A and Q |=[S] P) .[astr] eq A |=[S] P = (A’|=[S] P) and (A |=[S] A’) .endo Two special cases of T R a. A, P ⇒ Q (cid:238) R if A, P (cid:48) ⇒ Q (cid:238) R and A (cid:238) P (cid:48) ⇒ P roof Planning is the special case of R A, P ⇒ Q is substituted for A , where A, P (cid:48) ⇒ Q is substituted for A (cid:48) , where R is substituted for P , and thenthe result is simplified using the rule ((cid:63)) A, P ⇒ Q (cid:238) P (cid:48) ⇒ Q if A (cid:238) P (cid:48) ⇒ P . The resulting transformation rule is T a. A, P ⇒ Q (cid:238) R (cid:45) → A, P (cid:48) ⇒ Q (cid:238) R and A (cid:238) P (cid:48) ⇒ P . In case P (cid:48) is closed, we can use R T a. A, P ⇒ Q (cid:238) R (cid:45) → A, P (cid:48) ⇒ Q (cid:238) R and A, P (cid:48) (cid:238) P . Note that this rule applies in particular to the condition of a conditional equation. Exercise 8.4.5 Prove soundness of ((cid:63)) . Show how R a justifies strengtheningthe condition of a conditional equation. (cid:2) For our second special case of T 9, recall from Corollary 8.3.23 thatif Y (cid:48) ⊆ Y ⊆ X are variable sets and θ : X → T Σ (X) is a substitution suchthat θ X − (Y − Y (cid:48) ) = θ (i.e., θ is non-identity at most outside Y − Y (cid:48) ), then A (cid:238) ( ∀ Y )P implies A (cid:238) ( ∀ Y (cid:48) )θP . From this and T T b. A, ( ∀ Y )P (cid:238) Q (cid:45) → A, ( ∀ Y (cid:48) )θP (cid:238) Q , with Y , Y (cid:48) and θ as above.The first nine rules below are similar to the attributes declared for and and or , simple facts about the extended Boolean connectives; thenext three rules help us conclude proofs, and the last rule lets us do modus ponens on the leftside of the turnstile. These rules are oftenuseful in simplifying proof plans; it can be shown that applying themnever prevents a proof from being found if one exists. obj META3 is pr META2 .vars A P Q R : Sen . var S : Sig .eq A and A = A .eq A and true = A .eq A and false = false .eq A and not A = false .eq A or false = A .eq A or true = true .eq A => true = true .eq A => false = not A .eq not not A = A .eq P |=[S] P = true . First-Order Logic and Proof Planning eq false |=[S] P = true .eq (A and P) |=[S] P = true .eq A and P and (P => Q) |=[S] R= A and P and Q and (P => Q) |=[S] R .endo Example 8.4.4 Here are some proof tasks for which the above rules suffice toshow that no proof is needed, because they are already satisfied: open .ops P1 P2 : -> Sen .op Phi : -> Sig .red P1 |=[Phi] not P1 => P2 .red P1 and P2 |=[Phi] P1 => P2 .red P2 |=[Phi] P1 => (P1 => P2).red true |=[Phi] false => P1 .close (They all reduce to true .) (cid:2) We call the following a theorem even though it is easy to prove be-cause of its fundamental importance for our approach to proof plan-ning: Theorem 8.4.5 If some task A (cid:238) P transforms under the rules T T (cid:238) -atoms having truth values such that the Bool-ean combination evaluates to true , then A (cid:238) P is true. Proof: This follows by induction on the length of rule application sequences,using soundness of the individual rules, as stated in assertions R R (cid:2) In fact, we can set things up so that the Boolean combination evaluatesto true iff each atom is true, by using the rules T a and T b instead of T 2. Then an OBJ proof score based on this proof plan will succeed iffeach OBJ evaluation is true .The rule below says that if we can reduce the two sides of an equa- tion to the same thing, then the equation is true: TRW . A (cid:238) Φ ( ∀ X) t = t (cid:48) (cid:45) → t ↓ Φ (X),A R t (cid:48) , where the notation t ↓ Φ (X),A R t (cid:48) means that the terms t, t (cid:48) can be rewrit-ten to the same term using the set A R of rewrite rules of A . Althoughpossible, it is not worthwhile expressing this rule in our META frame-work, because this would require specifying equations, rewriting, etc.in OBJ. Instead, we just note that all atomic clauses should be passed onelsewhere for evaluation, after proof planning is completed. When theinstitution is FOLQ/D , these will all be equations, and rule TRW can beused; competition techniques are also possible (see Chapter 12). More roof Planning interestingly, there are decision procedures for atoms over certain spe-cial domains, e.g., Presburger arithmetic. Example 8.4.6 We explore some ways that things can go wrong in proofs; fail-ures are unpredictable, irregular, and very common. We first try to planthe easy part of the proof of Exercise 8.2.6, that if a relation R satisfiesthe equations cq X R* Y = true if X R Y .cq X R* Z = true if X R* Y and Y R* Z . then it also satisfies the equation cq X R* Z = true if X R Y and Y R* Z . Our proof task has the form A ∧ A (cid:238) Σ ( ∀ X) (P ⇒ P ) , and we can generate a proof score for it with the following: open META3 .op all-X_ : Sen -> Sen .ops A1 A2 P1 P2 : -> Sen .op Phi : -> Sig .var A : Sen .eq all-X A = (all ’x : ’Elt (all ’y : ’Elt (all ’z : ’Elt A))).red (A1 and A2) |=[Phi] (all-X (P1 => P2)) .close which yields the proof plan A1 and A2 and P1 |=[Phi] [’x : ’Elt][’y : ’Elt][’z : ’Elt] P2 . But if we translate this into an OBJ proof score, it fails because the second conditional equation ( A2 ) is not a rewrite rule, since the variable Y occurs in its condition but not in its leftside. We could get around thisby using OBJ’s apply feature for A2 ; but it seems easier to make part ofthe necessary substitution by hand (the entire substitution would haveto be entered by hand to use apply anyway), add the resulting rule,and then use reduction. Substituting y for Y in A2 is justified by T a ,yielding the first equation in the proof score below: This is a decidable fragment of arithmetic, usually taken to be so-called extendedquantifier free Presburger arithmetic for the rationals and integers, with unary minus,addition, subtraction, multiplication by constants, equality, disequality, and the rela-tions <, ≤ , ≥ and > [164]. First-Order Logic and Proof Planning open R* .vars-of .ops x y z : -> Elt .cq X R* Z = true if X R* y and y R* Z .eq x R y = true .eq y R* z = true .red x R* z .close However, this does not work either! This is because OBJ goes into aninfinite loop, applying the first equation to itself over and over, withthe substitution X = x , Z = y . We can circumvent this by preventingthe substitution of y for Z by adding to the condition of the rule. Thisis justified by T b , and yields the following rule: cq X R* Z = true if Z =/= y and X R* y and y R* Z . However, this still causes an infinite loop, because and does not knowthat if its first argument is false then the whole conjunction is nec-essarily false; hence we define and use a more clever conjunction, thatuses a partially lazy evaluation (see Section 5.4): open R* .vars-of .ops x y z : -> Elt .var B : Bool .op _cand_ : Bool Bool -> Bool [strat (1 0)] .eq false cand B = false .eq true cand B = B .cq X R* Z = true if Z =/= y cand (X R* y and y R* Z) .eq x R y = true .eq y R* z = true .red x R* z .close But this does not work either, since OBJ finds a different infinite loop!This one can be prevented by also prohibiting the instantiation of X by y , by adding another conjunct: cq X R* Z = true if (Z =/= y and X =/= y)cand (X R* y and y R* Z) . This (finally!) works, and the proof is done. (However, OBJ3 failsto parse the condition if either pair of the parentheses is omitted; thiscould be circumvented by declaring a non-default precedence for cand ,but it is not worth the trouble.)It was not so easy to get OBJ3 to execute this simple proof score:four different things went wrong and had to be worked around! These Of course, there were also some typographical errors during the development ofthis proof; these were caught in the usual way by the OBJ parser, and fixed by the user. roof Planning workarounds were: (a) instantiate an equation that was not a rewriterule to make it one; (b) add conditions to an equation to ensure ter-mination (this was done twice); (c) change the order of evaluation byforcing a rule to fail if one conjunct in its condition fails; and (d) addparentheses to help the parser. All of these are standard “tricks ofthe trade” for an experienced OBJ user, and I hope this example willhelp you to use them in the future. In particular, please note how ter-mination was handled: we did not attempt to prove that the rule setwas terminating; instead, we discovered experimentally that it was not terminating, and then we just strengthened the rules to prevent theparticular loop that we found, while preserving correctness. The sameapproach applies to the Church-Rosser property: when we failed to getthe order of evaluation we wanted, we just changed OBJ’s evaluation strategy. Our emphasis is on getting a correct proof, rather than ongetting a canonical specification.Things are worse for the other half of the proof of Exercise 8.2.6:the proof score that is automatically generated from the proof task isvery little help; some entirely new ideas are needed, and initiality mustbe used. We omit the details, but underline the moral: the proof scoreautomatically generated by our META rules is only adequate for simpleproof tasks; for slightly more difficult tasks, small modifications may besufficient, but in general, some real creativity must be supplied by theuser. Nevertheless, close adherence to the transformational approachwill guarantee correctness of the proof score, and hence of the proof,if the proof score executes correctly. (cid:2) Our approach to first-order logic has been a bit eccentric: After an al-gebraic treatment of syntax, we developed a number of properties ofsemantic entailment, and then applied them to proof planning; we havenot considered rules of deduction in the traditional sense at all.Rules of deduction are used to deduce (infer) something new fromsomething old, such that if the old is true then so is the new. Ourpurpose in this chapter has been just the opposite: to reduce something we hope is true to some new thing(s), such that if the new are true, thenso is the old. Hence, what we call an “elimination rule” correspondsto what is called an “introduction rule” in the standard literature, butapplied backwards .To illustrate this, let’s consider the traditional rule for conjunction(“and”) introduction: P QP ∧ Q This says that if we have proved P and Q , then we are entitled to say First-Order Logic and Proof Planning we can prove their conjunction P ∧ Q . Since it is awkward to work withtautologies, a more useful formulation is A (cid:96) P A (cid:96) QA (cid:96) P ∧ Q where A is some set of axioms, and “ (cid:96) ” indicates first-order proof; thisis very much like what we did for equational logic. But in our presentcontext, where we have the task of proving P ∧ Q (from A ), the abovetells us it is sufficient to prove P and Q separately: that is, we can applythe above rule backwards to eliminate the conjunction from our goal;this is why we call it “conjunction elimination.”The rule that is usually called conjunction elimination is completelydifferent: it says that if we have proved P ∧ Q , then we are entitled to say we have proved P . This may be written: A (cid:96) P ∧ QA (cid:96) P (Of course, there is a similar elimination rule for Q .)The most important property that a rule of deduction can have is soundness ; an unsound rule cannot guarantee correct proofs. A se-quent rule is sound iff the result of replacing (cid:96) by (cid:238) is a valid impli-cation. For example, soundness of the rule that we call conjunctionelimination depends upon the result A (cid:238) P and A (cid:238) Q imply A (cid:238) P ∧ Q . Using the traditional rules in the forward direction gives a bottom up proof, starting with what is known, and gradually building up more. Bycontrast, our proof planning rules build proofs top down , starting withwhat we want, and working down towards what we know. This sec-ond kind of proof organization corresponds (roughly) to what is called natural deduction in the logic literature. More generally, a rule thattransforms what appears on the right of the turnstile is doing top down or backwards inference, and one that transforms what appears on theleft of the turnstile is doing bottom up or forwards inference. It is important to notice that “real” proofs (e.g., from textbooks, re-search papers, lectures, blackboards, etc.) are usually neither top downnor bottom up! In fact, reading a proof written in either of these stylescan be pretty tedious. A bottom up version of a complex proof wouldfirst present a long list of assumptions and low level results that arecompletely unrelated to each other; it would then build on top of thesea layer of loosely related low level results; and so on upwards; the re-sult would be incomprehensible until the very end (and probably eventhen). A top down proof would be easier to follow, but would prohibit A proof calculus where sentences involve (cid:96) is sometimes called a sequent calculus . roof Planning the use of lemmas, which can make proofs much easier to follow. Thus,natural deduction is not really very natural after all. A brief discussionof the naturalness of proofs appears in Section 8.8.Proof planning rules that are rewrite rules can and generally shouldbe applied automatically, but other kinds of rule require more atten-tion. Therefore only the most routine aspects of proof planning canbe completely automated by rewriting; the most interesting rules, suchas proof by contradiction, adding lemmas, and induction, require some(often considerable!) ingenuity.We are now in a position to understand what OBJ “proof scores”really are, and why they work: An OBJ proof score contains the equa-tional reductions that result from applying proof planning rules to anoriginal proof task; such a proof score can be proved valid by appeal to the proof planning rules that produced it. This section verifies a ripple carry adder of arbitrary width, i.e., provesthat it really does add. The verification makes heavy use of OBJ’s ab-stract data type capabilities. Figure 8.1 shows the structure of thisdevice; it is a cascade connection of n “full adders.” In addition, thedevice has two input buses and one output bus, each n bits wide, plusa final carry bit output. (For those not already familiar with hardware,all of these terms are made precise in the OBJ specifications below.)The ADT of natural numbers with addition is needed to handle thecorrectness condition, and the use of multiplication and exponentiationshould not be surprising given the nature of binary numbers; but itis interesting to notice how convenient the integers with subtractionreally are for this example. n -bit wide busses are represented by listsof Booleans of length n ; this abstract data type is defined using order-sorted algebra, so that inductive proofs have the 1-bit case for theirbase, and the operation of postpending a bit for their induction step.The result to be verified is that ( ∀ w , w ) ( | w | = | w | ) ⇒ ( sout ∗ (w , w ) + | w | ∗ cout (w , w ) = w + w ) , where w and w range over buses, where | w | and w are respectivelythe length, and the number denoted by, a bus w , where sout ∗ (w , w ) represents the content of the output bus, and where cout (w , w ) isthe carry bit. In words, this formula says that given two input buses ofthe same width, the number on the output bus together with the carry(as highest bit) equals the sum of the numbers on the input buses.Despite the somewhat complex structure of its terms, this formulais of exactly the form treated in Example 8.4.1. Hence we can prove itby introducing new constants for w and w , then assuming P , and First-Order Logic and Proof Planning FA1 FA2 FAnfalse (cid:116) >> b > b ∧ s >c > b > b ∧ s . . .c >. . . > b n > b n ∧ s n >c n Figure 8.1: A Ripple Carry Adderproving P by checking equality of the reduced forms of its two sides. The proofs of the lemmas are by straightforward case analysis and/orinduction, and are omitted here. The proof of the main result is byinduction on the width of the input buses, starting from width 1.The first OBJ module below uses order-sorted algebra to specify theintegers; thus “ subsort Nat < Int ” says every natural number is alsoan integer (see Chapter 10 for details on order-sorted algebra). A num-ber of inductive lemmas are included in this module, such as the dis-tributive law. obj INT is sorts Int Nat .subsort Nat < Int .ops 0 1 2 : -> Nat .op s_ : Nat -> Nat [prec 1] .ops (s_)(p_) : Int -> Int [prec 1] .op (_+_) : Nat Nat -> Nat [assoc comm prec 3] .op (_+_) : Int Int -> Int [assoc comm prec 3] .op (_*_) : Nat Nat -> Nat [assoc comm prec 2] .op (_*_) : Int Int -> Int [assoc comm prec 2] .op (_-_) : Int Int -> Int [prec 4] .op -_ : Int -> Int [prec 1] .vars I J K : Int .eq 1 = s 0 . eq 2 = s 1 .eq s p I = I .eq p s I = I .eq I + 0 = I .eq I + s J = s(I + J) .eq I + p J = p(I + J) .eq I * 0 = 0 .eq I * s J = I * J + I .eq I * p J = I * J - I .eq I * (J + K) = I * J + I * K .eq - 0 = 0 .eq - - I = I .eq - s I = p - I . roof Planning eq - p I = s - I .eq I - J = I + - J .eq I + - I = 0 .eq -(I + J) = - I - J .eq I * - J = -(I * J) .op 2**_ : Nat -> Nat [prec 1] .var N : Nat .eq 2** 0 = 1 .eq 2** s N = 2** N * 2 .endoobj BUS is sort Bus .extending PROPC + INT .subsort Prop < Bus .op __ : Prop Bus -> Bus .op |_| : Bus -> Nat .var B : Prop . var W : Bus .eq | B | = 1 .eq | B W | = s | W | .op First-Order Logic and Proof Planning eq In reducing the above expression to true , OBJ3 did 158 rewrites, manyof which were associative-commutative (and many more rewrites weretried but failed), so one would certainly prefer to have this calculationdone mechanically, rather than do it oneself by hand! This proof maybe about two orders of magnitude easier using induction in OBJ than itwould be in a fully manual proof system. In addition, OBJ produced avalidated proof score. Exercise 8.4.6 Prove the three lemmas in the object LEMMAS above. (cid:2) To summarize, we have used OBJ and reduction in two differentways, at two different levels: first at the meta level, to reduce the orig- inal proof task to a form that OBJ can directly handle; and then at the“object” level, to do the actual “dirty work” of the proof. Hence, thisproof was completely automatic. (Needless to say, this is not alwayspossible.) After seeing the kind of tricks with signatures that are used to elimi-nate quantifiers ( R R 10 in the next subsection),a reader may worry that the truth of a goal depends on the signature.The following shows that is not the case. easoning with Existential Quantifiers Proposition 8.4.7 Let A be a set of Φ -formulae, let P be a Φ -formula, and let Φ (cid:48) ⊆ Φ be such that a sort of Φ (cid:48) is void iff it is also void in Φ . If A, P arealso Φ (cid:48) -formulae, then A (cid:238) Φ P iff A (cid:238) Φ (cid:48) P . Proof: First note that if M (cid:48) = M | Φ (cid:48) for M a Φ -model, then M (cid:238) Φ A iff M (cid:48) (cid:238) Φ (cid:48) A .Also note that by the non-void assumption, any Φ (cid:48) -model M (cid:48) extendsto a Φ -model M ∗ such that M (cid:48) = M ∗ | Φ (cid:48) .Now suppose A (cid:238) Φ P and M (cid:48) (cid:238) Φ (cid:48) A ; we will prove that M (cid:48) (cid:238) Φ (cid:48) P .Choose M ∗ such that M (cid:48) = M ∗ | Φ (cid:48) . Then M ∗ (cid:238) Φ A . Therefore M ∗ (cid:238) Φ P ,and hence M (cid:48) (cid:238) Φ (cid:48) P . For the converse, suppose A (cid:238) Φ (cid:48) P and M (cid:238) Φ A .Let M (cid:48) = M | Φ (cid:48) . Then M (cid:48) (cid:238) Φ (cid:48) A . Therefore M (cid:48) (cid:238) Φ (cid:48) P and hence M (cid:238) Φ P . (cid:2) As long as all formulae parse, and you don’t populate an old void sortor depopulate an old non-void sort, the working signature can be aslarge or as small as you please. This implies that a mechanical theoremprover can effectively ignore the working signature, as our OBJ proofscores in fact do; the above result also helps to justify our frequentpractice of dropping the signature subscript from (cid:238) . Sentences that involve existential quantifiers can occur either on theassumption or the goal side of the symbol (cid:238) , and must be handleddifferently in each case. We begin with the assumption case. Becauseit can be difficult to use assumptions that contain existential quanti-fiers, it is useful to transform them into a more constructive form. Forexample, the proof task A, ( ∃ a, b : Pos )(c = a/b) (cid:238) Φ Q can be transformed to A, c = a/b (cid:238) Φ (a,b : Pos ) Q . The intuition here is that since we know a, b exist, in trying to prove Q we may as well assume that a, b have been given to us in the signature;in this case a, b are called Skolem constants .More generally, an existential quantifier may lie within the scope ofone or more universal quantifiers, as in A, ( ∀ x : Nat )( ∃ y : Nat ) f (x, y) = (cid:238) Φ Q . First-Order Logic and Proof Planning In such a case, the choice of y must depend on the value of x , so thatwhat is added to the signature must be a function of x . Hence the resultof transforming the above should be A, ( ∀ x : Nat ) f (x, y(x)) = (cid:238) Φ (y : Nat → Nat ) Q (cid:48) , where Q (cid:48) denotes the result of substituting y(x) for y in Q . Transfor-mations of this kind are justified by Proposition 8.5.1 below. Proposition 8.5.1 Given a set A of Φ -formulae plus Φ -formulae E37 P , Q where Free (P ) = X ∪ { y } with X = { x : s , . . . , x n : s n } and with y of sort s ,then R a. A, P (cid:48) (cid:238) Φ (Y ) Q implies A, ( ∃ y : s)P (cid:238) Φ Q , where P (cid:48) denotes the result of substituting y(x , . . . , x n ) for y in P andwhere Y is the declaration y : s . . . s n → s . Moreover, under the sameassumptions, R b. A, ( ∀ X)P (cid:48) (cid:238) Φ (Y ) Q implies A, ( ∀ X)( ∃ y : s)P (cid:238) Φ Q . Proof: We first prove the implication R a . Let θ be the substitution thattakes y to y(x , . . . , x n ) and is the identity on other variables. Then P (cid:48) = P θ . Let M be a Φ -model satisfying ( ∃ y : s)P . Then for every a : X → M there is some a (cid:48) : X ∪ { y } → M such that a (cid:48) | X = a and a (cid:48) (P ) = true . Let M (cid:48) be a Φ (Y ) -model that extends M with a new func-tion M (cid:48) y : M s ...s n → M s defined by M (cid:48) y (m , . . . , m n ) = a (cid:48) (y) where a (cid:48) : X ∪ { y } → M is an interpretation obtained as above from a : X → M defined by a(x i ) = m i for i = , . . . , n . Note that M (cid:48) | Φ = M because M (cid:48) only adds the new operation M (cid:48) y to M . Also note that M (cid:48) (cid:238) Φ (Y ) P (cid:48) :indeed, each interpretation a : X → M (cid:48) is actually an interpretation a : X → M that takes each x i in X to an m i in M s i such that there isan a (cid:48) : X ∪ { y } → M such that a (cid:48) | X = a , M (cid:48) y (m , . . . , m n ) = a (cid:48) (y) and a (cid:48) (P ) = true . Because a(y(x , . . . , x n )) = M (cid:48) y (a(x ), . . . , a(x n )) = a (cid:48) (y) , then a(θ(z)) = a (cid:48) (id(z)) for all z ∈ Free (P ) , where id is the identity substitution; now Lemma 8.3.25 implies a(P (cid:48) ) = a (cid:48) (P ) = true . Thus M (cid:48) is a Φ (Y ) -model of P (cid:48) . Therefore if M is a Φ -model of both A and ( ∃ y : s)P , then M (cid:48) is a Φ (Y ) -model of both A and P (cid:48) . Hence M (cid:48) is a Φ (Y ) -model of Q , and thus M is a Φ -model of Q . R b now follows from R a by n applications of R b . (cid:2) The new function symbol y is called a Skolem constant or a Skolemfunction , depending on whether the quantifier ( ∀ X) is present. Thecorresponding transformation rules, called Skolemization rules, are T a. A, ( ∃ y : s)P (cid:238) Φ Q (cid:45) → A, P (cid:48) (cid:238) Φ (Y ) Q (cid:48) . T b. A, ( ∀ X)( ∃ y : s)P (cid:238) Σ Q (cid:45) → A, ( ∀ X)P (cid:48) (cid:238) Φ (Y ) Q (cid:48) . easoning with Existential Quantifiers Note that these rules only apply to formulae on the left of the turnstile;a different approach is needed for establishing goals that contain exis-tential quantifiers. Of course, not all existential quantifiers are so politeas to occur only within the scope of universal quantifiers. But since afirst-order formula has only a finite number of quantifiers, by patientlyapplying T 10 wherever it can be applied (outermost first), all of itsexistential quantifiers will eventually be eliminated. (Second-order ex-istential quantifiers can be Skolemized by adding further arguments tothe quantified function, as discussed in Chapter 9.)Below is OBJ3 (meta-)code for Skolem constants; Skolem functionscan be done in a similar way, but this would require modifying someprevious meta code. obj META4 is pr META3 .vars A P Q : Sen . var X : Id . var T : BType .var S : Sig .op (exist_:_ _) : Id BType Sen -> Sen .eq A and (exist X : T P) |=[S] Q = A and P |=[S][X : T] Q .endo To prove a goal that involves an existential quantifier, it is necessaryto show that a suitable value actually exists in all models that satisfythe assumptions. In general, the suitable value will depend upon achoice of other values, because the existential quantifier occurs withinthe scope of some universal quantifiers in the goal. For example, thesentence ( ∀ x)( ∃ y) x + y = y = − x . This suggests that ifwe can find a term expressing the dependency of the existential vari-able on the universal variables, then we can prove our goal. This proofmethod is supported by the following: Proposition 8.5.2 Given a Φ -sentence ( ∀ X)( ∃ y : s)P with Free (P ) = X ∪ { y } ,then R . A (cid:238) Φ ( ∀ X)P [y ← t] implies A (cid:238) Φ ( ∀ X)( ∃ y)P , where t is some term over X of sort s . Proof: We first assume A (cid:238) Φ ( ∀ X)P [y ← t] , which by the Theorem of Con-stants ( R M (cid:238) Φ (X) A implies M (cid:238) Φ (X) P [y ← t] for all Φ (X) -models M . Then we want to show A (cid:238) Φ ( ∀ X)( ∃ y)P , i.e., that M (cid:238) Φ (X) A implies M (cid:238) Φ (X) ( ∃ y)P , for all Φ (X) -models M . So weassume M (cid:238) Φ (X) A , and from this conclude by the assumption that M (cid:238) Φ (X) P [y ← t] , and hence E38 by Proposition 8.3.15, that M (cid:238) Φ (X) ( ∃ y)P . (cid:2) First-Order Logic and Proof Planning The corresponding transformation rule is: T . A (cid:238) Φ ( ∀ X)( ∃ y)P (cid:45) → A (cid:238) Φ ( ∀ X)P [y ← t] . This rule cannot be expressed in our current meta level formalism, be-cause terms are not specified in it. However, it is easy to express theessence of the rule, by expressing substitutions for variables as newequations, where terms will be handled the usual way in concrete ex-amples. obj META5 is pr META4 .vars A P : Sen . var y : Id . var T : BType .var S : Sig .op all-X_ : Sen -> Sen .op Eqt : Id -> Sen .eq A |=[S] (all-X (exist y : T P)) =A and Eqt(y) |=[S][y : T] (all-X P).*** where Eqt(y) is the equation y = t*** and all-X is one or more universal quantifierendo Notice that in this formulation of the rule, the status of y is changedfrom being a variable to being a constant; this is needed so that the newequation will do the substitution.Proposition 8.5.2 can be applied iteratively to eliminate nested exis-tential quantifiers. For example, a sentence of the form ( ∀ X)( ∃ y)( ∀ Z)( ∃ w) P can be transformed first to ( ∀ X ∪ Z)( ∃ w) P [y ← t] and then to ( ∀ X ∪ Z) P [y ← t][w ← t (cid:48) ] , provided the restrictions on free variables are satisfied — but of coursethese are very natural. Below is a simple example. Example 8.5.3 Suppose we want to prove NAT (cid:238) ( ∀ x, y : Int )( ∃ z, w : Int ) P and P , where P , P are linear equations. The goal of this proof task says thatthese two equations can always be solved for w, z given values for x, y .Below we use META5 to plan a proof for this goal; OBJ3’s call-that fea-ture is used to delay applying the equation that defines the universalquantifiers; without this trick, these quantifiers are turned into con-stants, and then the quantifier elimination rule in META5 cannot be ap-plied. ase Analysis open META5 .ops INT P1 P2 : -> Sen .op Phi : -> Sig .op Int : -> BType .var A : Sen .red INT |=[Phi] (all-X (exist ’z : Int(exist ’w : Int (P1 and P2)))) .call-that t .eq all-X A = (all ’x : Int (all ’y : Int A)).red t .close The proof plan that results from this is (INT and Eqt(’z) and Eqt(’w)|=[Phi][’z : Int][’w : Int][’x : Int][’y : Int] P2) and(INT and Eqt(’z) and Eqt(’w)|=[Phi][’z : Int][’w : Int][’x : Int][’y : Int] P1) For a particular instance, suppose that P1 and P2 are respectively theequations x − w + y = x + w = z − , for which the solutions are w = x + y − z = x + y + . Then the original goal is proved with the above proof plan by the fol-lowing, provided the two reductions give true (which they do): open INT .ops x y z w : -> Int .eq w = x + (2 * y) - 3 .eq z = x + y + 1 .red x - w + (2 * y) == 3 .red x + w == (2 * z) - 5 .close (cid:2) Sometimes it is easier to prove a result by breaking the proof (or a partof it) into “cases.” For example, in trying to prove a sentence of theform (where n is a natural number variable) ( ∀ n) (n > ⇒ Q(n)) , First-Order Logic and Proof Planning it might be easier to prove the following two cases separately, ( ∀ n) (n = ⇒ Q(n)) ,( ∀ n) (n > ⇒ Q(n)) , than to prove the assertion in its original form. In general, there aremany different ways to break a condition like n > ( ∀ n) (n even ∧ n > ⇒ Q(n)) ,( ∀ n) (n odd ⇒ Q(n)) , and still another is ( ∀ n) (n = ⇒ Q(n)) ,( ∀ n) (n prime ⇒ Q(n)) , ( ∀ n) (n composite ⇒ Q(n)) . Such examples suggest that “cases” are “predicates” (i.e., open for-mulae) P , . . . , P N such that P ∨ · · · ∨ P N and P are equivalent, wherethe sentence to be proved has the form P ⇒ Q ; and they further sug-gest that a proof by “case analysis” consists of proving P i ⇒ Q for i = , . . . , N . We make this more precise as follows: Proposition 8.6.1 To prove A (cid:238) P ⇒ Q , it suffices to give predicates P , . . . , P N such that A (cid:238) P ⇒ (P ∨ · · · ∨ P N ) , and then to prove A (cid:238) P i ⇒ Q for i = , . . . , N . Proof: Soundness of this proof measure follows from the calculation: A (cid:238) (P ⇒ Q) ∧ · · · ∧ (P N ⇒ Q) iff A (cid:238) (P ∨ · · · ∨ P N ) ⇒ Q implies A (cid:238) (P ⇒ Q) . using E 14 for the first step and R a for the second. (cid:2) This justifies the deduction rule R . A (cid:238) P ⇒ Q if A (cid:238) P i ⇒ Q for i = , . . . , N and A (cid:238) P ⇒ P ∨ · · · ∨ P N , which in turn justifies the transformation rule T . A (cid:238) P ⇒ Q (cid:45) → A (cid:238) P i ⇒ Q for i = , . . . , N and A (cid:238) P ⇒ P ∨ · · · ∨ P N . Example 8.6.2 Often a case analysis succeeds because it makes use of new in-formation. For example, we cannot prove A |(cid:155) ( ∀ B) not not B = B , where A is a ground specification of the Booleans, by proving A (cid:238) ( ∀ B : Bool ) not not B = B ase Analysis because it is not true of all models of A . However, if we work in theinstitution FOLQ /D where D includes the Booleans, then we can prove A |(cid:155) ( ∀ B : Bool ) B = true ∨ B = false , using initiality, so that it suffices to prove the two cases, A (cid:238) not not true = true A (cid:238) not not false = false . In fact, exactly this kind of Boolean case analysis justifies the methodof truth tables (as in Example 7.3.16). (cid:2) Some typical case analyses are given below, for x an integer, y a non-zero integer, and z a positive integer, respectively: (x = ) ∨ (x > ) ∨ (x < ) ; (y > ) ∨ (y < ) ; (z = ) ∨ (z > ) . It is also typical that proving the validity of these disjunctions requiresinduction. Exercise 8.6.1 Prove validity of the following case analysis: for all natural num-bers n , either n = ( ∃ j) n = s(j) . (cid:2) Example 8.6.3 The proof that √ √ = a/b with a, b positive relatively prime integers. This step can be justifiedusing case analysis. The initial assumption is √ = a/b with a, b pos-itive. Let gcd (a, b) = g (where gcd denotes the greatest common di-visor). Then either g = g > 1. In the first case, a, b are alreadyrelatively prime, while in the second case we have a = a (cid:48) g and b = b (cid:48) g with a (cid:48) , b (cid:48) positive and relatively prime. Then √ = a (cid:48) /b (cid:48) . So in eithercase the initial assumption is justified, and we can proceed with theproof. The rest of the top level proof planning can be done by OBJ: open META4 + NAT .op NAT : -> Sen .op NATSIG : -> Sig .ops ’a ’b : -> NzNat .op eq : Nat Nat -> Sen .let P = eq(’a * ’a * 2, ’b * ’b).red NAT |=[NATSIG] not (exist ’a : ’NzNat(exist ’b : ’NzNat P)).close The result of the reduction is as follows: First-Order Logic and Proof Planning result Meta: NAT and eq(’a * ’a * 2,’b * ’b) |=[(NATSIG][’a : ’NzNat)][’b : ’NzNat] false which says we should assume the negation of the goal, Skolemize ’a and ’b , and then try to derive a contradiction; of course, this leavesout the most difficult parts of the proof, which cannot be automated soeasily. (cid:2) Exercise 8.6.2 Give a complete proof plan and OBJ3 proof score for showing √ (cid:2) This section shows how to generalize, justify and use the familiar formof induction that checks base and step cases; this includes so-called“structural induction” but not well-founded induction and similar po-tentially transfinite methods. Example 8.3.14 showed that first-orderspecifications in general do not have initial models; therefore inductionis not in general valid for proving sentences about structures definedby first-order theories. However, induction is valid for proving sen-tences about structures that are defined (or definable) by initial algebrasemantics. Our applications usually have some underlying data valuesthat have been defined in this way, such as the integers or the nat-urals, and simple inductive results about them are usually needed inall but the simplest proofs. Using the language of Section 8.3.6, thismeans we are working in the institution FOLQ /D , where D is an ini-tial ( Ψ , E) -algebra. For this institution, satisfaction differs from that ofthe ordinary first-order logic institution FOL , in that for P a first-order Ψ -formula, ∅ (cid:238) FOLQ /D Ψ P iff D (cid:238) FOLQ Ψ P . We can now state the basicjustification for inductive reasoning as follows: R . ∅ (cid:238) FOLQ /D Ψ P iff D (cid:238) FOLQ Ψ P iff E |(cid:155) Ψ P , where we extend the notation |(cid:155) of Section 6.4 to first-order sentences,so that E |(cid:155) Ψ P means P is satisfied by an initial model of E . More gener- ally, we have A (cid:238) FOLQ /D Ψ P iff E |(cid:155) Ψ P , provided P is a Ψ -sentence and A is consistent with D . We should not neglect to mention a very simple,but still useful further rule, where P is an arbitrary first-order sentence: R . A |(cid:155) Σ P if A (cid:238) Σ P . This rule is sound for FOLQ /D because if P holds for every model of A ,it certainly holds for an initial model of A (a special case was alreadymentioned on page 181 in Chapter 6).In line with R 13, we have the following result, for which as so oftenhappens, there is a nice semantic proof: lgebraic Induction Proposition 8.7.1 If M, M (cid:48) are isomorphic Ψ -models and if P is a first-order ( Ψ , X) -formula, then M (cid:238) Ψ P iff M (cid:48) (cid:238) Ψ P . Proof: Let ψ : M → M (cid:48) be a Ψ -isomorphism with inverse ρ : M (cid:48) → M , andassume M (cid:238) Ψ P . Let θ : X → M (cid:48) be an assignment. E39 Then θ ; ρ : X → M is an assignment, so that P (θ ; ρ) = true . Therefore P (θ ; ρ) ; ψ = ( true )ψ = true . The proof of the converse is similar. (cid:2) Formulae that can be proved by induction include sentences of theform ( ∀ x : v)P where x is free in P ; in this case, we often write ( ∀ x)P (x) , and call x the induction variable . Then M (cid:238) ( ∀ x)P means M (cid:238) θ m (P ) for all m ∈ M v where θ m is the substitution E40 with θ m (x) = m ; we may write P (m) for θ m (P ) . If the inductive goal has theform ( ∀ x )( ∀ x ) . . . ( ∀ x n )P , then (by E 20) we can reorder the quanti-fiers to put any one of x , . . . , x n first, say x i , and use it as the induction variable for proving ( ∀ X − { x i } )P where X = { x , x , . . . , x n } .Usually more than one induction scheme can be used for a given ini-tial specification; even the natural numbers have many different induc-tion schemes. The most familiar scheme proves ( ∀ x)P (x) by proving P ( ) and then proving that P (n) implies P (sn) ; let’s call this Peano in-duction . But we could also prove ( ∀ x)P (x) by proving P ( ) and P (s ) ,and then proving that P (n) implies P (ssn) ; let’s call this even-odd in-duction . These two schemes correspond to two different choices ofgenerators: the first uses 0 , s _, while the second uses 0 , s , ss _. Theseare the first two of an infinite family of induction schemes: for each n > 0, the n -jump Peano induction scheme has n constants 0 , . . . , s n − s n _.We insist that before an induction scheme is used, it should beproved sound ; this requires formalizing the notion of induction schemefor (initial models of) an equational specification ( Ψ , E) ; the formaliza-tion will involve a signature Γ of generators defined over ( Ψ , E) by somenew equations that may involve some new auxiliary function symbols. Definition 8.7.2 An inductive goal for a signature (V , Ψ ) is a V -sorted family P of first-order Ψ -formulae P v , each with exactly one free variable x v ofsort v ∈ V ; we may write P v (x v ) and call x v the induction variable ofsort v . Usually P v (x v ) = true for all v except one, say u ∈ V , in which case we identify P with P u and write x for x u .Given a specification T = ( Ψ , E) , an algebraic induction scheme for T is a V -sorted extension theory T (cid:48) = ( Ψ (cid:48) , E (cid:48) ) of T and a subsignature Γ of Ψ (cid:48) , written ( Ψ (cid:48) , E (cid:48) , Γ ) . Then inductive reasoning for an inductivegoal P over T using the scheme ( Ψ (cid:48) , E (cid:48) , Γ ) says: first show P v (c) for eachconstant c ∈ Γ [],v ; and then show P v (g(t , . . . , t k )) for each g ∈ Γ w,v assuming P v i (t i ) for each i = , . . . , k (where w = v . . . v k and each t i is a ground Ψ -term of sort v i ) using the equations in E (cid:48) . (cid:2) We want to use inductive reasoning to prove E |(cid:155) Ψ ( ∀ x v )P v (x v ) foreach sort v ∈ V , which we may write for short as E |(cid:155) ( ∀ x)P (x) . The First-Order Logic and Proof Planning result below follows from Theorem 6.4.4, that initial algebras have noproper subalgebras: Theorem 8.7.3 Given a specification T = ( Ψ , E) and an algebraic inductionscheme ( Ψ (cid:48) , E (cid:48) , Γ ) over T , then inductive reasoning with ( Ψ (cid:48) , E (cid:48) , Γ ) over T is sound , in the sense that if the steps of inductive reasoning for P using the scheme are carried out, then T |(cid:155) ( ∀ x)P (x) , provided Γ is inductive for ( Ψ (cid:48) , E (cid:48) ) over ( Ψ , E) , in the sense that:(I1) two ground Ψ -terms are equal under E iff they are equal under E (cid:48) ;(I2) every ground Ψ (cid:48) -term equals some Ψ -term under E (cid:48) ; and(I3) every ground Ψ -term equals some Γ -term under E (cid:48) . In this case we may also say that ( Ψ (cid:48) , E (cid:48) , Γ ) is inductive over ( Ψ , E) . Proof: We need to show D (cid:238) ( ∀ x v )P v (x v ) for each v ∈ V , where D is aninitial model for T . Because (I1) and (I2) imply that initial models of T and T (cid:48) are Ψ -isomorphic, by Proposition 8.7.1 it suffices to show D (cid:48) (cid:238) ( ∀ x v )P v (x v ) for each v ∈ V , where D (cid:48) is an initial model for T (cid:48) .To this end, define a V -sorted subset M of the initial algebra D (cid:48) = T Ψ (cid:48) ,E (cid:48) by M v = { [t] | D (cid:48) (cid:238) P v (t), t ∈ T Ψ ,v } , for each v ∈ V . Then (I1) and (I2) imply D (cid:48) (cid:238) ( ∀ x) P iff M = D (cid:48) . Toprove M = D (cid:48) , it suffices to show that M is a Ψ (cid:48) -algebra, because D (cid:48) hasno proper Ψ (cid:48) -subalgebras (Theorem 6.4.4). By (I3), it suffices to showthat M is a Γ -algebra. But successfully carrying out the steps of theinductive reasoning shows exactly this. (cid:2) Conditions (I1) and (I2) above say that T (cid:48) is a protecting initial exten-sion of T , i.e., that after enriching T , there are no new ground terms;and (I3) says that all ground terms are (equal to) Γ -terms. Theorem8.7.3 not only justifies the most familiar induction schemes, but alsomany others, as shown in the examples and exercises below. It is worth noticing that the above theorem applies to any reachable model of T ,not just to an initial model of T (recall that D is reachable iff the unique Ψ -homomorphism I → D is surjective, where I is an initial model of T ,i.e., iff D satisfies the “no junk” condition). Example 8.7.4 The following take ( Ψ , E) to be the usual Peano specificationfor the natural numbers, with Ψ having just the sort Nat and functionsymbols 0 , s , and with E = ∅ .1. Of course, Peano induction takes Γ = Ψ (cid:48) = Ψ with E (cid:48) = ∅ . In thiscase, inductivity is trivial. lgebraic Induction 2. Letting Γ contain 0 , s , s , with Ψ (cid:48) = Ψ ∪ Γ and E (cid:48) = { s (n) = s(s(n)) } gives the even-odd induction scheme. Then (I1)–(I3) areeasy to prove. (cid:2) Exercise 8.7.1 Show that the n -jump Peano induction scheme is sound for each n > (cid:2) Example 8.7.5 A more sophisticated algebraic induction scheme lets Γ contain0 , s × p for each prime p , with ( Ψ (cid:48) , E (cid:48) ) definingthe usual binary multiplication and (to help define that) the usual binaryaddition. The resulting scheme, which we call prime induction , saysthat to prove P (n) for all natural numbers n , prove P ( ) , P (s ) , andthat P (n) implies P (n × p) for each prime p .This scheme is inductive, because in this case, (I1) and (I2) just say that after enriching T with addition, multiplication, and primes, thenatural numbers are still its ground terms, while (I3) says that everypositive number is a product of primes, which is the so-called Funda-mental Theorem of Arithmetic, which was first proved by Gauss. (cid:2) Example 8.7.6 Prime induction can be used to prove some pretty facts aboutthe so-called Euler function ϕ , where ϕ(n) is the number of positiveintegers less than n that are relatively prime to n . One of these is thefollowing, which is rather well known as the Euler formula , ϕ(n) = n (cid:89) p | n (cid:32) − p (cid:33) , where p varies over primes. We can define the Euler function induc-tively over the prime induction scheme as follows: ϕ( ) = ϕ(np) = ϕ(n)(p − ) if p prime and not p | nϕ(np) = ϕ(n) p if p prime and p | n . Or alternatively, we can consider the above as three properties of ϕ that can be proved from its definition as the number of relatively prime numbers less than its argument.The following is an OBJ proof score for the Euler formula (the spec-ifications NAT and ListOfNat have been omitted): obj PRIME is pr NAT .op _|_ : NzNat NzNat -> Bool .op prime : NzNat -> Bool [memo].op prime : NzNat NzNat -> Bool [memo].vars N M : NzNat .eq N | M = gcd(N,M) == N .eq prime(s 0) = false .cq prime(N) = prime(N,p N) if N > s 0 . First-Order Logic and Proof Planning eq prime(N, s 0) = true .cq prime(N,M) = false if M > s 0 and M | N .cq prime(N,M) = prime(N,p M) if M > s 0 and not M | N .endoobj PRIME-DIVISORS is pr PRIME + ListOfNat .op pr-div : NzNat -> List .op pr-div : NzNat Nat -> List .vars N P M : NzNat . var L : List .eq pr-div(s 0) = nil .cq pr-div(P) = P if prime(P) .cq pr-div(N * P) = pr-div(N) if P | N .cq pr-div(N * P) = P pr-div(N) if prime(P) and not P | N .cq pr-div(M) = pr-div(M,M) if not prime(M) .eq pr-div(N,s 0) = nil .cq pr-div(N,P) = P pr-div(N,p P) if P > s 0 and prime(P)and P | N .cq pr-div(N,P) = pr-div(N,p P) if P > s 0 andnot (prime(P) and P | N).ops Pi Pip : List -> NzNat .eq Pi(nil) = s 0 .eq Pi(N L) = N * Pi(L) .eq Pip(nil) = s 0 .eq Pip(N L) = (p N) * Pip(L) .endoobj EULER is pr PRIME-DIVISORS .op phi : NzNat -> NzNat .vars N P : NzNat .eq phi(s 0) = s 0 .cq phi(N * P) = phi(N) * P if prime(P) and P | N .cq phi(N * P) = phi(N) * p P if prime(P) and not P | N .endo***> Prove phi(N) * Pi(pr-div(N)) == N * Pip(pr-div(N))***> for each N:NzNat***> First show the formula for N = 1red phi(1) * Pi(pr-div(1)) == 1 * Pip(pr-div(1)) .***> and introduce the basic constants and assumptionsopenr .ops n q pq : -> NzNat .eq prime(q) = true .eq p q = pq .close***> Then suppose the property for n and prove it for n * qopenr .eq phi(n) * Pi(pr-div(n)) = n * Pip(pr-div(n)) .close lgebraic Induction ***> Case where q | nopen .eq q | n = true .red phi(n * q) * Pi(pr-div(n * q)) ==n * q * Pip(pr-div(n * q)) .close***> Case where not q | nopen .eq q | n = false .red phi(n * q) * Pi(pr-div(n * q)) ==n * q * Pip(pr-div(n * q)) .close The unary function p is the predecessor function, which is inverse to the successor function s ; the function pr-div gives the list of primedivisors of a number; the functions Pi and Pip give the products of alist of numbers, and of the list of predecessors of a list of numbers,respectively. The equation p q = pq tells OBJ that the predecessor of p is positive, an important fact that needs to be proved separately. Atotal of 148 rewrites were executed in doing this proof. (Special thanksto Grigore Ro¸su for help with this example.) (cid:2) Exercise 8.7.2 Prove soundness of the induction scheme for positive naturalnumbers that shows P ( ) and P (p) for each prime p , and then showsthat P (m) and P (n) imply P (mn) for any positive naturals m, n . (cid:2) Example 8.7.7 We now consider another somewhat sophisticated inductionscheme, this one for pairs of natural numbers. The underlying datatype is a simple extension of the naturals, with functions for pairingand unpairing, <_,_> : Nat Nat -> 2Nat .p1,p2 : 2Nat -> Nat . subject to the equations p1(< M, N >) = M .p2(< M, N >) = N . (It is interesting to compare this with the specification for pairs ofnatural numbers in Example 3.3.10.)Our induction scheme for this data type uses the functions a,b : Nat -> 2Natf,g : 2Nat -> 2Nat where First-Order Logic and Proof Planning a(M) = < M, 0 >b(N) = < 0, N >f(< M, N >) = < M + N, N >g(< M, N >) = < M, N + N > where we can think of the a, b as each providing an infinite family ofconstants.Then to prove ( ∀ p : ) P (p) for some first-order formula P , itsuffices to prove the following, where the first two are base cases andthe last two are induction steps, ( ∀ m : Nat ) P ( (cid:104) m, (cid:105) )( ∀ n : Nat ) P ( (cid:104) , n (cid:105) )( ∀ m, n : Nat ) P ( (cid:104) m, n (cid:105) ) ⇒ P ( (cid:104) m + n, n (cid:105) )( ∀ m, n : Nat ) P ( (cid:104) m, n (cid:105) ) ⇒ P ( (cid:104) m, n + m (cid:105) ) provided the scheme is inductive, which is Exercise 8.7.3 below. Noticethat in the last two steps, we can assume that both m, n are positivewithout loss of generality (because 0 is covered by the base cases).To illustrate this induction scheme, we give below an OBJ3 proofscore showing that gcd (m, n) = gcd (n, m) by considering gcd as a function → Nat . We first define the in-equality relation, and then introduce some lemmas. openr INT .op _>_ : Int Int -> Bool .vars-of .eq s I > I = true .eq I > p I = true .eq s I > s J = I > J .vars M N : Nat .eq 0 > M = false .eq s M > 0 = true .*** some lemmaseq I + J + (- J) = I .eq I + J + (- I) = J .cq I + J > I = true if I > 0 and J > 0 .cq I + J > J = true if I > 0 and J > 0 .cq I + J > 0 = true if I > 0 and J > 0 .closeobj 2NAT is sorts 2Nat 2Int .pr INT .subsort 2Nat < 2Int .op <_,_> : Nat Nat -> 2Nat .op <_,_> : Int Int -> 2Int .ops p1 p2 : 2Nat -> Nat . lgebraic Induction ops p1 p2 : 2Int -> Int .vars M N : Int .eq p1(< M, N >) = M .eq p2(< M, N >) = N .endoobj GCD is pr 2NAT .op gcd_ : 2Int -> Int .vars M N : Int .eq gcd < 0, N > = N .eq gcd < M, 0 > = M .eq gcd < M, M > = M .cq gcd < M, N > = gcd(< M - N, N >) if M > N and N > 0 .cq gcd < M, N > = gcd(< M, N - M >) if N > M and M > 0 .endoopenr GCD .ops m n : -> Nat .*** base cases:red gcd < m,0 > == gcd < 0,m > .red gcd < 0,n > == gcd < n,0 > .*** for the induction steps:eq m > 0 = true .eq n > 0 = true .close*** induction step computations:open .eq gcd < m, n > = gcd < n, m > . *** induction hypothesis:red gcd < m, m + n > == gcd < m + n, m > .closeopen .eq gcd < n, m > = gcd < m, n > . *** induction hypothesis:red gcd < m + n, n > == gcd < n, m + n > .close These computations require a total of 108 rewrites, many of which check the conditions of equations. (cid:2) Exercise 8.7.3 This involves the material introduced in Example 8.7.7.(a) Show that the scheme of this example is inductive.(b) Prove that the equation < p1( P ), p2( P ) > = P holds in any initial model of , and use this to conclude thatthe universal quantification ( ∀ P : ) is equivalent to ( ∀ M, N : Nat ) . (cid:2) First-Order Logic and Proof Planning Notice that in proving that a sentence P holds for some constructorterm t , we are entitled to assume P (t (cid:48) ) for every subterm t (cid:48) of t . Thisis because when t = σ (t , . . . , t n ) , we assume P (t ), . . . , P (t n ) , whichin turn were proved assuming that P holds for the top level subtermsof t , . . . , t n , etc. Some proofs require these additional assumptions.In the case of the usual Peano induction for the natural numbers, thericher induction principle which includes these additional assumptionsis called strong induction , or complete induction , or course of valuesinduction , and it means that in proving P (n) , we can assume P (k) forall k < n . We will use the same names for the corresponding enrich-ment of our much more general notion of induction. Below is a simpleexample for the natural numbers: Example 8.7.8 We define a sequence f of natural numbers by f ( ) = , f ( ) = 0, and f (n) = f ( div (n)) + n ≥ . The following OBJ proof score shows that f is always even. The module SEQ first defines the auxiliary functions even and div2 , which respec-tively tell if a number is even, and give its quotient when divided by 2.Next four lemmas are stated, a constant n is introduced to eliminatethe universal quantifier from the formula to be proved, which is ( ∀ N) even (f (N)) = true , and then n is assumed to be at least 2 (we also need to assume it isat least 1, which is easier than introducing another lemma which willdeduce that fact). Finally the strong induction hypothesis is stated, thebase cases are checked, and then the inductive step is checked, whichrequires 50 rewrites to get true . obj SEQ is pr NAT .op even_ : Nat -> Bool .var N : Nat .eq even 0 = true .eq even s 0 = false .eq even s s N = even N .op div2_ : Nat -> Nat .eq div2 0 = 0 .eq div2 s 0 = 0 .eq div2 s s N = s div2 N .op f_ : Nat -> Nat .eq f 0 = 0 .eq f s 0 = 0 .cq f s N = 3 * f(div2 N) + 2 if N > 0 .endo iterature openr SEQ .vars N M : Nat .cq s N > M = true if N > M .cq even(N + M) = true if even N and even M .cq even(N * M) = true if even M .cq N > div2 N = true if N > 0 .op n : -> Nat .eq n > s 0 = true .eq n > 0 = true .cq even f N = true if s n > N .closered even f 0 .red even f 1 .red even f s n . (cid:2) Exercise 8.7.4 Write a specification for finite sets of naturals having the con-structors ∅ and add : Nat Set → Set . Now define union, and prove itis associative, commutative, and idempotent. This is a nice example ofinduction over a specification that is not anarchic. E41 (cid:2) Exercise 8.7.5( (cid:63) ) Explore the idea that if specifications ( Ψ , E) and ( Ψ (cid:48) , E (cid:48) ) areequivalent in the (loose) sense of Section 4.10, then they are groundequivalent, in that they have the same initial models (after reducing toa common signature); therefore either one can be used as an inductionscheme for the other. Going even further down this road, it mightbe interesting to consider the two ways of specifying pairs of naturalnumbers that are given in Examples 3.3.10 and 8.7.3. (cid:2) Many mathematicians, especially logicians, would say that first-orderlogic is the most important of all logics, because it is the foundationfor set theory, and hence for all of mathematics. It is certainly one ofthe most intensively studied of all mathematical systems, and from themid twentieth century has been considered the most classic of all log- ics. Introductory textbooks on logic include [163, 133] and [48]; thereare many more textbooks, and a truly enormous literature of advancedtexts and papers.Most logic texts emphasize the completeness of some set of rules;we have taken a different, some would say eccentric, approach, by de-veloping and using whatever rules we need, subject to proving theirsoundness based on satisfaction; in this sense, we consider logic to bea kind of open system (see the discussion in Section 1.3). In any case,pure first-order logic is not sufficient for our applications, because ofour need for built in data types, induction, and (in the next chapter)second-order logic. First-Order Logic and Proof Planning Many mathematicians would also say that the rules of inference offirst-order logic are “self evident” logical truths. But this has been chal-lenged by the so-called intuitionism of Brouwer and others, as well assome even more radical approaches, such as Martin-Löf type theory. Inparticular, the rules of double negation (P5) and proof by contradiction(R4) have been questioned. However, these challenges have less forcefor reasoning about relatively finitistic applications like numbers, lists,and circuits, which are our main interest in this book.The material in Section 8.3 builds on the algebraic exposition offirst-order logic given in [67]. There are many works on algebraic ap-proaches to logic; some early books are by Paul Halmos [96], RogerLyndon [125], and Helena Rasiowa [155]; some of the earliest work inthis area was done in Poland, and Rasiowa was one of the pioneers. Halmos and Givant [97] give a nice introductory treatment. More ad-vanced approaches involve cylindric algebra, category theory, sheaves,and topoi [100, 44].Horn clauses are named after Alfred Horn, who for many years wasprofessor of logic at UCLA. Horn clauses are the basis for the syntax ofso-called “logic programming” languages, such as Prolog [34]; however,most of the syntax of Prolog does not correspond to Horn clause logic,and the part of its syntax that does correspond has a semantics thatdiffers from the model theory of Horn clause logic.Our definition of well-formed formulae follows the “initial algebrasemantics” advocated by [52] and [89] in using the freeness of certainsyntactic algebras. We consider this to be both simpler and clearer thanthe approaches usually taken in logic.The semantic definition of truth for first-order logic is originally dueto Tarski [175]. This important conceptual advance ushered in the eraof so-called “model theory” in mathematical logic, and is the originalsource for the emphasis on semantics in this book.The proof of Theorem 8.2.6 is due to Diaconescu, and follows [41].Those familiar with logic programming may be interested to note thatthe initial H -model T H is what is usually called a “Herbrand Universe”in that field. Properties (a) and (b) of Exercise 8.1.3 show that Φ -models with Φ -morphisms form (what is called) a category; then property (c) followsautomatically, along with many other useful properties. Category the-ory also suggests that for some structures bijective morphisms maynot be isomorphisms, because bijectivity cannot even be stated for ab-stract categories (see (d) of Exercise 8.1.3). Such “facts for free” andhints about generalizability motivate the study of category theory, andsome basics are given in Chapter 12, along with a few relevant refer-ences.Institutions [67] axiomatize the Tarskian model-theoretic formula-tion of mathematical logic, by axiomatizing the satisfaction relation iterature between syntax and semantics, indexed by the ambient signature; themain axiom is called the satisfaction condition, a special case of whichwas treated in Section 4.10. The theory of institutions is developedto some extent in Chapter 14; this theory has been used for manycomputing science applications, including specification and modular-ization. The Eqlog system [79, 42] implements the institution HCLEQ ;because of its use of general equality, Eqlog goes well beyond whatstandard logic programming languages like Prolog provide, though atsome cost in efficiency. The institution FOLQ / D is closely related tothe hidden algebra institution developed in Chapter 13, which was de-signed to handle dynamic situations, such as the sequential (i.e., statedependent, time varying) circuits that are discussed in Chapter 9.Proof planning is a venerable topic in Artificial Intelligence, and there is nothing especially novel about our treatment here, except thatwe have been very precise about the institutions(s) involved, have put astrong emphasis on equational reasoning, and have accepted the needfor human involvement. Inference and proof planning are not equa-tional deduction, because not all rules are reversible; instead they arerewrite rules. If a more logical formulation is really desired, then Mese-guer’s rewriting logic [134] is applicable.The formulation of the Substitution Theorem (Theorem 8.3.22) ap-pears to be new. Special thanks to Grigore Ro¸su for help with its proof,especially Proposition 8.3.21, on which it rests. Skolem constants andfunctions were introduced by the Norwegian logician Thoralf Skolem inhis studies of first-order proof theory in the late 1920s.The material on algebraic induction in Section 8.7 appears to benew. In general, mathematicians have not seen much value in formal-izing exotic variants of ordinary induction, though they have done con-siderable work in formalizing variants of transfinite induction. On theother hand, computer scientists have been very concerned with induc-tive techniques for complex data structures; Burstall’s work on struc-tural induction pioneered this important area [21]. Some related workhas been done in computer science under the name of “cover set” and“test set”; for some recent work, with citations of older literature, see [17, 18]; this work concerns semi-automated inductive proofs of equa-tions for the one-sorted case, largely restricted to anarchic construc-tors. This material, like much else in this book, is also more generalthan most of the existing literature in that it treats the overloadedmany-sorted case.Humans have a deep seated desire for the phenomena that theyexperience to appear coherent, e.g., through some kind of causal expla-nation. In particular, we want to know why a particular step is taken ina proof, not just that it happens to work. Unfortunately, formal proofsusually leave out the motivations for their steps. However, a proofshould be more fun (or at least, relatively easier) to read if it is orga- First-Order Logic and Proof Planning nized to tell a story , explaining how various difficulties arise and areovercome. Thus it seems likely that much can be learned about how toeffectively present proofs by studying narratives, particularly oral sto-ries as they are actually told, and perhaps also movies. References onnarratology include [118, 7, 123, 145, 124], and more information onproofs as narratives can be found in [64]. Semiotics, which is the studyof signs, should also be able to help make proofs easier to understand;references on semiotics include [149, 104, 64, 83]. A Note for Lecturers: Although much of the material inthis chapter looks rather technical, there are usually sim- ple underlying intuitions that can be brought out throughdiscussing the examples. (However the proof of Proposi-tion 8.3.21 in Appendix B really is rather technical.)Some students are troubled by the seemingly circular use offirst-order logic to reason about first-order logic. Thereforeit should be explained that informal first-order (and occa-sionally higher-order) logic is the language of mathematics,and that the project of this chapter is to formalize first-orderlogic, that is, to make of it a mathematical object, which canthen be reasoned about; this formal system has much thesame status as the formalizations of the natural numbersthat we have been dealing with all along, but of course it ismuch more complex. The major purpose of this formaliza-tion, for this book, is to justify the various ways that we useto mechanize reasoning. Second-Order EquationalLogic This chapter generalizes ordinary equational logic, which involves onlyuniversal quantification over constants, to second-order equational logic,which in addition permits universal quantification over operations. Wethen develop full second-order logic with the same techniques used forfirst-order logic in Chapter 8, and in particular, we extend first-orderquantification to second-order quantification using the techniques ofSection 8.3, emphasizing the special case where the only predicates arethe equality predicates. This treatment of second-order quantificationis a natural extension of our treatment of first-order quantification, andthe algebraic techniques used are essentially the same as for the first-order case.These generalizations are crucial for our approach to verifying se-quential hardware circuits, the behavior of which changes over time,due to the effects of internal memory and/or external inputs. Althoughthe resulting logic is much simpler than higher-order logic, it is entirelyadequate for applications to hardware. These applications extend theapproach of Section 7.4 with infinite sequences of Boolean values onwires, instead of single Boolean values, so as to model states, inputs,etc. that vary with (discrete, unbounded) time, as represented by thesequence of natural numbers. The following illustrates some of whatcan be done in this framework: Example 9.0.1 The fact that the union of a relation with its converse is itssymmetric closure, i.e., is the smallest symmetric relation containingit, is stated by the second-order formula below, where R, S are relationvariables, i.e., they have rank DD → Bool for some sort D (which youshould think of as the “domain” of R and S ), and x, y are ordinaryvariables of sort D : ( ∀ R, S) ( [ ( ∀ x, y) (S(x, y) = S(y, x)) ∧ (R(x, y) ⇒ S(x, y)) ] ⇒ [ ( ∀ x, y) (R(x, y) ∨ R(y, x) ⇒ S(x, y)) ] ) . After the relevant formal definitions are presented, Example 9.1.9 willgive an OBJ proof score that verifies the above formula. (cid:2) Second-Order Equational Logic We begin our formal development with the syntax of equations: Definition 9.0.2 A (second-order) Σ - equation is a signature X of variable sym-bols (disjoint from Σ ) plus two ( Σ ∪ X) -terms; we write such equationsabstractly in the form ( ∀ X) t = t (cid:48) and concretely in forms like ( ∀ x, y, z, f , g) f (x, y, z) = g(x, y) + z , when X = { x, y, z, f , g } and their sorts can be inferred from their usesin the terms. This definition and notation extend to conditional equa-tions in the usual way. (cid:2) To define satisfaction for second-order equations, we extend the no-tion of an assignment of values in a Σ -algebra from ground signaturesto arbitrary signatures. Such an assignment on X is just an interpre-tation of X in M , that is, an X -algebra structure for M in addition tothe Σ -algebra structure it already has. Since Σ and X are disjoint, thismeans that M gets the structure of a ( Σ ∪ X) -algebra. Therefore the Σ -algebra M and the interpretation a : X → M in M of the variable sym-bols in X determine a unique ( Σ ∪ X) -homomorphism a : T Σ ∪ X → M bythe initiality of T Σ ∪ X . Note that operation symbols in X are interpretedas functions on M in exactly the same way as are operation symbols in Σ . Now we are ready for Definition 9.0.3 A Σ -algebra M satisfies a second-order Σ -equation ( ∀ X) t = t (cid:48) iff for any interpretation a : X → M we have a(t) = a(t (cid:48) ) in M . In thiscase we may write M (cid:238) Σ ( ∀ X) t = t (cid:48) omitting some Σ subscripts for simplicity. A Σ -algebra M satisfies a set A of (first and second-order) Σ -equations iff it satisfies each e ∈ A , andin this case we write M (cid:238) Σ A . We may also say that M is a P -algebra, and write M (cid:238) P where P = ( Σ , A) . Once again, this extends to conditional equations in the obviousway. (cid:2) It is very pleasing that this is such a straightforward generalization offirst-order equations and their satisfaction, obtained by using an arbi-trary signature instead of a ground signature; in particular, note thatinitiality plays exactly the same role here as it did in the first-order case.Perhaps surprisingly, specifications with second-order equations haveinitial models (see Theorem 9.1.10 below); the proof is nearly the same heorem of Constants as for the first-order case (Theorem 6.1.15), but is deferred to the nextsection because it requires a result given there. The following illus-trates the satisfaction of second-order equations, and also shows howeasy it is to write second-order equations that identify all elements ofany model that satisfies them: Example 9.0.4 Many simple second-order equations only have trivial models.For example, given a one-sorted signature, the equation ( ∀ f , x, y) f (x, y) = f (y, x) implies ( ∀ x, y) x = y , by taking the function f (x, y) = x . Thereforeany model satisfying this equation must have just one element. Someother simple equations that have the same effect are ( ∀ f , x) f (x) = a where a is a constant, and ( ∀ f , g, x) f (x) = g(x) . (cid:2) The development of first-order equational logic in Chapters 3 and 4defined variables to be new constants, and used the Theorem of Con-stants (Theorem 3.3.11) to justify proving equations with variables byregarding the variables as constants, so that ground term deductioncould be used; recall that this was necessary for applications such asinductive proofs. We now extend it to our second-order setting, whereit will play the same role: Theorem 9.1.1 ( Theorem of Constants ) Given disjoint signatures Σ and X , givena set A of Σ -equations, and given t, t (cid:48) ∈ T Σ ∪ X , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iff A (cid:238) Σ ∪ X ( ∀∅ ) t = t (cid:48) . Proof: Each condition is equivalent to the condition that a(t) = a(t (cid:48) ) for every ( Σ ∪ X) − algebra M satisfying A andevery a : X → M , where a : T Σ ∪ X → M is the unique homomorphism. (cid:2) This is exactly the same proof that we gave for the first-order case; asbefore, its simplicity arises from using satisfaction and initiality, ratherthan using some set of rules of deduction. This and the CompletenessTheorem for first-order equational logic give the following: Corollary 9.1.2 Given disjoint signatures Σ and X , given a set A of Σ -equations,and given t, t (cid:48) ∈ T Σ ∪ X , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iff A (cid:96) Σ ∪ X ( ∀∅ ) t = t (cid:48) . (cid:2) Second-Order Equational Logic Results below on expanding and contracting signatures, which gen-eralize results in Section 4.7, use the following: Definition 9.1.3 A signature Ψ is non-void over another signature Σ iff Ψ is dis-joint from Σ and (T Σ ∪ Ψ ) s is non-empty for every sort s of Ψ . Similarly, asignature Ψ is non-void relative to a subsignature Φ over another sig-nature Σ if and only if Ψ is disjoint from Σ , and (T Σ ∪ Ψ ) s is non-emptyfor every sort s of Ψ iff (T Σ ∪ Φ ) s is non-empty for every sort s of Φ . (cid:2) Fact 9.1.4 If Φ ⊆ Ψ , if Φ is non-void over Σ , and if Ψ is non-void relative to Φ over Σ , then Ψ is non-void over Σ . Proof: By the relative non-void hypothesis, (T Σ ∪ Ψ ) s ≠ ∅ for every sort s of Ψ iff (T Σ ∪ Φ ) s ≠ ∅ for every sort s of Φ , and the latter is true by the non-voidness of Φ . Therefore the former is also true. (cid:2) Proposition 9.1.5 If ( ∀ Φ ) t = t (cid:48) is a (second-order) Σ -equation and if Φ ⊆ Ψ with Ψ non-void relative to Φ over Σ , then ( ∀ Ψ ) t = t (cid:48) is also a Σ -equation, and for every Σ -algebra M , M (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iff M (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) . Proof: Let a : Φ → M be such that a(t) = a(t (cid:48) ) . Then Φ must be non-void over Σ , and hence so is Ψ , by the non-voidness of Ψ relative to Φ . Thereforewe can extend a to b : Ψ → M such that b(t) = b(t (cid:48) ) by choosinginterpretations for operations in Ψ − Φ . On the other hand, if there areno such interpretations a , then Ψ must be void, so that Φ is also void,by the non-voidness of Ψ relative to Φ , and hence there are no suchinterpretations b . Therefore the first condition implies the second.Conversely, if b : Ψ → M such that b(t) = b(t (cid:48) ) , then Ψ is non-voidover Σ , so we can restrict b to a : Φ → M such that a(t) = a(t (cid:48) ) . On theother hand, if there is no such b , then Ψ must be void, and then relativenon-voidness implies Φ is void too, so there is also no such a . (cid:2) The above result justifies adding extra constant and operation sym- bols to a proof score under appropriate conditions. Necessity of thenon-voidness hypothesis is shown by the specification given in Exam-ple 4.3.8, but the following is also interesting: Exercise 9.1.1 Give signatures Σ , Φ , Ψ plus a Σ -model M and a Σ -sentence ( ∀ Φ ) t = t (cid:48) such that Φ ⊆ Ψ where Ψ adds only non-constant opera-tions to Φ , and M satisfies ( ∀ Φ ) t = t (cid:48) , but does not satisfy ( ∀ Ψ ) t = t (cid:48) . (cid:2) The Theorem of Constants also generalizes in a way similar to thatof Proposition 9.1.5: heorem of Constants Corollary 9.1.6 Given a (second-order) Σ -equation ( ∀ Φ ) t = t (cid:48) , a signature ∆ disjoint from both Φ and Σ , and a set A of Σ -equations, let Ψ = Φ ∪ ∆ .Then A (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) iff A (cid:238) Σ ∪ Φ ( ∀ ∆ ) t = t (cid:48) , provided Ψ is non-void relative to Φ over Σ . Proof: If M (cid:238) Σ A then M (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iff M (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) by Proposi-tion 9.1.5, so that A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iff A (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) iff A (cid:238) Σ ∪ Φ ( ∀ ∆ ) t = t (cid:48) , the last step by Theorem 9.1.1. (cid:2) (Theorem 9.1.1 is the special case where ∆ = ∅ .)Theorem 9.1.1 also justifies a key rule of deduction for second-orderequational logic, generalizing the first-order universal quantifier elim- ination rule (and the transformation T5 of Chapter 8) to the second-order case. Let (cid:96) denote the syntactic derivability relation defined bysome complete set of rules R for first-order equations, and let (cid:96) bedefined by R plus the following new rule, which we will call dropping : ( D ) A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) implies A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) . Fact 9.1.7 The rule (D) is sound. Proof: If A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) , then A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) by Corollary 9.1.2, so it issound to infer ( ∀ Φ ) t = t (cid:48) . (cid:2) We also have the following: Theorem 9.1.8 ( Completeness ) If R is a set of rules of deduction for first-order equational logic, defining a relation (cid:96) that is complete, then R = R ∪{ (D) } is complete for unconditional second-order equations,in the sense that it defines a relation (cid:96) such that A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) iff A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) , for any signature Σ and any set A of first-order Σ -equations. Proof: The direct implication is soundness of (D), which is Fact 9.1.7, plussoundness of (cid:96) . For the converse, if A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) , then Corol- lary 9.1.2 gives A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) , and then (D) gives A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) . (cid:2) This result supports deducing a single second-order equation from aset of first-order equations; it does not provide a complete inferencesystem for second-order equational logic, which would instead infersecond-order equations from other second-order equations. However,when all sorts involved are non-void, Corollary 9.1.6 allows droppingsecond-order equations to first-order equations, which could then beused in proofs, although in a limited way, because we cannot substitutefor the second-order variables that have been dropped. Nevertheless, Second-Order Equational Logic the inferences that are supported by the above result are all we needfor our applications to the verification of sequential circuits, and manyother applications, such as the following: Example 9.1.9 We use the machinery developed above to prove the formulaof Example 9.0.1. As usual, relations are translated into Boolean-valuedfunctions. The quantifiers for R and S may be considered eliminated bytheir declarations; note that commutativity of S is given as an attributein its declaration, instead of an equation. The main implication in theformula is eliminated by applying the rule (D). The quantifiers for x and y are eliminated in the usual way, as are the implications withintheir two scopes. The case split in the proof is justified by applying thedisjunction elimination rule (T2 of Chapter 8) to the formula R(x, y) ∨ R(y, x) , and then translating the disjuncts to equations. th SETUP is sort D .op R : D D -> Bool .op S : D D -> Bool [comm] .vars X Y : D .cq S(X,Y) = true if R(X,Y) .ops x y : -> D .endthopen . *** first caseeq R(x,y) = true .red S(x,y) .closeopen . *** second caseeq R(y,x) = true .red S(x,y) .close As expected, both reductions give true . Of course, this is a verysimple example. (cid:2) Exercise 9.1.2 Extend Example 8.2.7 by proving that for any relation R , its tran-sitive closure R* as defined there, is the least transitive relation con-taining R . (To do this in OBJ, some ingenuity will be needed in handling the equation for transitivity.) (cid:2) Theorem 9.1.10 ( Initiality ) Given a set A of Σ -equations, possibly conditional,which may be either first or second-order, let ≡ be the Σ -congruence on T Σ generated by the relation R having the components R s = { (cid:104) t, t (cid:48) (cid:105) | A (cid:96) ( ∀∅ ) t = t (cid:48) where t, t (cid:48) are of sort s } . Then T Σ /R , denoted T Σ ,A , is an initial ( Σ , A) -algebra. Proof: E42 Given any ( Σ , A) -algebra M , let v : T Σ → M be the unique ho-momorphism, and notice that R ⊆ ker (v) , because M (cid:238) A implies econd-Order Logic M (cid:238) ( ∀∅ ) t = t (cid:48) for every (cid:104) t, t (cid:48) (cid:105) ∈ R by Theorem 9.1.8. Now let v = q ; u with u : T Σ ,A → M be the factorization of v given by Propo-sition 6.1.12; see Figure 6.2. For uniqueness, if also u (cid:48) : T Σ ,A → M ,then q ; u (cid:48) = v by the initiality of T Σ , and R ⊆ ker (v) because M (cid:238) A .Therefore u = u (cid:48) by (2) of Proposition 6.1.12. (cid:2) The algebraic development of first-order logic in Chapter 8 extendsstraightforwardly to second-order quantification. As before, we assumea fixed first-order signature Φ = ( Σ , Π ) with sort set S , but now we let X be an (S ∗ × S) -indexed set of variable symbols, disjoint from Σ and Φ , such that each sort has an infinite number of symbols. For X ⊆ X , a ( Φ , X) - term is an element of T Σ ∪ X and the ( S ∗ × S )-indexed function Var on these terms is defined by:0. Var w,s (σ ) = ∅ if σ ∈ Σ [],s ;1. Var w,s (x) = ∅ if x ∈ X [],s and w ≠ [] ;2. Var [],s (x) = { x } if x ∈ X [],s ;3. Var w,s (σ (t , . . . , t n )) = (cid:83) ni = Var w,s i (t i ) if n > σ (cid:54)∈ X w,s ;4. Var w,s (σ (t , . . . , t n )) = { σ } ∪ (cid:83) ni = Var w,s i (t i ) if n > σ ∈ X w,s .(As before, Var can be seen as a ( Σ ∪ X) -homomorphism T Σ ∪ X → P (X) .)Then the well-formed ( Φ , X) - formulae are defined just as in Defini-tion 8.3.1, elements of the (one-sorted) algebra WFF X ( Φ ) , free over themetasignature Ω , except that the universal quantification operations ( ∀ x) will now include non-constant function symbols, and the atomic ( Φ , X) -formula generators should be G X = { π(t , . . . , t n ) | π ∈ Π s ...s n and t i ∈ (T Σ ∪ X ) s i for i = , . . . , n } . A Φ -formula is a ( Φ , X) -formula for some X , and the functions Var and Free , giving all variables, and all free variables, of Φ -formulae aredefined just as in Definition 8.3.1; the notions of bound variable, closedformula, scope, etc. are also the same.The semantics of first-order formulae in Section 8.3.3 generalizesdirectly to second-order formulae, along with the results in that sec-tion. In particular, E43 Definition 8.3.2 does not need to be changed atall, except to note that X is an arbitrary signature, not just a groundsignature, so that interpretations of X also need to be general; in par-ticular, all the rules given in that section remain sound. The materialon substitutions in Section 8.3.4 should be modified a bit, but we willnot do so, because we don’t need it for this chapter. Second-Order Equational Logic (cid:8)(cid:8)(cid:8)(cid:72)(cid:72)(cid:72) (cid:8)(cid:8)(cid:8)(cid:72)(cid:72)(cid:72) (cid:104) (cid:104)(cid:115) (cid:115) (cid:115) f f f Figure 9.1: Series Connected InvertersLet us denote the institution of second-order logic that results fromthe above by TOL , and denote the special case where the only predicatesare the equality predicates, by TOLQ , calling it the second-order logicof equality . Also, as in Section 8.3.6, if we fix a signature Ψ and a Ψ -model D , then we can define the institution TOLQ/D to be TOLQ withthe additional requirement that all its signatures must contain Ψ , and that all its models M must be such that their reduct M | Ψ to Ψ is D . Whereas combinational circuits can be described by equations that donot involve time, sequential circuits have behaviors that vary with time,and thus require modeling wires that have time-varying values. Onecommon approach is to model such wires as streams of Boolean val-ues. We can still apply the method of Section 7.4 to obtain a systemof equations from a circuit diagram, but the variables that model wiresnow take values that are functions from Nat (where the natural numbersrepresent moments of time) to truth values. As before, we use PROPC to represent the values on wires, rather than just BOOL , so that we canexploit its decision procedure for propositional logic. The followingsimple example illustrates the approach: Example 9.3.1 ( Series Connected Inverters ) We prove that the series connectionof two NOT gates (i.e., inverters), each with one unit delay, has the sameeffect as a two unit delay; see Figure 9.1. The system of equationsinvolved here is f (t + ) = not f (t)f (t + ) = not f (t) each of which is (implicitly) universally quantified over f , f , f and t ,where each variable f i has rank (cid:104) Nat , Prop (cid:105) , and where t has sort Nat .We think of f as an input variable, f as an output variable, and f asan internal variable. The behavior that we expect this circuit to have isdescribed by the equation f (t + ) = f (t) , erification of Sequential Circuits i.e., it functions as a two unit delay. Moreover, its internal behavior (at f ) is described by the first equation of the system, f (t + ) = not f (t) . An OBJ proof score showing that these two terms indeed solve thetwo inverter system will be given below. The assertion to be verifiedhas the form A |(cid:155) Σ ( ∀ Φ ) r where Σ is the union of the signatures of the objects PROPC and NAT , A is the union of their equations, Φ is the signature containing threefunction symbols f , f , f of rank (cid:104) Nat , Prop (cid:105) , and r is of the form (e ∧ e ) ⇒ e , where e = ( ∀ t) f (s t) = not f (t)e = ( ∀ t) f (s t) = not f (t)e = ( ∀ t) f (s s t) = f (t). We use the transformation rules of Section 8.4 plus R14. By R14 andT6, it suffices to prove that A (cid:238) Σ ∪ Φ ( ∀∅ ) r , which by rule T1 is equivalent to A ∪ { e ∧ e } (cid:238) Σ ∪ Φ e , which by rule T6 again can be verified by proving A ∪ { e ∧ e } (cid:238) Σ ∪ Φ ∪{ t } f (s s t) = f (t) , which by rule T3 is equivalent to A ∪ { e , e } (cid:238) Σ ∪ Φ ∪{ t } f (s s t) = f (t) , which is exactly what the proof score below does. This series of deduc-tions at the meta level can be automated using the same techniques aswere used in Section 8.4, but we do not give details here. open NAT + PROPC .ops (f0_)(f1_)(f2_) : Nat -> Prop [prec 9] .var T : Nat .eq f1 s T = not f0 T .eq f2 s T = not f1 T .op t : -> Nat .red f2 s s t iff f0 t .close Notice that the output values of the inverters at time 0 are not deter-mined by the equations given above, and do not enter into the verifica-tion. If desired, the following equations could be added Second-Order Equational Logic (cid:45)(cid:45) (cid:45) T Y fc y Figure 9.2: Parity of a Bit Stream eq f1 0 = false .eq f2 0 = false . to give them fixed values, but this is not necessary. (cid:2) The sentence proved here is typical of a very large class of sequentialhardware verification problems, which have the form A |(cid:155) Σ ( ∀ Φ ) (C ⇒ e) where A defines the abstract data types of the problem, where C is aconjunction of equations defining the circuit, where e is an equation tobe proved, and where Φ may involve second-order quantification. Example 9.3.2 We consider a simple circuit to compute the parity of a streamof bits, using just one T (for “toggle”) type flip-flop. In Figure 9.2, f is the input stream, c is the clock stream, and y is the output stream;the clock stream just marks cycles to stabilize the flip-flop, and can beignored for our (logical) purposes. This flip-flop satisfies the followingequations, y(t + ) = f (t) + y(t)y( ) = false from which it follows by induction that ( ∀ f , y, t) y(t + ) = t (cid:88) i = f (i) . The proof is as follows: The base case with t = y( ) = f ( ) + y( ) = f ( ) + false = f ( ) . For the induction step, we assume the above equation, and then prove it with t + t , by first noting that y(t + ) = f (t + ) + y(t + ) and then applyingthe above equation. (cid:2) Exercise 9.3.1 Prove correctness of the circuit of Example 9.3.2 using OBJ. (cid:2) It is worth remarking that any verification of a combinational circuit“lifts” to a verification of the same circuit viewed as a sequential circuit,by replacing each wire variable, whether an input i k or a non-input p k , by a function Nat -> Prop , e.g., in the form f k (t) ; the reason isthat the same proof works for the lifted system, with the same verifiedproperty holding at each instant. iterature and Discussion The material in this chapter is based on [59], although Proposition 9.1.5and Theorems 9.1.10 and 9.1.8 are not there, and appear to be new, asdoes the exposition of second-order logic. The proofs involving uni-versal properties of quotient and freedom seem especially elegant andsimple.It is interesting to compare our method for representing sequen-tial circuits with the more familiar method which represents compo-nents using higher-order relations, and represents connections usingexistential quantification (as in the usual definition of the compositionof relations) [93, 94, 25]; see also [167]. By contrast, the representa-tion suggested here uses no relations (except equality in an implicitway), and it represents interconnection by equality of wires. For se- quential circuits, both methods represent wires as variables that rangeover functions, and both methods use second-order quantifiers. How-ever, the results of this chapter show that existential quantification andhigher-order relations can be avoided in favor of a simple extension offirst-order equational logic by universal quantification over functions,contrary to claims made in [25]. The higher-order logic approach tohardware verification of [25] was claimed to have many benefits, in-cluding the following:1. natural definitions of data types (using Peano style induction prin-ciples);2. the possibility of leaving certain values undefined (such as theinitial output of a delay);3. dealing with bidirectional devices.But all these benefits can be realized more simply using just second-order equational logic:1. Chapter 6 showed that initial algebra semantics supports abstractdata type definitions in a very natural way, and also supports the use of structural induction principles for such definitions.2. It is very easy to leave values undefined, such as the values ofinverters at time 0; conditional equations can also be used forthis purpose.3. Although it is often advantageous to exploit causality (in the formof an input/output distinction), we are not limited to that case,because equality is bidirectional.4. Moreover, because equational logic is simpler than higher-orderlogic, in general its proofs are also simpler. Second-Order Equational Logic Of course, our method also has its limits, but fortunately, the prob-lems of greatest interest for hardware verification fall well within itscapabilities.Although the rules and transformations of Chapter 8 were provedfor first-order logic, they extend to second-order quantification. Wedid not use such extended rules in this chapter, because (e.g., in Exam-ple 9.3.1) we first applied rule (D) to get rid of the second-order univer-sal quantifiers. This is sufficient for assertions of the form A |(cid:155) Σ ( ∀ Φ ) r where r contains only first-order quantifiers, but it is not sufficient if r contains second-order quantifiers. Many of the rules in Chapter 8 actu-ally hold for a wide variety of logics, as can be shown using material oninstitutions in Chapter 14.It is worth mentioning that the use of a loose extension of a fixed data theory in the institution TOLQ/D is closely related to the hiddenalgebra institution developed in Chapter 13; this should not be surpris-ing, because hidden algebra was designed to handle dynamic situations,of which sequential circuits are a prime example. A Note for Lecturers: This short chapter contains some niceexamples and some relatively easy theory, which should beincluded in a course if possible, after the relevant parts ofChapters 7 and 8 have been covered. Proposition 9.1.5 canbe skipped, as can the details in Section 9.2. There are many examples where all items of one sort are necessarily also items of some other sort. For example, every natural number is aninteger, and every integer is a rational. We can write this symbolicallyin the form Natural ≤ Integer ≤ Rational , where Natural , Integer , and Rational are names for the sorts ofentity involved. If we associate to each such name a meaning (i.e., asemantic denotation, also called an extension ) which is the set of allitems of that sort (e.g., the set of all integers), then the subsort relationsappear as set-theoretic inclusions of the corresponding extensions. Forexample, if the usual extensions of Natural , Integer , and Rational are denoted N , Z , and Q , respectively, then we have N ⊆ Z ⊆ Q . Sort names like Natural and Rational are syntactic , while their ex-tensions are semantic . This distinction is formalized below with order-sorted signatures , which include a set of sort names with a partial orderrelation, and a family of operation symbols with sorted arities; an order-sorted algebra for a given signature is then an interpretation for these sort and operation names that respects the subsorts and arities. Thisarea of mathematics is called order-sorted algebra (hereafter abbrevi-ated OSA ).A closely related topic is overloading , which allows a single symbolto be used for several different operations. In applying an overloadedoperation symbol, we may not even be aware that we are moving amongvarious sorts and operations. For example, we can add 2 + − / + ( − ) (a rational and an integer), or 2 + / 29 (a nat-ural and a rational), or 2 / + / + , and a sub-sort relation among naturals, integers, and rationals, in such a way that Order-Sorted Algebra and Term Rewriting whichever addition is used, we always get the same result from thesame arguments, provided they make sense. We may describe this bysaying that + is subsort polymorphic .However, this is only one of several ways that the term “polymor-phic” is used. The term was introduced by Christopher Strachey toexpress the use of the same operation symbol with different meaningsin programming languages. He distinguished two main forms of poly-morphism, which he called ad hoc and parametric . In his own words[173]: In ad hoc polymorphism there is no simple systematicway of determining the type of the result from the type ofthe arguments. There may be several rules of limited extentwhich reduce the number of cases, but these are themselves ad hoc both in scope and content. All the ordinary arith-metic operations and functions come into this category. Itseems, moreover, that the automatic insertion of transferfunctions by the compiling system is limited to this class.Parametric polymorphism is more regular, as illustratedby the following example: Suppose f is a function whose ar-gument is of type α and whose result is of type β (so thatthe type of f might be written α → β ), and that L is a listwhose elements are all of type α (so that the type of L is α list ) . We can imagine a function, say Map , which applies f in turn to each member of L and makes a list of the re-sults. Thus Map [f , L] will produce a β list . We would like Map to work on all types of list provided f was a suitablefunction, so that Map would have to be polymorphic. How-ever its polymorphism is of a particularly simple parametrictype which could be written (α → β, α list ) → β list , where α and β stand for any types.Strachey’s distinction is based on the kind of semantic relationship thatholds between the different interpretations of an operation symbol, andit suggests a more detailed distinction among the different kinds ofpolymorphism, in which the less ad hoc the relationship is, the easier it is to do type inference, and the closer it is to parametric polymorphism:• In strongly ad hoc polymorphism , an operation symbol has se-mantically unrelated uses, such as + for both integer addition andBoolean disjunction. (But even in this case, the two instances of + share the associative, commutative, and identity properties.)• In multiple representation , the uses are related semantically, buttheir representations may be different. For example, in an arith-matic system we may have several representations for the number2, such as 2 . / 2. Polar and Cartesian coordinate represen-tations of points in the plane are another example. ignature, Algebra, and Homomorphism • Subsort polymorphism is where the different instances of an op-eration symbol are related by subset inclusion such that the resultdoes not depend on the instance used, as with + for natural, inte-ger, and rational numbers. This sense is developed in this chapter.• Parametric polymorphism , as in Strachey’s Map function, ap-pears in many higher-order functional programming languages,including ML [99, 180], Haskell [107], and Miranda [179].OBJ supports all four kinds of polymorphism. Strongly ad hoc polymor-phism is supported by signatures in which the same operation symbolhas sorts that are unrelated in the subsort hierarchy. We have alreadyexplained that subsort polymorphism is inherent in the nature of OSA.Strachey’s implementation of arithmetic involved “transfer functions”(which would now be called “coercions”) to change the representation of numbers; but coercions are not needed for subsort polymorphicoperations, because subsorts appear as subset inclusions of the dataitems. Also, for regular signatures (in the sense of Definition 10.2.5 be-low), expressions involving subsort polymorphism always have a small-est sort. OSA also accommodates coercions and multiple represen-tation polymorphism [138], although we do not treat this topic here.Parametric polymorphism in OBJ is supported by parameterized ob-jects, such as LIST[X] , that provide higher-order capabilities in a first-order algebraic setting [61]. So it seems fair to conclude that Stracheywas excessively pessimistic about the amount of structure in polymor-phism that is not parametric in his sense, since OSA is a simple but richmathematical theory that is easily implemented and is far from ad hoc in the pejorative sense of being arbitrary.This chapter first generalizes our treatment of many-sorted algebra(MSA) to OSA, with illustrative examples, and then treats retracts, whichenable error handling, and order-sorted term rewriting, which providesan operational semantics; details similar to MSA are sometimes omit-ted. This section introduces and briefly illustrates the three most basic con-cepts of order-sorted algebra; each is a straightforward extension ofthe corresponding MSA concept. Definition 10.1.1 An order-sorted signature (S, ≤ , Σ ) consists of1. a many-sorted signature (S, Σ ) , with2. a partial ordering ≤ on S such that the following monotonicitycondition is satisfied, σ ∈ Σ w ,s ∩ Σ w ,s and w ≤ w imply s ≤ s . (cid:2) Order-Sorted Algebra and Term Rewriting In OBJ notation, the sort set S and the operation set Σ are just the samefor OSA signatures as for MSA signatures. The new ingredient is thepartial ordering on S , which is declared by giving a set of subsort pairs,of the form S1 < S2 ; these can be strung together in declarations ofthe form S1 < S2 < · · · < Sn , which abbreviates S1 < S2 , S2 < S3 ,. . . . In each case, the declaration must be preceded by the keywords subsort or subsorts , and terminated with a period (preceded by aspace, as usual). The partial ordering defined on S is the least suchcontaining the given set of pairs. This syntax is illustrated in the fol-lowing: Example 10.1.2 Below is the signature part of a specification for lists of naturalnumbers in a Lisp-like syntax: sorts Nat NeList List .subsorts NeList < List .op 0 : -> Nat .op s_ : Nat -> Nat .op nil : -> List .op cons : Nat List -> NeList .op car : NeList -> Nat .op cdr : NeList -> List . Here cons is a list constructor which adds a new number at the headof a list; car and cdr are the corresponding selectors, which select the head and tail (also called the front and the rest ) of non-empty lists;and nil is the empty list. However, equations are needed to expressthese relationships between cons and its selectors, for which see Ex-ample 10.2.19 below. (cid:2) Definition 10.1.3 Given an order-sorted signature (S, ≤ , Σ ) , an order-sorted (S, ≤ , Σ ) - algebra is a many-sorted (S, Σ ) -algebra M such that1. s ≤ s implies M s ⊆ M s σ ∈ Σ w ,s ∩ Σ w ,s and w ≤ w imply M w ,s σ = M w ,s σ on M w . (cid:2) The second condition says that overloaded operations are consistentunder restriction; this expresses subsort polymorphism. Example 10.1.4 Letting Σ be the signature of Example 10.1.2, define a Σ -algebra F as follows: F Nat = { } F NeList = { } F List = { , nil } erm and Equation with s( ) = cons ( , L) = car (L) = cdr (L) = nil . Of course, this isnot the “intended” standard or initial model, which instead is definedas follows: L Nat = { , s , ss , . . . } L NeList = L Nat ∪ L ∗ Nat L List = L NeList ∪ { nil } where ∗ denotes the finite sequence constructor. Finally, the opera-tions of L are defined by cons (N, N . . . N n ) = NN . . . N n car (N . . . N n ) = N cdr (N . . . N n ) = N . . . N n if n > cdr (N . . . N n ) = nil if n = (cid:2) Definition 10.1.5 Given order-sorted (S, ≤ , Σ ) -algebras M, M (cid:48) , an order-sorted (S, ≤ , Σ ) - homomorphism h : M → M (cid:48) is a many-sorted Σ -homomor-phism h : M → M (cid:48) such that s ≤ s implies h s = h s on M s . (This is also called the monotonicity condition .) (cid:2) Exercise 10.1.1 Adopting the notation of Example 10.1.4, show that there is aunique Σ -homomorphism L → F . In fact, F is a final Σ -algebra, and wewill see later that L is an initial ( Σ , E) -algebra for the most reasonableand expected equation set E . (cid:2) The main topic of this section is order-sorted equations and their sat-isfaction. This requires that we first treat OSA terms and substitutions.We will see that there are some slightly subtle points about parsingterms, for which we later introduce the notion of a regular signature. Definition 10.2.1 Given an order-sorted signature Σ , the S -indexed set T Σ of Σ - terms is defined recursively by the following: Σ [],s ⊆ T Σ ,s for s ∈ S ,2. s ≤ s implies T Σ ,s ⊆ T Σ ,s ,3. σ ∈ Σ w,s and t i ∈ T Σ ,s i for i = , . . . , n imply σ (t , . . . , t n ) ∈ T Σ ,s where w = s . . . s n and n > (cid:2) Example 10.2.2 Let Σ denote the signature of Example 10.1.2 with car and cdr removed. Then the following are terms of sort List : nilcons ( , nil ), cons ( s0 , nil ), . . . cons ( , cons ( , nil )), cons ( s0 , cons ( , nil )), cons ( s0 , cons ( s0 , nil )), . . .. . . Order-Sorted Algebra and Term Rewriting sσ (cid:45) w ww (cid:45) σ s Figure 10.1: Visualizing RegularityAll except nil are also of sort NeList . (cid:2) Notice that conditions 1 and 3 in Definition 10.2.1 are the same asfor MSA; condition 2 is needed to satisfy the first condition in Defini- tion 10.1.3. Strictly speaking, we should have used underlined paren-theses, σ (t . . . t n ) , in the above, as we did for MSA terms in Defini-tion 3.7.5. We make this indexed family of Σ -terms into a Σ -algebra asfollows: Definition 10.2.3 Given σ ∈ Σ w,s with w = s . . . s n ≠ [] , define ( T Σ ) σ : ( T Σ ) w → ( T Σ ) s by ( T Σ ) σ (t , . . . , t n ) = σ (t , . . . , t n ) ; and when w = [] , define ( T Σ ) σ = σ . The resulting Σ -algebra is called the Σ - term (or sometimes word ) algebra , and denoted T Σ . (cid:2) Example 10.2.4 The terms listed in Example 10.2.2, plus others hinted at there,form the carrier of sort List of the term algebra T Σ for the signature ofExample 10.1.2; all but nil are also in the carrier of sort NeList , whilethe carrier of sort Nat contains the usual Peano numbers. (cid:2) Exercise 10.2.1 Check that T Σ of Definition 10.2.3 is an order-sorted Σ -algebra,by checking the two conditions in Definition 10.1.3. (cid:2) The next definition is motivated by wanting each Σ -term to have aunique parse of least sort. Notice that this is precisely what happensin the example arithmetic system that we discussed in the introductionto this chapter: we want each term to have the most specific possiblesort; for example, − Integer , but not a Rational or a Natural . It is natural to achieve this by requiring that the set of ranksthat each overloaded operation might have (in a given context) has aleast element: Definition 10.2.5 An order-sorted signature (S, ≤ , Σ ) is regular iff for each σ ∈ Σ w ,s and each w ≤ w there is a unique least element in the set { (w, s) | σ ∈ Σ w,s and w ≥ w } . (cid:2) This says that the set of possible ranks for σ with arity greater than anyfixed w has a smallest element; see Figure 10.1, in which the verticallines indicate subsort relations. To explore the consequences of thiscondition, we consider some examples where it is not satisfied: erm and Equation Example 10.2.6 The signature sorts s1 s2 s3 s4 s5 .subsort s1 < s3 .subsort s2 < s4 .op a : -> s1 .op b : -> s2 .op f : s1 s4 -> s5 .op f : s3 s2 -> s5 . is non-regular, because the set of ranks for f with arity at least s1s2 consists of the two tuples ( s1 s4, s5 ) and ( s3 s2, s5 ), neither ofwhich is less than the other. Therefore the term f(a,b) does not havea least sort, but instead has two incompatible sorts. (cid:2) Exercise 10.2.2 Show that the following is a non-regular signature: sorts Nat NeList List .subsort Nat < NeList < List .op 0 : -> Nat .op s_ : Nat -> Nat .op nil : -> List .op cons : NeList List -> NeList .op cons : List NeList -> NeList . This defines a complex kind of list of natural numbers, in which listscan be elements of other lists, with non-empty lists distinguished fromthe empty list nil . But its non-regularity shows that this distinction isnot sufficiently careful; see Exercise 10.2.4. (cid:2) Non-regular signatures can often be made regular by adding somenew subsort declarations and/or by changing the ranks of some opera-tions. Exercise 10.2.3 Show that adding the subsort declaration s2 < s4 to the sig-nature of Example 10.2.6 gives a regular signature. (cid:2) Exercise 10.2.4 Show how to modify the signature of Exercise 10.2.2 to make it regular. Hint: Add a new operation declaration to further overload cons . (cid:2) Proposition 10.2.7 If Σ is regular, then for each t ∈ T Σ there is a least sort s ∈ S such that t ∈ T Σ ,s . This sort is denoted LS(t) . Proof: We proceed by induction on the depth of terms in T Σ . If t ∈ T Σ hasdepth 0, then t = σ for some σ ∈ Σ [],s and so by regularity with w = w = [] , there is a least s ∈ S such that σ ∈ Σ [],s ; this is the leastsort of σ . Now consider a well-formed term t = σ (t . . . t n ) ∈ T Σ ,s ofdepth n + 1. Then each t i has depth ≤ n and therefore by the induction Order-Sorted Algebra and Term Rewriting hypothesis, has a least sort, say s i ; let w = s . . . s n . Then σ ∈ Σ w (cid:48) ,s (cid:48) for some w (cid:48) , s (cid:48) with s (cid:48) ≤ s and w ≤ w (cid:48) , and so by regularity, there areleast w (cid:48) and s (cid:48) such that σ ∈ Σ w (cid:48) ,s (cid:48) and w (cid:48) ≥ w ; this least s (cid:48) is thedesired least sort of t . (cid:2) This proof essentially gives an algorithm for finding the least sort parseof any term over a regular signature. Theorem 10.2.8 ( Initiality ) If Σ is regular and if M is any Σ -algebra, then thereis one and only one Σ -homomorphism from T Σ to M . (cid:2) This result is proved in Appendix B, and as before, its property is called initiality . The following example shows that if Σ is not regular, then T Σ is not necessarily initial: Exercise 10.2.5 Define an algebra M over the signature of Example 10.2.6 asfollows: M s1 = M s2 = M s3 = M s4 = { } , M s5 = { , } , M a = M b = , M s1 s4 , s5f ( , ) = , M s3 s2 , s5f ( , ) = 2. Now show that if T is the termalgebra for the signature given in Example 10.2.6, then T s = { f ( a , b ) } ,and conclude from this that T is not initial. (cid:2) However, [68] shows that initial Σ -algebras do exist even when Σ is notregular. Rather than just Σ -terms, the construction uses terms anno-tated with sort information; the notation T Σ may be used. We omitdetails, which are very similar to those for the many-sorted case (seeSection 3.2). Again as for MSA, we may sometimes write T Σ when wereally mean T Σ , and we may ignore the sort annotations, even thoughthey are necessary, because they are implicit in parsing; this is consis-tent with what is done in implementations of OBJ.As in the many-sorted case, it is convenient (but not always nec-essary) for each variable symbol to have just one sort; therefore weassume that any S -indexed set X = { X s | s ∈ S } used to provide vari-ables is such that X s and X s are disjoint whenever s ≠ s , and suchthat all symbols in X are distinct from those in Σ ; we may use theterm variable set for such indexed sets. Then as in the many-sortedcase, we define the signature Σ (X) by Σ (X) w,s = Σ w,s for w ≠ [] , and Σ (X) [],s = Σ [],s ∪ X s . We can now form the Σ (X) -term algebra T Σ (X) and view it as a Σ -algebra, which is then denoted T Σ (X) . The following isproved in Appendix B: Theorem 10.2.9 If (S, ≤ , Σ ) is regular, then T Σ (X) is a free Σ -algebra on X , inthe sense that for each Σ -algebra M and each assignment a : X → M ,there is a unique Σ -homomorphism a : T Σ (X) → M such that a(x) = a(x) for all x in X . (cid:2) This result also generalizes E44 to non-regular signatures, using T Σ (X) instead of T Σ (X) , though we omit the details. The same applies to thefollowing, the proof of which is just the same as for Proposition 3.5.1for MSA: erm and Equation Proposition 10.2.10 Given an OSA signature Σ , a ground signature X disjointfrom Σ , an OSA Σ -algebra M , and a map a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M which extends a , in the sense that a s (x) = a s (x) for each s ∈ S and x ∈ X s . (cid:2) Substitutions and their composition can now be defined just as in MSA,simply replacing T Σ (Y ) by T Σ (Y ) : Definition 10.2.11 A substitution of Σ -terms with variables in Y for variablesin X is an arrow a : X → T Σ (Y ) ; the notation a : X → Y may also beused. The application of a to t ∈ T Σ (X) is a(t) . Given substitutions a : X → T Σ (Y ) and b : Y → T Σ (Z) , their composition a ; b (as substitutions)is the S -sorted arrow a ; b : X → T Σ (Z) . (cid:2) We also use Notation 3.5.3 for OSA substitutions: Given t ∈ T Σ (X) and a : X → T Σ (Y ) with | X | = { x , . . . , x n } and a(x i ) = t i for i = , . . . , n ,write a(t) as t(x ← t , x ← t , . . . , x n ← t n ) , omitting x i ← t i when t i is x i . Proposition 10.2.12 OSA substitutions are sort decreasing, in that LS(θ(x)) ≤ s for any x ∈ X s and more generally, LS(θ(t)) ≤ LS(t) for any Σ -term t . Proof: The first assertion follows because θ(x) ∈ T Σ (Y ) s and LS(t) ≤ s forall t ∈ T Σ (Y ) s . The second assertion can be proved by an inductionsimilar to that used for Proposition 10.2.7, by applying condition 2 ofDefinition 10.1.1. (cid:2) The composition of substitutions is associative, by exactly the sameproof used for the MSA case (Proposition 3.6.5, except using Proposi-tion 10.2.10 instead of 3.5.1): Proposition 10.2.13 Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , c : Y → T Σ (Z) , then (a ; b) ; c = a ; (b ; c) . (cid:2) The following consequence of the proof is important for calculations inseveral proofs: Corollary 10.2.14 Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , then a ; b = (a ; b) . (cid:2) Definition 10.2.15 below introduces concepts which help define theterms for which our order-sorted equational satisfaction makes sense.Their motivation is that the two terms in an equation must have sortsthat are somehow connected. For example, an equation such as ( ∀ n) n = true really cannot make sense; on the other hand, the equation ( ∀ n) n = i Order-Sorted Algebra and Term Rewriting where n is a natural and i is the square root of minus one, does makesense, even though it is not satisfied by the standard model of the num-ber system. This is because 5 n and i are in the same connected com-ponent of S , while 5 n and true are not, where we say that s and s arein the same connected component of S iff s ≡ s , where ≡ is the leastequivalence relation on S that contains ≤ . Here is the formalization: Definition 10.2.15 A partial ordering (S, ≤ ) is filtered iff for all s , s ∈ S , thereis some s ∈ S such that s ≤ s and s ≤ s . A partial ordering (S, ≤ ) is locally filtered iff every connected component of it is filtered. An order-sorted signature (S, ≤ , Σ ) is locally filtered iff (S, ≤ ) is locally filtered,and it is a coherent signature iff it is both locally filtered and regular.A partial ordering (S, ≤ ) has a top iff it contains a (necessarily unique)maximum element, u ∈ S , such that s ≤ u for all s ∈ S . (cid:2) Hereafter we assume that all OSA signatures are coherent unless other-wise stated. Assuming local filtration is not at all restrictive in practice,because we always add top elements to connected components, or evento the whole partial ordering (see Exercise 10.2.6). Indeed, OBJ3 doesjust this, with its Universal sort. The need for local filtration is shownin Example 10.2.17 below, and is also used in the quotient construc-tion of Definition 10.4.6 and in the application of that construction toorder-sorted rewriting modulo equations in Section 10.7. Exercise 10.2.6 Show that any filtered partial order is locally filtered, and thatany partial order with a top is filtered. Give examples showing that theconverse assertions are false. (cid:2) Definition 10.2.16 An order-sorted Σ - equation is a triple (cid:104) X, t , t (cid:105) where X is a variable set, and t , t ∈ T Σ (X) such that LS(t ) and LS(t ) arein the same connected component of (S, ≤ ) ; we shall of course write ( ∀ X) t = t , or in concrete cases, things like ( ∀ x, y, z) t = t . A Σ -algebra M satisfies a Σ -equation ( ∀ X) t = t iff for all assignments a : X → M we have a(t ) = a(t ) .A conditional Σ - equation is a quadruple (cid:104) X, t , t , C (cid:105) , where (cid:104) X, t , t (cid:105) is a Σ -equation and C is a finite set of pairs (cid:104) u, v (cid:105) such that (cid:104) X, u, v (cid:105) is a Σ -equation; we shall write ( ∀ X) t = t if C , or more concretely, things like ( ∀ x, y) t = t if u = v , u = v . A Σ -algebra M satis-fies a conditional Σ -equation ( ∀ X) t = t if C iff for all assignments a : X → M whenever a(u) = a(v) for all (cid:104) u, v (cid:105) ∈ C then a(t ) = a(t ) .Finally, we say that a Σ -algebra M satisfies a set A of Σ -equations(conditional or not) iff it satisfies each one of them. In this case, wewrite M (cid:238) A or possibly M (cid:238) Σ A . (cid:2) Although satisfaction makes sense when the set of conditions is infi-nite, we have restricted the definition to finite C because this is neededfor both deduction and rewriting. The following gives another reasonwhy local filtration is necessary: erm and Equation Example 10.2.17 We show that without local filtration, equational satisfactionis not invariant under isomorphism. Given the specification th NON-LF is sorts A B C .subsorts B < A C .op a : -> A .op b : -> B .op c : -> C .eq a = c .endth let Σ denote its signature. Then the initial order-sorted Σ -algebra T Σ has ( T Σ ) A = { a, b } , ( T Σ ) B = { b } , ( T Σ ) C = { b, c } and it does not satisfythe equation, whereas the Σ -isomorphic algebra A with A A = { b, d } , A B = { b } and A C = { b, d } , does satisfy the equation, where both a and c are interpreted as d in A . (See Exercise 10.7.1 for some furtherrelated discussion.) (cid:2) The undesirable phenomenon of Example 10.2.17 is impossible for lo-cally filtered signatures: Proposition 10.2.18 If Σ is a coherent OSA signature and A , B are Σ -isomorphicalgebras, E45 then A satisfies an equation ( ∀ X) t = t (cid:48) iff B does. Proof: By symmetry of the isomorphism relation, it suffices to prove just onedirection. So assume A satisfies the equation, let f : A → B be a Σ -isomorphism, and let β : X → B be an assignment. Then β = α ; f for some assignment α : X → A . Therefore β = α ; f , so that if s ≥ LS(t), LS(t (cid:48) ) then β s (t) = f s (α s (t)) = f s (α s (t (cid:48) )) = β s (t (cid:48) ) as desired. (cid:2) Exercise 10.2.7 Generalize Proposition 10.2.18 from unconditional to condi-tional equations. (cid:2) Example 10.2.19 ( Errors for Lists ) Example 10.1.2 noted that equations are needed to give car and cdr the desired meanings. The following givesthose equations in an appropriate object (for convenience, we importthe natural numbers instead of defining them from scratch): obj LIST is sorts List NeList .pr NAT .subsorts NeList < List .op nil : -> List .op cons : Nat List -> NeList .op car : NeList -> Nat .op cdr : NeList -> List .var L : List . var N : Nat . Order-Sorted Algebra and Term Rewriting eq car(cons(N,L)) = N .eq cdr(cons(N,L)) = L .endo The initial algebra of this specification is what one would expect,noting that the terms car(nil) and cdr(nil) do not parse, and henceare not in it. However, because these terms, and the many others ofwhich they are subterms, represent errors, what we really want is forthem to be proper terms, but of a different sort, so that we can “handle”or “trap” them as errors, without having to invoke any nasty imperativefeatures, as is done in most functional programming languages. Thefollowing shows that OSA provides an elegant solution for this problem,and it has even been shown that MSA cannot provide a satisfactorysolution [138]. obj ELIST is sorts List ErrList ErrNat .pr NAT .subsort List < ErrList .subsort Nat < ErrNat .op nil : -> List .op cons : Nat List -> List .op car : List -> ErrNat .op cdr : List -> ErrList .var N : Nat . var L : List .eq car(cons(N,L)) = N .eq cdr(cons(N,L)) = L .op nohead : -> ErrNat .eq car(nil) = nohead .op notail : -> ErrList .eq cdr(nil) = notail .endo Now car(nil) is a proper term of sort ErrNat rather then Nat , andsimilarly for cdr(nil) . Therefore we can write equations that havesuch “error expressions” in their leftsides, as above. Of course morethan this is needed to get the right behavior in realistic situations, asfurther discussed in Section 10.6. (Note that expressions like cons(2, notail) and fail to parse, and hence are not terms forthis specification, although if they are executed in OBJ3, some inter-esting things happen with retracts, as explained in Example 10.6.1.) (cid:2) Exercise 10.2.8 Define appropriate error supersorts and error messages for op-erations applied to the empty stack, using the basic specification belowas your starting point: obj STACK is pr NAT .sort Stack . eduction op empty : -> Stack .op push : Nat Stack -> Stack .op top_ : Stack -> Nat .op pop_ : Stack -> Stack .var X : Nat .var S : Stack .eq top push(X,S) = X .eq pop push(X,S) = S .endo You should also define and run some test cases for your code. (cid:2) Equational deduction also generalizes to order-sorted algebra. The fol-lowing rules use the same notation as Section 8.4.1, in which the hy-potheses of a rule are above a horizontal line, while the conclusion isgiven below the line. Definition 10.3.1 (Order-sorted equational deduction) Let A be a set of Σ -equations. If an equation e can be deduced using the rules (1–5C) below,from a given set A of (possibly conditional) equations, then we write A (cid:96) e or possibly A (cid:96) Σ e .(1) Reflexivity: ( ∀ X) t = t (2) Symmetry: ( ∀ X) t = t (cid:48) ( ∀ X) t (cid:48) = t (3) Transitivity: ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) ( ∀ X) t = t (cid:48)(cid:48) (4) Congruence: ( ∀ X) θ(y) = θ (cid:48) (y) for each y ∈ Y( ∀ X) θ(t) = θ (cid:48) (t) where θ, θ (cid:48) : Y → T Σ (X) and where t ∈ T Σ (Y ) .(5C) Conditional Instantiation: ( ∀ X) θ(v) = θ(v (cid:48) ) for each v = v (cid:48) in C( ∀ X) θ(t) = θ(t (cid:48) ) where θ : Y → T Σ (X) and where ( ∀ Y ) t = t (cid:48) if C is in A . Order-Sorted Algebra and Term Rewriting There is also an unconditional version of (5C):(5) Instantiation: ( ∀ X) θ(t) = θ(t (cid:48) ) where θ : Y → T Σ (X) and where ( ∀ Y ) t = t (cid:48) is in A .We can use the same notation for deduction using (1–5) as for using(1–5C) because (5) is a special case of (5C). (cid:2) These rules of deduction are essentially the same as for MSA, exceptfor the restriction on substitutions given in Proposition 10.2.12, and the same results as in Chapter 4 for MSA deduction generally carryover. In particular, these rules are sound and complete: Theorem 10.3.2 ( Completeness ) Given a coherent order-sorted signature Σ thenan unconditional equation e can be deduced from a given set A of (pos-sibly conditional) equations iff it is true in every model of A , that is, E (cid:96) e iff A (cid:238) e . (cid:2) The proof is given in Appendix B, but it uses results from the nextsection, because for expository purposes, we have stated this resultearlier than required by the logical flow of proof. Exercise 10.3.1 Show that rule (0) ( Assumption ) of Chapter 4 (but for OSA) is aspecial case of rule (5) above. (cid:2) It is straightforward to generalize the material on subterm replace-ment for conditional equations in Section 4.9 to the order-sorted case.The basic rule is as follows:(6C) Forward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.The substitutions t (z ← θ(t )) and t (z ← θ(t )) are valid because LS(θ(t i )) ≤ LS(t i ) by Proposition 10.2.12 and LS(t i ) ≤ s by assump-tion.Exercises 4.9.1–4.9.4 also generalize, as does the reversed version of(6C): eduction (–6C) Backward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.Soundness of this rule follows as in the MSA case, by applying the sym-metry rule, and so we also get:( ± C ) Bidirectional Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.We now have the following important completeness result, which isproved in Appendix B: Theorem 10.3.3 Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iff A (cid:96) ( , , ± C) e . (cid:2) As in Chapter 4, the rules (+6C), (–6C), and ( ± C ) can each be special-ized to the case where t has exactly one occurrence of z , and thesevariants are indicated by writing 6 instead of 6; we do not write themout here (but see Definition 10.7.1 in Section 10.7 below). The follow-ing completeness result can now be proved in much the same way asCorollary 4.9.2: Corollary 10.3.4 Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) ( − C) e iff A (cid:96) ( , , ± C) e iff A (cid:96) ( , , , C) e . (cid:2) Exercise 10.3.2 Use order-sorted equational deduction to prove the equation f ( a ) = f ( b ) for the following specification: th OSRW-EQ is sorts A C .subsort A < C .ops a b : -> A .op c : -> C .op f : A -> C .eq c = a .eq c = b .endth Order-Sorted Algebra and Term Rewriting You may use OBJ’s apply feature. Hint: First prove a = b as alemma. (cid:2) This section develops some more theoretical topics in order-sorted al-gebra; in general, they are straightforward extensions of the corre-sponding MSA topics, the main exception being the treatment of sub-sorts in quotients. As in MSA, the completeness of deduction (The-orem 10.3.2) can be used to construct initial and free algebras whenthere are equations, by defining an S -sorted relation (cid:39) A,X on T Σ (X) for X a variable set, by t (cid:39) A,X t iff A (cid:96) ( ∀ X) t = t using the rules in Definition 10.3.1. Since this relation is an order-sorted congruence in the sense of Definition 10.4.1 immediately below,we can define T Σ ,A (X) to be the quotient of T Σ (X) by (cid:39) A,X using thequotient construction given in Definition 10.4.6 below. In preparationfor the OSA notion of congruence, one should first recall from Defi-nition 6.1.1 that, given a many-sorted signature ( S, Σ ), a many-sorted Σ - congruence ≡ on a many-sorted Σ -algebra M is an S -sorted family {≡ s | s ∈ S } of equivalence relations, with ≡ s on M s such that(1) given σ ∈ Σ w,s with w = s . . . s n and a i , a (cid:48) i ∈ M s i for i = , . . . , n such that a i ≡ s i a (cid:48) i , then M σ (a , . . . , a n ) ≡ s M σ (a (cid:48) , . . . , a (cid:48) n ) . Definition 10.4.1 For ( S, ≤ , Σ ) an order-sorted signature and M an order-sorted Σ -algebra, an order-sorted Σ - congruence ≡ on M is a many-sorted Σ -congruence ≡ such that(2) if s ≤ s (cid:48) in S and a, a (cid:48) ∈ M s then a ≡ s a (cid:48) iff a ≡ s (cid:48) a (cid:48) .An order-sorted (S, ≤ , Σ ) -algebra M (cid:48) is an order-sorted subalgebra ofanother such algebra M iff it is a many-sorted subalgebra such that M (cid:48) s ⊆ M (cid:48) s (cid:48) whenever s ≤ s (cid:48) in S . (cid:2) Exercise 10.4.1 Show that the intersection of any set of order-sorted Σ -congru-ences on an order-sorted Σ -algebra M is also an order-sorted Σ -congru-ence on M . Conclude from this that any S -sorted family R of binaryrelations R s on M s for s ∈ S is contained in a least order-sorted Σ -congruence on M . Hint: The set of congruences that contain R is non-empty because it contains the relation that identifies everything (foreach sort). (cid:2) Example 10.4.2 Let M be the initial Σ -algebra T Σ where Σ is the ELIST signa-ture from Example 10.2.19, and let ≡ be the congruence generated by ongruence, Quotient, and Initiality its equations, i.e., the least Σ -congruence that contains all ground in-stances of the equations in ELIST . Then cdr(cons(0,nil)) ≡ nil onboth sorts List and ErrList , consistent with List < ErrList , while cdr(cdr(cons(0,nil))) ≡ notail for the sort ErrList . (cid:2) Fact 10.4.3 (cid:39) A,X is an order-sorted congruence relation. Proof: It is easy to see that (cid:39) A,X is reflexive, symmetric, and transitive fromrules (1), (2) and (3) of Definition 10.3.1, respectively, and the Σ -congru-ence property follows from rule (4). To prove (2) of Definition 10.4.1,suppose that s ≤ s (cid:48) and that t (cid:39) A,X t (cid:48) for t, t (cid:48) ∈ T Σ (X) s . Then also t, t (cid:48) ∈ T Σ (X) s (cid:48) and the same proof that showed A (cid:96) ( ∀ X) t = t (cid:48) forsort s also works for sort s (cid:48) , and vice versa . (cid:2) Recall from Definition 6.1.5, that given a many-sorted Σ -homomor- phism f : M → M (cid:48) , the kernel of f , denoted ker (f ) , is the S -sortedfamily of equivalence relations ≡ f defined by a ≡ f ,s a (cid:48) iff f s (a) = f s (a (cid:48) ) , and the image of f is the subalgebra f (M) with f (M) s = f (M s ) for each s ∈ S ; it may also be denoted im(h) . The following shows thatthese concepts extend easily from MSA to OSA: Proposition 10.4.4 If f : M → M (cid:48) is an order-sorted Σ -homomorphism, then1. ker (f ) is an order-sorted Σ -congruence on M ;2. f (M) is an order-sorted subalgebra of M (cid:48) . Proof: Proposition 6.1.6 showed that each ≡ f ,s is an equivalence relation sat-isfying the congruence property (i.e., (1) above). Property (2) above fol-lows from the fact that f s (a) = f s (cid:48) (a) and f s (a (cid:48) ) = f s (cid:48) (a (cid:48) ) for any s ≤ s (cid:48) in S and any a, a (cid:48) ∈ M s .Assertion 2. was proved in Proposition 6.1.6 for MSA, so we needonly check (2) of Definition 10.4.1, which is an easy (set-theoretic) con-sequence of the fact that f is order-sorted. (cid:2) Example 10.4.5 Let LELIST denote the result of making the specification ELIST of Example 10.2.19 entirely loose, including the imported natural num- bers, say with the Peano signature, although we will use ordinary dec-imal notation for convenience. Let M be the ELIST -algebra with ele-ments of sort List just lists of natural numbers, such as ( , , , , ) ;with elements of sort ErrList those of sort List plus notail ; andwith elements of sort ErrNat the natural numbers plus nohead . Notethat expressions like cons(0, notail) and cons(nohead, (0,1,2)) are simply not in this algebra.Now let h : M → M (cid:48) be the S -sorted map that sends each naturalnumber to the corresponding list of numbers modulo 3, and that sends notail and nohead to themselves. Then h is a Σ -homomorphism, and h(M) is the LELIST -algebra M (cid:48) with M (cid:48) Nat = { , , } , with M (cid:48) List lists of Order-Sorted Algebra and Term Rewriting numbers from { , , } , with M (cid:48) ErrNat = M (cid:48) Nat ∪{ nohead } , and M (cid:48) ErrList = M (cid:48) List ∪ { notail } .If we let R denote the kernel of h , then nR Nat n (cid:48) iff n − n (cid:48) is divisibleby 3, and (cid:96)R List (cid:96) (cid:48) iff (cid:96) and (cid:96) (cid:48) have the same length N and (cid:96) i − (cid:96) (cid:48) i isdivisible by 3 for i = , . . . , N . Also, nR ErrNat n (cid:48) iff nR Nat n (cid:48) or n = n (cid:48) = nohead , and (cid:96)R ErrList (cid:96) (cid:48) iff (cid:96)R List (cid:96) (cid:48) or (cid:96) = (cid:96) (cid:48) = notail . (cid:2) Exercise 10.4.2 If f : M → M (cid:48) is an OSA Σ -homomorphism and M ⊆ M is a Σ -subalgebra, show that f (M ) is a Σ -subalgebra of f (M) . (cid:2) We now define the quotient of an order-sorted algebra by a congru-ence relation; following [82], the construction exploits local filtration toenable identifications across subsorts: Definition 10.4.6 For ( S, ≤ , Σ ) a locally filtered order-sorted signature, M an order-sorted Σ -algebra, and ≡ an order-sorted Σ -congruence on M , the quotient of M by ≡ is the order-sorted Σ -algebra M/ ≡ defined as fol-lows: for each connected component C , let M C = (cid:83) s ∈ C M s and definethe congruence relation ≡ C by a ≡ C a (cid:48) iff there is a sort s ∈ C suchthat a ≡ s a (cid:48) . Then ≡ C is clearly reflexive and symmetric. It is transitivebecause a ≡ s a (cid:48) and a (cid:48) ≡ s (cid:48) a (cid:48)(cid:48) yield a ≡ s (cid:48)(cid:48) a (cid:48)(cid:48) for s (cid:48)(cid:48) ≥ s, s (cid:48) , which ex-ists by local filtration. The inclusion M s ⊆ M C induces an injective map M s / ≡ s → M C / ≡ C because for a, a (cid:48) ∈ M s we have a ≡ s a (cid:48) implies a ≡ C a (cid:48) by construction, and conversely a ≡ C a (cid:48) implies a ≡ s (cid:48) a (cid:48) for some s (cid:48) ∈ C , and taking s (cid:48)(cid:48) ≥ s, s (cid:48) it also implies a ≡ s (cid:48)(cid:48) a (cid:48) and therefore itimplies a ≡ s a (cid:48) by property (2) of the definition of order-sorted congru-ence. Denoting by q C the natural projection q C : M C → M C / ≡ C of eachelement a into its ≡ C -equivalence class, we define the carrier (M/ ≡ ) s of sort s in the quotient algebra to be the image q C (M s ) . The order-sorted algebra M/ ≡ comes equipped with a surjective order-sorted Σ -homomorphism q : M → M/ ≡ defined to be the restriction of q C toeach of its sorts, and called the quotient map associated to the con-gruence ≡ . The operations are defined by (M/ ≡ ) σ ([a ], . . . , [a n ]) = [M σ (a , . . . , a n )] , and are well defined because ≡ is an order-sorted Σ -congruence. (cid:2) The following illustrates the above construction: Example 10.4.7 For a given Σ -theory B , consider the relation ≡ on T Σ definedfor LS(t) , LS(t (cid:48) ) ≤ s by t ≡ s t (cid:48) iff there is a proof that t = t (cid:48) in whichevery term used has least sort ≤ s ; it is not difficult to check that ≡ is a Σ -congruence. Now define B by the following: th OSTH is sorts A C .subsort A < C .ops a b : -> A .op c : -> C .eq c = a . ongruence, Quotient, and Initiality A/R (cid:63) qA (cid:45) f B (cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:42) v Figure 10.2: Condition (2) of Universal Property of Quotient eq c = b .endth Then under the ordinary quotient construction (as in Appendix C)the ≡ -equivalence class [ a ] A of a for sort A is { a } and also [ b ] A = { b } , whereas [ a ] C = [ b ] C = { a , b , c } . However, under the construction ofDefinition 10.4.6, the equivalence classes of sort s collect all terms ofsort s or less that can be proved equal, no matter what other sorts maybe involved. So in this case, [ a ] A = { a , b } and [ a ] B = { a , b , c } . Thisshows that the construction of Definition 10.4.6 does useful additionalwork for certain relations, although (cid:39) B is not one of these. E46 (cid:2) Exercise 10.4.3 Show that the relation ≡ in Example 10.4.7 is a Σ -congruence. (cid:2) The following is straightforward from the definitions: Fact 10.4.8 Under the assumptions of Definition 10.4.6, ker (q) = ≡ . (cid:2) Exercise 10.4.1 allows us to extend the construction of Definition10.4.6 to quotients by an arbitrary relation on an algebra. Definition 10.4.9 Given an arbitrary S -sorted family R of binary relations R s on M s for s ∈ S , then the quotient of M by R , denoted M/R , is the quotientof M by the smallest order-sorted Σ -congruence on M containing R . (cid:2) Proposition 10.4.10 ( Universal Property of Quotient ) If Σ is a locally filtered order-sorted signature, if M is an order-sorted Σ -algebra, and if R is an S -sorted family of binary relations R s on M s for s ∈ S , then the quotientmap q : M → M/R satisfies the following:(1) R ⊆ ker (q) , and(2) if f : M → B is any order-sorted Σ -homomorphism such that R ⊆ ker (f ) , then there is a unique Σ -homomorphism v : M/R → B such that q ; v = f (see Figure 10.2). Proof: (1) follows from Fact 10.4.8 and the definition of ≡ as the smallestcongruence that contains R . Order-Sorted Algebra and Term Rewriting For (2), let f : M → M (cid:48) be an order-sorted Σ -homomorphism suchthat R ⊆ ker (f ) . Then ker (q) ⊆ ker (f ) and both are congruences sothat for each connected component C we have ker (q) C ⊆ ker (f ) C andthere is a unique function v C : (M/R) C → M (cid:48) C such that v C ◦ q C = f C for f C : M C → M (cid:48) C defined by f C (a) = f s (a) if a ∈ M s (this is welldefined by local filtering). It remains only to check that, restricting v C to each one of the sorts s ∈ C , the family { v s | s ∈ S } thus obtainedis an order-sorted Σ -homomorphism. Property (2) for order-sorted ho-momorphisms follows by construction. Let σ ∈ Σ w,s with w = s . . . s n and let a i ∈ M s i for i = , . . . , n . Then (omitting sort qualificationsthroughout) we have v((M/R) σ ([a ], . . . , [a n ])) = v([M σ (a , . . . , a n )]) = f (M σ (a , . . . , a n )) = M (cid:48) σ (f (a ), . . . , f (a n )) = M (cid:48) σ (v([a ]), . . . , v([a n ])) . The case w = [] is left for the reader to check. (cid:2) The proof of Theorem 10.3.2 in Appendix B shows that the relation (cid:39) A,X (defined on page 334) is an order-sorted Σ -congruence. So we nowdefine T Σ ,A (X) to be the quotient of T Σ (X) by (cid:39) A,X . Also, we denote T Σ ,A ( ∅ ) by T Σ ,A . The following is also proved in Appendix B: Theorem 10.4.11 If Σ is coherent and A is a set of (possibly conditional) Σ -equations, then T Σ ,A is an initial ( Σ , A )-algebra, and T Σ ,A (X) is a free( Σ , A )-algebra on X , in the sense that for each Σ -algebra M and each as-signment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M such that a(x) = a(x) for each x in X . (cid:2) Example 10.4.12 Theorem 10.4.11 implies that the algebra M of Example 10.4.5is an initial model for LELIST and hence a model for the original speci-fication ELIST of Example 10.2.19. (cid:2) The following theorem generalizes Noether’s first isomorphism the-orem (Theorem 6.1.7) to OSA: Theorem 10.4.13 ( Homomorphism Theorem ) For any Σ -homomorphism h : M → M (cid:48) , there is a Σ -isomorphism M/ ker (h) (cid:155) Σ im(h) . Proof: Let f (cid:48) : A → f (A) denote the corestriction (Appendix C reviews thisconcept) of f to f (A) . Then Proposition 10.4.10 with R = ker (f (cid:48) ) = ker (f ) gives a (unique) Σ -homomorphism v : A/ ker (f ) → f (A) suchthat q ; v = f (cid:48) , which is surjective because f (cid:48) is. We will be done ifwe can show that v is also injective. To this end (and omitting sortsubscripts), suppose that v([a ]) = v([a ]) ; then f (a ) = f (a ) , sothat [a ] = [a ] . (cid:2) Exercise 10.4.4 For M, M (cid:48) , R as in Example 10.2.19, check that M/ ≡ is Σ -iso-morphic to M (cid:48) . (cid:2) lass Deduction It is worthwhile making explicit the following consequence of theproof given in Appendix B of the Completeness Theorem (10.3.2): Corollary 10.4.14 Given a coherent order-sorted signature Σ and a set A of(possibly conditional) Σ -equations, then an equation ( ∀ X) t = t (cid:48) is sat-isfied by every Σ -algebra that satisfies A iff it is satisfied by T Σ ,A (X) . (cid:2) The theory of class deduction in Section 7.2 easily generalizes to OSA. Definition 10.5.1 Given an order-sorted signature Σ and a set B of (possibly conditional) Σ -equations, a ( conditional ) Σ - equation modulo B , or ( Σ , B) - equation , is a 4-tuple (cid:104) X, t, t (cid:48) , C (cid:105) where t, t (cid:48) ∈ T Σ ,B (X) have sorts in thesame connected component of Σ , and C is a finite set of pairs from T Σ ,B (X) , again with sorts in the same connected components. Usuallywe write ( ∀ X) t = B t (cid:48) if C , and may use the same notation with t, t (cid:48) , C all Σ -terms that represent their B -equivalence classes; we may also dropthe B subscripts.Given a ( Σ , B) -algebra M , Σ - satisfaction modulo B , written M (cid:238) Σ ,B ( ∀ X) t = B t (cid:48) if C , is defined by a(t) = a(t (cid:48) ) whenever a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , for all a : X → M , where a : T Σ ,B (X) → M isthe unique Σ -homomorphism extending a . Given a set A of ( Σ , B) -equations, A (cid:238) Σ ,B e means M (cid:238) Σ ,B A implies M (cid:238) Σ ,B e for all B -models M . (cid:2) As in Section 7.2, class deduction versions of inference rules areobtained just by substituting T Σ for T Σ and = B for = , assuming that A contains ( Σ , B) -equations; we also use [A] and [e] as in Section 7.2,and the following three results have essentially the same proofs as thecorresponding results there. The B -class version of rule (i) is denoted (i B ) , and A (cid:96) Σ ,B [e] denotes class deduction modulo B of [e] from A ,using rules ( B − B ) and ( C B ) . Proposition 10.5.2 ( Bridge ) Given sets A, B of Σ -equations and another Σ -equa-tion e (with A, B and e possibly conditional), then [A] (cid:96) B [e] iff A ∪ B (cid:96) e . Furthermore, given any ( Σ , B) -algebra M and a (possibly conditional) Σ -equation e , then M (cid:238) Σ ,B [e] iff M (cid:238) Σ e . (cid:2) From this, it is not difficult to prove the following: Order-Sorted Algebra and Term Rewriting Theorem 10.5.3 ( Completeness ) Given sets A, B of Σ -equations and another Σ -equation e , all possibly conditional, then the following are equivalent: ( ) [A] (cid:96) B [e] ( ) [A] (cid:238) B [e] ( ) A ∪ B (cid:96) e ( ) A ∪ B (cid:238) e (cid:2) The above result connects OSA class inference and satisfaction withordinary OSA inference and satisfaction. Theorem 10.5.4 ( Completeness ) Given sets A, B of (possibly conditional) Σ -equa-tions and an unconditional Σ -equation e , then [A] (cid:96) B [e] iff [A] (cid:96) ( B , B , ± C B )B [e] . Moreover, [A] (cid:96) ( B , B , ± C B )B [e] iff M (cid:238) B [e] for all ( Σ , A ∪ B) -algebras M . (cid:2) The above result says that ( C B ) , i.e., class rewriting, is complete forclass deduction when combined with the reflextive, symmetric ( ± ) , andtransitive rules of inference. In strongly typed languages, some expressions may not type check,even though they have a meaningful value. For example, given a facto-rial function defined only on natural numbers, the expression ((- 6)/(- 2))! is not well-formed, because the parser can only determinethat the argument of the factorial is a rational number, possibly nega-tive. It is desirable to give such expressions the “benefit of the doubt,”because they could evaluate to a natural (e.g., the above evaluates to3). Retract functions provide this capability, by lowering the sort of asubexpression to the subsort needed for parsing. In this example, theretract function symbol r Rational , Natural : Rational -> Natural is automatically inserted by OBJ during rewriting, in a process called retract rewriting , to fill the gap between the parsed high sort andthe required low sort, yielding the expression (r Rational , Natural ((- 6)/(- 2)))! , which does type check. Then we can use retract eliminationequations , of the form r s,s (cid:48) (x) = x where s (cid:48) ≤ s and x is a variable of sort s (cid:48) , to eliminate retracts whentheir arguments do have the required sorts. When s (cid:48) (cid:54)≤ s , the retractremains, providing an error message that pinpoints exactly where the rror, Coercion, and Retract problem occurs and exactly what its sort gap was. For example, evaluates to Rational , Natural (1 / 3))! , in-dicating that the argument to factorial is the rational 1/3. Similar situ-ations arise with the function |_|^2 in Section 10.8 below. And unlikethe untyped case, truly nonsensical expressions are detected and re-jected at compile time, while any expression that could possibly recoveris allowed to be evaluated. By “truly nonsensical” is meant expressionslike factorial(false) that contain subexpressions in the wrong con-nected component (assuming that booleans and natural numbers are indifferent connected components of the sort poset) and therefore can-not be parsed by inserting retracts. A precise semantics for retractsis given in Section 10.6.1, while the rest of this section is devoted toexamples in OBJ. Example 10.6.1 ( Lists with Fewer Errors ) As already noted, without retracts,terms like car(cdr(cons(1,cons(2,cons(3,nil))))) do not parse in the context of the ELIST theory of Example 10.2.19,because the subterm beginning with cdr has sort ErrList , while car requires sort List as its argument. However, the correct answer (whichis 2), is obtained by inserting a retract and then reducing the result.The term car(cdr(cdr(cons(1,nil)))) has a somewhat different behavior when retracts are added: it is tem-porarily accepted as car(r ErrList , List (cdr(r ErrList , List (cdr(cons(1,nil)))))) which is then reduced to the form car(r ErrList , List (nil)) which serves as a very informative error message. (cid:2) One might think that, since this is a kind of run-time type checking,it is just operational semantics. But our approach requires that theoperational semantics agrees with the logical semantics, and retractshave a very nice logical semantics (see Section 10.6.1 below), as well as an operational semantics, which is developed in Section 10.7 below.Moreover, this kind of run-time type checking is relatively inexpensive,and in combination with the polymorphism given by subsorts and byparameterized modules, it provides the syntactic flexibility of untypedlanguages with all the advantages of strong typing.The following shows that if deduction is not treated carefully, it canyield unsound results, and that naive attempts to fix this problem cangreatly weaken deduction; the discussion following the example abovealso shows that retracts again provide a nice solution. Example 10.6.2 Consider the term f(a) for the following object: Order-Sorted Algebra and Term Rewriting obj NON-DED is sorts A B .subsorts A < B .op a : -> A .op b : -> B .ops f g : A -> A .var X : A .eq f(X) = g(X) .eq a = b .endo The first equation can deduce g(a) from f(a) , and then the secondequation can apparently deduce g(b) from f(a) ; but g(b) is not awell-formed term! The problem is that, although replacing a by b issound in itself, it is not sound in the context of g . (cid:2) The easiest way to avoid this problem is to prohibit deductions that donot decrease sorts. But this would eliminate many important examples,such as the square norm in Section 10.8 above. A better approach isto prohibit applications yielding terms that don’t parse; in fact, Defi-nition 10.3.1 takes this approach, because its rules implicitly assumethat every term occurring in them is well-formed. Unfortunately, thisprohibits many correct computations, such as that above with factorial,and it also fails to inform the user what went wrong. Retracts allow usto avoid all these difficulties. The result of running OBJ3, which imple-ments retracts, on red f(a) . in the context of NON-DED is the following, reduce in NON-DED : f(a)rewrites: 2result A: g(r:B>A(b)) which is not only a valid deduction, but also an informative error mes-sage.Example 10.6.2 might raise suspicions that the rules of deduction as stated in Definition 10.3.1 are unsound; but recall that those rulesassume that all terms in them are well-formed, so without retracts, theydisallow the deduction in Example 10.6.2. Once we formalize retractsas an order-sorted theory, the soundness of deduction using retractsfollows from Theorem 10.3.2, because deduction with retracts followsexactly the same rules as deduction without them.Raising and handling exceptions can also be given a nice seman-tics using retracts. This is significant because exceptions have bothinadequate semantic foundations and insufficient flexibility in manyprogramming and specification languages. Some algebraic specifica-tion languages use partial functions, which are simply undefined under rror, Coercion, and Retract exceptional conditions. This can be developed rigorously, e.g., in [111],but it has the disadvantage that neither error messages nor error recov-ery are possible. OSA with retracts supports both, and is fully imple-mented in OBJ3 and BOBJ. The following illustrates some capabilitiesof retracts in this respect: Example 10.6.3 ( Lists with Fewer Errors ) Again in the context of the ELIST theory of Example 10.2.19, in processing large lists, explicit error mes-sages that pinpoint exceptions might be difficult to understand. In thiscase, we can add some equations to simplify such expressions, var EN : ErrNat .var EL : ErrList .eq cons(r:ErrNat>Nat(EN), notail) = notail .eq cons(nohead, r:ErrList>List(EL)) = notail .eq car(r:ErrList>List(EL)) = nohead .eq cdr(r:ErrList>List(EL)) = notail . in which case red car(cdr(cdr(cdr(cons(1,nil))))) . gives just nohead as its reduced form. (cid:2) The following somewhat open-ended exercise gives a similar butmore complex application of retracts: Exercise 10.6.1 ( (cid:63) ) Write a suitable theory for a relational database in whichlists represent tuples and so-called “null values,” such as op nullNat : -> ErrNat . are treated as exceptions when doing arithmetic, but do not collapsetuples to a single error message. Show that it is also possible to haveboth kinds of exception in a single theory, by declaring two differenterror supersorts of Nat . (cid:2) Retracts also support situations where data is represented more than one way, and representations are converted to whatever form ismost convenient or efficient for a given context. This kind of multiplerepresentation is rather common, but is rarely given semantics. A goodexample is Cartesian and polar coordinates for points in the plane, asdeveloped in Section 10.9. There are also many cases involving conver-sion from one sort of data to another in an irreversible way; for exam-ple, to apply integer addition to two rational numbers, one might firsttruncate them; this is called coercion . In both multiple representationand coercion, applying functions defined on one representation to dataof another is mediated by functions that change the representation, butthe conversions between multiple representations are reversible. Order-Sorted Algebra and Term Rewriting A basic theorem about retracts asserts their consistency : the theorythat results from adding retract function symbols and retract equa-tions to an order-sorted specification is a conservative extension of theoriginal, in the sense that the equational deduction and initial modelsof the original theory are not disturbed. Thus retracts not only com-bine the flexibility of untyped languages with the discipline of strongtyping, and give satisfactory treatments of exception handling and mul-tiple representation, but the semantics of retracts, both deductive andmodel theoretic, is just a special case of order-sorted algebra. Example 10.6.4 The following is similar to Example 10.6.2, but more subtle. obj MORE-NON-DED is sorts A B C .subsorts A < B < C .op f : C -> C .ops f h : A -> A .op g : B -> B .op a : -> A .var X : B .eq f(X) = g(X) .endo Here f(X) has sort C , which looks reasonable because the sort C is greater than the sort B of g(X) . But the term h(f(a)) rewrites to h(r:B>A(g(a))) , because the equation obtained from the original byspecializing the variable X of sort B to a variable of sort A is not sortdecreasing. Therefore, not just the original rules, but also their special-izations to rules having variables of smaller sorts should be considered,as discussed in [113]; see also Definition 10.7.10. (cid:2) Exercise 10.6.2 Use OBJ to verify the assertions about its computations in Ex-amples 10.6.1, 10.6.3, and 10.6.4. (cid:2) We have shown that strong typing is not flexible enough in practice, and have also suggested how OSA can provide the necessary flexibilitywith retracts. We now develop the formal semantics and prove thatretracts are sound under certain mild assumptions. The first step is toextend an order-sorted signature Σ to another order-sorted signature Σ ⊗ having the same sorts as Σ , and the same operation symbols as Σ ,plus some new ones called retracts of the form r s (cid:48) ,s : s (cid:48) → s for eachpair s (cid:48) , s with s (cid:48) > s . The semantics of retracts is then given by retractequations ( ∀ x) r s (cid:48) ,s (x) = x , for all s (cid:48) > s , where x is a variable of sort s . rror, Coercion, and Retract Given an order-sorted signature Σ and a set A of conditional Σ -equations, extend Σ to the signature Σ ⊗ by adding the retract opera-tions, and extend A to the set of equations A ⊗ by adding the retractequations. Our requirement for retracts to be well-behaved is that thetheory extension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) should be conservative in the sensethat for all t, t (cid:48) ∈ T Σ (X) , t (cid:39) A(X) t (cid:48) iff t (cid:39) A ⊗ (X) t (cid:48) . This is equivalent in model-theoretic terms to requiring that the uniqueorder-sorted Σ -homomorphism ψ X : T Σ ,A (X) → T Σ ⊗ ,A ⊗ (X) which leavesthe elements of X fixed, is injective. We prove this under the very nat-ural assumption on the algebras T Σ ,A (X) that given X ⊆ X (cid:48) , then theunique Σ -homomorphism ι X,X (cid:48) : T Σ ,A (X) → T Σ ,A (X (cid:48) ) induced by thecomposite map X (cid:62) X (cid:48) → T Σ ,A (X (cid:48) ) (first inclusion, then the naturalmapping of each variable to the class of terms equivalent to it) is injec-tive. We will say that a presentation ( Σ , A) is faithful if it satisfies thisinjectivity condition. Pathological, unfaithful presentations do exist,and for them the extension with retracts is not conservative, as shownby the following example from [78]: Example 10.6.5 Let Σ have sorts a, b, u with a, b ≤ u , have an operation f : a → b , have no constants of sort a , have constants 0 , b , plus + , & binary infix and ¬ unary prefix of sort b . Let A have the equations ¬ (f (x)) = f (x), y + y = y, y & y = y, y + ( ¬ y) = , ( ¬ y) + y = , y & ( ¬ y) = , ( ¬ y) & y = , ¬ = , ¬ = 0. Then ( ∀ x) = A , where x is a variable of sort a , although ( ∀∅ ) = not deducible from A . Thus ( Σ , A) is not faithful. Note that T Σ ,A has 1 ≠ T Σ ⊗ ,A ⊗ has 1 = a such as r u,a ( ) and r u,a ( ) . Thus, theextension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is not conservative. (cid:2) There are simple conditions on both the signature Σ and on the equa-tions A that guarantee faithfulness of a presentation ( Σ , A) . For ar- bitrary A , it is necessary and sufficient E47 that Σ has no quasi-empty models, which are algebras B such that B s = ∅ for some s but B s (cid:48) ≠ ∅ for some other sort s (cid:48) [78]. For arbitrary Σ , it is sufficient that A isChurch-Rosser as a term rewriting system [136]. A proof of the follow-ing conservative extension result is given in Appendix B: Theorem 10.6.6 If the signature Σ is coherent and ( Σ , A) is faithful, then theextension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is conservative. (cid:2) This gives soundness, and Theorem 10.7.18 gives completeness. Order-Sorted Algebra and Term Rewriting Order-sorted rewriting arises from order-sorted equational logic inmuch the same way that many-sorted rewriting arises from many-sortedequational logic. We first consider unconditional rewriting, and addthe assumption that B contains no conditional equations to our priorassumption that all our order-sorted signatures are coherent (Defini-tion 10.2.15). Definition 10.7.1 Given an order-sorted signature Σ , an unconditional order-sorted Σ - rewrite rule is an order-sorted Σ -equation ( ∀ X) t = t suchthat var (t ) ⊆ var (t ) = X ; we will write t → t . An order-sorted Σ - term rewriting system (or Σ - OSTRS ) is a set A of order-sorted Σ -rewriterules; we may also write ( Σ , A) . (cid:2) Recall that t and t must lie in the same connected component of thesort set of Σ (by Definition 10.2.16), so that, by coherence, there alwaysexists a sort s such that LS(t i ) ≤ s for i = , 2. We begin by restrictingthe rule of deduction (+6 ), which replaces one subterm in the forwarddirection, to equations that are rewrite rules:( rw ) Order-Sorted Rewriting . Given t → t in A with LS(t ), LS(t ) ≤ s and var (t ) = Y , and given t ∈ T Σ (X ∪ { z } s ) with exactly oneoccurrence of z (cid:54)∈ X , if θ : Y → T Σ (X) is a substitution, then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is deducible.This rule is sound because it is a restriction of ( ) , which is alreadyknown to be sound. Therefore the following is also sound (in a sensestated formally in Proposition 10.7.9 below): Definition 10.7.2 Given a Σ -OSTRS A , one-step rewriting is defined for Σ -terms t, t (cid:48) by t ⇒ A t (cid:48) iff there exist a rule t → t in A with var (t ) = Y , aterm t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z (cid:54)∈ X , and asubstitution θ : Y → T Σ (X) , such that LS(t ), LS(t ) ≤ s and t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) . The rewrite relation is the transitive, reflexive closure of the one-steprewrite relation; we use the notation t ∗ ⇒ A t (cid:48) and say t rewrites to t (cid:48) ( under A ). (cid:2) The words “match,” “redex,” etc. are used the same way as in the many-sorted case, and the one-step order-sorted rewrite relation gives an ab-stract rewrite system, so that termination, Church-Rosser, local Church-Rosser, reduced term, and so on all make sense; moreover, the usualproof methods are available for proving these properties. We do notelaborate this, because we will soon generalize order-sorted rewritingto the conditional and modulo equation cases. rder-Sorted Rewriting Example 10.7.3 The following illustrates non-trivial overloaded order-sortedrewriting: th OSRW is sorts A B .subsort A < B .op a : -> A .op b : -> B .ops f g : B -> B .op g : A -> A .eq b = a .var A : A .eq f(A) = A .endthred f(g(b)) . The result is g(a) , after applying each rule once. (cid:2) The relation ∗ (cid:97) A , the reflexive, symmetric, and transitive closure of ⇒ A , is order-sorted replacement of equals by equals. An importantresult for many-sorted rewriting (Theorem 7.3.4) is that t ∗ (cid:97) A t (cid:48) iff A (cid:96) ( ∀ X) t = t (cid:48) , or otherwise put, that ∗ (cid:97) A and (cid:39) A,X (provability) are equalrelations on T Σ (X) . By the completeness of many-sorted equationaldeduction, this also implies that ∗ (cid:97) A is complete. Unfortunately, thesenice results fail for order-sorted rewriting: Exercise 10.7.1 Use the specification OSRW-EQ in Exercise 10.3.2 to show thatthe relation ∗ (cid:97) A is incomplete: f ( a ) ∗ (cid:97) A f ( b ) does not hold, even though a ∗ (cid:97) A b is true, and f(a) and f(b) are provably equal by Exercise10.3.2. (This observation is due to Gert Smolka.) Examples with thisundesirable behavior can be excluded by imposing the conditions inDefinition 10.7.10 below, but it is better to use retracts, relying on The-orem 10.7.18; see also Example 10.7.4 below. (cid:2) Exercise 10.7.2 It is also interesting to compare the equivalence classes for ∗ (cid:97) A under the ordinary quotient construction with those under the construction of Definition 10.4.6. Use the relation ≡ defined in Exam-ple 10.4.7 equals ∗ (cid:97) A (though the spec is different from here) to showthat the ordinary equivalence classes for ∗ (cid:97) A differ from those of Defi-nition 10.4.6. (cid:2) Example 10.7.4 If we write the two equations in the theory OSRW-EQ of Exercise10.3.2 in the converse order, then OBJ can prove the result with just onereduction, due to of the way that retracts work: th OSRW-EQ-CONV is sorts A C .subsort A < C . Order-Sorted Algebra and Term Rewriting ops a b : -> A .op c : -> C .op f : A -> C .eq a = c .eq b = c .endthred f(a) == f(b).red f(a) . The first reduction gives true , even though the normal form of f(a) is f(r:C>A(c)) , as shown by the second reduction, because f(b) hasexactly the same normal form. In fact, the relation ∗ (cid:97) is complete forOSA with retracts (by Theorem 10.7.18 below). (cid:2) The following is the natural extension of Definition 10.7.1: Definition 10.7.5 A conditional order-sorted rewrite rule is an order-sortedconditional equation ( ∀ X) t = t if C (in the sense of Definition10.2.16) such that var (t ) = X, var (t ) ⊆ X , and for each (cid:104) u, v (cid:105) ∈ C , var (u) ⊆ X and var (v) ⊆ X . We use the notation t → t if C . (cid:2) As with the many-sorted case, it is not straightforward to define theone-step rewrite relation for conditional rules. However, we can use the(Join) Conditional Abstract Rewrite Systems (CARS, Definition 7.7.1),just as we did in Section 7.7 for the many-sorted case; we will includerewriting modulo equations at the same time. For this purpose, westate the order-sorted modulo version of Forward Conditional SubtermReplacement Modulo Equations : ( C B ) Given ( ∀ Y ) t = B t if C in A with LS(t ), LS(t ) ≤ s , and given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = B θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , if t (cid:48) i = t (z ← θ(t i )) and t (cid:48)(cid:48) i (cid:39) B t (cid:48) i for i = , 2, then ( ∀ X) t (cid:48)(cid:48) = t (cid:48)(cid:48) is also deducible.Now the main concepts: Definition 10.7.6 An order-sorted conditional term rewriting system moduloequations ( MCOSTRS ) is ( Σ , A, B) where A is a set of (possibly condi- tional) Σ -rewrite rules, and B is a set of (unconditional) Σ -equations.Given ( Σ , A, B) , define two CARS’s as follows, where t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in A with LS(t), LS(t (cid:48) ) ≤ s , Y = var (t) , θ : Y → T Σ , u ∈ T Σ ( { z } s ) and z (cid:54)∈ X :1. For term rewriting, let W be the set of rules of the form v → v (cid:48) if v = v (cid:48) , . . . , v n = v (cid:48) n where v (cid:39) B u(z ← θ(t)), v (cid:48) (cid:39) B u(z ← θ(t (cid:48) )) , v i = θ(t i ) and v (cid:48) i = θ(t (cid:48) i ) for i = , . . . , n .2. For class rewriting, let W be the set of rules of the form c → c (cid:48) if c = c (cid:48) , . . . , c n = c (cid:48) n where c = [u(z ← θ(t))], c (cid:48) = [u(z ← θ(t (cid:48) ))] , c i = [θ(t i )] and c (cid:48) i = [θ(t (cid:48) i )] for i = , . . . , n . rder-Sorted Rewriting The classes in the second item are those defined by the quotient con-struction of Definition 10.4.6 from the provability relation (cid:39) B . Now Def-inition 7.7.1 (page 235) yields an ARS W (cid:5) for each of these. Write ⇒ A/B for the first, which is order-sorted conditional term rewriting modulo B , and ⇒ [A/B] for the second, which is order-sorted conditional classrewriting modulo B . Also, we will use standard terminology (“match,”“redex,” etc.) in the usual way. (cid:2) As discussed in Section 7.7, OBJ does not use equality semantics forevaluating conditions, even though it is common in the literature (e.g.,[113]): it uses join condition semantics for its efficiency and pragmaticadequacy. Because ⇒ A/B and ⇒ [A/B] are ARS’s, all ARS results applydirectly, such as the Newman lemma and the multi-level terminationresults in Section 5.8.2. Also, the following has essentially the same proof as Proposition 7.3.2: Proposition 10.7.7 Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MCOSTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iff t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Therefore t ∗ ⇒ A/B,X t (cid:48) iff t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend well over variables, sowe can drop the subscript X and use any X with var (t) ⊆ X ; also asbefore, ∗ (cid:97) A/B,X does not restrict and extend well, as shown by Exam-ple 5.1.15, so we define t ∗ (cid:97) A t (cid:48) to mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) . Example 5.1.15 also shows bad behavior for (cid:39) XA/B (definedby t (cid:39) XA/B t (cid:48) iff A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although again, the concretionrule (8) of Chapter 4 (generalized to order-sorted rewriting modulo B )implies (cid:39) XA,B does behave reasonably when the signature is non-void.Defining ↓ A/B,X from the ARS, we generalize Proposition 5.1.13, whichagain allows the subscript X to be dropped: Proposition 10.7.8 Given terms t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MCOSTRS ( Σ , A, B) ,then t ↓ A/B,X t iff t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) Because unconditional order-sorted rewriting modulo no equations is a special case, Exercise 10.7.1 also shows that ∗ (cid:97) A/B is not completefor satisfaction of A ∪ B . However, we do have: Proposition 10.7.9 ( Soundness ) Given an MCOSTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) ,then t ⇒ A/B t (cid:48) iff [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iff [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) B ( ∀ X) [t] = [t (cid:48) ] [t] ∗ (cid:97) [A/B] [t (cid:48) ] implies A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Order-Sorted Algebra and Term Rewriting Therefore ∗ (cid:97) A/B is sound for satisfaction of A ∪ B and ∗ (cid:97) [A/B] is soundfor satisfaction of A modulo B . Moreover, ∗ (cid:97) A/B ⊆ (cid:155) XA ∪ B on T Σ (X) . Proof: The first assertion follows from the definitions of ⇒ A/B and ⇒ [A/B] (Def-inition 10.7.6); then the second follows by induction. The third followsfrom ⇒ [A/B] being a rephrasing of ( C B ) , and the fourth follows fromthe third plus Proposition 10.5.2. (cid:2) We will consider two ways to render bidirectional term rewriting com-plete: the first assumes the conditions below, while the second usesretracts (Theorem 10.7.18). Definition 10.7.10 An MCOSTRS ( Σ , A, B) is sort decreasing iff t ⇒ A/B t (cid:48) im-plies LS(t) ≥ LS(t (cid:48) ) . A Σ -rule t → t (cid:48) if C is sort decreasing iff for any substitution θ : X → T Σ (Y ) where X = var (t) , we have LS(θ(t)) ≥ LS(θ(t (cid:48) ) . An order-sorted Σ -equation t = t (cid:48) is sort preserving iff forany substitution θ : X → T Σ (Y ) where X = var (t) , we have LS(θ(t)) = LS(θ(t (cid:48) ) . (cid:2) Notice that these conditions are decidable (provided Σ is finite). We willsee that they also improve the properties of order-sorted rewriting. Thenext two results follow [113]. Proposition 10.7.11 is straightforward us-ing induction. Since Theorem 10.7.18 gives completeness with retractsbut without the sort decreasing assumption, we do not prove Theorem10.7.11; however the proofs in [113] carry over to join condition rewrit-ing. Theorem 10.7.11 If ( Σ , B) is sort preserving, then t (cid:39) B t (cid:48) iff t ∗ (cid:97) B t (cid:48) . More-over, ( Σ , B) is sort preserving iff t (cid:39) B t (cid:48) implies LS(t) = LS(t (cid:48) ) . AnMCOSTRS ( Σ , A, B) is sort decreasing if A is sort decreasing and B issort preserving. (cid:2) Definition 10.7.12 An MCOSTRS ( Σ , A, B) is join condition canonical if andonly if ( Σ (cid:48) , A (cid:48) , B) is canonical, where: (1) Σ (cid:48) ⊆ Σ is least such that if t i = t (cid:48) i is a condition of some rule r in A , then θ(t i ) and θ(t (cid:48) i ) are in T Σ (cid:48) for all θ : X → T Σ where X = var (t) and t is the leftside of the head of rule r ; and (2) A (cid:48) ⊆ A is least such that all conditional rulesare in A (cid:48) , and all unconditional rules that can be used in evaluating theconditions of rules in A are in A (cid:48) . (cid:2) The following is proved essentially the same way as Theorem 7.7.10: Theorem 10.7.13 ( Completeness ) Given a join condition canonical MCOSTRS ( Σ , A, B) , the following four conditions are equivalent for any t, t (cid:48) ∈T Σ (X) : t ∗ (cid:97) A/B t (cid:48) A ∪ B (cid:96) ( ∀ X) t = t (cid:48) [A] (cid:96) B ( ∀ X) t = B t (cid:48) t (cid:39) A ∪ B t (cid:48) rder-Sorted Rewriting Moreover, if ( Σ , A, B) is Church-Rosser, then t ↓ A/B t (cid:48) is also equivalentto the above. Finally, if ( Σ , A, B) is canonical, then [[t]] A (cid:39) B [[t (cid:48) ]] A is alsoequivalent. (cid:2) The next result is proved essentially the same way as Theorem 7.3.9: Theorem 10.7.14 Given a ground canonical MCOSTRS ( Σ , A, B) with A sort de-creasing and B sort preserving, if t , t are two normal forms of aground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalenceclasses of ground normal forms under ⇒ A/B form an initial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B or just N A/B , as follows, where [[t]] denotesany arbitrary normal form of t , and [[t]] B denotes the B -equivalenceclass of [[t]] : (0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . (cid:2) As with previous similar results, this justifies the use of rewrite rulesand normal forms to represent abstract data types, but now in thevery rich setting of conditional order-sorted rewriting modulo equa-tions; Sections 10.8 and 10.9 give examples showing how powerfullyexpressive this setting can be.The important Theorem 10.7.18 below shows why OBJ3 works welleven without the sort decreasing assumption. But first, we make precisethe notion of retract rewriting: Definition 10.7.15 Given an MCOSTRS ( Σ , A, B) with B sort preserving, then the retract insertion rule is defined as follows: suppose a term t of leastsort s can be rewritten at the top by applying a (possibly conditional)rule u → v to yield a term t (cid:48) (i.e., there exists θ such that t (cid:39) B θ(u) and t (cid:48) (cid:39) B θ(v) ), and suppose the least sort s (cid:48) of t (cid:48) is not less than or equalto s ; now let w(z) be a context with variable z of sort s but s (cid:54)≥ s (cid:48) ; thenreplace t by r: s (cid:48)(cid:48) > s(t (cid:48) ) where s (cid:48)(cid:48) ≥ s, s (cid:48) , which exists by local filtration. Retract rewriting consists of rewriting with ( Σ , A ⊗ , B) plus the retractinsertion rule. Let ∗ ⇒ A ⊗ //B denote retract rewriting. (cid:2) The retract insertion rule is sound, because it can be decomposedinto two sound deductions: first substitute t = t (cid:48) into w( r: s (cid:48)(cid:48) > s (cid:48) (z)) to obtain w( r: s (cid:48)(cid:48) > s (cid:48) (t)) = w( r: s (cid:48)(cid:48) > s (cid:48) (t (cid:48) )) , and then apply retract elim-ination to obtain w(t) = w( r: s (cid:48)(cid:48) > s (cid:48) (t (cid:48) ) . Note that, as an optimization,a non-sort decreasing rule can be replaced by the corresponding rulewith a retract on its rightside; OBJ3 in fact does this. Also note that amatch with B cannot produce terms that require inserting or deletingretracts (although it may manipulate such terms). Order-Sorted Algebra and Term Rewriting Although the relation ∗ (cid:97) A/B is not complete for Σ -terms, t ∗ (cid:97) A ⊗ //B t (cid:48) is complete (and sound) for Σ -terms. We illustrate this with the specifi-cation of Exercise 10.7.1, which was originally used to show the incom-pleteness of ∗ (cid:97) A/B : Example 10.7.16 Example 10.7.4 successfully used the equations of Exercise10.3.2 backwards, but Exercise 10.7.1 was restricted to forward rewrit-ing. Can we show f(a) = f(b) using ∗ (cid:97) A/B ? No, but we can with ∗ (cid:97) A ⊗ //B : rewrite the retract term f(r:C>A(c)) in two different ways:first to f(r:C>A(a)) and second to f(r:C>A(b)) , using respectivelythe first and second rules. Then by retract elimination, the first equals f(a) and the second equals f(b) . (cid:2) Lemma 10.7.17 If B is sort preserving, then (cid:39) B = (cid:39) B ⊗ for Σ -terms, so that (cid:39) (A ∪ B) ⊗ = (cid:39) A ⊗ ∪ B , again for Σ -terms. Proof: The first assertion follows because equational reasoning with B cannotinsert or delete retracts, since B is sort preserving. The second asser-tion uses (A ∪ B) ⊗ = A ⊗ ∪ B ⊗ . (cid:2) Recall that t (cid:39) E means E (cid:238) ( ∀ X) t = t (cid:48) . Theorem 10.7.18 ( Completeness ) If ( Σ , A, B) is an MCOSTRS with B sort pre-serving and A ∪ B faithful, then t ∗ (cid:97) A ⊗ //B t (cid:48) iff A ∪ B (cid:96) Σ ( ∀ X) t = t (cid:48) forany t, t (cid:48) ∈ T Σ . Proof: Because order-sorted rewriting is sound, t ∗ (cid:97) A ⊗ //B t (cid:48) implies t (cid:39) A ⊗ ∪ B t (cid:48) for t, t (cid:48) ∈ T Σ , which by Lemma 10.7.17 is equivalent to t (cid:39) (A ∪ B) ⊗ t (cid:48) .Theorem 10.6.6 now gives equivalence of this to t (cid:39) A ∪ B t (cid:48) .For the converse, suppose t (cid:39) A ∪ B t (cid:48) for t, t (cid:48) ∈ T Σ . By the above, thisis equivalent to t (cid:39) A ⊗ ∪ B t (cid:48) . We will show t ∗ (cid:97) A ⊗ //B t (cid:48) by simulating theproof for t (cid:39) A ∪ B t (cid:48) using (cid:97) A ⊗ //B ; the essential difference is that thefirst can only substitute equals for equals, whereas the second allowsfirst proving and then using lemmas. Using a lemma u (cid:39) v is the sameas applying a rule u → v (or v → u , which is treated the same way),unless the rule is non-sort decreasing, in which case, when v (cid:48) = θ(v) is substituted for u (cid:48) = θ(u) in context w , an ill-formed Σ -term w(v (cid:48) ) may result. Then retract rewriting will substitute r: s (cid:48) > s(v (cid:48) ) for u (cid:48) ,thus obtaining the well-formed Σ ⊗ -term w( r: s (cid:48) > s(v (cid:48) )) . If the proof of u (cid:39) v involves other lemmas, the same is done recursively, substitutingthe simulated proof of u (cid:39) v using ∗ (cid:97) A ⊗ //B for the use of u → v . Doingthis recursively for all lemmas finally yields a rewriting sequence for t ∗ (cid:97) A ⊗ //B t (cid:48) . (cid:2) The implementation of order-sorted rewriting in OBJ3 achieves al-most the effiency of many-sorted rewriting, using clever techniques de-scribed in detail in [113]. In addition, retracts are efficiently handled bybuiltin Lisp code, rather than by interpreting the theory of retracts. rder-Sorted Rewriting Exercise 10.7.3 Check whether Example 10.6.4 is sort decreasing according toDefinition 10.7.10. Now enrich the theory MORE-NON-DED so that it per-mits a non-trivial deduction similar to that in Example 10.6.2, involvingthe intermediate use of retracts, and then show how to accomplish thisdeduction using the relation ∗ (cid:97) A/B . (cid:2) Exercise 10.7.4 Use OBJ’s apply commands to prove the equation f ( a ) = f ( b ) for the specification OSRW-EQ in Exercise 10.3.2. (cid:2) The results of Section 7.7.1 on adding new constants generalize straight-forwardly to conditional order-sorted rewriting modulo equations; we state these generalizations explicitly because of their importance fortheorem proving, and because they appear to be new in this context. Proposition 10.7.19 If an MCOSTRS ( Σ , A, B) is terminating, or Church-Rosser,or locally Church-Rosser, then so is ( Σ (X), A, B) , for any suitable count-able variable symbol set X . (cid:2) Proposition 10.7.20 An MCOSTRS ( Σ , A, B) is ground terminating if ( Σ (X), A, B) is ground terminating, where X is a variable set for Σ ; moreover, if Σ isnon-void, then ( Σ , A, B) is ground terminating iff ( Σ (X), A, B) is groundterminating. (cid:2) Corollary 10.7.21 If Σ is non-void, then an MCOSTRS ( Σ , A, B) is ground termi-nating iff it is terminating. (cid:2) Proposition 10.7.22 An MCOSTRS ( Σ , A, B) is Church-Rosser iff ( Σ (X ωS ), A, B) is ground Church-Rosser, is locally Church-Rosser iff ( Σ (X ωS ), A, B) isground locally Church-Rosser. (cid:2) This subsection generalizes termination results for MCTRS’s in Section Exercise 10.7.5 Generalize Propositions 7.5.6 and 7.5.7 to MCOSTRS’s, and giveproofs. (cid:2) Exercise 10.7.6 Apply the generalizations of Propositions 7.5.6 and 7.5.7 toMCOSTRS’s (that are not MCTRS’s) to prove their termination. (cid:2) Given a poset P , we can define weak and strong ρ -monotonicty of order-sorted conditional rewrite rules modulo B , of order-sorted substitutionmodulo B , just as in Definition 5.5.3, and of operations in Σ , exceptthat T Σ and T Σ ( { z } s ) are replaced by T Σ ,B and T Σ ,B ( { z } s ) , respectively; Order-Sorted Algebra and Term Rewriting note that as before, the inequalities for a rule are only required to holdwhen all the conditions of the rule converge (modulo B ). The followinggeneralizes Theorem 7.7.20: Theorem 10.7.23 Let ( Σ , A, B) be a MCOSTRS with Σ non-void and A (cid:48) ⊆ A un-conditional and ground terminating; let P be a poset and let N = A − A (cid:48) .If there is ρ : T Σ ,B → P such that(1) each rule in A (cid:48) is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least for each t ∈ ( T Σ ,B ) s there is some Noetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. (cid:2) Exercise 10.7.7 Prove Theorem 10.7.23. (cid:2) Exercise 10.7.8 Show that if C = ( Σ , A, B) is a MCOSTRS and C U = ( Σ , A U , B) where A U contains the rules in A with their conditions removed, then C is Church-Rosser (or ground Church-Rosser) if C U is. (cid:2) As always, ARS results apply directly, including the Newman Lemma,the Hindley-Rosen Lemma (Exercise 5.7.5) and Proposition 7.7.22, sowe do not state these here; the results that we do state are actuallyrather weak. Perhaps the most generally useful methods for provingChurch-Rosser are based on the Newman Lemma, since it is usuallymuch easier to prove the local Church-Rosser property. As mentonedin Sections 7.6 and 7.7.3, and in more detail in Chapter 12, although the Critical Pair Theorem (5.6.9) does not generalize to modulo B rewriting,the local Church-Rosser property can still in many cases be checked bya variant of the Knuth-Bendix algorithm [117]. Exercise 10.7.9 Does Proposition 7.6.9 generalize to MCOSTRS’s? Give a proofor a counterexample. (cid:2) Exercise 10.7.10 Generalize Proposition 7.7.23 to MCOSTRS’s, give a proof, andthen apply it to an example (not an MCTRS) to prove the Church-Rosserproperty. Hint: Consider a variant of PROPC where the truth values area subsort of the propositional expressions. (cid:2) Number System Quat (cid:117) (cid:117) (cid:15) (cid:15) (cid:37) (cid:37) J (cid:15) (cid:15) (cid:34) (cid:34) NzQuat (cid:15) (cid:15) (cid:117) (cid:117) NzJ Cpx (cid:15) (cid:15) (cid:36) (cid:36) (cid:42) (cid:42) Rat (cid:36) (cid:36) (cid:15) (cid:15) NzCpx (cid:15) (cid:15) (cid:37) (cid:37) Imag (cid:15) (cid:15) Int (cid:15) (cid:15) (cid:37) (cid:37) NzRat (cid:15) (cid:15) NzImag Nat (cid:117) (cid:117) (cid:37) (cid:37) NzInt (cid:15) (cid:15) Zero NzNatFigure 10.3: Subsort Structure for Number System This section presents an extended example, a rather complete numberhierarchy, from the naturals up to the quaternions, including also theinteger, rational, and complex numbers, with many of the usual opera-tions upon them; however, it does not include the real numbers, and thecomplex numbers and quaternions are based on the rationals insteadof the reals. A number of test cases are given. This example is from[82], and much of the work on it was done by Prof. José Meseguer andMr. Tim Winkler. It is interesting to notice that multiplication is not commutative on quaternions, although it is commutative on the sub-sorts of complexes, rationals, etc., and that this situation is allowed byour notion of overloading, as well as supported by the OBJ implemen-tation. This example is also used in [113], where it is annotated withmuch information about how its features are efficiently implementedin OBJ3 using techniques that include rule specializations and generalvariables. obj NAT is sorts Nat NzNat Zero .subsorts Zero NzNat < Nat .op 0 : -> Zero .op s_ : Nat -> NzNat . Order-Sorted Algebra and Term Rewriting op p_ : NzNat -> Nat .op _+_ : Nat Nat -> Nat [assoc comm] .op _*_ : Nat Nat -> Nat .op _*_ : NzNat NzNat -> NzNat .op _>_ : Nat Nat -> Bool .op d : Nat Nat -> Nat [comm] .op quot : Nat NzNat -> Nat .op gcd : NzNat NzNat -> NzNat [comm] .vars N M : Nat . vars N’ M’ : NzNat .eq p s N = N .eq N + 0 = N .eq (s N) + (s M) = s s(N + M) .eq N * 0 = 0 .eq 0 * N = 0 .eq (s N) * (s M) = s(N + (M + (N * M))) .eq 0 > M = false .eq N’ > 0 = true .eq s N > s M = N > M .eq d(0,N) = N .eq d(s N, s M) = d(N,M) .eq quot(N,M’) = if ((N > M’)or(N == M’)) thens quot(d(N,M’),M’) else 0 fi .eq gcd(N’,M’) = if N’ == M’ then N’ else (if N’ > M’ thengcd(d(N’,M’),M’) else gcd(N’,d(N’,M’)) fi) fi .endoobj INT is sorts Int NzInt .protecting NAT .subsorts NzNat < Nat NzInt < Int .op -_ : Int -> Int .op -_ : NzInt -> NzInt .op _+_ : Int Int -> Int [assoc comm] .op _*_ : Int Int -> Int .op _*_ : NzInt NzInt -> NzInt .op quot : Int NzInt -> Int .op gcd : NzInt NzInt -> NzNat [comm] .vars I J : Int . vars I’ J’ : NzInt . vars N’ M’ : NzNat .eq - - I = I .eq - 0 = 0 .eq I + 0 = I .eq M’ + (- N’) = if N’ == M’ then 0 else(if N’ > M’ then - d(N’,M’) else d(N’,M’) fi) fi .eq (- I) + (- J) = -(I + J) .eq I * 0 = 0 .eq 0 * I = 0 .eq I * (- J) = -(I * J) .eq (- J) * I = -(I * J) .eq quot(0,I’) = 0 .eq quot(- I’,J’) = - quot(I’,J’) . Number System eq quot(I’,- J’) = - quot(I’,J’) .eq gcd(- I’,J’) = gcd(I’,J’) .endoobj RAT is sorts Rat NzRat .protecting INT .subsorts NzInt < Int NzRat < Rat .op _/_ : Rat NzRat -> Rat .op _/_ : NzRat NzRat -> NzRat .op -_ : Rat -> Rat .op -_ : NzRat -> NzRat .op _+_ : Rat Rat -> Rat [assoc comm] .op _*_ : Rat Rat -> Rat .op _*_ : NzRat NzRat -> NzRat .vars I’ J’ : NzInt . vars R S : Rat . vars R’ S’ : NzRat .eq R / (R’ / S’) = (R * S’) / R’ .eq (R / R’) / S’ = R / (R’ * S’) .ceq J’ / I’ = quot(J’,gcd(J’,I’)) / quot(I’,gcd(J’,I’))if gcd(J’,I’) =/= s 0 .eq R / s 0 = R .eq 0 / R’ = 0 .eq R / (- R’) = (- R) / R’ .eq -(R / R’) = (- R) / R’ .eq R + (S / R’) = ((R * R’) + S) / R’ .eq R * (S / R’) = (R * S) / R’ .eq (S / R’) * R = (R * S) / R’ .endoobj CPX-RAT is sorts Cpx Imag NzImag NzCpx .protecting RAT .subsort Rat < Cpx .subsort NzRat < NzCpx .subsorts NzImag < NzCpx Imag < Cpx .subsorts Zero < Imag .op _i : Rat -> Imag .op _i : NzRat -> NzImag .op -_ : Cpx -> Cpx .op -_ : NzCpx -> NzCpx .op _+_ : Cpx Cpx -> Cpx [assoc comm] .op _+_ : NzRat NzImag -> NzCpx [assoc comm] .op _*_ : Cpx Cpx -> Cpx .op _*_ : NzCpx NzCpx -> NzCpx .op _/_ : Cpx NzCpx -> Cpx .op _ Order-Sorted Algebra and Term Rewriting eq (R i) + (S i) = (R + S) i .eq -(R’ + (S’ i)) = (- R’) + ((- S’) i) .eq -(S’ i) = (- S’) i .eq R * (S i) = (R * S) i .eq (S i) * R = (R * S) i .eq (R i) * (S i) = -(R * S) .eq C * (A + B) = (C * A) + (C * B) .eq (A + B) * C = (C * A) + (C * B) .eq R Number System eq Q / (C + (C’ j)) = Q * (((C The equation that defines the squared norm function |_|^2 is in-teresting, because given a non-zero rational as input, it should return anon-zero rational, but the rightside of the equation does not parse as anon-zero rational, although it can be proved that it always yields one.The attribute [memo] on the constants , . . . , causes OBJ to cache thenormal forms of these terms, and then use these cached values insteadof recomputing them each time they are needed. Exercise 10.8.1 Show that INT , RAT , CPX , and QUAT are rings, where the the-ory of rings is given by the following theories, where the first definescommutative (also called Abelian) groups with additive notation: th ABGP is sort Elt .op 0 : -> Elt .op _+_ : Elt Elt -> Elt [assoc comm id: 0].op -_ : Elt -> Elt .var X : Elt .eq X +(- X) = 0 .endth Order-Sorted Algebra and Term Rewriting th RING is us ABGP .op 1 : -> Elt .op _*_ : Elt Elt -> Elt [assoc id: 1 prec 30].vars X Y Z : Elt .eq X * 0 = 0 . eq 0 * X = 0 .eq X *(Y + Z) = (X * Y) + (X * Z) .eq (Y + Z)* X = (Y * X) + (Z * X) .endth Show that INT , RAT , and CPX are commutative rings, where the the-ory of commutative rings is as above, except that a comm attribute isadded for * and the last equation is omitted. Show that QUAT is not acommutative ring. (cid:2) Exercise 10.8.2 Show that INT , RAT , CPX , and QUAT are fields, where the theoryof fields is given by the following: th FIELD is us RING .sort NzElt .subsort NzElt < Elt .op _ − : NzElt -> NzElt [prec 2].vars X : NzElt .eq X * X − = 1 . eq X − * X = 1 .endth Show that INT , RAT , and CPX are commutative fields, where the the-ory of commutative fields is as above, except that it imports the theoryof commutative rings. Show that QUAT is not a commutative field. (cid:2) It is surprising that fields can be specified so simply with order-sortedalgebra, but it is not difficult to show that there is an isomorphismbetween the classes of models of FIELD , and of fields in the ordinarysense, given by the function U , where if F is a model of FIELD , then U (F) is the partial algebra with X − undefined when X = Exercise 10.8.3 Use OBJ to prove in the theory FIELD that if X, Y are non-zero,then so is X ∗ Y . Conclude from this that the overload declaration op _*_ : NzElt NzElt -> NzElt [assoc id: 1]. can be added to FIELD . (cid:2) Exercise 10.8.4 Use OBJ to prove ( ∀ X, Y : Rat ) (X + i ∗ Y )(X − i ∗ Y ) = | X | +| Y | . Hint: Do not neglect the cases where X = Y = (cid:2) ultiple Representation and Coercion This section gives an example based on one in [81, 57], showing howOSA handles multiple representations of a single abstract data type,by providing automatic coercions among the representations. The typehere is points in the plane (or vectors from the origin), and the repre-sentations are Cartesian and polar coordinates. The specification belowuses the module FLOAT , which is OBJ’s approximation to the real num-ber field (which cannot be fully implemented on a computer). BOBJ[71, 72], the newest version of OBJ, is needed for its sort constraints ,which allow users to declare new subsorts of old sorts. The keyword forsort constraints is mb , after the syntax of Maude [30]; the first defines asubsort NNeg of non-negatives for Float , and the second defines a fur- ther subsort for angles. Automatic coercions are defined by the threeequations with retracts; the first two convert a point in polar coordi-nates to Cartesian coordinates when the context requires it, and thethird does the opposite. Since both the sum and distance functions areonly defined for the Cartesian representation, applying them to polarpoints requires coercion, as illustrated in the three reductions belowthe specification. obj POINT is pr FLOAT .sorts NNeg Point Cart Polar Angle .subsorts Angle < NNeg < Float .var N : NNeg . var A : Angle . var F : Float .mb F : NNeg if F >= 0 .op _**2 : Float -> NNeg [prec 2].eq F **2 = F * F .mb F : Angle if 0 <= F and F < 2 * pi .subsorts Cart Polar < Point .op <_,_> : Float Float -> Cart .op _+_ : Cart Cart -> Cart .vars F1 F2 F3 F4 : Float .eq < F1, F2 > + < F3, F4 > = < F1 + F3, F2 + F4 > .op [_,_] : Angle NNeg -> Polar .eq r:Point>Cart([A, N]) = Exercise 10.9.1 Add a function to the above code that rotates a point in polarcoordinates about the origin, and use it to define a negation functionfor addition. Now run several test cases on these functions that requirecoercion. You can download the latest version of BOBJ from ftp://ftp.cs.ucsd.edu/pub/fac/goguen/bobj/ (cid:2) Order-sorted algebra has evolved through several stages. The olderversions OBJT and OBJ1 of OBJ used error algebras [53], which can failto have initial models [150]. OSA began in 1978 in [54], and was furtherdeveloped in papers including [82, 113, 152] and [165]. This chaptersummarizes many basic results from OSA, mainly following [82]. A rather comprehensive survey up to 1992 is given in [68], and [77] is aless formal introduction with many examples.Order-sorted rewriting was first treated in depth in [69], and thenfurther developed in [113]. Both papers focus on operational seman-tics, i.e., on how to efficiently implement order-sorted term rewritingin OBJ: the first translates to many-sorted rewriting, while the secondcomputes a set of rules that work on a term data structure that keepstrack of the ranks of operations and the sorts of subterms; this is moreefficient, and can be considered a weak form of compilation. Neither ofthese papers treats join conditional rewriting, so all the results aboutconditional order-sorted rewriting in Section 10.7 are new (we have ar-gued in previous chapters that join semantics is the most appropriatefor OBJ). The semantics of retracts in [78] does not cover rewriting mod-ulo equations or retract rewriting, although this is discussed in [90]under the name “safe rewriting,” again without condition rules or mod-ulo equations. Theorem 10.7.18 is an important and perhaps surprisingnew result. The example in Section 10.9 is also new in this simple form. iterature A Note to Lecturers: This chapter is a culmination of thisbook, in the sense that we have been gradually building uptowards a complete, rigorous treatment of the full opera-tional and algebraic semantics of OBJ3, which is conditionalorder-sorted rewriting with retracts, modulo equations, withits theories of equational deduction and of algebras as mod-els, together with practical methods for specification andverification, and practical ways to check key properties suchas Church-Rosser and termination. This chapter seems tobe the only place where such an exposition is available, andmany of its results are new, e.g., Theorem 10.7.18. On theother hand, many other results are fairly straightforwardgeneralizations of results in prior chapters. In my opinion, any course on algebraic specification should at least stateand illustrate the main theorems of this chapter, makingclear the great expressive power that results from combin-ing all these features, although it is not necessary to go overall the counterexamples, auxiliary results, and proofs. Order-Sorted Algebra and Term Rewriting Insert material from [59] here. With this technique, we can verify the correctness of generic objects in the sense of OBJ, as well as of higher-order functions as used infunctional programming. Generic Modules Use the approach of “What is Unification?” [60], introducingsome basic category theory to define unification; see alsothe discussion at the end of Chapter 8. The discussion of the Church-Rosser property in Section 5.6 in-cluded a claim that it is decidable for terminating TRS’s. We now dis-charge that claim. First, recall from Chapter 5 that, given a TRS A , if theleft sides t, t (cid:48) of two rules have an overlap (in the sense of Definition5.6.2) θ(t ) = θ (cid:48) (t (cid:48) ) , where t is a subterm of t , then θ(t) can be rewrit-ten in two ways (one for each rule). The following, which is needed inChapter 5, follows from the existence of most general unifiers, whichcan be computed as described above: Proposition 12.0.1 If terms t, t (cid:48) overlap at a subterm t of t , then there is a most general overlap p , in the sense that any other overlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) Recall that such a most general overlap is called a superposition , andthat the pair of terms resulting from applying the two rules to the term θ(t) is called a critical pair . If the two terms in a critical pair can berewritten to a common term using rules in A , then that critical pair issaid to converge or to be convergent . The following was proved inChapter 5, and is important since it covers non-orthogonal TRS’s thatcannot be checked with Theorem 5.6.4: Theorem 5.6.9 A TRS is locally Church-Rosser if all its critical pairs are conver-gent. (cid:2) With the Newman Lemma (Proposition 5.6.1), this implies: Corollary 5.6.10 A terminating TRS is Church-Rosser iff all its critical pairs areconvergent, in which case it is also canonical. (cid:2) Proofs of canonicity by orthogonality can be mechanized by usingan algorithm that checks if each pair of rules is overlapping, and checksleft linearity of each rule (which is trivial). UnificationSection 5.6 promised an algorithm from the above corollary;also the above just repeats material in that section.Section 5.6 also promises the unification algorithm.Discuss the Church-Rosser property for the TRS’s GROUPC ofExample 5.2.8, AND of Example 5.5.7, and MONOID of Exer-cise 5.2.7. Are these already in Chapter 5?Discuss Knuth-Bendix [117], and proof by consistency usingcompletion, which in general is not a very satisfying tech-nique. Do not prove the correctness of Knuth-Bendix com-pletion.Do MSA case first, then OSA. Do Example 5.8.37 in algorithmic detail; can copy and edit.Also discuss completion procedures for matching modulo,as promised in Section 7.3.3.Check carefully against chapters 5,7,10. We also need the following: Proposition 12.0.2 For B containing any combination of the commutative, as-sociative, and identity laws, if terms t, t (cid:48) overlap at a subterm t of t ,then there is a most general overlap p , in the sense that any otheroverlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) The following can now be done using the machinery developed above: Exercise 12.1.1 Recalling that Example 7.5.10 shows termination of DNF , useCorollary 5.6.10 to show that the MTRS DNF is locally Church-Rosser,and therefore canonical. (cid:2) Exercise 12.1.2 Show that the normal forms of the MTRS DNF are exactly thedisjunctive normal forms of the propositional formulae. (cid:2) Exercise 12.1.3 Recalling that Example 7.5.9 shows termination of PROPC , useCorollary 5.6.10 to complete the proof of Hsiang’s Theorem (Theo-rem 7.3.13) that PROPC is a canonical MTRS. (cid:2) Not really sure whether to do this; anyway, the following isjust a rough sketch of some preliminary ideas for introduc-tory remarks. Computing applications typically involve states, and it can be awkward,or even impossible, to treat these applications in a purely functionalstyle. Hidden algebra substantially extends ordinary algebra by distin-guishing sorts used for data from sorts used for states, calling themrespectively visible and hidden sorts. It also changes the notion of sat-isfaction to behavioral (also called observational ) satisfaction, so thatequations need not be literally satisfied, but need only appear to besatisfied under all possible experiments. Hidden algebra is powerfulenough to give a semantics for the object paradigm, including inheri-tance and concurrency.Whereas initial algebra semantics takes a somewhat static view ofdata structures and systems, as reflected in the central result that theinitial model is unique up to isomorphism, hidden algebra takes a moredynamic view, directly addressing behavior and abstracting away fromimplementation, with its notion of behavioral satisfaction for equa-tions. In philosophical terms, the evolution from initial algebra to hid-den algebra is similar to the evolution from Plato’s theory of static eter-nal ideals, to Aristotle’s attempts to confront the kinds of change anddevelopment that can be observed especially in the biological realm. Hidden Algebra This book takes the notions of model and satisfaction as basic, andchecks the soundness of each proposed rule of deduction with respectto them before accepting it for use in theorem proving. Since we do notassume a single pre-determined logical system, the concepts of modeland satisfaction cannot be fixed once and for all. Instead, we adopt amore general approach, influenced by the theory of institutions [67],and gradually enrich our language and models, working upward fromsimple equational logic.Assume a concept of signature. For each signature Σ , assume thatthere are a class M Σ of Σ -models, an algebra T Σ of Σ -terms, an algebra F Σ of Σ -formulae, a concept of Σ -substitution, and a relation (cid:238) Σ of Σ -satisfaction of formulae by terms. Each concept should be definedrecursively, and later concepts can be defined recursively over earlierones.Let R be a language of sentences that can be directly checked byOBJ, e.g., conjunctions of reductions, let G be a language for goals, andlet L be a “meta” language that includes both G and R . The justification for a proof score is a sequence of applications of proof measures whichtransforms a goal into an R sentence, which can then be translated intoan OBJ program and run.Elements of G express semantic facts which we wish to verify; a typical goal sentence is E |(cid:155) e , where E is a conjunction (perhaps rep-resented as a set) of formulae and e is a single formula. We will callsentences of this form (atomic) turnstile sentences. Our aim is then totransform (possibly rather exotic) turnstile sentences into R sentencesthat can be checked by OBJ.(The above can be seen as an attempt to give a more down to earthexposition of the theory of institutions [67].) MORE TO COME HERE. A General Framework (Institutions) OBJ3 Syntax and Usage This appendix gives a formal description of OBJ3 syntax, followed bysome practical advice on how to use it. OBJ3 is the latest implementa-tion of OBJ; it is an equational language with a mathematical semantics given by order-sorted equational logic, and a powerful type system fea-turing subtypes and overloading; these latter features allow us to defineand handle errors in a precise way. In addition, OBJ3 has user-definableabstract data types with user-definable mixfix syntax, and a powerfulparameterized module facility that includes views and module expres-sions. OBJ is declarative in the sense that its statements assert proper-ties that solutions should have; i.e., they describe the problem. A subsetof OBJ is executable using term rewriting, and all of it is provable. See[90] for more details.OBJ3 syntax is described using the following extended BNF notation:the symbols { and } are used as meta-parentheses; the symbol | is usedto separate alternatives; [ ] pairs enclose optional syntax; . . . indicateszero or more repetitions of preceding unit; and “ x ” denotes x literally.As an application of this notation, A {, A } . . . indicates a non-empty listof A ’s separated by commas. Finally, --- indicates comments in thissyntactic description, as opposed to comments in OBJ3 code. --- top-level --- (cid:104) OBJ-Top (cid:105) ::= { (cid:104) Object (cid:105) | (cid:104) Theory (cid:105) | (cid:104) View (cid:105) | (cid:104) Make (cid:105) | (cid:104) Reduction (cid:105) | in (cid:104) FileName (cid:105) | quit | eof | start (cid:104) Term (cid:105) . | open [ (cid:104) ModExp (cid:105) ] . | openr [ (cid:104) ModExp (cid:105) ] . | close |(cid:104) Apply (cid:105) | (cid:104) OtherTop (cid:105) }... (cid:104) Make (cid:105) ::= make (cid:104) Interface (cid:105) is (cid:104) ModExp (cid:105) endm (cid:104) Reduction (cid:105) ::= reduce [ in (cid:104) ModExp (cid:105) : ] (cid:104) Term (cid:105) . (cid:104) Apply (cid:105) ::=apply {reduction | red | print | retr | -retr with sort (cid:104) Sort (cid:105) |(cid:104) RuleSpec (cid:105) [ with (cid:104) VarId (cid:105) = (cid:104) Term (cid:105) {, (cid:104) VarId (cid:105) = (cid:104) Term (cid:105) }... ] }{within | at} (cid:104) Selector (cid:105) {of (cid:104) Selector (cid:105) }... OBJ3 Syntax and Usage (cid:104) RuleSpec (cid:105) ::= [ - ][ (cid:104) ModId (cid:105) ] . (cid:104) RuleId (cid:105)(cid:104) RuleId (cid:105) ::= (cid:104) Nat (cid:105) | (cid:104) Id (cid:105)(cid:104) Selector (cid:105) ::= term | top | ( (cid:104) Nat (cid:105) ...) | [ (cid:104) Nat (cid:105) [ .. (cid:104) Nat (cid:105) ] ] | "{" (cid:104) Nat (cid:105) {, (cid:104) Nat (cid:105) }... "}"--- note that "()" is a valid selector (cid:104) OtherTop (cid:105) ::= (cid:104) RedLoop (cid:105) | (cid:104) Commands (cid:105) | call-that (cid:104) Id (cid:105) . | test reduction [ in (cid:104) ModExp (cid:105) : ] (cid:104) Term (cid:105) expect: (cid:104) Term (cid:105) . | (cid:104) Misc (cid:105) --- "call that (cid:104) Id (cid:105) ." is an abbreviation for "let (cid:104) Id (cid:105) = ." (cid:104) RedLoop (cid:105) ::= rl {. | (cid:104) ModId (cid:105) } { (cid:104) Term (cid:105) .}... . (cid:104) Commands (cid:105) ::= cd (cid:104) Sym (cid:105) | pwd | ls | do (cid:104) DoOption (cid:105) . | select [ (cid:104) ModExp (cid:105) ] . | set (cid:104) SetOption (cid:105) . | show [ (cid:104) ShowOption (cid:105) ] .--- in select, can use "open" to refer to the open module (cid:104) DoOption (cid:105) ::= clear memo | gc | save (cid:104) Sym (cid:105) ... | restore (cid:104) Sym (cid:105) ... | ? (cid:104) SetOption (cid:105) ::= {abbrev quals | all eqns | all rules | blips | clear memo | gc show | include BOOL | obj2 | verbose | print with parens | reduce conditions | show retracts | show var sorts | stats | trace | trace whole} (cid:104) Polarity (cid:105) | ? (cid:104) Polarity (cid:105) ::= on | off (cid:104) ShowOption (cid:105) ::={abbrev | all | eqs | mod | name | ops | params | principal-sort | [ all ] rules | select | sign | sorts | subs | vars} [ (cid:104) ParamSpec (cid:105) | (cid:104) SubmodSpec (cid:105) ] [ (cid:104) ModExp (cid:105) ] | [ all ] modes | modules | pending | op (cid:104) OpRef (cid:105) | [ all ] rule (cid:104) RuleSpec (cid:105) | sort (cid:104) SortRef (cid:105) | term | time | verbose | (cid:104) ModExp (cid:105) |(cid:104) ParamSpec (cid:105) | (cid:104) SubmodSpec (cid:105) | ?--- can use "open" to refer to the open module (cid:104) ParamSpec (cid:105) ::= param (cid:104) Nat (cid:105)(cid:104) SubmodSpec (cid:105) ::= sub (cid:104) Nat (cid:105)(cid:104) Misc (cid:105) ::= eval (cid:104) Lisp (cid:105) | eval-quiet (cid:104) Lisp (cid:105) | parse (cid:104) Term (cid:105) . | (cid:104) Comment (cid:105)(cid:104) Comment (cid:105) ::= *** (cid:104) Rest-of-line (cid:105) | ***> (cid:104) Rest-of-line (cid:105) | *** ( (cid:104) Text-with-balanced-parentheses (cid:105) ) (cid:104) Rest-of-line (cid:105) --- the remaining text of the current line--- modules --- (cid:104) Object (cid:105) ::= obj (cid:104) Interface (cid:105) is { (cid:104) ModElt (cid:105) | (cid:104) Builtins (cid:105) }... endo (cid:104) Theory (cid:105) ::= th (cid:104) Interface (cid:105) is (cid:104) ModElt (cid:105) ... endth (cid:104) Interface (cid:105) ::= (cid:104) ModId (cid:105) [ [ (cid:104) ModId (cid:105) ... :: (cid:104) ModExp (cid:105) {, (cid:104) ModId (cid:105) ... :: (cid:104) ModExp (cid:105) }... ] ] (cid:104) ModElt (cid:105) ::={protecting | extending | including | using} (cid:104) ModExp (cid:105) . | using (cid:104) ModExp (cid:105) with (cid:104) ModExp (cid:105) {and (cid:104) ModExp (cid:105) }... | define (cid:104) SortId (cid:105) is (cid:104) ModExp (cid:105) . | principal-sort (cid:104) Sort (cid:105) . | sort (cid:104) SortId (cid:105) ... . | subsort (cid:104) Sort (cid:105) ... { < (cid:104) Sort (cid:105) ... }... . | as (cid:104) Sort (cid:105) : (cid:104) Term (cid:105) if (cid:104) Term (cid:105) . | op (cid:104) OpForm (cid:105) : (cid:104) Sort (cid:105) ... -> (cid:104) Sort (cid:105) [ (cid:104) Attr (cid:105) ] . | ops { (cid:104) Sym (cid:105) | ( (cid:104) OpForm (cid:105) )}... : (cid:104) Sort (cid:105) ... -> (cid:104) Sort (cid:105) [ (cid:104) Attr (cid:105) ] . | op-as (cid:104) OpForm (cid:105) : (cid:104) Sort (cid:105) ... -> (cid:104) Sort (cid:105) for (cid:104) Term (cid:105) if (cid:104) Term (cid:105) [ (cid:104) Attr (cid:105) ] . | [ (cid:104) RuleLabel (cid:105) ] let (cid:104) Sym (cid:105) [ : (cid:104) Sort (cid:105) ] = (cid:104) Term (cid:105) . | var (cid:104) VarId (cid:105) ... : (cid:104) Sort (cid:105) . | vars-of [ (cid:104) ModExp (cid:105) ] . | [ (cid:104) RuleLabel (cid:105) ] eq (cid:104) Term (cid:105) = (cid:104) Term (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cq (cid:104) Term (cid:105) = (cid:104) Term (cid:105) if (cid:104) Term (cid:105) . |(cid:104) Misc (cid:105)(cid:104) Attr (cid:105) ::= [ {assoc | comm | {id: | idr:} (cid:104) Term (cid:105) | idem | memo | strat ( (cid:104) Int (cid:105) ... ) | prec (cid:104) Nat (cid:105) | gather ({e | E | &}... ) | poly (cid:104) Lisp (cid:105) | intrinsic}... ] (cid:104) RuleLabel (cid:105) ::= (cid:104) Id (cid:105) ... {, (cid:104) Id (cid:105) ... }... (cid:104) ModId (cid:105) --- simple identifier, by convention all caps (cid:104) SortId (cid:105) --- simple identifier, by convention capitalised (cid:104) VarId (cid:105) --- simple identifier, typically capitalised (cid:104) OpName (cid:105) ::= (cid:104) Sym (cid:105) {"_" | " " | (cid:104) Sym (cid:105) }... (cid:104) Sym (cid:105) --- any operator syntax symbol (blank delimited) (cid:104) OpForm (cid:105) ::= (cid:104) OpName (cid:105) | ( (cid:104) OpName (cid:105) ) (cid:104) Sort (cid:105) ::= (cid:104) SortId (cid:105) | (cid:104) SortId (cid:105) . (cid:104) SortQual (cid:105)(cid:104) SortQual (cid:105) ::= (cid:104) ModId (cid:105) | ( (cid:104) ModExp (cid:105) ) (cid:104) Lisp (cid:105) --- a Lisp expression (cid:104) Nat (cid:105) --- a natural number (cid:104) Int (cid:105) --- an integer (cid:104) Builtins (cid:105) ::=bsort (cid:104) SortId (cid:105) (cid:104) Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] bq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] beq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cbeq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) if (cid:104) BoolTerm (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cbq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) if (cid:104) BoolTerm (cid:105) .--- views --- (cid:104) View (cid:105) ::= view [ (cid:104) ModId (cid:105) ] from (cid:104) ModExp (cid:105) to (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv | view (cid:104) ModId (cid:105) of (cid:104) ModExp (cid:105) as (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv--- terms --- (cid:104) Term (cid:105) ::= (cid:104) Mixfix (cid:105) | (cid:104) VarId (cid:105) | ( (cid:104) Term (cid:105) ) |(cid:104) OpName (cid:105) ( (cid:104) Term (cid:105) {, (cid:104) Term (cid:105) }... ) | ( (cid:104) Term (cid:105) ). (cid:104) OpQual (cid:105) --- precedence and gathering rules used to eliminate ambiguity (cid:104) OpQual (cid:105) ::= (cid:104) Sort (cid:105) | (cid:104) ModId (cid:105) | ( (cid:104) ModExp (cid:105) ) (cid:104) Mixfix (cid:105) --- mixfix operator applied to arguments OBJ3 Syntax and Usage --- module expressions --- (cid:104) ModExp (cid:105) ::= (cid:104) ModId (cid:105) | (cid:104) ModId (cid:105) is (cid:104) ModExpRenm (cid:105) |(cid:104) ModExpRenm (cid:105) + (cid:104) ModExp (cid:105) | (cid:104) ModExpRenm (cid:105)(cid:104) ModExpRenm (cid:105) ::= (cid:104) ModExpInst (cid:105) * ( (cid:104) RenameElt (cid:105) {, (cid:104) RenameElt (cid:105) }... ) |(cid:104) ModExpInst (cid:105)(cid:104) ModExpInst (cid:105) ::= (cid:104) ParamModExp (cid:105) [ (cid:104) Arg (cid:105) {, (cid:104) Arg (cid:105) }... ] | ( (cid:104) ModExp (cid:105) ) (cid:104) ParamModExp (cid:105) ::= (cid:104) ModId (cid:105) | ( (cid:104) ModId (cid:105) * ( (cid:104) RenameElt (cid:105) {, (cid:104) RenameElt (cid:105) }... )) (cid:104) RenameElt (cid:105) ::= sort (cid:104) SortRef (cid:105) to (cid:104) SortId (cid:105) | op (cid:104) OpRef (cid:105) to (cid:104) OpForm (cid:105)(cid:104) Arg (cid:105) ::= (cid:104) ViewArg (cid:105) | (cid:104) ModExp (cid:105) | [ sort ] (cid:104) SortRef (cid:105) | [ op ] (cid:104) OpRef (cid:105) --- may need to precede (cid:104) SortRef (cid:105) by "sort" and (cid:104) OpRef (cid:105) by "op" to--- distinguish from general case (i.e., from a module name) (cid:104) ViewArg (cid:105) ::= view [ from (cid:104) ModExp (cid:105) ] to (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv (cid:104) ViewElt (cid:105) ::= sort (cid:104) SortRef (cid:105) to (cid:104) SortRef (cid:105) . | var (cid:104) VarId (cid:105) ... : (cid:104) Sort (cid:105) . | op (cid:104) OpExpr (cid:105) to (cid:104) Term (cid:105) . | op (cid:104) OpRef (cid:105) to (cid:104) OpRef (cid:105) .--- priority given to (cid:104) OpExpr (cid:105) case--- vars are declared with sorts from source of view (a theory) (cid:104) SortRef (cid:105) ::= (cid:104) Sort (cid:105) | ( (cid:104) Sort (cid:105) ) (cid:104) OpRef (cid:105) ::= (cid:104) OpSpec (cid:105) | ( (cid:104) OpSpec (cid:105) ) | ( (cid:104) OpSpec (cid:105) ). (cid:104) OpQual (cid:105) | (( (cid:104) OpSpec (cid:105) ). (cid:104) OpQual (cid:105) )--- in views if have (op).(M) must be enclosed in (), i.e., ((op).(M)) (cid:104) OpSpec (cid:105) ::= (cid:104) OpName (cid:105) | (cid:104) OpName (cid:105) : (cid:104) SortId (cid:105) ... -> (cid:104) SortId (cid:105)(cid:104) OpExpr (cid:105) --- a (cid:104) Term (cid:105) consisting of a single operator applied--- to variables--- equivalent forms ---assoc = associative comm = commutativecq = ceq dfn = defineev = eval evq = eval-quietjbo = endo ht = endthendv = weiv = endview ex = extendinggather = gathering id: = identity:idem = idempotent idr: = identity-rules:in = input inc = includingobj = object poly = polymorphicprec = precedence psort = principal-sortpr = protecting q = quitred = reduce rl = red-loopsh = show sorts = sortstrat = strategy subsorts = subsortth = theory us = usingvars = var *** = ---***> = ---> --- Lexical analysis ------ Tokens are sequences of characters delimited by blanks--- "(", ")", and "," are always treated as single character symbols--- Tabs and returns are equivalent to blanks (except inside comments)--- Normally, "[", "]", "_", ",", "{", and "}"--- are also treated as single character symbols. Although OBJ provides a fully interactive user interface, in practicethis is an awkward way to use the system, because users nearly alwaysmake mistakes, and mistakes can be very troublesome to correct inan interactive mode. It is much easier to first make a file, then startOBJ, and read the file with the in command; then OBJ will report thebugs it finds, based on which you can re-edit and then re-run the file;for complex examples, this cycle can be repeated many times. The author has found it convenient to edit OBJ files in one Emacs buffer,while another Emacs buffer contains a live OBJ; then you can switchbetween these by switching windows (or buffers); moreover, the resultsof execution are easily available for consultation and archiving.OBJ3 can be obtained by ftp from pages linked to the following URL: Once OBJ3 is installed, you can invoke it with the command obj . A laterversion of OBJ called BOBJ is also available via the above URL; whereasOBJ3 is implemented in Lisp, BOBJ is implemented in Java, and providessome additional features, including hidden algebra (as in Chapter 13).BOBJ is almost completely upward compatible with OBJ3, except that apply commands may need some reorganization, because the internalordering of rules is different in the two systems.CafeOBJ [43] is another algebraic specification language that couldbe used in connection with this text, although syntactical conversionwill be needed, since its syntax tends to follow that of C. Informationon how to obtain CafeOBJ is also available via the above URL. Also discuss the conversion script once it is available. OBJ3 Syntax and Usage Exiled Proofs This appendix contains proofs considered too distracting to put in themain body of the text. B.1 Many-Sorted Algebra Most of the proofs for the main results on many-sorted algebra wereomitted in Chapter 4, and are also omitted here, because they followfrom the more general results on order-sorted algebra that are restatedand proved in Section B.5 below. An exception is Theorem 4.9.1, whichwe prove here as a sort of “warm up” for the proof of Theorem 10.3.3in Section B.5. Theorem 3.2.1 ( Initiality ) Given a signature Σ without overloading and a Σ -algebra M , there is a unique Σ -homomorphism T Σ → M . (cid:2) Theorem 3.2.10 ( Initiality ) Given any signature Σ and any Σ -algebra M , there isa unique Σ -homomorphism T Σ → M . (cid:2) Theorem 4.5.4 For any set A of unconditional Σ -equations and unconditional Σ -equation e , A (cid:96) e iff A (cid:96) ( , , ± ) e . (cid:2) Theorem 4.8.3 ( Completeness ) Given a signature Σ and a set A of (possiblyconditional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iff A (cid:238) e , where (cid:96) C denotes deduction using the rules (1,2,3,4,5C). (cid:2) Note that Theorem 4.4.2 is the special case of the above where all equa-tions in A are unconditional. Exiled Proofs Theorem 4.9.1 ( Completeness of Subterm Replacement ) For any set A of (pos-sibly conditional) Σ -equations and any unconditional Σ -equation e , A (cid:96) C e iff A (cid:96) ( , , ± ) e . Proof ( (cid:63) ): Let X be a fixed but arbitrary set of variable symbols over thesort set of Σ . We will show that for any e quantified by X , A (cid:96) C e iff A (cid:96) ( , , ± ) e . For this purpose, we define two binary relations on T Σ (X) , for s ∈ S and t, t (cid:48) ∈ T Σ (X) s , by t ≡ s t iff A (cid:96) C ( ∀ X) t = t , and t ≡ Rs t iff A (cid:96) ( , , ± ) ( ∀ X) t = t , and then show they are equal. (The superscript “R” comes from “Re-placement” in the name of rule ( ± ) .)Soundness of ( ± ) and completeness of (cid:96) C give us that A (cid:96) ( , , ± ) e implies A (cid:96) C e , which gives us ≡ R ⊆ ≡ .To show the opposite inclusion, we note that ≡ is the smallest Σ -congruence satisfying a certain property, and then prove that ≡ R isanother Σ -congruence satisfying that property. The property is closureunder (5C), in the sense that if ( ∀ Y ) t = t (cid:48) if C is in A and if θ : Y → T Σ (X) is a substitution such that θ(u) ≡ θ(v) for each pair (u, v) ∈ C ,then θ(t) ≡ θ(t (cid:48) ) . That ≡ is the least congruence closed under (5C)follows from its definition.To facilitate proofs about ≡ R , we define a family of relations on T Σ (X) , for s ∈ S , by ≡ R ,s = { (t, t) | t ∈ T Σ (X) s } , and for each n > ≡ Rn,s ={ (t , t ) | t , t ∈ T Σ (X) s and A (cid:96) ( , ± ) ( ∀ X) t = t via a proof of length ≤ n } . Then ≡ R = (cid:83) n ∈ ω ≡ Rn . The relation ≡ R is reflexive and transitive by definition. To prove itssymmetry, we show by induction on n that each relation ≡ Rn is symmet-ric. For the induction step, suppose that t ≡ Rn + t (cid:48) , using symmetry of ≡ Rn . There are just two cases, since the last step in proving ( ∀ X) t = t (cid:48) must use either the rule (3) or the rule ( ± ) . If the last step used(3), then there exists t (cid:48)(cid:48) such that t ≡ Rn t (cid:48)(cid:48) and t (cid:48)(cid:48) ≡ Rn t (cid:48) . By the induc-tion hypothesis, we have that t (cid:48)(cid:48) ≡ Rn t and t (cid:48) ≡ Rn t (cid:48)(cid:48) , which imply that t (cid:48) ≡ Rn + t . In the second case, where ( ± ) is used, we again conclude t (cid:48) ≡ Rn + t , this time by symmetry of ( ± ) . Thus each ≡ Rn is symmetric,and symmetry of ≡ R follows from the fact that any union of symmetricrelations is symmetric. ewriting To prove ≡ R is a congruence, we must show that for each operation σ in Σ , σ (t , . . . , t k ) ≡ R σ (t (cid:48) , . . . , t (cid:48) k ) whenever t i ≡ R t (cid:48) i for i = , . . . , k .For simplicity of presentation (and in fact without loss of generality),we do the proof for k = 2, showing by induction on n that if t ≡ Rn t (cid:48) and t ≡ Rn t (cid:48) then σ (t , t ) ≡ R σ (t (cid:48) , t (cid:48) ) . For this purpose, we show that σ (t , t ) ≡ R σ (t (cid:48) , t ) and σ (t (cid:48) , t ) ≡ R σ (t (cid:48) , t (cid:48) ) , and then use transitiv-ity of ≡ R . Since these two subgoals are entirely analogous, we concen-trate on the first, σ (t , t ) ≡ R σ (t (cid:48) , t ) .The base case ( n = 0) is trivial. For the induction step, as in thesymmetry proof for ≡ R , there are two cases, where the last step inproving t ≡ R t (cid:48) is either (3) or else ( ± ) .In the first case, there exists t ∈ T Σ (X) such that t ≡ Rn t and t ≡ Rn t (cid:48) . Then σ (t , t ) ≡ R σ (t , t ) ≡ R σ (t (cid:48) , t ) by the induction hypothesis, and transitivity of ≡ R gives the desiredresult.For the case where ( ± ) is applied, A contains an equation e (cid:48) , ( ∀ Y ) t = t (cid:48) if C , such that there is a substitution ψ : Y → T Σ (X) such that ψ(u) ≡ Rn ψ(v) for each pair (u, v) ∈ C , and such that t (z ← ψ(t)) = t and t (z ← ψ(t (cid:48) )) = t (cid:48) , for some t ∈ T Σ (X ∪ { z } ) with z (cid:54)∈ X . By applying ( ± ) with e (cid:48) to the term σ (t , t ) instead of t , we obtain σ (t , t ) ≡ Rn + σ (t (cid:48) , t ) , which concludes our proof that ≡ R is a congruence.We still have to show that ≡ R is closed under (5C). But this followsfrom the fact that ( ± ) includes (5C), which is Exercise 4.9.4. (cid:2) This elegant proof is due to R˘azvan Diaconescu. Note that Theorem4.5.4 is the special case of the above result where all equations in A are unconditional, and that the above result is in turn a special case ofTheorem 10.3.3, which covers the order-sorted case. B.2 Rewriting The results in this section concern overloaded many-sorted term rewrit-ing, beginning with the following: Proposition 5.3.4 A TRS ( Σ , A) is ground terminating if ( Σ (X), A) is groundterminating, where X is a variable set for Σ ; moreover, if Σ is non-void,then ( Σ , A) is ground terminating iff ( Σ (X), A) is ground terminating. Proof: It is clear that ground termination of ( Σ (X), A) implies that of ( Σ , A) . Exiled Proofs For the converse, suppose that ( Σ , A) is ground terminating and that Σ is non-void, so that for each sort s there is some term a s ∈ T Σ ,s . Nowassume that t ⇒ t ⇒ t ⇒ · · · is a non-terminating rewrite sequence for ( Σ (X), A) , where the rewrite t i ⇒ t i + uses the rule l i → r i in A with var (l i ) = X i for i = , , . . . ,with t i and θ i : X i → T Σ (X) such that t i = t i (z ← θ i (l i )) and t i + = t i (z ← θ i (r i )) . Define g : X ∪ { z } → T Σ ( { z } ) by g(z) = z and g(x) = a s whenever x ∈ X has sort s , and let g denote the free extension T Σ (X ∪{ z } ) → T Σ ( { z } ) . Then t ⇒ t ⇒ t ⇒ · · · is a non-terminating rewrite sequence for ( Σ , A) , where each rewrite t i ⇒ t i + uses the rule l i → r i with t i = t i (z ← θ i (l i )) and t i + = t i (z ← θ i (r i )) , where θ i : X i → T Σ is the composition θ i ; g and where t i = g(t i ) . This works because t i = g(t i ) for i = , , . . . ; intuitively,the t i rewrite sequence is the image under g of the t i rewrite sequence. (cid:2) The following is used to prove Proposition 5.5.6: Proposition Σ -TRS A and a function ρ : T Σ → ω , then Σ -substitution is strict ρ -monotone if every operation symbol in Σ is ρ -strict monotone; the same holds for weak ρ -monotonicity. Proof ( (cid:63) ): We use induction on the structure of Σ -terms. First notice that aterm t ∈ T Σ ( { z } ) having a single occurrence of z is either z or else isof the form σ (t , . . . , t n ) where each t i except one is ground, and thatone has a single occurrence of z . Now define the z - depth d of a term t ∈ T Σ ( { z } ) having a single occurrence of z as follows: d(z) = d(σ (t , . . . , t n )) = + d(t i ) where t i contains the occurrence of z .Notice in particular that d(σ (t , . . . , t n )) = t i containing z is z .Now assume that ρ(t) > ρ(t (cid:48) ) . Then substitution is strict ρ -mono-tone for all terms t of z -depths 0 and 1; this covers the base cases. Forthe induction step, assume that strict ρ -monotonicity of substitutionholds for all terms t of z -depth less than m > 0, and that we aregiven some t ∈ T Σ ( { z } ) with z -depth m and a single occurrence of z . Then t has the form σ (t , . . . , t n ) for some σ ∈ Σ and some n > ewriting furthermore, if t i is the term containing z then d(t i ) = m − 1. Thereforestrict ρ -monotonicity of substitution holds for t i . We now calculate: t (z ← t) = σ (t , . . . , t i (z ← t), . . . , t n ) = σ (t , . . . , z, . . . , t n )(z ← t i ) , where t i = t i (z ← t) ; and similarly, t (z ← t (cid:48) ) = σ (t , . . . , z, . . . , t n )(z ← t (cid:48) i ) , where t (cid:48) i = t i (z ← t (cid:48) ) . Now because strict ρ -monotonicity of substi-tution holds for t i we have ρ(t i (z ← t)) > ρ(t i (z ← t (cid:48) )) , i.e., ρ(t i ) > ρ(t (cid:48) i ) , and therefore it follows that ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) , as desired. The argument for the second assertion is essentially thesame. (cid:2) Theorem 5.6.9 ( Critical Pair Theorem ) A TRS is locally Church-Rosser if andonly if all its critical pairs are convergent. Sketch of Proof: The converse is easy. Suppose that all critical pairs converge,and consider a term with two distinct rewrites. Then their redexes areeither disjoint or else one of them is a subterm of the other, since iftwo subterms of a given term are not disjoint, one must be containedin the other. If the redexes are disjoint, then the result of applying bothrewrites is the same in either order. If the redexes are not disjoint, theneither the rules overlap (in the sense of Definition 5.6.2), or else thesubredex results from substituting for a variable in the leftside of therule producing the larger redex. In the first case, the result terms ofthe two rewrites rewrite to a common term by hypothesis, since theoverlap is a substitution instance of the overlap of some critical pairby Proposition 12.0.1. In the second case, the result of applying bothrules is the same in either order, though the subredex may have to berewritten multiple (or zero) times if the variable involved is non-linear. THIS IS JUST A COPY OF WHAT’S IN SECTION 5.6 ALREADY; SHOULD FILLIN DETAILS HERE OR ELSE OMIT. (cid:2) Proposition 5.8.10 A CTRS ( Σ , A) is ground terminating if ( Σ (X), A) is groundterminating, where X is a variable set for Σ ; moreover, if Σ is non-void,then ( Σ , A) is ground terminating iff ( Σ (X), A) is ground terminating. Proof: We extend the proof of Proposition 5.3.4 above. If a rewrite t i ⇒ t i + in ( Σ , A(X)) uses the rule l i → r i if C i where var (l i ) = X i and where C i contains u ij = v ij , then to get the corresponding rewrite t i ⇒ t i + in ( Σ , A) , we apply g to the conditions as well as to the left and rightsides, noting that θ i (u ij ) ↓ A θ i (v ij ) on T Σ (X) implies θ i (u ij ) ↓ A θ i (v ij ) on T Σ . (cid:2) Exiled Proofs B.2.1 ( (cid:63) ) Orthogonal Term Rewriting Systems The proof below that orthogonal term rewriting systems are Church-Rosser was provided by Grigore Ro¸su, following a suggestion of JosephGoguen to use the Hindley-Rosen Lemma (Proposition 5.7.5). As in theclassic proof of Gerard Huet [108], we use “parallel rewriting” (Defini-tion B.2.1); this should not be confused with the concurent rewriting of[70, 134], as it is a technical notion especially created for this result.Indeed, the entire proof is rather technical, and suggestions for furthersimplification would be of interest. Definition B.2.1 Given a Σ -TRS A and Σ -terms t, t (cid:48) ∈ T Σ (Y ) , the one-step par-allel rewriting relation, written t ⇒⇒ A t (cid:48) , holds iff there exists a Σ -term t ∈ T Σ ( { z , . . . , z n } ∪ Y ) having exactly one occurrence of each vari- able z i , and there exist n Σ -rules α i → β i in A and n substitutions θ i : X i → T Σ (Y ) where X i = var (α i ) for 1 ≤ i ≤ n , such that t = t [z ← θ (α ), . . . , z n ← θ n (α n )] and t (cid:48) = t [z ← θ (β ), . . . , z n ← θ n (β n )] . The parallel rewriting relation is the transitive closure of ⇒⇒ A , denoted t ⇒⇒ ∗ A t (cid:48) . (cid:2) The relation ⇒⇒ is reflexive, as can be seen by taking n = t contain exactly one occurrence of eachvariable z i only for technical reasons. Note that the Σ -rules α i → β i are not required to be distinct. We may omit the subscript A when it isclear from context, writing t ⇒⇒ ∗ t (cid:48) instead of t ⇒⇒ ∗ A t (cid:48) , and also writing ⇒⇒ instead of ⇒⇒ A . Exercise B.2.1 Given a Σ -TRS A , terms t , t (cid:48) , . . . , t n , t (cid:48) n ∈ T Σ (Y ) such that t ⇒⇒ A t (cid:48) , . . . , t n ⇒⇒ A t (cid:48) n , and t ∈ T Σ ( { z , . . . , z n } ∪ Y ) , show t [z ← t , . . . , z n ← t n ] ⇒⇒ A t [z ← t (cid:48) , . . . , z n ← t (cid:48) n ] . (cid:2) Three lemmas precede the main part of the proof. The first justifiesusing parallel rewriting to prove results about ordinary rewriting. Lemma B.2.2 Given a Σ -TRS A , ⇒⇒ ∗ A = ⇒ ∗ A . Proof: The inclusion ⇒ ∗ A ⊆ ⇒⇒ ∗ A follows from the fact that one-step rewritingis the special case of one-step parallel rewriting where n = z .Thus it suffices to prove the opposite inclusion, ⇒⇒ A ⊆ ⇒ ∗ A . Supposethat t ⇒⇒ A t (cid:48) and let t ∈ T Σ ( { z , . . . , z n } ∪ Y ) , as in the definition ofone-step parallel rewriting. Let t i ∈ T Σ ( { z } ∪ Y ) denote the term t [z ← θ (β ), . . . , z i − ← θ i − (β i − ), z i ← z, z i + ← θ i + (α i + ), . . . , z n ← θ n (α n )] ewriting and let t i denote the terms t i [z ← θ i (β i )] for 1 ≤ i ≤ n .Because t = t [z ← θ (α )] and t = t [z ← θ (β )] , we get t ⇒ A t by the definition of one-step (non-parallel) rewriting. Also because t i = t i + [z ← θ i + (α i + )] and t i + = t i + [z ← θ i + (β i + )] , we get t i ⇒ A t i + for 1 ≤ i < n . Finally, since t n = t (cid:48) , we get the chain of one-steprewrites t ⇒ A · · · ⇒ A t i ⇒ A t i + ⇒ A · · · ⇒ A t (cid:48) , and therefore t ⇒ ∗ A t (cid:48) . (cid:2) From now on, we assume A is a fixed Σ -TRS with Σ -rules α i → β i for1 ≤ i ≤ N . Let A i denote the Σ -TRS containing a single Σ -rule α i → β i ,let ⇒ i denote the relation ⇒ A i , let ⇒⇒ i denote the relation ⇒⇒ A i , and let X i = var (α i ) , the set of variables of α i . The next lemma is the only place where orthogonality of A is used.In reading its proof, it may help to visualize the various constructionsusing the picture below. Lemma B.2.3 If A is orthogonal and if ϕ : X i → T Σ (Y ) is a substitution suchthat ϕ(α i ) ⇒⇒ j t for some 1 ≤ i, j ≤ N , then there is some t (cid:48) ∈ T Σ (Y ) such that ϕ(β i ) ⇒⇒ j t (cid:48) and t ⇒ i t (cid:48) . Proof: Because ϕ(α i ) ⇒⇒ j t , Definition B.2.1 implies there exist a Σ -term t ∈ T Σ ( { z , . . . , z n }∪ Y ) and substitutions θ k : X j → T Σ (Y ) such that ϕ(α i ) = t [z ← θ (α j ), . . . , z n ← θ n (α j )] and t = t [z ← θ (β j ), . . . , z n ← θ n (β j )] . Therefore θ k (α j ) is a subterm of ϕ(α i ) for each 1 ≤ k ≤ n .But because A is nonoverlapping, the terms α i and α j do not over-lap, i.e., there does not exist a non-variable subterm α ki of α i such that θ k (α j ) = ϕ(α ki ) . Consequently, the only possibility for θ k (α j ) to be asubterm of ϕ(α i ) , is to be a subterm (not necessarly proper) of ϕ(x) where x is a variable in X i . Hence for each 1 ≤ k ≤ n , there is a variable x k in X i such that θ k (α j ) is a subterm of ϕ(x k ) .The variables x k need not be distinct for distinct indices k . Because t is the term of “positions” of θ k (α j ) in ϕ(α i ) for 1 ≤ k ≤ n andbecause each θ k (α j ) is a subterm of ϕ(x k ) and A is left linear, thatis, α i has no more than one occurence of any variable x in X i , we canconclude that for each x in X i there is a subterm t x of t such that ϕ(x) = t x [z ← θ (α j ), . . . , z n ← θ n (α j ) . The term t x is the subtermof t that contains the “positions” of each θ k (α j ) in ϕ(x) for 1 ≤ k ≤ n .It is possible that ϕ(x) does not contain all θ k (α j ) as subterms or justdoes not contain anyone of them, but we still preserve the notation t x [z ← θ (α j ), . . . , z n ← θ n (α j )] which means that one substitutesonly variables z k that appear in t x , that is, variables z k for those 1 ≤ k ≤ n for which x k = x . Exiled Proofs (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) α i x = xk = xk (cid:48) (cid:113) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) t x (cid:113) zk (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) θ k (α j ) (cid:113) zk (cid:48) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) θ k (cid:48) (α j ) (cid:63)(cid:54) ϕ(x) (cid:63)(cid:54)(cid:63)(cid:54) t ϕ(α i ) Let (cid:15) : X i → T Σ ( { z , . . . , z n } ∪ Y ) denote the function for which (cid:15)(x) = t x . Since A is left linear, no α i has more than one instanceof any variable x in X i , and therefore (cid:15)(α i ) = t .Now let θ α , θ β : { z , . . . , z n } → T Σ (Y ) be the substitutions such that θ α (z k ) = θ k (α j ) and θ β (z k ) = θ k (β j ) for all 1 ≤ k ≤ n . Then we have (cid:15) ; θ α = ϕ , because for each x in X i ((cid:15) ; θ α )(x) = θ α ((cid:15)(x)) = θ α (t x ) = t x [z ← θ (α j ), . . . , z n ← θ n (α j )] = ϕ(x) . Let t (cid:48) be the term ((cid:15) ; θ β )(β i ) , that is t (cid:48) = (cid:15)(β i )[z ← θ (β j ), . . . , z n ← θ n (β j )] . Then we can show that ϕ(β i ) = ((cid:15) ; θ α )(β i ) = θ α ((cid:15)(β i )) = (cid:15)(β i )[z ← θ (α j ), . . . , z n ← θ n (α j )] , and by the definition of parallel rewriting, we get ϕ(β i ) ⇒⇒ j t (cid:48) . Although (cid:15)(β i ) may contain multiple occurences of variables z , . . . , z n , this doesnot modify the one-step parallel rewriting relation (the reader shouldprove this). On the other hand, because t = t [z ← θ (β j ), . . . , z n ← θ n (β j )] = (cid:15)(α i )[z ← θ (β j ), . . . , z n ← θ n (β j )] = θ β ((cid:15)(α i )) = ((cid:15) ; θ β )(α i ) , ewriting it follows that t ⇒ i t (cid:48) , by the definition of one-step ordinary rewriting. (cid:2) The above lemma holds even when n = 0, that is, when there are noparallel rewrites. Lemma B.2.4 If A is orthogonal and if t, t , t are Σ -terms such that t ⇒⇒ i t and t ⇒⇒ j t then there is some t (cid:48) such that t ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . Proof: There exist Σ -terms t i ∈ T Σ ( { p , . . . , p m } ∪ Y ) , t j ∈ T Σ ( { z , . . . , z n } ∪ Y ) and substitutions ϕ , . . . , ϕ m : X i → T Σ (Y ) and θ , . . . , θ n : X j → T Σ (Y ) such that t = t i [p ← ϕ (α i ), . . . , p m ← ϕ m (α i )]t = t i [p ← ϕ (β i ), . . . , p m ← ϕ m (β i )] , and t = t j [z ← θ (α j ), . . . , z n ← θ n (α j )]t = t j [z ← θ (β j ), . . . , z n ← θ n (β j )] . In the picture below, t i and t j appear as two different “tops” for theterm t : (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:63)(cid:54) (cid:63)(cid:54) (cid:63)(cid:54) t t i t j t Let δ α,α : { p , . . . , p m , z , . . . , z n } → T Σ (Y ) be a substitution suchthat δ α,α (p l ) = ϕ l (α i ) for all 1 ≤ l ≤ m and δ α,α (z k ) = θ k (α j ) forall 1 ≤ k ≤ n . We get δ α,α (t i ) = t and also δ α,α (t j ) = t , that is, δ α,α is a unifier of the terms t i and t j . Because t i , t j are unifiable,they have a most general unifier, say ψ : { p , . . . , p m , z , . . . , z n } → T Σ ( { p , . . . , p m , z , . . . , z n } ∪ Y ) . In our case, because t i and t j haveexactly one occurrence of the variables p , . . . , p m and z , . . . , z n re-spectively, each ψ(p l ) is either equal to p l or else is a subterm of t j ,and each ψ(z k ) is either equal to z k or else is a subterm of t i . Thereader can now check that ψ ; δ α,α = δ α,α .Next we introduce two more substitutions, δ α,β , δ β,α : { p , . . . , p m ,z , . . . , z n } → T Σ (Y ) such that δ α,β (p l ) = ϕ l (α i ) and δ α,β (z k ) = θ k (β j ) , The notions of unifier and most general unifier are defined in Chapter 12. Exiled Proofs and δ β,α (p l ) = ϕ l (β i ) and δ β,α (z k ) = θ k (α j ) , respectively. We nowclaim that ϕ l (α i ) ⇒⇒ j (ψ ; δ α,β )(p l ) for each 1 ≤ l ≤ m . This is because ϕ l (α i ) = δ α,α (p l ) = (ψ ; δ α,α )(p l ) and because either ψ(p l ) equals p l ,in which case (ψ ; δ α,β )(p l ) = ϕ l (α i ) and then we use the reflexivity of ⇒⇒ j , or else ψ(p l ) contains only distinct variables in { z , . . . , z n } andthen ϕ l (α i ) = δ α,α (ψ(p l )) = ψ(p l )[z ← θ (α j ), . . . , z n ← θ n (α j )] and (ψ ; δ α,β )(p l ) = δ α,β (ψ(p l )) = ψ(p l )[z ← θ (β j ), . . . , z n ← θ n (β j )]. Similarly, θ k (α j ) ⇒⇒ i (ψ ; δ β,α )(z k ) for each 1 ≤ k ≤ n .Suppose that p , . . . , p M are all the variables of t i such that p l ∉ var (ψ(z k )) for 1 ≤ k ≤ n , and that z , . . . , z N are the variables of t j such that z k ∉ var (ψ(p l )) for 1 ≤ l ≤ m . Then there is a term t ∈ T Σ ( { p , . . . , p M } ∪ { z , . . . , z N } ∪ Y ) such that ψ(t ) = ψ(t i ) = ψ(t j ) .In the above picture, t is the “intersection” of t i and t j . The readershould now check that t = t [p ← ϕ (α i ), . . . , p M ← ϕ M (α i ),z ← θ (α j ), . . . , z N ← θ N (α j )]t = t [p ← ϕ (β i ), . . . , p M ← ϕ M (β i ),z ← (ψ ; δ β,α )(z ), . . . , z N ← (ψ ; δ β,α )(z N )]t = t [p ← (ψ ; δ α,β )(p ), . . . , p M ← (ψ ; δ α,β )(p M ),z ← θ (β j ), . . . , z N ← θ N (β j )] . Because ϕ l (α i ) ⇒⇒ j (ψ ; δ α,β )(p l ) , we conclude by Lemma B.2.3 thatthere is some u l such that ϕ l (β i ) ⇒⇒ j u l and (ψ ; δ α,β )(p l ) ⇒ i u l for all 1 ≤ l ≤ M . Similarly, there is some v k such that θ k (β j ) ⇒⇒ i v k and (ψ ; δ β,α )(z k ) ⇒ j v k for all 1 ≤ k ≤ N .Finally, let t (cid:48) = t [p ← u , . . . , p M ← u M , z ← v , . . . , z N ← v N ] .Then by Exercise B.2.1, we have t ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . (cid:2) Theorem 5.6.4 A Σ -TRS A is Church-Rosser if it is orthogonal and lapse free. Proof: To prove that ⇒ i and ⇒ j commute for 1 ≤ i, j ≤ N , by Lemma B.2.2 itsuffices to prove that ⇒⇒ i and ⇒⇒ j commute.First, we prove by induction on the length of rewriting with ⇒⇒ ∗ j thatwhenever t ⇒⇒ i t and t ⇒⇒ ∗ j t there exists a t (cid:48) such that t ⇒⇒ ∗ j t (cid:48) ewriting Modulo Equations and t ⇒⇒ i t (cid:48) . If the length of the rewrite sequence t ⇒⇒ ∗ j t is zero,then let t (cid:48) = t . If it is more than zero, let t (cid:48) be a Σ -term such that t ⇒⇒ ∗ j t (cid:48) ⇒⇒ j t . By the induction hypothesis, there exists t (cid:48)(cid:48) such that t ⇒⇒ ∗ j t (cid:48)(cid:48) and t (cid:48) ⇒⇒ i t (cid:48)(cid:48) . Now Lemma B.2.4 gives us t (cid:48) such that t (cid:48)(cid:48) ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . Therefore t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ i t (cid:48) .Now we prove by induction on the length of rewriting with ⇒⇒ ∗ i thatwhenever t ⇒⇒ ∗ i t and t ⇒⇒ ∗ j t there exists t (cid:48) such that t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ ∗ i t (cid:48) ; that is, ⇒⇒ i and ⇒⇒ i commute. If the length of the rewritesequence is zero, let t (cid:48) = t , and otherwise let t (cid:48) be a Σ -term such that t ⇒⇒ ∗ i t (cid:48) ⇒⇒ i t . By the induction hypothesis, there exists a term t (cid:48)(cid:48) suchthat t (cid:48) ⇒⇒ ∗ j t (cid:48)(cid:48) and t ⇒⇒ ∗ i t (cid:48)(cid:48) . By the induction above there exists t (cid:48) suchthat t ⇒⇒ ∗ j t (cid:48) and t (cid:48)(cid:48) ⇒⇒ i t (cid:48) .It now follows that t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ ∗ i t (cid:48) . Therefore ⇒⇒ i and ⇒⇒ j commute, and so the Hindley-Rosen Lemma (Proposition 5.7.5) gives usthat A is Church-Rosser. (cid:2) B.3 Rewriting Modulo Equations Theorem 7.7.20 Let ( Σ , A, B) be a CMTRS with Σ non-void and let ( Σ , A (cid:48) , B) be as ground terminating sub-CMTRS of Σ , A, B . let P a poset and let N = A − B . If there is a ρ : T Σ ,B → P such that(1) each rule in B is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least, for each t ∈ (T Σ ,B there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. Proof: . . . . . . See proof of Theorem 5.8.20, page 136 . . . . . . (cid:2) B.4 First-Order Logic This section restates and proves Proposition 8.3.21, which does mostof the work for proving the Subsitution Theorem (Theorem 8.3.22). Proposition 8.3.21 If θ is capture free for P , then for any model M , [[θ(P )]] M = [[θ]] − M ([[P ]] M ) . Exiled Proofs Proof: We first show by structural induction over Ω that the required equalityholds for every substitution τ that is capture free for P , with τ Free (P) the identity. The reader is left to check the base cases (where P isa generator in G X or true ) and the inductive steps for negation andconjunction. Now suppose P = ( ∀ x)Q . The assertion a ∈ [[τ(( ∀ x)Q)]] is equivalent to b ∈ [[τ x (Q)]] for each b : X → M with b(y) = a(y) for y ≠ x , that is, τ x ; b ∈ [[Q]] , because (τ x ) Free (Q) is the identity, τ x is capture free for Q , plus the induction hypothesis. Similarly, a ∈ [[τ]] − ([[( ∀ x)Q]]) is equivalent to τ ; a ∈ [[( ∀ x)Q]] , that is, b (cid:48) ∈ [[Q]] for each b (cid:48) : X → M with b (cid:48) (y) = a(τ(y)) for y ≠ x .Suppose a ∈ [[τ(( ∀ x)Q)]] and let b (cid:48) : X → M such that b (cid:48) (y) = a(τ(y)) for y ≠ x . Define b : X → M by b(x) = b (cid:48) (x) and b(y) = a(y) for y ≠ x ; then τ x ; b ∈ [[Q]] . We now claim b (cid:48) = τ x ; b . In- deed, (τ x ; b)(x) = b(x) = b (cid:48) (x) , and if y ≠ x then (τ x ; b)(y) = b(τ x (y)) = b(τ(y)) . But x ∉ Var (τ(y)) , because if y ∈ Free (P ) then x ∉ Var (τ(y)) because τ is capture free for P , and if y ∉ Free (P ) then τ(y) = y . Then b(τ(y)) = a(τ(y)) = b (cid:48) (y) , that is, b (cid:48) = τ x ; b .Therefore b (cid:48) ∈ [[Q]] , that is, τ ; a ∈ [[( ∀ x)Q]] .Conversely, suppose τ ; a ∈ [[( ∀ x)Q]] and let b : X → M such that b(y) = a(y) for y ≠ x and let b (cid:48) be τ x ; b . Then b (cid:48) (y) = a(τ(y)) (asabove). Therefore τ x ; b ∈ [[Q]] , that is, a ∈ [[τ(( ∀ x)Q)]] .Now let θ be any substitution and let τ be the substitution θ X − Free (P) .Then τ is capture free for P , and τ Free (P) is the identity; therefore [[τ(P )]] = [[τ]] − ([[P ]]) . By 5. of Exercise 8.3.14, θ(P ) = τ(P ) ; there-fore it suffices to prove θ ; a ∈ [[P ]] iff τ ; a ∈ [[P ]] for each a : X → M . ByProposition 8.3.3, it is enough to show that (θ ; a)(y) = (τ ; a)(y) for y ∈ Free (P ) , which is true because θ(y) = τ(y) for y ∈ Free (P ) , byconstruction of τ . (cid:2) B.5 Order-Sorted Algebra This section provides the omitted proofs for results on order-sorted algebra in Chapter 10. Theorem 10.2.8 ( Initiality ) If Σ is regular and if M is any Σ -algebra, then thereis one and only one Σ -homomorphism from T Σ to M . Proof: In this proof we write T for T Σ . Let M be an arbitrary order-sorted Σ -algebra; then we must show that there is a unique order-sorted Σ -homomorphism h : T → M . We will (1) construct h , then (2) show it isan order-sorted Σ -homomorphism, and finally (3) show it is unique.(1) We construct h by induction on the depth of terms in T . Thereare two cases: rder-Sorted Algebra (1a) If t ∈ T has depth 0, then t = σ for some constant σ in Σ . Byregularity, σ has a least sort s . Then for any s (cid:48) ≥ s we define h s (cid:48) (σ ) = M [],sσ (1b) If t = σ (t . . . t n ) ∈ T has depth n + 1, then by regularity thereare least w and s with σ ∈ Σ w,s where w = s . . . s n ≠ [] and LS(t i ) ≤ s i for i = , . . . , n . Then for any s (cid:48) ≥ s we define h s (cid:48) (t) = M w,sσ (h s (t ), . . . , h s n (t n )) , noting that h s (t ), . . . , h s n (t n ) are already defined.(2) We now show that h is an order-sorted Σ -homomorphism. Byconstruction h satisfies the restriction condition E48 of Definition 10.1.5.To see that it also satisfies the homomorphism condition of Defini- tion 10.1.5, we again consider two cases:(2a) σ ∈ Σ [],s is a constant. By regularity and monotonicity, s is theleast sort of σ , and we have already defined h s (σ ) = M [],sσ as needed.(2b) We now consider a term t of depth greater than 0, and let σ ∈ Σ w (cid:48) ,s (cid:48) with w (cid:48) = s (cid:48) . . . s (cid:48) n ≠ [] be such that t = σ (t . . . t n ) =T w (cid:48) ,s (cid:48) σ (t , . . . , t n ) . By regularity and Proposition 10.2.7 there are least w = s . . . s n and s = LS(t) such that t = σ (t . . . t n ) = T w,sσ (t , . . . , t n ) .Then w ≤ w (cid:48) and s ≤ s (cid:48) so that (2) of Definition 10.1.3 gives M w (cid:48) ,s (cid:48) σ = M w,sσ on M w . Thus, using the already established fact that h satisfiesthe restriction condition, we have h s (cid:48) (σ (t . . . t n )) = M w,sσ (h s (t ), . . . , h s n (t n )) = M w (cid:48) ,s (cid:48) σ (h s (cid:48) (t ), . . . , h s (cid:48) n (t n )). (3) Finally, we show the uniqueness of h . In fact, we will show thatif h (cid:48) : T → M is an order-sorted Σ -homomorphism, then h = h (cid:48) , byinduction on the depth of terms. For depth 0 consider σ ∈ Σ [],s . Then s is the least sort of σ , and for any s ≥ s (cid:48) , we must have h (cid:48) s (cid:48) (σ ) = h (cid:48) s (σ ) = M [],sσ = h s (σ ) = h s (cid:48) (σ ) , as desired. Now assume the result for depth ≤ n , and consider a term t = σ (t . . . t n ) = T w (cid:48) ,s (cid:48) σ (t , . . . , t n ) of depth n + σ ∈ Σ w (cid:48) ,s (cid:48) and w (cid:48) = s (cid:48) . . . s (cid:48) n . As in (2b), there are least w = s . . . s n and s = LS(t) such that t = σ (t , . . . , t n ) = T w,sσ (t , . . . , t n ) and M w (cid:48) ,s (cid:48) σ = M w,sσ on M w . Then h (cid:48) s (cid:48) (t) = M w (cid:48) ,s (cid:48) σ (h (cid:48) s (cid:48) (t ), . . . , h (cid:48) s (cid:48) n (t n )) = M w (cid:48) ,s (cid:48) σ (h s (cid:48) (t ), . . . , h s (cid:48) n (t n )) ( by the induction hypothesis ) = M w,sσ (h s (t ), . . . , h s n (t n )) = h s (cid:48) (t) as needed. (cid:2) Exiled Proofs Theorem 10.2.9 ( Freeness ) If (S, ≤ , Σ ) is regular, then T Σ (X) is a free Σ -algebraon X , in the sense that for each Σ -algebra M and each assignment a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M such that a(x) = a(x) for all x in X . Proof: The Σ -algebras M with an assignment a : X → M are in bijective cor-respondence with Σ (X) -algebras M . Now the initiality of T Σ (X) amongall Σ (X) -algebras A (Theorem 10.2.8) gives the desired result. (cid:2) Theorem 10.3.2 ( Completeness ) Given a coherent order-sorted signature Σ , given t, t (cid:48) in T Σ (X) , and given a set A of conditional Σ -equations, then the fol-lowing assertions are equivalent:(C1) ( ∀ X) t = t (cid:48) is derivable from A using rules (1)–(4) and (5C).(C2) ( ∀ X) t = t (cid:48) is satisfied by every order-sorted Σ -algebra thatsatisfies A .When all equations in A are unconditional, the same holds replacingrule (5C) by rule (5). Proof: We leave the reader to check soundness , i.e., that (C1) implies (C2); thisfollows as usual by induction from the soundness of each rule of de-duction separately. Here we show completeness , i.e., that (C2) implies(C1). The structure of this proof is as follows: We are given a Σ -equation e = ( ∀ X) t = t (cid:48) that is satisfied by every Σ -algebra that satisfies A , andwe wish to show that e is derivable from A ; to this end, we constructa particular Σ -algebra M such that if M satisfies e then e is derivablefrom A ; then we show that M satisfies A .First, we show that the following property of terms t, t (cid:48) ∈ T Σ (X) s for some sort s , defines an order-sorted Σ -congruence on T Σ (X) :(D) ( ∀ X) t = t (cid:48) is derivable from A using rules (1–4) plus (5C).Let us denote this relation ≡ . Then rules (1–3) say that ≡ is an equiva-lence relation on T Σ (X) s for each sort s . By applying rule (4) to terms t of the form σ (x , . . . , x n ) for σ ∈ Σ , we see that ≡ is a many-sorted Σ -congruence. Finally, ≡ is also an order-sorted Σ -congruence, becauseproperty (D) does not depend upon s .Now we can form the order-sorted quotient of T Σ (X) by ≡ , whichwe denote by T Σ ,A (X) , or within this proof, just M . Then by the con-struction of M , for each t, t (cid:48) ∈ T Σ (X) we have( * ) [t] = [t (cid:48) ] in M iff (D) holds,where [t] denotes the ≡ -equivalence class of t .We next show the key property of M , that( ** ) ( ∀ X) t = t (cid:48) satisfied in M implies that (D) holds. rder-Sorted Algebra Y (cid:63) [ _ ] T Σ (X) (cid:45)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:115) θ M ϕ Figure B.1: Factorization of θ Since the equation ( ∀ X) t = t (cid:48) is satisfied in M , we can use the inclu-sion i X : X → M sending x to [x] as an S -sorted assignment to see that [t] = [t (cid:48) ] in M ; then (D) holds by ( * ).We now prove that M satisfies A . Let ( ∀ Y ) t = t (cid:48) if C be a con- ditional equation in A , and let θ : Y → M be an S -sorted assignmentsuch that θ(u) = θ(v) for each condition u = v in C . Then for each s ∈ S and each y ∈ Y s , we can choose a representative t y ∈ T Σ (X) s such that θ(y) = [t y ] in M . Now let ϕ : Y → T Σ (X) be the substitutionsending y to t y . Then θ(y) = [ϕ(y)] for each y ∈ Y , and therefore θ(t) = [ϕ(t)] in M for any t ∈ T Σ (Y ) , by the freeness of T Σ (Y ) over Y . See Figure B.1.Therefore, [ϕ(u)] = [ϕ(v)] holds in M , and by the property ( * ), theequation ( ∀ X) ϕ(u) = ϕ(v) is derivable from A using (1–4) plus (5C)for each u = v in C . Therefore by rule (5C), the equation ( ∀ X) ϕ(t) = ϕ(t (cid:48) ) is derivable from A , and hence by ( * ), θ(t) = θ(t (cid:48) ) holds in M ,and thus the conditional equation ( ∀ Y ) t = t (cid:48) if C holds in M .Since an unconditional equation is just a conditional equation whoseset C of conditions is empty, when every equation in A is unconditionalwe are reduced to the simplified special case of the above argumentwhere only the rule (5) is needed. (cid:2) This result also gives completeness for ordinary MSA, and of coursefor unsorted algebra, as special cases. Now the initiality and freenessresults: Theorem 10.4.11 ( Initiality ) If Σ is coherent and A is a set of (possibly con-ditional) Σ -equations, then T Σ ,A is an initial ( Σ , A )-algebra, and T Σ ,A (X) is a free ( Σ , A )-algebra on X , in the sense that for each Σ -algebra M and each assignment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M such that a(x) = a(x) for each x in X . Proof: E49 First notice that the freeness of T Σ ,A (X) specializes to the initialityof T Σ ,A when X = ∅ , so that it suffices to show the freeness of T Σ ,A (X) .Let M be an order-sorted algebra satisfying A , and let a : X → M be anassignment for M . Then we have to show that there is a unique order-sorted Σ -homomorphism a & : T Σ ,A (X) → M extending a , i.e., such that Exiled Proofs a & (q(x)) = a(x) for each x ∈ X , where q denotes the quotient homo-morphism q : T Σ (X) → T Σ ,A (X) . The existence of a & follows from com-pleteness (Theorem 10.3.2), because the fact that M satisfies A impliesthat a ∗ (t) = a ∗ (t (cid:48) ) for every equation ( ∀ X) t = t (cid:48) that is derivablefrom A with the rules (1–4) plus (5C), and this implies that ≡ ⊆ ker (a ∗ ) ,and thus by the universal property of quotients (Proposition 10.4.10),there is a unique order-sorted homomorphism a & : T Σ ,A (X) → A with a ∗ = a & ◦ q .The uniqueness of a & now follows by combining the universal prop-erty of T Σ (X) as a free order-sorted algebra on X with the universalproperty of q as a quotient, as follows: Let h : T Σ ,A (X) → M be anotherorder-sorted homomorphism such that h(q(x)) = a(x) for each x ∈ X .Since T Σ (X) is a free order-sorted algebra on X , we have a ∗ = h ◦ q , and by the universal property of q as a quotient we have h = a & as desired. (cid:2) Theorem 10.3.3 Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iff A (cid:96) ( , , ± ) e . Proof: See page 8 of [82] [OSA1], and Theorem 4.9.1, page 82. (cid:2) The model-theoretic proof of Theorem 10.6.6 below uses naturality of the family ψ X of morphisms, which in particular implies commuta-tivity of the following diagram for X ⊆ X (cid:48) , where µ X,X (cid:48) is the unique Σ ⊗ -homomorphism induced by the composite map X (cid:62) X (cid:48) → T Σ ⊗ ,A ⊗ (X (cid:48) ) : T Σ ,A (X) (cid:63) ι X,X (cid:48) µ X,X (cid:48) T Σ ⊗ ,A ⊗ (X) (cid:45) ψ X (cid:48) (cid:63) T Σ ,A (X (cid:48) ) (cid:45) T Σ ⊗ ,A ⊗ (X (cid:48) )ψ X Theorem 10.6.6 If Σ is coherent and ( Σ , A) is faithful, then the extension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is conservative. Proof: We have to show that ψ X : T Σ ,A (X) → T Σ ⊗ ,A ⊗ (X) is injective. By theabove naturality diagram plus faithfulness, it suffices to show that ψ X (cid:48) : T Σ ,A (X (cid:48) ) → T Σ ⊗ ,A ⊗ (X (cid:48) ) is injective, where X (cid:48) ⊇ X is obtained from X byadding a new variable symbol of sort s for each sort s with X s = ∅ .Now pick an arbitrary variable symbol x s ∈ X s for each s ∈ S . Thekey step is to make the ( Σ , A) -algebra T Σ ,A (X (cid:48) ) into a ( Σ ⊗ , A ⊗ ) -algebra rder-Sorted Algebra by defining r s (cid:48) ,s : T Σ ,A (X (cid:48) ) s (cid:48) → T Σ ,A (X (cid:48) ) s to be the function that sends [t] ∈ T Σ ,A (X (cid:48) ) s to [t] , and otherwise sends it to x s . It is now easyto see that the retract equations are satisfied. Thus the freeness of T Σ ⊗ ,A ⊗ (X) implies that the natural inclusion X (cid:48) → T Σ ,A (X (cid:48) ) induces aunique Σ ⊗ -homomorphism q : T Σ ⊗ ,A ⊗ (X (cid:48) ) → T Σ ,A (X (cid:48) ) such that q ◦ ψ X (cid:48) is the identity. Therefore ψ X (cid:48) is injective. (cid:2) Exiled Proofs Some Background onRelations The first part of this appendix reviews some basic material that is as-sumed in the body of this text. The approach is oriented towards usein algebra and OBJ, and differs from traditional set-theoretic formaliza-tions like that in Definition C.0.1. Section C.1 implements some of thismaterial in OBJ. Definition C.0.1 A set-theoretic relation , from a set A to a set B , is a subset R ⊆ A × B ; we let aRb mean that (cid:104) a, b (cid:105) ∈ R . The image of R ⊆ A × B is the set { b | aRb for some a ∈ A } , a subset of B , and the coimage of R is { a | aRb for some b ∈ B } , a subset of A . A set-theoretic function from A to B is a set-theoretic relation f from A to B that satisfies thefollowing two properties:(1) for each a ∈ A there is some b ∈ B such that (cid:104) a, b (cid:105) ∈ f , and(2) if (cid:104) a, b (cid:105) , (cid:104) a, b (cid:48) (cid:105) ∈ f then b = b (cid:48) .When f is a function, we usually write f (a) for the unique b such that (cid:104) a, b (cid:105) ∈ f . (cid:2) The above is not very satisfactory in some respects. For example, con-sider the case where A ⊆ B and we want f to be the inclusion functionfrom A to B . As a set-theoretic relation, this is {(cid:104) a, a (cid:105) | a ∈ A } , which is exactly the same as the set-theoretic relation for the identity func-tion on A (for a specific example, let A = ω + and B = ω *). But thesetwo functions are not the same; although they have the same graph,image, and coimage, they have different target sets. Indeed, there aremany proofs in this text that use inclusion functions, and that wouldfail if inclusion and identity functions were the same! Hence, the aboveformalization of the relation concept is not suitable for our purposes.Instead,we use the following: Definition C.0.2 A relation from A to B is a triple (cid:104) A, R, B (cid:105) , where R is a set-theoretic relation from A to B , called the graph of the relation. We write Some Background on Relations aRb if (cid:104) a, b (cid:105) ∈ R . If B = A , then R is said to be a relation on A . A and B are called the source and target of R , respectively, or sometimes the domain and codomain of R , respectively, and we may write R : A → B .We may also use the terms image and coimage , as defined in DefinitionC.0.1 for the graph R of the relation. If the graph R satisfies (1) and (2)of Definition C.0.1, then the relation is called a function , an arrow , ora map from A to B . (cid:2) This differs from Definition C.0.1 in that source and target sets are ex-plicitly given; this allows us to distinguish inclusions from identities,but we may still abbreviate R : A → B by just R . Definition C.0.3 Given a relation R : A → B and A (cid:48) ⊆ A , then { b | (cid:104) a, b (cid:105) ∈ R for some a ∈ A (cid:48) } is called the image of A (cid:48) under R , written R(A (cid:48) ) . Also, given B (cid:48) ⊆ B , then { a | (cid:104) a, b (cid:105) ∈ R for some b ∈ B (cid:48) } is called the inverse image of B (cid:48) under R , written R − (B (cid:48) ) . (cid:2) The following equivalent formalization of relations is more suitablefor mechanization in OBJ (the equivalence is discussed in Chapter 8): Definition C.0.4 A relation from A to B is an arrow A × B → { true , false } . Its graph is the set {(cid:104) a, b (cid:105) | R(a, b) = true } , and A, B are called its source and target sets, respectively. (cid:2) Here are some further concepts associated with functions that areused in this book: Definition C.0.5 A function f : A → B is injective iff f (a) = f (a (cid:48) ) implies a = a (cid:48) for all a, a (cid:48) ∈ A , is surjective iff for all b ∈ B there is some a ∈ A such that f (a) = b , and is bijective iff it is both injective andsurjective. (cid:2) Exercise C.0.1 Given a function f : A → B , show that f − (B) = A . (cid:2) Exercise C.0.2 Show that a function f : A → B is surjective iff its image is B . (cid:2) Definition C.0.6 Given a function f : A → B and A (cid:48) ⊆ A , then the restric-tion of f to A (cid:48) is the function f | A (cid:48) : A (cid:48) → B with graph {(cid:104) a, b (cid:105) | a ∈ A (cid:48) and (cid:104) a, b (cid:105) ∈ f } . Also, given B (cid:48) ⊆ B , the corestriction of f to B (cid:48) isthe function f − (B (cid:48) ) → B (cid:48) with graph {(cid:104) a, b (cid:105) | b ∈ B (cid:48) and (cid:104) a, b (cid:105) ∈ f } . (cid:2) We now consider a number of different kinds of relation on a set. Although we can always recover the source of a set-theoretic function because ofcondition (1). Definition C.0.7 A relation R on a set A is:• reflexive iff aRa for all a ∈ A ,• symmetric iff aRa (cid:48) implies a (cid:48) Ra for all a, a (cid:48) ∈ A ,• anti-reflexive iff aRa for no a ∈ A ,• anti-symmetric iff aRa (cid:48) and a (cid:48) Ra imply a = a (cid:48) for all a, a (cid:48) ∈ A ,• transitive iff aRa (cid:48) and a (cid:48) Ra (cid:48)(cid:48) imply aRa (cid:48)(cid:48) for all a, a (cid:48) , a (cid:48)(cid:48) ∈ A ,• a partial ordering iff it is reflexive, anti-symmetric, and transitive,• a quasi ordering iff it is anti-reflexive and transitive, and• an equivalence relation iff it is reflexive, symmetric, and transi- tive.It is customary to let ≥ and > denote partial and quasi orderings, re-spectively, and to call a set with a partial ordering a poset , in whichcase the underlying set A may be called the carrier of the poset. (cid:2) Example C.0.8 Let A be the set of all people. Then the relation “ancestor-of”is transitive and anti-reflexive, and is thus a quasi ordering; but it isnot symmetric or reflexive. The “cousin-of” relation is symmetric, butnot transitive or reflexive, although the “cousin-of-or-equal” relation isreflexive, symmetric and transitive, and thus is an equivalence relation.The “child-of” relation is anti-reflexive, but has none of the other prop-erties in Definition C.0.7. (cid:2) If X is a set, then P (X) denotes the set of all subsets of X , called the power set of X . Exercise C.0.3 Let A = P (ω) . Then what properties does the “subset-of” re-lation on A have? What about the “proper-subset-of” relation? If A =P (X) , do the answers to these questions vary with X ? If so, how? Ifnot, why? (cid:2) There is an important bijective correspondence between the partial and the quasi orderings on a set. This expressed precisely in the fol-lowing: Proposition C.0.9 Given a set A and a relation R on A , define R Q by aR Q a (cid:48) iff aRa (cid:48) and a ≠ a (cid:48) , and define R P by aR P a (cid:48) iff aRa (cid:48) or a = a (cid:48) . Then R Q is a quasi order if R is a partial order, and R P is a partial order if R is aquasi order. Moreover, if R is a quasi order, then (R P ) Q = R , and if R isa partial order then (R Q ) P = R . Proof: The reader can check the following assertions, which complete theproof: if R is transitive and anti-symmetric, then R Q is transitive; if R Some Background on Relations is transitive then so is R P ; if R is any relation, then R Q is anti-reflexive,and R P is reflexive and anti-symmetric; if R is a partial order, then aRa (cid:48) iff aR Q a (cid:48) or a = a (cid:48) iff a(R Q ) P a (cid:48) ; and if R is a quasi order, then aRa (cid:48) iff aR P a (cid:48) and a ≠ a (cid:48) iff a(R P ) Q a (cid:48) . (cid:2) We can also have operations and relations on relations. For example: Definition C.0.10 If R and R (cid:48) are relations on A , then R ⊆ R (cid:48) means that aRa (cid:48) implies aR (cid:48) a (cid:48) for all a, a (cid:48) ∈ A . Also, we define the union of R and R (cid:48) ,denoted R ∪ R (cid:48) , by a(R ∪ R (cid:48) )a (cid:48) iff aRa (cid:48) or aR (cid:48) a (cid:48) , and their intersection ,denoted R ∩ R (cid:48) , by a(R ∩ R (cid:48) )a (cid:48) iff aRa (cid:48) and aR (cid:48) a (cid:48) . (cid:2) Proposition C.0.11 Every relation R on A is contained in a least transitive rela-tion on A , denoted R + and called the transitive closure of R . Proof: We define R + as follows: aR + a (cid:48) iff there exists a finite (possibly empty)list a , a , . . . , a n of elements of A such that aRa and a Ra and. . . a n Ra (cid:48) . Then R ⊆ R + , and it is not hard to check that R + is tran-sitive.Now suppose that R ⊆ S and that S is transitive. If aR + a (cid:48) , thenthere exist a , . . . , a n such that aRa and a Ra and . . . a n Ra (cid:48) . Thus aSa and a Sa and . . . a n Sa (cid:48) (because R ⊆ S ), and so transitivity of S gives aSa (cid:48) . Therefore R + ⊆ S , and hence R + is minimal. (cid:2) Example C.0.12 If A is the set of all people and R is the “parent-of” relation,then R + is the “ancestor-of” relation. (cid:2) Proposition C.0.13 Every relation R on a set A is contained in a least transi-tive and reflexive relation on A , denoted R ∗ and called the transitive,reflexive closure of R . Proof: Let us define aR ∗ a (cid:48) iff a = a (cid:48) or aR + a (cid:48) . Then R ∗ is transitive, re-flexive, and contains R . If S is another such relation, then R + ⊆ S byProposition C.0.11 and hence R ∗ ⊆ S (by reflexivity). (cid:2) Definition C.0.14 Given a relation R on A , its converse is denoted R (cid:94) , and isdefined by aR (cid:94) a (cid:48) iff a (cid:48) Ra . (cid:2) Example C.0.15 If A is the set of people and R is the “parent-of” relation, then R (cid:94) is the “child-of” relation. (cid:2) Proposition C.0.16 Every relation R on A is contained in a least symmetric re-lation on A , namely R ∪ R (cid:94) , called the symmetric closure of R anddenoted R ± . (cid:2) Proposition C.0.17 Every relation R on A is contained in a least equivalencerelation on A , namely (R ± ) ∗ , called the equivalence relation generatedby R , and denoted R ≡ . (cid:2) BJ Theories for Relations Exercise C.0.4 Prove Propositions C.0.16 and C.0.17. (cid:2) Exercise C.0.5 Show that a function f : A → B is bijective iff the converse of itsgraph is a function f : B → A . (cid:2) Definition C.0.18 If ≡ is an equivalence relation on a set A , we let [a] denotethe ≡ - equivalence class of a ∈ A , defined to be [a] = { a (cid:48) ∈ A | a (cid:48) ≡ a } , and we define the quotient of A by ≡ , denoted A/ ≡ , to be { [a] | a ∈ A } . (cid:2) Exercise C.0.6 If ≡ is an equivalence relation, and if [a] and [b] are two distinct ≡ -equivalence classes, then show that [a] ∩ [b] = ∅ . Also, show that (cid:83) { C | C ∈ A/ ≡} = A . (cid:2) Example C.0.19 If A is the set of all people, alive now or in the past, and if R is again the “parent-of” relation, then the hypothesis that all peo-ple are descended from Adam and Eve implies that there is just oneequivalence class under the relation R ≡ . On the other hand, if there ismore than one equivalence class, there may be aliens among us, anotherspecies, or some other non-interbreeding population. (cid:2) Everything in this section extends to S -sorted relations and func-tions in the style of Section 2.2, using the following set-theoretic rep-resentation for S -sorted sets: Let S be a set, and let Set be some setof sets that includes all sets that are candidates for use in indexed setswithin the current context; then an S -sorted function A is a function A : S → Set . C.1 OBJ Theories for Relations This section gives OBJ3 code for some of the concepts discussed above.It is intended to be read later than the above material, after the relevantconcepts from OBJ have been studied.To get started, here is OBJ3 code for the theory of relations: th REL is sort Elt .op _R_ : Elt Elt -> Bool .endth Notice that nothing at all is assumed about R . We will enrich thistheory in various ways in the following development. Example C.1.1 The theory of partial ordering relations is given below; a setwith a partial ordering is called a “poset.” th POSET is sort Elt .op _=>_ : Elt Elt -> Bool . Some Background on Relations vars A B C : Elt .eq A => A = true .cq A = B if A => B and B => A .cq A => C = true if A => B and B => C .endth Any initial algebra of the following specification of the natural num-bers with their natural ordering will satisfy the above theory th NAT is sort Nat .op 0 : -> Nat .op s : Nat -> Nat .op _=>_ : Nat Nat -> Bool .vars A B : Nat .eq A => A = true .eq s(A) => 0 = true .eq s(A) => s(B) = A => B .endth (cid:2) Exercise C.1.1 Write an OBJ3 theory for quasi orders and show how to get apartial order from a quasi order, and vice versa . Give three examples ofa quasi order. (cid:2) Example C.1.2 ((cid:63)) The transitive closure of a relation R is specified by the fol-lowing, in which the imported module REL is assumed to define R : obj TRCL ispr REL .op _R+_ : Elt Elt -> Bool .vars A B C : Elt .cq A R+ B = true if A R B .cq A R+ C = true if A R B and B R+ C .endo However, it is more elegant to define transitive closure as a param-eterized theory, as follows (Chapter 11 discusses this concept): obj TRCL[R :: REL] isop _R+_ : Elt Elt -> Bool .vars A B C : Elt .cq A R+ B = true if A R B .cq A R+ C = true if A R B and B R+ C .endo There are some peculiar points about the OBJ3 specifications above.First, the second conditional equation is not a rewrite rule, because thevariable B occurs in the condition but not in the leftside. ThereforeOBJ3 will not accept it in an object module, although it will accept itin a theory module. However, initial semantics is needed, because the BJ Theories for Relations transitive closure is supposed to be the least transitive relation con-taining the given one. This means that the above code will not actuallyrun in OBJ3. This is because a variable that is not in the leftside actsas if it were existentially quantified, and OBJ3 cannot handle existentialquantifiers. However, the specifications make perfect semantic sense,and in fact would run in Eqlog [79, 42].Note that under initial semantics, when a R+ b does not equal true ,it also does not equal false , but rather equals the term a R+ b , be-cause that term is itself reduced. This means that if we want a versionof transitive closure that does equal false when it is not true , then weshould replace occurrences of the expression a R+ b by the expression a R+ b == true . (cid:2) Exercise C.1.2 ((cid:63)) Write an OBJ3 parameterized theory specifying the transi- tive, reflexive closure of a relation. (cid:2) Example C.1.3 Here is an OBJ3 theory for equivalence relations: th EQV is sort Elt .op _ ≡ _ : Elt Elt -> Bool .vars A B C : Elt .eq A ≡ A = true .eq A ≡ B = B ≡ A .cq A ≡ C = true if A ≡ B and B ≡ C .endth (cid:2) Exercise C.1.3 ((cid:63)) Write an OBJ3 parameterized theory specifying the transi-tive, reflexive, symmetric closure of a relation; this is called its equiva-lence closure . (cid:2) Some Background on Relations Social Implications Practical interest in verification technology is particularly acute for socalled “critical systems,” which have the property that incorrect opera- tion may result in loss of human life, compromise of national security,massive loss of property, etc. Typical examples are heart pacemakers,flight control systems, automobile brake controllers, encryption sys-tems, nuclear power plants, and electronic fund transfer systems. Inthis context, it is especially important to understand the limitations ofverification, both those limitations that are inherent in the nature ofverification, and those that are due to the current state of the art. Un-fortunately, there is a temptation to play down, or even cover up, theselimitations, due to the lure of fame and fortune. When verification isjust an academic exercise, this does little harm; but there is cause forserious concern when manufacturers make advertising claims aboutthe reliability of a critical system (or component thereof) based on itshaving been verified. As Dr. Avra Cohn [31] says,The use of the word ‘verified’ must under no circumstances be allowed to confer a false sense of security.This is because it could lead to unintentionally and unnecessarily takingsevere risks. Indeed, one might well argue that to knowingly make falseor misleading claims about the reliability of a critical system should bea criminal offense. It is certainly a grave moral offense.We should realize that nothing in the real world can have the cer-tainty of mathematical truth. Although we might like to think that ‘8 × = 56’ is always and incontrovertibly true in some mathematicalheaven, an actual human being can sometimes misremember the multi-plication table, and an actual machine can sometimes drop a bit, break aconnector, or burn out a power supply. The most that can truthfully beasserted of a “verified chip” is that certain kinds of design errors havebeen made far less likely. Of course, this is very much worth pursuing,but there remains a long chain of assumptions that must be satisfiedin order that an actual physical instance of a chip whose logic has beenverified will operate as intended in situ , including the following: thechip must be correctly fabricated; it must be correctly installed; it mustbe fed correct data and given correct power; the electronic circuits that Social Implications realize its logic must be correctly designed, and used only within theirdesign limits; the analog circuits that support communication must op-erate correctly; there must not be excessive electromagnetic radiationaround the chip; etc., etc. In addition, human factors are often involved;for example, the user should not override warning signals, and mustcorrectly interpret the output.However, I wish to concentrate on certain logical issues that are in-volved. A major point, emphasized by Cohn [31], is that verification isa relationship between two models, that is, between two mathematicalabstractions, one of the chip, and the other of the designers’ intentions.However, such models can never capture either the totality of any par-ticular chip, or even of the designers intentions for that chip. This isdue to a number of factors, including errors made in constructing these necessarily complex abstractions, and deliberately ignoring certain fac-tors (which is of course the very nature of abstraction), such as fluctu-ations in power levels, aging of physical components, overheating, etc.Moreover, since the languages in which designs and specifications areusually expressed are relatively informal, there must be a translationinto some formal language, and errors may be introduced when eitherthe designers or the verifiers misunderstand these informal languages.In addition, the theorem prover must correctly implement some logicalsystem, and the formalism for representing the chip in logic must becorrect with respect to that system. Unless a theorem prover is rigor-ously based on a precise and well-understood logical system, there islittle basis for confidence in its “proofs.” For example, it is seductivebut dangerous to “throw together” several different logical systems,since the combination may fail to have any obvious notions of modeland satisfaction, even though the components do have them. Even atheorem prover with a sound logical basis is likely to have some bugsin its code, because it is after all a complex real system itself.Moreover, it is always possible that the assumptions about how for-mal sentences represent physical devices are flawed in some subtleways, for example, relating to signal strength, and it is all too easyto use a theorem prover incorrectly, for example, to give it erroneousinput, or to interpret its output incorrectly. Finally, we must note that the current state of the art is not adequate to support the verificationof really large or complex systems, although recent advances have beenboth rapid and significant, and the future looks promising. ibliography [1] Arnon Avron, Furio Honsell, and Ian Mason. Using typed lambdacalculus to implement formal systems on a computer. Technical Report ECS-LFCS-87-31, Laboratory for Computer Science, Uni-versity of Edinburgh, 1987.[2] Franz Baader and Tobias Nipkow. Term Rewriting and All That .Cambridge, 1998.[3] Hengt Barendregt. The Lambda Calculus, its Syntax and Seman-tics . North-Holland, 1984. Second Revised Edition.[4] Hengt Barendregt. Functional programming and lambda calculus.In Jan van Leeuwen, editor, Handbook of Theoretical ComputerScience . North-Holland, 1989.[5] Michael Barr and Charles Wells. Toposes, Triples and The-ories . Springer, 1985. Grundlehren der Mathematischen Wis-senschafter, Volume 278.[6] Michael Barr and Charles Wells. Category Theory for ComputingScience . Prentice-Hall, 1990.[7] Roland Barthes. S/Z: An Essay and Attitudes . Hill and Wang, 1974.Trans. Richard Miller.[8] Jean Benabou. Structures algébriques dans les catégories. Cahiersde Topologie et Géometrie Différentiel , 10:1–126, 1968.[9] Jan Bergstra, Jan Heering, and Paul Klint. Algebraic Specification . Association for Computing Machinery, 1989.[10] Jan Bergstra and Jan Willem Klop. Conditional rewrite rules: Con-fluence and termination. Journal of Computer System and Sci-ences , 32:323–362, 1986.[11] Jan Bergstra and John Tucker. Characterization of computabledata types by means of a finite equational specification method.In Jaco de Bakker and Jan van Leeuwen, editors, Automata, Lan-guages and Programming, Seventh Colloquium , pages 76–90.Springer, 1980. Lecture Notes in Computer Science, Volume 81. Bibliography [12] Garrett Birkhoff. On the structure of abstract algebras. Proceed-ings of the Cambridge Philosophical Society , 31:433–454, 1935.[13] Garrett Birkhoff and J. Lipson. Heterogeneous algebras. Journalof Combinatorial Theory , 8:115–133, 1970.[14] Woodrow Wilson Bledsoe and Donald Loveland (editors). Auto-mated Theorem Proving: After 25 Years . American MathematicalSociety, 1984. Volume 29 of Contemporary Mathematics Series.[15] Barry Boehm. Software Engineering Economics . Prentice-Hall,1981.[16] William Boone. The word problem. Ann. Math. , 70:207–265, 1959.[17] Adel Bouhoula. Automated theorem proving by test set induc- tion. Journal of Symbolic Computation , 23(1):47–77, 1997.[18] Adel Bouhoula and Jean-Pierre Jouannaud. Automata-driven au-tomated induction. In Proceedings, 12th Symposium on Logic inComputer Science , pages 14–25. IEEE, 1997.[19] Robert Boyer and J Moore. A Computational Logic . Academic,1980.[20] J.W. Brewer and Martha K. Smith, editors. Emmy Noether: A Trib-ute to her Life and Work . Dekker, 1981.[21] Rod Burstall. Proving properties of programs by structural induc-tion. Computer Journal , 12(1):41–48, 1969.[22] Rod Burstall and Joseph Goguen. Putting theories together tomake specifications. In Raj Reddy, editor, Proceedings, Fifth In-ternational Joint Conference on Artificial Intelligence , pages 1045–1058. Department of Computer Science, Carnegie-Mellon Univer-sity, 1977.[23] Rod Burstall and Joseph Goguen. The semantics of Clear, a spec-ification language. In Dines Bjorner, editor, Proceedings, 1979Copenhagen Winter School on Abstract Software Specification ,pages 292–332. Springer, 1980. Lecture Notes in Computer Sci- ence, Volume 86.[24] Rod Burstall and Joseph Goguen. Algebras, theories and free-ness: An introduction for computer scientists. In Martin Wirs-ing and Gunther Schmidt, editors, Theoretical Foundations ofProgramming Methodology , pages 329–350. Reidel, 1982. Pro-ceedings, 1981 Marktoberdorf NATO Summer School, NATO Ad-vanced Study Institute Series, Volume C91.[25] Albert Camilleri, Michael J.C. Gordon, and Tom Melham. Hard-ware verification using higher-order logic. Technical Report 91,University of Cambridge, Computer Laboratory, June 1986. ibliography [26] Carlo Cavenathi, Marco De Zanet, and Giancarlo Mauri. MC-OBJ: aC interpreter for OBJ, 1987.[27] Chin-Liang Chang and Richard Char-Tung Lee. Symbolic Logic andMechanical Theorem Proving . Academic, 1973.[28] Alonzo Church, editor. The Calculi of Lambda-Conversion . Prince-ton, 1941. Annals of Mathematics Studies, No. 6.[29] Manuel Clavel, Francisco Duran, Steven Eker, José Meseguer, andM.-O. Stehr. Maude as a formal meta-tool. In Proceedings, FM’99- Formal Methods, Volume II , pages 1684–1701. Springer, 1999.Lecture Notes in Computer Science, Volume 1709.[30] Manuel Clavel, Steven Eker, Patrick Lincoln, and José Meseguer. Principles of Maude. In José Meseguer, editor, Proceedings, FirstInternational Workshop on Rewriting Logic and its Applications .Elsevier Science, 1996. Volume 4, Electronic Notes in TheoreticalComputer Science .[31] Avra Cohn. Correctness properties of the Viper block model: Thesecond level. In V.P. Subramanyan and Graham Birtwhistle, ed-itors, Current Trends in Hardware Verification and AutomatedTheorem Proving , pages 1–91. Springer, 1989.[32] Paul M. Cohn. Universal Algebra . Harper and Row, 1965. Revisededition 1980.[33] Derek Coleman, Robin Gallimore, and Victoria Stavridou. The de-sign of a rewrite rule interpreter from algebraic specifications. IEE Software Engineering Journal , July:95–104, 1987.[34] Alan Colmerauer, H. Kanoui, and M. van Caneghem. Etudeet réalisation d’un système Prolog. Technical report, Grouped’Intelligence Artificielle, U.E.R. de Luminy, Université d’Aix-Marseille II, 1979.[35] Robert Constable et al. Implementing Mathematics with the NuprlProof Development System . Prentice-Hall, 1986. [36] Pierre-Luc Curien. Categorial Combinators, Sequential Algo-rithms, and Functional Programming . Pitman and Wiley, 1986.Research Notes in Theoretical Computer Science.[37] Haskell Curry and R. Feys. Combinatory Logic, Volume I . North-Holland, 1958.[38] Haskell Curry, J.R. Hindley, and J.P. Seldin. Combinatory Logic,Volume II . North-Holland, 1972. Studies in Logic 65.[39] Nachum Dershowitz and Jean-Pierre Jouannaud. Rewriting sys-tems. In Jan van Leeuwen, editor, Handbook of Theoretical Com- Bibliography puter Science, Volume B: Formal Methods and Semantics , pages243–320. North-Holland, 1990.[40] Nachum Dershowitz and David Plaisted. Equational program-ming. In John Hayes, Donald Michie, and J. Richards, editors, Ma-chine Intelligence 11 , pages 21–56. Oxford, 1987.[41] R˘azvan Diaconescu. The logic of Horn clauses is equational. Tech-nical Report PRG–TR–3–93, Programming Research Group, Uni-versity of Oxford, 1993. Written 1990.[42] R˘azvan Diaconescu. Category-based Semantics for Equational andConstraint Logic Programming . PhD thesis, Programming Re-search Group, Oxford University, 1994.[43] R ˘ azvan Diaconescu and Kokichi Futatsugi. CafeOBJ Report: TheLanguage, Proof Techniques, and Methodologies for Object-Oriented Algebraic Specification . World Scientific, 1998. AMASTSeries in Computing, Volume 6.[44] Aubert Daigneault (editor). Studies in Algebraic Logic . Mathemat-ical Association of America, 1974. MAA Studies in Mathematics,Vol. 9.[45] Hartmut Ehrig and Bernd Mahr. Fundamentals of Algebraic Speci-fication 1: Equations and Initial Semantics . Springer, 1985. EATCSMonographs on Theoretical Computer Science, Vol. 6.[46] Trevor Evans. On multiplicative systems defined by generatorsand relations, I. Proceedings of the Cambridge Philosophical Soci-ety , 47:637–649, 1951.[47] Kokichi Futatsugi, Joseph Goguen, Jean-Pierre Jouannaud, andJosé Meseguer. Principles of OBJ2. In Brian Reid, editor, Pro-ceedings, Twelfth ACM Symposium on Principles of ProgrammingLanguages , pages 52–66. Association for Computing Machinery,1985. [48] Jean H. Gallier. Logic for Computer Scientists . Harper and Row,1986.[49] Stephen Garland and John Guttag. Inductive methods for reason-ing about abstract data types. In Proceedings, Fifteenth Sympo-sium on Principles of Programming Languages , pages 219–229.Association for Computing Machinery, January 1988.[50] Jürgen Giesl and Aart Middeldorp. Innermost termination ofcontext-sensitive rewriting. In Proceedings, Sixth InternationalConference on Developments in Language Theory . Springer, 2002.Lecture Notes in Computer Science. ibliography [51] Joseph Goguen. Reality and human values in mathematics. Sub-mitted for publication.[52] Joseph Goguen. Semantics of computation. In Ernest Manes,editor, Proceedings, First International Symposium on CategoryTheory Applied to Computation and Control , pages 151–163.Springer, 1975. (San Francisco, February 1974.) Lecture Notes inComputer Science, Volume 25.[53] Joseph Goguen. Abstract errors for abstract data types. In EricNeuhold, editor, Proceedings, First IFIP Working Conference onFormal Description of Programming Concepts , pages 21.1–21.32.MIT, 1977. Also in Formal Description of Programming Concepts ,Peter Neuhold, Ed., North-Holland, pages 491–522, 1979. [54] Joseph Goguen. Order-sorted algebra. Technical Report 14, UCLAComputer Science Department, 1978. Semantics and Theory ofComputation Series.[55] Joseph Goguen. Some design principles and theory for OBJ-0, alanguage for expressing and executing algebraic specifications ofprograms. In Edward Blum, Manfred Paul, and Satsoru Takasu,editors, Proceedings, Conference on Mathematical Studies of In-formation Processing , pages 425–473. Springer, 1979. LectureNotes in Computer Science, Volume 75.[56] Joseph Goguen. How to prove algebraic inductive hypotheseswithout induction, with applications to the correctness of datatype representations. In Wolfgang Bibel and Robert Kowalski,editors, Proceedings, Fifth Conference on Automated Deduction ,pages 356–373. Springer, 1980. Lecture Notes in Computer Sci-ence, Volume 87.[57] Joseph Goguen. Modular algebraic specification of some ba-sic geometrical constructions. Artificial Intelligence , pages 123–153, 1988. Special Issue on Computational Geometry, edited byDeepak Kapur and Joseph Mundy; also, Report CSLI-87-87, Centerfor the Study of Language and Information at Stanford University, March 1987.[58] Joseph Goguen. Memories of ADJ. Bulletin of the European As-sociation for Theoretical Computer Science , 36:96–102, October1989. Guest column in the ‘Algebraic Specification Column.’ Alsoin Current Trends in Theoretical Computer Science: Essays andTutorials , World Scientific, 1993, pages 76–81.[59] Joseph Goguen. OBJ as a theorem prover, with application tohardware verification. In V.P. Subramanyan and Graham Birtwhis-tle, editors, Current Trends in Hardware Verification and Auto-mated Theorem Proving , pages 218–267. Springer, 1989. Bibliography [60] Joseph Goguen. What is unification? A categorical view of substi-tution, equation and solution. In Maurice Nivat and Hassan Aït-Kaci, editors, Resolution of Equations in Algebraic Structures, Vol-ume 1: Algebraic Techniques , pages 217–261. Academic, 1989.[61] Joseph Goguen. Higher-order functions considered unnecessaryfor higher-order programming. In David Turner, editor, ResearchTopics in Functional Programming , pages 309–352. Addison Wes-ley, 1990. University of Texas at Austin Year of Programming Se-ries; preliminary version in SRI Technical Report SRI-CSL-88-1,January 1988.[62] Joseph Goguen. Proving and rewriting. In Hélène Kirchner andWolfgang Wechler, editors, Proceedings, Second InternationalConference on Algebraic and Logic Programming , pages 1–24. Springer, 1990. Lecture Notes in Computer Science, Volume 463.[63] Joseph Goguen. A categorical manifesto. Mathematical Structuresin Computer Science , 1(1):49–67, March 1991.[64] Joseph Goguen. An introduction to algebraic semiotics, with ap-plications to user interface design. In Chrystopher Nehaniv, edi-tor, Computation for Metaphors, Analogy and Agents , pages 242–291. Springer, 1999. Lecture Notes in Artificial Intelligence, Vol-ume 1562.[65] Joseph Goguen. Social and semiotic analyses for theorem proveruser interface design. Formal Aspects of Computing , 11:272–301,1999. Special issue on user interfaces for theorem provers.[66] Joseph Goguen and Rod Burstall. Institutions: Abstract modeltheory for computer science. Technical Report CSLI-85-30, Centerfor the Study of Language and Information, Stanford University,1985. A preliminary version appears in Proceedings, Logics of Pro-gramming Workshop , Edmund Clarke and Dexter Kozen, editors,Springer Lecture Notes in Computer Science, Volume 164, pages221–256, 1984.[67] Joseph Goguen and Rod Burstall. Institutions: Abstract model theory for specification and programming. Journal of the Associ-ation for Computing Machinery , 39(1):95–146, January 1992.[68] Joseph Goguen and R˘azvan Diaconescu. An Oxford survey oforder-sorted algebra. Mathematical Structures in Computer Sci-ence , 4:363–392, 1994.[69] Joseph Goguen, Jean-Pierre Jouannaud, and José Meseguer. Op-erational semantics of order-sorted algebra. In Wilfried Brauer,editor, Proceedings, 1985 International Conference on Automata,Languages and Programming , pages 221–231. Springer, 1985.Lecture Notes in Computer Science, Volume 194. ibliography [70] Joseph Goguen, Claude Kirchner, and José Meseguer. Concurrentterm rewriting as a model of computation. In Robert Keller andJoseph Fasel, editors, Proceedings, Graph Reduction Workshop ,pages 53–93. Springer, 1987. Lecture Notes in Computer Science,Volume 279.[71] Joseph Goguen and Kai Lin. Behavioral verification of distributedconcurrent systems with BOBJ. In Hans-Dieter Ehrich and T.H.Tse, editors, Proceedings, Conference on Quality Software , pages216–235. IEEE Press, 2003.[72] Joseph Goguen and Kai Lin. Specifying, programming and verify-ing with equational logic. In Sergei Artemov, Howard Barringer,Artur d’Avila Garcez, Luis Lamb, and John Woods, editors, WeWill Show Them! Essays in honour of Dov Gabbay, Vol. 2 , pages Proceedings, Automated Software Engineering ,pages 55–62. IEEE, 1997.[74] Joseph Goguen, Kai Lin, Akira Mori, Grigore Ro¸su, and AkiyoshiSato. Tools for distributed cooperative design and validation. In Proceedings, CafeOBJ Symposium . Japan Advanced Institute forScience and Technology, 1998. Namazu, Japan, April 1998.[75] Joseph Goguen, Kai Lin, and Grigore Ro¸su. Circular coinductiverewriting. In Automated Software Engineering ’00 , pages 123–131. IEEE, 2000. Proceedings of a workshop held in Grenoble,France.[76] Joseph Goguen, Kai Lin, Grigore Ro¸su, Akira Mori, and BogdanWarinschi. An overview of the Tatami project. In Kokichi Fu-tatsugi, Ataru Nakagawa, and Tetsuo Tamai, editors, Cafe: AnIndustrial-Strength Algebraic Formal Method , pages 61–78. Else-vier, 2000.[77] Joseph Goguen and Grant Malcolm. Algebraic Semantics of Im- perative Programs . MIT, 1996.[78] Joseph Goguen and José Meseguer. Completeness of many-sortedequational logic. Houston Journal of Mathematics , 11(3):307–334,1985.[79] Joseph Goguen and José Meseguer. Eqlog: Equality, types, andgeneric modules for logic programming. In Douglas DeGroot andGary Lindstrom, editors, Logic Programming: Functions, Rela-tions and Equations , pages 295–363. Prentice-Hall, 1986. An ear-lier version appears in Journal of Logic Programming , Volume 1,Number 2, pages 179–210, September 1984. Bibliography [80] Joseph Goguen and José Meseguer. Remarks on remarks onmany-sorted equational logic. Bulletin of the European Associationfor Theoretical Computer Science , 30:66–73, October 1986. Alsoin SIGPLAN Notices , Volume 22, Number 4, pages 41-48, April1987.[81] Joseph Goguen and José Meseguer. Order-sorted algebra solvesthe constructor selector, multiple representation and coercionproblems. In Proceedings, Second Symposium on Logic in Com-puter Science , pages 18–29. IEEE Computer Society, 1987. AlsoReport CSLI-87-92, Center for the Study of Language and Infor-mation, Stanford University, March 1987; revised version in In-formation and Computation, 103 , 1993.[82] Joseph Goguen and José Meseguer. Order-sorted algebra I: Equational deduction for multiple inheritance, overloading, ex-ceptions and partial operations. Theoretical Computer Science ,105(2):217–273, 1992. Drafts exist from as early as 1985.[83] Joseph Goguen, Akira Mori, and Kai Lin. Algebraic semiotics,ProofWebs and distributed cooperative proving. In Yves Bartot,editor, UITP‘97, User Interfaces for Theorem Provers , pages 25–34. INRIA, 1999. (Sophia Antipolis, 1–2 September 1997).[84] Joseph Goguen, Andrew Stevens, Keith Hobley, and HendrikHilberdink. 2OBJ, a metalogical framework based on equationallogic. Philosophical Transactions of the Royal Society, Series A ,339:69–86, 1992. Also in Mechanized Reasoning and HardwareDesign , edited by C.A.R. Hoare and Michael J.C. Gordon, Prentice-Hall, 1992, pages 69–86.[85] Joseph Goguen and Joseph Tardo. OBJ-0 preliminary users man-ual. Semantics and theory of computation report 10, UCLA, 1977.[86] Joseph Goguen, James Thatcher, and Eric Wagner. An initial al-gebra approach to the specification, correctness and implemen-tation of abstract data types. In Raymond Yeh, editor, CurrentTrends in Programming Methodology, IV , pages 80–149. Prentice- Hall, 1978.[87] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Abstract data types as initial algebras and the correctness of datarepresentations. In Alan Klinger, editor, Computer Graphics, Pat-tern Recognition and Data Structure , pages 89–93. IEEE, 1975.[88] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Initial algebra semantics and continuous algebras. Journal of theAssociation for Computing Machinery , 24(1):68–95, January 1977.[89] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Initial algebra semantics and continuous algebras. Journal of the ibliography Association for Computing Machinery , 24(1):68–95, January 1977.An early version is “Initial Algebra Semantics”, by Joseph Goguenand James Thatcher, IBM T.J. Watson Research Center, ReportRC 4865, May 1974.[90] Joseph Goguen, Timothy Winkler, José Meseguer, Kokichi Fu-tatsugi, and Jean-Pierre Jouannaud. Introducing OBJ. In JosephGoguen and Grant Malcolm, editors, Software Engineering withOBJ: Algebraic Specification in Action , pages 3–167. Kluwer, 2000.Also Technical Report SRI-CSL-88-9, August 1988, SRI Interna-tional.[91] Robert Goldblatt. Topoi, the Categorial Analysis of Logic . North-Holland, 1979.[92] Michael J.C. Gordon. The Denotational Description of Program-ming Languages . Springer, 1979.[93] Michael J.C. Gordon. HOL: A machine oriented formulation ofhigher-order logic. Technical Report 85, University of Cambridge,Computer Laboratory, July 1985.[94] Michael J.C. Gordon. Why higher-order logic is a good formalismfor specifying and verifying hardware. In George Milne and P.A.Subrahmanyam, editors, Formal Aspects of VLSI Design . North-Holland, 1986.[95] Michael J.C. Gordon, Robin Milner, and Christopher Wadsworth. Edinburgh LCF . Springer, 1979. Lecture Notes in Computer Sci-ence, Volume 78.[96] Paul Halmos. Algebraic Logic . Van Nostrand, 1962.[97] Paul Halmos and Steven Givant. Logic as Algebra . MathematicalAssociation of America, 1998. Dolciani Expositions No. 21.[98] Robert Harper, Furio Honsell, and Gordon Plotkin. A frameworkfor defining logics. In Proceedings, Second Symposium on Logic inComputer Science , pages 194–204. IEEE Computer Society, 1987. [99] Robert Harper, David MacQueen, and Robin Milner. Standard ML.Technical Report ECS-LFCS-86-2, Department of Computer Sci-ence, University of Edinburgh, 1986.[100] Leon Henkin, Donald Monk, and Alfred Tarski. Cylindric Alge-bras . North Holland, 1971.[101] Phillip J. Higgins. Algebras with a scheme of operators. Mathema-tische Nachrichten , 27:115–132, 1963.[102] G. Higman and B.H. Neumann. Groups as groupoids with one law. Publ. Math. Debrecen , 2:215–221, 1952. Bibliography [103] J.R. Hindley. The Church-Rosser Property and a Result in Com-binatory Logic . PhD thesis, University of Newcastle-upon-Tyne,1964.[104] Masako K. Hiraga. Diagrams and metaphors: Iconic aspects inlanguage. Journal of Pragmatics , 22:5–21, 1994.[105] S. Hölldobler, editor. Foundations of Equational Logic Program-ming . Springer, 1989. Lecture Notes in Artificial Intelligence, Vol-ume 353.[106] Jieh Hsiang. Refutational Theorem Proving using Term RewritingSystems . PhD thesis, University of Illinois at Champaign-Urbana,1981. [107] Paul Hudak, Simon Peyton Jones, Philip Wadler, Arvind, et al. Re-port on the functional programming language Haskell. ACM SIG-PLAN Notices , 27, May 1992. Version 1.2.[108] Gérard Huet. Confluent reductions: Abstract properties and ap-plications to term rewriting systems. Journal of the Associationfor Computing Machinery , 27(4):797–821, 1980. Preliminary ver-sion in Proceedings , 18th IEEE Symposium on Foundations ofComputer Science, IEEE, 1977, pages 30–45.[109] Gérard Huet and Derek Oppen. Equations and rewrite rules: Asurvey. In Ron Book, editor, Formal Language Theory: Perspec-tives and Open Problems , pages 349–405. Academic, 1980.[110] John Hughes. Abstract interpretations of first-order polymorphicfunctions. In Cordelia Hall, John Hughes, and John O’Donnell,editors, Proceedings of the 1988 Glasgow Workshop on FunctionalProgramming , pages 68–86. Computing Science Department, Uni-versity of Glasgow, 1989.[111] Heinz Kaphengst and Horst Reichel. Initial algebraic seman-tics for non-context-free languages. In Marek Karpinski, editor, Fundamentals of Computation Theory , pages 120–126. Springer,1977. Lecture Notes in Computer Science, Volume 56.[112] Matt Kaufmann, Panagiotis Manolios, and J. Strother Moore. Computer-Aided Reasoning: An Approach . Kluwer, 2000.[113] Claude Kirchner, Hélène Kirchner, and José Meseguer. Opera-tional semantics of OBJ3. In T. Lepistö and Aarturo Salomaa, ed-itors, Proceedings, 15th International Colloquium on Automata,Languages and Programming , pages 287–301. Springer, 1988.(Tampere, Finland, 11-15 July 1988.) Lecture Notes in ComputerScience, Volume 317. ibliography [114] Jan Willem Klop. Term rewriting systems: A tutorial. Bulletinof the European Association for Theoretical Computer Science ,32:143–182, June 1987.[115] Jan Willem Klop. Term rewriting systems: from Church-Rosserto Knuth-Bendix and beyond. In Samson Abramsky, Dov Gabbay,and Tom Maibaum, editors, Handbook of Logic in Computer Sci-ence , pages 1–117. Oxford, 1992.[116] Donald Knuth. Semantics of context-free languages. Mathemati-cal Systems Theory , 2:127–145, 1968.[117] Donald Knuth and Peter Bendix. Simple word problems in uni-versal algebra. In J. Leech, editor, Computational Problems in Ab-stract Algebra . Pergamon, 1970. [118] William Labov. The transformation of experience in narrative syn-tax. In Language in the Inner City , pages 354–396. University ofPennsylvania, 1972.[119] Leslie Lamport. L A TEX User Guide and Reference Manual . Addison-Wesley, 1985.[120] Peter Landin. A correspondence between ALGOL60 and Church’slambda notation. Communications of the Association for Comput-ing Machinery , 8(2):89–101, 1965.[121] F. William Lawvere. Functorial semantics of algebraic theories. Proceedings, National Academy of Sciences, U.S.A. , 50:869–872,1963. Summary of Ph.D. Thesis, Columbia University.[122] Alexander Leitsch. The Resolution Calculus . Springer, 1997. Textsin Theoretical Computer Science.[123] Charlotte Linde. The organization of discourse. In TimothyShopen and Joseph M. Williams, editors, Style and Variables inEnglish , pages 84–114. Winthrop, 1981.[124] Charlotte Linde. Private stories in public discourse. Poetics , Notes on Logic . Van Nostrand, 1966. Mathemat-ical Studies, Volume 6.[126] Saunders Mac Lane. Categories for the Working Mathematician .Springer, 1971.[127] Saunders Mac Lane. Abstract algebra uses homomorphisms. American Mathematical Monthly , 103(4):330–331, April 1996.[128] Saunders Mac Lane and Garrett Birkhoff. Algebra . Macmillan,1967. Bibliography [129] David MacQueen and Donald Sannella. Completeness of proofsystems for equational specifications. IEEE Transactions on Soft-ware Engineering , SE-11(5):454–461, May 1985.[130] Ernest Manes. Algebraic Theories . Springer, 1976. Graduate Textsin Mathematics, Volume 26.[131] John McCarthy, Michael Levin, et al. LISP 1.5 Programmer’s Man-ual . MIT, 1966.[132] William McCune. Otter 3.0 Users Guide, 1995. Technical Report,Argonne National Laboratory.[133] Elliott Mendelson. Introduction to Mathematical Logic . Academic,1979. Second edition. [134] José Meseguer. Conditional rewriting logic: Deduction, modelsand concurrency. In Stéphane Kaplan and Misuhiro Okada, ed-itors, Conditional and Typed Rewriting Systems , pages 64–91.Springer, 1991. Lecture Notes in Computer Science, Volume 516.[135] José Meseguer. Conditional rewriting logic as a unified model ofconcurrency. Theoretical Computer Science , 96(1):73–155, 1992.[136] José Meseguer and Joseph Goguen. Deduction with many-sortedrewrite rules. Technical Report CSLI-85-42, Center for the Studyof Language and Information, Stanford University, December1985.[137] José Meseguer and Joseph Goguen. Initiality, induction and com-putability. In Maurice Nivat and John Reynolds, editors, AlgebraicMethods in Semantics , pages 459–541. Cambridge, 1985.[138] José Meseguer and Joseph Goguen. Order-sorted algebra solvesthe constructor selector, multiple representation and coercionproblems. Information and Computation , 103(1):114–158, March1993. Revision of a paper presented at LICS 1987.[139] Donald Michie. ‘Memo’ functions and machine learning. Nature , Proceedings, 7th Symposium on Principles of Program-ming Languages . Association for Computing Machinery, 1980.[141] Alan Mycroft. Abstract Interpretation and Optimising Transfor-mations for Applicative Programs . PhD thesis, University of Edin-gurgh, 1981.[142] Anil Nerode. Linear automaton transformations. Proceedings,American Math. Society , 9:541–544, 1958. ibliography [143] M.H.A. Newman. On theories with a combinatorial definition of‘equivalence’. Annals of Mathematics , 43(2):223–243, 1942.[144] Petr Novikov. On the algorithmic unsolvability of the word prob-lem in group theory. Trudy Mat. Inst. Steklov , 44:143, 1955.[145] Julian Orr. Narratives at work: Story telling as cooperative di-agnostic activity. In Proceedings, Conference on Computer Sup-ported Cooperative Work (SIGCHI) . Association for ComputingMachinery, 1986.[146] David Parnas. Information distribution aspects of design method-ology. Information Processing ’72 , 71:339–344, 1972. Proceedingsof 1972 IFIP Congress.[147] Lawrence Paulson. Logic and Computation: Interactive Proof withCambridge LCF . Cambridge, 1987. Cambridge Tracts in Theoreti-cal Computer Science, Volume 2.[148] Lawrence C. Paulson. The foundation of a generic theoremprover. Technical Report 130, University of Cambridge, Com-puter Laboratory, March 1988.[149] Charles Saunders Peirce. Collected Papers . Harvard, 1965. In 6volumes; see especially Volume 2: Elements of Logic.[150] David Plaisted. An initial algebra semantics for error presenta-tions. SRI International, Computer Science Laboratory, 1982.[151] David Plaisted. Equational reasoning and term rewriting systems.In Dov Gabbay and Jörg Siekmann, editors, Handbook of Logic inAI and Logic Programming . Oxford, 1993.[152] Axel Poigné. Parameterization for order-sorted algebraic specifi-cation. Journal of Computer and System Sciences , 40(3):229–268,1990.[153] Emil Post. Recursive unsolvability of a problem of thue. J. Sym-bolic Logic , 12:1–11, 1947. [154] Michael Rabin and Dana Scott. Finite automata and their decisionproblems. IBM Journal of Research and Development , 3:114–125,1959.[155] Helena Rasiowa and R. Sikorski. The Mathematics of Metamathe-matics . PAN (Warsaw), 1963.[156] Brian Ritchie and Paul Taylor. The interactive proof editor: Anexperiment in interactive theorem proving. In V.P. Subramanyanand Graham Birtwhistle, editors, Current Trends in HardwareVerification and Automated Theorem Proving , pages 303–322.Springer, 1989. Bibliography [157] J. Alan Robinson. A machine-oriented logic based on the resolu-tion principle. Journal of the Association for Computing Machin-ery , 12:23–41, 1965.[158] Barry Rosen. Tree-manipulating systems and Church-Rosser the-orems. Journal of the Association for Computing Machinery ,pages 160–187, January 1973.[159] A.B.C. Sampaio and Kamran Parsaye. The formal specification andtesting of expanded hardware building blocks. In Proceedings,ACM Computer Science Conference . Association for ComputingMachinery, 1981.[160] M. Schönfinkel. Über die bausteine der mathematischen logik. Mathematische Annalen , 92:305–316, 1924. In From Frege toGödel , Jean van Heijenoort (editor), Harvard, 1967, pages 355–366.[161] Dana Scott. Lattice theory, data types and semantics. In RandallRustin, editor, Formal Semantics of Algorithmic Languages , pages65–106. Prentice-Hall, 1972.[162] Dana Scott and Christopher Strachey. Towards a mathematicalsemantics for computer languages. In Proceedings, 21st Sympo-sium on Computers and Automata , pages 19–46. Polytechnic In-stitute of Brooklyn, 1971. Also Programming Research GroupTechnical Monograph PRG–6, Oxford.[163] Joseph R. Shoenfield. Mathematical Logic . Addison-Wesley, 1967.[164] Rob Shostak. Deciding combinations of theories. Journal of theACM , 31(1):1–12, 1984.[165] Gert Smolka, Werner Nutt, Joseph Goguen, and José Meseguer.Order-sorted equational computation. In Maurice Nivat and Has-san Aït-Kaci, editors, Resolution of Equations in Algebraic Struc-tures, Volume 2: Rewriting Techniques , pages 299–367. Academic,1989.[166] S. Sridhar. An implementation of OBJ2: An object-oriented lan- guage for abstract program specification. In K.V. Nori, editor, Proceedings, Sixth Conference on Foundations of Software Tech-nology and Theoretical Computer Science , pages 81–95. Springer,1986. Lecture Notes in Computer Science, Volume 241.[167] Victoria Stavridou. Specifying in OBJ, verifying in REVE, and someideas about time. Technical report, Department of Computer Sci-ence, University of Manchester, 1987.[168] Victoria Stavridou, Joseph Goguen, Steven Eker, and SergeAloneftis. funnel : A chdl with formal semantics. In Proceed- ibliography ings, Advanced Research Workshop on Correct Hardware DesignMethodologies , pages 117–144. IEEE, 1991.[169] Victoria Stavridou, Joseph Goguen, Andrew Stevens, Steven Eker,Serge Aloneftis, and Keith Hobley. funnel and 2OBJ: towards anintegrated hardware design environment. In Theorem Provers inCircuit Design , volume IFIP Transactions, A-10, pages 197–223.North-Holland, 1992.[170] Andrew Stevens and Joseph Goguen. Mechanised theorem prov-ing with 2OBJ: A tutorial introduction. Technical report, Program-ming Research Group, University of Oxford, 1993.[171] Mark Stickel. A Prolog technology theorem prover. In First Inter-national Symposium on Logic Programming . Association for Com- puting Machinery, February 1984.[172] Christopher Strachey. Towards a formal semantics. In Steel, ed-itor, Formal Language Description Languages , pages 198–220.North-Holland, 1966.[173] Christopher Strachey. Fundamental concepts in programminglanguages. Lecture Notes from International Summer School inComputer Programming, Copenhagen, 1967.[174] Tanel Tammet. Gandalf. Journal of Automated Reasoning ,18(2):199–204, 1997.[175] Alfred Tarski. The semantic conception of truth. Philos. Phe-nomenological Research , 4:13–47, 1944.[176] James Thatcher, Eric Wagner, and Jesse Wright. Data type spec-ification: Parameterization and the power of specification tech-niques. In Proceedings, Sixth Symposium on Principles of Program-ming Languages . Association for Computing Machinery, 1979.Also in TOPLAS 4 , pages 711–732, 1982.[177] Yoshihito Toyama. Counterexamples to termination for the directsum of term rewriting systems. Information Processing Letters , Software – Practice and Experience , 9:31–49, 1979.[179] David Turner. Miranda: A non-strict functional language withpolymorphic types. In Jean-Pierre Jouannaud, editor, FunctionalProgramming Languages and Computer Architectures , pages 1–16. Springer, 1985. Lecture Notes in Computer Science, Volume201.[180] Jeffrey Ullman. Elements of ML Programming . Prentice Hall, 1998. Bibliography [181] Ivo van Horebeck. Formal Specifications Based on Many-Sorted Ini-tial Algebras and their Applications to Software Engineering . Uni-versity of Leuven, 1988.[182] Alfred North Whitehead. A Treatise on Universal Algebra, withApplications, I . Cambridge, 1898. Reprinted 1960.[183] Glynn Winskel. Relating two models of hardware. In David Pitt,Axel Poigné, and David Rydeheard, editors, Proceedings, SecondSummer Conference on Category Theory and Computer Science ,pages 98–113. Laboratory for Computer Science, University of Ed-inburgh, 1987. ditors’ Notes Notes for Chapter 1 E1. [Page 8] The sentence about no new theorems being proved by automated theoremprovers is not accurate, because some new theorems in algebra were first proved byautomatic first-order provers, like the Robbins conjecture, for example.E2. [Page 10] The sentence about OBJ semantics can be confusing, because the expression“only because” might suggest that it happens “almost by chance.”E3. [Page 13] The author used to reset counters for each chapter, but the editors havedecided to reset them for each section in order to improve readability, since somechapters are quite long. Notes for Chapter 3 E4. [Page 41] After defining the concept of satisfaction in Definition 3.3.7, it would be con-venient to add a discussion about algebras with empty sorts ( M s = ∅ for some sort s ∈ S ) and the satisfaction of equations for such algebras: M (cid:238) ( ∀ X) t = t (cid:48) if there isno assignment a : X → M due to M s = ∅ . See also Section 4.3.2.E5. [Page 43] The proof of Theorem 3.3.11 is a bit too fast and it is obscured by using thesame notation M for the algebra and its underlying S -sorted set (see Definition 2.5.1).E6. [Page 44] After defining satisfaction of conditional equations in Definition 3.4.1, itwould be helpful to comment that M (cid:238) ( ∀ X) t = t (cid:48) if C when M does not satisfythe conditions C .E7. [Page 52] In the proof of Proposition 3.7.2, the second paragraph proves the contrapos-itive of first paragraph instead of its converse. Instead, the converse implication doesnot hold. Consider, for example, the signature ( { s, s , s } , { f : s → s , f : s → s } ) .There are no overloaded terms because there are no terms, but the signature is notregular.E8. [Page 53] In Definition 3.7.5 it seems that M Σ denotes the annotated form when there isconfusion and the non-annotated form when there is no confusion (as stated at the end,“As with terms in T Σ , we will usually omit the sort annotation unless it is necessary”).However, it also seems that the author’s intention is to denote by M Σ the non-annotatedform. Otherwise, Definition 3.7.7 and Exercise 3.7.2 don’t make sense. For example, aaa (Exercise 3.7.2) doesn’t have five distinct parses because it should be written either (a(a a.A).A).A or (a(a.A a).A).A or ((a a.A).A a).A or ((a.A a).A a).A or (a a.A a).A . Suggestion: Similar to T Σ and T Σ , there could be two notations: M Σ and M Σ corre-sponding to the non-annotated and annotated forms, respectively. Editors’ Notes Notes for Chapter 4 E9. [Page 65] This section on explicit quantification has as title “The Need for Quantifiers.”This can be a bit confusing, because what is really necessary is not the quantifiers perse, but the explicit annotation of the set of variables involved in an equation, whichinstead of being defined as a pair (cid:104) t, t (cid:48) (cid:105) is now defined as a triple (cid:104) X, t, t (cid:48) (cid:105) .In books like Johnstone’s “Notes on Logic and Set Theory” and Lambek & Scott’s “In-troduction to Higher-Order Categorical Logic”, other notations not involving quantifiersare used for this, such as t = X t (cid:48) .It is interesting to compare with discussion in page 249 about Horn clause symbols.E10. [Page 72] In Exercise 4.5.5, it seems it is necessary to add the hypothesis that A ≠ ∅ .E11. [Page 86] In Definition 4.10.6, if Σ (cid:48) contains symbols from X , then the translation of ( ∀ X) t = t (cid:48) along ϕ is not correctly defined. This is a serious issue which has beenconsidered in Diaconescu’s book “Institution-Independent Model Theory”: variables aretriples of the form (x, s, Σ ) , where x is the name of the variable, s is the sort of thevariable, and Σ is the signature for which the variable is considered. Then we have ϕ(x, s, Σ ) = (x, ϕ(s), Σ (cid:48) ) . In this way, variable name clashes are successfully avoided. Notes for Chapter 5 E12. [Page 107] The second sentence in the statement of Proposition 5.3.4 is not true: if ( Σ , A) is ground terminating, then ( Σ (X), A) may fail to be ground terminating; considerfor example the rule f (Z) → f (f (Z)) in Example 5.3.2. The second sentence should beread as “ ( Σ , A) is ground terminating if ( Σ (X), A) is ground terminating,” for the authorwrites so in the new statement of Proposition 5.3.4 in page 381.E13. [Page 112] The explaining paragraph after Definition 5.5.3 is confusing because it doesnot correspond to the definition statement. Instead, the second definition says that σ preserves the decrease of the weight, and the third definition says that all operationspreserve weight decreasingness.E14. [Page 118] After Theorem 5.6.4 (Orthogonality), an example showing why the left-linearcondition is needed would help the reader (in the same way that Exercise 5.6.2 showswhy the lapse free condition is needed). The following T RS is non-overlapping andlapse free but not Church-Rosser because it is not left linear: f (x, x) → a, f (x, g(x)) → b, c → g(c) E15. [Page 126] Proposition 5.3.4 reappears in this page, so note E12 in page 107 also applieshere.E16. [Page 128] In the first paragraph after Definition 5.8.2, it seems unclear whether R k − should instead be R k .E17. [Page 131] Proposition 5.8.10 is the conditional generalization of Proposition 5.3.4 (inpages 107 and 126) and the same comment applies: if ( Σ , A) is ground terminating then ( Σ (X), A) may fail to be ground terminating. See notes E12 and E15 above.E18. [Page 147] It would be necessary to check whether this argument is right.E19. [Page 152] The comment before Definition 5.9.1 about adding “little more information”seems inconsistent because the definition does not take that into account.E20. [Page 156] This dangling reference probably corresponds to a corollary which may havebeen removed. ditors’ Notes E21. [Page 165] In the proof of the initiality Theorem 6.1.15 there is an important gap, be-cause the author does not prove T Σ ,A (cid:238) A , and so we don’t know whether T Σ ,A is a ( Σ , A) -algebra or not. Compare with note E42 for Theorem 9.1.10 in page 312.E22. [Page 174] Exercise 6.4.1 coincides with Exercise 3.1.1 in Section 3.1.E23. [Page 177] Example 6.5.7 on commutativity of addition could be used to show howlemmas come up in a proof attempt.E24. [Page 178] Exercise 6.5.1 coincides with lemma0 in previous Example 6.5.7.E25. [Page 180] The protecting notion ( pr NAT in Exercise 6.5.5) has not been explainedbefore. The notion is quickly explained later in Example 7.3.11 (page 197). Notes for Chapter 7 E26. [Page 195] In the proof of Theorem 7.3.9, it seems that there is a small gap in theproof of N (cid:238) A ∪ B . Since [[ _ ]] B : T Σ → N , we should consider b : X → T Σ such that b ; [[ _ ]] B = a , because [[ _ ]] B is a surjection. T ÷ [[ _ ]] B X b b b a / / mM | | NT ÷ (X) a @ @ b O O Since a = b ; [[ _ ]] B and there exists a unique homomorphism T Σ (X) → N extending a ,we have b ; [[ _ ]] B = a . Now applying the given rule to t with substitution b : X → T Σ gives b(t) ⇒ A/B b(t (cid:48) ) , so these two terms have the same canonical form, i.e., [[b(t)]] B = [[b(t (cid:48) )]] B and thus a(t) = a(t (cid:48) ) .E27. [Page 201] Here it is stated that the protecting notion was defined in Chapter 6, butthis is not really the case, as pointed out about Exercise 6.5.5 in page 180 (see note E25above). The notion is quickly introduced (in Chapter 7) in Example 7.3.11 (page 197).E28. [Page 203] Corollary 7.3.19 deserves a more detailed proof.E29. [Page 211] For the statement of Proposition 7.4.3, it needs to be explained how thetriangular system T becomes a first-order formula.E30. [Page 216] As above (see note E29), for the statement of Proposition 7.4.10, it needs tobe explained how the conditional triangular system T becomes a first-order formula.E31. [Page 241] Lemma 7.7.16 and Proposition 7.7.17 refer to weak rewriting modulo, whichhas not been defined in the conditional case. Notes for Chapter 8 E32. [Page 257] In Definition 8.3.2 the author should denote by ˆ a : WFF X (φ) → B the ex-tension of a : X → M to well-formed formulae because he denoted by a : T Σ → M Editors’ Notes the extension of a to terms. As we may notice later, in the definition of substitution(Definition 8.3.16 in page 266), he denotes by θ : T Σ (X) → T Σ (X) the extension of θ : X → T Σ (X) to terms, and by ˆ θ : WFF X (φ) → WFF X (φ) the extension of θ to (φ, X) -formulae. So, there seems to be an inconsistency in the notations.E33. [Page 257] Since formulas and their meaning are defined with respect to a set X ofvariables, for satisfaction M (cid:238) Φ P to be well defined (in Definition 8.3.2), independencefrom X must be proved.However, the definition of satisfaction is clearly ambiguous because it dependson the set of variables X . Assume, for example, a Φ -model M such that M s = ∅ forsome sort s ∈ S , and [[P]] M ∅ = ∅ , i.e., M (cid:54)(cid:238) Φ P , for some sentence (closed formula) P ∈ WFF ∅ ( Φ ) . Let X = { x : s } be the variable set with just one element x of sort s . Wehave P ∈ WFF X ( Φ ) and [X → M] = ∅ = [[P]] MX . Hence, M (cid:238) Φ P , which is a contradictionwith our initial assumption M (cid:54)(cid:238) Φ P .This satisfaction condition should be indexed not only by the signature Φ but alsoby the set of variables X . For example, the notation (cid:238) X Φ clears the above ambiguity.E34. [Page 257] The concept of institution, used at the end of Definition 8.3.2, has only beenmentioned previously in the footnote 1 in page 250.E35. [Page 265] In the statement of Proposition 8.3.15 it could be added that, if P does notcontain variables with empty sort, then the nonempty assumption on carriers of M maybe omitted.E36. [Page 269] In the proof of Corollary 8.3.23, Theorem 8.3.22 is applied to Q = ( ∀ Y ) P ,which requires that θ is capture free for Q .E37. [Page 288] The hypothesis Bound (P) ∩ X = ∅ must be added to the statement of Propo-sition 8.5.1; otherwise, it could be possible for θ not to be capture free for P and thenLemma 8.3.25 could not be applied.Furthermore, the notation for the function y : M s ,...,s n → M s should be differentfrom the variable y : s . The reason is that θ has source X ∪ { y } and target T Σ (Y )(X ∪{ y } ) . Thus, the target contains y regarded both as variable and as operation symbol.E38. [Page 289]Proposition 8.3.15 is used in the proof of Proposition 8.5.2 without checkingthe nonempty carriers condition.E39. [Page 295] The notation for assignment application to a formula seems different in theproof of Proposition 8.7.1.E40. [Page 295] In the paragraph after Proposition 8.7.1, θ m is not a substitution, but anassignment in general. Then θ m (P) is not a sentence.E41. [Page 303] The concept “anarchic” used in Exercise 8.7.4 has not been defined. Notes for Chapter 9 E42. [Page 312] In the proof of Theorem 9.1.10, the author does not prove T Σ ,A (cid:238) A . Hence,we don’t know whether T Σ ,A is a ( Σ , A) -algebra or not. Compare with note E21 forTheorem 6.1.15 in page 165.E43. [Page 313] Here the author invokes Definition 8.3.2, meaning that we have to deal herewith the same problems as in Chapter 8 (see above notes E32–E33 about this definition). ditors’ Notes E44. [Page 326] It seems that the conclusions of Theorem 10.2.9 and Proposition 10.2.10coincide. However, Theorem 10.2.9 requires signatures to be regular while Proposi-tion 10.2.10 does not. This is rather strange even with the additional explanations.E45. [Page 329] The calligraphic notation for algebras in Proposition 10.2.18 and the para-graphs above has not been used before.E46. [Page 336] In Example 10.4.7, a function op f : A -> A should be added, in order topoint out that [ f(a) ] = { f(a) , f(b) } and f(c) ∉ [ f(a) ] .E47. [Page 345] In the paragraph before Theorem 10.6.6, the statement “for arbitrary A , it isnecessary and sufficient that Σ has no quasi-empty models” needs a proof.E48. [Page 391] In the proof of Theorem 10.2.8, the restriction condition in fact seems tocorrespond to what is called the monotonicity condition in Definition 10.1.5.E49. [Page 393] The star notation used in the proof of Theorem 10.4.11 (Initiality) has notbeen used before. Instead, the overbar notation was preferred.has no quasi-empty models” needs a proof.E48. [Page 391] In the proof of Theorem 10.2.8, the restriction condition in fact seems tocorrespond to what is called the monotonicity condition in Definition 10.1.5.E49. [Page 393] The star notation used in the proof of Theorem 10.4.11 (Initiality) has notbeen used before. Instead, the overbar notation was preferred.