[PDF] A Type and Scope Safe Universe of Syntaxes with Binding: Their Semantics and Proofs

Abstract

Almost every programming language's syntax includes a notion of binder and corresponding bound occurrences, along with the accompanying notions of α -equivalence, capture-avoiding substitution, typing contexts, runtime environments, and so on. In the past, implementing and reasoning about programming languages required careful handling to maintain the correct behaviour of bound variables. Modern programming languages include features that enable constraints like scope safety to be expressed in types. Nevertheless, the programmer is still forced to write the same boilerplate over again for each new implementation of a scope safe operation (e.g., renaming, substitution, desugaring, printing, etc.), and then again for correctness proofs. We present an expressive universe of syntaxes with binding and demonstrate how to (1) implement scope safe traversals once and for all by generic programming; and (2) how to derive properties of these traversals by generic proving. Our universe description, generic traversals and proofs, and our examples have all been formalised in Agda and are available in the accompanying material available online at this https URL.

Full PDF

aa r X i v : . [ c s . P L ] J a n ZU064-05-FPR jfp19 30 January 2020 2:24

Under consideration for publication in J. Functional Programming A Type and Scope Safe Universe of Syntaxes withBinding: Their Semantics and Proofs

GUILLAUME ALLAIS, ROBERT ATKEYUniversity of Strathclyde (UK)JAMES CHAPMANInput Output HK Ltd. (HK)CONOR MCBRIDEUniversity of Strathclyde (UK)JAMES MCKINNAUniversity of Edinburgh (UK)

Abstract

Almost every programming language’s syntax includes a notion of binder and corresponding boundoccurrences, along with the accompanying notions of α -equivalence, capture-avoiding substitution,typing contexts, runtime environments, and so on. In the past, implementing and reasoning aboutprogramming languages required careful handling to maintain the correct behaviour of bound vari-ables. Modern programming languages include features that enable constraints like scope safety tobe expressed in types. Nevertheless, the programmer is still forced to write the same boilerplate overagain for each new implementation of a scope safe operation (e.g., renaming, substitution, desugaring,printing, etc.), and then again for correctness proofs.We present an expressive universe of syntaxes with binding and demonstrate how to (1) imple-ment scope safe traversals once and for all by generic programming; and (2) how to derive propertiesof these traversals by generic proving. Our universe description, generic traversals and proofs, and ourexamples have all been formalised in Agda and are available in the accompanying material availableonline at https://github.com/gallais/generic-syntax . In modern typed programming languages, programmers writing embedded DSLs (Hudak(1996)) and researchers formalising them can now use the host language’s type system tohelp them. Using Generalised Algebraic Data Types (GADTs) or the more general indexedfamilies of Type Theory (Dybjer (1994)) to represent syntax, programmers can statically enforce some of the invariants in their languages. For example, managing variable scope isa popular use case in LEGO, Idris, Coq, Agda and Haskell (Altenkirch & Reus (1999);Brady & Hammond (2006); Hirschowitz & Maggesi (2012); Keuchel & Jeuring (2012);Bach Poulsen et al. (2018); Wadler & Kokke (2018); Eisenberg (2018)) as directly manip-ulating raw de Bruijn indices is notoriously error-prone. Solutions have been proposed This paper is typeset in colour .U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. that range from enforcing well scopedness of variables to ensuring full type correctness.In short, these techniques use the host languages’ types to ensure that “illegal states areunrepresentable”, where illegal states correspond to ill scoped or ill typed terms in theobject language.Despite the large body of knowledge in how to use types to deﬁne well formed syntax(see the related work in Section 10), it is still necessary for the working DSL designeror formaliser to redeﬁne essential functions like renaming and substitution for each newsyntax, and then to reprove essential lemmas about those functions. To reduce the burdenof such repeated work and boilerplate, in this paper we apply the methodology of datatype-genericity to programming and proving with syntaxes with binding.To motivate our approach, let us look at the formalisation of an apparently straightfor-ward program transformation: the inlining of let-bound variables by substitution togetherwith a soundness lemma proving that reductions in the source languages can be simulatedby reductions in the target one. There are two languages: the source ( S ), which has let-bindings, and the target ( T ), which only di ﬀ ers in that it does not: S :: = x | S S | λ x . S | let x = S in S T :: = x | T T | λ x . T Breaking the task down, an implementer needs to deﬁne an operational semantics foreach language, deﬁne the program transformation itself, and prove a correctness lemmathat states each step in the source language is simulated by zero or more steps of thetransformed terms in the target language. In the course of doing this, they will discoverthat there is actually a large amount of work:1. To deﬁne the operational semantics, one needs to deﬁne substitution, and hencerenaming. This needs to be done separately for both the source and target languages,even though they are very similar;2. In the course of proving the correctness lemma, one needs to prove eight lemmasabout the interactions of renaming, substitution, and transformation that are all re-markably similar, but must be stated and proved separately (e.g, as observed byBenton, Hur, Kennedy and McBride (2012)).Even after doing all of this work, they have only a result for a single pair of source andtarget languages. If they were to change their languages S or T , they would have to repeatthe same work all over again (or at least do a lot of cutting, pasting, and editing).The main contribution of this paper is this: using the universe of syntaxes with bindingwe present in this paper, we are able to solve this repetition problem once and for all . Content and Contributions.

To introduce the basic ideas that this paper builds on, westart with primers on scoped and sorted terms (Section 2), scope and sort safe programsacting on them (Section 3), and programmable descriptions of data types (Section 4). Theseintroductory sections help us build an understanding of the problem at hand as well as atoolkit that leads us to the novel content of this paper: a universe of scope safe syntaxeswith binding (Section 5) together with a notion of scope safe semantics for these syntaxes(Section 6). This gives us the opportunity to write generic implementations of renamingand substitution (Section 6.2), a generic let-binding removal transformation (generalisingthe problem stated above) (Section 7.5), and normalisation by evaluation (Section 7.7).

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Further, we show how to construct generic proofs by formally describing what it meansfor one semantics to simulate another (Section 9.2), or for two semantics to be fusible(Section 9.3). This allows us to prove the lemmas required above for renaming, substitution,and desugaring of let binders generically, for every syntax in our universe.Our implementation language is Agda (Norell (2009)). However, our techniques arelanguage independent: any dependently typed language at least as powerful as Martin-LöfType Theory (Martin-Löf (1982)) equipped with inductive families (Dybjer (1994)) such asCoq (The Coq Development Team (2017)), Lean (de Moura et al. (2015)) or Idris (Brady(2013)) ought to do.

Changes with respect to the ICFP 2018 version

This paper is a revised and expandedversion of a paper of the same title that appeared at ICFP 2018. This extended version ofthe paper includes many more examples of the use of our universe of syntax with bindingfor writing generic programs in Section 7: pretty printing with human readable names(Section 7.1), scope checking (Section 7.2), type checking (Section 7.3), elaboration (Sec-tion 7.4), inlining of single use let-bound expressions (shrinking reductions) (Section 7.6),and normalisation by evaluation (Section 7.7). We have also included a discussion of howto deﬁne generic programs for deciding equality of terms. Additionally, we have elaboratedour descriptions and examples throughout, and expanded our discussion of related work inSection 10.

A reasonable way torepresent the abstract syntax of the untyped λ -calculus in a typed functional programminglanguage is to use an inductive type: data Lam : Set where‘var : N → Lam‘lam : Lam → Lam‘app : Lam → Lam → Lam

We have used de Bruijn (1972) indices to represent variables by the number of ‘ lam bindersone has to pass up through to reach the binding occurrence. The de Bruijn representationhas the advantage that terms are automatically represented up to α -equivalence. If the indexgoes beyond the number of binders enclosing it, then we assume that it is referring to somecontext, left implicit in this representation.This representation works well enough for writing programs, but the programmer mustconstantly be vigilant to guard against the accidental construction of ill scoped terms. Theimplicit context that accompanies each represented term is prone to being forgotten ormuddled with another, leading to confusing behaviour when variables either have danglingpointers or point to the wrong thing.To improve on this situation, previous authors have proposed to use the host language’stype system to make the implicit context explicit, and to enforce well scopedness of vari-ables. Scope safe terms follow the discipline that every variable is either bound by some U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. binder or is explicitly accounted for in a context. Bellegarde and Hook (1994), Bird and Pat-terson (1999), and Altenkirch and Reus (1999) introduced the classic presentation of scopesafety using inductive families (Dybjer (1994)) instead of plain inductive types to representabstract syntax. Indeed, using a family indexed by a

Set , we can track scoping informationat the type level. The empty

Set represents the empty scope. The type constructor 1 + (_)extends the running scope with an extra variable. data Lam : Set → Set where‘var : X → Lam X ‘lam : Lam ( X ) → Lam X ‘app : Lam X → Lam X → Lam X Implicit generalisation of variables in Agda

The careful reader may have noticed thatwe use a seemingly out-ot-scope variable X of type Set . The latest version of Agda allowsus to declare variables that the system should implicitly quantify over if it happens toﬁnd them used in types. This allows us to lighten the presentation by omitting a largenumber of prenex quantiﬁers. The reader will hopefully be familiar enough with ML-stylepolymorphic types that this will seem natural to them.The

Lam type is now a family of types, indexed by the set of variables in scope. Thus,the context for each represented term has been made visible to the type system, and thetypes enforce that only variables that have been explicitly declared can be referenced in the ‘var constructor. We have made illegal terms unrepresentable.Since

Lam is deﬁned to be a function

Set → Set , it makes sense to ask whether it isalso a functor and a monad. Indeed it is, as Altenkirch and Reus showed. The functorialaction corresponds to renaming, the monadic ‘return’ corresponds to the use of variables(the ‘var constructor), and the monadic ‘bind’ corresponds to substitution. The functor andmonad laws correspond to well known properties from the equational theories of renamingand substitution. We will revisit these properties, for our whole universe of syntax withbinding, in Section 9.3.

A Typed Variant of Altenkirch and Reus’ Calculus

There is no reason to restrict thistechnique to inductive families indexed by

Set . The more general case of inductive fam-ilies in

Set J can be endowed with similar functorial and monadic operations by usingAltenkirch, Chapman and Uustalu’s relative monads (2015b; 2014).We pick as our index type J the category whose objects are inhabitants of List I ( I is aparameter of the construction) and whose morphisms are thinnings (permutations that mayforget elements, see Section 7). Values of type List I are intended to represent the list of thesorts (or kinds, or types, depending on the application) of the de Bruijn variables in scope.We can recover an unsorted approach by picking I to be the unit type. Given this sortedsetting, our functors take an extra I argument corresponding to the sort of the expressionbeing built. This is captured by the large type I − Scoped : _ − Scoped : Set → Set I − Scoped = I → List I → Set

We use Agda’s mixﬁx operator notation where underscores denote argument positions.To lighten the presentation, we exploit the observation that the current scope is eitherpassed unchanged to subterms (e.g. in the application case) or extended (e.g. in the λ - U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding abstraction case) by introducing combinators to build indexed types. We conform to theconvention (see e.g. Martin-Löf (1982)) of mentioning only context extensions when pre-senting judgements. That is to say that we aim to write sequents with an implicit ambientcontext. Concretely: we would rather use the rule app i than app e as the inference rule forapplication in STLC. f : σ → τ t : σ f t : τ app i Γ ⊢ f : σ → τ Γ ⊢ t : σ Γ ⊢ f t : τ app e In this discipline, the turnstile is used in rules which are binding fresh variables. Itseparates the extension applied to the ambient context on its left and the judgment thatlives in the thus extended context on its right. Concretely: we would rather use the rule lam i than lam e as the inference rule for λ -abstraction in STLC. x : σ ⊢ b : τλ x . t : σ → τ lam i Γ , x : σ ⊢ b : τ Γ ⊢ λ x . t : σ → τ lam e This observation that an ambient context is either passed around as is or extended forsubterms is critical to our whole approach to syntax with binding, and will arise again inour generic formulation of syntax traversals in Section 6. _ ⇒ _ : ( P Q : A → Set) → ( A → Set)( P ⇒ Q ) x = P x → Q x _ ⊢ _ : ( A → B ) → ( B → Set) → ( A → Set)( f ⊢ P ) x = P ( f x )const : Set → ( A → Set)const

P x = P ∀ [_] : ( A → Set) → Set ∀ [_] P = ∀ { x } → P x

Fig. 1. Combinators to build indexed SetsWe lift the function space pointwise with _ ⇒ _ , silently threading the underlying scope.The _ ⊢ _ makes explicit the adjustment made to the index by a function, a generalisationof the idea of extension . We write f ⊢ T where f is the adjustment and T the indexed Setit operates on. Although it may seem surprising at ﬁrst to deﬁne binary inﬁx operators ashaving arity three, they are meant to be used partially applied, surrounded by ∀ [_] whichturns an indexed Set into a Set by implicitly quantifying over the index. Lastly, const is theconstant combinator, which ignores the index.We make _ ⇒ _ associate to the right as one would expect and give it the highest prece-dence level as it is the most used combinator. These combinators lead to more readable typedeclarations. For instance, the compact expression ∀ [ ( const P ⇒ s ⊢ Q ) ⇒ R ] desugars tothe more verbose type ∀ { i } → ( P → Q ( s i )) → R i .As the context argument comes second in the deﬁnition of _ − Scoped , we can readilyuse these combinators to thread, modify, or quantify over the scope when deﬁning suchfamilies, as for example in Figure 2.The inductive family

Var represents well scoped and well sorted de Bruijn indices. Its z (for zero) constructor refers to the nearest binder in a non-empty scope. The s (forsuccessor) constructor lifts a a variable in a given scope to the extended scope where an U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. data Var : I − Scoped wherez : ∀ [ ( σ :: _) ⊢ Var σ ]s : ∀ [ Var σ ⇒ ( τ :: _) ⊢ Var σ ] Fig. 2. Scope and Kind Aware de Bruijn Indicesextra variable has been bound. Both of the constructors’ types have been written using thecombinators deﬁned above. They respectively normalise to: z : ∀ { σ Γ } → Var σ ( σ :: Γ ) s : ∀ { σ τ Γ } → Var σ Γ → Var σ ( τ :: Γ )We will reuse the Var family to represent variables in all the syntaxes deﬁned in this paper. data Type : Set where α : Type_‘ → _ : Type → Type → Type data Lam : Type − Scoped where‘var : ∀ [ Var σ ⇒ Lam σ ]‘app : ∀ [ Lam ( σ ‘ → τ ) ⇒ Lam σ ⇒ Lam τ ]‘lam : ∀ [ ( σ :: _) ⊢ Lam τ ⇒ Lam ( σ ‘ → τ ) ] Fig. 3. Simple Types and Intrinsically Typed deﬁnition of STLCThe

Type − Scoped family

Lam is Altenkirch and Reus’ intrinsically typed representa-tion of the simply typed λ -calculus, where Type is the Agda type of simple types. We canreadily write well scoped-and-typed terms such as e.g. application, a closed term of type(( σ ‘ → τ ) ‘ → ( σ ‘ → τ )) ( {- and -} delimit comments meant to help the reader see whichbinder the de Bruijn indices are referring too): apply : Lam (( σ ‘ → τ ) ‘ → ( σ ‘ → τ )) []apply = ‘lam {- f -} (‘lam {- x -} (‘app (‘var (s z) {- f -} ) (‘var z {- x -} ))) The scope- and type- safe representation described in the previous section is naturally onlya start. Once the programmer has access to a good representation of the language theyare interested in, they will want to write programs manipulating terms. Renaming andsubstitution are the two typical examples that are required for almost all syntaxes. Nowthat well typedness and well scopedness are enforced statically, all of these traversals haveto be implemented in a type and scope safe manner. These constraints show up in the typesof renaming and substitution deﬁned in Figure 4.We have intentionally hidden technical details behind some auxiliary deﬁnitions leftabstract here: var and extend . Their implementations are distinct for ren and sub but theyserve the same purpose: var is used to turn a value looked up in the evaluation environmentinto a term and extend is used to alter the environment when going under a binder. Thispresentation highlights the common structure between ren and sub which we will exploitlater in this section, particularly in Section 3.2 where we deﬁne an abstract notion ofsemantics and the corresponding generic traversal.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding ren : ( Γ − Env) Var ∆ → Lam σ Γ → Lam σ ∆ ren ρ (‘var k ) = var r (lookup ρ k )ren ρ (‘app f t ) = ‘app (ren ρ f ) (ren ρ t )ren ρ (‘lam b ) = ‘lam (ren (extend r ρ ) b ) sub : ( Γ − Env) Lam ∆ → Lam σ Γ → Lam σ ∆ sub ρ (‘var k ) = var s (lookup ρ k )sub ρ (‘app f t ) = ‘app (sub ρ f ) (sub ρ t )sub ρ (‘lam b ) = ‘lam (sub (extend s ρ ) b ) Fig. 4. Type and Scope Preserving Renaming and Substitution

Both renaming and substitution are deﬁned in terms of environments . We typically call Γ -environment an environment that associates values to each variable in Γ . This informs ournotation choice: we write (( Γ − Env ) V ∆ ) for an environment that associates a value V (variables for renaming, terms for substitution) well scoped and typed in ∆ to every entryin Γ . Formally, we have the following record structure (using a record helps Agda’s typeinference reconstruct the type family V of values for us): record _ − Env ( Γ : List I ) ( V : I − Scoped) ( ∆ : List I ) : Set whereconstructor packﬁeld lookup : Var i Γ → V i ∆ Fig. 5. Well Typed and Scoped Environments of Values

Record syntax in Agda

As with (all) other record structures deﬁned in this paper, weare able to proﬁt from Agda’s copattern syntax, as introduced in (Abel et al. (2013)) andshowcased in (Thibodeau et al. (2016)). That is, when deﬁning an environment ρ , we mayeither use the constructor pack , packaging a function r as an environment ρ = pack r , orelse deﬁne ρ in terms of the underlying function obtained from it by projecting out the(in this case, unique) lookup ﬁeld, as lookup ρ = r . Examples of deﬁnition in this styleare given in Figure 6 below, and throughout the rest of the paper. A value of a recordtype with more than one ﬁeld requires each of its ﬁelds to be given, either by a namedconstructor (or else Agda’s default record syntax), or in copattern style. By analogy withrecord / object syntax in other languages, Agda further supports ‘dot’ notation, so that anequivalent deﬁnition here could be expressed as ρ . lookup = r .We can readily deﬁne some basic building blocks for environments in Figure 6. Theempty environment ( ε ) is implemented by remarking that there can be no variable of type( Var σ [] ) and to correspondingly dismiss the case with the impossible pattern () . Thefunction _ • _ extends an existing Γ -environment with a new value of type σ thus returninga ( σ :: Γ )-environment. We also include the deﬁnition of _<$>_ , which lifts in a pointwisemanner a function acting on values into a function acting on environment of such values.As we have already observed, the deﬁnitions of renaming and substitution have verysimilar structure. Abstracting away this shared structure would allow for these deﬁnitionsto be refactored, and their common properties to be proved in one swift move.Previous e ﬀ orts in dependently typed programming (Benton et al. (2012); Allais et al. (2017)) have achieved this goal and refactored renaming and substitution, but also normali-sation by evaluation, printing with names or CPS conversion as various instances of a more U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. ε : ([] − Env) V ∆ lookup ε () _ • _ : ( Γ − Env) V ∆ → V σ ∆ → (( σ :: Γ ) − Env) V ∆ lookup ( ρ • v ) z = v lookup ( ρ • v ) (s k ) = lookup ρ k _<$>_ : ( ∀ { i } → V i ∆ → W i Θ ) → ( Γ − Env) V ∆ → ( Γ − Env) W Θ lookup ( f <$> ρ ) k = f (lookup ρ k ) Fig. 6. Combinators to Build Environmentsgeneral traversal. As we will show in Section 7.3, typechecking in the style of Atkey (2015)also ﬁts in that framework. To make sense of this body of work, we need to introduce threenew notions:

Thinning , a generalisation of renaming;

Thinnable s, which are types thatpermit thinning; and the (cid:3) functor, which freely adds Thinnability to any indexed type. Weuse (cid:3) , and our compact notation for the indexed function space between indexed types,to crisply encapsulate the additional quantiﬁcation over environment extensions which istypical of Kripke semantics.

Thinning : List I → List I → SetThinning

Γ ∆ = ( Γ − Env) Var ∆ Fig. 7. Thinnings: A Special Case of Environments

The Special Case of Thinnings

Thinning s subsume more structured notions such as theCategory of Weakenings (Altenkirch et al. (1995)) or Order Preserving Embeddings (Chapman(2009)), cf. Figure 8 for some examples of combinators. In particular, they do not preventthe user from deﬁning arbitrary permutations or from introducing contractions althoughwe will not use such instances. However, such extra ﬂexibility will not get in our way, andpermits a representation as a function space which grants us monoid laws “for free” as perJe ﬀ rey’s observation (2011). identity : Thinning Γ Γ lookup identity k = k extend : Thinning Γ ( σ :: Γ )lookup extend v = s v select : Thinning Γ ∆ → ( ∆ − Env) V Θ → ( Γ − Env) V Θ lookup (select ren ρ ) k = lookup ρ (lookup ren k ) Fig. 8. Identity Thinning, context extension, and (generalised) transitivityThe (cid:3) combinator turns any (

List I )-indexed Set into one that can absorb thinnings. Thisis accomplished by abstracting over all possible thinnings from the current scope, akin toan S4-style necessity modality. The axioms of S4 modal logic incite us to observe that thefunctor (cid:3) is a comonad: extract applies the identity Thinning to its argument, and duplicate is obtained by composing the two

Thinning s we are given. The expected laws hold triviallythanks to Je ﬀ rey’s trick mentioned above. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding The notion of

Thinnable is the property of being stable under thinnings; in other words

Thinnable s are the coalgebras of (cid:3) . It is a crucial property for values to have if one wantsto be able to push them under binders. From the comonadic structure we get that the (cid:3) combinator freely turns any (

List

I)-indexed Set into a

Thinnable one. (cid:3) : (List I → Set) → (List I → Set)( (cid:3) T ) Γ = ∀ [ Thinning Γ ⇒ T ] Thinnable : (List I → Set) → SetThinnable T = ∀ [ T ⇒ (cid:3) T ]extract : ∀ [ (cid:3) T ⇒ T ]extract t = t identity duplicate : ∀ [ (cid:3) T ⇒ (cid:3) ( (cid:3) T ) ]duplicate t ρ σ = t (select ρ σ ) th^ (cid:3) : Thinnable ( (cid:3) T )th^ (cid:3) = duplicate Fig. 9. The (cid:3) comonad, Thinnable, and the cofree Thinnable.

As we show in our previous work (ACMM) (2017), equipped with these new notionswe can deﬁne an abstract concept of semantics for our scope- and type- safe language.Provided that a set of constraints on two (

Type − Scoped ) families V and C is satisﬁed, wewill obtain a traversal of the following type: semantics : ( Γ − Env) V ∆ → (Lam σ Γ → C σ ∆ ) Broadly speaking, a semantics turns our deeply embedded abstract syntax trees into theshallow embedding of the corresponding parametrised higher order abstract syntax term.We get a choice of useful scope- and type- safe traversals by using di ﬀ erent ‘host languages’for this shallow embedding.Semantics, speciﬁed in terms of a record Semantics , are deﬁned in terms of a choiceof values V and computations C . A semantics must satisfy constraints on the notions ofvalues V and computations C at hand.In the following paragraphs, we interleave the deﬁnition of the record of constraints Semantics with explanations of our choices. It is important to understand that all of theindented Agda snippets are part of the record’s deﬁnition. Some correspond to record ﬁelds(highlighted in pink ) while others are mere auxiliary deﬁnitions (highlighted in blue ) aspermitted by Agda. record Semantics (

V C : Type − Scoped) : Set where

First of all, values V should be Thinnable so that semantics may push the environmentunder binders. We call this constraint th^ V , using a caret to generate a mnemonic name: th refers to th innable and V clariﬁes the family which is proven to be thinnable . th^ V : Thinnable ( V σ ) This constraint allows us to deﬁne extend , the generalisation of the two auxiliary deﬁni-tions we used in Figure 4, in terms of the building blocks introduced in Figure 6. It takes acontext extension from ∆ to Θ in the form of a thinning, an existing evaluation environment We use this convention consistently throughout the paper, using names such as vl^Tm for theproof that terms are

VarLike

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. mapping Γ variables to ∆ values and a value living in the extended context Θ and returnsan evaluation environment mapping ( σ :: Γ ) variables to Θ values. extend : Thinning ∆ Θ → ( Γ − Env) V ∆ → V σ Θ → (( σ :: Γ ) − Env) V Θ extend σ ρ v = (( λ t → th^ V t σ ) <$> ρ ) • v Second, the set of computations needs to be closed under various combinators which arethe semantical counterparts of the language’s constructors. For instance in the variable casewe obtain a value from the evaluation environment but we need to return a computation.This means that values should embed into computations. var : ∀ [ V σ ⇒ C σ ] The semantical counterpart of application is an operation that takes a representation of afunction and a representation of an argument and produces a representation of the result. app : ∀ [ C ( σ ‘ → τ ) ⇒ C σ ⇒ C τ ] The interpretation of the λ -abstraction is of particular interest: it is a variant on the Kripkefunction space one can ﬁnd in normalisation by evaluation (Berger & Schwichtenberg(1991); Berger (1993); Coquand & Dybjer (1997); Coquand (2002)). In all possible thin-nings of the scope at hand, it promises to deliver a computation whenever it is providedwith a value for its newly bound variable. This is concisely expressed by the constraint’stype: lam : ∀ [ (cid:3) ( V σ ⇒ C τ ) ⇒ C ( σ ‘ → τ ) ] Agda allows us to package the deﬁnition of the generic traversal function semantics together with the ﬁelds of the record

Semantics . This causes the deﬁnition to be specialisedand brought into scope for any instance of

Semantics the user will deﬁne. We thus realisethe promise made earlier, namely that any given

Semantics

V C induces a function which,given a value in V for each variable in scope, transforms a Lam term into a computation C . semantics : ( Γ − Env) V ∆ → (Lam σ Γ → C σ ∆ )semantics ρ (‘var k ) = var (lookup ρ k )semantics ρ (‘app f t ) = app (semantics ρ f ) (semantics ρ t )semantics ρ (‘lam b ) = lam ( λ σ v → semantics (extend σ ρ v ) b ) Fig. 10. Fundamental Lemma of Semantics for

Lam , relative to a given

Semantics

V C

Semantics

Recall that each

Semantics is parametrised by two families: V and C . During the eval-uation of a term, variables are replaced by values of type V and the overall result is acomputation of type C . Coming back to renaming and substitution, we see that they bothﬁt in the Semantics framework. The family V of values is respectively the family ofvariables for renaming, and the family of λ -terms for substitution. In both cases C is thefamily of λ -terms because the result of the operation will be a term. We notice that the U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Renaming : Semantics Var LamRenaming = record{ th^ V = th^Var; var = ‘var; app = ‘app; lam = λ b → ‘lam ( b extend z) }ren : ( Γ − Env) Var ∆ → Lam σ Γ → Lam σ ∆ ren = Semantics.semantics Renaming Substitution : Semantics Lam LamSubstitution = record{ th^ V = λ t ρ → ren ρ t ; var = id; app = ‘app; lam = λ b → ‘lam ( b extend (‘var z)) }sub : ( Γ − Env) Lam ∆ → Lam σ Γ → Lam σ ∆ sub = Semantics.semantics Substitution Fig. 11. Renaming and Substitution as Instances of

Semantics deﬁnition of substitution depends on the deﬁnition of renaming: to be able to push termsunder binder, we need to have already proven that they are thinnable.In both cases we use ( extend ) deﬁned in Figure 8 as ( pack s ) (where pack is theconstructor for environments and s , deﬁned in Section 2, is the function lifting an existingde Bruijn variable into an extended scope) as the deﬁnition of the thinning embedding Γ into ( σ :: Γ .We also include the deﬁnition of a basic printer relying on a name supply to highlightthe fact that computations can very well be e ﬀ ectful. The ability to generate fresh names isgiven to us by a monad that here we decide to call Fresh . Concretely,

Fresh is implementedas an instance of the State monad where the state is a stream of distinct strings. The

Printing semantics is deﬁned by using

Name s (i.e.

String s) as values and

Printer s (i.e.monadic actions in

Fresh returning a

String ) as computations. We use a

Wrap per witha type and a context as phantom types in order to help Agda’s inference propagate theappropriate constraints. We deﬁne a function fresh that fetches a name from the namesupply and makes sure it is not available anymore. record Wrap ( A : Set) ( σ : I ) ( Γ : List I ) : Set whereconstructor MkW; ﬁeld getW : A Fresh : Set → SetFresh = State (Stream String _)Name : I − ScopedName = Wrap StringPrinter : I − ScopedPrinter = Wrap (Fresh String) fresh : ∀ σ → Fresh (Name σ ( σ :: Γ ))fresh σ = do names ← getput (tail names )pure (MkW (head names )) Fig. 12. Wrapper and fresh name generationThe wrapper

Wrap does not depend on the scope Γ so it is automatically a thinnablefunctor, that is to say that we have the (used but not shown here) deﬁnitions map^Wrap witnessing the functoriality of Wrap and th^Wrap witnessing its thinnability. We jumpstraight to the deﬁnition of the printer.To print a variable, we are handed the

Name associated to it by the environment and return it immediately. var : ∀ [ Name σ ⇒ Printer σ ]var = map^Wrap return U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

To print an application, we produce a string representation, f , of the term in functionposition, then one, t , of its argument and combine them by putting the argument betweenparentheses. app : ∀ [ Printer ( σ ‘ → τ ) ⇒ Printer σ ⇒ Printer τ ]app mf mt = MkW do f ← getW mft ← getW mt return ( f ++ " (" ++ t ++ ")" ) To print a λ -abstraction, we start by generating a fresh name, x , for the newly-boundvariable, use that name to generate a string b representing the body of the function towhich we prepend a “ λ ” binding the name x . lam : ∀ [ (cid:3) (Name σ ⇒ Printer τ ) ⇒ Printer ( σ ‘ → τ ) ]lam { σ } mb = MkW do x ← fresh σ b ← getW ( mb extend x )return ( " λ " ++ getW x ++ ". " ++ b ) Putting all of these pieces together, we get the

Printing semantics shown in Figure 13.

Printing : Semantics Name PrinterPrinting = record { th^ V = th^Wrap; var = var; app = app; lam = lam } Fig. 13. Printing as an instance of

Semantics

We show how one can use this newly-deﬁned semantics to implement print , a printer forclosed terms assuming that we have already deﬁned names , a stream of distinct stringsused as our name supply. We show the result of running print on the term apply (ﬁrstintroduced in Figure 2). print : Lam σ [] → Stringprint t = proj (getW printer names) whereempty : ([] − Env) Name []empty = ε printer = semantics Printing empty t apply : Lam (( σ ‘ → τ ) ‘ → ( σ ‘ → τ )) []apply = ‘lam (‘lam (‘app (‘var (s z)) (‘var z)))_ : print apply ≡ " λ a. λ b. a (b)" _ = reﬂ Both printing and renaming highlight the importance of distinguishing values and com-putations: the type of values in their respective environments are distinct from their type ofcomputations.All of these examples are already described at length by ACMM (2017) so we willnot spend any more time on them. In ACMM we have also obtained the simulation andfusion theorems demonstrating that these traversals are well behaved as corollaries of moregeneral results expressed in terms of semantics . We will come back to this in Section 9.2.One important observation to make is the tight connection between the constraints de-scribed in

Semantics and the deﬁnition of

Lam : the semantical counterparts of the

Lam constructors are obtained by replacing the recursive occurrences of the inductive family

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding with either a computation or a Kripke function space depending on whether an extravariable was bound. This suggests that it ought to be possible to compute the deﬁnitionof Semantics from the syntax description. Before doing this in Section 5, we need to lookat a generic descriptions of datatypes.

Chapman, Dagand, McBride and Morris (CDMM) (2010) deﬁned a universe of data typesinspired by Dybjer and Setzer’s ﬁnite axiomatisation of inductive-recursive deﬁnitions (1999)and Benke, Dybjer and Jansson’s universes for generic programs and proofs (2003). Thisexplicit deﬁnition of codes for data types empowers the user to write generic programstackling all of the data types one can obtain this way. In this section we recall the mainaspects of this construction we are interested in to build up our generic representation ofsyntaxes with binding.The ﬁrst component of the deﬁnition of CDMM’s universe (Figure 14) is an inductivetype of

Desc riptions of strictly positive functors from

Set J to Set I . These functors cor-respond to I -indexed containers of J -indexed payloads. Keeping these index types distinctprevents mistaking one for the other when constructing the interpretation of descriptions.Later of course we can use these containers as the nodes of recursive datastructures byinterpreting some payloads sorts as requests for subnodes (Altenkirch et al. (2015a)).The inductive type of descriptions has three constructors: ‘ σ to store data (the rest ofthe description can depend upon this stored value), ‘X to attach a recursive substructureindexed by J and ‘ (cid:4) to stop with a particular index value.The recursive function ~ _ (cid:127) makes the interpretation of the descriptions formal. Interpre-tation of descriptions give rise right-nested tuples terminated by equality constraints. data Desc ( I J : Set) : Set where‘ σ : ( A : Set) → ( A → Desc

I J ) → Desc

I J ‘X : J → Desc

I J → Desc

I J ‘ (cid:4) : I → Desc

I J ~ _ (cid:127) : Desc I J → ( J → Set) → ( I → Set) ~ ‘ σ A d (cid:127)

X i = Σ [ a ∈ A ] ( ~ d a (cid:127) X i ) ~ ‘X j d (cid:127) X i = X j × ~ d (cid:127) X i ~ ‘ (cid:4) i ′ (cid:127) X i = i ≡ i ′ Fig. 14. Datatype Descriptions and their Meaning as FunctorsThese constructors give the programmer the ability to build up the data types they areused to. For instance, the functor corresponding to lists of elements in A stores a Bool eanwhich stands for whether the current node is the empty list or not. Depending on its value,the rest of the description is either the “stop” token or a pair of an element in A and arecursive substructure i.e. the tail of the list. The List type is unindexed, we represent thelack of an index with the unit type ⊤ whose unique inhabitant is tt .Indices can be used to enforce invariants. For example, the type Vec

A n of length-indexed lists. It has the same structure as the deﬁnition of listD . We start with a

Bool eandistinguishing the two constructors: either the empty list (in which case the branch’s indexis enforced to be 0) or a non-empty one in which case we store a natural number n , thehead of type A and a tail of size n (and the branch’s index is enforced to be suc n ).The payo ﬀ for encoding our datatypes as descriptions is that we can deﬁne genericprograms for whole classes of data types. The decoding function ~ _ (cid:127) acted on the objects of U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. listD : Set → Desc ⊤ ⊤ listD A = ‘ σ Bool $ λ isNil → if isNil then ‘ (cid:4) ttelse ‘ σ A ( λ _ → ‘X tt (‘ (cid:4) tt)) Fig. 15. The Description of the base functor for

List A vecD : Set → Desc

N N vecD A = ‘ σ Bool $ λ isNil → if isNil then ‘ (cid:4) σ N ( λ n → ‘ σ A ( λ _ → ‘X n (‘ (cid:4) (suc n )))) Fig. 16. The Description of the base functor for

Vec

A n

Set J , and we will now deﬁne the function fmap by recursion over a code d . It describes theaction of the functor corresponding to d over morphisms in Set J . This is the ﬁrst example ofgeneric programming over all the functors one can obtain as the meaning of a description. fmap : ( d : Desc I J ) → ∀ [ X ⇒ Y ] → ∀ [ ~ d (cid:127) X ⇒ ~ d (cid:127) Y ]fmap (‘ σ A d ) f ( a , v ) = ( a , fmap ( d a ) f v )fmap (‘X j d ) f ( r , v ) = ( f r , fmap d f v )fmap (‘ (cid:4) i ) f t = t Fig. 17. Action on Morphisms of the Functor corresponding to a

Desc riptionAll the functors obtained as meanings of

Desc riptions are strictly positive. So we canbuild the least ﬁxpoint of the ones that are endofunctors (i.e. the ones for which I equals J ). This ﬁxpoint is called µ and its iterator is given by the deﬁnition of fold d .We can see in Figure 19 that we can recover the types we are used to thanks to this leastﬁxpoint. Pattern synonyms let us hide away the encoding: users can use them to pattern-match on lists and Agda conveniently resugars them when displaying a goal. Finally, wecan get our hands on the types’ eliminators by instantiating the generic fold .The CDMM approach therefore allows us to generically deﬁne iteration principles forall data types that can be described. These are exactly the features we desire for a universeof data types with binding, so in the next section we will see how to extend CDMM’sapproach to include binding.The functor underlying any well scoped and sorted syntax can be coded as some Desc ( I × List I ) ( I × List I ), with the free monad construction from CDMM uniformly addingthe variable case. Whilst a good start, Desc treats its index types as unstructured, so thisconstruction is blind to what makes the

List I index a scope . The resulting ‘bind’ operatordemands a function which maps variables in any sort and scope to terms in the same sort NB In Figure 18 the

Size (Abel (2010)) index added to the inductive deﬁnition of µ plays a crucialrole in getting the termination checker to see that fold is a total function.U064-05-FPR jfp19 30 January 2020 2:24 A Type and Scope Safe Universe of Syntaxes with Binding data µ ( d : Desc I I ) : Size → I → Set where‘con : ~ d (cid:127) ( µ d s ) i → µ d ( ↑ s ) i fold : ( d : Desc I I ) → ∀ [ ~ d (cid:127) X ⇒ X ] → ∀ [ µ d s ⇒ X ]fold d alg (‘con t ) = alg (fmap d (fold d alg ) t ) Fig. 18. Least Fixpoint of an Endofunctor and Corresponding Generic Fold

List : Set → SetList A = µ (listD A ) ∞ ttpattern []’ = (true , reﬂ)pattern [] = ‘con []’pattern _ :: ’_ x xs = (false , x , xs , reﬂ)pattern _ :: _ x xs = ‘con ( x :: ’ xs ) foldr : ( A → B → B ) → B → List A → B foldr c n = fold (listD _) $ λ where[]’ → n ( hd :: ’ rec ) → c hd rec Fig. 19. List, its constructors, and eliminatorand scope. However, the behaviour we need is to preserve sort while mapping betweenspeciﬁc source and target scopes which may di ﬀ er. We need to account for the fact thatscopes change only by extension, and hence that our speciﬁcally scoped operations can bepushed under binders by weakening. Our universe of scope safe and well kinded syntaxes (deﬁned in Figures 20, 21) followsthe same principle as CDMM’s universe of datatypes, except that we are not buildingendofunctors on

Set I any more but rather on I − Scoped . We now think of the index type I as the sorts used to distinguish terms in our embedded language. The ‘ σ and ‘ (cid:4) constructorsare as in the CDMM Desc type, and are used to represent data and index constraintsrespectively. What distinguishes this new universe

Desc from that of Section 4 is that the ‘X constructor is now augmented with an additional List I argument that describes the newbinders that are brought into scope at this recursive position. This list of the kinds of thenewly-bound variables will play a crucial role when deﬁning the description’s semanticsas a binding structure in Figures 21, 22 and 23. data Desc ( I : Set) : Set where‘ σ : ( A : Set) → ( A → Desc I ) → Desc I ‘X : List I → I → Desc I → Desc I ‘ (cid:4) : I → Desc I Fig. 20. Syntax DescriptionsThe meaning function ~ _ (cid:127) we associate to a description follows closely its CDMMequivalent. It only departs from it in the ‘X case and the fact it is not an endofunctor U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. on I − Scoped ; it is more general than that. The function takes an X of type List I → I − Scoped to interpret ‘X ∆ j (i.e. substructures of sort j with newly-bound variables in ∆ )in an ambient scope Γ as X ∆ j Γ . ~ _ (cid:127) : Desc I → (List I → I − Scoped) → I − Scoped ~ ‘ σ A d (cid:127)

X i Γ = Σ [ a ∈ A ] ( ~ d a (cid:127) X i Γ ) ~ ‘X ∆ j d (cid:127) X i Γ = X ∆ j Γ × ~ d (cid:127) X i Γ ~ ‘ (cid:4) j (cid:127) X i Γ = i ≡ j Fig. 21. Descriptions’ MeaningsThe astute reader may have noticed that ~ _ (cid:127) is uniform in X and Γ ; however refactoring ~ _ (cid:127) to use the partially applied X _ _ Γ following this observation would lead to a deﬁnitionharder to use with the combinators for indexed sets described in Section 2 which make ourtypes much more readable.If we pre-compose the meaning function ~ _ (cid:127) with a notion of ‘de Bruijn scopes’ (de-noted Scope here) which turns any I − Scoped family into a function of type

List I → I − Scoped by appending the two

List indices, we recover a meaning function producingan endofunctor on I − Scoped . So far we have only shown the action of the functor onobjects; its action on morphisms is given by a function fmap deﬁned by induction over thedescription just as in Section 4.

Scope : I − Scoped → List I → I − ScopedScope T ∆ i = ( ∆ ++_) ⊢ T i

Fig. 22. De Bruijn ScopesThe endofunctors thus deﬁned are strictly positive and we can take their ﬁxpoints. Aswe want to deﬁne the terms of a language with variables, instead of considering the initialalgebra, this time we opt for the free relative monad (Altenkirch et al. (2014)) (with respectto the functor

Var ): the ‘var constructor corresponds to return, and we will deﬁne bind (alsoknown as the parallel substitution sub ) in the next section. data Tm ( d : Desc I ) : Size → I − Scoped where‘var : ∀ [ Var i ⇒ Tm d ( ↑ s ) i ]‘con : ∀ [ ~ d (cid:127) (Scope (Tm d s )) i ⇒ Tm d ( ↑ s ) i ] Fig. 23. Term Trees: The Free

Var -Relative Monads on DescriptionsComing back to our original examples, we now have the ability to give codes for the wellscoped untyped λ -calculus and, just as well, the intrinsically typed simply typed λ -calculus.We add a third example to showcase the whole spectrum of syntaxes: a well scoped andwell sorted but not well typed bidirectional language. In all examples, the variable case willbe added by the free monad construction so we only have to describe the other constructors. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Un(i)typed λ -calculus For the untyped case, the lack of type translates to picking theunit type ( ⊤ ) as our notion of sort. We have two possible constructors: application wherewe have two substructures which do not bind any extra argument and λ -abstraction whichhas exactly one substructure with precisely one extra bound variable. A single Bool ean isenough to distinguish the two constructors.

UTLC : Desc ⊤ UTLC = ‘ σ Bool $ λ isApp → if isApp then ‘X [] tt (‘X [] tt (‘ (cid:4) tt))else ‘X (tt :: []) tt (‘ (cid:4) tt) Fig. 24. Description for the Untyped λ -calculus Bidirectional STLC

Our second example is a bidirectional (Pierce & Turner (2000)) lan-guage hence the introduction of a notion of

Mode : each term is either part of the

Infer or Check fraction of the language. This language has four constructors which we list in thead-hoc ‘Bidi type of constructor tags, its decoding

Bidi is deﬁned by a pattern-matching λ -expression in Agda. Application and λ -abstraction behave as expected, with the importantobservation that λ -abstraction binds an Infer rable term. The two remaining constructorscorrespond to changes of direction: one can freely

Emb bed inferrable terms as checkableones whereas we require a type annotation when forming a

Cut (we reuse the notion of

Type introduced in Figure 3). data Mode : Set whereCheck Infer : Modedata ‘Bidi : Set whereApp Lam Emb : ‘BidiCut : Type → ‘Bidi Bidi : Desc ModeBidi = ‘ σ ‘Bidi $ λ whereApp → ‘X [] Infer (‘X [] Check (‘ (cid:4) Infer))Lam → ‘X (Infer :: []) Check (‘ (cid:4) Check)(Cut σ ) → ‘X [] Check (‘ (cid:4) Infer)Emb → ‘X [] Infer (‘ (cid:4) Check)

Fig. 25. Description for the bidirectional STLC

Intrinsically typed STLC

In the typed case (for the same notion of

Type deﬁned inFigure 3), we are back to two constructors: the terms are fully annotated and thereforeit is not necessary to distinguish between

Mode s anymore. We need our tags to carry extrainformation about the types involved so we use once more and ad-hoc datatype ‘STLC , anddeﬁne its decoding

STLC by a pattern-matching λ -expression. data ‘STLC : Set whereApp Lam : Type → Type → ‘STLC STLC : Desc TypeSTLC = ‘ σ ‘STLC $ λ where(App σ τ ) → ‘X [] ( σ ‘ → τ ) (‘X [] σ (‘ (cid:4) τ ))(Lam σ τ ) → ‘X ( σ :: []) τ (‘ (cid:4) ( σ ‘ → τ )) Fig. 26. Description for the intrinsically typed STLC

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

For convenience we use Agda’s pattern synonyms corresponding to the original construc-tors in Section 2. These synonyms can be used when pattern-matching on a term and Agdaresugars them when displaying a goal. This means that the end user can seamlessly workwith encoded terms without dealing with the gnarly details of the encoding. These patterndeﬁnitions can omit some arguments by using “ _ ”, in which case they will be ﬁlled in byuniﬁcation just like any other implicit argument: there is no extra cost to using an encoding!The only downside is that the language currently does not allow the user to specify typeannotations for pattern synonyms. We only include examples of pattern synonyms for thetwo extreme examples, the deﬁnition for Bidi are similar. pattern ‘app f t = ‘con (true , f , t , reﬂ)pattern ‘lam b = ‘con (false , b , reﬂ) pattern ‘app f t = ‘con (App _ _ , f , t , reﬂ)pattern ‘lam b = ‘con (Lam _ _ , b , reﬂ) Fig. 27. Respective Pattern Synonyms for

UTLC and

STLC .As a usage example of these pattern synonyms, we deﬁne the identity function in allthree languages in Figure 28, using the same caret-based naming convention we introducedearlier. The code is virtually the same except for

Bidi which explicitly records the changeof direction from

Check to Infer . id^U : Tm UTLC ∞ tt []id^U = ‘lam (‘var z) id^B : Tm Bidi ∞ Check []id^B = ‘lam (‘emb (‘var z)) id^S : Tm STLC ∞ ( σ ‘ → σ ) []id^S = ‘lam (‘var z) Fig. 28. Identity function in all three languagesIt is the third time (the ﬁrst and second times being the deﬁnition of listD and vecD inFigure 15 and 16) that we use a

Bool to distinguish between two constructors. In order toavoid re-encoding the same logic, the next section introduces combinators demonstratingthat descriptions are closed under ﬁnite sums.

Common Combinators and Their Properties.

As seen previously, we can use a depen-dent pair whose ﬁrst component is a

Bool ean to take the coproduct of two descriptions:depending on the value of the ﬁrst component, we will return one or the other. We canabstract this common pattern as a combinator _‘+_ together with an appropriate eliminator case which, given two continuations, picks the one corresponding to the chosen branch. _‘+_ : Desc I → Desc I → Desc Id ‘+ e = ‘ σ Bool $ λ isLeft → if isLeft then d else e case : ( ~ d (cid:127) X i Γ → A ) → ( ~ e (cid:127) X i Γ → A ) → ( ~ d ‘+ e (cid:127) X i Γ → A )case l r (true , t ) = l t case l r (false , t ) = r t Fig. 29. Descriptions are closed under SumA concrete use case for this combinator will be given in section 7.5 where we explainhow to seamlessly enrich an existing syntax with let-bindings and how to use the

Seman-tics framework to elaborate them away.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Based on the

Semantics type we deﬁned for the speciﬁc example of the simply typed λ -calculus in Section 3, we can deﬁne a generic notion of semantics for all syntax descrip-tions. It is once more parametrised by two I − Scoped families V and C correspondingrespectively to values associated to bound variables and computations delivered by evalu-ating terms. These two families have to abide by three constraints: • th^ V Values should be thinnable so that we can push the evaluation environmentunder binders; • var Values should embed into computations for us to be able to return the valueassociated to a variable as the result of its evaluation; • alg We should have an algebra turning a term whose substructures have been re-placed with computations (possibly under some binders, represented semanticallyby the

Kripke type-valued function deﬁned below) into computations record Semantics ( d : Desc I ) ( V C : I − Scoped) : Set whereﬁeld th^ V : Thinnable ( V σ )var : ∀ [ V σ ⇒ C σ ]alg : ∀ [ ~ d (cid:127) (Kripke V C ) σ ⇒ C σ ] Fig. 30. A Generic Notion of SemanticsHere we crucially use the fact that the meaning of a description is deﬁned in terms ofa function interpreting substructures which has the type

List I → I − Scoped , i.e. that getsaccess to the current scope but also the exact list of the kinds of the newly bound variables.We deﬁne a function

Kripke by case analysis on the number of newly bound variables.It is essentially a subcomputation waiting for a value associated to each one of the freshvariables. • If it’s 0 we expect the substructure to be a computation corresponding to the resultof the evaluation function’s recursive call; • But if there are newly bound variables then we expect to have a function space. Inany context extension, it will take an environment of values for the newly-boundvariables and produce a computation corresponding to the evaluation of the body ofthe binder.

Kripke : (

V C : I − Scoped) → (List I → I − Scoped)Kripke

V C [] j = C j Kripke

V C ∆ j = (cid:3) (( ∆ − Env)

V ⇒ C j ) Fig. 31. Substructures as either Computations or Kripke Function SpacesIt is once more the case that the abstract notion of Semantics comes with a fundamentallemma: all I − Scoped families V and C satisfying the three criteria we have put for-ward give rise to an evaluation function. We introduce a notion of computation _ − Comp

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. analogous to that of environments: instead of associating values to variables, it associatescomputations to terms. _ − Comp : List I → I − Scoped → List I → Set( Γ − Comp) C ∆ = ∀ { s σ } → Tm d s σ Γ → C σ ∆ We can now deﬁne the type of the fundamental lemma (called semantics ) which takes asemantics and returns a function from environments to computations. It is deﬁned mutuallywith a function body turning syntactic binders into semantic binders: to each de Bruijn

Scope (i.e. a substructure in a potentially extended context) it associates a

Kripke (i.e. asubcomputation expecting a value for each newly bound variable). semantics : ( Γ − Env) V ∆ → ( Γ − Comp) C ∆ body : ( Γ − Env) V ∆ → ∀ Θ σ → Scope (Tm d s ) Θ σ Γ → Kripke

V C Θ σ ∆ Fig. 32. Statement of the Fundamental Lemma of

Semantics

The proof of semantics is straightforward now that we have clearly identiﬁed the prob-lem structure and the constraints we need to enforce. If the term considered is a variable,we lookup the associated value in the evaluation environment and turn it into a computationusing var . If it is a non variable constructor then we call fmap to evaluate the substructuresusing body and then call the alg ebra to combine these results. semantics ρ (‘var k ) = var (lookup ρ k )semantics ρ (‘con t ) = alg (fmap d (body ρ ) t ) Fig. 33. Proof of the Fundamental Lemma of

Semantics – semantics The auxiliary lemma body distinguishes two cases. If no new variable has been boundin the recursive substructure, it is a matter of calling semantics recursively. Otherwise weare provided with a

Thinning , some additional values and evaluate the substructure in thethinned and extended evaluation environment (thanks to a auxiliary function _>>_ whichgiven two environments ( Γ − Env ) V Θ and ( ∆ − Env ) V Θ produces an environment (( Γ ++ ∆ ) − Env ) V Θ ). body ρ [] i t = semantics ρ t body ρ (_ :: _) i t = λ σ vs → semantics ( vs >> th^Env th^ V ρ σ ) t Fig. 34. Proof of the Fundamental Lemma of

Semantics – body U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Given that fmap introduces one level of indirection between the recursive calls and thesubterms they are acting upon, the fact that our terms are indexed by a

Size is once morecrucial in getting the termination checker to see that our proof is indeed well founded.We immediately introduce closed , a corollary of the fundamental lemma of semanticsfor the special cases of closed terms in Figure 35. Given a

Semantics with value type V and computation type C , we can evaluate a closed term of type σ and obtain a computationof type ( C σ [] ) by kickstarting the evaluation with an empty environment. closed : TM d σ → C σ []closed = semantics ε Fig. 35. Corollary: evaluation of closed terms

Similarly to ACMM (2017) renaming can be deﬁned generically for all syntax descriptionsas a semantics with

Var as values and Tm as computations. The ﬁrst two constraints on Var described earlier are trivially satisﬁed. Observing that renaming strictly respects thestructure of the term it goes through, it makes sense for the algebra to be implementedusing fmap . When dealing with the body of a binder, we ‘reify’ the

Kripke function byevaluating it in an extended context and feeding it placeholder values corresponding tothe extra variables introduced by that context. This is reminiscent both of what we did inSection 3 and the deﬁnition of reiﬁcation in the setting of normalisation by evaluation (seee.g. Catarina Coquand’s formal development (2002)).Substitution is deﬁned in a similar manner with Tm as both values and computations. Ofthe two constraints applying to terms as values, the ﬁrst one corresponds to renaming andthe second one is trivial. The algebra is once more deﬁned by using fmap and reifying thebodies of binders. Ren : Semantics d Var (Tm d ∞ )Ren .th^ V = th^VarRen .var = ‘varRen .alg = ‘con ◦ fmap d (reify vl^Var)ren : ( Γ − Env) Var ∆ → Tm d ∞ σ Γ → Tm d ∞ σ ∆ ren ρ t = Semantics.semantics Ren ρ t Sub : Semantics d (Tm d ∞ ) (Tm d ∞ )Sub .th^ V = th^TmSub .var = idSub .alg = ‘con ◦ fmap d (reify vl^Tm)sub : ( Γ − Env) (Tm d ∞ ) ∆ → Tm d ∞ σ Γ → Tm d ∞ σ ∆ sub ρ t = Semantics.semantics Sub ρ t Fig. 36. Generic Renaming and Substitution for All Scope Safe Syntaxes with BindingThe reiﬁcation process mentioned in the deﬁnition of renaming and substitution can beimplemented generically for

Semantics families which have

VarLike values, i.e. valueswhich are

Thinnable and such that we can craft placeholder values in non-empty contexts.It is almost immediate that both

Var and Tm are VarLike (with proofs vl^Var and vl^Tm ,respectively).

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. record VarLike ( V : I − Scoped) : Set whereﬁeld th^ V : Thinnable ( V σ )new : ∀ [ ( σ :: _) ⊢ V σ ] Fig. 37.

VarLike : Thinnable and with placeholder valuesGiven a proof that V is VarLike , we can manufacture several useful environments ofvalues V . We provide users with base of type ( Γ − Env ) V Γ , fresh r of type ( Γ − Env ) V ( ∆ ++ Γ ) and fresh l of type ( Γ − Env ) V ( Γ ++ ∆ ) by combining the use of placeholdervalues and thinnings. In the Var case these very general deﬁnitions respectively specialiseto the identity renaming for a context Γ and the injection of Γ fresh variables to the rightor the left of an ambient context ∆ . Similarly, in the Tm case, we can show base vl^Tm extensionally equal to the identity environment id^Tm given by lookup id^Tm = ‘var , whichassociates each variable to itself (seen as a term). Using these deﬁnitions, we can thenimplement reify as in Figure 38. reify : VarLike V → ∀ ∆ i → Kripke

V C ∆ i Γ → Scope C ∆ i Γ reify vl^ V [] i b = b reify vl^ V ∆ @(_ :: _) i b = b (fresh r vl^Var ∆ ) (fresh l vl^ V _) Fig. 38. Generic Reiﬁcation thanks to

VarLike

Values

In this section we explore a large part of the spectrum of traversals a compiler writer mayneed when implementing their own language. In Section 7.1 we look at the productionof human-readable representations of internal syntax; in Section 7.2 we write a genericscope checker thus bridging the gap between raw data fresh out of a parser to well scopedsyntax; we then demonstrate how to write a type checker in Section 7.3 and even anelaboration function turning well scoped into well scoped and typed syntax in Section 7.4.We then study type and scope respecting transformations on internal syntax: desugaring inSection 7.5 and size preserving inlining in Section 7.6. We conclude with an unsafe butgeneric evaluator deﬁned using normalisation by evaluation in Section 7.7.

We have seen in Section 3.3 that printing with names is an instance of ACMM’s notionof

Semantics . We will now show that this observation can be generalised to arbitrarysyntaxes with binding. Unlike renaming or substitution, this generic program will requireuser guidance: there is no way for us to guess how an encoded term should be printed.We can however take care of the name generation (using the monad

Fresh introduced inFigure 12), deal with variable binding, and implement the traversal generically. We wantour printer to have type: print : Display d → Tm d i σ Γ → String

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding where Display explains how to print one ‘layer’ of term provided that we are handed the

Pieces corresponding to the printed subterm and names for the bound variables:

Display : Desc I → SetDisplay d = ∀ { i Γ } → ~ d (cid:127) Pieces i Γ → String

Reusing the notion of

Name introduced in Section 3.3, we can make

Pieces formal. Asubterm has already been printed if we have a string representation of it together withan environment of

Name s we have attached to the newly-bound variables this structurecontains. That is to say:

Pieces : List I → I − ScopedPieces [] i Γ = StringPieces ∆ i Γ = ( ∆ − Env) Name ( ∆ ++ Γ ) × String

The key observation that will help us deﬁne a generic printer is that

Fresh composed with

Name is VarLike . Indeed, as the composition of a functor and a trivially thinnable

Wrap per,

Fresh is Thinnable , and fresh (deﬁned in Figure 12) is the proof that we can generateplaceholder values thanks to the name supply. vl^FreshName : VarLike ( λ ( σ : I ) → Fresh ◦ (Name σ ))vl^FreshName = record{ th^ V = th^Functor functor^M th^Wrap; new = fresh _} This

VarLike instance empowers us to reify in an e ﬀ ectful manner a Kripke functionspace taking

Name s and returning a

Printer to a set of

Pieces . reify^Pieces : ∀ ∆ i → Kripke Name Printer ∆ i Γ → Fresh (Pieces ∆ i Γ ) In case there are no newly bound variables, the

Kripke function space collapses to a mere

Printer which is precisely the wrapped version of the type we expect. reify^Pieces [] i p = getW p Otherwise we proceed in a manner reminiscent of the pure reiﬁcation function deﬁned inFigure 38. We start by generating an environment of names for the newly-bound variablesby using the fact that

Fresh composed with

Name is VarLike together with the fact thatenvironments are Traversable (McBride & Paterson (2008)), and thus admit the standardHaskell-like mapA and sequenceA traversals. We then run the

Kripke function on thesenames to obtain the string representation of the subterm. We ﬁnally return the names weused together with this string. reify^Pieces ∆ @(_ :: _) i f = do ρ ← sequenceA (fresh l vl^FreshName _) b ← getW ( f (fresh r vl^Var ∆ ) ρ )return ( ρ , b ) We can put all of these pieces together to obtain the

Printing semantics presented inFigure 39. The ﬁrst two constraints can be trivially discharged. When deﬁning the algebrawe start by reifying the subterms, then use the fact that one “layer” of term of our syntaxeswith binding is always traversable to combine all of these results into a value we can applyour display function to.

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Printing : Display d → Semantics d Name PrinterPrinting dis .th^ V = th^WrapPrinting dis .var = map^Wrap returnPrinting dis .alg = λ v → MkW $ dis <$> mapA d reify^Pieces v Fig. 39. Printing with

Name s as a

Semantics

This allows us to write a printer for open terms as demonstrated in Figure 40. We startby using base (deﬁned in Section 6.2) to generate an environment of

Name s for the freevariables, then use our semantics to get a printer which we can run using a stream names of distinct strings as our name supply. print : Display d → Tm d i σ Γ → Stringprint dis t = proj (printer names) whereprinter : Fresh Stringprinter = do init ← sequenceA (base vl^FreshName)getW (Semantics.semantics (Printing dis ) init t ) Fig. 40. Generic Printer for Open Terms

Untyped λ -calculus Deﬁning a printer for the untyped λ -calculus is now very easy: wedeﬁne a Display by case analysis. In the application case, we combine the string rep-resentation of the function, wrap its argument’s representation between parentheses andconcatenate the two together. In the lambda abstraction case, we are handed the name thebound variable was assigned together with the body’s representation; it is once more amatter of putting the

Pieces together. printUTLC : Display UTLCprintUTLC = λ where(‘app’ f t ) → f ++ " (" ++ t ++ ")" (‘lam’ ( x , b )) → " λ " ++ getW (lookup x z) ++ ". " ++ b As always, these functions are readily executable and we can check their behaviour bywriting tests. First, we print the identity function deﬁned in Figure 28 in an empty contextand verify that we do obtain the string " λ a. a" . Next, we print an open term in a contextof size two and can immediately observe that names are generated for the free variablesﬁrst, and then the expression itself is printed. _ : print printUTLC id^U ≡ " λ a. a" _ = reﬂ _ : let tm : Tm UTLC _ _ (_ :: _ :: []) tm = ‘app (‘var z) (‘lam (‘var (s (s z))))in print printUTLC tm ≡ "b ( λ c. a)" _ = reﬂ U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Converting terms in the internal syntax to strings which can in turn be displayed in aterminal or an editor window is only part of a compiler’s interaction loop. The otherdirection takes strings as inputs and attempts to produce terms in the internal syntax. Theﬁrst step is to parse the input strings into structured data, the second is to perform scopechecking, and the third step consists of type checking.Parsing is currently out of scope for our library; users can write safe ad-hoc parsersfor their object language by either using a library of total parser combinators (Danielsson(2010); Allais (2018)) or invoking a parser generator oracle whose target is a total lan-guage (Stump (2016)). As we will see shortly, we can write a generic scope checkertransforming terms in a raw syntax where variables are represented as strings into a wellscoped syntax. We will come back to typechecking with a concrete example in section 7.3and then discuss related future work in the conclusion.Our scope checker will be a function taking two explicit arguments: a name for eachvariable in scope Γ and a raw term for a syntax description d . It will either fail (the Monad Fail granting us the ability to fail is made explicit in Figure 43) or return a well scoped andsorted term for that description. toTm : Names Γ → Raw d i σ → Fail (Tm d i σ Γ ) Scope

We can obtain

Names , the datastructure associating to each variable in scope itsraw name as a string by reusing the standard library’s

All . The inductive family

All is apredicate transformer making sure a predicate holds of all the element of a list. It is deﬁnedin a style common in Agda: because

All ’s constructors are in one to one correspondencewith that of its index type (

List A ), the same name are reused: [] is the name of the proofthat P trivially holds of all the elements in the empty list [] ; similarly _ :: _ is the proof thatprovided that P holds of the element a on the one hand and of the elements of the list as onthe other then it holds of all the elements of the list ( a :: as ). data All ( P : A → Set) : List A → Set where[] : All P []_ :: _ : ∀ { a as } → P a → All

P as → All P ( a :: as ) Names : List I → SetNames = All (const String)

Fig. 41. Associating a raw string to each variable in scope

Raw terms

The deﬁnition of

WithNames is analogous to

Pieces in the previous section:we expect

Names for the newly bound variables. Terms in the raw syntax then leveragethese deﬁnitions. They are either a variables or another “layer” of raw terms. Variables ’var carry a

String and potentially some extra information E (typically a position in a ﬁle). Theother constructor ’con carries a layer of raw terms where subterms are raw terms equipedwith names for any newly-bound variables. Error Handling

Various things can go wrong during scope checking: evidently a namecan be out of scope but it is also possible that it may be associated to a variable of the

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

WithNames : ( I → Set) → List I → I − ScopedWithNames T [] j Γ = T j

WithNames T ∆ j Γ = Names ∆ × T j data Raw ( d : Desc I ) : Size → I → Set where‘var : E → String → Raw d ( ↑ i ) σ ‘con : ~ d (cid:127) (WithNames (Raw d i )) σ [] → Raw d ( ↑ i ) σ Fig. 42. Names and Raw Termswrong sort. We deﬁne an enumerating type covering these two cases. The scope checkerwill return a computation in the Monad

Fail thus allowing us to fail and return an error, thestring that caused the failure and the extra data of type E that accompanied it. data Error : Set whereOutOfScope : ErrorWrongSort : ( σ τ : I ) → σ . τ → Error Fail : Set → SetFail A = (Error × E × String) ⊎ A fail : Error → E → String → Fail A fail err e str = inj ( err , e , str ) Fig. 43. Error Type and Scope Checking MonadEquipped with these notions, we can write down the type of toVar which tackles thecore of the problem: variable resolution. The function takes a string and a sort as well thenames and sorts of the variables in the ambient scope. Provided that we have a function _ ? = I_ to decide equality on sorts, we can check whether the string corresponds to an existingvariable and whether that binding is of the right sort. Thus we either fail or return a wellscoped and well sorted Var .If the ambient scope is empty then we can only fail with an

OutOfScope error. Alterna-tively, if the variable’s name corresponds to that of the ﬁrst one in scope we check that thesorts match up and either return z or fail with a WrongSort error. Otherwise we look forthe variable further down the scope and use s to lift the result to the full scope. toVar : E → String → ∀ σ Γ → Names Γ → Fail (Var σ Γ )toVar e x σ [] [] = fail OutOfScope e x toVar e x σ ( τ :: Γ ) ( y :: scp ) with x ? = y | σ ? = I τ ... | yes _ | yes reﬂ = pure z... | yes _ | no ¬ eq = fail (WrongSort σ τ ¬ eq ) e x ... | no ¬ p | _ = s <$> toVar e x σ Γ scp Fig. 44. Variable ResolutionScope checking an entire term then amounts to lifting this action on variables to an actionon terms. The error Monad

Fail is by deﬁnition an Applicative and by design our terms areTraversable (Bird & Paterson (1999); Gibbons & d. S. Oliveira (2009)). The action on termis deﬁned mutually with the action on scopes. As we can see in the second equation for

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding toScope , thanks to the deﬁnition of WithNames , concrete names arrive just in time tocheck the subterm with newly-bound variables. toTm : Names Γ → Raw d i σ → Fail (Tm d i σ Γ )toScope : Names Γ → ∀ ∆ σ → WithNames (Raw d i ) ∆ σ [] → Fail (Scope (Tm d i ) ∆ σ Γ )toTm scp (‘var e v ) = ‘var <$> toVar e v _ _ scp toTm scp (‘con b ) = ‘con <$> mapA d (toScope scp ) b toScope scp [] σ b = toTm scp b toScope scp ∆ @(_ :: _) σ ( bnd , b ) = toTm ( bnd ++ scp ) b Fig. 45. Generic Scope Checking for Terms and Scopes

Following Atkey (2015), we can consider type checking and type inference as a possiblesemantics for a bidirectional (Pierce & Turner (2000)) language. We reuse the syntax in-troduced in Section 5 and the types introduced in Figure 3; it gives us a simply typedbidirectional calculus as a bisorted language using a notion of

Mode to distinguish betweenterms for which we will be able to

Infer the type and the ones for which we will have to

Check a type candidate.The values stored in the environment of the typechecking function attach

Type informa-tion to bound variables whose

Mode is Infer , guaranteeing no variable ever uses the

Check mode. In contrast, the generated computations will, depending on the mode, either take atype candidate and

Check it is valid or

Infer a type for their argument. These computationsare always potentially failing so we use the

Maybe monad. In an actual compiler pipelinewe would naturally use a di ﬀ erent error monad and generate helpful error messages point-ing out where the type error occured. The interested reader can see a ﬁne-grained analysisof type errors in the extended example of a typechecker in McBride & McKinna (2004). data Var- : Mode → Set where‘var : Type → Var- Infer Type- : Mode → SetType- Check = Type → Maybe ⊤ Type- Infer = Maybe Type

Fig. 46. Var- and Type- Relations indexed by ModeA change of direction from inferring to checking will require being able to check thattwo types agree so we introduce the function _=?_ . Similarly we will sometimes expect afunction type but may be handed anything so we will have to check with isArrow that ourcandidate’s head constructor is indeed an arrow, and collect the domain and codomain.We can now deﬁne typechecking as a

Semantics . We describe the algorithm constructorby constructor; in the

Semantics deﬁnition (omitted here) the algebra will simply performa dispatch and pick the relevant auxiliary lemma. Note that in the following code, _<$_ is, following classic Haskell notations, the function which takes an A and a Maybe B andreturns a Maybe A which has the same structure as its second argument. U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. _=?_ : ( σ τ : Type) → Maybe ⊤ α =? α = just tt( σ ‘ → τ ) =? ( φ ‘ → ψ ) = ( σ =? φ ) >> ( τ =? ψ )_ =? _ = nothing isArrow : Type → Maybe (Type × Type)isArrow ( σ ‘ → τ ) = just ( σ , τ )isArrow _ = nothing Fig. 47. Tests for

Type values

Application

When facing an application: infer the type of the function, make sure it is anarrow type, check the argument at the domain’s type and return the codomain. app : Type- Infer → Type- Check → Type- Inferapp f t = do arr ← f ( σ , τ ) ← isArrow arr τ <$ t σ λ -abstraction For a λ -abstraction: check that the input type arr is an arrow type and checkthe body b at the codomain type in the extended environment (using bind ) where the newly-bound variable is of mode Infer and has the domain’s type. lam : Kripke (const ◦ Var-) (const ◦ Type-) (Infer :: []) Check Γ → Type- Checklam b arr = do( σ , τ ) ← isArrow arrb (bind Infer) ( ε • ‘var σ ) τ Embedding of

Infer into

Check

The change of direction from

Infer rable to

Check able issuccessful when the inferred type is equal to the expected one. emb : Type- Infer → Type- Checkemb t σ = do τ ← t σ =? τ Cut: A

Check in an

Infer position

So far, our bidirectional syntax only permits theconstruction of STLC terms in canonical form (Pfenning (2004); Dunﬁeld & Pfenning(2004)). In order to construct non-normal (redex) terms, whose semantics is given logicallyby the ‘cut’ rule, we need to reverse direction. Our ﬁnal semantic operation, cut , alwayscomes with a type candidate against which to check the term and to be returned in case ofsuccess. cut : Type → Type- Check → Type- Infercut σ t = σ <$ t σ We have deﬁned a bidirectional typechecker for this simple language by leveraging the

Semantics framework. We can readily run it on closed terms using the closed corollarydeﬁned in Figure 35 and (deﬁning β to be ( α ‘ → α )) infer the type of the expression ( λ x. x: β → β ) ( λ x. x).The output of this function is not very informative. As we will see shortly, there isnothing stopping us from moving away from a simple computation returning a ( MaybeType ) to an evidence-producing function elaborating a term in

Bidi to a well scoped andtyped term in

STLC . U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding type- : ∀ p → TM Bidi p → Type- p type- p = Semantics.closed Typecheck _ : type- Infer (‘app (‘cut ( β ‘ → β ) id^B) id^B) ≡ just β _ = reﬂ Fig. 48. Type- Inference / Checking as a Semantics

Instead of generating a type or checking that a candidate will do, we can use our languageof

Desc riptions to deﬁne not only an untyped source language but also an intrinsicallytyped internal language. During typechecking we simultaneously generate an expression’stype and a well scoped and well typed term of that type. We use

STLC (deﬁned in Section 5)as our internal language.Before we can jump right in, we need to set the stage: a

Semantics for a

Bidi term willinvolve (

Mode − Scoped ) notions of values and computations but an

STLC term is (

Type − Scoped ). We ﬁrst introduce a

Typing associating types to each of the modes in scope,together with an erasure function x _ y extracting the context of types implicitly deﬁned bysuch a Typing . We will systematically distinguish contexts of modes (typically named ms )and their associated typings (typically named Γ ). Typing : List Mode → SetTyping = All (const Type) x _ y : Typing ms → List Type x [] y = [] x σ :: Γ y = σ :: x Γ y Fig. 49. Typing: From Contexts of

Mode s to Contexts of

Type sWe can then explain what it means for an elaboration process of type σ in a contextof modes ms to produce a term of the ( Type − Scoped ) family T : for any typing Γ of thiscontext of modes, we should get a value of type ( T σ x Γ y ). Elab : Type − Scoped → Type → ( ms : List Mode) → Typing ms → SetElab T σ _ Γ = T σ x Γ y Fig. 50. Elaboration of a Scoped FamilyOur ﬁrst example of an elaboration process is our notion of environment values. To eachvariable in scope of mode

Infer we associate an elaboration function targeting

Var . In otherwords: our values are all in scope i.e. provided any typing of the scope of modes, we canassuredly return a type together with a variable of that type. data Var- : Mode − Scoped where‘var : ( infer : ∀ Γ → Σ [ σ ∈ Type ] Elab Var σ ms Γ ) → Var- Infer ms Fig. 51. Values as Elaboration Functions for VariablesWe can for instance prove that we have such an inference function for a newly-boundvariable of mode

Infer : given that the context has been extended with a variable of mode

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Infer , the

Typing must also have been extended with a type σ . We can return that typepaired with the variable z . var : Var- Infer (Infer :: ms )var = ‘var λ where ( σ :: _) → ( σ , z) Fig. 52. Inference Function for the 0-th VariableThe computations are a bit more tricky. On the one hand, if we are in checking modethen we expect that for any typing of the scope of modes and any type candidate we can

Maybe return a term at that type in the induced context. On the other hand, in the inferencemode we expect that given any typing of the scope, we can

Maybe return a type togetherwith a term at that type in the induced context.

Elab- : Mode − ScopedElab- Check ms = ∀ Γ → ( σ : Type) → Maybe (Elab (Tm STLC ∞ ) σ ms Γ )Elab- Infer ms = ∀ Γ → Maybe ( Σ [ σ ∈ Type ] Elab (Tm STLC ∞ ) σ ms Γ ) Fig. 53. Computations as

Mode -indexed Elaboration FunctionsBecause we are now writing a typechecker which returns evidence of its claims, we needmore informative variants of the equality and isArrow checks. In the equality checkingcase we want to get a proof of propositional equality but we only care about the successfulpath and will happily return nothing when failing. Agda’s support for (dependent!) do -notation makes writing the check really easy. For the arrow type, we introduce a family Arrow constraining the shape of its index to be an arrow type and redeﬁne isArrow as a view targeting this inductive family (Wadler (1987); McBride & McKinna (2004)). Wedeliberately overload the constructor of the isArrow family by calling it _‘ → _ . This meansthat the proof that a given type has the shape ( σ ‘ → τ ) is literally written ( σ ‘ → τ ). Thisallows us to specify in the type whether we want to work with the full set of values in Type or only the subset corresponding to function types and to then proceed to write thesame programs a Haskell programmers would, with the added conﬁdence that ours areguaranteed to be total. _=?_ : ( σ τ : Type) → Maybe ( σ ≡ τ ) α =? α = just reﬂ( σ ‘ → τ ) =? ( φ ‘ → ψ ) = doreﬂ ← σ =? φ reﬂ ← τ =? ψ return reﬂ_ =? _ = nothing data Arrow : Type → Set where_‘ → _ : ∀ σ τ → Arrow ( σ ‘ → τ )isArrow : ∀ σ → Maybe (Arrow σ )isArrow ( σ ‘ → τ ) = just ( σ ‘ → τ )isArrow _ = nothing Fig. 54. Informative Equality Check and Arrow ViewWe now have all the basic pieces and can start writing elaboration code. We will uselowercase letter for terms in

Bidi and uppercase ones for their elaborated counterparts in

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding STLC . We once more start by dealing with each constructor in isolation before puttingeverything together to get a

Semantics . These steps are very similar to the ones in theprevious section.

Application

In the application case, we start by elaborating the function and we get itstype together with its internal representation. We then check that the inferred type is indeedan

Arrow and elaborate the argument using the corresponding domain. We conclude byreturning the codomain together with the internal function applied to the internal argument. app : ∀ [ Elab- Infer ⇒ Elab- Check ⇒ Elab- Infer ]app f t Γ = do( arr , F ) ← f Γ ( σ ‘ → τ ) ← isArrow arrT ← t Γ σ return ( τ , ‘app F T ) λ -abstraction For the λ -abstraction case, we start by checking that the type candidate arr is an Arrow . We can then elaborate the body b of the lambda in a context of modes extendedwith one Infer variable, and the corresponding

Typing extended with the function’s domain.From this we get an internal term B corresponding to the body of the λ -abstraction andconclude by returning it wrapped in a ‘lam constructor. lam : ∀ [ Kripke Var- Elab- (Infer :: []) Check ⇒ Elab- Check ]lam b Γ arr = do( σ ‘ → τ ) ← isArrow arrB ← b (bind Infer) ( ε • var ) ( σ :: Γ ) τ return (‘lam B ) Cut: A

Check in an

Infer position

For cut, we start by elaborating the term with the typeannotation provided and return them paired together. cut : Type → ∀ [ Elab- Check ⇒ Elab- Infer ]cut σ t Γ = ( σ ,_) <$> t Γ σ Embedding of

Infer into

Check

For the change of direction

Emb we not only want tocheck that the inferred type and the type candidate are equal: we need to cast the internalterm labelled with the inferred type to match the type candidate. Luckily, Agda’s dependent do -notation make our job easy once again: when we make the pattern reﬂ explicit, theequality holds in the rest of the block. emb : ∀ [ Elab- Infer ⇒ Elab- Check ]emb t Γ σ = do( τ , T ) ← t Γ reﬂ ← σ =? τ return T We have almost everything we need to deﬁne elaboration as a semantics. Dischargingthe th^ V constraint is a bit laborious and the proof doesn’t yield any additional insight sowe leave it out here. The semantical counterpart of variables ( var ) is fairly straightforward:provided a Typing , we run the inference and touch it up to return a term rather than a merevariable. Finally we deﬁne the algebra ( alg ) by pattern-matching on the constructor andusing our previous combinators.

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Elaborate : Semantics Bidi Var- Elab-Elaborate .th^ V = th^Var-Elaborate .var = λ where (‘var infer ) Γ → just (map ‘var ( infer Γ ))Elaborate .alg = λ where(‘app’ f t ) → app f t (‘lam’ b ) → lam b (‘emb’ t ) → emb t (‘cut’ σ t ) → cut σ t Fig. 55.

Elaborate , the elaboration semanticsWe can once more deﬁne a specialised version of the traversal induced by this

Seman-tics for closed terms: not only can we give a (trivial) initial environment (using the closed corollary deﬁned in Figure 35) but we can also give a (trivial) initial

Typing . This leads tothe deﬁnitions in Figure 56.

Type- : Mode → SetType- Check = ∀ σ → Maybe (TM STLC σ )Type- Infer = Maybe ( ∃ λ σ → TM STLC σ ) type- : ∀ p → TM Bidi p → Type- p type- Check t = closed Elaborate t []type- Infer t = closed Elaborate t [] Fig. 56. Evidence-producing Type (Checking / Inference) FunctionRevisiting the example introduced in Section 7.3, we can check that elaborating theexpression ( λ x. x : β → β ) ( λ x. x) yields the type β together with the term ( λ x. x) ( λ x. x)in internal syntax. Type annotations have disappeared in the internal syntax as all the typeinvariants are enforced intrinsically. _ : type- Infer ( B.‘app (B.‘cut ( β ‘ → β ) id^B) id^B) ≡ just ( β , S.‘app id^S id^S)_ = reﬂ One of the advantages of having a universe of programming language descriptions is theability to concisely deﬁne an extension of an existing language by using

Desc riptiontransformers grafting extra constructors à la Swiestra (2008). This is made extremelysimple by the disjoint sum combinator _‘+_ which we deﬁned in Figure 29. An example ofsuch an extension is the addition of let-bindings to an existing language.Let bindings allow the user to avoid repeating themselves by naming sub-expressionsand then using these names to refer to the associated terms. Preprocessors adding thesetypes of mechanisms to existing languages (from C to CSS) are rather popular. In Figure 57,we introduce a description

Let which can be used to extend any language description d toa language with let-bindings ( d ‘+ Let ).This description states that a let-binding node stores a pair of types σ and τ and twosubterms. First comes the let-bound expression of type σ and second comes the body of U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Let : Desc I Let = ‘ σ ( I × I ) $ uncurry $ λ σ τ → ‘X [] σ (‘X ( σ :: []) τ (‘ (cid:4) τ )) pattern ‘let’_‘in’_ e t = (_ , e , t , reﬂ)pattern ‘let_‘in_ e t = ‘con (‘let’ e ‘in’ t ) Fig. 57. Description of a single let binding, associated pattern synonymsthe let which has type τ in a context extended with a fresh variable of type σ . This deﬁnesa term of type τ .In a dependently typed language, a type may depend on a value which in the presence oflet bindings may be a variable standing for an expression. The user naturally does not wantit to make any di ﬀ erence whether they used a variable referring to a let-bound expression orthe expression itself. Various typechecking strategies can accommodate this expectation: inCoq (The Coq Development Team (2017)) let bindings are primitive constructs of the lan-guage and have their own typing and reduction rules whereas in Agda they are elaboratedaway to the core language by inlining.This latter approach to extending a language d with let bindings by inlining them beforetypechecking can be implemented generically as a semantics over ( d ‘+ Let ). For thissemantics values in the environment and computations are both let-free terms. The algebraof the semantics can be deﬁned by parts thanks to case , the eliminator for _‘+_ deﬁned inFigure 29: the old constructors are kept the same by interpreting them using the genericsubstitution algebra ( Sub ); whilst the let-binder precisely provides the extra value to beadded to the environment.

UnLet : Semantics ( d ‘+ Let) (Tm d ∞ ) (Tm d ∞ )Semantics.th^ V UnLet = th^TmSemantics.var UnLet = idSemantics.alg UnLet = case (Semantics.alg Sub) $ λ where(‘let’ e ‘in’ t ) → extract t ( ε • e ) Fig. 58. Desugaring as a

Semantics

The process of removing let binders is then kickstarted with the placeholder environment id^Tm = pack ‘var of type ( Γ − Env ) ( Tm d ∞ ) Γ . unlet : ∀ [ Tm ( d ‘+ Let) ∞ σ ⇒ Tm d ∞ σ ]unlet = Semantics.semantics UnLet id^Tm Fig. 59. Specialising semantics with an environment of placeholder valuesIn less than 10 lines of code we have deﬁned a generic extension of syntaxes with bindingtogether with a semantics which corresponds to an elaborator translating away this newconstruct. In previous work (Allais et al. (2017)), we focused on STLC only and showedthat it is similarly possible to implement a Continuation Passing Style transformation as thecomposition of two semantics à la Hatcli ﬀ and Danvy (1994). The ﬁrst semantics embedsSTLC into Moggi’s Meta-Language (1991) and thus ﬁxes an evaluation order. The second U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. one translates Moggi’s ML back into STLC in terms of explicit continuations with a ﬁxedreturn type.We have demonstrated how easily one can deﬁne extensions and combine them on topof a base language without having to reimplement common traversals for each one ofthe intermediate representations. Moreover, it is possible to deﬁne generic transforma-tions elaborating these added features in terms of lower-level ones. This suggests thatthis setup could be a good candidate to implement generic compilation passes and coulddeal with a framework using a wealth of slightly di ﬀ erent intermediate languages à laNanopass (Keep & Dybvig (2013)). Although useful in its own right, desugaring all let bindings can lead to an exponentialblow-up in code size. Compiler passes typically try to maintain sharing by only inlininglet-bound expressions which appear at most one time. Unused expressions are eliminatedas dead code whilst expressions used exactly one time can be inlined: this transformationis size preserving and opens up opportunities for additional optimisations.As we will see shortly, we can implement reference counting and size respecting let-inlining as a generic transformation over all syntaxes with binding equipped with let binders.This two-pass simple transformation takes linear time which may seem surprising giventhe results due to Appel and Jim (1997). Our optimisation only inlines let-bound variableswhereas theirs also encompasses the reduction of static β -redexes of (potentially) recursivefunction. While we can easily count how often a variable is used in the body of a let binder,the interaction between inlining and β -reduction in theirs creates cascading simpliﬁcationopportunities thus making the problem much harder.But ﬁrst, we need to look at an example demonstrating that this is a slightly subtle matter.Assuming that expensive takes a long time to evaluate, inlining all of the lets in the ﬁrstexpression is a really good idea whilst we only want to inline the one binding y in thesecond one to avoid duplicating work. That is to say that the contribution of the expressionbound to y in the overall count depends directly on whether y itself appears free in the bodyof the let which binds it. _ = let x = expensive inlet y = ( x , x ) in x _ = let x = expensive inlet y = ( x , x ) in y Our transformation will consist of two passes: the ﬁrst one will annotate the tree withaccurate count information precisely recording whether let-bound variables are used zero , one , or many times. The second one will inline precisely the let-binders whose variable isused at most once.During the counting phase we need to be particularly careful not to overestimate thecontribution of a let-bound expression. If the let-bound variable is not used then we cannaturally safely ignore the associated count. But if it used many times then we know wewill not inline this let-binding and the count should therefore only contribute once to the U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding running total. We deﬁne the control combinator in Figure 64 precisely to explicitly handlethis subtle case.The ﬁrst step is to introduce the Counter additive monoid (cf. Figure 60). Addition willallow us to combine counts coming from di ﬀ erent subterms: if any of the two counters is zero then we return the other, otherwise we know we have many occurences. data Counter : Set wherezero : Counterone : Countermany : Counter _+_ : Counter → Counter → Counterzero + n = nm + zero = m _ + _ = many Fig. 60. The (

Counter , zero , _+_ ) additive monoidThe syntax extension CLet deﬁned in Figure 61 is a variation on the

Let syntax extensionof Section 7.5, attaching a

Counter to each

Let node. The annotation process can then bedescribed as a function computing a ( d ‘+ CLet ) term from a ( d ‘+ Let ) one. CLet : Desc I CLet = ‘ σ Counter $ λ _ → Let

Fig. 61. Counted LetsWe keep a tally of the usage information for the variables in scope. This allows usto know which

Counter to attach to each

Let node. Following the same strategy as inSection 7.2, we use the standard library’s

All to represent this mapping. We say that ascoped value has been

Counted if it is paired with a

Count . Count : List I → SetCount = All (const Counter) Counted : I − Scoped → I − ScopedCounted

T i Γ = T i Γ × Count Γ Fig. 62. Counting i.e. Associating a

Counter to each

Var in scope.The two most basic counts are described in Figure 63: the empty one is zero everywhereand the one corresponding to a single use of a single variable v which is zero everywhereexcept for v where it’s one . zeros : ∀ [ Count ]zeros {[]} = []zeros { σ :: Γ } = zero :: zeros fromVar : ∀ [ Var σ ⇒ Count ]fromVar z = one :: zerosfromVar (s v ) = zero :: fromVar v Fig. 63. Zero Count and Count of One for a Speciﬁc VariableWhen we collect usage information from di ﬀ erent subterms, we need to put the variouscounts together. The combinators in Figure 64 allow us to easily do so: merge adds up twocounts in a pointwise manner while control uses one Counter to decide whether to erasean existing

Count . This is particularly convenient when computing the contribution of a

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. let-bound expression to the total tally: the contribution of the let-bound expression willonly matter if the corresponding variable is actually used. merge : ∀ [ Count ⇒ Count ⇒ Count ]merge [] [] = []merge ( m :: cs ) ( n :: ds ) =( m + n ) :: merge cs ds control : Counter → ∀ [ Count ⇒ Count ]control zero cs = zeroscontrol one cs = cs (cid:21) inlined control many cs = cs (cid:21) not inlined Fig. 64. Combinators to Compute

Count sWe can now focus on the core of the annotation phase. We deﬁne a

Semantics whosevalues are variables themselves and whose computations are the pairing of a term in ( d ‘+CLet ) together with a Count . The variable case is trivial: provided a variable v , we return( ‘var v ) together with the count ( fromVar v ).The non-let case is purely structural: we reify the Kripke function space and obtain ascope together with the corresponding

Count . We unceremoniously drop the

Counter sassociated to the variables bound in this subterm and return the scope together with thetally for the ambient context. reify^Count : ∀ ∆ σ → Kripke Var (Counted (Tm ( d ‘+ CLet) ∞ )) ∆ σ Γ → Counted (Scope (Tm ( d ‘+ CLet) ∞ ) ∆ ) σ Γ reify^Count ∆ σ kr = let ( scp , c ) = reify vl^Var ∆ σ kr in scp , drop ∆ c Fig. 65. Purely Structural CaseThe

Let -to-

CLet case in Figure 66 is the most interesting one. We start by reifying the body of the let binder which gives us a tally cx for the bound variable and ct for the body’scontribution to the ambient environment’s Count . We annotate the node with cx and useit as a control to decide whether we are going to merge any of the let-bound’s expressioncontribution ce to form the overall tally. clet : ~ Let (cid:127) (Kripke Var (Counted (Tm ( d ‘+ CLet) ∞ ))) σ Γ → Counted ( ~ CLet (cid:127) (Scope (Tm ( d ‘+ CLet) ∞ ))) σ Γ clet ( στ , ( e , ce ) , body , eq ) = case body extend ( ε • z) of λ where( t , cx :: ct ) → ( cx , στ , e , t , eq ) , merge (control cx ce ) ct Fig. 66. Annotating Let BindersPutting all of these things together we obtain the

Semantics Annotate . We promptlyspecialise it using an environment of placeholder values to obtain the traversal annotate elaborating raw let-binders into counted ones.Using techniques similar to the ones described in Section 7.5, we can write an

Inline semantics working on ( d ‘+ CLet ) terms and producing ( d ‘+ Let ) ones. We make sureto preserve all the let-binders annotated with many and to inline all the other ones. Bycomposing Annotate with

Inline we obtain a size-preserving generic optimisation pass.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding annotate : ∀ [ Tm ( d ‘+ Let) ∞ σ ⇒ Tm ( d ‘+ CLet) ∞ σ ]annotate t = let ( t’ , _) = Semantics.semantics Annotate identity t in t’ Fig. 67. Specialising semantics to obtain an annotation function

A key type of traversal we have not studied yet is a language’s evaluator. Our universeof syntaxes with binding does not impose any typing discipline on the user-deﬁned lan-guages and as such cannot guarantee their totality. This is embodied by one of our runningexamples: the untyped λ -calculus. As a consequence there is no hope for a safe genericframework to deﬁne normalisation functions.The clear connection between the Kripke functional space characteristic of our semanticsand the one that shows up in normalisation by evaluation suggests we ought to man-age to give an unsafe generic framework for normalisation by evaluation. By temporarily disabling Agda’s positivity checker , we can deﬁne a generic reﬂexive domain Dm (cf.Figure 68) in which to interpret our syntaxes. It has three constructors correspondingrespectively to a free variable, a constructor’s counterpart where scopes have become Kripke functional spaces on Dm and an error token because the evaluation of untypedprograms may go wrong. {- d : Desc I ) : Size → I − Scoped whereV : ∀ [ Var σ ⇒ Dm d s σ ]C : ∀ [ ~ d (cid:127) (Kripke (Dm d s ) (Dm d s )) σ ⇒ Dm d ( ↑ s ) σ ] ⊥ : ∀ [ Dm d ( ↑ s ) σ ] Fig. 68. Generic Reﬂexive DomainThis datatype deﬁnition is utterly unsafe. The more conservative user will happily re-strict themselves to particular syntaxes where the typed settings allows for domain to bedeﬁned as a logical predicate or opt instead for a step-indexed approach.But this domain does make it possible to deﬁne a generic nbe semantics which, given aterm, produces a value in the reﬂexive domain. Thanks to the fact we have picked a universeof ﬁnitary syntaxes, we can traverse (McBride & Paterson (2008); Gibbons & d. S. Oliveira(2009)) the functor to deﬁne a (potentially failing) reiﬁcation function turning elements ofthe reﬂexive domain into terms. By composing them, we obtain the normalisation functionwhich gives its name to normalisation by evaluation.The user still has to explicitly pass an interpretation of the various constructors becausethere is no way for us to know what the binders are supposed to represent: they may standfor λ -abstractions, Σ -types, ﬁxpoints, or anything else.Using this setup, we can write a normaliser for the untyped λ -calculus by providingan algebra. The key observation that allows us to implement this algebra is that we canturn a Kripke function, f , mapping values of type σ to computations of type τ into anAgda function from values of type σ to computations of type τ . This is witnessed by U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. reify^Dm : ∀ [ Dm d s σ ⇒ Maybe ◦ Tm d ∞ σ ]nbe : Alg d (Dm d ∞ ) (Dm d ∞ ) → Semantics d (Dm d ∞ ) (Dm d ∞ )norm : Alg d (Dm d ∞ ) (Dm d ∞ ) → ∀ [ Tm d ∞ σ ⇒ Maybe ◦ Tm d ∞ σ ]norm alg = reify^Dm ◦ Semantics.semantics (nbe alg ) (base vl^Dm)

Fig. 69. Generic Normalisation by Evaluation Frameworkthe application function ( _$$_ ) deﬁned in Figure 70: we ﬁrst use extract (deﬁned inFigure 9) to obtain a function taking environments of values to computations. We thenuse the combinators deﬁned in Figure 6 to manufacture the singleton environment ( ε • t )containing the value t of type σ . _$$_ : ∀ [ Kripke V C ( σ :: []) τ ⇒ ( V σ ⇒ C τ ) ] f $$ t = extract f ( ε • t ) Fig. 70. Applying a Kripke Function to an argumentWe now deﬁne two patterns for semantical values: one for application and the other forlambda abstraction. This should make the case of interest of our algebra (a function appliedto an argument) fairly readable. pattern LAM f = C (false , f , reﬂ)pattern APP’ f t = (true , f , t , reﬂ) Fig. 71. Pattern synonyms for UTLC-speciﬁc Dm valuesWe ﬁnally deﬁne the algebra by case analysis: if the node at hand is an application andits ﬁrst component evaluates to a lambda, we can apply the function to its argument using _$$_ . Otherwise we have either a stuck application or a lambda, in other words we alreadyhave a value and can simply return it using C .We have not used the ⊥ constructor so if the evaluation terminates (by disabling totalitychecking we have lost all guarantees of the sort) we know we will get a term in normalform. See for instance in Figure 73 the evaluation of an untyped yet normalising term: ( λ x.x) (( λ x. x) ( λ x. x)) normalises to ( λ x. x). Some generic programs of interest do not ﬁt in the

Semantics framework. They can stillbe implemented once and for all, and even beneﬁt from the

Semantics -based deﬁnitions.We will ﬁrst explore existing work on representing cyclic structures using a syntax withbinding: a binder is a tree node declaring a pointer giving subtrees the ability to point backto it, thus forming a cycle. Substitution will naturally play a central role in giving theseﬁnite terms a semantics as their potentially inﬁnite unfolding.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding norm^LC : ∀ [ Tm UTLC ∞ tt ⇒ Maybe ◦ Tm UTLC ∞ tt ]norm^LC = norm $ λ where(APP’ (LAM f ) t ) → f $$ t (cid:21) redex t → C t (cid:21) value Fig. 72. Normalisation by Evaluation for the Untyped λ -Calculus _ : norm^LC (‘app id^U (‘app id^U id^U)) ≡ just id^U_ = reﬂ Fig. 73. Example of a normalising untyped termWe will then see that many of the standard traversals produced by the ‘deriving’ machin-ery familiar to Haskell programmers can be implemented on syntaxes too, sometimes withmore informative types.

Ghani, Hamana, Uustalu and Vene (2006) have demonstrated how Altenkirch and Reus’type-level de Bruijn indices (1999) can be used to represent potentially cyclic structuresby a ﬁnite object. In their representation each bound variable is a pointer to the node thatintroduced it. Given that we are, at the top-level, only interested in structures with no“dangling pointers”, we introduce the notation TM d to mean closed terms (i.e. terms oftype Tm d ∞ [] ).A basic example of such a structure is a potentially cyclic list which o ﬀ ers a choiceof two constructors: [] which ends the list and _::_ which combines a head and a tailbut also acts as a binder for a self-reference; these pointers can be used by using the var constructor which we have renamed x (pronounced “backpointer”) to match the domain-speciﬁc meaning. We can see this approach in action in the examples [0, 1] and (cid:9) (pronounced “0-1-cycle”) which describe respectively a ﬁnite list containing 0 followedby 1 and a cyclic list starting with 0, then 1, and then repeating the whole list again byreferring to the ﬁrst cons cell represented here by the de Bruijn variable 1 (i.e. s z ). CListD : Set → Desc ⊤ CListD A = ‘ (cid:4) tt‘+ ‘ σ A ( λ _ → ‘X (tt :: []) tt (‘ (cid:4) tt))pattern [] = ‘con (true , reﬂ)pattern _ :: _ x xs = ‘con (false , x , xs , reﬂ)pattern x _ k = ‘var k [0,1] : TM (CListD N ) tt01 (cid:9) : TM (CListD N ) tt[0,1] = 0 :: :: []01 (cid:9) = 0 :: :: x s z Fig. 74. Potentially Cyclic Lists: Description, Pattern Synonyms and ExamplesThese ﬁnite representations are interesting in their own right and we can use the genericsemantics framework deﬁned earlier to manipulate them. A basic building block is the unroll function which takes a closed tree, exposes its top node and unrolls any cycle which

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. has it as its starting point. We can decompose it using the plug function which, givena closed and an open term, closes the latter by plugging the former at each free ‘var leaf.Noticing that plug ’s fundamental nature is that of substituting a term for each leaf, it makessense to implement it by re-using the

Substitution semantics we already have. plug : TM d tt → ∀ ∆ i → Scope (Tm d ∞ ) ∆ i [] → TM d i plug t ∆ i = Semantics.semantics Sub (pack ( λ _ → t ))unroll : TM d tt → ~ d (cid:127) (Const (TM d )) tt []unroll t @(‘con b ) = fmap d (plug t ) b Fig. 75. Plug and Unroll: Exposing a Cyclic Tree’s Top LayerHowever, one thing still out of our reach with our current tools is the underlying co-ﬁnitetrees these ﬁnite objects are meant to represent. We start by deﬁning the coinductive typecorresponding to them as the greatest ﬁxpoint of a notion of layer. One layer of a co-ﬁnitetree is precisely given by the meaning of its description where we completely ignore thebinding structure. We show with · · · the inﬁnite list that corresponds to the unfolding ofthe example (cid:9) given above in Figure 74. record ∞ Tm ( d : Desc I ) ( s : Size) ( i : I ) : Set wherecoinductive; constructor ‘conﬁeld force : { s’ : Size< s } → ~ d (cid:127) (Const ( ∞ Tm d s’ )) i [] 01 ··· : ∀ { s } → ∞ Tm (CListD N ) s tt10 ··· : ∀ { s } → ∞ Tm (CListD N ) s tt01 ··· .force = false , 0 , 10 ··· , reﬂ10 ··· .force = false , 1 , 01 ··· , reﬂ Fig. 76. Co-ﬁnite Trees: Deﬁnition and ExampleWe can then make the connection between potentially cyclic structures and the co-ﬁnite trees formal by giving an unfold function which, given a closed term, produces itsunfolding. The deﬁnition proceeds by unrolling the term’s top layer and co-recursivelyunfolding all the subterms. unfold : TM d tt → ∞ Tm d s ttunfold t .force = fmap d ( λ _ _ → unfold) (unroll t ) Fig. 77. Generic Unfold of Potentially Cyclic StructuresEven if the powerful notion of semantics described in Section 6 cannot encompass allthe traversals we may be interested in, it provides us with reusable building blocks: thedeﬁnition of unfold was made very simple by reusing the generic program fmap and the

Substitution semantics whilst the deﬁnition of ∞ Tm was made easy by reusing ~ _ (cid:127) . Haskell programmers are used to receiving help from the ‘deriving’ mechanism (Hinze & Peyton Jones(2000); Magalhães et al. (2010)) to automatically generate common traversals for every

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding inductive type they deﬁne. Recalling that generic programming is normal programmingover a universe in a dependently typed language (Altenkirch & McBride (2002)), we oughtto be able to deliver similar functionalities for syntaxes with binding.We will focus in this section on the deﬁnition of an equality test. The techniques usedin this concrete example are general enough that they also apply to the deﬁnition of anordering test, a Show instance, etc. In type theory we can do better than an uninformativeboolean function claiming that two terms are equal: we can implement a decision procedurefor propositional equality (Löh & Magalhães (2011)) which either returns a proof that itstwo inputs are equal or a proof that they cannot possibly be.The notion of decidability can be neatly formalised by an inductive family with twoconstructors: a

Set P is decidable if we can either say yes and return a proof of P or no and provide a proof of the negation of P (here, a proof that P implies the empty type ⊥ ). data ⊥ : Set where data Dec ( P : Set) : Set whereyes : P → Dec P no : ( P → ⊥ ) → Dec P Fig. 78. Empty Type and Decidability as an Inductive FamilyTo get acquainted with these new notions we can start by proving that equality ofvariables is decidable.

The type of the decision procedure for equality of variables is as follows: given any twovariables (of the same type, in the same context), the set of equality proofs between themis

Dec idable. eq^Var : ( v w : Var σ Γ ) → Dec ( v ≡ w ) We can easily dismiss two trivial cases: if the two variables have distinct head construc-tors then they cannot possibly be equal. Agda allows us to dismiss the impossible premiseof the function stored in the no contructor by using an absurd pattern () . eq^Var z (s w ) = no ( λ ())eq^Var (s v ) z = no ( λ ()) Otherwise if the two head constructors agree we can be in one of two situations. If theyare both z then we can conclude that the two variables are indeed equal to each other. eq^Var z z = yes reﬂ Finally if the two variables are ( s v ) and ( s w ) respectively then we need to checkrecursively whether v is equal to w . If it is the case we can conclude by invoking thecongruence rule for s . If v and w are not equal then a proof that ( s v ) and ( s w ) are will leadto a direct contradiction by injectivity of the constructor s . eq^Var (s v ) (s w ) with eq^Var v w ... | yes p = yes (cong s p )... | no ¬ p = no λ where reﬂ → ¬ p reﬂ U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.8.2.2 Deciding Term Equality

The constructor ‘ σ for descriptions gives us the ability to store values of any Set in terms.For some of these

Set s (e.g. ( N → N )), equality is not decidable. As a consequence ourdecision procedure will be conditioned to the satisfaction of a certain set of Constraints which we can compute from the

Desc itself, as show in Figure 79. We demand that we areable to decide equality for all of the

Set s mentioned in a description.

Constraints : Desc I → SetConstraints (‘ σ A d ) = (( a b : A ) → Dec ( a ≡ b )) × ( ∀ a → Constraints ( d a ))Constraints (‘X _ _ d ) = Constraints d Constraints (‘ (cid:4) _) = ⊤ Fig. 79. Constraints Necessary for Decidable EqualityRemembering that our descriptions are given a semantics as a big right-nested productterminated by an equality constraint, we realise that proving decidable equality will entailproving equality between proofs of equality. We are happy to assume Streicher’s axiomK (Hofmann & Streicher (1994)) to easily dismiss this case. A more conservative approachwould be to demand that equality is decidable on the index type I and to then use the classicHedberg construction (Hedberg (1998)) to recover uniqueness of identity proofs for I .Assuming that the constraints computed by ( Constraints d ) are satisﬁed, we deﬁne thedecision procedure for equality of terms together with its equivalent for bodies. The func-tion eq^Tm is a straightforward case analysis dismissing trivially impossible cases whereterms have distinct head constructors ( ‘var vs. ‘con ) and using either eq^Var or eq^ ~(cid:127) otherwise. The latter is deﬁned by induction over e . The somewhat verbose deﬁnitions arenot enlightening so we leave them out here. eq^Tm : ( t u : Tm d i σ Γ ) → Dec ( t ≡ u )eq^ ~(cid:127) : ∀ e → Constraints e → ( b c : ~ e (cid:127) (Scope (Tm d i )) σ Γ ) → Dec ( b ≡ c ) Fig. 80. Type of Decidable Equality for Terms and BodiesWe now have an informative decision procedure for equality between terms providedthat the syntax they belong to satisﬁes a set of constraints. Other generic functions anddecision procedures can be deﬁned following the same approach: implement a similarfunction for variables ﬁrst, compute a set of constraints, and demonstrate that they aresu ﬃ cient to handle any input term. In ACMM (2017) we have already shown that, for the simply typed λ -calculus, introducingan abstract notion of Semantics not only reveals the shared structure of common traversals,it also allows us to give abstract proof frameworks for simulation or fusion lemmas. Thisidea naturally extends to our generic presentation of semantics for all syntaxes. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding In our exploration of generic proofs about the behaviour of various

Semantics , we aregoing to need to manipulate relations between distinct notions of values or computations.In this section, we introduce the notion of relation we are going to use as well as these twokey relation transformers.In Section 3.1 we introduced a generic notion of well typed and scoped environment asa function from variables to values. Its formal deﬁnition is given in Figure 5 as a recordtype. This record wrapper helps Agda’s type inference reconstruct the type family of valueswhenever it is passed an environment.For the same reason, we will use a record wrapper for the concrete implementation ofour notion of relation over (I − Scoped ) families. A

Rel ation between two such families T and U is a function which to any σ and Γ associates a relation between ( T σ Γ ) and ( U σ Γ ). Our ﬁrst example of such a relation is Eq R the equality relation between an ( I − Scoped )family T and itself. record Rel ( T U : I − Scoped) : Set whereconstructor mkRelﬁeld rel : ∀ σ → ∀ [ T σ ⇒ U σ ⇒ const Set ] Eq R : Rel T T rel Eq R i = _ ≡ _ Fig. 81. Relation Between I − Scoped

Families and Equality ExampleOnce we know what relations are, we are going to have to lift relations on values andcomputations to relations on environments,

Kripke function spaces or on d -shaped termswhose subterms have been evaluated already. This is what the rest of this section focuseson. Environment relator

Provided a relation V R for notions of values V A and V B , by point-wise lifting we can deﬁne a relation ( All V R Γ ) on Γ -environments of values V A and V B respectively. We once more use a record wrapper simply to facilitate Agda’s job whenreconstructing implicit arguments. record All ( V R : Rel V A V B ) ( Γ : List I )( ρ A : ( Γ − Env) V A ∆ ) ( ρ B : ( Γ − Env) V B ∆ ) : Set whereconstructor pack R ﬁeld lookup R : ∀ k → rel V R σ (lookup ρ A k ) (lookup ρ B k ) Fig. 82. Relating Γ -Environments in a Pointwise MannerThe ﬁrst example of two environment being related is reﬂ R that, to any environment ρ associates a trivial proof of the statement ( All Eq R Γ ρ ρ ). The combinators we introducedin Figure 6 to build environments ( ε , _ • _ , etc.) have natural relational counterparts. Wereuse the same names for them, simply appending an R su ﬃ x. U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Kripke relator

We assume that we have two types of values V A and V B as well as arelation V R for pairs of such values, and two types of computations C A and C B whose no-tion of relatedness is given by C R . We can deﬁne Kripke R relating Kripke functions of type( Kripke V A C A ) and ( Kripke V B C B ) respectively by stating that they send related inputsto related outputs. We use the relation transformer All deﬁned in the previous paragraph.

Kripke R : ∀ ∆ i → ∀ [ Kripke V A C A ∆ i ⇒ Kripke V B C B ∆ i ⇒ const Set ]Kripke R [] σ k A k B = rel C R σ k A k B Kripke R ∆ @(_ :: _) σ k A k B = ∀ { Θ } ( ρ : Thinning _ Θ ) { vs A vs B } → All V R ∆ vs A vs B → rel C R σ ( k A ρ vs A ) ( k B ρ vs B ) Fig. 83. Relational Kripke Function Spaces: From Related Inputs to Related Outputs

Desc relator

The relator ( ~ d (cid:127) R ) is a relation transformer which characterises structurallyequal layers such that their substructures are themselves related by the relation it is passedas an argument. It inherits a lot of its relational arguments’ properties: whenever R isreﬂexive (respectively symmetric or transitive) so is ( ~ d (cid:127) R R ).It is deﬁned by induction on the description and case analysis on the two layers whichare meant to be equal: • In the stop token case ‘ (cid:4) i , the two layers are considered to be trivially equal (i.e. theconstraint generated is the unit type) • When facing a recursive position ‘X ∆ j d , we demand that the two substructures arerelated by R ∆ j and that the rest of the layers are related by ( ~ d (cid:127) R R ) • Two nodes of type ‘ σ A d will be related if they both carry the same payload a oftype A and if the rest of the layers are related by ( ~ d a (cid:127) R R ) ~ _ (cid:127) R : ( d : Desc I ) → ( ∀ ∆ σ → ∀ [ X ∆ σ ⇒ Y ∆ σ ⇒ const Set ]) → ∀ [ ~ d (cid:127) X σ ⇒ ~ d (cid:127) Y σ ⇒ const Set ] ~ ‘ (cid:4) j (cid:127) R R x y = ⊤ ~ ‘X ∆ j d (cid:127) R R ( r , x ) ( r’ , y ) = R ∆ j r r’ × ~ d (cid:127) R R x y ~ ‘ σ A d (cid:127) R R ( a , x ) ( a’ , y ) = Σ ( a’ ≡ a ) ( λ where reﬂ → ~ d a (cid:127) R R x y ) Fig. 84. Relator: Characterising Structurally Equal Values with Related SubstructuresIf we were to take a ﬁxpoint of ~ _ (cid:127) R , we could obtain a structural notion of equality forterms which we could prove equivalent to propositional equality. Although interesting inits own right, this section will focus on more advanced use-cases. A constraint mentioning all three relation transformers appears naturally when we wantto say that a semantics can simulate another one. For instance, renaming is simulated by

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding substitution: we simply have to restrict ourselves to environments mapping variables toterms which happen to be variables. More generally, given a semantics S A with values V A and computations C A and a semantics S B with values V B and computations C B , we wantto establish the constraints under which these two semantics yield related computationsprovided they were called with environments of related values.These constraints are packaged in a record type called Simulation and parametrised overthe semantics as well as the notion of relatedness used for values (given by a relation V R )and computations (given by a relation C R ). record Simulation ( d : Desc I )( S A : Semantics d V A C A ) ( S B : Semantics d V B C B )( V R : Rel V A V B ) ( C R : Rel C A C B ) : Set where The two ﬁrst constraints are self-explanatory: the operations th^ V and var deﬁned byeach semantics should be compatible with the notions of relatedness used for values andcomputations. th R : ( ρ : Thinning Γ ∆ ) → rel V R σ v A v B → rel V R σ ( S A .th^ V v A ρ ) ( S B .th^ V v B ρ )var R : rel V R σ v A v B → rel C R σ ( S A .var v A ) ( S B .var v B ) The third constraint is similarly simple: the algebras ( alg ) should take related recursivelyevaluated subterms of respective types ~ d (cid:127) ( Kripke V A C A ) and ~ d (cid:127) ( Kripke V B C B )to related computations. The di ﬃ culy is in deﬁning an appropriate notion of relatedness body R for these recursively evaluated subterms. alg R : ( b : ~ d (cid:127) (Scope (Tm d s )) σ Γ ) → All V R Γ ρ A ρ B → let v A = fmap d ( S A .body ρ A ) bv B = fmap d ( S B .body ρ B ) b in body R v A v B → rel C R σ ( S A .alg v A ) ( S B .alg v B ) We can combine ~ _ (cid:127) R and Kripke R to express the idea that two recursively evaluatedsubterms are related whenever they have an equal shape (which means their Kripke func-tions can be grouped in pairs) and that all the pairs of Kripke function spaces take relatedinputs to related outputs. body R : ~ d (cid:127) (Kripke V A C A ) σ ∆ → ~ d (cid:127) (Kripke V B C B ) σ ∆ → Setbody R v A v B = ~ d (cid:127) R (Kripke R V R C R ) v A v B The fundamental lemma of simulations is a generic theorem showing that for each pairof

Semantics respecting the

Simulation constraint, we get related computations givenenvironments of related input values. In Figure 85, this theorem is once more mutuallyproven with a statement about

Scope s, and

Size s play a crucial role in ensuring that thefunction is indeed total.Instantiating this generic simulation lemma, we can for instance prove that renamingis a special case of substitution, or that renaming and substitution are extensional i.e.that given environments equal in a pointwise manner they produce syntactically equalterms . Of course these results are not new but having them generically over all syntaxeswith binding is convenient. We experience this ﬁrst hand when tackling the POPLMarkReloaded challenge (2017) where rensub (deﬁned in Figure 86) was actually needed.

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. sim : All V R Γ ρ A ρ B → ( t : Tm d s σ Γ ) → rel C R σ ( S A .semantics ρ A t ) ( S B .semantics ρ B t )body : All V R Γ ρ A ρ B → ∀ ∆ j → ( t : Scope (Tm d s ) ∆ j Γ ) → Kripke R V R C R ∆ j ( S A .body ρ A ∆ j t ) ( S B .body ρ B ∆ j t )sim ρ R (‘var k ) = var R (lookup R ρ R k )sim ρ R (‘con t ) = alg R t ρ R (lift R d (body ρ R ) t )body ρ R [] i t = sim ρ R t body ρ R (_ :: _) i t = λ σ vs R → sim ( vs R >> R (th R σ <$> R ρ R )) t Fig. 85. Fundamental Lemma of

Simulation s RenSub : Simulation d Ren Sub VarTm R Eq R rensub : ( ρ : Thinning Γ ∆ ) ( t : Tm d ∞ σ Γ ) → ren ρ t ≡ sub (‘var <$> ρ ) t rensub ρ = Simulation.sim RenSub (pack R λ _ → reﬂ) Fig. 86. Renaming as a Substitution via SimulationWhen studying speciﬁc languages, new opportunities to deploy the fundamental lemmaof simulations arise. Our solution to the POPLMark Reloaded challenge for instance de-scribes the fact that ( sub ρ t ) reduces to ( sub ρ ’ t ) whenever for all v , ρ ( v ) reduces to ρ ’ ( v )as a Simulation . The main theorem (strong normalisation of STLC via a logical relation)is itself an instance of (the unary version of) the simulation lemma.The

Simulation proof framework is the simplest example of the abstract proof frame-works introduced in ACMM (2017). We also explain how a similar framework can bedeﬁned for fusion lemmas and deploy it for the renaming-substitution interactions but alsotheir respective interactions with normalisation by evaluation. Now that we are familiarisedwith the techniques at hand, we can tackle this more complex example for all syntaxesdeﬁnable in our framework.

Results that can be reformulated as the ability to fuse two traversals obtained as

Se-mantics into one abound. When claiming that Tm is a Functor, we have to prove thattwo successive renamings can be fused into a single renaming where the Thinning s havebeen composed. Similarly, demonstrating that Tm is a relative Monad (Altenkirch et al. (2014)) implies proving that two consecutive substitutions can be merged into a singleone whose environment is the ﬁrst one, where the second one has been applied in apointwise manner. The Substitution Lemma central to most model constructions (see forinstance (Mitchell & Moggi (1991))) states that a syntactic substitution followed by theevaluation of the resulting term into the model is equivalent to the evaluation of the originalterm with an environment corresponding to the evaluated substitution.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding A direct application of these results is our (to be published) entry to the POPLMarkReloaded challenge (2017). By using a

Desc -based representation of intrinsically welltyped and well scoped terms we directly inherit not only renaming and substitution butalso all four fusion lemmas as corollaries of our generic results. This allows us to removethe usual boilerplate and go straight to the point. As all of these statements have preciselythe same structure, we can once more devise a framework which will, provided that itsconstraints are satisﬁed, prove a generic fusion lemma.Fusion is more involved than simulation; we will once more step through each one ofthe constraints individually, trying to give the reader an intuition for why they are shapedthe way they are.

The notion of fusion is deﬁned for a triple of

Semantics ; each S i being deﬁned for values in V i and computations in C i . The fundamental lemma associated to such a set of constraintswill state that running S B after S A is equivalent to running S AB only.The deﬁnition of fusion is parametrised by three relations: E R relates triples of environ-ments of values in ( Γ − Env ) V A ∆ , ( ∆ − Env ) V B Θ and ( Γ − Env ) V AB Θ respectively; V R relates pairs of values V B and V AB ; and C R , our notion of equivalence for evaluationresults, relates pairs of computation in C B and C AB . record Fusion ( d : Desc I ) ( S A : Semantics d V A C A ) ( S B : Semantics d V B C B )( S AB : Semantics d V AB C AB )( E R : ∀ Γ ∆ { Θ } → ( Γ − Env) V A ∆ → ( ∆ − Env) V B Θ → ( Γ − Env) V AB Θ → Set)( V R : Rel V B V AB ) ( C R : Rel C B C AB ) : Set where The ﬁrst obstacle we face is the formal deﬁnition of “running S B after S A ”: for thisstatement to make sense, the result of running S A ought to be a term. Or rather, we oughtto be able to extract a term from a C A . Hence the ﬁrst constraint: the existence of a reify A function, which we supply as a ﬁeld of the record Fusion . When dealing with syntacticsemantics such as renaming or substitution this function will be the identity. Nothingprevents proofs, such as the idempotence of NbE, which use a bona ﬁde reiﬁcation functionthat extracts terms from model values. reify A : ∀ σ → ∀ [ C A σ ⇒ Tm d ∞ σ ] Then, we have to think about what happens when going under a binder: S A will producea Kripke function space where a syntactic value is required. Provided that V A is VarLike ,we can make use of reify to get a

Scope back. Hence the second constraint. vl^ V A : VarLike V A Still thinking about going under binders: if three evaluation environments ρ A in ( Γ − Env ) V A ∆ , ρ B in ( ∆ − Env ) V B Θ , and ρ AB in ( Γ − Env ) V AB Θ are related by E R and we aregiven a thinning σ from Θ to Ω then ρ A , the thinned ρ B and the thinned ρ AB should still berelated. th^ E R : E R Γ ∆ ρ A ρ B ρ AB → ( ρ : Thinning Θ Ω ) →E R Γ ∆ ρ A (th^Env S B .th^ V ρ B ρ ) (th^Env S AB .th^ V ρ AB ρ ) U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Remembering that _>>_ is used in the deﬁnition of body (Figure 34) to combine twodisjoint environments ( Γ − Env ) V Θ and ( ∆ − Env ) V Θ into one of type (( Γ ++ ∆ ) − Env ) V Θ ), we mechanically need a constraint stating that _>>_ is compatible with E R . We demand as an extra precondition that the values ρ B and ρ AB are extended withare related according to V R . Lastly, for all the types to match up, ρ A has to be extendedwith placeholder variables which is possible because we have already insisted on V A being VarLike . _>> R _ : E R Γ ∆ ρ A ρ B ρ AB → All V R Θ vs B vs AB → let id >> ρ A = fresh l vl^ V A ∆ >> th^Env S A .th^ V ρ A (fresh r vl^Var Θ )in E R ( Θ ++ Γ ) ( Θ ++ ∆ ) id >> ρ A ( vs B >> ρ B ) ( vs AB >> ρ AB ) We ﬁnally arrive at the constraints focusing on the semantical counterparts of the terms’constructors. Each constraint essentially states that evaluating a term with S A , reifying theresult and running S B is equivalent to using S AB straight away. This can be made formalby deﬁning the following relation R . R : ∀ σ → ( Γ − Env) V A ∆ → ( ∆ − Env) V B Θ → ( Γ − Env) V AB Θ → Tm d s σ Γ → Set R σ ρ A ρ B ρ AB t = rel C R σ (eval B ρ B (reify A σ (eval A ρ A t ))) (eval AB ρ AB t ) When evaluating a variable, on the one hand S A will look up its meaning in the evaluationenvironment, turn the resulting value into a computation which will get reiﬁed and thenthe result will be evaluated with S B . Provided that all three evaluation environments arerelated by E R this should be equivalent to looking up the value in S AB ’s environment andturning it into a computation. Hence the constraint var R : var R : E R Γ ∆ ρ A ρ B ρ AB → ∀ v → R σ ρ A ρ B ρ AB (‘var v ) The case of the algebra follows a similar idea albeit being more complex: a term getsevaluated using S A and to be able to run S B afterwards we need to recover a piece ofsyntax. This is possible if the Kripke functional spaces are reiﬁed by being fed placeholder V A arguments (which can be manufactured thanks to the vlˆ V A we mentioned before)and then quoted. Provided that the result of running S B on that term is related via ~ d (cid:127) R ( Kripke R V R C R ) to the result of running S AB on the original term, the alg R constraintstates that the two evaluations yield related computations. alg R : E R Γ ∆ ρ A ρ B ρ AB → ( b : ~ d (cid:127) (Scope (Tm d s )) σ Γ ) → let b A : ~ d (cid:127) (Kripke V A C A ) _ _ b A = fmap d ( S A .body ρ A ) bb B = fmap d ( λ ∆ i → S B .body ρ B ∆ i ◦ quote A ∆ i ) b A b AB = fmap d ( S AB .body ρ AB ) b in ~ d (cid:127) R (Kripke R V R C R ) b B b AB → R σ ρ A ρ B ρ AB (‘con b ) This set of constraints is enough to prove a fundamental lemma of

Fusion stating that froma triple of related environments, one gets a pair of related computations: the compositionof S A and S B on one hand and S AB on the other. This lemma is once again proven mutuallywith its counterpart for Semantics ’s body ’s action on Scope s. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding fusion : E R Γ ∆ ρ A ρ B ρ AB → ( t : Tm d s σ Γ ) → R σ ρ A ρ B ρ AB t Fig. 87. Fundamental Lemma of

Fusion

A direct consequence of this result is the four lemmas collectively stating that any pair ofrenamings and / or substitutions can be fused together to produce either a renaming (inthe renaming-renaming interaction case) or a substitution (in all the other cases). One suchexample is the fusion of substitution followed by renaming into a single substitution wherethe renaming has been applied to the environment. subren : ( t : Tm d i σ Γ ) ( ρ : ( Γ − Env) (Tm d ∞ ) ∆ ) ( ρ : Thinning ∆ Θ ) → ren ρ (sub ρ t ) ≡ sub (ren ρ <$> ρ ) t Fig. 88. A Corollary: Substitution-Renaming FusionAnother corollary of the fundamental lemma of fusion is the observation that Kaiser,Schäfer, and Stark (2018) make: assuming functional extensionality , all the ACMM (2017)traversals are compatible with variable renaming. We reproduced this result genericallyfor all syntaxes (see accompanying code). The need for functional extensionality arises inthe proof when dealing with subterms which have extra bound variables. These terms areinterpreted as Kripke functional spaces in the host language and we can only prove thatthey take equal inputs to equal outputs. An intensional notion of equality will simply notdo here. As a consequence, we refrain from using the generic result in practice whenan axiom-free alternative is provable. Kaiser, Schäfer and Stark’s observation naturallyraises the question of whether the same semantics are also stable under substitution. Oursemantics implementing printing with names is a clear counter-example.

Although we were able to use propositional equality when studying syntactic traversalsworking on terms, it is not the appropriate notion of equality for co-ﬁnite trees. What wewant is a generic coinductive notion of bisimilarity for all co-ﬁnite tree types obtained asthe unfolding of a description. Two trees are bisimilar if their top layers have the sameshape and their substructures are themselves bisimilar. This is precisely the type of relation ~ _ (cid:127) R was deﬁned to express. Hence the following coinductive relation. record ≈ ^ ∞ Tm ( d : Desc I ) ( s : Size) ( i : I ) ( t u : ∞ Tm d s i ) : Set wherecoinductiveﬁeld force : { s ′ : Size< s } → ~ d (cid:127) R ( λ _ i → ≈ ^ ∞ Tm d s ′ i ) ( t .force) ( u .force) Fig. 89. Generic Notion of Bisimilarity for Co-ﬁnite Trees

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

We can then prove by coinduction that this generic deﬁnition always gives rise to anequivalence relation by using the relator’s stability properties (if R is reﬂexive / symmetric / transitive then so is ( ~ d (cid:127) R R ) mentioned in Section 9.1. reﬂ : ≈ ^ ∞ Tm d s i t t sym : ≈ ^ ∞ Tm d s i t u → ≈ ^ ∞ Tm d s i u t trans : ≈ ^ ∞ Tm d s i t u → ≈ ^ ∞ Tm d s i u v → ≈ ^ ∞ Tm d s i t v This deﬁnition can be readily deployed to prove e.g. that the unfolding of (cid:9) deﬁnedin Section 8.1 is indeed bisimilar to · · · which was deﬁned in direct style. The proof isstraightforward due to the simplicity of this example: the ﬁrst reﬂ witnesses the fact thatboth deﬁnitions pick the same constructor (a cons cell), the second that they carry the samenatural number, and we can conclude by an appeal to the coinduction hypothesis. eq-01 : { i : Size} → ≈ ^ ∞ Tm (CListD N ) i tt 01 · · · (unfold 01 (cid:9) )eq-10 : { i : Size} → ≈ ^ ∞ Tm (CListD N ) i tt 10 · · · (unfold (1 :: :: :: x s z))eq-01 .force = reﬂ , reﬂ , eq-10 , tteq-10 .force = reﬂ , reﬂ , eq-01 , tt

10 Related Work

The representation of variable binding in formal systems has been a hot topic for decades.Part of the purpose of the ﬁrst POPLMark challenge (2005) was to explore and comparevarious methods.Having based our work on a de Bruijn encoding of variables, and thus a canonicaltreatment of α -equivalence classes, our work has no direct comparison with permutation-based treatments such as those of Pitts’ and Gabbay’s nominal syntax (2002).Our generic universe of syntax is based on scoped and typed de Bruijn indices (de Bruijn(1972)) but it is not a necessity. It is for instance possible to give an interpretation of Desc riptions corresponding to Chlipala’s Parametric Higher-Order Abstract Syntax (2008)and we would be interested to see what the appropriate notion of

Semantics is for thisrepresentation.

The binding structure we present here is based on a ﬂat, lexical scoping strategy. There areother strategies and it would be interesting to see whether our approach could be reused inthese cases.Weirich, Yorgey, and Sheard’s work (2011) encompassing a large array of patterns(nested, recursive, telescopic, and n-ary) can inform our design. They do not enforce scop-ing invariants internally which forces them to introduce separate constructors for a simplebinder, a recursive one, or a telescopic pattern. They recover guarantees by giving their

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding syntaxes a nominal semantics thus bolting down the precise meaning of each combinatorand then proving that users may only generate well formed terms.Bach Poulsen, Rouvoet, Tolmach, Krebbers and Visser (2018) introduce notions ofscope graphs and frames to scale the techniques typical of well scoped and typed deepembeddings to imperative languages. They showcase the core ideas of their work usingSTLC extended with references and then demonstrate that they can already handle a largesubset of Middleweight Java. We have demonstrated that our framework could be usedto deﬁne e ﬀ ectful semantics by choosing an appropriate monad stack (Moggi (1991)).This suggests we should be able to model STLC + Ref. It is however clear that the scopingstructures handled by scope graphs and frames are, in their full generality, out of reach forour framework. In constrast, our work shines by its generality: we deﬁne an entire universeof syntaxes and provide users with traversals and lemmas implemented once and for all .Many other opportunities to enrich the notion of binder in our library are highlighted byCheney (2005). As we have demonstrated in Sections 7.5 and 7.6 we can already handle let-bindings generically for all syntaxes. We are currently considering the modiﬁcation of oursystem to handle deeply-nested patterns by removing the constraint that the binders’ andvariables’ sorts are identical. A notion of binding corresponding to hierarchical namespaceswould be an exciting addition.We have demonstrated how to write generic programs over the potentially cyclic struc-tures of Ghani, Hamana, Uustalu and Vene (2006). Further work by Hamana (2009) yieldeda di ﬀ erent presentation of cyclic structures which preserves sharing: pointers can not onlyrefer to nodes above them but also across from them in the cyclic tree. Capturing this classof inductive types as a set of syntaxes with binding and writing generic programs over themis still an open problem. An early foundational study of a general semantic framework for signatures with binding,algebras for such signatures, and initiality of the term algebra, giving rise to a categorical‘program’ for substitution and proofs of its properties, was given by Fiore, Plotkin andTuri (Fiore et al. (1999)). They worked in the category of presheaves over renamings, (askeleton of) the category of ﬁnite sets. The presheaf condition corresponds to our notion ofbeing

Thinnable . Exhibiting algebras based on both de Bruijn level and index encodings,their approach isolates the usual (abstract) arithmetic required of such encodings.By contrast, we are working in an implemented type theory where the encoding canbe understood as its own foundation without appeal to an external mathematical semantics.We are able to go further in developing machine-checked such implementations and proofs,themselves generic with respect to an abstract syntax

Desc of syntaxes-with-binding. More-over, the usual source of implementation anxiety, namely concrete arithmetic on de Bruijnindices, has been successfully encapsulated via the (cid:3) coalgebra structure. It is perhapsnoteworthy that our type-theoretic constructions, by contrast with their categorical ones,appear to make fewer commitments as to functoriality, thinnability, etc. in our speciﬁcationof semantics, with such properties typically being provable as a further instance of ourframework.

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

The tediousness of repeatedly proving similar statements has unsurprisingly led to variousattempts at automating the pain away via either code generation or the deﬁnition of tactics.These solutions can be seen as untrusted oracles driving the interactive theorem prover.Polonowski’s DBGen (2013) takes as input a raw syntax with comments annotatingbinding sites. It generates a module deﬁning lifting, substitution as well as a raw syntaxusing names and a validation function transforming named terms into de Bruijn ones; werefrain from calling it a scopechecker as terms are not statically proven to be well scoped.Kaiser, Schäfer, and Stark (2018) build on our previous paper to draft possible theoreticalfoundations for Autosubst, a so-far untrusted set of tactics. The paper is based on a speciﬁcsyntax: well scoped call-by-value System F. In contrast, our e ﬀ ort has been here to carveout a precise universe of syntaxes with binding and give a systematic account of thesesyntaxes’ semantics and proofs.Keuchel, Weirich, and Schrijvers’ Needle (2016) is a code generator written in Haskellproducing syntax-speciﬁc Coq modules implementing common traversals and lemmasabout them. Keeping in mind Altenkirch and McBride’s observation that generic programming is every-day programming in dependently-typed languages (2002), we can naturally expect generic,provably sound, treatments of these notions in tools such as Agda or Coq.Keuchel (2011) together with Jeuring (2012) deﬁne a universe of syntaxes with bind-ing with a rich notion of binding patterns closed under products but also sums as longas the disjoint patterns bind the same variables. They give their universe two distinctsemantics: a ﬁrst one based on well scoped de Bruijn indices and a second one basedon Parametric Higher-Order Abstract Syntax (PHOAS) (Chlipala (2008)) together witha generic conversion function from the de Bruijn syntax to the PHOAS one. FollowingMcBride (2005), they implement both renaming and substitution in one fell swoop. Theyleave other opportunities for generic programming and proving to future work.Keuchel, Weirich, and Schrijvers’ Knot (2016) implements as a set of generic programsthe traversals and lemmas generated in specialised forms by their Needle program. Theysee Needle as a pragmatic choice: working directly with the free monadic terms overﬁnitary containers would be too cumbersome. In our experience solving the POPLMarkReloaded challenge, Agda’s pattern synonyms make working with an encoded deﬁnitionalmost seamless.The GMeta generic framework (2012) provides a universe of syntaxes and o ﬀ ers variousbinding conventions (locally nameless (Charguéraud (2012)) or de Bruijn indices). It alsogenerically implements common traversals (e.g. computing the sets of free variables, shift-ing de Bruijn indices or substituting terms for parameters) as well as common predicates(e.g. being a closed term) and provides generic lemmas proving that they are well behaved.It does not o ﬀ er a generic framework for deﬁning new well scoped-and-typed semanticsand proving their properties. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Érdi (2018) deﬁnes a universe inspired by a ﬁrst draft of this paper and gives threedi ﬀ erent interpretations (raw, scoped and typed syntax) related via erasure. He providesscope- and type- preserving renaming and substitution as well as various generic proofsthat they are well behaved but o ﬀ ers neither a generic notion of semantics, nor genericproof frameworks.Copello (2017) works with named binders and deﬁnes nominal techniques (e.g. nameswapping) and ultimately α -equivalence over a universe of regular trees with binders in-spired by Morris’ (2006). The careful characterisation of the successive recursive traversals which can be fusedtogether into a single pass in a semantics-preserving way is not new. This transformationis a much needed optimisation principle in a high-level functional language.Through the careful study of the recursion operator associated to each strictly positivedatatype, Malcolm (1990) deﬁned optimising fusion proof principles. Other optimisationssuch as deforestation (Wadler (1990)) or the compilation of a recursive deﬁnition intoan equivalent abstract machine-based tail-recursive program (Cortiñas & Swierstra (2018))rely on similar generic proofs that these transformations are meaning-preserving.

11 Conclusion and Future Work

Recalling our earlier work (2017) we have started from an example of a scope- and type-safe language (the simply typed λ -calculus), have studied common invariant preservingtraversals and noticed their similarity. After introducing a notion of semantics and refac-toring these traversals as instances of the same fundamental lemma, we have observed thetight connection between the abstract deﬁnition of semantics and the shape of the language.By extending a universe of datatype descriptions to support a notion of binding, we havegiven a generic presentation of syntaxes with binding. We then described a large classof scope- and type-safe generic programs acting on all of them. We started with syntac-tic traversals such as renaming and substitution. We then demonstrated how to write asmall compiler pipeline: scope checking, type checking and elaboration to a core language,desugaring of new constructors added by a language transformer, dead code eliminationand inlining, partial evaluation, and printing with names.We have seen how to construct generic proofs about these generic programs. We ﬁrstintroduced a Simulation relation showing what it means for two semantics to yield relatedoutputs whenever they are fed related input environments. We then built on our experienceto tackle a more involved case: identifying a set of constraints guaranteeing that twosemantics run consecutively can be subsumed by a single pass of a third one.We have put all of these results into practice by using them to solve the (to be published)POPLMark Reloaded challenge which consists of formalising strong normalisation for thesimply typed λ -calculus via a logical-relation argument. This also gave us the opportunityto try our framework on larger languages by tackling the challenge’s extensions to sumtypes and Gödel’s System T. U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Finally, we have demonstrated that this formalisation can be re-used in other domainsby seeing our syntaxes with binding as potentially cyclic terms. Their unfolding is a non-standard semantics and we provide the user with a generic notion of bisimilarity to reasonabout them.

Although quite versatile already our current framework has some limitations which suggestavenues for future work. We list these limitations from easiest to hardest to resolve. Re-member that each modiﬁcation to the universe of syntaxes needs to be given an appropriatesemantics.

Closure under Products

Our current universe of descriptions is closed under sums asdemonstrated in Section 5. It is however not closed under products: two arbitrary right-nested products conforming to a description may disagree on the sort of the term they areconstructing. An approach where the sort is an input from which the description of allowedconstructors is computed (à la Dagand (2013) where, for instance, the ‘lam constructor isonly o ﬀ ered if the input sort is a function type) would not su ﬀ er from this limitation. Unrestricted Variables

Our current notion of variable can be used to form a term of anykind. We remarked in Sections 7.3 and 7.4 that in some languages we want to restrictthis ability to one kind in particular. In that case, we wanted users to only be able to usevariables at the kind

Infer of our bidirectional language. For the time being we made do byrestricting the environment values our

Semantics use to a subset of the kinds: terms withvariables of the wrong kind will not be given a semantics.

Flat Binding Structure

Our current setup limits us to ﬂat binding structures: variable andbinder share the same kinds. This prevents us from representing languages with bindingpatterns, for instance pattern-matching let-binders which can have arbitrarily nested pat-terns taking pairs apart.

Closure under Derivation

One-hole contexts play a major role in the theory of program-ming languages. Just like the one-hole context of a datatype is a datatype (Abbott et al. (2005)), we would like our universe to be closed under derivatives so that the formalisationof e.g. evaluation contexts could beneﬁt directly from the existing machinery.

Closure under Closures

Jander’s work on formalising and certifying continuation pass-ing style transformations (Jander (2019)) highlighted the need for a notion of syntaxes withclosures. Recalling that our notion of Semantics is always compatible with precompositionwith a renaming (Kaiser et al. (2018)) but not necessarily precomposition with a substi-tution (printing is for instance not stable under substitution), accommodating terms withsuspended substitutions is a real challenge. Preliminary experiments show that a drasticmodiﬁcation of the type of the fundamental lemma of

Semantics makes dealing with suchclosures possible. Whether the resulting traversal has good properties that can be provengenerically is still an open problem.

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding The diverse inﬂuences leading to this work suggest many opportunities for future research. • Our example of the elaboration of an enriched language to a core one, ACMM’simplementation of a Continuation Passing Style conversion function, and Jander’swork (2019) on the certiﬁcation of a intrinsically typed CPS transformation raisesthe question of how many such common compilation passes can be implementedgenerically. • Our universe only includes syntaxes that allow unrestricted variable use. Variablesmay be used multiple times or never, with no restriction. We are interested in rep-resenting syntaxes that only allow single use of variables, such as term calculi forlinear logic (Benton et al. (1993); Barber (1996); Atkey & Wood (2018)), or that an-notate variables with usage information (Brunel et al. (2014); Ghica & Smith (2014);Petricek et al. (2014)), or arrange variables into non-list like structures such as bunches(O’Hearn (2003)), or arbitrary algebraic structures (Licata et al. (2017)), and in in-vestigating what form a generic semantics for these syntaxes takes. • An extension of McBride’s theory of ornaments (2017) could provide an appropriateframework to formalise and mechanise the connection between various languages,some being seen as reﬁnements of others. This is particularly evident when consider-ing the informative typechecker (see the accompanying code) which given a scopedterm produces a scoped-and-typed term by type-checking or type-inference. • Our work on the POPLMark Reloaded challenge highlights a need for generic no-tions of congruence closure which would come with guarantees (if the original rela-tion is stable under renaming and substitution so should the closure). Similarly, the“evaluation contexts” corresponding to a syntax could be derived automatically bybuilding on the work of Huet (1997) and Abbott, Altenkirch, McBride and Ghani (2005),allowing us to revisit previous work based on concrete instances of ACMM such asMcLaughlin, McKinna and Stark (2018).We now know how to generically describe syntaxes and their well behaved semantics.We can now start asking what it means to deﬁne well behaved judgments. Why stop athelping the user write their speciﬁc language’s meta-theory when we could study meta-meta-theory?

References

Abbott, Michael Gordon, Altenkirch, Thorsten, McBride, Conor, & Ghani, Neil. (2005). δ for data: Di ﬀ erentiating data structures. Fundam. inform. , (1-2), 1–28.Abel, Andreas. (2010). MiniAgda: Integrating Sized and Dependent Types. Pages 14–28 of:

Bove, Ana, Komendantskaya, Ekaterina, & Niqui, Milad (eds),

ProceedingsWorkshop on Partiality and Recursion in Interactive Theorem Provers, PAR 2010,Edinburgh, UK, 15th July 2010.

EPTCS, vol. 43.Abel, Andreas, Pientka, Brigitte, Thibodeau, David, & Setzer, Anton. (2013). Copatterns:programming inﬁnite structures by observations.

Pages 27–38 of: ACM SIGPLANNotices , vol. 48. ACM.

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Abel, Andreas, Momigliano, Alberto, & Pientka, Brigitte. (2017). POPLMark Reloaded.

Proceedings of the Logical Frameworks and Meta-Languages: Theory and Practiceworkshop .Allais, Guillaume. (2018). agdarsec – Total parser combinators.

Pages 45–59 of:

Boldo,Sylvie, & Magaud, Nicolas (eds),

JFLA 2018 Journées Francophones des LangagesApplicatifs . Banyuls-sur-Mer, France: publié par les auteurs.Allais, Guillaume, Chapman, James, McBride, Conor, & McKinna, James. (2017). Type-and-scope safe programs and their proofs.

Pages 195–207 of: Proceedings of the 6thACM SIGPLAN Conference on Certiﬁed Programs and Proofs . CPP 2017. ACM.Altenkirch, Thorsten, & McBride, Conor. (2002). Generic programming withindependently typed programming.

Pages 1–20 of:

Gibbons, Jeremy, & Jeuring,Johan (eds),

Generic Programming, IFIP TC2 / WG2.1 Working Conference on GenericProgramming, July 11-12, 2002, Dagstuhl, Germany . IFIP Conference Proceedings, vol.243. Kluwer.Altenkirch, Thorsten, & Reus, Bernhard. (1999). Monadic presentations of lambda termsusing generalized inductive types.

Pages 453–468 of: CSL . Springer.Altenkirch, Thorsten, Hofmann, Martin, & Streicher, Thomas. (1995). Categoricalreconstruction of a reduction free normalization proof.

Pages 182–199 of: LNCS , vol.530. Springer.Altenkirch, Thorsten, Chapman, James, & Uustalu, Tarmo. (2014). Relative monadsformalised.

Journal of formalized reasoning , (1), 1–43.Altenkirch, Thorsten, Ghani, Neil, Hancock, Peter, McBride, Conor, & Morris, Peter.(2015a). Indexed containers. J. funct. program. , .Altenkirch, Thorsten, Chapman, James, & Uustalu, Tarmo. (2015b). Monads need not beendofunctors. Logical methods in computer science , (1).Appel, Andrew W., & Jim, Trevor. (1997). Shrinking lambda expressions in linear time. J.funct. program. , (5), 515–540.Atkey, Robert. (2015). An algebraic approach to typechecking and elaboration . Talk.Atkey, Robert, & Wood, James. (2018). Context constrained computation. .Aydemir, Brian E., Bohannon, Aaron, Fairbairn, Matthew, Foster, J. Nathan, Pierce,Benjamin C., Sewell, Peter, Vytiniotis, Dimitrios, Washburn, Geo ﬀ rey, Weirich,Stephanie, & Zdancewic, Steve. (2005). Mechanized Metatheory for the Masses: ThePOPLMark Challenge. Pages 50–65 of:

Hurd, Joe, & Melham, Tom (eds),

TheoremProving in Higher Order Logics . Springer.Bach Poulsen, Casper, Rouvoet, Arjen, Tolmach, Andrew, Krebbers, Robbert, & Visser,Eelco. (2018). Intrinsically-typed deﬁnitional interpreters for imperative languages.

Proc. acm program. lang. , (POPL), 16:1–16:34.Barber, Andrew. (1996). Dual intuitionistic linear logic . Tech. rept. ECS-LFCS-96-347.LFCS, University of Edinburgh.Bellegarde, Françoise, & Hook, James. (1994). Substitution: A formal methods case studyusing monads and transformations.

Science of computer programming , (2), 287 – 311.Benke, Marcin, Dybjer, Peter, & Jansson, Patrik. (2003). Universes for generic programsand proofs in dependent type theory. Nordic j. of computing , (4), 265–289.Benton, Nick, Hur, Chung-Kil, Kennedy, Andrew J, & McBride, Conor. (2012). Stronglytyped term representations in Coq. Jar , (2), 141–159. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Benton, P. N., Bierman, Gavin M., de Paiva, Valeria, & Hyland, Martin. (1993). Aterm calculus for intuitionistic linear logic.

Pages 75–90 of:

Bezem, Marc, & Groote,Jan Friso (eds),

Typed Lambda Calculi and Applications, International Conference onTyped Lambda Calculi and Applications, TLCA ’93, Utrecht, The Netherlands, March16-18, 1993, Proceedings . Lecture Notes in Computer Science, vol. 664. Springer.Berger, Ulrich. (1993). Program extraction from normalization proofs.

Pages 91–106 of:TLCA . Springer.Berger, Ulrich, & Schwichtenberg, Helmut. (1991). An inverse of the evaluation functionalfor typed λ -calculus. Pages 203–211 of: LICS . IEEE.Bird, Richard S., & Paterson, Ross. (1999). de Bruijn notation as a nested datatype.

Journalof functional programming , (1), 77–91.Brady, Edwin. (2013). Idris, a general-purpose dependently typed programming language:Design and implementation. Journal of functional programming , (5), 552–593.Brady, Edwin, & Hammond, Kevin. (2006). A veriﬁed staged interpreter is a veriﬁedcompiler. Pages 111–120 of:

Jarzabek, Stan, Schmidt, Douglas C., & Veldhuizen,Todd L. (eds),

Generative Programming and Component Engineering, 5th InternationalConference, GPCE 2006, Portland, Oregon, USA, October 22-26, 2006, Proceedings .ACM.Brunel, Aloïs, Gaboardi, Marco, Mazza, Damiano, & Zdancewic, Steve. (2014). ACore Quantitative Coe ﬀ ect Calculus. Pages 351–370 of: Programming Languages andSystems - 23rd European Symposium on Programming, ESOP 2014 .Chapman, James, Dagand, Pierre-Évariste, McBride, Conor, & Morris, Peter. (2010). Thegentle art of levitation.

Pages 3–14 of: Proceedings of the 15th ACM SIGPLANInternational Conference on Functional Programming . ICFP ’10. ACM.Chapman, James Maitland. (2009).

Type checking and normalisation . Ph.D. thesis,University of Nottingham (UK).Charguéraud, Arthur. (2012). The locally nameless representation.

Journal of automatedreasoning , (3), 363–408.Cheney, James. (2005). Toward a general theory of names: binding and scope. Pages 33–40 of:

Pollack, Randy (ed),

ACM SIGPLAN International Conference on FunctionalProgramming, Workshop on Mechanized reasoning about languages with variablebinding, MERLIN 2005, Tallinn, Estonia, September 30, 2005 . ACM.Chlipala, Adam. (2008). Parametric higher-order abstract syntax for mechanized semantics.

Pages 143–156 of:

Hook, James, & Thiemann, Peter (eds),

Proceeding of the 13th ACMSIGPLAN international conference on Functional programming, ICFP 2008, Victoria,BC, Canada, September 20-28, 2008 . ACM.Copello, Ernesto. (2017).

On the formalisation of the metatheory of the lambda calculusand languages with binders . Ph.D. thesis, Universidad de la República (Uruguay).Coquand, Catarina. (2002). A formalised proof of the soundness and completeness of asimply typed lambda-calculus with explicit substitutions.

Higher-order and symboliccomputation , (1), 57–90.Coquand, Thierry, & Dybjer, Peter. (1997). Intuitionistic model constructions andnormalization proofs. Mscs , (01), 75–94.Cortiñas, Carlos Tomé, & Swierstra, Wouter. (2018). From algebra to abstract machine:a veriﬁed generic construction. Pages 78–90 of:

Eisenberg, Richard A., & Vazou, Niki

U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al. (eds),

Proceedings of the 3rd ACM SIGPLAN International Workshop on Type-DrivenDevelopment, TyDe@ICFP 2018, St. Louis, MO, USA, September 27, 2018 . ACM.Dagand, Pierre-Évariste. (2013).

A cosmology of datatypes : reusability and dependenttypes . Ph.D. thesis, University of Strathclyde, Glasgow, UK.Danielsson, Nils Anders. (2010). Total parser combinators.

Pages 285–296 of:

Hudak,Paul, & Weirich, Stephanie (eds),

Proceeding of the 15th ACM SIGPLAN internationalconference on Functional programming, ICFP 2010, Baltimore, Maryland, USA,September 27-29, 2010 . ACM.de Bruijn, Nicolaas Govert. (1972). Lambda Calculus notation with nameless dummies.

Pages 381–392 of: Indagationes Mathematicae , vol. 75. Elsevier.de Moura, Leonardo Mendonça, Kong, Soonho, Avigad, Jeremy, van Doorn, Floris, & vonRaumer, Jakob. (2015). The Lean theorem prover (system description).

Pages 378–388of:

Felty, Amy P., & Middeldorp, Aart (eds),

Automated Deduction - CADE-25 - 25thInternational Conference on Automated Deduction, Berlin, Germany, August 1-7, 2015,Proceedings . Lecture Notes in Computer Science, vol. 9195. Springer.Dunﬁeld, Joshua, & Pfenning, Frank. (2004). Tridirectional typechecking.

Pages 281–292 of: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles ofProgramming Languages . POPL ’04. ACM.Dybjer, Peter. (1994). Inductive families.

Formal aspects of computing , (4), 440–465.Dybjer, Peter, & Setzer, Anton. (1999). A ﬁnite axiomatization of inductive-recursivedeﬁnitions. Pages 129–146 of:

Girard, Jean-Yves (ed),

Typed Lambda Calculi andApplications, 4th International Conference, TLCA’99, L’Aquila, Italy, April 7-9, 1999,Proceedings . Lecture Notes in Computer Science, vol. 1581. Springer.Eisenberg, Richard A. (2018).

Stitch: The sound type-indexed type checker . Draft.Érdi, Gerg˝o. (2018).

Generic description of well-scoped, well-typed syntaxes . Unpublisheddraft, privately communicated.Fiore, Marcelo P., Plotkin, Gordon D., & Turi, Daniele. (1999). Abstract syntax andvariable binding.

Pages 193–202 of: 14th Annual IEEE Symposium on Logic inComputer Science, Trento, Italy, July 2-5, 1999 . IEEE Computer Society.Gabbay, Murdoch, & Pitts, Andrew M. (2002). A new approach to abstract syntax withvariable binding.

Formal asp. comput. , (3-5), 341–363.Ghica, Dan R., & Smith, Alex I. (2014). Bounded linear types in a resource semiring. Pages 331–350 of: Programming Languages and Systems - 23rd European Symposiumon Programming, ESOP 2014 .Gibbons, Jeremy, & d. S. Oliveira, Bruno C. (2009). The essence of the Iterator pattern.

J.funct. program. , (3-4), 377–402.Hamana, Makoto. (2009). Initial algebra semantics for cyclic sharing structures. Pages127–141 of:

Curien, Pierre-Louis (ed),

Typed Lambda Calculi and Applications, 9thInternational Conference, TLCA 2009, Brasilia, Brazil, July 1-3, 2009. Proceedings .Lecture Notes in Computer Science, vol. 5608. Springer.Hatcli ﬀ , John, & Danvy, Olivier. (1994). A generic account of continuation-passing styles. Pages 458–471 of: Proceedings of the 21st ACM SIGPLAN-SIGACT symposium onPrinciples of programming languages . ACM.Hedberg, Michael. (1998). A coherence theorem for Martin-Löf’s type theory.

J. funct.program. , (4), 413–436. U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Hinze, Ralf, & Peyton Jones, Simon L. (2000). Derivable type classes.

Electr. notes theor.comput. sci. , (1), 5–35.Hirschowitz, André, & Maggesi, Marco. (2012). Nested abstract syntax in Coq. J. autom.reasoning , (3), 409–426.Hofmann, Martin, & Streicher, Thomas. (1994). The groupoid model refutes uniquenessof identity proofs. Pages 208–212 of: Proceedings of the Ninth Annual Symposium onLogic in Computer Science (LICS ’94), Paris, France, July 4-7, 1994 . IEEE ComputerSociety.Hudak, Paul. (1996). Building domain-speciﬁc embedded languages.

Acm computingsurveys (csur) , (4es), 196.Huet, Gérard. (1997). The zipper. Journal of functional programming , (5), 549–554.Jander, Piotr. (2019). Verifying type-and-scope safe program transformations . M.Phil.thesis, University of Edinburgh.Je ﬀ rey, Alan. (2011). Associativity for free! http://thread.gmane.org/gmane.comp.lang.agda/3259 .Kaiser, Jonas, Schäfer, Steven, & Stark, Kathrin. (2018). Binder aware recursion over well-scoped de Bruijn syntax.

Pages 293–306 of: Proceedings of the 7th ACM SIGPLANInternational Conference on Certiﬁed Programs and Proofs . CPP 2018. ACM.Keep, Andrew W., & Dybvig, R. Kent. (2013). A nanopass framework for commercialcompiler development.

Sigplan not. , (9), 343–350.Keuchel, Steven. (2011). Generic programming with binders and scope . M.Phil. thesis,Utrecht University.Keuchel, Steven, & Jeuring, Johan. (2012). Generic conversions of abstract syntaxrepresentations.

Pages 57–68 of:

Löh, Andres, & Garcia, Ronald (eds),

Proceedingsof the 8th ACM SIGPLAN workshop on Generic programming, WGP@ICFP 2012,Copenhagen, Denmark, September 9-15, 2012 . ACM.Keuchel, Steven, Weirich, Stephanie, & Schrijvers, Tom. (2016). Needle & Knot: Binderboilerplate tied up.

Pages 419–445 of: Proceedings of the 25th European Symposium onProgramming Languages and Systems - Volume 9632 . Springer-Verlag New York, Inc.Lee, Gyesik, Oliveira, Bruno C. D. S., Cho, Sungkeun, & Yi, Kwangkeun. (2012). GMeta:A generic formal metatheory framework for ﬁrst-order representations.

Pages 436–455of:

Seidl, Helmut (ed),

Programming Languages and Systems . Springer.Licata, Daniel R., Shulman, Michael, & Riley, Mitchell. (2017). A ﬁbrational frameworkfor substructural and modal logics.

Pages 25:1–25:22 of:

Miller, Dale (ed), . LIPIcs, vol. 84. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik.Löh, Andres, & Magalhães, José Pedro. (2011). Generic programming with indexedfunctors.

Pages 1–12 of:

Järvi, Jaakko, & Mu, Shin-Cheng (eds),

Proceedings of theseventh ACM SIGPLAN workshop on Generic programming, WGP@ICFP 2011, Tokyo,Japan, September 19-21, 2011 . ACM.Magalhães, José Pedro, Dijkstra, Atze, Jeuring, Johan, & Löh, Andres. (2010). A genericderiving mechanism for haskell.

Pages 37–48 of:

Gibbons, Jeremy (ed),

Proceedings ofthe 3rd ACM SIGPLAN Symposium on Haskell, Haskell 2010, Baltimore, MD, USA, 30September 2010 . ACM.Malcolm, Grant. (1990). Data structures and program transformation.

Sci. comput.program. , (2-3), 255–279. U064-05-FPR jfp19 30 January 2020 2:24 G. Allais et al.

Martin-Löf, Per. (1982). Constructive mathematics and computer programming.

Studiesin logic and the foundations of mathematics , , 153–175.The Coq Development Team. (2017). The Coq proof assistant reference manual . π r Team.Version 8.6.McBride, Conor. (2005).

Type-preserving renaming and substitution . Unpublished draft.McBride, Conor. (2017).

Ornamental algebras, algebraic ornaments . Unpublished draft.McBride, Conor, & McKinna, James. (2004). The view from the left.

J. funct. program. , (1), 69–111.McBride, Conor, & Paterson, Ross. (2008). Applicative programming with e ﬀ ects. Journalof functional programming , (1), 1–13.McLaughlin, Craig, McKinna, James, & Stark, Ian. (2018). Triangulating context lemmas. Pages 102–114 of: Proceedings of the 7th ACM SIGPLAN Conference on CertiﬁedPrograms and Proofs . CPP 2018. ACM.Mitchell, John C, & Moggi, Eugenio. (1991). Kripke-style models for typed lambdacalculus.

Annals of pure and applied logic , (1-2), 99–124.Moggi, Eugenio. (1991). Notions of computation and monads. Inf. comput. , (1), 55–92.Morris, Peter, Altenkirch, Thorsten, & McBride, Conor. (2006). Exploring the regulartree types. Pages 252–267 of:

Filliâtre, Jean-Christophe, Paulin-Mohring, Christine, &Werner, Benjamin (eds),

Types for Proofs and Programs . Springer.Neil Ghani, Makoto Hamana, Tarmo Uustalu, & Vene, Varmo. (2006). Representingcyclic structures as nested datatypes.

Pages 173–188 of: Proceedings of 7th Trendsin Functional Programming, 2006 . Intellect.Norell, Ulf. (2009). Dependently typed programming in Agda.

Pages 230–266 of: AFPSummer School . Springer.O’Hearn, Peter W. (2003). On bunched typing.

J. funct. program. , (4), 747–796.Petricek, Tomas, Orchard, Dominic A., & Mycroft, Alan. (2014). Coe ﬀ ects: a calculusof context-dependent computation. Pages 123–135 of:

Jeuring, Johan, & Chakravarty,Manuel M. T. (eds),

Proceedings of the 19th ACM SIGPLAN international conferenceon Functional programming, Gothenburg, Sweden, September 1-3, 2014 . ACM.Pfenning, Frank. (2004).

Lecture 17: Bidirectional type checking . 15-312: Foundations ofProgramming Languages.Pierce, Benjamin C, & Turner, David N. (2000). Local type inference.

Acm transactionson programming languages and systems (toplas) , (1), 1–44.Polonowski, Emmanuel. (2013). Automatically generated infrastructure for de Bruijnsyntaxes. Pages 402–417 of:

Blazy, Sandrine, Paulin-Mohring, Christine, & Pichardie,David (eds),

Interactive Theorem Proving . Springer.Stump, Aaron. (2016).

Veriﬁed functional programming in Agda . New York, NY, USA:Association for Computing Machinery and Morgan & Claypool.Swiestra, Wouter. (2008). Data types à la carte.

Journal of functional programming , (4),423–436.Thibodeau, David, Momigliano, Alberto, & Pientka, Brigitte. (2016). A case-study inprogramming coinductive proofs: Howe’s method . Tech. rept. Technical report, McGillUniversity.Wadler, Philip. (1987). Views: A way for pattern matching to cohabit with data abstraction.

Pages 307–313 of: Conference Record of the Fourteenth Annual ACM Symposium on

U064-05-FPR jfp19 30 January 2020 2:24

A Type and Scope Safe Universe of Syntaxes with Binding Principles of Programming Languages, Munich, Germany, January 21-23, 1987 . ACMPress.Wadler, Philip. (1990). Deforestation: Transforming programs to eliminate trees.

Theor.comput. sci. , (2), 231–248.Wadler, Philip, & Kokke, Wen. (2018). Programming language foundations in Agda .Available at http://plfa.inf.ed.ac.uk .Weirich, Stephanie, Yorgey, Brent A., & Sheard, Tim. (2011). Binders unbound.

Pages 333–345 of:

Chakravarty, Manuel M. T., Hu, Zhenjiang, & Danvy, Olivier(eds),