[PDF] Polymorphic System I

Abstract

System I is a simply-typed lambda calculus with pairs, extended with an equational theory obtained from considering the type isomorphisms as equalities. In this work we propose an extension of System I to polymorphic types, adding the corresponding isomorphisms. We provide non-standard proofs of subject reduction and strong normalisation, extending those of System I.

Full PDF

aa r X i v : . [ c s . L O ] J a n Polymorphic System I

Cristian F. Sottile Alejandro D´ıaz-Caro , Pablo E. Mart´ınez L´opez Instituto de Investigaci´on en Ciencias de la Computaci´on (ICC).CONICET–Universidad de Buenos Aires. Argentina. Departamento de Ciencia y Tecnolog´ıa.Universidad Nacional de Quilmes. Argentina.

Abstract

System I is a simply-typed lambda calculus with pairs, extended with an equational theoryobtained from considering the type isomorphisms as equalities. In this work we propose an extensionof System I to polymorphic types, adding the corresponding isomorphisms. We provide non-standardproofs of subject reduction and strong normalisation, extending those of System I.

Two types A and B are considered isomorphic ( ≡ ) if there exist two functions f of type A ⇒ B and g of type B ⇒ A such that the composition g ◦ f is semantically equivalent to the identity in A andthe composition f ◦ g is semantically equivalent to the identity in B . Di Cosmo et al. [9] characterisedthe isomorphic types in diﬀerent systems: simple types, simple types with pairs, polymorphism, etc.Using this characterisation, System I has been deﬁned [12]. It is a simply-typed lambda calculus withpairs, where isomorphic types are considered equal. In this way, if A and B are isomorphic, every termof type A can be used as a corresponding term of type B . For example, the currying isomorphism( A ∧ B ) ⇒ C ≡ A ⇒ B ⇒ C allows passing arguments one by one to a function expecting a pair.Normally, this would imply for a function f : ( A ∧ B ) ⇒ C to be transformed through a term t into tf : A ⇒ B ⇒ C . System I goes further, by considering that f has both types ( A ∧ B ) ⇒ C and A ⇒ B ⇒ C , and so, the transformation occurs implicitly without the need for the term t . To make thisidea work, System I includes an equivalence between terms; for example: t h r, s i ⇄ trs , since if t expectsa pair, it can also take each component at a time. Also, β -reduction has to be parametrized by the type:if the expected argument is a pair, then t h r, s i β -reduces; otherwise, it does not β -reduce, but trs does.For example, ( λx A ∧ B .u ) h r, s i β -reduces if r has type A and s has type B . Instead, ( λx A .u ) h r, s i does notreduce directly, but since it is equivalent to ( λx A .u ) rs , which does reduce, then it also reduces, modulothis equivalence.The idea of identifying some propositions has already been investigated, for example, in Martin-L¨of’stype theory [21], in the Calculus of Constructions [6], and in Deduction modulo theory [16, 17], wheredeﬁnitionally equivalent propositions, for instance A ⊆ B , A ∈ P ( B ), and ∀ x ( x ∈ A ⇒ x ∈ B ) can beidentiﬁed. But deﬁnitional equality does not handle isomorphisms. For example, A ∧ B and B ∧ A arenot identiﬁed in these logics. Besides deﬁnitional equality, identifying isomorphic types in type theoryis also a goal of the univalence axiom [25]. From the programming perspective, isomorphisms capturethe computational meaning correspondence between types. Taking currying again, for example, we havea function f of type A ∧ B ⇒ C that can be transformed, because there exists an isomorphism, into afunction f ′ of type A ⇒ B ⇒ C . These two functions diﬀer in how they can be combined with otherterms, but they share a purpose: they both compute the same value of type C given two arguments oftypes A and B . In this sense, System I’s proposal is to allow a programmer to focus on the meaningof programs, combining any term with the ones that are combinable with its isomorphic counterparts(e.g. f x A y B and f ′ h x A , y B i ), ignoring the rigid syntax of terms within the safe context provided bytype isomorphisms. From the logic perspective, isomorphisms make proofs more natural. For instance,to prove ( A ∧ ( A ⇒ B )) ⇒ B in natural deduction we need to introduce the conjunctive hypothesis1 ∧ B ≡ B ∧ A (1) A ∧ ( B ∧ C ) ≡ ( A ∧ B ) ∧ C (2) A ⇒ ( B ∧ C ) ≡ ( A ⇒ B ) ∧ ( A ⇒ C ) (3)( A ∧ B ) ⇒ C ≡ A ⇒ B ⇒ C (4) If X F T V ( A ), ∀ X. ( A ⇒ B ) ≡ A ⇒ ∀ X.B (5) ∀ X. ( A ∧ B ) ≡ ∀ X.A ∧ ∀

X.B (6)

Table 1:

Isomorphisms considered in PSI A ∧ ( A ⇒ B ), which has to be decomposed into A and A ⇒ B , while using currying allows to transformthe goal to A ⇒ ( A ⇒ B ) ⇒ B and to directly introduce the hypotheses A and A ⇒ B , completelyeliminating the need for the conjunctive hypotheses.One of the pioneers on using isomorphisms in programming languages has been Rittri [24], who usedthe types, equated by isomorphisms, as search keys in program libraries.An interpreter of a preliminary version of System I extended with a recursion operator has beenimplemented in Haskell [15]. Such a language have peculiar characteristics. For example, using theexisting isomorphism between A ⇒ ( B ∧ C ) and ( A ⇒ B ) ∧ ( A ⇒ C ), we can project a functioncomputing a pair of elements, and obtain, through evaluation, a simpler function computing only one ofthe elements of the pair, discarding the unused code that computes the output that is not of interest tous. In this work we propose an extension of System I to polymorphism, considering the correspondingisomorphisms. Plan of the paper.

The paper is organized as follows: Section 2 introduces the proposed system,and Section 3 gives examples to better clarify the constructions. Section 4 proves the Subject Reductionproperty and Section 5 the Strong Normalisation property, which are the main theorems in the paper.Finally, Section 6 discusses some design choices, as well as possible directions for future work.

We deﬁne Polymorphic System I (PSI) as an extension of System I [12] to polymorphic types. Thesyntax of types coincides with that of System F [20, Chapter 11] with pairs: A := X | A ⇒ A | A ∧ A | ∀ X.A where X ∈ T V ar , a set of type variables.The extension with respect to System F with pairs consists of adding a typing rule such that if t hastype A and A ≡ B , then t has also type B , which is valid for every pair of isomorphic types A and B .This non-trivial addition induces a modiﬁcation of the operational semantics of the calculus.There are eight isomorphisms characterising all the valid isomorphism of System F with pairs (cf. [9,Table 1.4]). From those eight, we consider the six given as a congruence in Table 1, where F T V ( A ) isthe set of free type variables deﬁned as usual.The two non-listed isomorphisms are the following: ∀ X.A ≡ ∀ Y. [ X := Y ] A (7) ∀ X. ∀ Y.A ≡ ∀ Y. ∀ X.A (8)The isomorphism (7) is in fact an α -equivalence, and we indeed consider terms and types modulo α -equivalence. We simply do not make this isomorphism explicit in order to avoid confusion. The isomor-phism (8), on the other hand, is not treated on this paper because PSI is presented in Church style (asSystem I), and so, being able to swap the arguments of a type abstraction would imply swapping thetyping arguments with a cumbersome notation and little gain. We discuss this in Section 6.1.2he added typing rule for isomorphic types induces certain equivalences between terms. In particular,the isomorphism (1) implies that the pairs h t, r i and h r, t i are indistinguishable, since both are typed as A ∧ B and also as B ∧ A , independently of which term has type A and which one type B . Therefore,we consider that those two pairs are equivalent. In the same way, as a consequence of isomorphism (2), h t, h r, s ii is equivalent to hh t, r i , s i .Such an equivalence between terms implies that the usual projection, which is deﬁned with respectto the position (i.e. π i ( h t , t i ) ֒ → t i ), is not well-deﬁned in this system. Indeed, π ( h t, r i ) would reduceto t , but since h t, r i is equivalent to h r, t i , it would also reduce to r . Therefore, PSI (as well as SystemI), deﬁnes the projection with respect to a type: if Γ ⊢ t : A , then π A ( h t, r i ) ֒ → t .This rule turns PSI into a non-deterministic (and therefore non-conﬂuent) system. Indeed, if both t and r have type A , then π A ( h t, r i ) reduces non-deterministically to t or to r . This non-determinism,however, can be argued not to be of a major problem: if we think of PSI as a proof system, then thenon-determinism, as soon as we have type preservation, implies that the system identiﬁes diﬀerent proofsof isomorphic propositions (as a form of proof-irrelevance). On the other hand, if PSI is thought as aprogramming language, then the determinism can be recovered by the following encoding: if t and r havethe same type, it suﬃces to encode the deterministic projection of h t, r i into t as π B ⇒ A ( h λx B .t, λx C .r i ) s where B C and s has type B . Hence, the non-determinism of System I (inherited in PSI) is considereda feature and not a ﬂaw (cf. [12] for a longer discussion).Thus, PSI (as well as System I) is one of the many non-deterministic calculi in the literature, e.g. [4,5, 7, 8, 22] and so our pair-construction operator can also be considered as the parallel compositionoperator of a non-deterministic calculus.In non-deterministic calculi, the non-deterministic choice is such that if r and s are two λ -terms, theterm r ⊕ s represents the computation that runs either r or s non-deterministically, that is such that( r ⊕ s ) t reduces either to rt or st . On the other hand, the parallel composition operator k is such that theterm ( r k s ) t reduces to rt k st and continue running both rt and st in parallel. In our case, given r and s of type A ⇒ B and t of type A , the term π B ( h r, s i t ) is equivalent to π B ( h rt, st i ), which reduces to rt or st , while the term h rt, st i itself would run both computations in parallel. Hence, our pair-constructoris equivalent to the parallel composition while the non-deterministic choice ⊕ is decomposed into thepair-constructor followed by its destructor.In PSI and System I, the non-determinism comes from the interaction of two operators, h , i and π .This is also related to the algebraic calculi [1, 2, 3, 26, 11, 14], some of which have been designed toexpress quantum algorithms. There is a clear link between our pair constructor and the projection π ,with the superposition constructor + and the measurement π on these algebraic calculi. In these cases,the pair s + t is not interpreted as a non-deterministic choice, but as a superposition of two processesrunning s and t , and the operator π is the projection related to the measurement, which is the onlynon-deterministic operator. In such calculi, the distributivity rule ( r + s ) t ⇄ rt + st is seen as thepoint-wise deﬁnition of the sum of two functions.The syntax of terms is then similar to that of System F with pairs, but with the projections dependingon types instead of position, as discussed: t := x A | λx A .t | tt | h t, t i | π A ( t ) | Λ X.t | t [ A ]where x A ∈ V ar , a set of typed variables. We omit the type of variables when it is evident from thecontext. For example, we write λx A .x instead of λx A .x A .The type system of PSI is standard, with only two modiﬁcations with respect to that of System Fwith pairs: the projection ( ∧ e ), and the added rule for isomorphisms ( ≡ ). The full system is shown inTable 2. We write Γ ⊢ t : A to express that t has type A in context Γ. Notice, however, that since thesystem is given in Church-style (i.e. variables have their type written), the context is redundant [19, 23].Hence, we may write “ t has type A ” with no ambiguity. From now on, except where indicated, we usethe ﬁrst upper-case letters of the Latin alphabet ( A, B, C, . . . ) for types, the last upper-case letters ofthe Latin alphabet (

W, X, Y, Z ) for type variables, lower-case Latin letters ( r, s, t, . . . ) for terms, thelast lower-case letters of the Latin alphabet ( x, y, z ) for term variables, and upper-case Greek letters(Γ , ∆ , . . . ) for contexts.In the same way as isomorphisms (1) and (2) induce the commutativity and associativity of pairs,as well as a modiﬁcation in the elimination of pairs (i.e. the projection), the isomorphism (3) inducesthat an abstraction of type A ⇒ ( B ∧ C ) can be considered as a pair of abstractions of type ( A ⇒ , x : A ⊢ x : A ( ax ) Γ ⊢ t : A A ≡ B Γ ⊢ t : B ( ≡ ) Γ , x : A ⊢ t : B Γ ⊢ λx A .t : A ⇒ B ( ⇒ i ) Γ ⊢ t : A ⇒ B Γ ⊢ r : A Γ ⊢ tr : B ( ⇒ e ) Γ ⊢ t : A Γ ⊢ r : B Γ ⊢ h t, r i : A ∧ B ( ∧ i ) Γ ⊢ t : A ∧ B Γ ⊢ π A ( t ) : A ( ∧ e ) Γ ⊢ t : A X / ∈ F T V (Γ)Γ ⊢ Λ X.t : ∀ X.A ( ∀ i ) Γ ⊢ t : ∀ X.A Γ ⊢ t [ B ] : [ X := B ] A ( ∀ e ) Table 2:

Typing rules B ) ∧ ( A ⇒ C ), and so it can be projected. Therefore, an abstraction returing a pair is identiﬁed witha pair of abstractions, and a pair applied distributes its argument—that is, λx A . h t, r i ⇄ h λx A .t, λx A .r i ,and h t, r i s ⇄ h ts, rs i , where ⇄ is a symmetric symbol (and ⇄ ∗ its transitive closure).In addition, isomorphism (4) induces the following equivalence: t h r, s i ⇄ trs . However, this equiva-lence produces an ambiguity with the β -reduction. For example, if t has type A and r has type B , theterm ( λx A ∧ B .s ) h t, r i can β -reduce to [ x := h t, r i ] s , but also, since this term is equivalent to ( λx A ∧ B .s ) tr ,which β -reduces to ([ x := t ] s ) r , reduction would not be stable by equivalence. To ensure the stability ofreduction by equivalence, the β -reduction must be performed only when the type of the argument is thesame as the type of the abstracted variable: if Γ ⊢ r : A , then ( λx A .t ) r ֒ → [ x := r ] t .The two added isomorphisms for polymorphism ((5) and (6)) also add several equivalences betweenterms. Two induced by (5), and four induced by (6).Summarizing, the operational semantics of PSI is given by the relation ֒ → modulo the symmetricrelation ⇄ . That is, we consider the relation → := ⇄ ∗ ◦ ֒ → ◦ ⇄ ∗ As usual, we write → ∗ the reﬂexive and transitive closure of → . We also may write ֒ → n to express n steps in relation ֒ → , and ֒ → R to specify that the used rule is R . Both relations for PSI are given inTable 3. In this Section we present some examples to discuss uses and necessity for the rules presented.

Example 3.1.

We show the use of term equivalences to allow applications that are not possible to buildin System F. For instance, the “apply” function λf A ⇒ B .λx A .f x can be applied to a pair, e.g. h g, t i with ⊢ g : A ⇒ B and ⊢ t : A , because, due to isomorphism (4), thetype derivation from Table 4 is valid. Then we have( λf A ⇒ B .λx A .f x ) h g, t i ⇄ ( λf A ⇒ B .λx A .f x ) gt ֒ → β λ gt Example 3.2.

Continuing with the previous example, equivalent applications can be build in otherways. For instance, the term ( λf A ⇒ B .λx A .f x ) tg is well-typed using isomorphisms (1) and (4), and reduces to gt :( λf A ⇒ B .λx A .f x ) tg ⇄ ( λf A ⇒ B .λx A .f x ) h t, g i ⇄ ( λf A ⇒ B .λx A .f x ) h g, t i→ ∗ gt r, s i ⇄ h s, r i ( COMM ) h r, h s, t ii ⇄ hh r, s i , t i ( ASSO ) λx A . h r, s i ⇄ h λx A .r, λx A .s i ( DIST λ ) h r, s i t ⇄ h rt, st i ( DISTapp ) r h s, t i ⇄ rst ( CURRY ) If X / ∈ F T V ( A ) , Λ X.λx A .r ⇄ λx A . Λ X.r ( P-COMM ∀ i ⇒ i ) If X / ∈ F T V ( A ) , ( λx A .r )[ B ] ⇄ λx A .r [ B ] ( P-COMM ∀ e ⇒ i )Λ X. h r, s i ⇄ h Λ X.r, Λ X.s i ( P-DIST ∀ i ∧ i ) h r, s i [ A ] ⇄ h r [ A ] , s [ A ] i ( P-DIST ∀ e ∧ i ) π ∀ X.A (Λ X.r ) ⇄ Λ X.π A ( r ) ( P-DIST ∀ i ∧ e ) If r : ∀ X. ( B ∧ C ) , ( π ∀ X.B ( r ))[ A ] ⇄ π [ X := A ] B ( r [ A ]) ( P-DIST ∀ e ∧ e )If Γ ⊢ s : A, ( λx A .r ) s ֒ → [ x := s ] r ( β λ )(Λ X.r )[ A ] ֒ → [ X := A ] r ( β Λ )If Γ ⊢ r : A, π A ( h r, s i ) ֒ → r ( π ) t ⇄ rλx A .t ⇄ λx A .r t ⇄ rts ⇄ rs t ⇄ rst ⇄ sr t ⇄ r h t, s i ⇄ h r, s i t ⇄ r h s, t i ⇄ h s, r i t ⇄ rπ A ( t ) ⇄ π A ( r ) t ⇄ r Λ X.t ⇄ Λ X.r t ⇄ rt [ A ] ⇄ r [ A ] t ֒ → rλx A .t ֒ → λx A .r t ֒ → rts ֒ → rs t ֒ → rst ֒ → sr t ֒ → r h t, s i ֒ → h r, s i t ֒ → r h s, t i ֒ → h s, r i t ֒ → rπ A ( t ) ֒ → π A ( r ) t ֒ → r Λ X.t ֒ → Λ X.r t ֒ → rt [ A ] ֒ → r [ A ] Table 3:

Relations deﬁning the operational semantics of PSI

Example 3.3.

Concluding with the previous example, the uncurried “apply” function λz ( A ⇒ B ) ∧ A .π A ⇒ B ( z ) π A ( z )can be applied to ⊢ g : A ⇒ B and ⊢ t : A as if it was curried:( λz ( A ⇒ B ) ∧ A .π A ⇒ B ( z ) π A ( z )) gt ⇄ ( λz ( A ⇒ B ) ∧ A .π A ⇒ B ( z ) π A ( z )) h g, t i ֒ → β λ π A ⇒ B ( h g, t i ) π A ( h g, t i ) ֒ → π gt In the three previous examples, the β -reduction cannot occur before the equivalences because of thetyping condition in rule ( β λ ). Example 3.4.

Another use of interest is the one mentioned in Section 2: a function returning a pair ⊢ λf A ⇒ B .λx A .f x : ( A ⇒ B ) ⇒ A ⇒ B ( ≡ ) ⊢ λf A ⇒ B .λx A .f x : (( A ⇒ B ) ∧ A ) ⇒ B ⊢ g : A ⇒ B ⊢ t : A ( ∧ i ) ⊢ h g, t i : ( A ⇒ B ) ∧ A ( ⇒ e ) ⊢ ( λf A ⇒ B .λx A .f x ) h g, t i : B Table 4:

Type derivation of example 3.1.5 Λ X.λx A .λf A ⇒ X .f x : ∀ X. ( A ⇒ ( A ⇒ X ) ⇒ X ) ( ≡ ) ⊢ Λ X.λx A .λf A ⇒ X .f x : A ⇒ ∀ X. (( A ⇒ X ) ⇒ X ) ⊢ t : A ( ⇒ e ) ⊢ (Λ X.λx A .λf A ⇒ X .f x ) t : ∀ X. (( A ⇒ X ) ⇒ X ) Table 5:

Type derivation of example 3.5.can be projected even while not being applied, computing another function. Consider the term π A ⇒ B ( λx A . h t, r i )where x : A ⊢ t : B and x : A ⊢ r : C . This term is typable using isomorphism (3), since A ⇒ ( B ∧ C ) ≡ ( A ⇒ B ) ∧ ( A ⇒ C ). The reduction goes as follows: π A ⇒ B ( λx A . h t, r i ) ⇄ π A ⇒ B ( h λx A .t, λx A .r i ) ֒ → π λx A .t Example 3.5.

Rule (

P-COMM ∀ i ⇒ i ) is a consequence of isomorphism (5). For instance, the term(Λ X.λx A .λf A ⇒ X .f x ) t is well-typed assuming ⊢ t : A and X F T V ( A ), as shown in Table 5, and we have(Λ X.λx A .λf A ⇒ X .f x ) t ⇄ ( λx A . (Λ X.λf A ⇒ X .f x )) t֒ → β λ (Λ X.λf A ⇒ X .f t ) Example 3.6.

Rule (

P-COMM ∀ e ⇒ i ) is also a consequence of isomorphism (5). Consider the term( λx ∀ X. ( X ⇒ X ) .x )[ A ]Λ X.λx X .x Let B = ∀ X. ( X ⇒ X ). Since B ⇒ B ≡ ∀ Y. ( B ⇒ ( Y ⇒ Y )) (renaming the variable for readability),then ⊢ ( λx B .x )[ A ]Λ X.λx X .x : A ⇒ A The reduction goes as follows: ( λx ∀ X. ( X ⇒ X ) .x )[ A ]Λ X.λx X .x ⇄ ( λx ∀ X. ( X ⇒ X ) .x [ A ])Λ X.λx X .x֒ → β λ (Λ X.λx X .x )[ A ] ֒ → β Λ λx A .x Example 3.7.

Rules (

P-DIST ∀ i ∧ i ) and ( P-DIST ∀ i ∧ e ) are both consequences of the same isomorphism: (6).Consider the term π ∀ X. ( X ⇒ X ) (Λ X. h λx X .x, t i )where ⊢ t : A . Since ∀ X. (( X ⇒ X ) ∧ A ) ≡ ( ∀ X. ( X ⇒ X )) ∧ ∀ X.A , we can derive ⊢ π ∀ X. ( X ⇒ X ) (Λ X. h λx X .x, t i ) : ∀ X. ( X ⇒ X )A possible reduction is: π ∀ X. ( X ⇒ X ) (Λ X. h λx X .x, t i ) ⇄ π ∀ X. ( X ⇒ X ) ( h Λ X.λx X .x, Λ X.t i ) ֒ → π Λ X.λx X .x Example 3.8.

Rule (

P-DIST ∀ e ∧ i ) is also a consequence of isomorphism (6). Consider h Λ X.λx X .λy A .t, Λ X.λx X .λz B .r i [ C ]where ⊢ t : D and ⊢ r : E . It has type ( C ⇒ A ⇒ D ) ∧ ( C ⇒ B ⇒ E ), and reduces as follows: h Λ X.λx X .λy A .t, Λ X.λx X .λz B .r i [ C ] ⇄ h ( λx X .λy A .t )[ C ] , ( λx X .λz B .r )[ C ] i ֒ → β Λ h λx C .λy A .t, λx C .λz B .r i xample 3.9. Rule (

P-DIST ∀ e ∧ e ) is also a consequence of isomorphism (6). Consider the term( π ∀ X. ( X ⇒ X ) (Λ X. h λx X .x, r i ))[ A ]with type A ⇒ A , which reduces as follows:( π ∀ X. ( X ⇒ X ) (Λ X. h λx X .x, r i ))[ A ] ⇄ π A ⇒ A ((Λ X. h λx X .x, r i )[ A ]) ֒ → β Λ π A ⇒ A ( h λx A .x, [ X := A ] r i ) ֒ → π λx A .x In this section we prove the preservation of typing through reduction. First we need to characterise theequivalences between types, for example, if ∀ X.A ≡ B ∧ C , then B ≡ ∀ X.B ′ and C ≡ ∀ X.C ′ , with A ≡ B ′ ∧ C ′ (Lemma 4.9). Due to the amount of isomorphisms, this kind of lemmas are not trivial. Toprove these relations, we ﬁrst deﬁne the multiset of prime factors of a type (Deﬁnition 4.1). That is, themultiset of types that are not equivalent to a conjunction, such that the conjunction of all its elementsis equivalent to a certain type. This technique has already been used in System I [12], however, it hasbeen used with simple types with only one basic type τ . In PSI, instead, we have an inﬁnite amount ofvariables acting as basic types, hence the proof becomes more complex.We write ~X for X , . . . , X n and ∀ ~X.A for ∀ X . . . . ∀ X n .A , for some n (where, in the second case, if n = 0, ∀ ~X.A = A ). In addition, we write [ A , . . . , A n ] or [ A i ] ni =1 for the multiset containing the elements A to A n .We also may write X : X : · · · : X n for ~X . Deﬁnition 4.1 (Prime factors) . PF ( X ) = [ X ] PF ( A ⇒ B ) = [ ∀ ~X i . (( A ∧ B i ) ⇒ Y i )] ni =1 where PF ( B ) = [ ∀ ~X i . ( B i ⇒ Y i )] ni =1 PF ( A ∧ B ) = PF ( A ) ⊎ PF ( B ) PF ( ∀ X.A ) = [ ∀ X. ∀ ~Y i . ( A i ⇒ Z i )] ni =1 where PF ( A ) = [ ∀ ~Y i . ( A i ⇒ Z i )] ni =1 Lemma 4.2 and Corollary 4.3 state the correctness of Deﬁnition 4.1. We write V ([ A i ] ni =1 ) for V ni =1 A i . Lemma 4.2.

For all A , there exist ~X, n, B , . . . , B n , Y , . . . , Y n such that P F ( A ) = [ ∀ ~X i . ( B i ⇒ Y i )] ni =1 .Proof. Straightforward induction on the structure of A . Corollary 4.3.

For all A , A ≡ V ( PF ( A )) .Proof. By induction on the structure of A. • Let A = X . Then PF ( X ) = [ X ], and V ([ X ]) = X . • Let A = B ⇒ C . By Lemma 4.2, PF ( C ) = [ ∀ ~X i . ( C i ⇒ Y i )] ni =1 . Hence, by deﬁnition, PF ( A ) =[ ∀ ~X i . ( B ∧ C i ⇒ Y i )] ni =1 . By the induction hypothesis, C ≡ V ( PF ( C )) = V ni =1 ∀ ~X i . ( C i ⇒ Y i ).Therefore, A = B ⇒ C ≡ B ⇒ n ^ i =1 ∀ ~X i . ( C i ⇒ Y i ) ≡ n ^ i =1 ∀ ~X i . (( B ∧ C i ) ⇒ Y i )= ^ ([ ∀ ~X i . ( B ∧ C i ⇒ Y i )] ni =1 ) = ^ ( PF ( A ))7 Let A = B ∧ C . By the induction hypothesis, B ≡ V ( PF ( B )) and C ≡ V ( PF ( C )). Hence, A = B ∧ C ≡ ^ ( PF ( B )) ∧ ^ ( PF ( C )) ≡ ^ ( PF ( B ) ⊎ PF ( C )) = ^ ( PF ( A )) • Let A = ∀ X.B . By Lemma 4.2, PF ( B ) = [ ∀ ~Y i . ( B i ⇒ Z i )] ni =1 . Hence, by deﬁnition, PF ( A ) =[ ∀ X. ∀ ~Y i . ( B i ⇒ Z i )] ni =1 . By the induction hypothesis, B ≡ V ( PF ( B )) = V ni =1 ∀ ~Y i . ( B i ⇒ Z i ).Therefore, A = ∀ X.B ≡ ∀ X. n ^ i =1 ∀ ~Y i . ( B i ⇒ Z i ) ≡ n ^ i =1 ∀ X. ∀ ~Y i . ( B i ⇒ Z i )= ^ ([ ∀ X. ∀ ~Y . ( B i ⇒ Z )] ni =1 ) = ^ ( PF ( A ))Lemma 4.5 states the stability of prime factors through equivalence and Lemma 4.6 states a kind ofreciprocal result. Deﬁnition 4.4. [ A , . . . , A n ] ∼ [ B , . . . , B m ] if n = m and A i ≡ B p ( i ) , for i = 1 , . . . , n and p apermutation on { , . . . .n } . Lemma 4.5.

For all

A, B such that A ≡ B , we have PF ( A ) ∼ PF ( B ) .Proof. First we check that PF ( A ∧ B ) ∼ PF ( B ∧ A ) and similar for the other ﬁve isomorphisms. Thenwe prove by structural induction that if A and B are equivalent in one step, then PF (A) ∼ PF (B). Weconclude by an induction on the length of the derivation of the equivalence A ≡ B . Lemma 4.6.

For all

R, S multisets such that R ∼ S , we have V ( R ) ≡ V ( S ) . Lemma 4.7.

For all ~X, ~Z, A, B, Y, W such that ∀ ~X. ( A ⇒ Y ) ≡ ∀ ~Z. ( B ⇒ W ) , we have ~X = ~Z , A ≡ B ,and Y = W .Proof. By simple inspection of the isomorphisms.

Lemma 4.8.

For all

A, B, C , C such that A ⇒ B ≡ C ∧ C , there exist B , B such that C ≡ A ⇒ B , C ≡ A ⇒ B and B ≡ B ∧ B .Proof. By Lemma 4.5, PF ( A ⇒ B ) ∼ PF ( C ∧ C ) = PF ( C ) ⊎ PF ( C ).By Lemma 4.2, let PF ( B ) = [ ∀ ~X i . ( D i ⇒ Z i )] ni =1 , PF ( C ) = [ ∀ ~Y j . ( E j ⇒ Z ′ j )] kj =1 , and PF ( C ) =[ ∀ ~Y j . ( E j ⇒ Z ′ j )] mj = k +1 . Hence, [ ∀ ~X i . (( A ∧ D i ) ⇒ Z i )] ni =1 ∼ [ ∀ ~Y j . ( E j ⇒ Z ′ j )] mj =1 . So, by deﬁnition of ∼ , n = m and for i = 1 , . . . , n and a permutation p , we have ∀ ~X i . (( A ∧ D i ) ⇒ Z i ) ≡ ∀ ~Y p ( i ) . ( E p ( i ) ⇒ Z ′ p ( i ) ).Hence, by Lemma 4.7, we have ~X i = ~Y p ( i ) , A ∧ D i ≡ E p ( i ) , and Z i = Z ′ p ( i ) .Thus, there exists I such that I ∪ ¯ I = { , . . . , n } , such that PF ( C ) = [ ∀ ~Y p ( i ) . ( E p ( i ) ⇒ Z ′ p ( i ) )] i ∈ I PF ( C ) = [ ∀ ~Y p ( i ) . ( E p ( i ) ⇒ Z ′ p ( i ) )] i ∈ ¯ I Therefore, by Corollary 4.3, C ≡ ^ i ∈ I ∀ ~Y p ( i ) . ( E p ( i ) ⇒ Z ′ p i ) ≡ ^ i ∈ I ∀ ~X i . (( A ∧ D i ) ⇒ Z i )and C ≡ ^ i ∈ ¯ I ∀ ~X i . (( A ∧ D i ) ⇒ Z i )Let B = V i ∈ I ∀ ~X i . ( D i ⇒ Z i ) and B = V i ∈ ¯ I ∀ ~X i . ( D i ⇒ Z i ). So, C ≡ A ⇒ B and C ≡ A ⇒ B . Inaddition, also by Corollary 4.3, we have B ≡ V ni =1 ∀ ~X i . ( D i ⇒ Z i ) ≡ B ∧ B .8he proofs of the following two lemmas are similar to the proof of Lemma 4.8. Full details are givenin Appendix A. Lemma 4.9.

For all

X, A, B, C such that ∀ X.A ≡ B ∧ C , there exist B ′ , C ′ such that B ≡ ∀ X.B ′ , C ≡ ∀ X.C ′ and A ≡ B ′ ∧ C ′ . Lemma 4.10.

For all

X, A, B, C such that ∀ X.A ≡ B ⇒ C , there exists C ′ such that C ≡ ∀ X.C ′ and A ≡ B ⇒ C ′ . Since the calculus is presented in Church-style, excluding rule ≡ ,PSI is syntax directed. Therefore,the generation lemma (Lemma 4.12) is straightforward, and we have the following unicity lemma (whoseproof is given in Appendix A): Lemma 4.11 (Unicity modulo) . For all Γ , r, A, B such that Γ ⊢ r : A and Γ ⊢ r : B , we have A ≡ B . Lemma 4.12 (Generation) . For all Γ , x, r, s, X, A, B :1. If Γ ⊢ x : A and Γ ⊢ x : B , then A ≡ B .2. If Γ ⊢ λx A .r : B , then there exists C such that Γ , x : A ⊢ r : C and B ≡ A ⇒ C .3. If Γ ⊢ rs : A , then there exists C such that Γ ⊢ r : C ⇒ A and Γ ⊢ s : C .4. If Γ ⊢ h r, s i : A , then there exist C, D such that A ≡ C ∧ D , Γ ⊢ r : C and Γ ⊢ s : D .5. If Γ ⊢ π A ( r ) : B , then A ≡ B and there exists C such that Γ ⊢ r : B ∧ C .6. If Γ ⊢ Λ X.r : A , then there exists C such that A ≡ ∀ X.C , Γ ⊢ r : C and X F T V (Γ) .7. If Γ ⊢ r [ A ] : B , then there exists C such that [ X := A ] C ≡ B and Γ ⊢ r : ∀ X.C . The detailed proofs of Lemma 4.13 (Substitution) and Theorem 4.14 (Subject Reduction) are givenin Appendix A.

Lemma 4.13 (Substitution) .

1. For all Γ , x, r, s, A, B such that Γ , x : B ⊢ r : A and Γ ⊢ s : B , we have Γ ⊢ [ x := s ] r : A .2. For all Γ , r, X, A, B such that Γ ⊢ r : A , we have [ X := B ]Γ ⊢ [ X := B ] r : [ X := B ] A . Theorem 4.14 (Subject reduction) . For all Γ , r, s, A such that Γ ⊢ r : A and r ֒ → s or r ⇄ s , we have Γ ⊢ s : A . In this section we prove the strong normalisation of the relation → , that is, every reduction sequenceﬁred from a typed term eventually terminates. The set of typed strongly normalising terms with respectto reduction → is written SN . The size of the longest reduction issued from t is written | t | .We extend to polymorphism the proof of System I [12]. To prove that every term is in SN , weassociate, as usual, a set J A K of strongly normalising terms to each type A . A term ⊢ r : A is said tobe reducible when r ∈ J A K . We then prove an adequacy theorem stating that every well typed term isreducible.The set J A ⇒ A ⇒ · · · ⇒ A n ⇒ X K can be deﬁned either as the set of terms r such that for all s ∈ J A K , rs ∈ J A ⇒ · · · ⇒ A n ⇒ X K or, equivalently, as the set of terms r such that for all s i ∈ J A i K , rs . . . s n ∈ J X K = SN . To prove that a term of the form λx A .t is reducible, we need to use the so-calledCR3 property [20], in the ﬁrst case, and the property that a term whose all one-step reducts are in SN isin SN , in the second. In PSI, an introduction can be equivalent to an elimination e.g. h rt, st i ⇄ h r, s i t ,hence, we cannot deﬁne a notion of neutral term and have an equivalent to the CR3 property. Therefore,we use the second deﬁnition, and since reduction depends on types, the set J A K is deﬁned as a set oftyped terms.Before proving the normalisation of PSI, we reformulate the proof of strong normalisation of SystemF along these lines. 9 .1 Normalisation of System F Deﬁnition 5.1 (Elimination context) . Consider an extension of the language where we introduce anextra symbol •• A , called hole of type A . We deﬁne the set of elimination contexts with a hole •• A as thesmallest set such that: • •• A is an elimination context of type A , • if K B ⇒ CA is an elimination context of type B ⇒ C with a hole of type A and r ∈ SN is a term oftype B , then K B ⇒ CA r is an elimination context of type C with a hole of type A , • and if K ∀ X.BA is an elimination context of type ∀ X.B with a hole of type A , then K ∀ X.BA [ C ] is anelimination context of type [ X := C ] B with a hole of type A .We write K BA • t • for [ •• A := t ]( K BA ), where •• A is the hole of K BA . In particular, t may be an eliminationcontext.Notice that the shape of every context K BA is •• A α . . . α n , where each α i is either a term or a typeargument. Example 5.2.

Let K XX = •• X K ′ XX ⇒ X = K XX ••• X ⇒ X x • K ′′ X ∀ X.X ⇒ X = K ′ XX ⇒ X ••• ∀ X.X ⇒ X [ X ] • = K XX ••• ∀ X.X ⇒ X [ X ] x ) • Then K ′′ X ∀ X.X ⇒ X • Λ X.λy X .y • = (Λ X.λy X .y )[ X ] x . Deﬁnition 5.3 (Terms occurring in an elimination context) . Let K BA be an elimination context. Themultiset of terms occurring in K BA is deﬁned as T ( •• A ) = ∅T ( K B ⇒ CA r ) = T ( K B ⇒ CA ) ⊎ { r }T ( K ∀ X.BA [ C ]) = T ( K ∀ X.BA )We write | K BA | for P ni =1 | r i | where [ r , . . . , r n ] = T ( K BA ). Deﬁnition 5.4 (Reducibility) . The set of reducible terms of type A (notation J A K ) is deﬁned as the setof terms t of type A such that for any elimination context K XA where all the terms in T ( K XA ) are in SN ,we have K XA • t • ∈ SN . Lemma 5.5.

For all A , J A K ⊆ SN .Proof. For all A , there exists an elimination context K XA , since variables are in SN and they can haveany type. Hence, given that if r ∈ J A K then K XA • r • ∈ SN , we have r ∈ SN . Lemma 5.6 (Adequacy of variables) . For all A and x A , we have x A ∈ J A K .Proof. Let K XA = •• A α . . . α n , where for all i such that α i is a term, we have α i ∈ SN , then for all x , K XA • x • ∈ SN . Lemma 5.7 (Adequacy of application) . For all r, s, A, B such that r ∈ J A ⇒ B K and s ∈ J A K , we have rs ∈ J B K .Proof. We need to prove that for every elimination context K XB , we have K XB • rs • ∈ SN . As s ∈ J A K , K ′ XA ⇒ B = K XB ••• A ⇒ B s • ∈ SN , and r ∈ J A ⇒ B K , we have K ′ XA ⇒ B • r • = K XB • rs • ∈ SN . Lemma 5.8 (Adequacy of abstraction) . For all t, r, x, A, B such that t ∈ J A K and [ x := t ] r ∈ J B K , wehave λx A .r ∈ J A ⇒ B K . roof. We need to prove that for every elimination context K XA ⇒ B , we have K XA ⇒ B • λx A .r • ∈ SN , thatis that all its one step reducts are in SN . By Lemma 5.6, x ∈ J A K , so r ∈ J B K ⊆ SN . We conclude withan induction on | r | + | K XA ⇒ B | . Lemma 5.9 (Adequacy of type application) . For all r, X, A, B such that r ∈ J ∀ X.A K , we have r [ B ] ∈ J [ X := B ] A K .Proof. We need to prove that for every elimination context K Y [ X := B ] A we have K Y [ X := B ] A • r [ B ] • ∈ SN .Let K ′ Y ∀ X.A = K Y [ X := B ] A ••• ∀ X.A [ B ] • ∈ SN , and since r ∈ J ∀ X.A K , we have K ′ Y ∀ X.A • r • = K Y [ X := B ] A • r [ B ] • ∈ SN . Lemma 5.10 (Adequacy of type abstraction) . For all r , X , A , B such that [ X := B ] r ∈ J [ X := B ] A K ,we have Λ X.r ∈ J ∀ X.A K .Proof. We need to prove that for every elimination context K Y ∀ X.A , we have K Y ∀ X.A • Λ X.r • ∈ SN , that isthat all its one step reducts are in SN . Since [ X := B ] r ∈ J A K ⊆ SN and every term in T ( K Y ∀ X.A ) is in SN , then all its one step reducts are in SN . Deﬁnition 5.11 (Adequate substitution) . A substitution θ is adequate for a context Γ (notation θ (cid:15) Γ)if for all x : A ∈ Γ, we have θ ( x ) ∈ J A K . Theorem 5.12 (Adequacy) . For all Γ , r, A , and substitution θ such that Γ ⊢ r : A and θ (cid:15) Γ , we have θr ∈ J A K .Proof. By induction on r , using Lemmas 5.6 to 5.10. Theorem 5.13 (Strong normalisation) . For all Γ , r, A such that Γ ⊢ r : A , we have r ∈ SN .Proof. By Lemma 5.6, the idendity substitution is adequate. Thus, by Theorem 5.12 and Lemma 5.5, r ∈ J A K ⊆ SN . The size of a term is not invariant through the equivalence ⇄ . for example, counting the number oflambda abstractions in a term, we see that λx A . h r, s i is diﬀerent than h λx A .r, λx A .s i . Hence we introducea measure M ( · ) on terms. Deﬁnition 5.14 (Measure on terms) . P ( x ) = 0 P ( λx A .r ) = P ( r ) P ( rs ) = P ( r ) P ( h r, s i ) = 1 + P ( r ) + P ( s ) P ( π A ( r )) = P ( r ) P (Λ X.r ) = P ( r ) P ( r [ A ]) = P ( r ) M ( x ) = 1 M ( λx A .r ) = 1 + M ( r ) + P ( r ) M ( rs ) = M ( r ) + M ( s ) + P ( r ) M ( s ) M ( h r, s i ) = M ( r ) + M ( s ) M ( π A ( r )) = 1 + M ( r ) + P ( r ) M (Λ X.r ) = 1 + M ( r ) + P ( r ) M ( r [ A ]) = 1 + M ( r ) + P ( r ) Lemma 5.15.

For all r, s such that r ⇄ s , we have P ( r ) = P ( s ) . roof. We check the case of each rule of Table 3, and then conclude by structural induction to handlethe contextual closure. • (comm) : P ( h r, s i ) = 1 + P ( r ) + P ( s )= P ( h s, r i ) • (asso) : P ( hh r, s i , t i ) = 2 + P ( r ) + P ( s ) + P ( t )= P ( h r, h s, t ii ) • (dist λ ) : P ( λx A . h r, s i ) = 1 + P ( r ) + P ( s )= P ( h λx A .r, λx A .s i ) • (dist app ) : P ( h r, s i t ) = 1 + P ( r ) + P ( s )= P ( h rt, st i ) • (curry) : P (( rs ) t ) = P ( r )= P ( r h s, t i ) • (P-COMM ∀ i ⇒ i ) : P (Λ X.λx A .r ) = P ( r )= P ( λx A . Λ X.r ) • (P-COMM ∀ e ⇒ i ) : P (( λx A .r )[ B ]) = P ( r )= P ( λx A .r [ B ]) • (P-DIST ∀ i ∧ i ) : P (Λ X. h r, s i ) = 1 + P ( r ) + P ( s )= P ( h Λ X.r, Λ X.s i ) • (P-DIST ∀ e ∧ i ) : P ( h r, s i [ A ]) = 1 + P ( r ) + P ( s )= P ( h r [ A ] , s [ A ] i ) • (P-DIST ∀ i ∧ e ) : P ( π ∀ X.A (Λ X.r )) = P ( r )= P (Λ X.π A ( r )) • (P-DIST ∀ e ∧ e ) : P (( π ∀ X.B ( r ))[ A ]) = P ( r )= P ( π [ X := A ] B ( r [ A ])) Lemma 5.16.

For all r, s such that r ⇄ s , we have M ( r ) = M ( s ) .Proof. We check the case of each rule of Table 3, and then conclude by structural induction to handlethe contextual closure. • (comm) : M ( h r, s i )= M ( r ) + M ( s )= M ( h s, r i ) • (asso) : M ( hh r, s i , t i )= M ( r ) + M ( s ) + M ( t )= M ( h r, h s, t ii ) • (dist λ ) : M ( λx A . h r, s i )= 2 + M ( r ) + M ( s ) + P ( r ) + P ( s )= M ( h λx A .r, λx A .s i ) 12 (dist app ) : M ( h r, s i t )= M ( r ) + M ( s ) + 2 M ( t )+ P ( r ) M ( t ) + P ( s ) M ( t )= M ( h rt, st i ) • (curry) : M (( rs ) t )= M ( r ) + M ( s ) + P ( r ) M ( s )+ M ( t ) + P ( r ) M ( t )= M ( r h s, t i ) • (p-comm ∀ i ⇒ i ) : M (Λ X.λx A .r )= 2 + M ( r ) + 2 P ( r )= M ( λx A . Λ X.r ) • (p-comm ∀ e ⇒ i ) : M (( λx A .r )[ B ])= 2 + M ( r ) + 2 P ( r )= M ( λx A .r [ B ]) • (p-dist ∀ i ∧ i ) : M (Λ X. h r, s i )= 1 + M ( r ) + M ( s )= M ( h Λ X.r, Λ X.s i ) • (p-dist ∀ e ∧ i ) : M ( h r, s i [ A ])= 2 + M ( r ) + P ( r ) + M ( s ) + P ( s )= M ( h r [ A ] , s [ A ] i ) • (p-dist ∀ i ∧ e ) : M ( π ∀ X.A (Λ X.r ))= 2 + M ( r ) + 2 P ( r )= M (Λ X.π A ( r )) • (p-dist ∧ e ∀ e ) : M (( π ∀ X.B ( r ))[ A ])= 2 + M ( r ) + 2 P ( r )= M ( π [ X := A ] B ( r [ A ])) Lemma 5.17.

For all r, s, X, A , M ( λx A .r ) > M ( r ) M ( h r, s i ) > M ( s ) M ( rs ) > M ( r ) M ( π A ( r )) > M ( r ) M ( rs ) > M ( s ) M (Λ X.r ) > M ( r ) M ( h r, s i ) > M ( r ) M ( r [ A ]) > M ( r ) Proof.

For all t , M ( t ) ≥

1. We conclude by case inspection.

When typed lambda-calculus is extended with pairs, proving that if r ∈ SN and r ∈ SN then h r , r i ∈ SN is easy. However, in System I and PSI this property (Lemma 5.20) is harder to prove, as it requiresa characterisation of the terms equivalent to the product h r , r i (Lemma 5.18) and of all the reducts ofthis term (Lemma 5.19). Lemma 5.18.

For all r, s, t such that h r, s i ⇄ ∗ t , we have either1. t = h u, v i where either a) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with r ⇄ ∗ h t , t i and s ⇄ ∗ h t , t i , or(b) v ⇄ ∗ h w, s i with r ⇄ ∗ h u, w i , or any of the three symmetric cases, or(c) r ⇄ ∗ u and s ⇄ ∗ v , or the symmetric case.2. t = λx A .a and a ⇄ ∗ h a , a i with r ⇄ ∗ λx A .a and s ⇄ ∗ λx A .a .3. t = av and a ⇄ ∗ h a , a i , with r ⇄ ∗ a v and s ⇄ ∗ a v .4. t = Λ X.a and a ⇄ ∗ h a , a i with r ⇄ ∗ Λ X.a and s ⇄ ∗ Λ X.a .5. t = a [ A ] and a ⇄ ∗ h a , a i , with r ⇄ ∗ a [ A ] and s ⇄ ∗ a [ A ] .Proof. By a double induction, ﬁrst on M ( t ) and then on the length of the relation ⇄ ∗ . Full details aregiven in Appendix B. Lemma 5.19.

For all r , r , s, t such that h r , r i ⇄ ∗ s ֒ → t , there exists u , u such that t ⇄ ∗ h u , u i and either1. r → u and r → u ,2. r → u and r ⇄ ∗ u , or3. r ⇄ ∗ u and r → u .Proof. By induction on M ( h r , r i ). Full details are given in Appendix B. Lemma 5.20.

For all r , r such that r ∈ SN and r ∈ SN , we have h r , r i ∈ SN .Proof. By Lemma 5.19, from a reduction sequence starting from h r , r i we can extract one starting from r , from r or from both. Hence, this reduction sequence is ﬁnite. Deﬁnition 5.21 (Elimination context) . Consider an extension of the language where we introduce anextra symbol •• A , called hole of type A . We deﬁne the set of elimination contexts with a hole •• A as thesmallest set such that: • •• A is an elimination context of type A , • if K B ⇒ CA is an elimination context of type B ⇒ C with a hole of type A and r ∈ SN is a term oftype B , then K B ⇒ CA r is an elimination context of type C with a hole of type A , • if K B ∧ CA is an elimination context of type B ∧ C with a hole of type A , then π B ( K B ∧ CA ) is anelimination context of type B with a hole of type A . • and if K ∀ X.BA is an elimination context of type ∀ X.B with a hole of type A , then K ∀ X.BA [ C ] is anelimination context of type [ X := C ] B with a hole of type A .We write K BA • t • for [ •• A := t ]( K BA ), where •• A is the hole of K BA . In particular, t may be an eliminationcontext. Example 5.22.

Let K XX = •• X ,K ′ XX ⇒ ( X ∧ X ) = K XX • π X ( •• X ⇒ ( X ∧ X ) x ) • ,K ′′ X ∀ X.X ⇒ ( X ∧ X ) = K ′ XX ⇒ ( X ∧ X ) ••• ∀ X.X ⇒ ( X ∧ X ) [ X ] • = K XX • π X ( •• ∀ X.X ⇒ ( X ∧ X ) [ X ] x ) • . Then, K ′′ X ∀ X.X ⇒ ( X ∧ X ) • Λ X.λy X . h y, y i • = π X ((Λ X.λy X . h y, y i )[ X ] x ) . eﬁnition 5.23 (Terms occurring in an elimination context) . Let K AB be an elimination context. Themultiset of terms occurring in K AB is deﬁned as T ( •• A ) = ∅T ( K B ⇒ CA r ) = T ( K B ⇒ CA ) ⊎ { r }T ( π B ( K B ∧ CA )) = T ( K B ∧ CA ) T ( K ∀ X.BA [ C ]) = T ( K ∀ X.BA )We write | K BA | for P ni =1 | r i | where [ r , . . . , r n ] = T ( K BA ). Example 5.24. T ( •• A rs ) = [ r, s ] and T ( •• A h r, s i ) = [ h r, s i ]. Remark that K BA • t • ⇄ ∗ K ′ BA • t • does notimply T ( K BA ) ∼ T ( K ′ BA ). Deﬁnition 5.25 (Reducibility) . The set of reducible terms of type A (notation J A K ) is deﬁned as theset of terms t of type A such that for any elimination context K XA where all the terms in T ( K XA ) are in SN , we have K XA • t • ∈ SN .The following lemma is a trivial consequence of the deﬁnition of reducibility. Lemma 5.26.

For all

A, B such that A ≡ B , we have J A K = J B K . Lemma 5.27.

For all A , J A K ⊆ SN .Proof. For all A , there exists an elimination context K XA , since variables are in SN and they can haveany type. Hence, given that if r ∈ J A K then K XA • r • ∈ SN , we have r ∈ SN . We ﬁnally prove the adequacy theorem (Theorem 5.36) showing that every typed term is reducible, andthe strong normalisation theorem (Theorem 5.37) as a consequence of it.

Lemma 5.28 (Adequacy of variables) . For all A and x A , we have x A ∈ J A K .Proof. We need to prove that K XA • x • ∈ SN . The term K XA • x • has the variable x in a position thatdoes not create any redex, hence the only redexes are those in T ( K XA ), which are already in SN . Then, K XA • x • ∈ SN . Lemma 5.29 (Adequacy of projection) . For all r, A, B , such that r ∈ J A ∧ B K , we have π A ( r ) ∈ J A K .Proof. We need to prove that K XA • π A ( r ) • ∈ SN . Take K ′ XA ∧ B = K XA • π A ( •• A ∧ B ) • , and since r ∈ J A ∧ B K ,we have K ′ XA ∧ B • r • = K XA • π A ( r ) • ∈ SN . Lemma 5.30 (Adequacy of application) . For all r, s, A, B such that r ∈ J A ⇒ B K and s ∈ J A K , we have rs ∈ J B K .Proof. We need to prove that K XB • rs • ∈ SN . Take K ′ XA ⇒ B = K XB ••• A ⇒ B s • , and since r ∈ J A ⇒ B K , wehave K ′ XA ⇒ B • r • = K XB • rs • ∈ SN . Lemma 5.31 (Adequacy of type application) . For all r , X , A , B such that r ∈ J ∀ X.A K , we have r [ B ] ∈ J [ X := B ] A K .Proof. We need to prove that K Y [ X := B ] A • r [ B ] • ∈ SN . Take K ′ Y ∀ X.A = K Y [ X := B ] A ••• ∀ X.A [ B ] • ∈ SN , andsince r ∈ J ∀ X.A K , we have K ′ Y ∀ X.A • r • = K Y [ X := B ] A • r [ B ] • ∈ SN . Lemma 5.32 (Adequacy of product) . For all r, s, A, B such that r ∈ J A K and s ∈ J B K , we have h r, s i ∈ J A ∧ B K . roof. We need to prove that K XA ∧ B • h r, s i • ∈ SN . We proceed by induction on the number of projectionsin K XA ∧ B . Since the hole of K XA ∧ B has type A ∧ B , and K XA ∧ B • t • has type X for any t of type A ∧ B , wecan assume, without lost of generality, that the context K XA ∧ B has the form K ′ XC • π C ( •• A ∧ B α . . . α n ) • ,where each α i is either a term or a type argument. We prove that all K ′ XC • π C ( h rα . . . α n , sα . . . α n i ) • ∈ SN by showing, more generally, that if r ′ and s ′ are two reducts of rα . . . α n and sα . . . α n , then K ′ XC • π C ( h r ′ , s ′ i ) • ∈ SN . For this, we show that all its one step reducts are in SN , by induction on | K ′ XC | + | r ′ | + | s ′ | . The full details are given in Appendix C. Lemma 5.33 (Adequacy of abstraction) . For all t, r, x, A, B such that t ∈ J A K and [ x := t ] r ∈ J B K , wehave λx A .r ∈ J A ⇒ B K .Proof. By induction on M ( r ). • If r ⇄ ∗ h r , r i , then by Lemma 4.12, we have B ≡ B ∧ B with r of type B and r oftype B , and so by Lemma 4.13, [ x := t ] r has type B and [ x := t ] r has type B . Since[ x := t ] r ∈ J B K , we have h [ x := t ] r , [ x := t ] r i ∈ J B K . By Lemma 5.29, [ x := t ] r ∈ J B K and[ x := t ] r ∈ J B K . By the induction hypothesis, λx A .r ∈ J A ⇒ B K and λx A .r ∈ J A ⇒ B K , thus,by Lemma 5.32, λx A .r ⇄ ∗ h λx A .r , λx A .r i ∈ J ( A ⇒ B ) ∧ ( A ⇒ B ) K . Finally, by Lemma 5.26,we have J ( A ⇒ B ) ∧ ( A ⇒ B ) K = J A ⇒ B K . • If r ⇄ ∗ h r , r i , we need to prove that for any elimination context K XA ⇒ B , we have K XA ⇒ B • Λ X.r • ∈ SN . Since r and all the terms in T ( K XA ⇒ B ) are in SN , we proceed by induction on the lexicographicalorder of ( | K XA ⇒ B | + | r | , M ( r )) to show that all the one step reducts of K XA ⇒ B • λx A .r • are in SN .Since r is not a product, its only one step reducts are the following. – A term where the reduction took place in one of the terms in T ( K XA ⇒ B ) or in r , and so weapply the induction hypothesis. – K ′ XB • [ x := s ] r • , with K XA ⇒ B • λx A .r • = K ′ XB [( λx A .r ) s ]. As [ x := s ] r ∈ J B K , we have K ′ XB • [ x := s ] r • ∈ SN . – K ′ XA ⇒ B ′ • λx A . [ X := C ] r ′ • , with r ⇄ ∗ Λ X.r ′ , B ≡ ∀ X.B ′ , and K XA ⇒ B • λx A . Λ X.r ′ • is equalto K ′ XA ⇒ B ′ • ( λx A . Λ X.r ′ )[ C ] • . Since M ([ X := C ] r ′ ) < M (Λ X.r ′ ), we apply the inductionhypothesis. Lemma 5.34 (Adequacy of type abstraction) . For all r , X , A , B such that [ X := B ] r ∈ J [ X := B ] A K ,we have Λ X.r ∈ J ∀ X.A K .Proof. We proceed by induction on M ( r ) with a proof similar to that of Lemma 5.33. Full details aregiven in Appendix C. Deﬁnition 5.35 (Adequate substitution) . A substitution θ is adequate for a context Γ (notation θ (cid:15) Γ)if for all x : A ∈ Γ, we have θ ( x ) ∈ J A K . Theorem 5.36 (Adequacy) . For all Γ , r, A , and substitution θ such that Γ ⊢ r : A and θ (cid:15) Γ , we have θr ∈ J A K .Proof. By induction on r . • If r is a variable x : A ∈ Γ, then, since θ (cid:15) Γ, we have θr ∈ J A K . • If r is a product h s, t i , then by Lemma 4.12, Γ ⊢ s : B , Γ ⊢ t : C , and A ≡ B ∧ C , thus, bythe induction hypothesis, θs ∈ J B K and θt ∈ J C K . By Lemma 5.32, h θs, θt i ∈ J B ∧ C K , hence byLemma 5.26, θr ∈ J A K . • If r is a projection π A ( s ), then by Lemma 4.12, Γ ⊢ s : A ∧ B , and by the induction hypothesis, θs ∈ J A ∧ B K . By Lemma 5.29, π A ( θs ) ∈ J A K , hence θr ∈ J A K . • If r is an abstraction λx B .s , with Γ ⊢ s : C , then by Lemma 4.12, A ≡ B ⇒ C , hence by theinduction hypothesis, for all θ and for all t ∈ J B K , [ x := t ]( θs ) ∈ J C K . Hence, by Lemma 5.33, λx B .θs ∈ J B ⇒ C K , so, by Lemma 5.26, θr ∈ J A K .16 If r is an application st , then by Lemma 4.12, Γ ⊢ s : B ⇒ A and Γ ⊢ t : B , thus, by the inductionhypothesis, θs ∈ J B ⇒ A K and θt ∈ J B K . Hence, by Lemma 5.30, we have θr = θsθt ∈ J A K . • If r is a type abstraction Λ X.s , with Γ ⊢ s : B , then by Lemma 4.12, A ≡ ∀ X.B , hence by theinduction hypothesis, for all θ , θs ∈ J B K . Hence, by Lemma 5.34, Λ X.θs ∈ J ∀ X.B K , hence, byLemma 5.26, θr ∈ J A K . • If r is a type application s [ C ], then by Lemma 4.12, Γ ⊢ s : ∀ X.B with A ≡ [ X := C ] B , thus, bythe induction hypothesis, θs ∈ J ∀ X.B K . Hence, by Lemma 5.31, we have θr = θs [ C ] ∈ J A K . Theorem 5.37 (Strong normalisation) . For all Γ , r, A such that Γ ⊢ r : A , we have r ∈ SN .Proof. By Lemma 5.27, the identity substitution is adequate. Thus, by Theorem 5.36 and Lemma 5.27, r ∈ J A K ⊆ SN . System I is a simply-typed lambda calculus with pairs, extended with an equational theory obtainedfrom considering the type isomorphisms as equalities. In this way, the system allows a programmer tofocus on the meaning of programs, ignoring the rigid syntax of terms within the safe context provided bytype isomorphisms. In this paper we have extended System I with polymorphism, and its correspondingisomorphisms, enriching the language with a feature that most programmers expect.From a logical perspective, System I is a proof system for propositional logic, where isomorphicpropositions have the same proofs, and PSI extends System I with the universal quantiﬁer.The main theorems in this paper prove subject reduction (Theorem 4.14) and strong normalisation(Theorem 5.37). The proof of the latter is a non-trivial adaptation of Girard’s proof [20] for System F.

As mentioned in Section 2, two isomorphisms for System F with pairs, as deﬁned by Di Cosmo [9], are notconsidered explicitly: isomorphisms (7) and (8). However, the isomorphism (7) is just the α -equivalence,which has been given implicitly, and so it has indeed been considered. The isomorphism that actuallywas not considered is (8), which allows to swap the type abstractions: ∀ X. ∀ Y.A ≡ ∀ Y. ∀ X.A . Thisisomorphism is the analogous to the isomorphism A ⇒ B ⇒ C ≡ B ⇒ A ⇒ C at the ﬁrst order level,which is a consequence of isomorphisms (4) and (1). At this ﬁrst order level, the isomorphism inducesthe following equivalence: ( λx A .λy B .r ) st ⇄ ( λx A .λy B .r ) h s, t i ⇄ ( λx A .λy B .r ) h t, s i ⇄ ( λx A .λy B .r ) ts An alternative approach would have been to introduce an equivalence between λx A .λy B .r and λy B .λx A .r . However, in any case, to keep subject reduction, the β λ reduction must verify that thetype of the argument matches the type of the variable before reducing. This solution is not easily imple-mentable for the β Λ reduction, since it involves using the type as a labelling for the term and the variable,to identify which term corresponds to which variable (leaving the posibility for non-determinism if the“labellings” are duplicated), but at the level of types we do not have a natural labelling.Another alternative solution, in the same direction, is the one implemented by the selective lambdacalculus [18], where only arrows, and not conjunctions, were considered, and so only the ismorphism A ⇒ B ⇒ C ≡ B ⇒ A ⇒ C is treated. In the selective lambda calculus the solution is indeed to includeexternal labellings (not types) to identify which argument is being used at each time. We could haveadded a labelling to type applications, t [ A X ], together with the following rule: r [ A X ][ B Y ] ⇄ r [ B Y ][ A X ]and so modifying the β Λ to (Λ X.r )[ A X ] ֒ → [ X := A ] r .Despite that such a solution seems to work, we found that it does not contribute to the languagein any aspect, while it does make the system less readable. Therefore, we have decided to exclude theisomorphism (8) for PSI. 17 .2 Future work An extended fragment of an early version [10] of System I has been implemented [15] in Haskell. Insuch an implementation, we have added some ad-hoc rules in order to have a progression property (thatis, having only introductions as normal forms of closed terms). For example, “If s has type B then( λx A .λy B .r ) s ֒ → λx A . (( λy B .r ) s )”. Such a rule, among others introduced in this implementation, is aparticular case of a more general η -expansion rule. Certainly, with the rule t ֒ → λx A .tx we can derive( λx A .λy B .r ) s ֒ → λz A . ( λx A .λy B .r ) sz ⇄ ∗ λz A . ( λx A .λy B .r ) zs֒ → λz A . (( λy B . [ x := z ] r ) s )In [13] we have showed that it is indeed the case that all the ad-hoc rules from [10] can be lifted byadding extensional rules.In addition, the proof of the consistency of PSI as a language of proof-terms for second-order logic hasbeen intentionally left out of this paper. Indeed, as shown in [12], it would require to restrict variablesto only have “prime types”, that is non-conjunctive types. Such a restriction has also been shown to benot necessary when the language is extended with eta rules [13]. Therefore, we preferred to delay theproof of consistency for a future version of PSI with η -rules. The mentioned implementation of an early version of System I, included a ﬁx point operator and num-bers, showing some interesting programming examples. We plan to extend such an implementation forpolymorphism, following the design of PSI.

References [1] Pablo Arrighi and Alejandro D´ıaz-Caro. A System F accounting for scalars.

LMCS , 8(1:11):1–32,2012.[2] Pablo Arrighi, Alejandro D´ıaz-Caro, and Benoˆıt Valiron. The vectorial lambda-calculus.

Inf. andComp. , 254(1):105–139, 2017.[3] Pablo Arrighi and Gilles Dowek. Lineal: A linear-algebraic lambda-calculus.

LMCS , 13(1:8):1–33,2017.[4] G´erard Boudol. Lambda-calculi for (strict) parallel functions.

Inf. and Comp. , 108(1):51–127, 1994.[5] Antonio Bucciarelli, Thomas Ehrhard, and Giulio Manzonetto. A relational semantics for parallelismand non-determinism in a functional setting.

APAL , 163(7):918–934, 2012.[6] Thierry Coquand and G´erard Huet. The calculus of constructions.

Inf. and Comp. , 76(2–3):95–120,1988.[7] Ugo de’Liguoro and Adolfo Piperno. Non deterministic extensions of untyped λ -calculus. Inf. andComp. , 122(2):149–177, 1995.[8] Mariangiola Dezani-Ciancaglini, Ugo de’Liguoro, and Adolfo Piperno. A ﬁlter model for concurrent λ -calculus. SIAM JComp. , 27(5):1376–1419, 1998.[9] Roberto Di Cosmo.

Isomorphisms of types: from λ -calculus to information retrieval and languagedesign . Progress in Theoretical Computer Science. Birkhauser, Switzerland, 1995.[10] Alejandro D´ıaz-Caro and Gilles Dowek. Non determinism through type isomorphism. EPTCS(LSFA’12) , 113:137–144, 2013. 1811] Alejandro D´ıaz-Caro and Gilles Dowek. Typing quantum superpositions and measurement.

LNCS(TPNC’17) , 10687:281–293, 2017.[12] Alejandro D´ıaz-Caro and Gilles Dowek. Proof normalisation in a logic identifying isomorphic propo-sitions.

LIPIcs (FSCD’19) , 131:14:1–14:23, 2019.[13] Alejandro D´ıaz-Caro and Gilles Dowek. Extensional proofs in a propositional logic modulo isomor-phisms. arXiv:2002.03762 , 2020.[14] Alejandro D´ıaz-Caro, Mauricio Guillermo, Alexandre Miquel, and Benoˆıt Valiron. Realizability inthe unitary sphere. In

Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in ComputerScience (LICS 2019) , pages 1–13, Vancouver, BC, Canada, 2019. IEEE.[15] Alejandro D´ıaz-Caro and Pablo E. Mart´ınez L´opez. Isomorphisms considered as equalities: Pro-jecting functions and enhancing partial application through an implementation of λ + . ACM IFL ,2015(9):1–11, 2015.[16] Gilles Dowek, Th´er`ese Hardin, and Claude Kirchner. Theorem proving modulo.

JAR , 31(1):33–72,2003.[17] Gilles Dowek and Benjamin Werner. Proof normalization modulo.

JSL , 68(4):1289–1316, 2003.[18] Jacques Garrigue and Hassan A¨ıt-Kaci. The typed polymorphic label-selective λ -calculus. In Pro-ceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages ,POPL ’94, page 35–47, New York, NY, USA, 1994. Association for Computing Machinery.[19] Herman Geuvers, Robbert Krebbers, James McKinna, and Freek Wiedijk. Pure type systemswithout explicit contexts. In Karl Crary and Marino Miculan, editors,

Proceedings of LFMTP2010 , volume 34 of

EPTCS , pages 53–67, 2010.[20] Jean-Yves Girard, Paul Taylor, and Yves Lafont.

Proofs and types . Cambridge U.P., UK, 1989.[21] Per Martin-L¨of.

Intuitionistic type theory . Bibliopolis, Napoli, Italy, 1984.[22] Michele Pagani and Simona Ronchi Della Rocca. Linearity, non-determinism and solvability.

Fund.Inf. , 103(1–4):173–202, 2010.[23] Jonghyun Park, Jeongbong Seo, Sungwoo Park, and Gyesik Lee. Mechanizing metatheory withouttyping contexts.

Journal of Automated Reasoning , 52(2):215–239, 2014.[24] Mikael Rittri. Retrieving library identiﬁers via equational matching of types. In

Proceedings ofCADE 1990 , volume 449 of

LNCS , pages 603–617, 1990.[25] The Univalent Foundations Program.

HoTT: Univalent Foundations of Mathematics . Institute forAdvanced Study, Princeton, NJ, USA, 2013.[26] Lionel Vaux. The algebraic lambda calculus.

MSCS , 19(5):1029–1059, 2009.

A Detailed proofs of Section 4

Lemma 4.9.

For all

X, A, B, C such that ∀ X.A ≡ B ∧ C , there exist B ′ , C ′ such that B ≡ ∀ X.B ′ , C ≡ ∀ X.C ′ and A ≡ B ′ ∧ C ′ .Proof. By Lemma 4.5, PF ( ∀ X.A ) ∼ PF ( B ∧ C ) = PF ( B ) ⊎ PF ( C ).By Lemma 4.2, let PF ( A ) = [ ∀ ~Y i . ( A i ⇒ Z i )] ni =1 , PF ( B ) = [ ∀ ~W j . ( D j ⇒ Z ′ j )] kj =1 , and PF ( C ) =[ ∀ ~W j . ( D j ⇒ Z ′ j )] mj = k +1 .Hence, [ ∀ X. ∀ ~Y i . ( A i ⇒ Z i )] ni =1 ∼ [ ∀ ~W j . ( D j ⇒ Z ′ j )] mj =1 . So, by deﬁnition of ∼ , n = m and for i = 1 , . . . , n and a permutation p , we have ∀ X. ∀ ~Y i . ( A i ⇒ Z i ) ≡ ∀ ~W p ( i ) . ( D p ( i ) ⇒ Z ′ p ( i ) ). Thus, byLemma 4.7, we have X, ~Y i = ~W p ( i ) , A i ≡ D p ( i ) , and Z i = Z ′ p ( i ) . Therefore, there exists I such that I ∪ ¯ I =19 , . . . , n } , such that PF ( B ) = [ ∀ ~W p ( i ) . ( D p ( i ) ⇒ Z ′ p ( i ) )] i ∈ I and PF ( C ) = [ ∀ ~W p ( i ) . ( D p ( i ) ⇒ Z ′ p ( i ) )] i ∈ ¯ I .Hence, by Corollary 4.3, we have, B ≡ V i ∈ I ∀ ~W p ( i ) . ( D p ( i ) ⇒ Z ′ p i ) ≡ V i ∈ I ∀ X. ∀ ~Y i . ( A i ⇒ Z i ) and C ≡ V i ∈ ¯ I ∀ X. ∀ ~Y i . ( A i ⇒ Z i ).Let B ′ = V i ∈ I ∀ ~Y i . ( A i ⇒ Z i ) and C ′ = V i ∈ ¯ I ∀ ~Y i . ( A i ⇒ Z i ). So, B ≡ ∀ X.B ′ and C ≡ ∀ X.C ′ . Hence,also by Corollary 4.3, we have A ≡ V ni =1 ∀ ~Y i . ( A i ⇒ Z i ) ≡ B ′ ∧ C ′ . Lemma 4.10.

For all

X, A, B, C such that ∀ X.A ≡ B ⇒ C , there exists C ′ such that C ≡ ∀ X.C ′ and A ≡ B ⇒ C ′ .Proof. By Lemma 4.5, PF ( ∀ X.A ) ∼ PF ( B ⇒ C ).By Lemma 4.2, let PF ( A ) = [ ∀ ~Y i . ( A i ⇒ Z i )] ni =1 and PF ( C ) = [ ∀ ~W j . ( D j ⇒ Z ′ j )] mj =1 . Hence,[ ∀ X. ∀ ~Y i . ( A i ⇒ Z i )] ni =1 ∼ [ ∀ ~W j . (( B ∧ D j ) ⇒ Z ′ j )] mj =1 . So, by deﬁnition of ∼ , n = m and for i = 1 , . . . , n and a permutation p , we have ∀ X. ∀ ~Y i . ( A i ⇒ Z i ) ≡ ∀ ~W p ( i ) . (( B ∧ D p ( i ) ) ⇒ Z ′ p ( i ) )Hence, by Lemma 4.7, we have X, ~Y i = ~W p ( i ) , A i ≡ B ∧ D p ( i ) , and Z i = Z ′ p ( i ) . Hence, by Corollary 4.3, C ≡ n ^ j =1 ∀ ~W j . ( D j ⇒ Z ′ j ) ≡ n ^ i =1 ∀ ~W p ( i ) . ( D p ( i ) ⇒ Z ′ p ( i ) ) ≡ n ^ i =1 ∀ X. ∀ ~Y i . ( D p ( i ) ⇒ Z i )Let C ′ = V ni =1 ∀ ~Y i . ( D p ( i ) ⇒ Z i ). So, C ≡ ∀ X.C ′ .Hence, also by Corollary 4.3, we have A ≡ n ^ i =1 ∀ ~Y i . ( A i ⇒ Z i ) ≡ n ^ i =1 ∀ ~Y i . (( B ∧ D p ( i ) ) ⇒ Z i ) ≡ B ⇒ n ^ i =1 ∀ ~Y i . ( D p ( i ) ⇒ Z i ) ≡ B ⇒ C ′ Lemma 4.11 (Unicity modulo) . For all Γ , r, A, B such that Γ ⊢ r : A and Γ ⊢ r : B , we have A ≡ B .Proof. • If the last rule of the derivation of Γ ⊢ r : A is ( ≡ ), then we have a shorter derivation of Γ ⊢ r : C with C ≡ A , and, by the induction hypothesis, C ≡ B , hence A ≡ B . • If the last rule of the derivation of Γ ⊢ r : B is ( ≡ ) we proceed in the same way. • All the remaining cases are syntax directed.

Lemma 4.13 (Substitution) .

1. For all Γ , x, r, s, A, B such that Γ , x : B ⊢ r : A and Γ ⊢ s : B , we have Γ ⊢ [ x := s ] r : A .2. For all Γ , r, X, A, B such that Γ ⊢ r : A , we have [ X := B ]Γ ⊢ [ X := B ] r : [ X := B ] A .Proof.

1. By structural induction on r . • Let r = x . By Lemma 4.12, A ≡ B , thus Γ ⊢ s : A . Since [ x := s ] x = s , we haveΓ ⊢ [ x := s ] x : A . • Let r = y , with y = x . Since [ x := s ] y = y , we have Γ ⊢ [ x := s ] y : A . • Let r = λx C .t . We have [ x := s ]( λx C .t ) = λx C .t , so Γ ⊢ [ x := s ]( λx C .t ) : A . • Let r = λy C .t , with y = x . By Lemma 4.12, A ≡ C ⇒ D and Γ , y : C ⊢ t : D . By the inductionhypothesis, Γ , y : C ⊢ [ x := s ] t : D , and so, by rule ( ⇒ i ), Γ ⊢ λy C . [ x := s ] t : C ⇒ D . Since λy C . [ x := s ] t = [ x := s ]( λy C .t ), using rule ( ≡ ), Γ ⊢ [ x := s ]( λx C .t ) : A . • Let r = tu . By Lemma 4.12, Γ ⊢ t : C ⇒ A and Γ ⊢ u : C . By the induction hypothesis, Γ ⊢ [ x := s ] t : C ⇒ A and Γ ⊢ [ x := s ] u : C , and so, by rule ( ⇒ e ), Γ ⊢ ([ x := s ] t )([ x := s ] u ) : A .Since ([ x := s ] t )([ x := s ] u ) = [ x := s ]( tu ), we have Γ ⊢ [ x := s ]( tu ) : A .20 Let r = h t, u i . By Lemma 4.12, Γ ⊢ t : C and Γ ⊢ u : D , with A ≡ C ∧ D . By theinduction hypothesis, Γ ⊢ [ x := s ] t : C and Γ ⊢ [ x := s ] u : D , and so, by rule ( ∧ i ),Γ ⊢ h [ x := s ] t, [ x := s ] u i : C ∧ D . Since h [ x := s ] t, [ x := s ] u i = [ x := s ] h t, u i , using rule ( ≡ ),we have Γ ⊢ [ x := s ] h t, u i : A . • Let r = π A ( t ). By Lemma 4.12, Γ ⊢ t : A ∧ C . By the induction hypothesis, Γ ⊢ [ x := s ] t : A ∧ C , and so, by rule ( ∧ e ), Γ ⊢ π A ([ x := s ] t ) : A . Since π A ([ x := s ] t ) = [ x := s ]( π A ( t )), wehave Γ ⊢ [ x := s ]( π A ( t )) : A . • Let r = Λ X.t . By Lemma 4.12, A ≡ ∀ X.C and Γ ⊢ t : C . By the induction hypothesis,Γ ⊢ [ x := s ] t : C , and so, by rule ( ∀ i ), Γ ⊢ Λ X. [ x := s ] t : ∀ X.C . Since Λ X. [ x := s ] t =[ x := s ](Λ X.t ), using rule ( ≡ ), we have Γ ⊢ [ x := s ](Λ X.t ) : A . • Let r = t [ C ]. By Lemma 4.12, A ≡ [ X := C ] D and Γ ⊢ t : ∀ X.D . By the inductionhypothesis, Γ ⊢ [ x := s ] t : ∀ X.D , and so, by rule ( ∀ e ), Γ ⊢ ([ x := s ] t )[ C ] : [ X := C ] D . Since([ x := s ] t )[ C ] = [ x := s ]( t [ C ]), using rule ( ≡ ), we have Γ ⊢ [ x := s ]( t [ C ]) : A .2. By induction on the typing relation. • ( ax ): Let Γ , x : A ⊢ x : A . Then, using rule ( ax ), we have [ X := B ]Γ , x : [ X := B ] A ⊢ [ X := B ] x : [ X := B ] A . • ( ≡ ): Let Γ ⊢ r : A , with A ≡ C . By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] r : [ X := B ] C . Since A ≡ C , [ X := B ] A ≡ [ X := B ] C . Using rule ( ≡ ), we have[ X := B ]Γ ⊢ [ X := B ] r : [ X := B ] A . • ( ⇒ i ): Let Γ ⊢ λx C .t : C ⇒ D . By the induction hypothesis, [ X := B ]Γ , x : [ X := B ] C ⊢ [ X := B ] t : [ X := B ] D . Using rule ( ⇒ i ), [ X := B ]Γ ⊢ λx [ X := B ] C . [ X := B ] t : [ X := B ] C ⇒ [ X := B ] D . Since λx [ X := B ] C . [ X := B ] t = [ X := B ]( λx C .t ), we have [ X := B ]Γ ⊢ [ X := B ]( λx C .t ) : [ X := B ]( C ⇒ D ). • ( ⇒ e ): Let Γ ⊢ ts : D . By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] t : [ X := B ]( C ⇒ D ) and [ X := B ]Γ ⊢ [ X := B ] s : [ X := B ] C . Since [ X := B ]( C ⇒ D ) = [ X := B ] C ⇒ [ X := B ] D , using rule ( ⇒ e ), we have [ X := B ]Γ ⊢ ([ X := B ] t )([ X := B ] s ) : [ X := B ] D . Since([ X := B ] t )([ X := B ] s ) = [ X := B ]( ts ), we have [ X := B ]Γ ⊢ [ X := B ]( ts ) : [ X := B ] D . • ( ∧ i ): Let Γ ⊢ h t, s i : C ∧ D . By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] t : [ X := B ] C and [ X := B ]Γ ⊢ [ X := B ] s : [ X := B ] D . Using rule ( ∧ i ), [ X := B ]Γ ⊢ h [ X := B ] t, [ X := B ] s i : [ X := B ] C ∧ [ X := B ] D . Since h [ X := B ] t, [ X := B ] s i = [ X := B ] h t, s i , and [ X := B ] C ∧ [ X := B ] D = [ X := B ]( C ∧ D ), we have [ X := B ]Γ ⊢ [ X := B ] h t, s i : [ X := B ]( C ∧ D ). • ( ∧ e ): Let Γ ⊢ t : C ∧ D . By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] t : [ X := B ]( C ∧ D ). Since [ X := B ]( C ∧ D ) = [ X := B ]( C ) ∧ [ X := B ]( D ), using rule ( ∧ e )wehave [ X := B ]Γ ⊢ π [ X := B ] C ([ X := B ] t ) : [ X := B ]( C ). Since π [ X := B ] C ([ X := B ] t ) =[ X := B ] π C ( t ), we have [ X := B ]Γ ⊢ [ X := B ] π C ( t ) : [ X := B ]( C ). • ( ∀ i ): Let Γ ⊢ Λ Y.t : ∀ Y.C , with X F T V (Γ). By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] t : [ X := B ] C . Since X F T V (Γ), X F V ([ X := B ]Γ). Using rule ( ∀ i ), wehave [ X := B ]Γ ⊢ Λ Y. [ X := B ] t : Λ Y. [ X := B ] C . Since Λ Y. [ X := B ] t = [ X := B ]Λ Y.t , and ∀ Y. [ X := B ] C = [ X := B ] ∀ Y.C , we have [ X := B ]Γ ⊢ [ X := B ]Λ Y.t : [ X := B ] ∀ Y.C . • ( ∀ e ): Let Γ ⊢ t [ D ] : [ Y := D ] C . By the induction hypothesis, [ X := B ]Γ ⊢ [ X := B ] t : [ X := B ] ∀ Y.C . Since [ X := B ] ∀ Y.C = ∀ Y. [ X := B ] C , using rule ( ∀ e ), we have [ X := B ]Γ ⊢ ([ X := B ] t )[[ X := B ] D ] : [ Y := [ X := B ] D ][ X := B ] C .Since ([ X := B ] t )[[ X := B ] D ] = [ X := B ]( t [ D ]), and [ Y := [ X := B ] D ][ X := B ] C =[ X := B ][ Y := D ] C , we have [ X := B ]Γ ⊢ [ X := B ]( t [ D ]) : [ X := B ][ Y := D ] C . Theorem 4.14 (Subject reduction) . For all Γ , r, s, A such that Γ ⊢ r : A and r ֒ → s or r ⇄ s , we have Γ ⊢ s : A .Proof. By induction on the rewrite relation.(

COMM ) : h t, r i ⇄ h r, t i → ) 1. Γ ⊢ h t, r i : A (Hypothesis)2. A ≡ B ∧ C Γ ⊢ t : B Γ ⊢ r : C (1, Lemma 4.12)3. B ∧ C ≡ C ∧ B (Iso. (1))4. Γ ⊢ r : C Γ ⊢ t : B ( ∧ i )Γ ⊢ h r, t i : C ∧ B [3] ( ≡ )Γ ⊢ h r, t i : B ∧ C [2] ( ≡ )Γ ⊢ h r, t i : A ( ← ) analogous to ( → ).( ASSO ) : h t, h r, s ii ⇄ hh t, r i , s i ( → ) 1. Γ ⊢ h t, h r, s ii : A (Hypothesis)2. A ≡ B ∧ C Γ ⊢ t : B Γ ⊢ h r, s i : C (1, Lemma 4.12)3. C ≡ D ∧ E Γ ⊢ r : D Γ ⊢ s : E (2, Lemma 4.12)4. B ∧ ( D ∧ E ) ≡ ( B ∧ D ) ∧ E (Iso. (2))5. A ≡ B ∧ ( D ∧ E ) (2, 3, congr. ( ≡ ))6. Γ ⊢ t : B Γ ⊢ r : D ( ∧ i )Γ ⊢ h t, r i : B ∧ D Γ ⊢ s : E ( ∧ i )Γ ⊢ hh t, r i , s i : ( B ∧ D ) ∧ E [4] ( ≡ )Γ ⊢ hh t, r i , s i : B ∧ ( D ∧ E )[5] ( ≡ )Γ ⊢ hh t, r i , s i : A ( ← ) analogous to ( → ).( DIST λ ) : λx A . h t, r i ⇄ h λx A .t, λx A .r i ( → ) 1. Γ ⊢ λx A . h t, r i : B (Hypothesis)2. B ≡ A ⇒ C Γ , x : A ⊢ h t, r i : C (1, Lemma 4.12)3. C ≡ D ∧ E Γ , x : A ⊢ t : D Γ , x : A ⊢ r : E (2, Lemma 4.12)4. A ⇒ ( D ∧ E ) ≡ ( A ⇒ D ) ∧ ( A ⇒ E ) (Iso. (3))5. B ≡ A ⇒ ( D ∧ E ) (2, 3, congr. ( ≡ ))6. Γ , x : A ⊢ t : D ( ⇒ i )Γ ⊢ λx A .t : A ⇒ D Γ , x : A ⊢ r : E ( ⇒ i )Γ ⊢ λx A .r : A ⇒ E ( ∧ i )Γ ⊢ h λx A .t, λx A .r i : ( A ⇒ D ) ∧ ( A ⇒ E )[4] ( ≡ )Γ ⊢ h λx A .t, λx A .r i : A ⇒ ( D ∧ E )[5] ( ≡ )Γ ⊢ h λx A .t, λx A .r i : B ( ← ) 22. Γ ⊢ h λx A .t, λx A .r i : B (Hypothesis)2. B ≡ C ∧ D Γ ⊢ λx A .t : C Γ ⊢ λx A .r : D (1, Lemma 4.12)3. C ≡ A ⇒ C ′ Γ , x : A ⊢ t : C ′ (2, Lemma 4.12)4. D ≡ A ⇒ D ′ Γ , x : A ⊢ r : D ′ (2, Lemma 4.12)5. ( A ⇒ C ′ ) ∧ ( A ⇒ D ′ ) ≡ A ⇒ ( C ′ ∧ D ′ ) (Iso. (3))6. B ≡ ( A ⇒ C ′ ) ∧ ( A ⇒ D ′ ) (2, 3, 4, congr. ( ≡ ))7. Γ , x : A ⊢ t : C ′ Γ , x : A ⊢ r : D ′ ( ∧ i )Γ , x : A ⊢ h t, r i : C ′ ∧ D ′ ( ⇒ i )Γ ⊢ λx A . h t, r i : A ⇒ ( C ′ ∧ D ′ )[5] ( ≡ )Γ ⊢ λx A . h t, r i : ( A ⇒ C ′ ) ∧ ( A ⇒ D ′ )[6] ( ≡ )Γ ⊢ λx A . h t, r i : B ( DISTapp ) : h t, r i s ⇄ h ts, rs i ( → ) 1. Γ ⊢ h t, r i s : A (Hypothesis)2. Γ ⊢ h t, r i : B ⇒ A Γ ⊢ s : B (1, Lemma 4.12)3. B ⇒ A ≡ C ∧ D Γ ⊢ t : C Γ ⊢ r : D (2, Lemma 4.12)4. C ≡ B ⇒ C ′ D ≡ B ⇒ D ′ A ≡ C ′ ∧ D ′ (3, Lemma 4.8)5. Γ ⊢ t : C [4] ( ≡ )Γ ⊢ t : B ⇒ C ′ Γ ⊢ s : B ( ⇒ e )Γ ⊢ ts : C ′

6. Γ ⊢ r : D [4] ( ≡ )Γ ⊢ r : B ⇒ D ′ Γ ⊢ s : B ( ⇒ e )Γ ⊢ rs : D ′

7. (5)Γ ⊢ ts : C ′ (6)Γ ⊢ rs : D ′ ( ∧ i )Γ ⊢ h ts, rs i : C ′ ∧ D ′ [4] ( ≡ )Γ ⊢ h ts, rs i : A ( ← ) 1. Γ ⊢ h ts, rs i : A (Hypothesis)2. A ≡ B ∧ C Γ ⊢ ts : B Γ ⊢ rs : C (1, Lemma 4.12)3. Γ ⊢ t : D ⇒ B Γ ⊢ s : D (2, Lemma 4.12)4. Γ ⊢ r : E ⇒ B Γ ⊢ s : E (2, Lemma 4.12)23. D ≡ E (3, 4, Lemma 4.11)6. D ⇒ ( B ∧ C ) ≡ ( D ⇒ B ) ∧ ( D ⇒ C ) (Iso. (3))7. E ⇒ C ≡ D ⇒ C (6, congr. ( ≡ ))8. Γ ⊢ t : D ⇒ B Γ ⊢ r : E ⇒ C [7] ( ≡ )Γ ⊢ r : D ⇒ C ( ∧ i )Γ ⊢ h t, r i : ( D ⇒ B ) ∧ ( D ⇒ C )[5] ( ≡ )Γ ⊢ h t, r i : D ⇒ ( B ∧ C ) ( ⇒ e )Γ ⊢ h t, r i s : B ∧ C [2] ( ≡ )Γ ⊢ h t, r i s : A ( CURRY ) : t h r, s i ⇄ trs ( → ) 1. Γ ⊢ t h r, s i : A (Hypothesis)2. Γ ⊢ t : B ⇒ A Γ ⊢ h t, r i : B (1, Lemma 4.12)3. B ≡ C ∧ D Γ ⊢ r : C Γ ⊢ s : D (2, Lemma 4.12)4. B ⇒ A ≡ ( C ∧ D ) ⇒ A (3, congr. ( ≡ ))5. ( C ∧ D ) ⇒ A ≡ C ⇒ ( D ⇒ A ) (Iso. (4))6. Γ ⊢ t : B ⇒ A [4] ( ≡ )Γ ⊢ t : ( C ∧ D ) ⇒ A [5] ( ≡ )Γ ⊢ t : C ⇒ ( D ⇒ A ) Γ ⊢ r : C ( ⇒ e )Γ ⊢ tr : D ⇒ A

7. (6)Γ ⊢ tr : D ⇒ A Γ ⊢ s : D ( ⇒ e )Γ ⊢ trs : A ( ← ) 1. Γ ⊢ trs : A (Hypothesis)2. Γ ⊢ tr : B ⇒ A Γ ⊢ s : B (1, Lemma 4.12)3. Γ ⊢ t : C ⇒ ( B ⇒ A )Γ ⊢ r : C (2, Lemma 4.12)4. C ⇒ ( B ⇒ A ) ≡ ( C ∧ B ) ⇒ A (Iso. (4))5. Γ ⊢ t : C ⇒ ( B ⇒ A )[4] ( ≡ )Γ ⊢ t : ( C ∧ B ) ⇒ A Γ ⊢ r : C Γ ⊢ s : B ( ∧ i )Γ ⊢ h r, s i : C ∧ B ( ⇒ e )Γ ⊢ t h r, s i : A ( P-COMM ∀ i ⇒ i ) : Λ X.λx A .t ⇄ λx A . Λ X.t ( → ) 1. X F T V ( A ) (Hypothesis)2. Γ ⊢ Λ X.λx A .t : B (Hypothesis)3. B ≡ ∀ X.C Γ ⊢ λx A .t : CX F T V (Γ) (2, Lemma 4.12)4. C ≡ A ⇒ D Γ , x : A ⊢ t : D (3, Lemma 4.12)24. ∀ X. ( A ⇒ D ) ≡ A ⇒ ∀ X.D (1, Iso. (5))6. ∀ X.C ≡ ∀ X. ( A ⇒ D ) (4, congr. ( ≡ ))7. Γ , x : A ⊢ t : D [1 3] ( ∀ i )Γ , x : A ⊢ Λ X.t : ∀ X.D ( ⇒ i )Γ ⊢ λx A . Λ X.t : A ⇒ ∀ X.D [5] ( ≡ )Γ ⊢ λx A . Λ X.t : ∀ X. ( A ⇒ D )[6] ( ≡ )Γ ⊢ λx A . Λ X.t : ∀ X.C [3] ( ≡ )Γ ⊢ λx A . Λ X.t : B ( ← ) 1. X F T V ( A ) (Hypothesis)2. Γ ⊢ λx A . Λ X.t : B (Hypothesis)3. B ≡ A ⇒ C Γ , x : A ⊢ Λ X.t : C (2, Lemma 4.12)4. C ≡ ∀ X.D Γ , x : A ⊢ t : DX F T V (Γ) ∪ F T V ( A ) (3, Lemma 4.12)5. ∀ X. ( A ⇒ D ) ≡ A ⇒ ∀ X.D (1, Iso. (5))6. A ⇒ C ≡ A ⇒ ∀ X.D (4, congr. ( ≡ ))7. Γ , x : A ⊢ t : D ( ⇒ i )Γ ⊢ λx A .t : A ⇒ D [4] ( ∀ i )Γ ⊢ Λ X.λx A .t : ∀ X. ( A ⇒ D )[5] ( ≡ )Γ ⊢ Λ X.λx A .t : A ⇒ ∀ X.D [6] ( ≡ )Γ ⊢ Λ X.λx A .t : A ⇒ C [3] ( ≡ )Γ ⊢ Λ X.λx A .t : B ( P-COMM ∀ e ⇒ i ) : ( λx A .t )[ B ] ⇄ λx A .t [ B ]( → ) 1. X F T V ( A ) (Hypothesis)2. Γ ⊢ ( λx A .t )[ B ] : C (Hypothesis)3. C ≡ [ X := B ] D Γ ⊢ λx A .t : ∀ X.D (2, Lemma 4.12)4. ∀ X.D ≡ A ⇒ E Γ , x : A ⊢ t : E (3, Lemma 4.12)5. E ≡ ∀ X.E ′ D ≡ A ⇒ E ′ (4, Lemma 4.10)6. A ⇒ [ X := B ] E ′ = [ X := B ]( A ⇒ E ′ ) (1, Def.)7. [ X := B ]( A ⇒ E ′ ) ≡ [ X := B ] D (5, congr. ( ≡ ))8. Γ , x : A ⊢ t : E [5] ( ≡ )Γ , x : A ⊢ t : ∀ X.E ′ ( ∀ e )Γ , x : A ⊢ t [ B ] : [ X := B ] E ′ ( ⇒ i )Γ ⊢ λx A .t [ B ] : A ⇒ [ X := B ] E ′ [6] ( ≡ )Γ ⊢ λx A .t [ B ] : [ X := B ]( A ⇒ E ′ )[7] ( ≡ )Γ ⊢ λx A .t [ B ] : [ X := B ] D [3] ( ≡ )Γ ⊢ λx A .t [ B ] : C ← ) 1. X F T V ( A ) (Hypothesis)2. Γ ⊢ λx A .t [ B ] : C (Hypothesis)3. C ≡ A ⇒ D Γ , x : A ⊢ t [ B ] : D (1, Lemma 4.12)4. D ≡ [ X := B ] E Γ , x : A ⊢ t : ∀ X.E (2, Lemma 4.12)5. A ⇒ ∀ X.E ≡ ∀ X. ( A ⇒ E ) (Iso. (5))6. [ X := B ]( A ⇒ E ) = A ⇒ [ X := B ] E (1, Def.)7. A ⇒ [ X := B ] E ≡ A ⇒ D (4, congr. ( ≡ ))8. Γ , x : A ⊢ t : ∀ X.E ( ⇒ i )Γ ⊢ λx A .t : A ⇒ ∀ X.E [5] ( ≡ )Γ ⊢ λx A .t : ∀ X. ( A ⇒ E ) ( ∀ e )Γ ⊢ ( λx A .t )[ B ] : [ X := B ]( A ⇒ E )[6] ( ≡ )Γ ⊢ ( λx A .t )[ B ] : A ⇒ [ X := B ] E [7] ( ≡ )Γ ⊢ ( λx A .t )[ B ] : A ⇒ D [3] ( ≡ )Γ ⊢ ( λx A .t )[ B ] : C ( P-DIST ∀ i ∧ i ) : Λ X. h t, r i ⇄ h Λ X.t, Λ X.r i ( → ) 1. Γ ⊢ Λ X. h t, r i : A (Hypothesis)2. A ≡ ∀ X.B Γ ⊢ h t, r i : BX F T V (Γ) (1, Lemma 4.12)3. B ≡ C ∧ D Γ ⊢ t : C Γ ⊢ r : D (2, Lemma 4.12)4. ∀ X. ( C ∧ D ) ≡ ∀ X.C ∧ ∀

X.D (Iso. (6))5. ∀ X.B ≡ ∀ X. ( C ∧ D ) (3, congr. ( ≡ ))6. Γ ⊢ t : C [2] ( ∀ i )Γ ⊢ Λ X.t : ∀ X.C Γ ⊢ r : D [2] ( ∀ i )Γ ⊢ Λ X.r : ∀ X.D ( ∧ i )Γ ⊢ h Λ X.t, Λ X.r i : ∀ X.C ∧ ∀

X.D [4] ( ≡ )Γ ⊢ h Λ X.t, Λ X.r i : ∀ X. ( C ∧ D )[5] ( ≡ )Γ ⊢ h Λ X.t, Λ X.r i : ∀ X.B [2] ( ≡ )Γ ⊢ h Λ X.t, Λ X.r i : A ( ← ) 1. Γ ⊢ h Λ X.t, Λ X.r i : A (Hypothesis)2. A ≡ B ∧ C Γ ⊢ Λ X.t : B Γ ⊢ Λ X.r : C (1, Lemma 4.12)3. B ≡ ∀ X.D Γ ⊢ t : DX F T V (Γ) (2, Lemma 4.12)4. C ≡ ∀ X.E Γ ⊢ r : EX F T V (Γ) (2, Lemma 4.12)5. ∀ X. ( D ∧ E ) ≡ ∀ X.D ∧ ∀

X.E (Iso. (6))26. ∀ X.D ∧ ∀

X.E ≡ B ∧ C (3, 4, congr. ( ≡ ))7. Γ ⊢ t : D Γ ⊢ r : E ( ∧ i )Γ ⊢ h t, r i : D ∧ E [3] ( ∀ i )Γ ⊢ Λ X. h t, r i : ∀ X. ( D ∧ E )[5] ( ≡ )Γ ⊢ Λ X. h t, r i : ∀ X.D ∧ ∀

X.E [6] ( ≡ )Γ ⊢ Λ X. h t, r i : B ∧ C [2] ( ≡ )Γ ⊢ Λ X. h t, r i : A ( P-DIST ∀ e ∧ i ) : h t, r i [ B ] ⇄ h t [ B ] , r [ B ] i ( → ) 1. Γ ⊢ h t, r i [ B ] : A (Hypothesis)2. A ≡ [ X := B ] C Γ ⊢ h t, r i : ∀ X.C (1, Lemma 4.12)3. ∀ X.C ≡ D ∧ E Γ ⊢ t : D Γ ⊢ r : E (2, Lemma 4.12)4. D ≡ ∀ X.D ′ E ≡ ∀ X.E ′ C ≡ D ′ ∧ E ′ (3, Lemma 4.9)5. [ X := B ]( D ′ ∧ E ′ ) = [ X := B ] D ′ ∧ [ X := B ] E ′ (Def.)6. [ X := B ] C ≡ [ X := B ]( D ′ ∧ E ′ ) (4, congr. ( ≡ ))7. Γ ⊢ t : D [4] ( ≡ )Γ ⊢ t : ∀ X.D ′ ( ∀ e )Γ ⊢ t [ B ] : [ X := B ] D ′ Γ ⊢ r : E [4] ( ≡ )Γ ⊢ r : ∀ X.E ′ ( ∀ e )Γ ⊢ r [ B ] : [ X := B ] E ′ ( ∧ i )Γ ⊢ h t [ B ] , r [ B ] i : [ X := B ] D ′ ∧ [ X := B ] E ′ [5] ( ≡ )Γ ⊢ h t [ B ] , r [ B ] i : [ X := B ]( D ′ ∧ E ′ )[6] ( ≡ )Γ ⊢ h t [ B ] , r [ B ] i : [ X := B ] C [2] ( ≡ )Γ ⊢ h t [ B ] , r [ B ] i : A ( ← ) 1. Γ ⊢ h t [ B ] , r [ B ] i : A (Hypothesis)2. A ≡ C ∧ D Γ ⊢ t [ B ] : C Γ ⊢ r [ B ] : D (1, Lemma 4.12)3. C ≡ [ X := B ] C ′ Γ ⊢ t : ∀ X.C ′ (2, Lemma 4.12)4. D ≡ [ X := B ] D ′ Γ ⊢ r : ∀ X.D ′ (2, Lemma 4.12)5. ∀ X. ( C ′ ∧ D ′ ) ≡ ∀ X.C ′ ∧ ∀ X.D ′ (Iso. (6))6. [ X := B ]( C ′ ∧ D ′ ) = [ X := B ] C ′ ∧ [ X := B ] D ′ (Def.)7. [ X := B ] C ′ ∧ [ X := B ] D ′ ≡ C ∧ D (3, 4, congr. ( ≡ ))8. Γ ⊢ t : ∀ X.C ′ Γ ⊢ r : ∀ Y.D ′ ( ∧ i )Γ ⊢ h t, r i : ∀ X.C ′ ∧ ∀ X.D ′ [5] ( ≡ )Γ ⊢ h t, r i : ∀ X. ( C ′ ∧ D ′ ) ( ∀ e )Γ ⊢ h t, r i [ B ] : [ X := B ]( C ′ ∧ D ′ )[6] ( ≡ )Γ ⊢ h t, r i [ B ] : [ X := B ] C ′ ∧ [ X := B ] D ′ [7] ( ≡ )Γ ⊢ h t, r i [ B ] : C ∧ D [2] ( ≡ )Γ ⊢ h t, r i [ B ] : A P-DIST ∀ i ∧ e ) : π ∀ X.B (Λ X.t ) ⇄ Λ X.π B ( t )( → ) 1. Γ ⊢ π ∀ X.B (Λ X.t ) : A (Hypothesis)2. A ≡ ∀ X.B Γ ⊢ Λ X.t : ( ∀ X.B ) ∧ C (1, Lemma 4.12)3. ( ∀ X.B ) ∧ C ≡ ∀ X.D Γ ⊢ t : DX F T V (Γ) (2, Lemma 4.12)4. C ≡ ∀ X.C ′ D ≡ B ∧ C ′ (3, Lemma 4.9)5. Γ ⊢ t : D [4] ( ≡ )Γ ⊢ t : B ∧ C ′ ( ∧ e )Γ ⊢ π B ( t ) : B [3] ( ∀ i )Γ ⊢ Λ X.π B ( t ) : ∀ X.B [2] ( ≡ )Γ ⊢ Λ X.π B ( t ) : A ( ← ) 1. Γ ⊢ Λ X.π B ( t ) : A (Hypothesis)2. A ≡ ∀ X.C Γ ⊢ π B ( t ) : CX F T V (Γ) (1, Lemma 4.12)3. B ≡ C Γ ⊢ t : C ∧ D (2, Lemma 4.12)4. ∀ X. ( C ∧ D ) ≡ ∀ X.C ∧ ∀

X.D (Iso. (6))5. Γ ⊢ t : C ∧ D [2] ( ∀ i )Γ ⊢ Λ X.t : ∀ X. ( C ∧ D )[4] ( ≡ )Γ ⊢ Λ X.t : ∀ X.C ∧ ∀

X.D ( ∧ e )Γ ⊢ π ∀ X.B (Λ X.t ) : ∀ X.C [2] ( ≡ )Γ ⊢ π ∀ X.B (Λ X.t ) : A ( P-DIST ∀ e ∧ e ) : ( π ∀ X.B ( t ))[ C ] ⇄ π [ X := C ] B ( t [ C ])( → ) 1. Γ ⊢ t : ∀ X. ( B ∧ D ) (Hypothesis)2. Γ ⊢ ( π ∀ X.B ( t ))[ C ] : A (Hypothesis)3. A ≡ [ X := C ] E Γ ⊢ π ∀ X.B ( t ) : ∀ X.E (2, Lemma 4.12)4. ∀ X.E ≡ ∀

X.B Γ ⊢ t : ∀ X.E ∧ F (3, Lemma 4.12)5. E ≡ B (4)6. [ X := C ]( B ∧ D ) = [ X := C ] B ∧ [ X := C ] D (Def.)7. [ X := C ] B ≡ [ X := C ] E (5, congr. ( ≡ ))8. Γ ⊢ t : ∀ X.B ∧ D ( ∀ e )Γ ⊢ t [ C ] : [ X := C ]( B ∧ D )[6] ( ≡ )Γ ⊢ t [ C ] : [ X := C ] B ∧ [ X := C ] D ( ∧ e )Γ ⊢ π [ X := C ] B ( t [ C ]) : [ X := C ] B [7] ( ≡ )Γ ⊢ π [ X := C ] B ( t [ C ]) : [ X := C ] E [3] ( ≡ )Γ ⊢ π [ X := C ] B ( t [ C ]) : A ← ) 1. Γ ⊢ t : ∀ X. ( B ∧ D ) (Hypothesis)2. Γ ⊢ π [ X := C ] B ( t [ C ]) : A (Hypothesis)3. A ≡ [ X := C ] B Γ ⊢ t [ C ] : A ∧ E (2, Lemma 4.12)4. ∀ X. ( B ∧ D ) ≡ ∀ X.B ∧ ∀

X.D (Iso. (6))5. Γ ⊢ t : ∀ X. ( B ∧ D )[4] ( ≡ )Γ ⊢ t : ∀ X.B ∧ ∀

X.D ( ∧ e )Γ ⊢ π ∀ X.B ( t ) : ∀ X.B ( ∀ e )Γ ⊢ ( π ∀ X.B ( t ))[ C ] : [ X := C ] B [3] ( ≡ )Γ ⊢ ( π ∀ X.B ( t ))[ C ] : A ( β λ ) : If Γ ⊢ s : A , ( λx A .r ) s ֒ → [ x := s ] r

1. Γ ⊢ s : A (Hypothesis)2. Γ ⊢ λx A .r : B (Hypothesis)3. Γ ⊢ λx A .r : A ⇒ B (2, Lemma 4.12)4. A ⇒ B ≡ A ⇒ C Γ , x : A ⊢ r : C (3, Lemma 4.12)5. B ≡ C (4, congr. ( ≡ ))6. Γ ⊢ [ x := s ] r : C (1, 4, Lemma 4.13)7. Γ ⊢ [ x := s ] r : B (5, 6, rule ( ≡ ))( β Λ ) : (Λ X.r )[ A ] ֒ → [ X := A ] r

1. Γ ⊢ (Λ X.r )[ A ] : B (Hypothesis)2. B ≡ [ X := A ] C Γ ⊢ Λ X.r : ∀ X.C (1, Lemma 4.12)3. ∀ X.C ≡ ∀

X.D Γ ⊢ r : DX F T V (Γ) (2, Lemma 4.12)4. C ≡ D (3)5. Γ ⊢ r : C (4, rule ( ≡ ))6. [ X := A ]Γ ⊢ Γ ⊢ [ X := A ] r : [ X := A ] C (5, Lemma 4.13)7. Γ ⊢ [ X := A ] r : B (2, 3, 7, rule ( ≡ ))( π ) : If Γ ⊢ r : A , π A ( h r, s i ) ֒ → r

1. Γ ⊢ r : A (Hypothesis)2. π A ( h r, s i ) : B (Hypothesis)3. B ≡ A Γ ⊢ h r, s i : A ∧ C (2, Lemma 4.12)4. Γ ⊢ π A ( h r, s i ) : A (2, 3, rule ( ≡ ))29 Detailed proof of Section 5.3

Lemma 5.18.

For all r, s, t such that h r, s i ⇄ ∗ t , we have either1. t = h u, v i where either(a) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with r ⇄ ∗ h t , t i and s ⇄ ∗ h t , t i , or(b) v ⇄ ∗ h w, s i with r ⇄ ∗ h u, w i , or any of the three symmetric cases, or(c) r ⇄ ∗ u and s ⇄ ∗ v , or the symmetric case.2. t = λx A .a and a ⇄ ∗ h a , a i with r ⇄ ∗ λx A .a and s ⇄ ∗ λx A .a .3. t = av and a ⇄ ∗ h a , a i , with r ⇄ ∗ a v and s ⇄ ∗ a v .4. t = Λ X.a and a ⇄ ∗ h a , a i with r ⇄ ∗ Λ X.a and s ⇄ ∗ Λ X.a .5. t = a [ A ] and a ⇄ ∗ h a , a i , with r ⇄ ∗ a [ A ] and s ⇄ ∗ a [ A ] .Proof. By a double induction, ﬁrst on M ( t ) and then on the length of the relation ⇄ ∗ . Consider anequivalence proof h r, s i ⇄ ∗ t ′ ⇄ t with a shorter proof h r, s i ⇄ ∗ t ′ . By the second induction hypothesis,the term t ′ has the form prescribed by the lemma. We consider the ﬁve cases and in each case, thepossible rules transforming t ′ in t .1. Let h r, s i ⇄ ∗ h u, v i ⇄ t . The possible equivalences from h u, v i are • t = h u ′ , v i or h u, v ′ i with u ⇄ u ′ and v ⇄ v ′ , and so the term t is in case 1. • Rules (comm) and (asso) preserve the conditions of case 1. • t = λx A . h u ′ , v ′ i , with u = λx A .u ′ and v = λx A .v ′ , and so the term t is in case 2. • t = h u ′ , v ′ i t ′ , with u = u ′ t ′ and v = v ′ t ′ , and so the term t is in case 3. • t = Λ X. h u ′ , v ′ i , with u = Λ X.u ′ and v = Λ X.v ′ , and so the term t is in case 4. • t = h u ′ , v ′ i [ A ], with u = u ′ [ A ] and v = v ′ [ A ], and so the term t is in case 5.2. Let h r, s i ⇄ ∗ λx A .a ⇄ t , with a ⇄ ∗ h a , a i , r ⇄ ∗ λx A .a , and s ⇄ ∗ λx A .a . Hence, possibleequivalences from λx A .a to t are • t = λx A .a ′ with a ⇄ ∗ a ′ , hence a ′ ⇄ h a , a i , and so the term t is in case 2. • t = h λx A .u, λx A .v i , with h a , a i ⇄ ∗ a = h u, v i . Hence, by the ﬁrst induction hypothesis(since M ( a ) < M ( t )), either(a) a ⇄ ∗ u and a ⇄ ∗ v , and so r ⇄ ∗ λx A .u and s ⇄ ∗ λx A .v , or(b) v ⇄ ∗ h t , t i with a ⇄ ∗ h u, t i and a ⇄ ∗ t , and so λx A .v ⇄ ∗ h λx A .t , λx A .t i , r ⇄ ∗ h λx A .u, λx A .t i and s ⇄ ∗ λx A .t , or(c) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with a ⇄ ∗ h t , t i and a ⇄ ∗ h t , t i , and so λx A .u ⇄ ∗ h λx A .t , λx A .t i , λx A .v ⇄ ∗ h λx A .t , λx A .t i , r ⇄ ∗ h λx A .t , λx A .t i and s ⇄ ∗ h λx A .t , λx A .t i .(the symmetric cases are analogous), and so the term t is in case 1. • t = Λ X.λx A .a ′ with a = Λ X.a ′ , hence Λ X.a ′ ⇄ ∗ h a , a i . Since M ( h a , a i ) < M ( h r, s i ), bythe ﬁrst induction hypothesis, the term t is in case 4. • t = λx A .a ′ [ B ] with a = a ′ [ B ], hence a ′ [ B ] ⇄ ∗ h a , a i . Since M ( h a , a i ) < M ( h r, s i ), by theﬁrst induction hypothesis, the term t is in case 5.3. Let h r, s i ⇄ ∗ aw ⇄ t , with a ⇄ ∗ h a , a i , r ⇄ ∗ a w , and s ⇄ ∗ a w . The possible equivalencesfrom aw to t are • t = a ′ w with a ⇄ ∗ a ′ , hence a ′ ⇄ ∗ h a , a i , and so the term t is in case 3. • t = aw ′ with w ⇄ ∗ w ′ and so the term t is in case 3.30 t = h uw, vw i , with h a , a i a ⇄ ∗ a = h u, v i . Hence, by the ﬁrst induction hypothesis (since M ( a ) < M ( t )), either(a) a ⇄ ∗ u and a ⇄ ∗ v , and so r ⇄ ∗ uw and s ⇄ ∗ vw , or(b) v ⇄ ∗ h t , t i with a ⇄ ∗ h u, t i and a ⇄ ∗ t , and so vw ⇄ ∗ h t w, t w i , r ⇄ ∗ h uw, t w i and s ⇄ ∗ t w , or(c) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with a ⇄ ∗ h t , t i and a ⇄ ∗ h t , t i , and so uw ⇄ ∗ h t w, t w i , vw ⇄ ∗ h t w, t w i , r ⇄ ∗ h t w, t w i and s ⇄ ∗ h t w, t w i .(the symmetric cases are analogous), and so the term t is in case 1. • t = a ′ h v, w i with a = a ′ v , thus a ′ v = a ⇄ ∗ h a , a i . Hence, by the ﬁrst induction hypothesis, a ′ ⇄ ∗ h a ′ , a ′ i , with a ⇄ ∗ a ′ v and a ⇄ ∗ a ′ v . Therefore, r ⇄ ∗ a ′ h v, w i and s ⇄ ∗ a ′ h v, w i ,and so the term t is in case 3.4. Let h r, s i ⇄ ∗ Λ X.a ⇄ t , with a ⇄ ∗ h a , a i , r ⇄ ∗ Λ X.a , and s ⇄ ∗ Λ X.a . Hence, possibleequivalences from λx.a to t are • t = Λ X.a ′ with a ⇄ ∗ a ′ , hence a ′ ⇄ h a , a i , and so the term t is in case 4. • t = h Λ X.u, Λ X.v i , with h a , a i ⇄ ∗ a = h u, v i . Hence, by the ﬁrst induction hypothesis (since M ( a ) < M ( t )), either(a) a ⇄ ∗ u and a ⇄ ∗ v , and so r ⇄ ∗ Λ X.u and s ⇄ ∗ Λ X.v , or(b) v ⇄ ∗ h t , t i with a ⇄ ∗ h u, t i and a ⇄ ∗ t , and so Λ X.v ⇄ ∗ h Λ X.t , Λ X.t i , r ⇄ ∗ h Λ X.u, Λ X.t i and s ⇄ ∗ Λ X.t , or(c) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with a ⇄ ∗ h t , t i and a ⇄ ∗ h t , t i , and soΛ X.u ⇄ ∗ h Λ X.t , Λ X.t i , Λ X.v ⇄ ∗ h Λ X.t , Λ X.t i , r ⇄ ∗ h Λ X.t , Λ X.t i and s ⇄ ∗ h Λ X.t , Λ X.t i .(the symmetric cases are analogous), and so the term t is in case 1. • t = λx A . Λ X.a ′ with a = λx A .a ′ , hence λx A .a ′ ⇄ ∗ h a , a i . Since M ( h a , a i ) < M ( h r, s i ), bythe ﬁrst induction hypothesis, the term t is in case 2. • t = (Λ X.a ′ )[ B ] with a = a ′ [ B ], hence a ′ [ B ] ⇄ ∗ h a , a i . Since M ( h a , a i ) < M ( h r, s i ), bythe ﬁrst induction hypothesis, the term t is in case 5.5. Let h r, s i ⇄ ∗ a [ A ] ⇄ t , with a ⇄ ∗ h a , a i , r ⇄ ∗ a [ A ], and s ⇄ ∗ a [ A ]. The possible equivalencesfrom a [ A ] to t are • t = a ′ [ A ] with a ⇄ ∗ a ′ , hence a ′ ⇄ ∗ h a , a i , and so the term t is in case 5. • t = h u [ A ] , v [ A ] i , with h a , a i ⇄ ∗ a = h u, v i . Hence, by the ﬁrst induction hypothesis (since M ( a ) < M ( t )), either(a) a ⇄ ∗ u and a ⇄ ∗ v , and so r ⇄ ∗ u [ A ] and s ⇄ ∗ v [ A ], or(b) v ⇄ ∗ h t , t i with a ⇄ ∗ h u, t i and a ⇄ ∗ t , and so v [ A ] ⇄ ∗ h t [ A ] , t [ A ] i , r ⇄ ∗ h u [ A ] , t [ A ] i and s ⇄ ∗ t [ A ], or(c) u ⇄ ∗ h t , t i and v ⇄ ∗ h t , t i with a ⇄ ∗ h t , t i and a ⇄ ∗ h t , t i , andso u [ A ] ⇄ ∗ h t [ A ] , t [ A ] i , v [ A ] ⇄ ∗ h t [ A ] , t [ A ] i , r ⇄ ∗ h t [ A ] , t [ A ] i and s ⇄ ∗ h t [ A ] , t [ A ] i .(the symmetric cases are analogous), and so the term t is in case 1. • t = λx B . ( a ′ [ A ]) with a = λx B .a ′ , hence λx B .a ′ ⇄ ∗ h a , a i . Since M ( λx B .a ′ ) < M ( h r, s i ), bythe ﬁrst induction hypothesis, the term t is in case 2. • t = π [ X := A ] B ( a ′ [ A ]) with a = π ∀ X.B ( a ′ ), hence π ∀ X.B ( a ′ ) ⇄ ∗ h a , a i . Since M ( π ∀ X.B ( a ′ ))

For all r , r , s, t such that h r , r i ⇄ ∗ s ֒ → t , there exists u , u such that t ⇄ ∗ h u , u i and either1. r → u and r → u , . r → u and r ⇄ ∗ u , or3. r ⇄ ∗ u and r → u .Proof. By induction on M ( h r , r i ). By Lemma 5.18, s is either a product, an abstraction, an application,a type abstraction or a type application with the conditions given in the lemma. The diﬀerent terms s reducible by ֒ → are • ( λx A .a ) s ′ that reduces by the ( β λ ) rule to [ x := s ′ ] a . • (Λ X.a )[ A ] that reduces by the ( β Λ ) rule to [ X := A ] a . • h s , s i , λx A .a , as ′ , Λ X.a , a [ A ] with a reduction in the subterm s , s , a , or s ′ .Notice that rule ( π ) cannot apply since s ⇄ ∗ π C ( s ′ ).We consider each case: • s = ( λx A .a ) s ′ and t = [ x := s ′ ] a . Using twice Lemma 5.18, we have a ⇄ ∗ h a , a i , r ⇄ ∗ ( λx A .a ) s ′ and r ⇄ ∗ ( λx A .a ) s ′ . Since t ⇄ ∗ h [ x := s ′ ] a , [ x := s ′ ] a i , we take u = [ x := s ′ ] a and u = [ x := s ′ ] a . • s = (Λ X.a )[ A ] and t = [ X := A ] a . Using twice Lemma 5.18, we have a ⇄ ∗ h a , a i , r ⇄ ∗ (Λ X.a )[ A ] and r ⇄ ∗ (Λ X.a )[ A ]. Since t ⇄ ∗ h [ X := A ] a , [ X := A ] a i , we take u = [ X := A ] a and u = [ X := A ] a . • s = h s , s i , t = h t , s i or t = h s , t i , with s ֒ → t and s ֒ → t . We only consider the ﬁrst casesince the other is analogous. One of the following cases happen(a) r ⇄ ∗ h w , w i , r ⇄ ∗ h w , w i , s = h w , w i and s = h w , w i . Hence, by theinduction hypothesis, either t = h w ′ , w i , or t = h w , w ′ i , or t = h w ′ , w ′ i , with w ֒ → w ′ and w ֒ → w ′ . We take, in the ﬁrst case u = h w ′ , w i and u = h w , w i ,in the second case u = h w , w i and u = h w ′ , w i , and in the third u = h w ′ , w i and u = h w ′ , w i .(b) We consider two cases, since the other two are symmetric. – r ⇄ ∗ h s , w i and s ⇄ ∗ h w, r i , in which case we take u = h t , w i and u = r . – r ⇄ ∗ h w, s i and s = h r , w i . Hence, by the induction hypothesis, either t = h r ′ , w i ,or t = h r , w ′ i or t = h r ′ , w ′ i , with r ֒ → r ′ and w ֒ → w ′ . We take, in the ﬁrst case u = r ′ and u = h w, s i , in the second case u = r and u = h w ′ , s i , and in the thirdcase u = r ′ and u = h w ′ , s i .(c) r ⇄ ∗ s and r ⇄ ∗ s , in which case we take u = t and u = s . • s = λx A .a , t = λx A .t ′ , and a ֒ → t ′ , with a ⇄ ∗ h a , a i and s ⇄ ∗ λx A . h a , a i x A .a . Therefore, bythe induction hypothesis, there exists u ′ , u ′ such that either ( a → u ′ and a → u ′ ), or ( a ⇄ ∗ u ′ and a → u ′ ), or ( a → u ′ and a ⇄ ∗ u ′ ). Therefore, we take u = λx A .u ′ and u = λx A .u ′ . • s = as ′ , t = t ′ s ′ , and a ֒ → t ′ , with a ⇄ ∗ h a , a i and s ⇄ ∗ h a s ′ , a s ′ i . Therefore, by the inductionhypothesis, there exists u ′ , u ′ such that either ( a → u ′ and a → u ′ ), or ( a ⇄ ∗ u ′ and a → u ′ ),or ( a → u ′ and a ⇄ ∗ u ′ ). Therefore, we take u = u ′ s ′ and u = u ′ s ′ . • s = as ′ , t = at ′ , and s ′ ֒ → t ′ , with a ⇄ ∗ h a , a i and s ⇄ ∗ h a s ′ , a s ′ i . By Lemma 5.18 severaltimes, one the following cases happen(a) a s ′ ⇄ ∗ h w s ′ , w s ′ i , a s ′ ⇄ ∗ h w s ′ , w s ′ i , r ⇄ ∗ h w s ′ , w s ′ i and r ⇄ ∗ h w s ′ , w s ′ i .We take u ⇄ ∗ ( h w , w i ) t ′ and r ⇄ ∗ ( h w , w i ) t ′ .(b) a s ′ ⇄ ∗ h w s ′ , w s ′ i , r ⇄ ∗ h a s ′ , w s ′ i and r ⇄ ∗ w s ′ . So we take u = ( h a , a i ) t ′ and u = w t ′ , the symmetric cases are analogous.(c) r ⇄ ∗ a s ′ and r ⇄ ∗ a s ′ , in which case we take u = a t ′ and u = a t ′ the symmetric caseis analogous. 32 s = Λ X.a , t = Λ X.t ′ , and a ֒ → t ′ , with a ⇄ ∗ h a , a i and s ⇄ ∗ Λ X. h a , a i X.a . Therefore, by theinduction hypothesis, there exists u ′ , u ′ such that either ( a → u ′ and a → u ′ ), or ( a ⇄ ∗ u ′ and a → u ′ ), or ( a → u ′ and a ⇄ ∗ u ′ ). Therefore, we take u = Λ X.u ′ and u = Λ X.u ′ . • s = a [ A ], t = t ′ [ A ], and a ֒ → t ′ , with a ⇄ ∗ h a , a i and s ⇄ ∗ h a [ A ] , a [ A ] i . Therefore, by theinduction hypothesis, there exists u ′ , u ′ such that either ( a → u ′ and a → u ′ ), or ( a ⇄ ∗ u ′ and a → u ′ ), or ( a → u ′ and a ⇄ ∗ u ′ ). Therefore, we take u = u ′ [ A ] and u = u ′ [ A ]. C Detailed proofs of Section 5.5