[PDF] Probabilistic Termination by Monadic Affine Sized Typing (Long Version)

Abstract

We introduce a system of monadic affine sized types, which substantially generalise usual sized types, and allows this way to capture probabilistic higher-order programs which terminate almost surely. Going beyond plain, strong normalisation without losing soundness turns out to be a hard task, which cannot be accomplished without a richer, quantitative notion of types, but also without imposing some affinity constraints. The proposed type system is powerful enough to type classic examples of probabilistically terminating programs such as random walks. The way typable programs are proved to be almost surely terminating is based on reducibility, but requires a substantial adaptation of the technique.

Full PDF

aa r X i v : . [ c s . P L ] J a n Probabilistic Terminationby Monadic Aﬃne Sized Typing(Long Version)

Ugo Dal Lago Charles GrelloisJuly 10, 2018

Abstract

We introduce a system of monadic aﬃne sized types, which substantially generalise usualsized types, and allows this way to capture probabilistic higher-order programs which termi-nate almost surely. Going beyond plain, strong normalisation without losing soundness turnsout to be a hard task, which cannot be accomplished without a richer, quantitative notionof types, but also without imposing some aﬃnity constraints. The proposed type system ispowerful enough to type classic examples of probabilistically terminating programs such asrandom walks. The way typable programs are proved to be almost surely terminating is basedon reducibility, but requires a substantial adaptation of the technique.

Probabilistic models are more and more pervasive in computer science [1, 2, 3]. Moreover, theconcept of algorithm, originally assuming determinism, has been relaxed so as to allow probabilisticevolution since the very early days of theoretical computer science [4]. All this has given impetus toresearch on probabilistic programming languages, which however have been studied at a large scaleonly in the last twenty years, following advances in randomized computation [5], cryptographicprotocol veriﬁcation [6, 7], and machine learning [8]. Probabilistic programs can be seen as ordinaryprograms in which speciﬁc instructions are provided to make the program evolve probabilisticallyrather than deterministically. The typical example are instructions for sampling from a givendistribution toolset, or for performing probabilistic choice.One of the most crucial properties a program should satisfy is termination : the executionprocess should be guaranteed to end. In (non)deterministic computation, this is easy to formalize,since any possible computation path is only considered qualitatively, and termination is a booleanpredicate on programs: any non-deterministic program either terminates – in must or may sense– or it does not. In probabilistic programs, on the other hand, any terminating computationpath is attributed a probability, and thus termination becomes a quantitative property. It istherefore natural to consider a program terminating when its terminating paths form a set ofmeasure one or, equivalently, when it terminates with maximal probability. This is dubbed “almostsure termination” (AST for short) in the literature [9], and many techniques for automaticallyand semi-automatically checking programs for AST have been introduced in the last years [10,11, 12, 13]. All of them, however, focus on imperative programs; while probabilistic functionalprogramming languages are nowadays among the most successful ones in the realm of probabilisticprogramming [8]. It is not clear at all whether the existing techniques for imperative languagescould be easily applied to functional ones, especially when higher-order functions are involved.In this paper, we introduce a system of monadic aﬃne sized types for a simple probabilistic λ -calculus with recursion and show that it guarantees the AST property for all typable programs.The type system, described in Section 4, can be seen as a non-trivial variation on Hughes et al.’ssized types [14], whose main novelties are the following:1 Types are generalised so as to be monadic , this way encapsulating the kind of information weneed to type non-trivial examples. This information, in particular, is taken advantage of whentyping recursive programs. – Typing rules are aﬃne : higher-order variables cannot be freely duplicated. This is quitesimilar to what happens when characterising polynomial time functions by restricting higher-order languages akin to the λ -calculus [15]. Without aﬃnity, the type system is bound to beunsound for AST.The necessity of both these variations is discussed in Section 2 below. The main result of this paperis that typability in monadic aﬃne sized types entails AST, a property which is proved using anadaptation of the Girard-Tait reducibility technique [16]. This adaptation is technically involved,as it needs substantial modiﬁcations allowing to deal with possibly inﬁnite and probabilistic com-putations. In particular, every reducibility set must be parametrized by a quantitative parameter p guaranteeing that terms belonging to this set terminate with probability at least p . The idea ofparametrizing such sets already appears in work by the ﬁrst author and Hofmann [17], in which anotion of realizability parametrized by resource monoids is considered. These realizability modelsare however studied in relation to linear logic and to the complexity of normalisation, and do notﬁt as such to our setting, even if they provided some crucial inspiration. In our approach, the factthat recursively-deﬁned terms are AST comes from a continuity argument on this parameter: wecan prove, by unfolding such terms, that they terminate with probability p for every p <

1, andcontinuity then allows to take the limit and deduce that they are AST. This soundness result istechnically speaking the main contribution of this paper, and is described in Section 6.

Sized types have been originally introduced by Hughes, Pareto, and Sabry [14] in the context ofreactive programming. A series of papers by Barthe and colleagues [18, 19, 20] presents sizedtypes in a way similar to the one we will adopt here, although still for a deterministic functionallanguage. Contrary to the other works on sized types, their type system is proved to admita decidable type inference, see the unpublished tutorial [19]. Abel developed independently ofBarthe and colleagues a similar type system featuring size informations [21]. These three lines ofwork allow polymorphism, arbitrary inductive data constructors, and ordinal sizes, so that datasuch as inﬁnite trees can be manipulated. These three features will be absent of our system, in orderto focus the challenge on the treatment of probabilistic recursive programs. Another interestingapproach is the one of Xi’s Dependent ML [22], in which a system of lightweight dependenttypes allows a more liberal treatment of the notion of size, over which arithmetic or conditionaloperations may in particular be applied, see [21] for a detailed comparison. This type system iswell-adapted for practical termination checking, but does not handle ordinal sizes either. Someworks along these lines are able to deal with coinductive data, as well as inductive ones [14, 18, 21].They are related to Amadio and Coupet-Grimal’s work on guarded types ensuring productivity ofinﬁnite structures such as streams [23]. None of these works deal with probabilistic computation,and in particular with almost sure termination.There has been a lot of interest, recently, about probabilistic termination as a veriﬁcationproblem in the context of imperative programming [10, 11, 12, 13]. All of them deal, invariably,with some form of while-style language without higher-order functions. A possible approachis to reduce AST for probabilistic programs to termination of non-deterministic programs [10].Another one is to extend the concept of ranking function to the probabilistic case. Bournez andGarnier obtained in this way the notion of Lyapunov ranking function [24], but such functionscapture a notion more restrictive than AST: positive almost sure termination, meaning that theprogram is AST and terminates in expected ﬁnite time. To capture AST, the notion of rankingsupermartingale [25] has been used. Note that the use of ranking supermartingales allows to dealwith programs which are both probabilistic and non-deterministic [11, 13] and even to reasonabout programs with real-valued variables [12].Some recent work by Cappai, the ﬁrst author, and Parisen Toldin [26, 27] introduce typesystems ensuring that all typable programs can be evaluated in probabilistic polynomial time.2his is too restrictive for our purposes. On the one hand, we aim at termination, and restrictingto polynomial time algorithms would be an overkill. On the other hand, the above-mentioned typesystems guarantee that the length of all probabilistic branches are uniformly bounded (by the samepolynomial). In our setting, this would restrict the focus to terms in which inﬁnite computationsare forbidden, while we simply want the set of such computations to have probability 0.

In this section, we justify the design choices that guided us in the development of our type system.As we will see, the nature of AST requires a signiﬁcant and non-trivial extension of the system ofsized types originally introduced to ensure termination in the deterministic case [14].

Sized Types for Deterministic Programs.

The simply-typed λ -calculus endowed with atyped recursion operator letrec and appropriate constructs for the natural numbers, sometimescalled PCF , is already Turing-complete, so that there is no hope to prove it strongly normalizing.Sized types [14] reﬁne the simple type system by enriching base types with annotations, so as toensure the termination of any recursive deﬁnition. Let us explain the idea of sizes in the simple,yet informative case in which the base type is

Nat . Sizes are deﬁned by the grammar s ::= i (cid:12)(cid:12) ∞ (cid:12)(cid:12) b s where i is a size variable and b s is the successor of the size s — with c ∞ = ∞ . These sizes permitto consider decorations Nat s of the base type Nat , whose elements are natural numbers of size atmost s . The type system ensures that the only constant value of type Nat b i is , that the onlyconstant values of type Nat bb i are or 1 ¯ = S 0 , and so on. The type

Nat ∞ is the one of all naturalnumbers, and is therefore often denoted as Nat .The crucial rule of the sized type system, which we present here following Barthe et al. [18],allows one to type recursive deﬁnitions as follows:Γ , f : Nat i → σ ⊢ M : Nat b i → σ [ b i / i ] i pos σ Γ ⊢ letrec f = M : Nat s → σ [ s / i ] (1)This typing rule ensures that, to recursively deﬁne the function f = M , the term M taking aninput of size b i calls f on inputs of strictly lesser size i . This is for instance the case when typingthe program M DBL = letrec f = λx. case x of (cid:8) S → λy. S S ( f y ) (cid:12)(cid:12) → (cid:9) computing recursively the double of an input integer, as the hypothesis of the ﬁxpoint rule in atyping derivation of M DBL is f : Nat i → Nat ⊢ λx. case x of (cid:8) S → λy. S S ( f y ) (cid:12)(cid:12) → (cid:9) : Nat b i → Nat

The fact that f is called on an input y of strictly lesser size i is ensured by the rule typing the case construction:Γ ⊢ x : Nat b i Γ ⊢ λy. S S ( f y ) : Nat i → Nat Γ ⊢ : Nat Γ ⊢ case x of (cid:8) S → λy. S S ( f y ) (cid:12)(cid:12) → (cid:9) : Nat where Γ = f : Nat i → Nat , x : Nat b i . The soundness of sized types for strong normalizationallows to conclude that M DBL is indeed SN. 3

Na¨ıve Generalization to Probabilistic Terms.

The aim of this paper is to obtain aprobabilistic, quantitative counterpart to this soundness result for sized types. Note that unlikethe result for sized types, which was focusing on all reduction strategies of terms, we only considera call-by-value calculus . Terms can now contain a probabilistic choice operator ⊕ p , such that M ⊕ p N reduces to the term M with probability p ∈ R [0 , , and to N with probability 1 − p . Thelanguage and its operational semantics will be deﬁned more extensively in Section 3. Suppose forthe moment that we type the choice operator in a na¨ıve way:Γ ⊢ M : σ Γ ⊢ N : σ Choice Γ ⊢ M ⊕ p N : σ On the one hand, the original system of sized types features subtyping, which allows some ﬂexibilityto “unify” the types of M and N to σ . On the other hand, it is easy to realise that all probabilisticbranches would have to be terminating, without any hope of capturing interesting AST programs:nothing has been done to capture the quantitative nature of probabilistic termination. An instanceof a term which is not strongly normalizing but which is almost-surely terminating — meaningthat it normalizes with probability 1 — is M BIAS = (cid:16) letrec f = λx. case x of n S → λy.f ( y ) ⊕ ( f ( S S y ))) (cid:12)(cid:12) → o(cid:17) n ¯ (2)simulating a biased random walk which, on x = m + 1, goes to m with probability and to m + 2with probability . The na¨ıve generalization of the sized type system only allows us to type thebody of the recursive deﬁnition as follows: f : Nat bb i → Nat ∞ ⊢ λy.f ( y ) ⊕ ( f ( S S y ))) : Nat b i → Nat ∞ (3)and thus does not allow us to deduce any relevant information on the quantitative terminationof this term: nothing tells us that the recursive call f ( S S y ) is performed with a relatively lowprobability. A Monadic Type System.

Along the evaluation of M BIAS , there is indeed a quantity whichdecreases during each recursive call to the function f : the average size of the input on whichthe call is performed. Indeed, on an input of size b i , f calls itself on an input of smaller size i with probability , and on an input of greater size bb i with probability only . To capture such arelevant quantitative information on the recursive calls of f , and with the aim to capture almostsure termination, we introduce a monadic type system, in which distributions of types can be usedto type in a ﬁner way the functions to be used recursively. Contexts Γ | Θ will be generated by acontext Γ attributing sized types to any number of variables, while Θ will attribute a distribution of sized types to at most one variable — typically the one we want to use to recursively deﬁne afunction. In such a context, terms will be typed by a distribution type, formed by combining theDirac distributions of types introduced in the Axiom rules using the following rule for probabilisticchoice: Γ | Θ ⊢ M : µ Γ | Ψ ⊢ N : ν h µ i = h ν i Choice Γ | Θ ⊕ p Ψ ⊢ M ⊕ p N : µ ⊕ p ν The guard condition h µ i = h ν i ensures that µ and ν are distributions of types decorating of the same simple type. Without this condition, there is no hope to aim for a decidable type inferencealgorithm. The Fixpoint Rule.

Using these monadic types, instead of the insuﬃciently informative typing(3), we can derive the sequent f : ((cid:16) Nat i → Nat ∞ (cid:17) , (cid:18) Nat bb i → Nat ∞ (cid:19) ) ⊢ λy.f ( y ) ⊕ ( f ( S S y ))) : Nat b i → Nat ∞ (4) Please notice that choosing a reduction strategy is crucial in a probabilistic setting, otherwise one risks gettingnasty forms of non-conﬂuence [28].

4n which the type of f contains ﬁner information on the sizes of arguments over which it is calledrecursively, and with which probability. This information enables us to perform a ﬁrst switch froma qualitative to a quantitative notion of termination: we will adapt the hypothesisΓ , f : Nat i → σ ⊢ M : Nat b i → σ [ b i / i ] (5)of the original ﬁx rule (1) of sized types, expressing that f is called on an argument of size oneless than the one on which M is called, to a condition meaning that there is probability 1 to call f on arguments of a lesser size after enough iterations of recursive calls . We therefore deﬁne arandom walk associated to the distribution type µ of f , the sized walk associated to µ , and whichis as follows for the typing (4): – the random walk starts on 1, corresponding to the size b i , – on an integer n +1, the random walk jumps to n with probability and to n +2 with probability , – one-counter Markov decisionproblem [29], so that it is decidable in polynomial time whether the walk reaches 0 with probability1. We will therefore replace the hypothesis (5) of the letrec rule by the quantitative counterpartwe just sketched, obtaining (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) induces an AST sized walkΓ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ν [ b i / i ] letrec Γ , ∆ | Θ ⊢ letrec f = V : Nat r → ν [ r / i ]where we omit two additional technical conditions to be found in Section 4 and which justifythe weakening on contexts incorporated to this rule. The resulting type system allows to type avarieties of examples, among which the following program computing the geometric distributionover the natural numbers: M EXP = (cid:16) letrec f = λx.x ⊕ S ( f x ) (cid:17) (6)and for which the decreasing quantity is the size of the set of probabilistic branches of the termmaking recursive calls to f . Another example is the unbiased random walk M UNB = (cid:16) letrec f = λx. case x of n S → λy.f ( y ) ⊕ ( f ( S S y ))) (cid:12)(cid:12) → o(cid:17) n ¯ (7)for which there is no clear notion of decreasing measure during recursive calls, but which yetterminates almost surely, as witnessed by the sized walk associated to an appropriate derivationin the sized type system. We therefore claim that the use of this external guard condition onassociated sized walks, allowing us to give a general condition of termination, is satisfying as itboth captures an interesting class of examples, and is computable in polynomial time.In Section 6, we prove that this shift from a qualitative to a quantitative hypothesis in thetype system results in a shift from the soundness for strong normalization of the original sizedtype system to a soundness for its quantitative counterpart: almost-sure termination . Why Aﬃnity?

To ensure the soundness of the letrec rule, we need one more structural restric-tion on the type system. For the sized walk argument to be adequate, we must ensure that therecursive calls of f are indeed precisely modelled by the sized walk, and this is not the case whenconsidering for instance the following term: M NAFF = (cid:16) letrec f = λx. case x of n S → λy.f ( y ) ⊕ ( f ( S S y ) ; f ( S S y )) (cid:12)(cid:12) → o(cid:17) n ¯ (8)where the sequential composition ; is deﬁned in this call-by-value calculus as M ; N = ( λx.λy. ) M N M NAFF calls recursively f twice in the right branch of its probabilistic choice, and isnot therefore modelled appropriately by the sized walk associated to its type. In fact, we wouldneed a generalized notion of random walk to model the recursive calls of this process; it would bea random walk on stacks of integers. In the case where n = 1, the recursive calls to f can indeedbe represented by a tree of stacks as depicted in Figure 1, where leftmost edges have probability and rightmost ones . The root indicates that the ﬁrst call on f was on the integer 1. From it,there is either a call of f on 0 which terminates, or two calls on 2 which are put into a stack ofcalls, and so on. We could prove that, without the aﬃne restriction we are about to formulate, theterm M NAFF is typable with monadic sized types and the ﬁxpoint rule we just designed. However,this term is not almost-surely terminating. Notice, indeed, that the sum of the integers appearingin a stack labelling a node of the tree in Figure 1 decreases by 1 when the left edge of probability is taken, and increases by at least is taken. It followsthat the expected increase of the sum of the elements of the stack during one step is at least − × + 3 × = >

0. This implies that the probability that f is called on an input of size 0after enough iterations is strictly less than 1, so that the term M NAFF cannot be almost surelyterminating. [1] [2 2] [2 3 3]...[2 1] [2 2 2]...[2] [3 3]...[1][2 2]...[0][0] Figure 1: A Tree of Recursive Calls.Such general random processes have stacksas states and are rather complex to analyse. Tothe best of the authors’ knowledge, they do notseem to have been considered in the literature.We also believe that the complexity of deter-mining whether 0 can be reached almost surelyin such a process, if decidable, would be veryhigh. This leads us to the design of an aﬃne type system, in which the management of con-texts ensures that a given probabilistic branchof a term may only use at most once a givenhigher-order symbol. We do not however for-mulate restrictions on variables of simple type

Nat , as aﬃnity is only used on the letrec ruleand thus on higher-order symbols. Remarkthat this is in the spirit of certain systems fromimplicit computational complexity [15, 30].

We consider the language λ ⊕ , which is an extension of the λ -calculus with recursion, constructorsfor the natural numbers, and a choice operator. In this section, we introduce this language andits operational semantics, and use them to deﬁne the crucial notion of almost-sure termination . Terms and Values.

Given a set of variables X , terms and values of the language λ ⊕ are deﬁnedby mutual induction as follows:Terms: M, N, . . . ::= V (cid:12)(cid:12) V V (cid:12)(cid:12) let x = M in N (cid:12)(cid:12) M ⊕ p N (cid:12)(cid:12) case V of { S → W | → Z } Values:

V, W, Z, . . . ::= x (cid:12)(cid:12) (cid:12)(cid:12) S V (cid:12)(cid:12) λx.M (cid:12)(cid:12) letrec f = V where x, f ∈ X , p ∈ ]0 , p = , we often write ⊕ as a shorthand for ⊕ . The set of termsis denoted Λ ⊕ and the set of values is denoted Λ V ⊕ . Terms of the calculus are assumed to be inA-normal form [31]. This allows to formulate crucial deﬁnitions in a simpler way, concentrating inthe Let construct the study of the probabilistic behaviour of terms. We claim that all traditional6onstructions can be encoded in this formalism. For instance, the usual application M N of twoterms can be harmlessly recovered via the encoding let x = M in ( let y = N in x y ). In thesequel, we write c −→ V when a value may be either or of the shape S V . Term Distributions.

The introduction of a probabilistic choice operator in the syntax leads toa probabilistic reduction relation. It is therefore meaningful to consider the (operational) semanticsof a term as a distribution of values modelling the outcome of all the ﬁnite probabilistic reductionpaths of the term. For instance, the term M EXP deﬁned in (6) evaluates to the term distributionassigning probability n +1 to the value n ¯ . Let us deﬁne this notion more formally: Deﬁnition 1 (Distribution) A distribution on X is a function D : X → [0 , satisfying theconstraint P D = P x ∈ X D ( x ) ≤ , where P D is called the sum of the distribution D .We say that D is proper precisely when P D = 1 . We denote by P the set of all distributions,would they be proper or not. We deﬁne the support S ( D ) of a distribution D as: S ( D ) = (cid:8) x ∈ X (cid:12)(cid:12) D ( x ) > (cid:9) . When S ( D ) consists only of closed terms, we say that D is a closed distribution. When it is ﬁnite, we say that D is a ﬁnite distribution. We call Dirac a properdistribution D such that S ( D ) is a singleton. We denote by the null distribution, mapping everyterm to the probability . When X = Λ ⊕ , we say that D is a term distribution . In the sequel, we will use a more practicalnotion of representation of distributions, which enumerates the terms with their probabilities asa family of assignments. For technical reasons, notably related to the subject reduction property,we will also need pseudo-representations , which are essentially multiset-like decompositions of therepresentation of a distribution. Deﬁnition 2 (Representations and Pseudo-Representations)

Let D ∈ P be of support (cid:8) x i (cid:12)(cid:12) i ∈ I (cid:9) ,where x i = x j implies i = j for every i, j ∈ I . The representation of D is the set D = n x D ( x i ) i (cid:12)(cid:12) i ∈ I o where x D ( x i ) i is just an intuitive way to write the pair ( x i , D ( x i )) . A pseudo-representation of D is any multiset (cid:2) y p j j (cid:12)(cid:12) j ∈ J (cid:3) such that ∀ j ∈ J , y j ∈ S ( D ) ∀ i ∈ I , D ( x i ) = X y j = x i p j . By abuse of notation, we will simply write D = (cid:2) y p j j (cid:12)(cid:12) j ∈ J (cid:3) to mean that D admits (cid:2) y p j j (cid:12)(cid:12) j ∈ J (cid:3) as pseudo-representation. Any distribution has a canonical pseudo-representation obtained by sim-ply replacing the set-theoretic notation with the multiset-theoretic one. Deﬁnition 3 ( ω -CPO of distributions) We deﬁne the pointwise order on distributions over X as D E if and only if ∀ x ∈ X, D ( x ) ≤ E ( x ) . This turns ( P , ) into a partial order. This partial order is an ω -CPO, but not a lattice as thejoin of two distributions does not necessarily exist. The bottom element of this ω -CPO is the nulldistribution . Deﬁnition 4 (Operations on distributions)

Given a distribution D and a real number α ≤ ,we deﬁne the distribution α · D as x α · D ( x ) . We similarly deﬁne the sum D + E of twodistributions over a same set X as the function x D ( x ) + E ( x ) . Note that this is a total op-eration on functions X → R , but a partial operation on distributions: it is deﬁned if and onlyif P D + P E ≤ . When D E , we deﬁne the partial operation of diﬀerence of distributions E − D as the function V E ( V ) − D ( V ) . We naturally extend these operations to representationsand pseudo-representations of distributions. et x = V in M → v n ( M [ V /x ]) o ( λx.M ) V → v n ( M [ V /x ]) o M ⊕ p N → v (cid:8) M p , N − p (cid:9) M → v (cid:8) L p i i (cid:12)(cid:12) i ∈ I (cid:9) let x = M in N → v (cid:8) ( let x = L i in N ) p i (cid:12)(cid:12) i ∈ I (cid:9) case S V of { S → W | → Z } → v n ( W V ) o case 0 of { S → W | → Z } → v n ( Z ) o ( letrec f = V ) (cid:16) c −→ W (cid:17) → v (cid:26) (cid:16) V [( letrec f = V ) /f ] (cid:16) c −→ W (cid:17)(cid:17) (cid:27) D VD = (cid:8) M p j j (cid:12)(cid:12) j ∈ J (cid:9) + D | V ∀ j ∈ J , M j → v E j D → v (cid:16)P j ∈J p j · E j (cid:17) + D | V Figure 2: Call-by-value reduction relation → v on distributions. Deﬁnition 5 (Value Decomposition of a Term Distribution)

Let D be a term distribution.We write its value decomposition as D VD = D | V + D | T , where D | V is the maximal subdistributionof D whose support consists of values, and D | T = D − D | V is the subdistribution of “non-values”contained in D . Operational Semantics.

The semantics of a term will be the value distribution to which itreduces via the probabilistic reduction relation, iterated up to the limit. As a ﬁrst step, wedeﬁne the call-by-value reduction relation → v ⊆ P × R Λ ⊕ on Figure 2. The relation → v is in facta relation on distributions: Lemma 1

Let D be a distribution such that D → v E . Then E is a distribution. Note that we write Dirac distributions simply as terms on the left side of → v , to improve readabil-ity. As usual, we denote by → nv the n -th iterate of the relation → v , with → v being the identityrelation. We then deﬁne the relation ⇛ nv as follows. Let D → nv E VD = E | V + E | T . Then D ⇛ nv E | V .Note that, for every n ∈ N and D ∈ P , there is a unique distribution E such that D → nv E .Moreover, E | V is the only distribution such that D ⇛ nv E | V . Lemma 2

Let n, m ∈ N with n < m . Let D n (resp D m ) be the distribution such that M → nv D n (resp M → mv D m ). Then D n D m . Lemma 3

Let n, m ∈ N with n < m . Let D n (resp D m ) be the distribution such that M ⇛ nv D n (resp M ⇛ mv D m ). Then D n D m . Deﬁnition 6 (Semantics of a Term, of a Distribution)

The semantics of a distribution D is the distribution [[ D ]] = sup n ∈ N (cid:0)(cid:8) D n (cid:12)(cid:12) D ⇛ nv D n (cid:9)(cid:1) . This supremum exists thanks toLemma 3, combined with the fact that ( P , ) is an ω -CPO. We deﬁne the semantics of a term M as [[ M ]] = [[ (cid:8) M (cid:9) ]] . orollary 1 Let n ∈ N and D n be such that M ⇛ nv D n . Then D n [[ M ]] . We now have all the ingredients required to deﬁne the central concept of this paper, the one ofalmost-surely terminating term:

Deﬁnition 7 (Almost-Sure Termination)

We say that a term M is almost-surely terminat-ing precisely when P [[ M ]] = 1 . Before we terminate this section, let us formulate the following lemma on the operationalsemantics of the let construction, which will be used in the proof of typing soundness for monadicaﬃne sized types:

Lemma 4

Suppose that M ⇛ nv (cid:2) V p i (cid:12)(cid:12) i ∈ I (cid:3) and that, for every i ∈ I , N [ V i /x ] ⇛ mv E i . Then let x = M in N ⇛ n + m +1 v P i ∈I p i · E i . Proof.

Easy from the deﬁnition of ⇛ v and of → v in the case of let . (cid:3) Following the discussion from Section 2, we introduce in this section a non-trivial lifting of sizedtypes to our probabilistic setting. As a ﬁrst step, we design an aﬃne simple type system for λ ⊕ .This means that no higher-order variable may be used more than once in the same probabilisticbranch. However, variables of base type Nat may be used freely. In spite of this restriction,the resulting system allows to type terms corresponding to any probabilistic Turing machine. InSection 4.2, we introduce a more sophisticated type system, which will be monadic and aﬃne, andwhich will be sound for almost-sure termination as we prove in Section 6. λ ⊕ The terms of the language λ ⊕ can be typed using a variant of the simple types of the λ -calculus,extended to type letrec and ⊕ p , but also restricted to an aﬃne management of contexts. Recallthat the constraint of aﬃnity ensures that a given higher-order symbol is used at most once in a probabilistic branch. We deﬁne simple types over the base type Nat in the usual way: κ, κ ′ , . . . ::= Nat (cid:12)(cid:12) κ → κ ′ where, by convention, the arrow associates to the right. ContextsΓ , ∆ , . . . are sequences of simply-typed variables x :: κ . We write sequents as Γ ⊢ M :: κ todistinguish these sequents from the ones using distribution types appearing later in this section.Before giving the rules of the type system, we need to deﬁne two policies for contracting contexts:an aﬃne and a general one. Context Contraction.

The contraction Γ ∪ ∆ of two contexts is a non-aﬃne operation, and ispartially deﬁned as follows: • x :: κ ∈ Γ \ ∆ = ⇒ x :: κ ∈ Γ ∪ ∆, • x :: κ ∈ ∆ \ Γ = ⇒ x :: κ ∈ Γ ∪ ∆ , • if x :: κ ∈ Γ and x :: κ ′ ∈ ∆, – if κ = κ ′ , x :: κ ∈ Γ ∪ ∆, – else the operation is undeﬁned.This operation will be used to contract contexts in the rule typing the choice operation ⊕ p : indeed,we allow a same higher-order variable f to occur both in M and in N when forming M ⊕ p N , asboth terms correspond to diﬀerent probabilistic branches.9ar Γ , x :: κ ⊢ x :: κ Γ ⊢ V :: Nat Γ ⊢ S V :: Nat Γ ⊢ :: Nat Γ , x :: κ ⊢ M :: κ ′ λ Γ ⊢ λx.M :: κ → κ ′ Γ ⊢ V :: κ → κ ′ ∆ ⊢ W :: κ AppΓ ⊎ ∆ ⊢ V W :: κ ′ Γ ⊢ M :: κ ∆ ⊢ N :: κ Choice Γ ∪ ∆ ⊢ M ⊕ p N :: κ Γ ⊢ M :: κ ∆ , x :: κ ⊢ N :: κ ′ Let Γ ⊎ ∆ ⊢ let x = M in N :: κ ′ Γ ⊢ V :: Nat ∆ ⊢ W :: Nat → κ ∆ ⊢ Z :: κ Case Γ ⊎ ∆ ⊢ case V of { S → W | → Z } :: κ Γ , f :: Nat → κ ⊢ V :: Nat → κ ∀ x ∈ Γ , x :: Natletrec Γ ⊢ letrec f = V :: Nat → κ Figure 3: Aﬃne simple types for λ ⊕ . Aﬃne contraction of contexts.

The aﬃne contraction Γ ⊎ ∆ will be used in all rules but theone for ⊕ p . It is partially deﬁned as follows: • x :: κ ∈ Γ \ ∆ = ⇒ x :: κ ∈ Γ ⊎ ∆, • x :: κ ∈ ∆ \ Γ = ⇒ x :: κ ∈ Γ ⊎ ∆ , • if x :: κ ∈ Γ and x :: κ ′ ∈ ∆, – if κ = κ ′ = Nat , x :: κ ∈ Γ ⊎ ∆, – in any other case, the operation is undeﬁned.As we explained earlier, only variables of base type Nat may be contracted.

The Aﬃne Type System.

The aﬃne simple type system is then deﬁned in Figure 3. Allthe rules are quite standard. Higher-order variables can occur at most once in any probabilisticbranch because all binary typing rules – except probabilistic choice – treat contexts aﬃnely. Weset Λ V ⊕ (Γ , κ ) = (cid:8) V ∈ Λ V ⊕ (cid:12)(cid:12) Γ ⊢ V :: κ } and Λ ⊕ (Γ , κ ) = (cid:8) M ∈ Λ ⊕ (cid:12)(cid:12) Γ ⊢ M :: κ (cid:9) . Wesimply write Λ V ⊕ ( κ ) = Λ V ⊕ ( ∅ , κ ) and Λ ⊕ ( κ ) = Λ ⊕ ( ∅ , κ ) when the terms or values are closed.These closed, typable terms enjoy subject reduction and the progress property. This section is devoted to giving the basic deﬁnitions and results about monadic aﬃne sized types(MASTs, for short), which can be seen as decorations of the aﬃne simple types with some sizeinformation . Sized Types.

We consider a set S of size variables , denoted i , j , . . . and deﬁne sizes (called stages in [18]) as: s , r ::= i (cid:12)(cid:12) ∞ (cid:12)(cid:12) b s pos Nat s i neg σ i pos µ i pos σ → µ ∀ i ∈ I , i pos σ i i pos (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) i / ∈ si neg Nat s i pos σ i neg µ i neg σ → µ ∀ i ∈ I , i neg σ i i neg (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) Figure 4: Positive and negative occurrences of a size variable in a size type.where b · denotes the successor operation. We denote the iterations of b · as follows: bb s is denoted b s , bbb s is denoted b s ,and so on. By deﬁnition, at most one variable i ∈ S appears in a given size s . Wecall it its spine variable , denoted as spine ( s ). We write spine ( s ) = ∅ when there is no variable in s . An order on sizes can be deﬁned as follows: s s s r r ts t s b s s ∞ Notice that these rules imply notably that c ∞ is equivalent to ∞ , i.e., c ∞ ∞ and ∞ c ∞ . Weconsider sizes modulo this equivalence. We can now deﬁne sized types and distribution types bymutual induction, calling distributions of (sized) types the distributions over the set of sized types: Deﬁnition 8 (Sized Types, Distribution Types)

Sized types and distribution types are de-ﬁned by mutual induction, contextually with the function h·i which maps any sized or distributiontype to its underlying aﬃne type.Sized types: σ, τ ::= σ → µ (cid:12)(cid:12) Nat s Distribution types: µ, ν ::= (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) , Underlying map: h σ → µ i = h σ i → h µ ih Nat s i = Nat h (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) i = h σ j i For distribution types we require additionally that P i ∈I p i ≤ , that I is a ﬁnite non-empty set,and that h σ i i = h σ j i for every i, j ∈ I . In the last equation, j is any element of I . The deﬁnition of sized types is monadic in that a higher-order sized type is of the shape σ → µ where σ is again a sized type, and µ is a distribution of sized types.The deﬁnition of the ﬁxpoint will refer to the notion of positivity of a size variable in a sizedor distribution type. We deﬁne positive and negative occurrences of a size variable in such a typein Figure 4. Contexts and Operations on Them.

Contexts are sequences of variables together with asized type, and at most one distinguished variable with a distribution type:

Deﬁnition 9 (Contexts)

Contexts are of the shape Γ | Θ , with Sized contexts: Γ , ∆ , . . . ::= ∅ (cid:12)(cid:12) x : σ, Γ ( x / ∈ dom (Γ))Distribution contexts: Θ , Ψ , . . . ::= ∅ (cid:12)(cid:12) x : µ As usual, we deﬁne the domain dom (Γ) of a sized context Γ by induction: dom ( ∅ ) = ∅ anddom ( x : σ, Γ) = { x } ⊎ dom (Γ) . We proceed similarly for the domain dom (Θ) of a distributioncontext Θ . When a sized context Γ = x : σ , . . . , x n : σ n ( n ≥ ) is such that there is a simpletype κ with ∀ i ∈ { , . . . , n } , h σ i i = κ , we say that Γ is uniform of simple type κ . We write thisas h Γ i = κ .We write Γ , ∆ for the disjoint union of these sized contexts: it is deﬁned whenever dom (Γ) ∩ dom (∆) = ∅ . We proceed similarly for Θ , Ψ , but note that due to the restriction on the cardinalityof such contexts, there is the additional requirement that Θ = ∅ or Ψ = ∅ .We ﬁnally deﬁne contexts as pairs Γ | Θ of a sized context and of a distribution context, withthe constraint that dom (Γ) ∩ dom (Θ) = ∅ . eﬁnition 10 (Probabilistic Sum) Let µ and ν be two distribution types. We deﬁne theirprobabilistic sum µ ⊕ p ν as the distribution type p · µ + (1 − p ) · ν . We extend this operation to a partial operation on distribution contexts: – For two distribution types µ and ν such that h µ i = h ν i , we deﬁne ( x : µ ) ⊕ p ( x : ν ) = x : µ ⊕ p ν , – ( x : µ ) ⊕ p ∅ = x : p · µ , – ∅ ⊕ p ( x : µ ) = x : (1 − p ) · µ , – In any other case, the operation is undeﬁned.

Deﬁnition 11 (Weighted Sum of Distribution Contexts)

Let (Θ i ) i ∈I be a non-empty fam-ily of distribution contexts and ( p i ) i ∈I be a family of reals of [0 , . We deﬁne the weighted sum P i ∈I p i · Θ i as the distribution context x : P i ∈I p i · µ i when the following conditions are met:1. ∃ x, ∀ i ∈ I , Θ i = x : µ i ,2. ∀ ( i, j ) ∈ I , h Θ i i = h Θ j i ,3. and P i ∈I p i ≤ ,In any other case, the operation is undeﬁned. Deﬁnition 12 (Substitution of a Size Variable)

We deﬁne the substitution s [ r / i ] of a sizevariable in a size as follows: i [ r / i ] = r j [ r / i ] = j ∞ [ r / i ] = ∞ b s [ r / i ] = [ s [ r / i ] where i = j . We then deﬁne the substitution σ [ s / i ] (resp. µ [ s / i ] ) of a size variable i by a size s ina sized or distribution type as: ( σ → µ ) [ s / i ] = σ [ s / i ] → µ [ s / i ] ( Nat s ) [ r / i ] = Nat s [ r / i ] (cid:0)(cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9)(cid:1) [ s / i ] = (cid:8) ( σ i [ s / i ]) p i (cid:12)(cid:12) i ∈ I (cid:9) We deﬁne the substitution of a size variable in a sized or distribution context in the obvious way: ∅ [ s / i ] = ∅ ( x : σ, Γ) [ s / i ] = x : σ [ s / i ] , Γ[ s / i ]( x : µ ) [ s / i ] = x : µ [ s / i ] Lemma 5 ( µ ⊕ p ν ) [ s / i ] = µ [ s / i ] ⊕ p ν [ s / i ]

2. For distribution contexts, (Θ ⊕ p Ψ) [ s / i ] = Θ[ s / i ] ⊕ p Ψ[ s / i ]

3. For distribution contexts, (cid:0)P i ∈I p i · Γ i (cid:1) [ s / i ] = P i ∈I p i · Γ i [ s / i ] Proof.

1. Let µ = n σ p ′ i i (cid:12)(cid:12) i ∈ I o and ν = n τ p ′′ j j (cid:12)(cid:12) j ∈ J o . Then µ [ s / i ] ⊕ p ν [ s / i ]= n σ p ′ i i (cid:12)(cid:12) i ∈ I o [ s / i ] ⊕ p n τ p ′′ j j (cid:12)(cid:12) j ∈ J o [ s / i ]= n ( σ i [ s / i ]) p ′ i (cid:12)(cid:12) i ∈ I o ⊕ p n ( τ j [ s / i ]) p ′′ j (cid:12)(cid:12) j ∈ J o = h ( σ i [ s / i ]) pp ′ i (cid:12)(cid:12) i ∈ I i + h ( τ j [ s / i ]) (1 − p ) p ′′ j (cid:12)(cid:12) j ∈ J i = (cid:16)h ( σ i ) pp ′ i (cid:12)(cid:12) i ∈ I i + h ( τ j ) (1 − p ) p ′′ j (cid:12)(cid:12) j ∈ J i(cid:17) [ s / i ]= ( µ ⊕ p ν ) [ s / i ] 12. Suppose that Θ = x : µ and that Ψ = x : ν . Then Θ ⊕ p Ψ = x : µ ⊕ p ν . It follows from(1) that Θ[ s / i ] ⊕ p Ψ[ s / i ] = x : µ [ s / i ] ⊕ p ν [ s / i ] = x : ( µ ⊕ p ν ) [ s / i ] = (Θ ⊕ p Ψ) [ s / i ]3. The proof is similar to the previous cases. (cid:3) A subtyping relation allows to lift the order on sizes to monadic sized types: Deﬁnition 13 (Subtyping)

We deﬁne the subtyping relation ⊑ on sized types and distributiontypes as follows: σ ⊑ σ s r Nat s ⊑ Nat r τ ⊑ σ µ ⊑ νσ → µ ⊑ τ → ν ∃ f : I → J , (cid:0) ∀ i ∈ I , σ i ⊑ τ f ( i ) (cid:1) and (cid:16) ∀ j ∈ J , P i ∈ f − ( j ) p i ≤ p ′ j (cid:17)(cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) ⊑ n τ p ′ j j (cid:12)(cid:12) j ∈ J o Sized Walks and Distribution Types.

As we explained in Section 2, the rule typing letrec in the monadic, aﬃne type system relies on an external decision procedure, computable in poly-nomial time. This procedure ensures that the sized walk — a particular instance of one-counterMarkov decision process (OC-MDP, see [29]), but which does not make use of non-determinism —associated to the type of the recursive function of interest indeed ensures almost sure termination.Let us now deﬁne the sized walk associated to a distribution type µ . We then make precise theconnection with OC-MDPs, from which the computability in polynomial time of the almost-suretermination of the random walk follows. Deﬁnition 14 (Sized Walk)

Let

I ⊆ ﬁn N be a ﬁnite set of integers. Let { p i } i ∈I be such that P i ∈I p i ≤ . These parameters deﬁne a Markov chain whose set of states is N and whosetransition relation is deﬁned as follows: – the state ∈ N is stationary (i.e. one goes from to with probability ), – from the state s + 1 ∈ N one moves: – to the state s + i with probability p i , for every i ∈ I ; – to with probability − (cid:0)P i ∈I p i (cid:1) .We call this Markov chain the sized walk on N associated to (cid:0) I , ( p i ) i ∈I (cid:1) . A sized walk is almostsurely terminating when it reaches with probability 1 from any initial state. Notably, checking whether a sized walk is terminating is relatively easy:

Proposition 1 (Decidability of AST for Sized Walks)

It is decidable in polynomial time whethera sized walk is AST.

Proof.

See Section 4.3. (cid:3)

Deﬁnition 15 (From Types to Sized Walks)

Consider a distribution type µ = (cid:8) ( Nat s j → ν j ) p j (cid:12)(cid:12) j ∈ J (cid:9) such that ∀ j ∈ J , spine ( s j ) = i . Then µ induces a sized walk, deﬁned as follows. First, by deﬁ-nition, s j must be of the shape b i kj with k j ≥ for every j ∈ J . We set I = (cid:8) k j (cid:12)(cid:12) j ∈ J (cid:9) and q k j = p j for every j ∈ J . The sized walk induced by the distribution type µ is then the sized walkassociated to ( I , ( q i ) i ∈I )) . Example 1

Let µ = ( (cid:16) Nat i → Nat ∞ (cid:17) , (cid:18) Nat b i → Nat ∞ (cid:19) ) . Then the induced sized walkis the one associated to (cid:0) { , } , (cid:0) p = , p = (cid:1)(cid:1) . In other words, it is the random walk on N which is stationary on , and which on non-null integers i + 1 moves to i with probability , to i + 2 with probability , and jumps to with probability . Note that the type µ , and thereforethe associated sized walk, models a recursive function which calls itself on a size lesser by one unitwith probability , on a size greater by one unit with probability , and which does not call itselfwith probability . ar Γ , x : σ | Θ ⊢ x : σ Var’Γ | x : σ ⊢ x : σ Γ | Θ ⊢ V : Nat s Succ Γ | Θ ⊢ S V : Nat b s ZeroΓ | Θ ⊢ : Nat b s Γ , x : σ | Θ ⊢ M : µλ Γ | Θ ⊢ λx.M : σ → µ Γ | Θ ⊢ M : µ µ ⊑ ν SubΓ | Θ ⊢ M : ν Γ , ∆ | Θ ⊢ V : σ → µ Γ , Ξ | Ψ ⊢ W : σ h Γ i = Nat

App Γ , ∆ , Ξ | Θ , Ψ ⊢ V W : µ Γ | Θ ⊢ M : µ Γ | Ψ ⊢ N : ν h µ i = h ν i Choice Γ | Θ ⊕ p Ψ ⊢ M ⊕ p N : µ ⊕ p ν Γ , ∆ | Θ ⊢ M : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) h Γ i = Nat Γ , Ξ , x : σ i | Ψ i ⊢ N : µ i ( ∀ i ∈ I )Let Γ , ∆ , Ξ | Θ , (cid:0)P i ∈I p i · Ψ i (cid:1) ⊢ let x = M in N : P i ∈I p i · µ i Γ | ∅ ⊢ V : Nat b s ∆ | Θ ⊢ W : Nat s → µ ∆ | Θ ⊢ Z : µ Case Γ , ∆ | Θ ⊢ case V of { S → W | → Z } : µ h Γ i = Nat i / ∈ Γ and i positive in ν and ∀ j ∈ J , spine ( s j ) = i (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) induces an AST sized walkΓ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ν [ b i / i ] letrec Γ , ∆ | Θ ⊢ letrec f = V : Nat r → ν [ r / i ] Figure 5: Aﬃne distribution types for λ ⊕ . Typing Rules.

Judgements are of the shape Γ | Θ ⊢ M : µ . When a distribution µ = (cid:8) σ (cid:9) is Dirac, we simply write it σ . The type system is deﬁned in Figure 5. As earlier, we deﬁnesets of typable terms, and set Λ s ,V ⊕ (Γ | Θ , σ ) = (cid:8) V (cid:12)(cid:12) Γ | Θ ⊢ V : σ (cid:9) , and Λ s ⊕ (Γ | Θ , µ ) = (cid:8) M (cid:12)(cid:12) Γ | Θ ⊢ M : µ (cid:9) . We abbreviate Λ s ,V ⊕ ( ∅ | ∅ , σ ) as Λ s ,V ⊕ ( σ ) and Λ s ⊕ ( ∅ | ∅ , σ ) as Λ s ⊕ ( σ ).This sized type system is a reﬁnement of the aﬃne simple type system for λ ⊕ : if x : σ , . . . , x n : σ n | f : µ ⊢ M : ν , then it is easily checked that x :: h σ i , . . . , x n :: h σ n i , f :: h µ i ⊢ M :: h ν i . Lemma 6 (Properties of Distribution Types)– Γ | Θ ⊢ V : µ = ⇒ µ is Dirac. – Γ | Θ ⊢ M : µ = ⇒ µ is proper. Proof.

Immediate inspection of the rules. (cid:3)

We now prove Proposition 1 by reducing sized walks to deterministic one-counter Markov decisionprocesses (DOC-MDPs), and using then a result of [29] to conclude. Please note that in [29]14he Markov decision processes are more general, as they allow non-determinism. They are calledone-counter Markov decision processes (OC-MDPs), and contain in particular all the DOC-MDPs.We omit this feature in our presentation.

Deﬁnition 16 (Markov Decision Process)

A Markov decision process (MDP) is a tuple ( V, , Pr ) such that V is a ﬁnite-or-countable set of vertices,

7→ ⊆ V × V is a total transition relation, and Pris a probability assignment mapping each v ∈ V to a probability distribution associating a rationaland non-null probability to each edge outgoing of v . These distributions are moreover required tosum to 1. Deﬁnition 17 (Deterministic One-Counter Markov Decision Process) A deterministic one-counter Markov decision process (DOC-MDP) is a tuple (cid:0) Q, δ =0 , δ > , P =0 , P > (cid:1) such that: • Q is a ﬁnite set of states, • δ =0 ⊆ Q × { , } × Q and δ > ⊆ Q × {− , , } × Q are sets of zero and positive transitions,satisfying that every q ∈ Q has at least a zero and a positive outgoing transition, • P =0 (resp. P > ) is a probability assignment mapping every q ∈ Q to a probability distributionover the outgoing transitions of δ =0 (resp. δ > ) from q . These distributions are required toattribute a non-null, rational probability to every outgoing transition, and to sum to 1. Deﬁnition 18 (Induced Markov Decision Process)

A DOC-MDP (cid:0)

Q, δ =0 , δ > , P =0 , P > (cid:1) in-duces a MDP ( Q × N , , Pr ) such that, for q ∈ Q and n ∈ N : • for every state q ′ such that ( q, m, q ′ ) ∈ δ =0 , ( q, ( q ′ , m ) , and the probability of thistransition is the one attributed by P =0 ( q ) to the transition ( q, m, q ′ ) , • and for every state q ′ such that ( q, m, q ′ ) ∈ δ > , ( q, n ) ( q ′ , n + m ) , and the probability ofthis transition is the one attributed by P > ( q ) to the transition ( q, m, q ′ ) ,This MDP is said to terminate when it reaches the value counter in any state q ∈ Q . Recall that, by deﬁnition, | m | ≤

1. This is the only restriction to overcome (using intermediatestates) to encode sized walks in DOC-MDPs, so that the MDP they induce coincide with theoriginal sized walk. We will then obtain the result of polynomial time decidability of terminationwith probability 1 using the following proposition:

Proposition 2 ([29], Theorem 4.1)

It is decidable in polynomial time whether the MDP in-duced by an OC-MDP — and thus by a DOC-MDP — terminates with probability . We now encode sized walks as DOC-MDPs:

Deﬁnition 19 (DOC-MDP Corresponding to a Sized Walk)

Consider the sized walk on N associated to (cid:0) I , ( p i ) i ∈I (cid:1) . We deﬁne the corresponding DOC-MDP (cid:0) Q, δ =0 , δ > , P =0 , P > (cid:1) asfollows. Let us ﬁrst consider the following set of states: Q = { q α , q zero } ∪ (cid:8) q , . . . , q j − (cid:12)(cid:12) j = max { i ∈ I (cid:12)(cid:12) i ≥ } (cid:9) where q α is the “main” state of the DOC-MDP and the other ones will be used for encodingpurposes. We deﬁne the transitions of δ > as follows: • we add the transition ( q zero , − , q zero ) with probability , • for every j ∈ (cid:8) , . . . , max { i ∈ I (cid:12)(cid:12) i ≥ } − (cid:9) , we add the transition ( q j , , q j − ) with prob-ability , • we add the transition ( q , , q α ) with probability , for i ∈ I ∩ { , , } , we add the transition ( q α , i − , q α ) and attribute it probability p i , • for i ∈ I \ { , , } , we add the transition ( q α , , q i − ) and attribute it probability p i , • if − (cid:0)P i ∈I p i (cid:1) > , we add the transition ( q α , − , q zero ) with probability − (cid:0)P i ∈I p i (cid:1) .Finally, we deﬁne δ =0 as follows: for every state q ∈ Q , we add the transition ( q, , q ) and attributeit probability . It is easily checked that, by construction, these DOC-MDP induce the same Markov decisionprocesses as sized walks:

Proposition 3

The sized walk on N associated to (cid:0) I , ( p i ) i ∈I (cid:1) coincides with the induced MDP ofthe corresponding DOC-MDP. This allows us to deduce from the result of [29] the polynomial time decidability of AST forsized walks:

Corollary 2 (Proposition 1)

It is decidable in polynomial time whether a sized walk is almost-surely terminating.

The type system enjoys a form of subject reduction which can be understood from the followingexample. Remark that the type system allows to derive the sequent ∅ | ∅ ⊢ ⊕ : (cid:26) (cid:16) Nat b s (cid:17) , (cid:16) Nat bb r (cid:17) (cid:27) (9)The distribution type typing ⊕ contains information about the types of the two probabilisticbranches of ⊕ , which will be separated into two diﬀerent terms during the reduction, but thesetwo diﬀerent terms will not be distinguished by the operational semantics: [[ ⊕ ]] = (cid:8) (cid:9) . Thesubject reduction procedure needs to keep track of more information, namely that ⊕ reducedto with type Nat b s in a copy, and again to but with type Nat bb r in the other copy. To formalizethis distinction, we require a few preliminary deﬁnitions: the typed term ⊕ will reduce to thefollowing closed distribution of typed terms : (cid:26) (cid:16) : Nat b s (cid:17) , (cid:16) : Nat bb r (cid:17) (cid:27) (10)which types the pseudo-representation h , i of [[ ⊕ ]]. The quantity which will be preservedduring the reduction is the average type of (10):12 · (cid:26) (cid:16) Nat b s (cid:17) (cid:27) + 12 · (cid:26) (cid:16) Nat bb r (cid:17) (cid:27) = (cid:26) (cid:16) Nat b s (cid:17) , (cid:16) Nat bb r (cid:17) (cid:27) which we call the expectation type of (10), and which coincides with the type of the initial term(9). Deﬁnition 20 (Distributions of Distribution Types, of Typed Terms) – A distributionof distribution types is a distribution D over the set of distribution types, and such that µ, ν ∈ S ( D ) ⇒ h µ i = h ν i . – A distribution of typed terms , or typed distribution , is a distribution of typing sequents whichare derivable in the monadic, aﬃne sized type system. The representation of such a distributionhas thus the following form: (cid:8) (Γ i | Θ i ⊢ M i : µ i ) p i (cid:12)(cid:12) i ∈ I (cid:9) . In the sequel, we restrict to the uniform case in which all the terms appearing in the sequents are typed with distribution typesof the same ﬁxed underlying type. We denote this unique simple type κ as h−→ µ i . A distribution of closed typed terms , or closed typed distribution , is a typed distribution inwhich all contexts are ∅ | ∅ . In this case, we simply write the representation of the distributionas (cid:8) ( M i : µ i ) p i (cid:12)(cid:12) i ∈ I (cid:9) , or even as ( M i : µ i ) p i when the indexing is clear from context.We write pseudo-representations in a similar way. – The underlying term distribution of a closed typed distribution (cid:8) ( M i : µ i ) p i (cid:12)(cid:12) i ∈ I (cid:9) is thedistribution (cid:8) ( M i ) p i (cid:12)(cid:12) i ∈ I (cid:9) . Deﬁnition 21 (Expectation Types)

Let ( M i : µ i ) p i be a closed typed distribution. We deﬁneits expectation type as the distribution type E (( M i : µ i ) p i ) = P i ∈I p i µ i . Lemma 7

Expectation is linear: • E (cid:16) ( M i : µ i ) p i + ( N j : ν j ) p ′ j (cid:17) = E (( M i : µ i ) p i ) + E (cid:16) ( N j : ν j ) p ′ j (cid:17) , • E (cid:16) ( M i : µ i ) pp ′ i (cid:17) = p · E (cid:16) ( M i : µ i ) p ′ i (cid:17) . Lemma 8 (Subtyping Probabilistic Sums)

Suppose that P ( ν ⊕ p ξ ) = 1 and that ν ⊕ p ξ ⊑ µ .Then there exists ν ′ and ξ ′ such that µ = ν ′ ⊕ p ξ ′ , ν ⊑ ν ′ , and that ξ ⊑ ξ ′ . Note that this impliesthat S ( ν ′ ) ∪ S ( ξ ′ ) = S ( µ ) . Proof.

Let ν = n σ p ′ i i (cid:12)(cid:12) i ∈ I o and ξ = n τ p ′′ j j (cid:12)(cid:12) j ∈ J o . We assume, without loss ofgenerality, that I and J are chosen in such a way that, setting K = I ∩ J , ∃ ( i, j ) ∈ I × J , σ i = τ j ⇐⇒ i = j ∈ K . It follows that ν ⊕ p ξ = n σ pp ′ i i (cid:12)(cid:12) i ∈ I \ K o + n τ (1 − p ) p ′′ j j (cid:12)(cid:12) j ∈ J \ K o + n σ pp ′ i +(1 − p ) p ′′ i i (cid:12)(cid:12) i ∈ K o Set µ = n θ p ′′′ l l (cid:12)(cid:12) l ∈ L o . Since ν ⊕ p ξ ⊑ µ and P ( ν ⊕ p ξ ) = 1, there exists a decomposition µ = h θ pp ′ i i (cid:12)(cid:12) i ∈ I \ K i + h θ (1 − p ) p ′′ j j (cid:12)(cid:12) j ∈ J \ K i + h θ pp ′ i +(1 − p ) p ′′ i k (cid:12)(cid:12) k ∈ K i (note that the supports of these distributions may have a non-empty intersection), and this de-composition is such that ∀ i ∈ I , σ i ⊑ θ i and ∀ j ∈ J , τ j ⊑ θ j . We deﬁne ν ′ = n θ p ′ i i (cid:12)(cid:12) i ∈ I o and ξ ′ = n θ p ′′ j j (cid:12)(cid:12) j ∈ J o which satisfy ν ⊑ ν ′ and ξ ⊑ ξ ′ but also, by construction, µ = ν ′ ⊕ p ξ ′ . (cid:3) Corollary 3

Suppose that µ = P i ∈I p i · µ i is a distribution such that µ ⊑ ν and that P µ = 1 .Then there exists a family ( ν i ) i ∈I of distributions such that ν = P i ∈I p i · ν i and that, for all i ∈ I , µ i ⊑ ν i . Note that the requirement that P µ = 1 is not necessary to obtain this result, although itsimpliﬁes the reasoning. Lemma 9 (Generation Lemma for Typing) ∅ | ∅ ⊢ let x = V in N : µ = ⇒ ∃ ( ν, σ ) , ∅ | ∅ ⊢ V : σ and x : σ | ∅ ⊢ N : ν and ν ⊑ µ .2. ∅ | ∅ ⊢ V W : µ = ⇒ ∃ ( ν, σ ) , ∅ | ∅ ⊢ V : σ → ν and ∅ | ∅ ⊢ W : σ and ν ⊑ µ . . ∅ | ∅ ⊢ λx.M : σ → µ = ⇒ ∃ ( ν, τ ) , x : τ | ∅ ⊢ M : ν and σ ⊑ τ and ν ⊑ µ .4. ∅ | ∅ ⊢ M ⊕ p N : µ = ⇒ ∃ ( ν, ξ ) , ∅ | ∅ ⊢ M : ν and ∅ | ∅ ⊢ N : ξ with P ( ν ⊕ p ξ ) = 1 and ν ⊕ p ξ ⊑ µ and h µ i = h ν i = h ξ i .5. ∅ | ∅ ⊢ let x = M in N : ν = ⇒ ∃ (cid:0) I , ( σ i ) i ∈I , ( p i ) i ∈I , ( µ i ) i ∈I (cid:1) such that • P i ∈I p i · µ i ⊑ ν , • P (cid:0)P i ∈I p i · µ i (cid:1) = 1 , • ∅ | ∅ ⊢ M : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) , • ∀ i ∈ I , x : σ i | ∅ ⊢ N : µ i .6. ∅ | ∅ ⊢ case V of { S → W | → Z } : µ = ⇒ ∃ ( s , ν ) such that ∅ | ∅ ⊢ V : Nat b s and ∅ | ∅ ⊢ W : Nat s → ν and ∅ | ∅ ⊢ Z : ν with ν ⊑ µ .7. ∅ | ∅ ⊢ letrec f = V : µ = ⇒ ∃ (cid:16) ( p j ) j ∈J , ( s j ) j ∈J , i (cid:17) such that • Nat r → ν [ r / i ] ⊑ µ , • ∀ j ∈ J , spine ( s j ) = i , • i / ∈ Γ and i positive in ν , • (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) induces an AST sized walk, • ∅ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ν [ b i / i ] Proof.

By inspection of the rules, the key point being that the subtyping rule is the only onewhich is not syntax-directed, and that by transitivity of ⊑ we can compose several successivesubtyping rules. In case (5), we have P (cid:0)P i ∈I p i · µ i (cid:1) = 1 since it appears that ∅ | ∅ ⊢ let x = M in N : P i ∈I p i · µ i . Lemma 6 allows then to conclude that this distribution of types has sum1. (cid:3) Deﬁnition 22 (Context Extending Another)

We say that a context ∆ | Ψ extends a context Γ | Θ when: • for every x : σ ∈ Γ , we have x : σ ∈ ∆ . • and either Θ = ∅ or Θ = Ψ .In other words, ∆ | Ψ extends Γ | Θ when there exists Ξ and Φ such that ∆ = Γ , Ξ and Ψ = Θ , Φ . Lemma 10

Let M be a closed term such that Γ | Θ ⊢ M : µ . Then for every context ∆ | Ψ extending Γ | Θ , we have ∆ | Ψ ⊢ M : µ . Proof.

We proceed by induction on the structure of M . We set ∆ = Γ , Ξ and Ψ = Θ , Φ. • If M = x is a variable, the result is immediate. • If M = 0, the result is immediate. • If M = S V , we have by typing rules that σ = Nat b s and that Γ | Θ ⊢ V : Nat s . By inductionhypothesis ∆ | Ψ ⊢ V : Nat s from which we conclude using the typing rule for S . • If M = λx.N , we have σ = τ → µ and Γ , x : τ | Θ ⊢ N : µ . By deﬁnition, ∆ , x : τ | Ψextends Γ , x : τ | Θ so that we have ∆ , x : τ | Ψ ⊢ N : µ . The result follows using theLambda rule. 18 If M = letrec f = V , the typing rule is of the shape h Γ i = Nat i / ∈ Γ and i positive in ν and ∀ j ∈ J , spine ( s j ) = i (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) induces an AST sized walkΓ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ν [ b i / i ] letrec Γ , Γ | Θ ⊢ letrec f = V : Nat r → ν [ r / i ]Let ∆ = ∆ , ∆ with ∆ the maximal subcontext consisting only of variables of aﬃne type Nat . Then ∆ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) extends Γ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) so that by induction hypothesis ∆ | f : (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ν [ b i / i ] so that we can conclude using the letrec rule again that∆ , ∆ | Ψ ⊢ letrec f = V : Nat r → ν [ r / i ] . • If M = V W , the typing derivation provides contexts such that Γ = Γ , Γ , Γ and thatΘ = Θ , Θ with Γ , Γ | Θ ⊢ V : σ → µ and Γ , Γ | Θ ⊢ W : σ . By inductionhypothesis, Γ , Γ , Ξ | Θ , Φ ⊢ W : σ from which we conclude using the App rule. • If M = let x = N in L , the typing derivation provides contexts such that Γ = Γ , Γ , Γ and that Θ = Θ , P i ∈I p i · Θ ,i with Γ , Γ | Θ ⊢ M : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) and Γ , Γ , x : σ i | Θ ,i ⊢ N : µ i . By induction hypothesis, Γ , Γ , Ξ | Θ , Φ ⊢ M : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) fromwhich we conclude using the Let rule. • If M = N ⊕ p L , then Θ = Θ ⊕ p Θ with Γ | Θ ⊢ M : µ and Γ | Θ ⊢ N : ν . By applyinginduction hypothesis twice, we obtain Γ , Ξ | Θ , Φ ⊢ M : µ and Γ , Ξ | Θ , Φ ⊢ N : ν . Weapply the Choice rule; it remains to prove that (Θ , Φ) ⊕ p (Θ , Φ) = Θ ⊕ p Θ , Φ whichis easily done by deﬁnition of ⊕ p . • If M = case V of { S → W | → Z } , the typing derivation provides contexts such thatΓ = Γ , Γ with Γ | ∅ ⊢ V : Nat b s and Γ | Θ ⊢ W : Nat s → µ and Γ | Θ ⊢ Z : µ . Byinduction hypothesis, Γ , Ξ | Θ , Φ ⊢ W : Nat s → µ and Γ , Ξ | Θ , Φ ⊢ Z : µ from which weconclude using the Case rule. (cid:3) Lemma 11 (Closed Value Substitution)

Suppose that Γ , x : σ | Θ ⊢ M : µ and that ∅ | ∅ ⊢ V : σ . Then Γ | Θ ⊢ M [ V /x ] : µ . Proof.

As usual, the proof is by induction on the structure of the typing derivation. We proceedby case analysis on the last rule: – If it is Var, we have two cases. • If the conclusion is Γ , x : σ | Θ ⊢ x : σ then x [ V /x ] = V . By Lemma 10, we obtainthat Γ | Θ ⊢ V : σ . • If the conclusion is Γ , x : σ, y : τ | Θ ⊢ y : τ then y [ V /x ] = y and we obtainΓ , y : τ | Θ ⊢ y : τ using the Var rule. – If it is Var’, the situation is similar to the latter case of the previous one. The conclusion isΓ , x : σ | y : τ ⊢ y : τ and y [ V /x ] = y so that we obtain Γ | y : τ ⊢ y : τ using the Var’rule. 19 If it is Succ, then M = S W and µ = Nat b s . We obtain by induction hypothesis thatΓ | Θ ⊢ W [ V /x ] :

Nat s and we conclude using the Succ rule that Γ | Θ ⊢ ( S W )[ V /x ] :

Nat b s . – If it is Zero, we obtain immediately the result. – If it is λ , suppose that Γ , x : σ | Θ ⊢ λy.M : τ → µ . This comes from Γ , x : σ, y : τ | Θ ⊢ M : µ to which we apply the induction hypothesis, obtaining that Γ , y : τ | Θ ⊢ M [ V /x ] : µ .Then applying the λ rule gives the expected result. – For all the remaining cases, as for the λ rule, the result is obtained in a straightforward wayfrom the induction hypothesis. (cid:3) Lemma 12 (Substitution for distributions)

Suppose that Γ | x : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) ⊢ M : µ and that, for every i ∈ I , we have ∅ | ∅ ⊢ V : σ i . Then Γ | ∅ ⊢ M [ V /x ] : µ . Proof.

The proof is by induction on the structure of the typing derivation. We proceed by caseanalysis on the last rule: – If it is Var, we have M = y = x and y ∈ Γ. It follows that y [ V /x ] = y and we obtainΓ | ∅ ⊢ M [ V /x ] : µ simply by the Var rule. – If it is Var’, we have M = x so that M [ V /x ] = V . Moreover, the distribution (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) must be Dirac; we denote by σ the unique element of its support. Note that we also obtain σ = µ . As we supposed that ∅ | ∅ ⊢ V : σ , Lemma 10 gives Γ | ∅ ⊢ V : σ from which weconclude. – If it is LetRec, then x does not occur free in M . It follows that M [ V /x ] = M , and we canderive Γ | ∅ ⊢ M [ V /x ] : µ using a LetRec rule with the same hypothesis. – All others cases are treated straightforwardly using the induction hypothesis. (cid:3)

Lemma 13 Γ | Θ ⊢ S V : Nat b s = ⇒ Γ | Θ ⊢ V : Nat s ,2. Γ | Θ ⊢ : Nat s = ⇒ ∃ r , s = b r .3. Γ | Θ ⊢ S V : Nat s = ⇒ ∃ r , s = b r . Proof.

All points are immediate due to the typing rules introducing and S . Recall that by thesubtyping rules c ∞ = ∞ . (cid:3) Lemma 14 (Successor and Size Order)

Suppose that s r . Then b s b r . Proof.

By deﬁnition of , if s r , there are two cases: either r = ∞ , or spine ( s ) = spine ( r ) = i with s = b i k , r = b i k ′ and k ≤ k ′ . In both cases the conclusion is immediate. (cid:3) Lemma 15 (Size Substitutions are Monotonic)

1. Suppose that s r , then for any size t and size variable i we have s [ t / i ] r [ t / i ] .2. Suppose that s r , then for any size t and size variable i we have t [ s / i ] t [ r / i ] . Proof.

1. We proceed by induction on the derivation proving that s r , by case analysis onthe last rule. – If it is s s , then s = r and the result is immediate. – If it is s u u rs r then by induction hypothesis s [ t / i ] u [ t / i ] and u [ t / i ] r [ t / i ] so that we conclude usingthis same deduction rule. 20 If it is s b s , we have r = b s and using the deﬁnition of size substitution we obtain r [ t / i ] = b s [ t / i ] = [ s [ t / i ]. We conclude using the same deduction rule. – If it is s ∞ , we have ∞ [ t / i ] = ∞ and we obtain immediately s [ t / i ] ∞ .2. We proceed by case analysis on t . There are four cases: – If t = i , then t [ s / i ] = s r = t [ r / i ]. – If t = j = i , then t [ s / i ] = j j = t [ r / i ]. – If t = b u , we have by induction hypothesis that u [ s / i ] u [ r / i ]. We conclude usingLemma 14. – If t = ∞ , t [ s / i ] = ∞ ∞ = t [ r / i ]. (cid:3) Lemma 16 (Size Substitutions and Subtyping)

1. If σ ⊑ τ , then for any size s and size variable i , we have σ [ s / i ] ⊑ τ [ s / i ] .If µ ⊑ ν , then for any size s and size variable i , we have µ [ s / i ] ⊑ ν [ s / i ] .2. If i pos σ and s r , we have σ [ s / i ] ⊑ σ [ r / i ] .If i pos µ and s r , we have µ [ s / i ] ⊑ µ [ r / i ] .3. If i neg σ and s r , we have σ [ r / i ] ⊑ σ [ s / i ] .If i neg µ and s r , we have µ [ r / i ] ⊑ µ [ s / i ] . Proof.

1. We prove both statements at the same time by induction on the derivation proving that µ ⊑ ν (or σ ⊑ τ ). – If the last rule is σ ⊑ σ , then µ = ν = σ and the result is immediate. – If the last rule is t r Nat t ⊑ Nat r then by Lemma 15 we have t [ s / i ] r [ s / i ] so that (cid:0) Nat t (cid:1) [ s / i ] = Nat t [ s / i ] ⊑ Nat r [ s / i ] =( Nat r ) [ s / i ]. – If the last rule is τ ⊑ σ µ ⊑ νσ → µ ⊑ τ → ν then by induction hypothesis τ [ s / i ] ⊑ σ [ s / i ] and µ [ s / i ] ⊑ ν [ s / i ] from which we concludeusing the same rule. – If the last rule is ∃ f : I → J , (cid:0) ∀ i ∈ I , σ i ⊑ τ f ( i ) (cid:1) and (cid:16) ∀ j ∈ J , P i ∈ f − ( j ) p i ≤ p ′ j (cid:17)(cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) ⊑ n τ p ′ j j (cid:12)(cid:12) j ∈ J o we obtain by induction hypothesis that for every i ∈ I σ i [ s / i ] ⊑ τ f ( i ) [ s / i ] from which weconclude using the same rule.2. We prove (2) and (3) by mutual induction on µ (or σ ). Let s r . – If σ = Nat t , – Suppose that i pos Nat t . Note that this does not assume anything on t . Since s r ,we have (cid:0) Nat t (cid:1) [ s / i ] = Nat t [ s / i ] ⊑ Nat t [ r / i ] = (cid:0) Nat t (cid:1) [ r / i ] where we used themonotonicity of size substitution (Lemma 15). – Suppose that i neg Nat t . Then i / ∈ t and (cid:0) Nat t (cid:1) [ s / i ] = (cid:0) Nat t (cid:1) [ r / i ] so that we canconclude. 21 If σ = τ → µ , – Suppose that i pos σ . Then i neg τ and i pos µ . By induction hypothesis, τ [ r / i ] ⊑ τ [ s / i ]and µ [ s / i ] ⊑ µ [ r / i ]. By the subtyping rules, σ [ s / i ] = τ [ s / i ] → µ [ s / i ] ⊑ τ [ r / i ] → µ [ r / i ] = σ [ r / i ]. – Suppose that i neg σ . The reasoning is symmetrical. – If µ = (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) , – Suppose that i pos µ . Then for every i ∈ I we have i pos σ i and by inductionhypothesis σ i [ s / i ] ⊑ σ i [ r / i ]. We obtain that µ [ s / i ] ⊑ µ [ r / i ] using the identity asreindexing function. – Suppose that i neg µ . The reasoning is symmetrical. (cid:3) Lemma 17 (Size substitution) If Γ | Θ ⊢ M : µ , then for any size variable i and any size s we have that Γ[ s / i ] | Θ[ s / i ] ⊢ M : µ [ s / i ] . Proof.

We assume that i / ∈ s , without loss of generality: else we introduce a fresh size variable j , substitute it with s , and then substitute i with j . The proof is by induction on the typingderivation. We proceed by case analysis on the last rule. – If it is Var: we have Γ , x : σ | Θ ⊢ x : σ and deduce immediately using Var rule again thatΓ[ s / i ] , x : σ [ s / i ] | Θ[ s / i ] ⊢ x : σ [ s / i ]. – If it is Var’: we have Γ | x : σ ⊢ x : σ and deduce immediately using Var’ rule again thatΓ[ s / i ] | x : σ [ s / i ] ⊢ x : σ [ s / i ]. – If it is Succ: then M = S V and µ = Nat b r . By induction hypothesis, Γ[ s / i ] | Θ[ s / i ] ⊢ V :( Nat r ) [ s / i ]. But ( Nat r ) [ s / i ] = Nat r [ s / i ] so that by the Succ rule Γ[ s / i ] | Θ[ s / i ] ⊢ S V : Nat [ r [ s / i ] .We use the equality Nat [ r [ s / i ] = (cid:16) Nat b r (cid:17) [ s / i ] to conclude. – If it is Zero: the result is immediate. – If it is λ : we have M = λx.N and µ = σ → ν . By induction hypothesis, Γ[ s / i ] , x : σ [ s / i ] | Θ[ s / i ] ⊢ N : ν [ s / i ]. By application of the λ rule, Γ[ s / i ] | Θ[ s / i ] ⊢ λx.N : σ [ s / i ] → ν [ s / i ]. We conclude using σ [ s / i ] → ν [ s / i ] = ( σ → ν ) [ s / i ]. – If it is Sub: the hypothesis of the rule is Γ | Θ ⊢ M : ν for ν ⊑ µ . By induction hypothesis,Γ[ s / i ] | Θ[ s / i ] ⊢ M : ν [ s / i ]. But by Lemma 16 we have ν [ s / i ] ⊑ µ [ s / i ]. We conclude using theSub rule. – If it is App, we have M = V W and Γ = Γ , Γ , Γ and Θ = Θ , Θ with h Γ i = Nat ,Γ , Γ | Θ ⊢ V : σ → µ and Γ , Γ | Θ ⊢ W : σ . Applying the induction hypothesis twicegives Γ [ s / i ] , Γ [ s / i ] | Θ [ s / i ] ⊢ V : ( σ → µ ) [ s / i ] and Γ [ s / i ] , Γ [ s / i ] | Θ [ s / i ] ⊢ W : σ [ s / i ].Since σ [ s / i ] → µ [ s / i ] = ( σ → µ ) [ s / i ], we can use the Application rule to conclude. – If it is Choice, then M = N ⊕ p L and µ = ν ⊕ p ξ and Θ = Θ ⊕ p Θ with Γ | Θ ⊢ N : ν andΓ | Θ ⊢ L : ξ and h ν i = h ξ i . The induction hypothesis, applied twice, gives Γ[ s / i ] | Θ [ s / i ] ⊢ N : ν [ s / i ] and Γ[ s / i ] | Θ [ s / i ] ⊢ L : ξ [ s / i ] from which we conclude using the Choice rule againand the equality ν [ s / i ] ⊕ p ξ [ s / i ] = ( ν ⊕ p ξ ) [ s / i ] from Lemma 5.22 If it is Let, then M = ( let x = N in L ) and µ = P i ∈I p i · ν i and Γ = Γ , Γ , Γ and Θ = Θ , P i ∈I Θ ,i with Γ , Γ | Θ ⊢ N : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) and, for every i ∈ I ,Γ , Γ | Θ ,i ⊢ L : ν i and h Γ i = Nat . By repeated applications of the induction hypothesis,Γ [ s / i ] , Γ [ s / i ] | Θ [ s / i ] ⊢ N : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) [ s / i ] and, for every i ∈ I , Γ [ s / i ] , Γ [ s / i ] | Θ ,i [ s / i ] ⊢ L : ν i [ s / i ]. We use in a ﬁrst time the equality (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) [ s / i ] = (cid:8) ( σ i [ s / i ]) p i (cid:12)(cid:12) i ∈ I (cid:9) coming from the deﬁnition of size substitutions. We conclude using the Let rule again and theequality (cid:0)P i ∈I p i · ν i (cid:1) [ s / i ] = P i ∈I p i · ν i [ s / i ] from Lemma 5. – If it is Case, then M = case V of (cid:8) S → W (cid:12)(cid:12) → Z (cid:9) and Γ = Γ , Γ with Γ | ∅ ⊢ V : Nat b r and Γ | Θ ⊢ W : Nat r → µ and Γ | Θ ⊢ Z : µ . We apply induction hy-pothesis three times, and obtain Γ [ s / i ] | ∅ ⊢ V : (cid:16) Nat b r (cid:17) [ s / i ] and Γ [ s / i ] | Θ[ s / i ] ⊢ W :( Nat r → µ ) [ s / i ] and Γ [ s / i ] | Θ[ s / i ] ⊢ Z : µ [ s / i ]. We use the equalities (cid:16) Nat b r (cid:17) [ s / i ] = Nat [ r [ s / i ] and ( Nat r → µ ) [ s / i ] = Nat r [ s / i ] → µ [ s / i ] and then the Case rule to conclude. – If it is letrec , we carefully adapt the proof scheme of [18, Lemma 3.8]. We have M = letrec f = V and µ = Nat r → ν [ r / j ] and Γ = Γ , Γ with – h Γ i = Nat , – j / ∈ Γ and j positive in ν and ∀ j ∈ J , spine ( r j ) = j , – (cid:8) ( Nat r j → ν [ r j / j ]) p j (cid:12)(cid:12) j ∈ J (cid:9) induces an AST sized walk, – and Γ | f : (cid:8) ( Nat r j → ν [ r j / j ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b j → ν [ b j / j ] (11)We suppose, without loss of generality as this can be easily obtained by renaming j to a freshvariable, that i = j and that j / ∈ s . Let l be a fresh size variable; it follows in particular that l / ∈ Γ , Γ , ν, s . We apply the induction hypothesis to (11) and obtainΓ [ l / j ] | f : (cid:0)(cid:8) ( Nat r j → ν [ r j / j ]) p j (cid:12)(cid:12) j ∈ J (cid:9)(cid:1) [ l / j ] ⊢ V : (cid:16) Nat b j → ν [ b j / j ] (cid:17) [ l / j ]which, after applying a series of equalities and using the fact that j / ∈ Γ , coincides withΓ | f : (cid:16)n (cid:16) Nat r j [ l / j ] → ν [ r j [ l / j ] / j ] (cid:17) p j (cid:12)(cid:12) j ∈ J o(cid:17) ⊢ V : Nat b l → ν [ b j / j ][ l / j ]but also withΓ | f : (cid:16)n (cid:16) Nat r j [ l / j ] → ν [ l / j ][ r j [ l / j ] / l ] (cid:17) p j (cid:12)(cid:12) j ∈ J o(cid:17) ⊢ V : Nat b l → ν [ l / j ][ b l / l ]We can apply the induction hypothesis again, and obtain after rewritingΓ [ s / i ] | f : (cid:16)n (cid:16) Nat r j [ l / j ] → ν [ l / j ][ r j [ l / j ] / l ][ s / i ] (cid:17) p j (cid:12)(cid:12) j ∈ J o(cid:17) ⊢ V : Nat b l → ν [ l / j ][ b l / l ][ s / i ]where we used the fact that ∀ j ∈ J , spine ( r j ) = j = i so that (cid:16) Nat r j [ l / j ] (cid:17) [ s / i ] = Nat r j [ l / j ] .Since l / ∈ s , we can exchange [ b l / l ] and [ s / i ]. For every j ∈ J , we can also exchange [ s / i ] and[ r j [ l / j ] / l ] since spine ( r j [ l / j ]) = l = i and l / ∈ s . We obtain:Γ [ s / i ] | f : n (cid:16) Nat r j [ l / j ] → ν [ l / j ][ s / i ][ r j [ l / j ] / l ] (cid:17) p j (cid:12)(cid:12) j ∈ J o ⊢ V : Nat b l → ν [ l / j ][ s / i ][ b l / l ]Additionally, we have: – h Γ [ s / i ] i = Nat , – l / ∈ Γ [ s / i ], – l positive in ν [ l / j ][ s / i ] since j was positive in ν , – ∀ j ∈ J , spine ( r j [ l / j ]) = l since spine ( r j ) = j ,23 and n (cid:16) Nat r j [ l / j ] → ν [ l / j ][ s / i ][ r j [ l / j ] / l ] (cid:17) p j (cid:12)(cid:12) j ∈ J o induces the same sized walk, whichis thus AST, as (cid:8) ( Nat r j → ν [ r j / j ]) p j (cid:12)(cid:12) j ∈ J (cid:9) . Indeed, only the spine variable changesunder the substitution [ l / j ].Let t = r [ s / i ]. Since all these conditions are met, we can apply the letrec rule and obtainΓ [ s / i ] , Γ [ s / i ] | Θ[ s / i ] ⊢ letrec f = V : Nat t → ν [ l / j ][ s / i ][ t / l ]Since i , l / ∈ s and l / ∈ ν , we can commute [ s / i ] and [ t / l ] and compose substitutions to obtainΓ[ s / i ] | Θ[ s / i ] ⊢ letrec f = V : Nat t → ν [ t / j ][ s / i ]which rewrites to Γ[ s / i ] | Θ[ s / i ] ⊢ letrec f = V : ( Nat r → ν [ r / j ]) [ s / i ]which allows to conclude. (cid:3) We can now state the main lemma of subject reduction:

Lemma 18 (Subject Reduction, Fundamental Lemma)

Let M ∈ Λ s ⊕ ( µ ) and D be the uniqueclosed term distribution such that M → v D . Then there exists a closed typed distribution (cid:8) ( L j : ν j ) p j (cid:12)(cid:12) j ∈ J (cid:9) such that – E (( L j : ν j ) p j ) = µ , – (cid:2) ( L j ) p j (cid:12)(cid:12) j ∈ J (cid:3) is a pseudo-representation of D .Note that the condition on expectations implies that S j ∈J S ( ν j ) = S ( µ ) . Proof.

We proceed by induction on M . • Suppose that M = let x = V in N , that D = n ( N [ V /x ]) o , and that ∅ | ∅ ⊢ let x = V in N : µ . By Lemma 9, there exists ( ξ, σ ) such that ∅ | ∅ ⊢ V : σ and x : σ | ∅ ⊢ N : ξ with ξ ⊑ µ . By Lemma 11, ∅ | ∅ ⊢ N [ V /x ] : ξ , and since ξ ⊑ µ we obtain by subtyping that ∅ | ∅ ⊢ N [ V /x ] : µ . It follows that n ( N [ V /x ] : µ ) o is a closed typed distribution satisfyingthe requirements of the lemma. • Suppose that M = ( λx.N ) V , that D = n ( N [ V /x ]) o and that ∅ | ∅ ⊢ ( λx.N ) V : µ .Applying Lemma 9 twice, we obtain that x : τ | ∅ ⊢ N : ξ and ∅ | ∅ ⊢ V : σ with σ ⊑ τ and ξ ⊑ µ . Applying subtyping to the second judgement gives ∅ | ∅ ⊢ V : τ , and we canapply Lemma 11 to obtain ∅ | ∅ ⊢ N [ V /x ] : ξ . Since ξ ⊑ µ we obtain by weakening that ∅ | ∅ ⊢ N [ V /x ] : µ . It follows that n ( N [ V /x ] : µ ) o is a closed typed distribution satisfyingthe requirements of the lemma. • Suppose that M = N ⊕ p L , that D = (cid:2) N p , L − p (cid:3) and that ∅ | ∅ ⊢ N ⊕ p L : µ . ByLemma 9, there exists ( ξ, ρ ) such that ∅ | ∅ ⊢ N : ξ and ∅ | ∅ ⊢ L : ρ with ξ ⊕ p ρ ⊑ µ and P ( ξ ⊕ p ρ ) = 1. By Lemma 8, there exists ( ξ ′ , ρ ′ ) such that µ = ξ ′ ⊕ p ρ ′ , ξ ⊑ ξ ′ and ρ ⊑ ρ ′ .By subtyping, ∅ | ∅ ⊢ N : ξ ′ and ∅ | ∅ ⊢ L : ρ ′ . We consider the closed typed distributionof pseudo-representation h ( N : ξ ′ ) p , ( L : ρ ′ ) − p i which satisﬁes the requirements of thelemma since its expectation type is p · ξ ′ + (1 − p ) · ρ ′ = ξ ′ ⊕ p ρ ′ = µ . Note that we use apseudo-representation to cope with the very speciﬁc case in which N = L and ξ ′ = ρ ′ , inwhich the representation of the closed typed distribution is n ( N : ξ ′ ) o .24 Suppose that M = let x = N in L , that D = n ( let x = P j in L ) p ′ j (cid:12)(cid:12) j ∈ J o and that ∅ | ∅ ⊢ let x = N in L : µ . By Lemma 9, there exists (cid:0) I , ( σ i ) i ∈I , ( p i ) i ∈I , ( ξ i ) i ∈I (cid:1) such that – P i ∈I p i · ξ i ⊑ µ , – ∅ | ∅ ⊢ N : (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) , – ∀ i ∈ I , x : σ i | ∅ ⊢ L : ξ i .This reduction comes, by deﬁnition of → v , from N → v n P p ′ j j (cid:12)(cid:12) j ∈ J o , to which we canapply the induction hypothesis: there exists a closed typed distribution n ( R k : ρ k ) p ′′ k (cid:12)(cid:12) k ∈ K o which is such that (cid:8) σ p i i (cid:12)(cid:12) i ∈ I (cid:9) = X k ∈K p ′′ k · ρ k and that h ( R k ) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of n P p ′ j j (cid:12)(cid:12) j ∈ J o . It follows that,for every k ∈ K , we can write ρ k as the pseudo-representation h σ p ′′′ ki i (cid:12)(cid:12) i ∈ I i where someof the p ′′′ ki (but not all of them) may be worth 0. This implies that, for all i ∈ I , p i = X k ∈K p ′′ k p ′′′ ki Now, for every k ∈ K , we can derive ∅ | ∅ ⊢ let x = R k in L : P i ∈I p ′′′ ki · ξ i from the rule ∅ | ∅ ⊢ R k : n σ p ′′′ ki i (cid:12)(cid:12) i ∈ I o x : σ i | ∅ ⊢ L : ξ i ( ∀ i ∈ I ) ∅ | ∅ ⊢ let x = R k in L : P i ∈I p ′′′ ki · ξ i so that h (cid:0) let x = R k in L : P i ∈I p ′′′ ki · ξ i (cid:1) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of a closedtyped distribution, whose expectation is X k ∈K p ′′ k X i ∈I p ′′′ ki · ξ i = X i ∈I X k ∈K p ′′ k p ′′′ ki ! · ξ i = X i ∈I p i · ξ i By Lemma 9, the sum of P i ∈I p i · ξ i is 1, and it follows that P µ = 1 as well. Since P i ∈I p i · ξ i ⊑ µ , applying Corollary 3 gives us a family ( ν i ) i ∈I of distribution types suchthat, by subtyping, we can derive for every k ∈ K the judgement ∅ | ∅ ⊢ let x = R k in L : P i ∈I p ′′′ ki · ν i . This family −→ ν satisﬁes moreover P i ∈I p i · ν i = µ . We therefore considerthe closed typed distribution of pseudo-representation  let x = R k in L : X i ∈I p ′′′ ki · ν i ! p ′′ k (cid:12)(cid:12) k ∈ K  and of expectation type X k ∈K p ′′ k X i ∈I p ′′′ ki · ν i = X i ∈I p i · ν i = µ Since h ( R k : ρ k ) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of n P p ′ j j (cid:12)(cid:12) j ∈ J o , we have that h ( let x = R k in L : ρ k ) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of n ( let x = P j in L ) p ′ j (cid:12)(cid:12) j ∈ J o which allows us to conclude. 25 Suppose that M = case S V of { S → W | → Z } , that D = n ( W V ) o and that ∅ | ∅ ⊢ case S V of { S → W | → Z } : µ . By Lemma 9, there exists s and ξ suchthat ∅ | ∅ ⊢ S V : Nat b s and ∅ | ∅ ⊢ W : Nat s → ξ with ξ ⊑ µ . Lemma 13 implies that ∅ | ∅ ⊢ V : Nat s . Using an Application rule, we obtain that ∅ | ∅ ⊢ W V : ξ , and subtypinggives ∅ | ∅ ⊢ W V : µ , allowing us to conclude for n ( W V : µ ) o . • Suppose that M = case 0 of { S → W | → Z } , that D = n ( Z ) o and that ∅ | ∅ ⊢ case 0 of { S → W | → Z } : µ . By Lemma 9, there exists ξ with ξ ⊑ µ and such that ∅ | ∅ ⊢ Z : ξ . By subtyping, ∅ | ∅ ⊢ Z : µ which allows to conclude for n ( Z : µ ) o . • Suppose that M = ( letrec f = V ) (cid:16) c −→ W (cid:17) , that D = (cid:26) (cid:16) V [( letrec f = V ) /f ] (cid:16) c −→ W (cid:17)(cid:17) (cid:27) and that ∅ | ∅ ⊢ ( letrec f = V ) (cid:16) c −→ W (cid:17) : µ . We apply again Lemma 9, but this time werather depict the derivation typing M with µ it induces, for the sake of clarity. This deriva-tion is of the form (modulo composition of subtyping rules): π ... Hyp ∅ | f : (cid:8) ( Nat u j → ξ [ u j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ξ [ b i / i ] ∅ | ∅ ⊢ letrec f = V : Nat t → ξ [ t / i ] ∅ | ∅ ⊢ letrec f = V : Nat b s → µ π ... ∅ | ∅ ⊢ c −→ W : Nat b r ∅ | ∅ ⊢ c −→ W : Nat b s ∅ | ∅ ⊢ ( letrec f = V ) (cid:16) c −→ W (cid:17) : µ where the two sizes appearing in the types for c −→ W are successors due to Lemma 13, andwhere – Hyp denotes the additional premises of the letrec rule, and contains notably i pos ξ , – r b r b s t , – ξ [ t / i ] ⊑ µ .It follows that, for every j ∈ J , we can deduce that the closed value letrec f = V has type Nat u j → ξ [ u j / i ], as proved by the derivation π ... Hyp ∅ | f : (cid:8) ( Nat u j → ξ [ u j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ξ [ b i / i ] ∅ | ∅ ⊢ letrec f = V : Nat u j → ξ [ u j / i ]Since ∅ | f : (cid:8) ( Nat u j → ξ [ u j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) ⊢ V : Nat b i → ξ [ b i / i ]we obtain by Lemma 12 that ∅ | ∅ ⊢ V [( letrec f = V ) /f ] : Nat b i → ξ [ b i / i ]We now apply Lemma 17 to ∅ | ∅ ⊢ V [( letrec f = V ) /f ] : Nat b i → ξ [ b i / i ] with the sub-stitution [ r / i ] and we obtain that ∅ | ∅ ⊢ V [( letrec f = ) /V ] f : Nat b r → ξ [ b r / i ]. Using the26pplication rule, we derive ∅ | ∅ ⊢ V [( letrec f = V ) /f ] (cid:16) c −→ W (cid:17) : ξ [ b r / i ]. Since i pos ξ and b r t , by Lemma 16, we get that ξ [ b r / i ] ⊑ ξ [ t / i ]. By transitivity of ⊑ , ξ [ b r / i ] ⊑ µ which al-lows us to conclude by subtyping that ∅ | ∅ ⊢ V [( letrec f = V ) /f ] (cid:16) c −→ W (cid:17) : µ . The resultfollows, for (cid:26) (cid:16) V [( letrec f = V ) /f ] (cid:16) c −→ W (cid:17) : µ (cid:17) (cid:27) . (cid:3) Theorem 1 (Subject Reduction for → v ) Let n ∈ N , and (cid:8) ( M i : µ i ) p i (cid:12)(cid:12) i ∈ I (cid:9) be a closedtyped distribution. Suppose that (cid:8) ( M i ) p i (cid:12)(cid:12) i ∈ I (cid:9) → nv n ( N j ) p ′ j (cid:12)(cid:12) j ∈ J o then there existsa closed typed distribution n ( L k : ν k ) p ′′ k (cid:12)(cid:12) k ∈ K o such that • E (( M i : µ i ) p i ) = E (cid:16) ( L k : ν k ) p ′′ k (cid:17) , • and that h ( L k ) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of n ( N j ) p ′ j (cid:12)(cid:12) j ∈ J o . Proof.

The proof is by induction on n . For n = 0, → v is the identity relation and the result isimmediate. For n + 1, we have (cid:8) ( M i ) p i (cid:12)(cid:12) i ∈ I (cid:9) → nv n ( P l ) p ′′ l (cid:12)(cid:12) l ∈ L o → v n ( N j ) p ′ j (cid:12)(cid:12) j ∈ J o We apply the induction hypothesis and obtain a closed typed distribution n ( R g : ξ g ) p (3) g (cid:12)(cid:12) g ∈ G o satisfying E (( M i : µ i ) p i ) = E (cid:16) ( R g : ξ g ) p (3) g (cid:17) and such that h ( R g ) p (3) g (cid:12)(cid:12) g ∈ G i is a pseudo-representation of n ( P l ) p ′′ l (cid:12)(cid:12) l ∈ L o . For every g ∈ G : – if R g is a value, we set D g = (cid:8) R (cid:9) and T g to be the closed typed distribution T g = n ( T h : ρ h ) p (4) h (cid:12)(cid:12) h ∈ H g o = ( R g : ξ g ) , – else R g → v D g . We apply Lemma 18 and obtain a closed typed distribution T g = n ( T h : ρ h ) p (4) h (cid:12)(cid:12) h ∈ H g o such that E (cid:16) ( T h : ρ h ) p (4) h (cid:17) = ξ g and that h ( T h ) p (4) h (cid:12)(cid:12) h ∈ H g i is a pseudo-representation of D g .We claim that the closed typed distribution deﬁned as n ( L k : ν k ) p ′′ k (cid:12)(cid:12) k ∈ K o = X g ∈G p (3) g · T g satisﬁes the required conditions. Indeed, the expectation type is preserved: E (( M i : µ i ) p i ) = E (cid:16) ( R g : ξ g ) p (3) g (cid:17) = P g ∈G p (3) g · ξ g = P g ∈G p (3) g · E (cid:16) ( T h : ρ h ) p (4) h (cid:17) = E (cid:16)P g ∈G p (3) g · T g (cid:17) = E (cid:16)n ( L k : ν k ) p ′′ k (cid:12)(cid:12) k ∈ K o(cid:17) Moreover, by deﬁnition of the family ( D g ) g ∈G , n ( P l ) p ′′ l (cid:12)(cid:12) l ∈ L o = X g ∈G p (3) g · n ( R g ) o → v n ( N j ) p ′ j (cid:12)(cid:12) j ∈ J o = X g ∈G p (3) g · D g The result follows from the fact that h ( T h ) p (4) h (cid:12)(cid:12) h ∈ H g i is a pseudo-representation of D g forevery g ∈ G . (cid:3) .6 Subject Reduction for ⇛ v Recall that there is an order on distributions, deﬁned pointwise. Lemma 19

Suppose that M ⇛ v (cid:8) V p i i (cid:12)(cid:12) i ∈ I (cid:9) and that M ∈ Λ s ⊕ ( µ ) . Then there exists aclosed typed distribution n ( W j : σ j ) p ′ j (cid:12)(cid:12) j ∈ J o such that • E (cid:16) ( W j : σ j ) p ′ j (cid:17) µ , • and that h ( W j ) p ′ j (cid:12)(cid:12) j ∈ J i is a pseudo-representation of (cid:8) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:9) . Proof.

We have M → v D VD = D | T + (cid:8) V p i i (cid:12)(cid:12) i ∈ I (cid:9) . By Lemma 18, there exists a closed typeddistribution n ( L k : ν k ) p ′′ k (cid:12)(cid:12) k ∈ K o such that E (cid:16) ( L k : ν k ) p ′′ k (cid:17) = µ and that h ( L k ) p ′′ k (cid:12)(cid:12) k ∈ K i is a pseudo-representation of D . We consider the pseudo-representation h ( W j ) p ′ j (cid:12)(cid:12) j ∈ J i ob-tained from h ( L k ) p ′′ k (cid:12)(cid:12) k ∈ K i by removing all the terms which are not values. Note that J ⊆ K .We obtain in this way a pseudo-representation of (cid:8) V p i i (cid:12)(cid:12) i ∈ I (cid:9) which induces a closed typedrepresentation n ( W j : ν j ) p ′ j (cid:12)(cid:12) j ∈ J o such that E (cid:16) ( W j : ν j ) p ′ j (cid:17) µ . (cid:3) Theorem 2 (Subject Reduction)

Let M ∈ Λ s ⊕ ( µ ) . Then there exists a closed typed distribu-tion (cid:8) ( W j : σ j ) p j (cid:12)(cid:12) j ∈ J (cid:9) such that – E (( W j : σ j ) p j ) µ , – and that (cid:2) ( W j ) p j (cid:12)(cid:12) j ∈ J (cid:3) is a pseudo-representation of [[ M ]] . Note that E (( W j : σ j ) p j ) µ since the semantics of a term may not be a proper distributionat this stage. In fact, it will follow from the soundness theorem of Section 6 that the typability of M implies that P [[ M ]] = 1 and thus that the previous statement is an equality. This section is technically the most advanced one of the paper, and proves that the typing dis-cipline we have introduced indeed enforces almost sure termination. As already mentioned, thetechnique we will employ is a substantial generalisation of Girard-Tait’s reducibility. In partic-ular, reducibility must be made quantitative, in that terms can be said to be reducible with acertain probability . This means that reducibility sets will be deﬁned as sets parametrised by a realnumber p , called the degree of reducibility of the set. As Lemma 20 will emphasize, this degree ofreducibility ensures that terms contained in a reducibility set parametrised by p terminate withprobability at least p . These “intermediate” degrees of reducibility are required to handle theﬁxpoint construction, and show that recursively-deﬁned terms that are typable are indeed AST— that is, that they belong to the appropriate reducibility set, parametrised by 1. The ﬁrst preliminary notion we need is that of a size environment:

Deﬁnition 23 (Size Environment) A size environment is any function ρ from S to N ∪ {∞} .Given a size environment ρ and a size expression s , there is a naturally deﬁned element of N ∪{∞} ,which we indicate as J s K ρ : – J b i n K ρ = ρ ( i ) + n , – J ∞ K ρ = ∞ . In other words, the purpose of size environments is to give a semantic meaning to size expressions.Our reducibility sets will be parametrised not only on a probability, but also on a size environment.28 eﬁnition 24 (Reducibility Sets)–

For values of simple type

Nat , we deﬁne the reducibility sets

VRed p Nat s ,ρ = (cid:8) S n (cid:12)(cid:12) p > ⇒ n < J s K ρ (cid:9) . – Values of higher-order type are in a reducibility set when their applications to appropriate valuesare reducible terms, with an adequate degree of reducibility:

VRed pσ → µ,ρ = (cid:8) V ∈ Λ V ⊕ ( h σ → µ i ) (cid:12)(cid:12) ∀ q ∈ (0 , , ∀ W ∈ VRed qσ,ρ ,V W ∈ TRed pqµ,ρ (cid:9) – Distributions of values are reducible with degree p when they consist of values which arethemselves globally reducible “enough”. Formally, DRed pµ,ρ is the set of ﬁnite distributionsof values – in the sense that they have a ﬁnite support – admitting a pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) such that, setting µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o , there exists a family ( p ij ) i ∈I ,j ∈J ∈ [0 , |I|×|J | of probabilities and a family ( q ij ) i ∈I ,j ∈J ∈ [0 , |I|×|J| of degrees ofreducibility, satisfying:1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = µ ( σ j ) ,4. p ≤ P i ∈I P j ∈J q ij p ij .Note that (2) and (3) imply that P D = P µ . We say that (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) witnesses that D ∈ DRed pµ,ρ . – A term is reducible with degree p when its ﬁnite approximations compute distributions of valuesof degree of reducibility arbitrarily close to p : TRed pµ,ρ = (cid:8) M ∈ Λ ⊕ ( h µ i ) (cid:12)(cid:12) ∀ ≤ r < p, ∃ ν r µ, ∃ n r ∈ N ,M ⇛ n r v D r and D r ∈ DRed rν r ,ρ (cid:9) Note that here, unlike to the case of

DRed , the fact that M ∈ Λ ⊕ ( h µ i ) implies that µ is proper. The ﬁrst thing to observe about reducibility sets as given in Deﬁnition 24 is that they only dealwith closed terms, and not with arbitrary terms. As such, we cannot rely directly on them whenproving AST for typable terms, at least if we want to prove it by induction on the structureof type derivations. We will therefore deﬁne in the sequel an extension of these sets to openterms, which will be based on these sets of closed terms, and therefore enjoy similar properties.Before embarking in the proof that typability implies reducibility, it is convenient to prove somefundamental properties of reducibility sets, which inform us about how these sets are structured,and which will be crucial in the sequel. This is the purpose of the following subsections.

The following lemma, relatively easy to prove, is crucial for the understanding of the reducibilitysets, for that it shows that the degree of reducibility of a term gives information on the sum of itsoperational semantics:

Lemma 20 (Reducibility and Termination)–

Let D ∈ DRed pµ,ρ . Then P D ≥ p . – Let M ∈ TRed pµ,ρ . Then P [[ M ]] ≥ p . Proof. Let D ∈ DRed pµ,ρ , then there exists a pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) andfamilies ( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J of reals of [0 ,

1] such that ∀ i ∈ I , P j ∈J p ij = p i ,and that p ≤ P i ∈I P j ∈J q ij p ij . We therefore have: X D = X i ∈I p i = X i ∈I X j ∈J p ij ≥ X i ∈I X j ∈J q ij p ij ≥ p. • Since M ∈ TRed pµ,ρ , for every 0 ≤ r < p , there exists n r with M ⇛ n r v D r and D r ∈ DRed rν r ,ρ .From the previous point, we get that P D r ≥ r for every 0 ≤ r < p . It follows fromCorollary 1 that P [[ M ]] ≥ r for every 0 ≤ r < p and, by taking the supremum, P [[ M ]] ≥ p . (cid:3) It follows from this lemma that terms with degree of reducibility 1 are AST:

Corollary 4 (Reducibility and AST)

Let M ∈ TRed µ,ρ . Then M is AST. We now prove two results related to the reducibility degrees of reducibility sets. First of all, ifthe degree of reducibility p is 0, then no assumption is made on the probability of termination ofterms, distributions or values. It follows that the three kinds of reducibility sets collapse to theset of all aﬃnely simply typable terms, distributions or values: Lemma 21 (Candidates of Null Reducibility) – If V ∈ Λ V ⊕ ( κ ) , then V ∈ VRed σ,ρ for ev-ery σ such that h σ i = κ and every size environment ρ . – Let D = (cid:8) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:9) be a ﬁnite distribution of values. If ∀ i ∈ I , V i ∈ Λ V ⊕ ( κ ) , then D ∈ DRed µ,ρ for every µ such that h µ i = κ and P µ = P D and every ρ . – If M ∈ Λ ⊕ ( κ ) , then M ∈ TRed µ,ρ for µ such that h µ i = κ and every ρ . Structure of the proof.

In this lemma, as for most lemmas proving properties about

VRed , DRed and

TRed , we use a proof by induction on types. As the property is deﬁned in a mutual wayover

VRed , DRed and

TRed , we typically prove it for

VRed p Nat s ,ρ for any size s reﬁning Nat , andthen for

VRed pσ → µ,ρ by using the associated hypothesis on TRed pµ,ρ . Then we prove the propertyfor any distribution type for

DRed pµ,ρ using induction hypothesis on the

VRed pσ,ρ for σ ∈ S ( µ ), andwe prove it for TRed pµ,ρ using induction hypothesis on

VRed pσ,ρ . The point is that these ingredientsallow to give a proof by induction on the simple type underlying the sized type of interest. In thebase case, the sized type is necessarily of the form

Nat s for some size s : we prove the statement on VRed p Nat s ,ρ for all these sized types without using any induction-like hypothesis. Then we provethe statement for distribution types µ = (cid:8) ( Nat s i ) p i (cid:12)(cid:12) i ∈ I (cid:9) , ﬁrst on DRed pµ,ρ by using theresults for the sets

VRed p Nat s i ,ρ . Then we prove the result for TRed pµ,ρ typically using the one for

DRed pµ,ρ .We then switch to higher-order types, and give the proof for

VRed pσ → µ,ρ , which may use theresults for the other sets on types σ and µ . Typically, only results on TRed pµ,ρ are used. Then theproofs for

DRed pσ → µ,ρ and TRed pσ → µ,ρ are typically the same as in the case of distributions oversized types reﬁning Nat : therefore we do not write them again.This proof scheme will become more clear with the proof of this lemma on candidates of nullreducibility:

Proof. • Let V ∈ Λ V ⊕ ( Nat ). Every σ :: Nat is of the shape σ = Nat s for a size s . Let ρ be a sizeenvironment. By inspection of the grammar of values and of the simple type system, we seethat V must be of the shape S n for n ∈ N . Note that V is closed: it cannot be a variable.By deﬁnition, V ∈ VRed σ,ρ . 30 Let κ = κ ′ → κ ′′ be a higher-order type, with σ :: κ ′ and µ :: κ ′′ . Let ρ be a sizeenvironment, and V ∈ Λ V ⊕ ( κ ). Let q ∈ (0 ,

1] and W ∈ VRed qσ,ρ , we need to prove that

V W ∈ TRed µ,ρ . But, by deﬁnition of VRed qσ,ρ , W ∈ Λ V ⊕ ( κ ′ ). It follows that V W ∈ Λ ⊕ ( κ ′′ ),and we can apply the induction hypothesis to deduce that V W ∈ TRed µ,ρ , so that bydeﬁnition V ∈ VRed σ,ρ . • Let D = (cid:8) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:9) be a distribution of values and µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o :: κ be a distribution type. Suppose that ∀ i ∈ I , V i ∈ Λ V ⊕ ( κ ). Let ρ be a size environment. Forevery ( i, j ) ∈ I × J , we set p ij = p i p ′ j P µ and q ij = 0. We consider the canonical pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) and check the four conditions to be in DRed µ,ρ :1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ : this is obtained by induction hypothesis,2. ∀ i ∈ I , P j ∈J p ij = p i : let i ∈ I , we have P j ∈J p ij = p i P µ P j ∈J p ′ j = p i P µ × P µ = p i .3. ∀ j ∈ J , P i ∈I p ij = µ ( σ j ): let j ∈ J , we have P i ∈I p ij = p ′ j P µ P i ∈I p i = p ′ j P µ × P D . But P µ = P D , so that the sum equals p ′ j as requested.4. p ≤ P i ∈I P j ∈J q ij p ij : this amounts to 0 ≤

0, which holds. • Let M ∈ Λ ⊕ ( κ ) and µ :: κ . Let ρ be a size environment. Then M ∈ TRed µ,ρ : the conditionon M in the deﬁnition of TRed µ,ρ is for any 0 ≤ r < (cid:3) As p gives us a lower bound on the sum of the semantics of terms, it is easily guessed that a termhaving degree of reducibility p must also have degree of reducibility q < p . The following lemmamakes this statement precise: Lemma 22 (Downward Closure)

Let σ be a sized type, µ be a distribution type and ρ be a sizeenvironment. Let ≤ q < p ≤ . Then: – For any value V , V ∈ VRed pσ,ρ = ⇒ V ∈ VRed qσ,ρ , – For any ﬁnite distribution of values D , D ∈ DRed pµ,ρ = ⇒ D ∈ DRed qµ,ρ , – For any term M , M ∈ TRed pµ,ρ = ⇒ M ∈ TRed qµ,ρ . Proof.

Let σ be a sized type, µ be a distribution type and ρ be a size environment. If q = 0, theresult is immediate as a consequence of Lemma 21. Let 0 < q < p ≤ • Suppose that V ∈ VRed p Nat s ,ρ . Since by deﬁnition p, q > ⇒ VRed p Nat s ,ρ = VRed q Nat s ,ρ ,the result holds. • Suppose that V ∈ VRed pσ → µ,ρ . Then: V ∈ VRed pσ → µ,ρ ⇐⇒ ∀ q ∈ (0 , , ∀ W ∈ VRed qσ,ρ , V W ∈ TRed pqµ,ρ = ⇒ ∀ q ′ ∈ (0 , , ∀ W ∈ VRed q ′ σ,ρ , V W ∈ TRed qq ′ µ,ρ (by IH, since 0 < qq ′ < pq ≤ ⇐⇒ V ∈ VRed qσ → µ,ρ • Suppose that D ∈ DRed pµ,ρ . Then there exists a pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) and families of reals ( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J satisfying conditions (1) − (4). We have D ∈ DRed qµ,ρ , for the same pseudo-representation, since conditions (1) − (3) are the same,and (4) holds as well since q < p . 31 Suppose that M ∈ TRed pµ,ρ . Then for every 0 ≤ r < p , there exists ν r µ and n r ∈ N with M ⇛ n r v D r and D r ∈ DRed rν r ,ρ . So this statement also holds for every 0 ≤ r < q and M ∈ TRed qµ,ρ . (cid:3) To prove the lemma of continuity on the reducibility sets, which says that if an element is in allthe reducibility sets for degrees q < p then it is also in the set parametrised by the degree p , weuse the following companion lemma computing a family of probabilities maximizing the degree ofreducibility of a distribution: Lemma 23 (Maximizing the Degree of Reducibility of a Distribution)

Let D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) be a ﬁnite distribution of values and µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o bea distribution type. Set q ij = max n q (cid:12)(cid:12) V i ∈ VRed qσ j ,ρ o for every ( i, j ) ∈ I × J . Then thereexists a family ( p ij ) i ∈I ,j ∈J of reals of [0 , satisfying:1. ∀ i ∈ I , P j ∈J p ij = p i ,2. ∀ j ∈ J , P i ∈I p ij = µ ( σ j ) ,and which maximizes P i ∈I P j ∈J q ij p ij . Proof.

We use the theory of linear programming in the ﬁnite real vectorial space R n , taking [32]as a reference. We stick to the notations of this book. The problem then amounts to showing theexistence of max n cx (cid:12)(cid:12) x ≥ −→ , Ax = b o (12)where, supposing that we can index vectors and matrices by i × j thanks to a bijection i × j −→{ , . . . , n } where n = I × J ): • x is the column vector indexed by the ﬁnite set I × J , where x ij plays the role of p ij , • c is the row vector indexed by I × J , with c ij = max n q (cid:12)(cid:12) V i ∈ VRed qσ j ,ρ o , • −→ I × J ), • A is the matrix with columns indexed by I × J and rows indexed by I + J , and such that: – a i ′ , ( i,j ) = 1 if and only if i = i ′ , and 0 else, – and a j ′ , ( i,j ) = 1 if and only if j = j ′ , and 0 else. • b is the column vector indexed by I + J and such that b i = p i and b j = µ ( σ j ).Following [32, Section 7.4], the maximum (12) exists if and only if: • the problem is feasible : its constraints admit at least a solution, • and if it is bounded : there should be an upper bound over (12).and, also, its existence is equivalent to the one of the maximum of the following problem:max n cx (cid:12)(cid:12) x ≥ −→ , Ax ≤ b o (13)This reformulation makes the feasibility immediate, for the null vector x = −→ q ij ∈ [0 , P i ∈I P j ∈J p ij =1 so that P i ∈I P j ∈J q ij p ij ≤

1. The existence of the maximum (12) follows, and the Lemmatherefore holds. (cid:3)

32t follows that a distribution has a maximal degree of reducibility: the supremum of the degreesof reducibility is again a degree of reducibility:

Corollary 5 (Maximizing the Degree of Reducibility of a Distribution II)

Let D be a ﬁ-nite distribution of values, µ be a distribution type and ρ be a size environment. Suppose that D ∈ DRed pµ,ρ for some real p ∈ [0 , . Then there exists a maximal real p max ∈ [ p, such that D ∈ DRed p max µ,ρ and p ′ > p max = ⇒ D / ∈ DRed p ′ µ,ρ . Proof.

Let D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) be a ﬁnite distribution of values and µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o be a distribution type. By Lemma 23, setting q ij = max n q (cid:12)(cid:12) V i ∈ VRed qσ j ,ρ o for every ( i, j ) ∈I × J , there exists a family ( p ij ) i ∈I ,j ∈J of reals of [0 ,

1] which maximizes w = P i ∈I P j ∈J q ij p ij .It is immediate to see that any increase of a q ij to q ′ is contradictory with V i ∈ VRed q ′ σ j ,ρ , and thatany decrease of a q ij actually decreases w . It follows that p max = w . (cid:3) To analyse the letrec construction, we will prove that, for every ε ∈ (0 , − ε . We will be able to conclude on the AST nature of recursive constructionsusing the following continuity lemma, proved using the theory of linear programming: Lemma 24 (Continuity)

Let σ be a sized type, µ be a distribution type and ρ be a size environ-ment. Let p ∈ (0 , . Then: – VRed pσ,ρ = T

Let σ be a sized type, µ be a distribution type and ρ be a size environment. Let p ∈ (0 , • If σ = Nat s for some size s , then for every 0 < q < p we have VRed qσ,ρ = VRed pσ,ρ so that

VRed pσ,ρ = T

DRed pµ,ρ ⊆ T p max ,n = ⇒ D / ∈ DRed p ′ µ,ρ . It follows that all the p max ,n coincide, and that they are greater than sup n ∈ N q n = p . So D ∈ DRed pµ,ρ .33

The inclusion

TRed pµ,ρ ⊆ T r , we obtain the desired ν r µ , n r ∈ N , D r having the properties of interest. The result follows. (cid:3) In this subsection, we show how the sizes appearing in the (sized or distribution) type parametriz-ing a reducibility set relate with the interpretation of size variables contained in the size envi-ronment which also parametrizes it. We prove ﬁrst the following lemma, which will be used as acompanion for this result:

Lemma 25 (Commuting Sizes with Environments)

Let i be a size variable, s , r be two sizes,and ρ be a size environment. Suppose that s = ∞ or that spine ( s ) = i . Then J r [ s / i ] K ρ = J r K ρ [ i J s K ρ ] . Proof.

By case analysis. • If r = b j n for j = i , then r [ s / i ] = r and J r K ρ = ρ ( j ) + n = J r K ρ [ i J s K ρ ] . • If r = b i n , then – if s = b j m for j = i , then r [ s / i ] = b j n + m and J r [ s / i ] K ρ = ρ ( j ) + n + m = J b j m K ρ + n = J s K ρ + n = J b i n K ρ [ i J s K ρ ] = J r K ρ [ i J s K ρ ] , – if s = ∞ , then r [ s / i ] = ∞ and J r [ s / i ] K ρ = ∞ = J r K ρ [ i J s K ρ ] . • If r = ∞ , then r [ s / i ] = r and J r K ρ = ∞ = J r K ρ [ i J s K ρ ] . (cid:3) The last fundamental property about reducibility sets which will be crucial to treat the recursivecase is the following, stating that the sizes appearing in a sized type may be recovered in thereducibility set by using an appropriate semantics of the size variables, and conversely:

Lemma 26 (Size Commutation)

Let i be a size variable, s be a size such that s = ∞ or that spine ( s ) = i and ρ be a size environment. Then: – VRed pσ [ s / i ] ,ρ = VRed pσ,ρ [ i J s K ρ ] , – DRed pµ [ s / i ] ,ρ = DRed pµ,ρ [ i J s K ρ ] , – TRed pµ [ s / i ] ,ρ = TRed pµ,ρ [ i J s K ρ ] . Proof. • The ﬁrst case to consider is σ = Nat r for some size r . Using Lemma 25, we have that VRed p ( Nat r )[ s / i ] ,ρ = VRed p Nat r [ s / i ] ,ρ = (cid:8) S n (cid:12)(cid:12) p > ⇒ n < J r [ s / i ] K ρ (cid:9) = (cid:8) S n (cid:12)(cid:12) p > ⇒ n < J r K ρ [ i J s K ρ ] (cid:9) = VRed p Nat r ,ρ [ i J s K ρ ] • We then consider the case of the sized type σ → µ :: κ ′ → κ ′′ . We have34 Red p ( σ → µ )[ s / i ] ,ρ = VRed pσ [ s / i ] → µ [ s / i ] ,ρ = n V ∈ Λ V ⊕ ( h σ [ s / i ] → µ [ s / i ] i ) (cid:12)(cid:12) ∀ q ∈ (0 , , ∀ W ∈ VRed qσ [ s / i ] ,ρ , V W ∈ TRed pqµ [ s / i ] ,ρ o = n V ∈ Λ V ⊕ ( h σ → µ i ) (cid:12)(cid:12) ∀ q ∈ (0 , , ∀ W ∈ VRed qσ,ρ [ i J s K ρ ] , V W ∈ TRed pqµ,ρ [ i J s K ρ ] o = VRed pσ → µ,ρ [ i J s K ρ ] where we used the induction hypothesis twice, once on κ ′ and the other time on κ ′′ . • Let D be a ﬁnite distribution of values and µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o be a distribution type.We have that µ [ s / i ] = n ( σ j [ s / i ]) p ′ j (cid:12)(cid:12) j ∈ J o . Suppose that D ∈ DRed pµ [ s / i ] ,ρ . Thenthere exist a pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) and families ( p ij ) i ∈I ,j ∈J and( q ij ) i ∈I ,j ∈J of reals of [0 ,

1] satisfying:1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j [ s / i ] ,ρ ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = µ ( σ j ),4. p ≤ P i ∈I P j ∈J q ij p ij .But (1) is equivalent to ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ [ i J s K ρ ] by induction hypothesis. Itfollows that D ∈ DRed pµ,ρ [ i J s K ρ ] . The converse direction proceeds in the exact same way. • Then, M ∈ TRed pµ [ s / i ] ,ρ if and only if M ∈ Λ ⊕ ( h µ i ) and ∀ ≤ r < p, ∃ ν r µ, ∃ n r ∈ N , M ⇛ n r v D r and D r ∈ DRed rν r [ s / i ] ,ρ if and only if, by induction hypothesis, M ∈ Λ ⊕ ( h µ i ) and ∀ ≤ r < p, ∃ ν r µ, ∃ n r ∈ N , M ⇛ n r v D r and D r ∈ DRed rν r ,ρ [ i J s K ρ ] that is, if and only if M ∈ TRed pµ,ρ [ i J s K ρ ] . (cid:3) The most diﬃcult step in proving all typable terms to be reducible is, unexpectedly, proving thatterms involving recursion are reducible whenever their respective unfoldings are. This very naturalconcept expresses simply that any term in the form letrec f = W is assumed to compute theﬁxpoint of the function deﬁned by W . Deﬁnition 25 ( n -Unfolding) Suppose that V = ( letrec f = W ) is closed, then the n - unfolding of V is: – V if n = 0 ; – W [ Z/f ] if n = m + 1 and Z is the m -unfolding of V .We write the set of unfoldings of V as Unfold ( V ) . Note that if V admits a simple type, then allits unfoldings have this same simple type as well. In the sequel, we implicitly consider that V issimply typed. Any unfolding of V = ( letrec f = W ) should behave like V itself: all unfoldings of V should beequivalent. This, however, cannot be proved using simply the operational semantics. It requiressome work, and techniques akin to logical relations, to prove this behavioural equivalence betweena recursive deﬁnition and its unfoldings. The ﬁrst lemma is technical and lists the unfoldings ofterms deﬁned recursively as equal to themselves or to a variable:35 emma 27 • Let V = f and W ∈ Unfold ( letrec f = V ) . Then W = letrec f = V . • Let V = x = f and W ∈ Unfold ( letrec f = V ) . Then W = letrec f = V or W = x .More precisely, the n -unfoldings for n ≥ are all x . The next lemma is the technical core of this section. Think of two terms as related when theyare of the shape M [ −→ Z / −→ x ] and M [ −→ Z ′ / −→ x ], where −→ x is a sequence of “holes” in M , ﬁlled withunfoldings from a same recursively-deﬁned term. Then their rewritings by → v form distributionsof pairwise related terms. Lemma 28

Let V = ( letrec f = W ) be a closed value. Let −→ x , −→ Z , −→ Z ′ be a vector of variablesand two vectors of terms of Unfold ( V ) , all of the same length. Let M be a simply-typed term offree variables contained in −→ x , all typed with the simple type of V . Suppose that M [ −→ Z / −→ x ] → v D .Then there exists N , . . . , N n , a vector of variables −→ y and −→ Z , . . . , −→ Z n , −→ Z ′ , . . . , −→ Z ′ n ∈ Unfold ( V ) ofthe same length as −→ y and such that D = n (cid:16) N i [ −→ Z i / −→ y ] (cid:17) p i o and moreover M [ −→ Z ′ / −→ x ] → v E = n (cid:16) N i [ −→ Z ′ i / −→ y ] (cid:17) p i o . Proof.

We prove the result by induction on the structure of M . • The case where M is a variable cannot ﬁt in this setting: either M = y / ∈ −→ x and thereis no reduction from M [ −→ Z / −→ x ], or M = x i ∈ −→ x and there is no reduction either from M [ −→ Z / −→ x ] = Z i since it is a value. We can similarly rule out all the cases where M is avalue. • Suppose that M = V V . We proceed by case exhaustion on V . Three possibilities exist,the other ones contradicting the fact that there should be a reduction step from M : – If V = x i ∈ −→ x , we distinguish four cases: ∗ Suppose that Z i = Z ′ i are both the 0-unfolding of V . Then M [ −→ Z / −→ x ] = M [ −→ Z ′ / −→ x ]and the result follows immediately from: M [ −→ Z / −→ x ] = ( letrec f = W ) V [ −→ Z / −→ x ]= ( letrec f = W ) ( S m → v n (( W [ letrec f = W/f ]) ( S m o where in the second line the shape of V needs to be S m by typing constraints.Note that −→ y is the empty vector here. ∗ Suppose that Z i is the n -unfolding of V for n >

0, and that Z ′ i is the 0-unfolding.We have that M [ −→ Z / −→ x ] = W [ Z ′′ /f ] V [ −→ Z / −→ x ]where Z ′′ is the ( n − V , and that M [ −→ Z ′ / −→ x ] = ( letrec f = W ) V [ −→ Z ′ / −→ x ] → v (cid:26) (cid:16) ( W [ letrec f = W/f ]) V [ −→ Z ′ / −→ x ] (cid:17) (cid:27) Notice that this reduction is possible since the constraint of simple typing impliesthat V is of the shape S m for some m ≥

0. We can therefore rewrite the twoterms as M [ −→ Z / −→ x ] = W [ Z ′′ /f ] ( S m )and M [ −→ Z ′ / −→ x ] → v n (( W [ letrec f = W/f ]) ( S m )) o We need to distinguish four cases, depending on the structure of W .36 Suppose that W is a variable diﬀerent from f . Then by Lemma 27 there cannotbe a step of reduction from M [ −→ Z / −→ x ]. · Suppose that W = f . Then by Lemma 27 we have Z i = Z ′ i = Z ′′ , so that M [ −→ Z / −→ x ] = M [ −→ Z ′ / −→ x ] and the result follows just as for the case where both Z and Z ′ were 0-unfoldings. · Suppose that W = λy.L . Then M [ −→ Z / −→ x ] = ( λy.L [ Z ′′ /f ]) ( S m ) → v n ( L [ Z ′′ /f ][ S m /y ]) o = n (( L [ S m /y ]) [ Z ′′ /f ]) o Moreover, M [ −→ Z ′ / −→ x ] → v n (( W [ letrec f = W/f ]) ( S m )) o = n ((( λy.L ) ( S m ))[ letrec f = W/f ]) o → v n (( L [( S m ) /y ]) [ letrec f = W/f ]) o so that we can conclude with −→ y = f and N = L [( S m ) /y ]. · Suppose that W = letrec g = W ′ . Then M [ −→ Z / −→ x ] = (( letrec g = W ′ )[ Z ′′ /f ]) ( S m ) → v n ( W ′ [ letrec g = W ′ /g ])[ Z ′′ /f ] ( S m )) o = n ( W ′ [ letrec g = W ′ /g ] ( S m ))[ Z ′′ /f ]) o Moreover, M [ −→ Z ′ / −→ x ] → v (( letrec g = W ′ )[ letrec f = W/f ]) ( S m ) → v n ( W ′ [ letrec g = W ′ /g ])[ letrec f = W/f ] ( S m )) o = n ( W ′ [ letrec g = W ′ /g ] ( S m ))[ letrec f = W/f ]) o so that we can conclude with −→ y = f and N = W ′ [ letrec g = W ′ /g ] ( S m )). ∗ Suppose that Z i is the 0-unfolding of V , and that Z ′ i is the n -unfolding for n > V is of the shape S m for some m ≥

0. We have that M [ −→ Z / −→ x ] → v n (( W [ letrec f = W/f ]) ( S m )) o = n (( W ( S m ))[ letrec f = W/f ])) o and that M [ −→ Z ′ / −→ x ] = W [ Z ′′ /f ] ( S m ) = ( W ( S m ))[ Z ′′ /f ]where Z ′′ is the ( n − V , so that we can conclude with −→ y = f and N = W ( S m ). ∗ Suppose that Z i is the n -unfolding of V for n >

0, and that Z ′ i is the n ′ -unfoldingfor n ′ >

0. We have M [ −→ Z / −→ x ] = W [ Z ′′ /f ] V [ −→ Z / −→ x ]where Z ′′ is the ( n − V , and M [ −→ Z / −→ x ] = W [ Z ′′′ /f ] V [ −→ Z / −→ x ]37here Z ′′ is the ( n ′ − V . We proceed by case analysis on W . As wediscussed in the case where Z i was a ( n ′′ + 1)-unfolding and Z ′ i a 0-unfolding, thecase where W is a variable does not lead to a rewriting step. It remains to treattwo cases: · Suppose that W = λy.L . Then M [ −→ Z / −→ x ] = λy.L [ Z ′′ /f ] V [ −→ Z / −→ x ] → v (cid:26) (cid:16) L [ Z ′′ /f ][ V [ −→ Z / −→ x ] /y ] (cid:17) (cid:27) = (cid:26) (cid:16)(cid:16) L [ V [ −→ Z / −→ x ] /y ] (cid:17) [ Z ′′ /f ] (cid:17) (cid:27) and M [ −→ Z ′ / −→ x ] = λy.L [ Z ′′′ /f ] V [ −→ Z / −→ x ] → v (cid:26) (cid:16) L [ Z ′′′ /f ][ V [ −→ Z / −→ x ] /y ] (cid:17) (cid:27) = (cid:26) (cid:16)(cid:16) L [ V [ −→ Z / −→ x ] /y ] (cid:17) [ Z ′′′ /f ] (cid:17) (cid:27) so that we can conclude with −→ y = f and N = L [ V [ −→ Z / −→ x ] /y ]. · Suppose that W = letrec g = W ′ . Then M [ −→ Z / −→ x ] = (( letrec g = W ′ )[ Z ′′ /f ]) V [ −→ Z / −→ x ] → v (cid:26) (cid:16) W ′ [ letrec g = W ′ /g ])[ Z ′′ /f ] V [ −→ Z / −→ x ] (cid:17) (cid:27) = (cid:26) (cid:16) W ′ [ letrec g = W ′ /g ] V [ −→ Z / −→ x ])[ Z ′′ /f ] (cid:17) (cid:27) where the reduction is possible because the simple typing constraints implythat V [ −→ Z / −→ x ] is of the shape S m for some m ∈ N . Moreover, M [ −→ Z ′ / −→ x ] = (( letrec g = W ′ )[ Z ′′′ /f ]) V [ −→ Z / −→ x ] → v (cid:26) (cid:16) ( W ′ [ letrec g = W ′ /g ])[ Z ′′′ /f ] V [ −→ Z / −→ x ] (cid:17) (cid:27) = (cid:26) (cid:16) ( W ′ [ letrec g = W ′ /g ] V [ −→ Z / −→ x ])[ Z ′′′ /f ] (cid:17) (cid:27) so that we can conclude with −→ y = f and N = W ′ [ letrec g = W ′ /g ] V [ −→ Z / −→ x ]. – If V = λy.L , M [ −→ Z / −→ x ] = (cid:16) λy.L [ −→ Z / −→ x ] (cid:17) V [ −→ Z / −→ x ] → v (cid:26) (cid:16) L [ −→ Z / −→ x ][ V [ −→ Z / −→ x ] /y ] (cid:17) (cid:27) = (cid:26) (cid:16) L [ V /y ][ −→ Z / −→ x ] (cid:17) (cid:27) and in the same way M [ −→ Z ′ / −→ x ] → v (cid:26) (cid:16) L [ V /y ][ −→ Z ′ / −→ x ] (cid:17) (cid:27) which allows to conclude with N = L [ V /y ].38 If V = letrec g = W ′ , by typing constraints V = S m for some m ≥

0. It followsthat we can reduce M [ −→ Z / −→ x ] and M [ −→ Z ′ / −→ x ] as follows: M [ −→ Z / −→ x ] = (cid:16) letrec g = W ′ [ −→ Z / −→ x ] (cid:17) S m → v (cid:26) (cid:16)(cid:16) W ′ [ −→ Z / −→ x ][ letrec g = W ′ [ −→ Z / −→ x ] /g ] (cid:17) ( S m ) (cid:17) (cid:27) = (cid:26) (cid:16) ( W ′ [ letrec g = W ′ /g ]) [ −→ Z / −→ x ] ( S m ) (cid:17) (cid:27) = (cid:26) (cid:16) ( W ′ [ letrec g = W ′ /g ] ( S m )) [ −→ Z / −→ x ] (cid:17) (cid:27) and similarly M [ −→ Z ′ / −→ x ] → v (cid:26) (cid:16) ( W ′ [ letrec g = W ′ /g ] ( S m )) [ −→ Z ′ / −→ x ] (cid:17) (cid:27) so that we can conclude with N = W ′ [ letrec g = W ′ /g ] ( S m ). • Suppose that M = let y = X in P . Then M [ −→ Z / −→ x ] = let y = X [ −→ Z / −→ x ] in P [ −→ Z / −→ x ] → v (cid:26) (cid:16) P [ −→ Z / −→ x ][ X [ −→ Z / −→ x ] /y ] (cid:17) (cid:27) = (cid:26) (cid:16) P [ X/y ][ −→ Z / −→ x ] (cid:17) (cid:27) and similarly M [ −→ Z ′ / −→ x ] → v (cid:26) (cid:16) P [ X/y ][ −→ Z ′ / −→ x ] (cid:17) (cid:27) from which we can conclude. • Suppose that M = let y = L in P and that M [ −→ Z / −→ x ] = let y = L [ −→ Z / −→ x ] in P [ −→ Z / −→ x ] → v n (cid:16) let y = L ′ i [ −→ Z / −→ x ] in P [ −→ Z / −→ x ] (cid:17) p i o = n (cid:16) let y = L ′′ i [ −→ Z / −→ z ] in P [ −→ Z / −→ x ] (cid:17) p i o = n (cid:16) ( let y = L ′′ i in P ) [ −→ Z , −→ Z / −→ z , −→ x ] (cid:17) p i o where the third step is obtained by α -renaming, and where by deﬁnition of → v we have L [ −→ Z / −→ x ] → v n ( L ′ i [ −→ Z / −→ x ]) p i (cid:12)(cid:12) i ∈ I o . By induction hypothesis, there exists −→ Z ′ , . . . , −→ Z ′ n ∈ Unfold ( V ) such that L [ −→ Z ′ / −→ x ] → v n ( L ′ i [ −→ Z ′ i / −→ x ]) p i (cid:12)(cid:12) i ∈ I o . Now remark that M [ −→ Z ′ / −→ x ] → v n (cid:16) let y = L ′ i [ −→ Z ′ i / −→ x ] in P [ −→ Z / −→ x ] (cid:17) p i o = n (cid:16) let y = L ′′ i [ −→ Z ′ i / −→ z ] in P [ −→ Z / −→ x ] (cid:17) p i o = n (cid:16) ( let y = L ′′ i in P ) [ −→ Z , −→ Z ′ i / −→ x , −→ z ] (cid:17) p i o The result follows for −→ y = −→ x , −→ z and N i = let y = L ′′ i in P .39 Suppose that M = L ⊕ p P . Suppose that L = P . Then M [ −→ Z / −→ x ] → v n L [ −→ Z / −→ x ] p , P [ −→ Z / −→ x ] − p o and M [ −→ Z ′ / −→ x ] → v n L [ −→ Z ′ / −→ x ] p , P [ −→ Z ′ / −→ x ] − p o so that the result holds for N = L and N = P .If L = P , M [ −→ Z / −→ x ] → v n L [ −→ Z / −→ x ] o and M [ −→ Z ′ / −→ x ] → v n L [ −→ Z ′ / −→ x ] o and the result holds as well. Note that the distinction is necessary so as to avoid the use ofpseudo-representations in the statement of the lemma. • Suppose that M = case V ′ of (cid:8) S → X (cid:12)(cid:12) → Y (cid:9) . By typing constraints, V ′ = S m or V ′ = y is a variable. – If V ′ = , M [ −→ Z / −→ x ] → v (cid:26) (cid:16) R [ −→ Z / −→ x ] (cid:17) (cid:27) and M [ −→ Z ′ / −→ x ] → v (cid:26) (cid:16) R [ −→ Z ′ / −→ x ] (cid:17) (cid:27) sothat we can conclude. – If V ′ = S m with m >

0, we can conclude in the same way. – In the latter case, there is no reduction from M [ −→ Z / −→ x ] unless V ′ [ −→ Z / −→ x ] is of the shape V ′ = S m . But this is of type Nat and cannot therefore be an unfolding of V , so thatthis case is impossible. (cid:3) This result can be extended to a n -step rewriting process; however pseudo-representations arerequired to keep the statement true, as we explain in the proof. Lemma 29

Let V = ( letrec f = W ) be a closed value. Let M be a simply-typed term of freevariables contained in −→ x , all typed with the simple type of V . Let −→ Z , −→ Z ′ ∈ Unfold ( V ) and n ∈ N .Then there exists a distribution of values of pseudo-representation (cid:2) X p i i (cid:12)(cid:12) i ∈ I (cid:3) , a vector ofvariables −→ y and families of vectors (cid:16) −→ Z i (cid:17) i ∈I , (cid:16) −→ Z ′ i (cid:17) i ∈I of the same length as −→ y , all such that M [ −→ Z / −→ x ] ⇛ nv h (cid:16) X i [ −→ Z i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i and that M [ −→ Z ′ / −→ x ] ⇛ nv h (cid:16) X i [ −→ Z ′ i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i . Proof.

By iteration of Lemma 28. The pseudo-representations come from the fact that someterms in diﬀerent reduction branches may converge to the same value, say, in the reduction from M [ −→ Z / −→ x ] but not in the one from M [ −→ Z ′ / −→ x ]. (cid:3) The following lemma is of technical interest. It states that, given two pseudo-representationsof a distribution – one of the shape exhibited in the previous lemmas and used for relating termswith unfoldings, the other one being a pseudo-representation witnessing the belonging to a set

DRed – there exists a third one which “combines” both:

Lemma 30

Suppose that D r = h (cid:16) X i [ −→ Z i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i = h (cid:0) X ′ j (cid:1) p ′ j (cid:12)(cid:12) j ∈ J i . Then thereexists a set K , two applications π : K → I and π : K → J and a pseudo-representation D r = (cid:20) (cid:16) X ′′ k [ −−−→ Z π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) such that • ∀ k ∈ K , X ′′ k = X π ( k ) , ∀ i ∈ I , P k ∈ π − ( i ) p ′′ k = p i , • ∀ k ∈ K , X ′′ k [ −→ Z ′′ k / −→ y ] = X ′ π ( k ) , • ∀ j ∈ J , P k ∈ π − ( j ) p ′′ k = p ′ j . Proof.

Let D = n ( Y l ) p ′′ l (cid:12)(cid:12) l ∈ L o be the representation of D . We build K , π and π asfollows. The construction starts from the empty set and the empty maps, and is iterated onevery l ∈ L . First, we set I l = n i ∈ I (cid:12)(cid:12) Y l = X i [ −→ Z i / −→ y ] o and J l = (cid:8) j ∈ J (cid:12)(cid:12) Y l = X ′ j (cid:9) .We suppose that both these sets are enumerated, and will write them I l = { i , . . . , i n l } and J l = { j , . . . , j m l } . We consider the set of reals R = ( , p i , p i + p i , . . . , n l X r =0 p i r ) ∪ ( , p ′ j , p ′ j + p ′ j , . . . , m l X r ′ =0 p ′ i r ′ ) ⊂ [0 , p ′′ l ]This set is ordered, as a set of reals, so that we have a maximal enumeration0 = α < α < · · · < α s = p where maximality means that β ∈ R = ⇒ ∃ t, β = α t . We add s elements to the set K produced during the examination of previous elements of L : K := K ⊎ { , . . . , s − } . For every t ∈ { , . . . , s − } , we deﬁne: • p ′′ t = α t +1 − α t , • π ( t ) = i k ∈ I l where P k − r =0 p i r ≤ α t and P kr =0 p i r ≥ α t , • π ( t ) = j k ∈ J l where P k − r =0 p j r ≤ α t and P kr =0 p j r ≥ α t .We claim that the set K resulting of this constructive process satisﬁes the equalities of the lemma. (cid:3) The series of previous lemmas allows to deduce that a term is reducible if and only if the termsto which it is related are:

Lemma 31

Let V = ( letrec f = W ) be a closed value. Let M be a simply-typed term of freevariables contained in −→ x , all typed with the simple type of V . Let −→ Z , −→ Z ′ ∈ Unfold ( V ) . Then M [ −→ Z / −→ x ] ∈ TRed pµ,ρ if and only if M [ −→ Z ′ / −→ x ] ∈ TRed pµ,ρ . Proof.

We prove that M [ −→ Z / −→ x ] ∈ TRed pµ,ρ implies that M [ −→ Z ′ / −→ x ] ∈ TRed pµ,ρ , the converse direc-tion being exactly symmetrical. The proof proceeds by induction on the simple type reﬁning µ .Suppose that µ :: Nat . Let r ∈ [0 , p ). Since M [ −→ Z / −→ x ] ∈ TRed pµ,ρ , there exists n r and ν r µ such that M [ −→ Z / −→ x ] ⇛ n r v D r and that D r ∈ DRed rν r ,ρ . Lemma 29 implies that there exists a distri-bution of values of pseudo-representation (cid:2) X p i i (cid:12)(cid:12) i ∈ I (cid:3) , a vector of variables −→ y and families ofvectors (cid:16) −→ Z i (cid:17) i ∈I , (cid:16) −→ Z ′ i (cid:17) i ∈I of the same length as −→ y all such that D r = h (cid:16) X i [ −→ Z i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i and that M [ −→ Z ′ / −→ x ] ⇛ nv E r = h (cid:16) X i [ −→ Z ′ i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i . By typing constraints coming fromthe subject reduction property, all the X i [ −→ Z i / −→ y ] and X i [ −→ Z ′ i / −→ y ] have the simple type Nat . Thisimplies that all these terms are of the shape S m for m ≥

0, and thus that the X i cannot con-tain a variable from −→ y , as their simple type is of the shape Nat → κ . It follows that, for everyindex i ∈ I , X i [ −→ Z i / −→ y ] = X i [ −→ Z ′ i / −→ y ]. This implies that E r = D r ∈ DRed rν r ,ρ , and thus that41 [ −→ Z ′ / −→ x ] ∈ TRed pµ,ρ .Suppose that µ :: κ → κ ′ . Let r ∈ [0 , p ). Since M [ −→ Z / −→ x ] ∈ TRed pµ,ρ , there exists n r and ν r µ such that M [ −→ Z / −→ x ] ⇛ n r v D r and that D r ∈ DRed rν r ,ρ . Lemma 29 implies that there exists a distri-bution of values of pseudo-representation (cid:2) X p i i (cid:12)(cid:12) i ∈ I (cid:3) , a vector of variables −→ y and families ofvectors (cid:16) −→ Z i (cid:17) i ∈I , (cid:16) −→ Z ′ i (cid:17) i ∈I of the same length as −→ y all such that D r = h (cid:16) X i [ −→ Z i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i and that M [ −→ Z ′ / −→ x ] ⇛ nv E r = h (cid:16) X i [ −→ Z ′ i / −→ y ] (cid:17) p i (cid:12)(cid:12) i ∈ I i . Since D r ∈ DRed rν r ,ρ , there is apseudo-representation D r = h ( Z ′ j ) p ′ j (cid:12)(cid:12) j ∈ J i witnessing this fact. By Lemma 30, there existsa pseudo-representation D r = (cid:20) (cid:16) X ′′ k [ −−−→ Z π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) satisfying a series of additionalproperties. These properties ensure two crucial facts for our purpose: • M [ −→ Z / −→ x ] ⇛ nv (cid:20) (cid:16) X ′′ k [ −−−→ Z π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) and M [ −→ Z ′ / −→ x ] ⇛ nv E r = (cid:20) (cid:16) X ′′ k [ −−−→ Z ′ π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) , • and (cid:20) (cid:16) X ′′ k [ −−−→ Z π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) is a pseudo-distribution witnessing that D r ∈ DRed rν r ,ρ .Setting µ = n ( σ l ) p ′′′ l (cid:12)(cid:12) l ∈ L o , there exists therefore families ( p ′′ kl ) k ∈K ,l ∈L and ( q kl ) k ∈K ,l ∈L of reals of [0 ,

1] satisfying:1. ∀ k ∈ K , ∀ l ∈ L , X ′′ k [ −−−→ Z π ( k ) / −→ y ] ∈ VRed q kl σ l ,ρ ,2. ∀ k ∈ K , P l ∈L p ′′ kl = p ′′ k ,3. ∀ l ∈ L , P k ∈K p ′′ kl = µ ( σ l ),4. p ≤ P k ∈K P l ∈L q kl p ′′ kl .We now prove that ∀ k ∈ K , ∀ l ∈ L , X ′′ k [ −−−→ Z ′ π ( k ) / −→ y ] ∈ VRed q kl σ l ,ρ . Let k ∈ K and l ∈ L . Let σ l = θ → ν . X ′′ k [ −−−→ Z π ( k ) / −→ y ] ∈ VRed q kl σ l ,ρ ⇐⇒ ∀ q ∈ (0 , , ∀ Y ∈ VRed qθ,ρ , X ′′ k [ −−−→ Z π ( k ) / −→ y ] Y ∈ TRed qq kl ν,ρ ⇐⇒ ∀ q ∈ (0 , , ∀ Y ∈ VRed qθ,ρ , ( X ′′ k Y ) [ −−−→ Z π ( k ) / −→ y ] ∈ TRed qq kl ν,ρ ⇐⇒ ∀ q ∈ (0 , , ∀ Y ∈ VRed qθ,ρ , ( X ′′ k Y ) [ −−−→ Z ′ π ( k ) / −→ y ] ∈ TRed qq kl ν,ρ by IH ⇐⇒ ∀ q ∈ (0 , , ∀ Y ∈ VRed qθ,ρ , X ′′ k [ −−−→ Z ′ π ( k ) / −→ y ] Y ∈ TRed qq kl ν,ρ ⇐⇒ X ′′ k [ −−−→ Z ′ π ( k ) / −→ y ] ∈ VRed q kl σ l ,ρ This will implies that (cid:20) (cid:16) X ′′ k [ −−−→ Z ′ π ( k ) / −→ y ] (cid:17) p ′′ k (cid:12)(cid:12) k ∈ K (cid:21) witnesses that E r ∈ DRed rν r ,ρ , for the samefamilies of reals p ′′ kl and q kl . Now for every r ∈ [0 , p ), there exists n r and ν r µ such that M [ −→ Z ′ / −→ x ] ⇛ n r v E r and that E r ∈ DRed rν r ,ρ : we have that M [ −→ Z ′ / −→ x ] ∈ TRed pµ,ρ . (cid:3) The following lemma shows that reducible values are reducible terms:

Lemma 32 (Reducible Values are Reducible Terms)

Let V be a value. Then V ∈ TRed p { σ } ,ρ if and only if V ∈ VRed pσ,ρ . Note that, conversely, we may have V ∈ TRed pµ,ρ where µ is not Dirac. For instance, ∈ TRed µ,ρ for µ = n ( Nat i ) , ( Nat b i ) o . Proof. Suppose that V ∈ VRed pσ,ρ . Let r ∈ [0 , p ). We must prove that there exists n r and ν r such that V → n r v (cid:8) V (cid:9) and that (cid:8) V (cid:9) ∈ DRed rν r ,ρ . Necessarily n r = 0 and ν r = (cid:8) σ (cid:9) .Since V ∈ VRed pσ,ρ , (cid:8) V (cid:9) ∈ DRed rν r ,ρ : take the canonical pseudo-representation (cid:2) V (cid:3) and p = 1, q = r . • Suppose that V ∈ TRed p { σ } ,ρ . It follows that, for every r ∈ [0 , p ), there exists n r and ν r such that V → n r v (cid:8) V (cid:9) and that (cid:8) V (cid:9) ∈ DRed rν r ,ρ . Again, since, V is a value, wenecessarily have n r = 0 and ν r = (cid:8) σ (cid:9) . Since (cid:8) V (cid:9) ∈ DRed rν r ,ρ , there is a pseudo-representation [ V p , . . . , V p n ] such that P ni =1 p i = 1, and a family ( q i ) i ∈I which is suchthat r ≤ P i ∈I p i q i , where p i = p i .Suppose that there is no q i greater or equal to r . Then ∀ i ∈ I , q i < r and X i ∈I p i q i < X i ∈I p i r = r X i ∈I p i = r which is a contradiction. So there exists q i ≥ r and therefore V ∈ VRed q i σ,ρ . By Lemma 22, V ∈ VRed rσ,ρ . Since the result is true for all r ∈ [0 , p ), we obtain by Lemma 24 that V ∈ VRed pσ,ρ . (cid:3) We ﬁnally deduce from the two previous lemmas the proposition of interest, relating the re-ducibility of a recursively-deﬁned term with the one of its unfoldings:

Proposition 4 (Reducibility is Stable by Unfolding)

Let n ∈ N and V = ( letrec f = W ) be a closed value. Suppose that Z is the n -unfolding of V . Then V ∈ VRed p Nat s → µ,ρ if and only if Z ∈ VRed p Nat s → µ,ρ . Proof.

A direct consequence of Lemma 31 and Lemma 32. (cid:3)

If a distribution, obtained as partial approximation of the semantics [[ M ]] of a term M , is reduciblefor a type µ n , then all the partial approximations of [[ M ]] obtained by iterating at least as manytimes the reduction relation ⇛ v have the same degree of reducibility, for a greater type: Lemma 33

Suppose that M ⇛ nv D n ∈ DRed pµ n ,ρ for µ n µ , with P µ = 1 . Suppose that, for m ≥ n , M ⇛ mv D m . Then there exists µ n µ m µ such that D m ∈ DRed pµ m ,ρ . Proof.

Let µ n = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o . Since D n ∈ DRed pµ n ,ρ , there exists a pseudo-representation D n = (cid:2) V p i i (cid:12)(cid:12) i ∈ I (cid:3) and two families of reals ( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J such that1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = p ′ j ,4. p ≤ P i ∈I P j ∈J q ij p ij .By Lemma 3, we have D n D m so that the distribution D m admits a pseudo-representation D m = (cid:2) V p i i (cid:12)(cid:12) i ∈ I ⊎ K (cid:3) extending the one of D n . We now need to deﬁne appropriate familiesof reals (cid:0) p ′ ij (cid:1) i ∈I⊎K ,j ∈J and (cid:0) q ′ ij (cid:1) i ∈I⊎K ,j ∈J . We set: • ∀ i ∈ I , ∀ j ∈ J , p ′ ij = p ij , • ∀ i ∈ I , ∀ j ∈ J , q ′ ij = q ij , 43 ∀ i ∈ K , ∀ j ∈ J , q ′ ij = 0and we choose the (cid:0) p ′ ij (cid:1) i ∈K ,j ∈J arbitrarily in [0 ,

1] under the constraints that ∀ i ∈ K , P j ∈J p ′ ij = p i and that ∀ j ∈ J , P i ∈I⊎K p ′ ij ≤ µ ( σ j ). These constraints are feasible since P i ∈I⊎K P j ∈J p ′ ij = P i ∈I⊎K p i ≤ P µ . We then set µ m = n ( σ j ) P i ∈I⊎K p ′ ij (cid:12)(cid:12) j ∈ J o µ . Let us check that D m ∈ DRed pµ m ,ρ .1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ and ∀ i ∈ K , ∀ j ∈ J , V i ∈ VRed σ j ,ρ as this set containsall terms of simple type h σ j i by Lemma 21,2. ∀ i ∈ I , P j ∈J p ′ ij = p i by deﬁnition and ∀ i ∈ K , P j ∈J p ′ ij = p i by construction,3. ∀ j ∈ J , P i ∈I⊎K p ′ ij = µ m ( σ j ) by deﬁnition of µ m ,4. p ≤ P i ∈I P j ∈J q ij p ij = P i ∈I P j ∈J q ′ ij p ′ ij + 0= P i ∈I P j ∈J q ′ ij p ′ ij + P i ∈K P j ∈J q ′ ij p ′ ij = P i ∈I⊎K P j ∈J q ′ ij p ′ ij So D m ∈ DRed pµ m ,ρ . (cid:3) When two distributions D and E are reducible, with respective degrees of reducibility p ′ and p ′′ ,their probabilistic combination D ⊕ p E is reducible as well with degree of reducibility pp ′ +(1 − p ) p ′′ ,for the distribution type computed by ⊕ p : Lemma 34

Suppose that h µ i = h ν i , that D ∈ DRed p ′ µ,ρ and that E ∈ DRed p ′′ ν,ρ . Then p D + (1 − p ) E ∈ DRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ . Proof.

Let µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o . Since D ∈ DRed p ′ µ,ρ , there exists a pseudo-representation D = (cid:2) V p i i (cid:12)(cid:12) i ∈ I (cid:3) and two families of reals ( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J such that1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = p ′ j ,4. p ′ ≤ P i ∈I P j ∈J q ij p ij .Let ν = n ( τ l ) p ′′′ l (cid:12)(cid:12) l ∈ L o . Since E ∈ DRed p ′′ ν,ρ , there exists a pseudo-representation E = h W p ′′ k k (cid:12)(cid:12) k ∈ K i and two families of reals ( p ′ kl ) k ∈K ,l ∈L and ( q ′ kl ) k ∈K ,l ∈L such that1. ∀ k ∈ K , ∀ l ∈ L , W k ∈ VRed q ′ kl τ l ,ρ ,2. ∀ k ∈ K , P l ∈L p ′ kl = p ′′ k ,3. ∀ l ∈ L , P k ∈K p ′ kl = p ′′′ l ,4. p ′′ ≤ P k ∈K P l ∈L q ′ kl p ′ kl . 44e suppose that I and K are disjoint, and that j ∈ J ∩L ⇔ σ j = τ j . To prove that p D +(1 − p ) E ∈ DRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ , we consider the pseudo-representation p D + (1 − p ) E = (cid:2) V pp i i (cid:12)(cid:12) i ∈ I (cid:3) + h W (1 − p ) p ′′ k k (cid:12)(cid:12) k ∈ K i (14)and we write the distribution type µ ⊕ p ν as n ( σ j ) pp ′ j (cid:12)(cid:12) j ∈ J \ ( J ∩ L ) o + n ( σ j ) pp ′ j +(1 − p ) p ′′′ j (cid:12)(cid:12) j ∈ J ∩ L o + n ( τ l ) (1 − p ) p ′′′ l (cid:12)(cid:12) l ∈ L \ ( J ∩ L ) o We set G = I + K and H = J + L . We now need to deﬁne appropriate families of reals (cid:16) p ′′ gh (cid:17) g ∈G ,h ∈H and (cid:16) q ′′ gh (cid:17) g ∈G ,h ∈H . We proceed as follows: • if g ∈ I and h ∈ J , p ′′ gh = pp gh and q ′′ gh = q gh , • if g ∈ I and h ∈ L , p ′′ gh = 0 and q ′′ gh = 0, • if g ∈ K and h ∈ J , p ′′ gh = 0 and q ′′ gh = 0, • if g ∈ K and h ∈ L , p ′′ gh = (1 − p ) p ′ gh and q ′′ gh = q ′ gh .Let us prove that (14) together with these two families provide a witness that p D + (1 − p ) E ∈ DRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ by checking the four usual conditions. We write Z g either for V i or W k , dependingon the context. We write similarly θ h for σ j or τ l .1. ∀ g ∈ G , ∀ h ∈ H , Z g ∈ VRed q ′′ gh θ h ,ρ is proved by case exhaustion: • ∀ g ∈ I , ∀ h ∈ J , V g ∈ VRed q gh σ h ,ρ since D ∈ DRed p ′ µ,ρ , • ∀ g ∈ K , ∀ h ∈ L , W g ∈ VRed q ′ gh τ h ,ρ since E ∈ DRed p ′′ ν,ρ , • in the two remaining cases, q ′′ gh = 0 and by Lemma 21 the result holds.2. We proceed again by case exhaustion. • If g ∈ I , P h ∈H p ′′ gh = P h ∈J p ′′ gh + P h ∈L p ′′ gh = P h ∈J pp gh = pp g . • If g ∈ K , P h ∈H p ′′ gh = P h ∈L (1 − p ) p ′ gh = (1 − p ) p ′′ g .3. We proceed again by case exhaustion. • Suppose that h ∈ J \ ( J ∩ L ). Then P g ∈G p ′′ gh = P g ∈I p ′′ gh = P g ∈I pp gh = pp ′ g . • Suppose that h ∈ L \ ( J ∩ L ). Then P g ∈G p ′′ gh = P g ∈K p ′′ gh = P g ∈K (1 − p ) p ′ gh =(1 − p ) p ′′′ g . • Suppose that h ∈ J ∩L . Then P g ∈G p ′′ gh = P g ∈I p ′′ gh + P g ∈K p ′′ gh = pp ′ g +(1 − p ) p ′′′ g .4. P g ∈G P h ∈H q ′′ gh p ′′ gh = P g ∈I P h ∈J q ′′ gh p ′′ gh + P g ∈K P h ∈L q ′′ gh p ′′ gh = P g ∈I P h ∈J q gh pp gh + P g ∈K P h ∈L q ′ gh (1 − p ) p ′ gh = p P g ∈I P h ∈J q gh p gh + (1 − p ) P g ∈K P h ∈L q ′ gh p ′ gh ≥ pp ′ + (1 − p ) p ′′ It follows that p D + (1 − p ) E ∈ DRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ . (cid:3) This lemma generalizes to the n -ary case of a weighted sum of distributions:45 emma 35 Let ( µ i ) i ∈I be a family of distribution types of same underlying type. For every i ∈ I , let D i ∈ DRed q i µ i ,ρ . Let ( p i ) i ∈I be a family of reals of [0 , such that P i ∈I p i ≤ . Then P i ∈I p i D i ∈ DRed P i ∈I p i q i P i ∈I p i µ i ,ρ . Proof.

Similar to the proof of Lemma 34. (cid:3)

TRed is closed by anti-reduction for Dirac distributions, but also in the case corresponding tothe reduction of a choice operator:

Lemma 36 (Reductions and Sets of Candidates) • Suppose that M → v (cid:8) N (cid:9) and that N ∈ TRed pµ,ρ . Then M ∈ TRed pµ,ρ . • Suppose that M → v (cid:8) N p , L − p (cid:9) , that N ∈ TRed p ′ µ,ρ and that L ∈ TRed p ′′ ν,ρ . Then M ∈ TRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ . Proof. • Since N ∈ TRed pµ,ρ , for every 0 ≤ r < p there exists ν r µ and n r ∈ N such that N ⇛ n r v D r ∈ DRed rν r ,ρ . Recall that ⇛ n r +1 v = → v ◦ ⇛ n r v . It follows that M ⇛ n r +1 v D r whichhas the required properties, so that M ∈ TRed pµ,ρ . • Let 0 ≤ r < pp ′ + (1 − p ) p ′′ . Let ( r ′ , r ′′ ) be such that r = pr ′ + (1 − p ) r ′′ , 0 ≤ r ′ < p ′ and 0 ≤ r ′′ < p ′′ . Since N ∈ TRed p ′ µ,ρ , there exists n r ′ and µ r ′ µ such that N ⇛ n r ′ v D r ′ ∈ DRed r ′ µ r ′ ,ρ .Since L ∈ TRed p ′′ ν,ρ , there exists m r ′′ and ν r ′′ ν such that L ⇛ m r ′′ v E r ′′ ∈ DRed r ′′ ν r ′′ ,ρ .Suppose that n r ′ ≤ m r ′′ , the dual case being exactly symmetrical. By Lemma 33, bydenoting D r ′′ the distribution such that N ⇛ m r ′′ v D r ′′ , there exists µ r ′ µ r ′′ µ suchthat D r ′′ ∈ DRed r ′ µ r ′′ ,ρ . Now M ⇛ m r ′′ +1 v p D r ′′ + (1 − p ) E r ′′ , and by Lemma 34 we have p D r ′′ + (1 − p ) E r ′′ ∈ DRed pr ′ +(1 − p ) r ′′ µ r ′′ ⊕ p ν r ′′ ,ρ . Since by construction µ r ′′ ⊕ p ν r ′′ µ ⊕ p ν , we canconclude that M ∈ TRed pp ′ +(1 − p ) p ′′ µ ⊕ p ν,ρ . (cid:3) Reducibility sets are monotonic with respect to the subtyping order ⊑ : Lemma 37 (Subtyping Soundness) • Suppose that σ ⊑ τ . Then, for every p ∈ [0 , and ρ , VRed pσ,ρ ⊆ VRed pτ,ρ . • Suppose that µ ⊑ ν and that P µ = P ν . Then, for every p ∈ [0 , and ρ , DRed pµ,ρ ⊆ DRed pν,ρ . • Suppose that µ ⊑ ν . Then, for every p ∈ [0 , and ρ , TRed pµ,ρ ⊆ TRed pν,ρ . Proof.

The proof is by mutual induction on the statements following the shape of the simple typereﬁned by σ and µ , as earlier. • Suppose that σ :: Nat . Then σ = Nat s and τ = Nat r with s r . Let V ∈ VRed pσ,ρ . Thereare three possibilities: – Either s = b i k and r = b i k ′ with k ≤ k ′ . Then V is of the shape S n . If p = 0 the resultis immediate. Else we have n < J s K ρ = ρ ( i ) + k ≤ ρ ( i ) + k ′ = J r K ρ so that V ∈ VRed pτ,ρ . – Or s = b i k and r = ∞ . In this case V is of the shape S n and therefore V ∈ VRed pτ,ρ .46 Or s = r = ∞ . In this case σ = τ and the result is immediate. • Suppose that σ = σ ′ → µ and that τ = τ ′ → ν . Let p ∈ [0 , ρ be a size environment,and V ∈ VRed pσ,ρ . We have that τ ′ ⊑ σ ′ and µ ⊑ ν . It follows, by induction hypothesis, that VRed p ′ τ ′ ,ρ ⊆ VRed p ′ σ ′ ,ρ and that TRed p ′ µ,ρ ⊆ TRed p ′ ν,ρ for every p ′ ∈ [0 , V ∈ VRed pσ,ρ ,for every q ∈ (0 ,

1] and W ∈ VRed qσ ′ ,ρ , V W ∈ TRed pqµ,ρ ⊆ TRed pqν,ρ . As

VRed qτ ′ ,ρ ⊆ VRed qσ ′ ,ρ , V ∈ VRed pτ,ρ . • Suppose that µ = n σ p ′ j j (cid:12)(cid:12) j ∈ J o and that ν = n τ p ′′ k k (cid:12)(cid:12) k ∈ K o . By deﬁnition ofsubtyping, there exists f : J → K such that for all j ∈ J , σ j ⊑ τ f ( j ) and that for all k ∈ K , P j ∈ f − ( k ) p ′ j ≤ p ′′ k . Note that since P µ = P ν , this is in fact an equality. Let D ∈ DRed pµ,ρ , then there exists a pseudo-representation D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) and families( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J of reals of [0 ,

1] satisfying:1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = p ′ j ,4. p ≤ P i ∈I P j ∈J q ij p ij .By induction hypothesis, for every j ∈ J , VRed q ij σ j ,ρ ⊆ VRed q ij τ f ( j ) ,ρ . We now prove that (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) witnesses that D ∈ DRed pν,ρ . We need to deﬁne families of reals ( p ′ ik ) i ∈I ,k ∈K and ( q ′ ik ) i ∈I ,k ∈K satisfying the four usual conditions. To this end, for every i ∈ I , k ∈ K ,we set p ′ ik = X j ∈ f − ( k ) p ij and q ′ ik = max j ∈ f − ( k ) q ij Let us check that the four conditions hold.1. ∀ i ∈ I , ∀ k ∈ K , V i ∈ VRed q ′ ik τ f ( j ) ,ρ by induction hypothesis and by deﬁnition of q ′ ik ,2. ∀ i ∈ I , P k ∈K p ′ ik = P k ∈K P j ∈ f − ( k ) p ij = P j ∈J p ij = p i ,3. ∀ k ∈ K , P i ∈I p ′ ik = P i ∈I P j ∈ f − ( k ) p ij = P j ∈ f − ( k ) P i ∈I p ij = P j ∈ f − ( k ) p ′ j = p ′′ k ,4. p ≤ P i ∈I P j ∈J q ij p ij = P i ∈I P k ∈K P j ∈ f − ( k ) q ij p ij ≤ P i ∈I P k ∈K P j ∈ f − ( k ) q ′ if ( j ) p ij = P i ∈I P k ∈K q ′ ik P j ∈ f − ( k ) p ij = P i ∈I P k ∈K q ′ ik p ′ ik It follows that D ∈ DRed pν,ρ . • Suppose that µ = n σ p ′ j j (cid:12)(cid:12) j ∈ J o and that ν = n τ p ′′ k k (cid:12)(cid:12) k ∈ K o . By deﬁnition ofsubtyping, there exists f : J → K such that for all j ∈ J , σ j ⊑ τ f ( j ) and that for all k ∈ K , P j ∈ f − ( k ) p ′ j ≤ p ′′ k . Let M ∈ TRed pµ,ρ . Then, for every 0 ≤ r < p , there exists µ ′ r µ and n r such that M ⇛ n r v D r ∈ DRed rµ ′ r ,ρ . By deﬁnition of µ ′ r µ , µ ′ r = h σ q ′ j j (cid:12)(cid:12) j ∈ J i with q ′ j ≤ p ′ j for every j ∈ J . We set ν ′ r = h τ q ′ j f ( j ) (cid:12)(cid:12) j ∈ J i which is such that P µ ′ r = P ν ′ r and, by construction, µ ′ r ⊑ ν ′ r so that we can apply the induction hypothesis and obtain that M ⇛ n r v D r ∈ DRed rν ′ r ,ρ . The result follows, since by construction ν ′ r ν . (cid:3) .9 Reducibility Sets for Open Terms We are now ready to extend the notion of reducibility set from the realm of closed terms to theone of open terms. This turns out to be subtle. The guiding intuition is that one would like todeﬁne a term M with free variables in −→ x to be reducible iﬀ any closure M [ −→ V / −→ x ] is itself reduciblein the sense of Deﬁnition 24. What happens, however, to the underlying degree of reducibility p ? How do we relate the degrees of reducibility of −→ V with the one of M [ −→ V / −→ x ]? The answer iscontained in the following deﬁnition: Deﬁnition 26 (Reducibility Sets for Open Terms)

Suppose that Γ is a sized context in theform x : σ , . . . , x n : σ n , and that y is a variable distinct from x , . . . , x n . Then we deﬁne thefollowing sets of terms and values: OTRed Γ | ∅ µ,ρ = (cid:8) M (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , n , ∀ ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ ,M [ −→ V / −→ x ] ∈ TRed Q ni =1 q i µ,ρ o OVRed Γ | ∅ µ,ρ = (cid:8) W (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , n , ∀ ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ ,W [ −→ V / −→ x ] ∈ VRed Q ni =1 q i µ,ρ o OTRed Γ | y : { τ pjj } j ∈J µ,ρ = (cid:8) M (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , n , ∀−→ V ∈ Q ni =1 VRed q i σ i ,ρ , ∀ (cid:0) q ′ j (cid:1) j ∈ [0 , J , ∀ W ∈ T j ∈J VRed q ′ j σ j ,ρ ,M [ −→ V , W/ −→ x , y ] ∈ TRed αµ,ρ o OVRed Γ | y : { τ pjj } j ∈J µ,ρ = (cid:8) Z (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , n , ∀−→ V ∈ Q ni =1 VRed q i σ i ,ρ , ∀ (cid:0) q ′ j (cid:1) j ∈ [0 , J , ∀ W ∈ T j ∈J VRed q ′ j σ j ,ρ ,Z [ −→ V , W/ −→ x , y ] ∈ VRed αµ,ρ o where is called the degree of reducibility. the degree of reducibility α is deﬁned as α = n Y i =1 q i ! X j ∈J p j q ′ j  + 1 − X j ∈J p j  . Note that this contains:

OTRed ∅ | ∅ µ,ρ = TRed µ,ρ OVRed ∅ | ∅ σ,ρ = VRed σ,ρ OTRed ∅ | y : { τ pii } i ∈ I µ,ρ = (cid:8) M (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , I , ∀ V ∈ T i ∈I VRed q i σ i ,ρ ,M [ V /y ] ∈ TRed P i ∈ I p i q i +1 − ( P j ∈J p j ) µ,ρ (cid:27) OVRed ∅ | y : { τ pii } i ∈ I µ,ρ = (cid:8) W (cid:12)(cid:12) ∀ ( q i ) i ∈ [0 , I , ∀ V ∈ T i ∈I VRed q i σ i ,ρ ,W [ V /y ] ∈ VRed P i ∈ I p i q i +1 − ( P j ∈J p j ) µ,ρ (cid:27) Note alsothat these sets extend the ones for closed terms: in particular,

OTRed ∅ | ∅ µ,ρ = TRed µ,ρ .

48s for closed terms (Lemma 32), reducible values are reducible terms:

Lemma 38 (Reducible Values are Reducible Terms)

For every Γ , Θ , σ and ρ , V ∈ OVRed Γ | Θ σ,ρ if and only if V ∈ OTRed Γ | Θ { σ } ,ρ . An immediate consequence is that OVRed Γ | Θ σ,ρ ⊆ OTRed Γ | Θ { σ } ,ρ . Proof.

Corollary of Lemma 32 and of the deﬁnitions of the candidates for open sets. (cid:3)

The following easy lemma relates the reducibility of natural numbers, and will be used to treatthe case of the rules Succ and Zero in the proof of typing soundness:

Lemma 39 • V ∈ VRed p Nat s ,ρ = ⇒ S V ∈ VRed p Nat b s ,ρ • For every size s , ∈ VRed p Nat b s ,ρ . Proof.

First point: • Suppose that V ∈ VRed p Nat b i k ,ρ and that p >

0. Then V = S n for some n < J s K ρ . Then S V = S n +1 satisﬁes n + 1 < J b s K ρ = J s K ρ + 1, so that S V ∈ VRed p Nat b i k +1 ,ρ . • Suppose that V ∈ VRed p Nat ∞ ,ρ or that p = 0. By deﬁnition, V = S n for n ∈ N . It followsthat S V ∈ VRed p Nat b s ,ρ .Second point: • Suppose that p = 0. Then ∈ VRed p Nat b s ,ρ , by deﬁnition. • Else we need to prove that J s K ρ >

0. But b s is either ∞ , in which case J s K ρ = ∞ , or it is ofthe shape b i k for k >

0, and J b i k K ρ = ρ ( i ) + k > (cid:3) To handle the ﬁxpoint rule, we need to relate the notion of sized walk which guards it with thereducibility sets, and in particular with the degrees of reducibility we can attribute to recursively-deﬁned terms.

Deﬁnition 27 (Probabilities of Convergence in Finite Time)

Let us consider a sized walk.We deﬁne the associated probabilities of convergence in ﬁnite time ( Pr n,m ) n ∈ N ,m ∈ N as follows: ∀ n ∈ N , ∀ m ∈ N , the real number Pr n,m is deﬁned as the probability that, starting from m , thesized walk reaches in at most n steps. The point is that, for an AST sized walk, the more we iterate, the closer we get to reaching 0 inﬁnite time n with probability 1. Lemma 40 (Finite Approximations of AST)

Let m ∈ N and ε ∈ (0 , . Consider a sizedwalk, and its associated probabilities of convergence in ﬁnite time ( Pr n,m ) n ∈ N ,m ∈ N . If the sizedwalk is AST, there exists n ∈ N such that Pr n,m ≥ − ε . Proof.

Suppose, by contradiction, that there exists ε ∈ (0 ,

1] such that there is no n ∈ N with Pr n,m ≥ − ε . Then lim n ∈ N Pr n,m ≤ − ε . But this limit should be worth 1 as we supposed thesized walk to be AST. (cid:3) The following lemma allows to treat the base case of Lemma 42:

Lemma 41

Suppose that V is a closed value of simple type Nat → κ . Then, for every Nat i → µ :: Nat → κ , and for every size environment ρ such that ρ ( i ) = 0 , we have V ∈ VRed Nat i → µ,ρ . roof. To prove that V ∈ VRed Nat i → µ,ρ , we need to show that for every q ∈ (0 ,

1] and every W ∈ VRed q Nat i ,ρ we have that V W ∈ TRed qµ,ρ . This is always the case, as

VRed q Nat i ,ρ is the emptyset by deﬁnition: there is no term of the shape S n with n < ρ ( i ) = 0. (cid:3) The following lemma is the crucial result relating sized walks with the reducibility sets. Itproves that, when the sized walk is AST, and after substitution of the variables of the context byreducible values in the recursively-deﬁned term, we can prove the degree of reducibility to be anyprobability Pr n,m of convergence in ﬁnite time. Lemma 42 (Convergence in Finite Time and letrec ) Consider the distribution type µ = (cid:8) ( Nat s j → ν [ s j / i ]) p j (cid:12)(cid:12) j ∈ J (cid:9) . Let Γ be the sized context x : Nat r , . . . , x l : Nat r l . Supposethat Γ | f : µ ⊢ V : Nat b i → ν [ b i / i ] and that µ induces an AST sized walk. Denote ( Pr n,m ) n ∈ N ,m ∈ N its associated probabilities of convergence in ﬁnite time. Suppose that V ∈ OVRed Γ | f : µ Nat b i → ν [ b i / i ] ,ρ forevery ρ . Let −→ W ∈ Q li =1 VRed Nat r i ,ρ , then for every ( n, m ) ∈ N , we have that letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n,m Nat i → ν,ρ [ i m ] Proof.

We prove the statement by induction on n . • If n = 0, we have two cases. – If m = 0, then Lemma 41 implies that letrec f = V [ −→ W / −→ x ] ∈ VRed Nat i → ν,ρ [ i so that by downward closure (Lemma 22) we obtain letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n, Nat i → ν,ρ [ i . – If m = 0, then Pr n,m = 0. The hypothesis of the lemma imply that letrec f = V [ −→ W / −→ x ] :: Nat → h ν i , and we conclude using Lemma 21. • Suppose that n ≥ – If m = 0, the result is immediate as in the previous case. – Suppose that m >

0. Then m = m ′ + 1. By deﬁnition, s j must be of the shape b i kj with k j ≥ j ∈ J . We set I = (cid:8) k j (cid:12)(cid:12) j ∈ J (cid:9) and q k j = p j for every j ∈ J .The sized walk induced by the distribution type µ is then the sized walk associated to( I , ( q i ) i ∈I )), which from the state m ′ + 1 ∈ N \ { } moves: – to the state m ′ + k j with probability p j , for every j ∈ J ; – to 0 with probability 1 − (cid:16)P j ∈J p j (cid:17) .It follows that Pr n +1 ,m ′ +1 = X j ∈J p j Pr n,m ′ + k j + 1 − X j ∈J p j  (15)For every j ∈ J , let us apply the induction hypothesis and obtain letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n,m ′ + kj Nat i → ν,ρ [ i m ′ + k j ] By Lemma 26, letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n,m ′ + kj Nat b i kj → ν [ b i kj / i ] ,ρ [ i m ′ ] = VRed Pr n,m ′ + kj Nat s j → ν [ s j / i ] ,ρ [ i m ′ ] Since this is valid for every j ∈ J , we have that letrec f = V [ −→ W / −→ x ] ∈ \ j ∈J VRed Pr n,m ′ + kj Nat s j → ν [ s j / i ] ,ρ [ i m ′ ] V ∈ OVRed Γ | f : µ Nat b i → ν [ b i / i ] ,ρ [ i m ′ ] we obtain V [ −→ W , letrec f = V [ −→ W / −→ x ] / −→ x , f ] ∈ VRed P j ∈J p j Pr n,m + kj + 1 − ( P j ∈J p j ) Nat b i → ν [ b i / i ] ,ρ [ i m ′ ] which, by (15) and by Lemma 26, gives V [ −→ W , letrec f = V [ −→ W / −→ x ] / −→ x , f ] ∈ VRed Pr n +1 ,m ′ +1 Nat i → ν,ρ [ i m ′ +1] But this term is an unfolding of letrec f = V [ −→ W / −→ x ], so that by Corollary 4 we obtain letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n,m Nat i → ν,ρ [ i m ] (cid:3) When m = ∞ , the previous lemma does not allow to conclude, and an additional argument isrequired. Indeed, it does not make sense to consider a sized walk beginning from ∞ : the meaningof this size is in fact any integer , not the ordinal ω . Before we justify this understanding, we needthe following companion lemma. Lemma 43 If i pos σ , then • VRed pσ,ρ [ i n ] ⊆ VRed pσ,ρ [ i ] , • DRed pµ,ρ [ i n ] ⊆ DRed pµ,ρ [ i ] , • TRed pµ,ρ [ i n ] ⊆ TRed pµ,ρ [ i ] . Proof. • Let s = b i n . We have J s K ρ [ i = n . Using Lemma 26, we obtain VRed pσ [ s / i ] ,ρ [ i = VRed pσ,ρ [ i J s K ρ [ i ] = VRed pσ,ρ [ i n ] By the same lemma,

VRed pσ [ ∞ / i ] ,ρ [ i = VRed pσ,ρ [ i J ∞ K ρ [ i ] = VRed pσ,ρ [ i ] Since i pos σ and s ∞ , Lemma 16 implies that σ [ s / i ] ⊑ σ [ ∞ / i ]. By Lemma 37, we obtain VRed pσ [ s / i ] ,ρ [ i ⊆ VRed pσ [ ∞ / i ] ,ρ [ i and thus VRed pσ,ρ [ i n ] ⊆ VRed pσ,ρ [ i ] . • Let D ∈ DRed pµ,ρ [ i n ] . It follows that D = (cid:2) ( V i ) p i (cid:12)(cid:12) i ∈ I (cid:3) and that, setting µ = n ( σ j ) p ′ j (cid:12)(cid:12) j ∈ J o , there exists families ( p ij ) i ∈I ,j ∈J and ( q ij ) i ∈I ,j ∈J of reals of [0 ,

1] sat-isfying:1. ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ [ i n ] ,2. ∀ i ∈ I , P j ∈J p ij = p i ,3. ∀ j ∈ J , P i ∈I p ij = µ ( σ j ),4. p ≤ P i ∈I P j ∈J q ij p ij .Since ∀ i ∈ I , ∀ j ∈ J , V i ∈ VRed q ij σ j ,ρ [ i n ] ⊆ VRed q ij σ j ,ρ [ i ] , we obtain that D ∈ DRed pµ,ρ [ i ] using the same witnesses. 51 Let M ∈ TRed pµ,ρ [ i n ] . It follows that for every 0 ≤ r < p , there exists ν r µ and n r ∈ N such that M ⇛ n r v D r and D r ∈ DRed rν r ,ρ [ i n ] . But DRed rν r ,ρ [ i n ] ⊆ DRed rν r ,ρ [ i ] , so that M ∈ TRed pµ,ρ [ i ] . (cid:3) The following lemma proves that ∞ stands for “every integer”. It proves indeed that, if a termis in a reducibility set for any ﬁnite interpretation of a size, then it is also in the set where thesize is interpreted as ∞ : Lemma 44 (Reducibility for Inﬁnite Sizes)

Suppose that i pos ν and that W is the value letrec f = V . If W ∈ VRed p Nat i → ν,ρ [ i n ] for every n ∈ N , then W ∈ VRed p Nat i → ν,ρ [ i ] . Proof.

Suppose that i pos ν and that, for every n ∈ N , letrec f = V ∈ VRed p Nat i → ν,ρ [ i n ] . Let W ∈ VRed p Nat i ,ρ [ i ] . Then W = S m for some m ∈ N . It follows that W ∈ VRed p Nat i ,ρ [ i m +1] .But letrec f = V ∈ VRed p Nat i → ν,ρ [ i m +1] , so that( letrec f = V ) W ∈ TRed pν,ρ [ i m +1] By Lemma 43, since i pos ν , we obtain that( letrec f = V ) W ∈ TRed pν,ρ [ i ] It follows that letrec f = V ∈ VRed p Nat i → ν,ρ [ i ] (cid:3) The following technical lemma will allow us to deal with the Let rule in the proof of typingsoundness.

Lemma 45

Let ( q i ) i ∈ (0 , n , ( q ′ j ) j ∈ (0 , m and ( q ′′ k ) k ∈ (0 , l . Let L and G be two setsof indexes. Let ≤ r ′′ < ( Q ni =1 q i ) (cid:16)Q mj =1 q ′ j (cid:17) (cid:16)Q lk =1 q ′′ k (cid:17) . Suppose that, for every ≤ r < Q mj =1 q ′ j , there exists two families ( p rlg ) l ∈L ,g ∈G and ( q rlg ) l ∈L ,g ∈G of reals of [0 , satisfying r ≤ X l ∈L X g ∈G p rlg q rlg (16) and X l ∈L X g ∈G p rlg ≤ Then there exists ≤ r < Q mj =1 q ′ j and a family ( r ′ lg ) l ∈L ,g ∈G satisfying ∀ l ∈ L , ∀ g ∈ G , ≤ r ′ lg < n Y i =1 q i ! l Y k =1 q ′′ k ! q rlg (18) and r ′′ ≤ X l ∈L X g ∈G p rlg r ′ lg (19)52 roof. Since r ′′ < ( Q ni =1 q i ) (cid:16)Q mj =1 q ′ j (cid:17) (cid:16)Q lk =1 q ′′ k (cid:17) , there exists ε > ∀ l ∈ L , ∀ g ∈G , ε lg > < ε + X l ∈L X g ∈G ε lg < n Y i =1 q i !  m Y j =1 q ′ j  l Y k =1 q ′′ k ! − r ′′ (20)We pick r such that m Y j =1 q ′ j − ε < r < m Y j =1 q ′ j (21)and this induces families ( p rlg ) l ∈L ,g ∈G and ( q rlg ) l ∈L ,g ∈G of reals of [0 ,

1] satisfying (16) and (17).We choose a family ( r ′ lg ) l ∈L ,g ∈G such that ∀ l ∈ L , ∀ g ∈ G , n Y i =1 q i ! l Y k =1 q ′′ k ! q rlg − ε lg < r ′ lg < n Y i =1 q i ! l Y k =1 q ′′ k ! q rlg By (17) and since ( Q ni =1 q i ) (cid:16)Q lk =1 q ′′ k (cid:17) , we obtain from (20) that n Y i =1 q i ! l Y k =1 q ′′ k ! ε + X l ∈L X g ∈G p rlg ε lg < n Y i =1 q i !  m Y j =1 q ′ j  l Y k =1 q ′′ k ! − r ′′ Thus r ′′ < n Y i =1 q i ! l Y k =1 q ′′ k !  m Y j =1 q ′ j  − ε  − X l ∈L X g ∈G p rlg ε lg By (21) and then (16): r ′′ < n Y i =1 q i ! l Y k =1 q ′′ k ! X l ∈L X g ∈G p rlg q rlg  − X l ∈L X g ∈G p rlg ε lg which rewrites to r ′′ < X l ∈L X g ∈G p rlg n Y i =1 q i ! l Y k =1 q ′′ k ! q rlg − ε lg ! and by deﬁnition of ( r ′ lg ) l ∈L ,g ∈G we obtain r ′′ < X l ∈L X g ∈G p rlg r ′ lg as requested. (cid:3) All these fundamental lemmas allow us to prove the following proposition, which expresses thatall typable terms are reducible and is the key step towards the fact that typability implies AST:

Proposition 5 (Typing Soundness) If Γ | Θ ⊢ M : µ , then M ∈ OTRed Γ | Θ µ,ρ for every ρ .Similarly, if Γ | Θ ⊢ V : σ , then V ∈ OVRed Γ | Θ σ,ρ for every ρ . roof. We proceed by induction on the derivation of the sequent Γ | Θ ⊢ M : µ . When M = V is a value, we know by Lemma 6 that µ = (cid:8) σ (cid:9) ; and we prove that V ∈ OVRed Γ | Θ σ,ρ for every ρ .By Lemma 38 we obtain that V ∈ OTRed Γ | Θ µ,ρ for every ρ . We proceed by case analysis on the lastrule of the derivation:We suppose in the following that Γ is a sized context which can be enumerated in the form x : σ , . . . , x n : σ n , and that y is a variable distinct from x , . . . , x n . We proceed accordingly tothe last rule of the derivation: – Var: Suppose that Γ , y : τ | Θ ⊢ y : τ . Let ( q i ) i ∈ [0 , n +1 and ( V , . . . , V n , W ) ∈ (cid:0)Q ni =1 VRed q i σ i ,ρ (cid:1) × VRed q n +1 τ,ρ . • If Θ = ∅ , we need to prove that y [ −→ V , W/ −→ x , y ] = W ∈ VRed Q n +1 i =1 q i τ,ρ . This is immediatesince Q n +1 i =1 q i ≤ q n +1 , using Lemma 22. • If Θ = z : (cid:8) θ p j j (cid:12)(cid:12) j ∈ J (cid:9) , let (cid:0) q ′ j (cid:1) j ∈J ∈ [0 , J and Z ∈ T j ∈J VRed q ′ j σ j ,ρ . Weneed to prove that y [ −→ V , W, Z/ −→ x , y, z ] = W ∈ VRed ( Q n +1 i =1 q i )( P j ∈J p j q ′ j ) τ,ρ . But, again, (cid:16)Q n +1 i =1 q i (cid:17) (cid:16)P j ∈J p j q ′ j (cid:17) ≤ q n +1 since q i ≤ i , q ′ j ≤ j and P j ∈J p j =1. We conclude using Lemma 22. – Var’: Suppose that Γ | y : (cid:8) τ (cid:9) ⊢ y : τ . Let ( q i ) i ∈ [0 , n +1 and ( V , . . . , V n , W ) ∈ (cid:0)Q ni =1 VRed q i σ i ,ρ (cid:1) × VRed q n +1 τ,ρ . We need to prove that y [ −→ V , W/ −→ x , y ] = W ∈ VRed Q n +1 i =1 q i τ,ρ .This is immediate since Q n +1 i =1 q i ≤ q n +1 , using Lemma 22. – Succ: Suppose that Γ | Θ ⊢ S V : Nat b s . Suppose moreover that Θ = ∅ . Let ( q i ) i ∈ [0 , n and ( W , . . . , W n ) ∈ Q ni =1 VRed q i σ i ,ρ . We need to prove that ( S V ) [ −→ W / −→ x ] ∈ VRed Q ni =1 q i Nat b s ,ρ . But( S V ) [ −→ W / −→ x ] = S (cid:16) V [ −→ W / −→ x ] (cid:17) and, by induction hypothesis, V [ −→ W / −→ x ] ∈ VRed Q ni =1 q i Nat s ,ρ . ByLemma 39, ( S V ) [ −→ W / −→ x ] ∈ VRed Q ni =1 q i Nat b s ,ρ and we can conclude. The case where Θ = ∅ is similar. – Zero: Suppose that Γ | Θ ⊢ : Nat b s . Suppose moreover that Θ = ∅ . Let ( q i ) i ∈ [0 , n and( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ . By Lemma 39, [ −→ V / −→ x ] = 0 ∈ VRed Q ni =1 q i Nat b s ,ρ . The case whereΘ = ∅ is similar. – λ : Suppose that Γ | Θ ⊢ λy.M : σ → µ , with Θ = z : (cid:8) ( τ j ) p j (cid:12)(cid:12) j ∈ J (cid:9) . Let ( q i ) i ∈ [0 , n and ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ . Let (cid:0) q ′ j (cid:1) j ∈J ∈ [0 , J and W ∈ T j ∈J VRed q ′ j σ j ,ρ . We needto prove that( λy.M ) [ −→ V , W/ −→ x , z ] = λy.M [ −→ V , W/ −→ x , z ] ∈ VRed ( Q ni =1 q i )( P j ∈J p j q ′ j ) σ → µ,ρ Therefore, let q ′′ ∈ (0 ,

1] and Z ∈ VRed q ′′ σ,ρ . We now have to prove that (cid:16) λy.M [ −→ V , W/ −→ x , z ] (cid:17) Z ∈ TRed q ′′ ( Q ni =1 q i )( P j ∈J p j q ′ j ) µ,ρ (22)But (cid:16) λy.M [ −→ V , W/ −→ x , z ] (cid:17) Z → v M [ −→ V , W, Z/ −→ x , z, y ]Since Γ , x : σ | Θ ⊢ M : µ by typing, the induction hypothesis ensures that M [ −→ V , W, Z/ −→ x , z, y ] ∈ TRed q ′′ ( Q ni =1 q i )( P j ∈J p j q ′ j ) µ,ρ and by Lemma 36 we obtain that (22) holds, which allows to con-clude.The case where Θ = ∅ is similar. 54 Sub:

Suppose that Γ | Θ ⊢ M : ν is derived from Γ | Θ ⊢ M : µ where µ ⊑ ν . Supposethat Θ = ∅ . Let ( q i ) i ∈ [0 , n and ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ . By induction hypothesis, M [ V / −→ x ] ∈ TRed Q ni =1 q i µ,ρ so that by Lemma 37 we have M [ V / −→ x ] ∈ TRed Q ni =1 q i ν,ρ which allows toconclude.The case where Θ = ∅ is similar. – App: Suppose that Γ , ∆ , Ξ | Θ , Ψ ⊢ V W : µ . Suppose that Θ , Ψ = ∅ . We set Γ = x : σ , . . . , x n : σ n , ∆ = y : τ , . . . , y m : τ m and Ξ = z : θ , . . . , z l : θ l . Let( q i ) i ∈ [0 , n , ( q ′ j ) j ∈ [0 , m , ( q ′′ k ) k ∈ [0 , l , ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ , ( W , . . . , W m ) ∈ Q mj =1 VRed q ′ j τ j ,ρ , and ( Z , . . . , Z l ) ∈ Q lk =1 VRed q k θ k ,ρ . We need to prove that( V W ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] = V [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] W [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] (23)is in TRed ( Q ni =1 q i )( Q mj =1 q ′ j )( Q lk =1 q ′′ k ) µ,ρ . • Suppose that Q ni =1 q i = 0. Then we need to prove that (23) is in TRed µ,ρ , which isimmediate by Lemma 21 as it is of simple type h µ i . • Suppose that Q ni =1 q i = 0. It follows that ∀ i ∈ I , q i = 0. We have that Γ , ∆ | ∅ ⊢ V : σ → µ which, by induction hypothesis, gives that V ∈ OVRed Γ , ∆ | ∅ σ → µ,ρ . Note that for every i ∈ I we have σ i :: Nat ; since q i = 0, we have by deﬁnition of the sets of candidatesthat VRed q i σ i ,ρ = VRed σ i ,ρ . It follows that V [ −→ V , −→ W / −→ x , −→ y ] = V [ −→ V , −→ W −→ Z / −→ x , −→ y , −→ z ] ∈ VRed ( Q ni =1 )( Q mj =1 q ′ j ) σ → µ,ρ = VRed Q mj =1 q ′ j σ → µ,ρ . Since Γ , Ξ | Ψ ⊢ W : σ , we obtain similarlyfrom the induction hypothesis that W [ −→ V , −→ W −→ Z / −→ x , −→ y , −→ z ] ∈ VRed Q lk =1 q ′′ k σ,ρ . By deﬁnitionof VRed Q mj =1 q ′ j σ → µ,ρ , we obtain that V [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] W [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ∈ TRed ( Q mj =1 q ′ j )( Q lk =1 q ′′ k ) µ,ρ and by downwards closure (Lemma 22) we obtain that (23) is in TRed ( Q ni =1 q i )( Q mj =1 q ′ j )( Q lk =1 q ′′ k ) µ,ρ so that we can conclude.The case where Θ , Ψ = ∅ is similar. – Choice: Suppose that Γ | Θ ⊕ p Ψ ⊢ M ⊕ p N : µ ⊕ p ν . Suppose that Θ = ∅ and that Ψ = ∅ .We set Θ = y : (cid:8) τ p j j (cid:12)(cid:12) j ∈ J (cid:9) and Ψ = y : n ( τ ′ k ) p ′ k (cid:12)(cid:12) k ∈ K o where we suppose that j ∈ J ∩ L ⇔ σ j = τ j . We obtain thatΘ ⊕ p Ψ = y : (cid:8) τ pp j j (cid:12)(cid:12) j ∈ J \ ( J ∩ K ) (cid:9) + n ( τ l ) pp l +(1 − p ) p ′ l (cid:12)(cid:12) l ∈ J ∩ K o + n ( τ ′ k ) (1 − p ) p ′ k (cid:12)(cid:12) k ∈ K \ ( J ∩ K ) o Let ( q i ) i ∈ [0 , n , ( q ′ j ) j ∈ [0 , |J \ ( J ∩K ) | , ( q ′′ l ) l ∈ [0 , |J ∩K| , ( q ′′′ k ) k ∈ [0 , |K\ ( J ∩K ) | , ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ , and W ∈ \ j ∈J \ ( J ∩K ) VRed q ′ j τ j ,ρ ∩ \ l ∈J ∩K VRed q ′′ l τ l ,ρ ∩ \ k ∈K\ ( J ∩K ) VRed q ′′′ k τ ′ k ,ρ We need to prove that ( M ⊕ p N ) [ −→ V , W/ −→ x , y ] is in TRed ( Q ni =1 q i )( P j ∈J \ ( J∩K ) pp j q ′ j + P l ∈J∩K ( pp l +(1 − p ) p ′ l ) q ′′ l + P k ∈K\ ( J∩K ) (1 − p ) p ′ k q ′′′ k ) µ ⊕ p ν,ρ = TRed p ( Q ni =1 q i )( P j ∈J\ ( J∩K ) p j q ′ j + P l ∈J∩K p l q ′′ l ) +(1 − p ) ( Q ni =1 q i )( P l ∈J∩K p ′ l q ′′ l + P k ∈K\ ( J∩K ) p ′ k q ′′′ k ) µ ⊕ p ν,ρ | Θ ⊢ M : µ , which by induction hypothesis implies that M [ −→ V , W/ −→ x , y ] ∈ TRed ( Q ni =1 q i )( P j ∈J \ ( J∩K ) p j q ′ j + P l ∈J∩K p l q ′′ l ) µ,ρ Typing also implies that Γ | Ψ ⊢ N : ν , and provides by induction hypothesis N [ −→ V , W/ −→ x , y ] ∈ TRed ( Q ni =1 q i )( P l ∈J∩K p ′ l q ′′ l + P k ∈K\ ( J∩K ) p ′ k q ′′′ k ) µ ⊕ p ν,ρ Since ( M ⊕ p N ) [ −→ V , W/ −→ x , y ] → v (cid:26) (cid:16) M [ −→ V , W/ −→ x , y ] (cid:17) p , (cid:16) N [ −→ V , W/ −→ x , y ] (cid:17) − p (cid:27) Lemma 36 allows to conclude.The cases where Θ = ∅ or Ψ = ∅ are treated similarly. – Let: Suppose that Γ , ∆ , Ξ | Θ , (cid:0)P i ∈I p i · Ψ i (cid:1) ⊢ let x = M in N : P i ∈I p i · µ i . LetΓ = x : σ , . . . , x n : σ n , ∆ = y : τ , . . . , y m : τ m , and Ξ = z : θ , . . . , z m : θ l .Let ( q i ) i ∈ [0 , n , ( q ′ j ) j ∈ [0 , m and ( q ′′ k ) k ∈ [0 , l . Let ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ ,( W , . . . , W m ) ∈ Q mj =1 VRed q ′ j τ j ,ρ , and ( Z , . . . , Z l ) ∈ Q lk =1 VRed q ′′ k θ k ,ρ . There are two subcaseshere. • Suppose that M is a value. Then the last typing rule isΓ , ∆ | Θ ⊢ M : σ Γ , Ξ , x : σ | Ψ ⊢ N : µ h Γ i = Nat Γ , ∆ , Ξ | Θ , Ψ ⊢ let x = M in N : µ We treat the case where Θ = Ψ = ∅ , the two other ones being similar. We need to provethat ( let x = M in N ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ∈ TRed ( Q i ∈I q i )( Q j ∈J q ′ j )( Q k ∈K q ′′ k ) µ,ρ (24)We now distinguish two cases. – Suppose that Q i ∈I q i = 0. Then (25) holds immediately since by Lemma 21 all theterms of simple type h µ i are in TRed µ,ρ . – Else for every i ∈ I we have VRed q i σ i ,ρ = VRed σ i ,ρ . Since Γ , ∆ | Θ ⊢ M : σ , we obtainby induction hypothesis that M [ −→ V , −→ W / −→ x , −→ y ] ∈ TRed ( Q i ∈I )( Q j ∈J q ′ j ) σ,ρ . None of the −→ z occur in M , so M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ Z ] ∈ TRed Q j ∈J q ′ j σ,ρ . Since Γ , Ξ , x : σ | Ψ ⊢ N : µ , we obtain by induction hypothesis that N [ −→ V , −→ W , −→ Z , M [ −→ V , −→ Z / −→ x , −→ y , −→ z ] / −→ x , −→ z , x ]is in TRed ( Q i ∈I q i )( Q j ∈J q ′ j )( Q k ∈K q ′′ k ) µ,ρ . Since none of the variables of −→ y occur in thisterm, we obtain N [ −→ V , −→ W , −→ Z , M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] / −→ x , −→ y , −→ z , x ] ∈ TRed ( Q i ∈I q i )( Q j ∈J q ′ j )( Q k ∈K q ′′ k ) µ,ρ Now ( let x = M in N ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ]= let x = M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] in N [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] → v (cid:26) (cid:16) N [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ][ M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] /x ] (cid:17) (cid:27) = (cid:26) (cid:16) N [ −→ V , −→ W , −→ Z , M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] / −→ x , −→ y , −→ z , x ] (cid:17) (cid:27) and it follows from Lemma 36 that (25) holds, allowing to conclude.56 Suppose that M is not a value. We treat in a ﬁrst time the case where Θ = Ψ = ∅ . Thecase where Θ = ∅ is exactly similar, while the case where Ψ = ∅ reveals the reason why asum P j ∈J p j q ′ j appears in the deﬁnitions of OTRed and

OVRed . The last typing rule isΓ , ∆ | ∅ ⊢ M : (cid:8) σ p h h (cid:12)(cid:12) h ∈ H (cid:9) Γ , Ξ , x : σ h | ∅ ⊢ N : µ h h Γ i = Nat Γ , ∆ , Ξ | ∅ ⊢ let x = M in N : P h ∈H p h · µ h We need to prove that( let x = M in N ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ]= let x = M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] in N [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ∈ TRed ( Q i ∈I q i )( Q j ∈J q ′ j )( Q k ∈K q ′′ k ) P h ∈H p h · µ h ,ρ (25)We now distinguish two cases. – Suppose that (cid:0)Q i ∈I q i (cid:1) (cid:16)Q j ∈J q ′ j (cid:17) (cid:0)Q k ∈K q ′′ k (cid:1) = 0. Then (25) holds immediatelysince by Lemma 21 all the terms of simple type h P h ∈H p h · µ h i are in TRed P h ∈H p h · µ h ,ρ . – Else, we use the induction hypothesis on Γ , ∆ | ∅ ⊢ M : (cid:8) σ p h h (cid:12)(cid:12) h ∈ H (cid:9) . Since h σ i i = Nat , for every i ∈ I we have VRed q i σ i ,ρ = VRed σ i ,ρ . Together with the factthat −→ z does not appear in M , we obtain that M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ∈ TRed Q j ∈J q ′ j n σ phh (cid:12)(cid:12) h ∈H o ,ρ By deﬁnition, for every 0 ≤ r < Q mj =1 q ′ j , there exists n r and ν r = (cid:8) σ p r,g g (cid:12)(cid:12) g ∈ G r (cid:9) (cid:8) σ p h h (cid:12)(cid:12) h ∈ H (cid:9) with G r ⊆ H such that M [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ⇛ n r v D r = h X p ′′ r,l l (cid:12)(cid:12) l ∈ L r i ∈ DRed rν r ,ρ This implies the existence of two families ( p rlg ) l ∈L r ,g r ∈G and ( q rlg ) l ∈L r ,g r ∈G of reals of[0 ,

1] satisfying in particular r ≤ X l ∈L r X g ∈G r p rlg q rlg (26) X l ∈L r X g ∈G r p rlg ≤ ∀ l ∈ L , X g ∈G r p rlg = p ′′ r,l (28) ∀ g ∈ G , X l ∈L r p rlg = p r,g (29)and ∀ l ∈ L r , ∀ g ∈ G r , X l ∈ VRed q rlg σ g ,ρ (30)By (26) and (27), we can apply Lemma 45 and we obtain 0 ≤ r < Q mj =1 q ′ j and afamily ( r ′ lg ) l ∈L r ,g ∈G r satisfying ∀ l ∈ L r , ∀ g ∈ G r , ≤ r ′ lg < n Y i =1 q i ! l Y k =1 q ′′ k ! q rlg (31)and r ′′ ≤ X l ∈L X g ∈G p rlg r ′ lg (32)57e now consider r to be ﬁxed to this value given by the lemma, this providing D r , ν r and so on.Since Γ , Ξ , x : σ h | ∅ ⊢ N : µ h , we obtain by induction hypothesis using (30) thatfor every l ∈ L and g ∈ G we have N [ −→ V , −→ W , −→ Z , X l / −→ x , −→ y , −→ z , x ] ∈ TRed ( Q ni =1 q i )( Q lk =1 q ′′ k ) q rlg µ g ,ρ (33)By 31, there exists for every l ∈ L and g ∈ G and index m lg and a type µ ′ lg ⊑ µ g suchthat N [ −→ V , −→ W , −→ Z , X l / −→ x , −→ y , −→ z , x ] ⇛ m lg v E lg ∈ DRed r ′ lg µ ′ lg ,ρ (34)Now set m = max l ∈L ,g ∈G m lg By Lemma 33, we obtain types µ ′ lg µ ′′ lg µ g and distributions E ′ lg such that all thereduction lengths are the same: N [ −→ V , −→ W , −→ Z , X l / −→ x , −→ y , −→ z , x ] ⇛ mv E ′ lg ∈ DRed r ′ lg µ ′′ lg ,ρ (35)Now it follows of (28) that D r = h X p rl,g l (cid:12)(cid:12) l ∈ L r , g ∈ G r i which allows us to use Lemma 4, obtaining that( let x = M in N ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ⇛ n r + m +1 v X l ∈L X g ∈G p rl,g · E ′ lg By (35) and Lemma 35, we obtain that X l ∈L X g ∈G p rl,g · E ′ lg ∈ DRed P l ∈L P g ∈G p rl,g r ′ l,g P l ∈L P g ∈G p rl,g µ ′′ l,g ,ρ By (32) and downward closure (Lemma 22) we obtain X l ∈L X g ∈G p rl,g · E ′ lg ∈ DRed r ′′ P l ∈L P g ∈G p rl,g µ ′′ l,g ,ρ and since by (29) we have P l ∈L P g ∈G p rl,g µ ′′ l,g P h ∈H p h µ h we can conclude that( let x = M in N ) [ −→ V , −→ W , −→ Z / −→ x , −→ y , −→ z ] ∈ TRed ( Q i ∈I q i )( Q j ∈J q ′ j )( Q k ∈K q ′′ k ) P h ∈H p h · µ h ,ρ – Case: Suppose that Γ , ∆ | Θ ⊢ case V of { S → W | → Z } : µ . Suppose that Θ = ∅ . Weset Γ = x : σ , . . . , x n : σ n and ∆ = y : τ , . . . , y m : τ m .Let ( q i ) i ∈ [0 , n , ( q ′ j ) j ∈ [0 , m , ( V , . . . , V n ) ∈ Q ni =1 VRed q i σ i ,ρ and ( V ′ , . . . , V ′ m ) ∈ Q mj =1 VRed q ′ j τ j ,ρ .We need to prove that( case V of { S → W | → Z } ) [ −→ V , −→ V ′ / −→ x , −→ y ] ∈ TRed ( Q ni =1 q i )( Q mj =1 q ′ j ) µ,ρ i.e. that case V [ −→ V , −→ V ′ / −→ x , −→ y ] of n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o is in TRed ( Q ni =1 q i )( Q mj =1 q ′ j ) µ,ρ . Since Γ | ∅ ⊢ V : Nat b s , we have by induction hypothesis that V [ −→ V / −→ x ] ∈ TRed Q ni =1 q i n ( Nat b s ) o ,ρ . Since it is a value, we have by Lemma 32 the stronger statementthat V [ −→ V / −→ x ] ∈ VRed Q ni =1 q i Nat b s ,ρ which implies that V [ −→ V / −→ x ] is of the shape S k for k ∈ N satisfying Q ni =1 q i = 0 = ⇒ k < J b s K ρ . The typing also ensures that none of the variables of −→ y occurs in V , so that V [ −→ V / −→ x ] = V [ −→ V , −→ V ′ / −→ x , −→ y ].58 If k = 0, then case 0 of n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o → v (cid:26) (cid:16) Z [ −→ V , −→ V ′ / −→ x , −→ y ] (cid:17) (cid:27) Since ∆ | ∅ ⊢ Z : µ , by induction hypothesis, we have that Z [ −→ V ′ / −→ y ] ∈ TRed Q mj =1 q ′ j µ,ρ and also, by the typing hypothesis, that none of the variables of −→ x is free in Z [ −→ V ′ / −→ y ]so that Z [ −→ V ′ / −→ y ] = Z [ −→ V , −→ V ′ / −→ x , −→ y ]. But Q ni =1 q i ≤

1, so that the downward-closureproperty of Lemma 22 induces that Z [ −→ V , −→ V ′ / −→ x , −→ y ] ∈ TRed ( Q ni =1 q i )( Q mj =1 q ′ j ) µ,ρ Now the closure by anti-reduction of Lemma 36 ensures that case V [ −→ V , −→ V ′ / −→ x , −→ y ] of n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o is in TRed ( Q ni =1 q i )( Q mj =1 q ′ j ) µ,ρ . • If k >

0, then case S k n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o → v (cid:26) (cid:16)(cid:16) W [ −→ V , −→ V ′ / −→ x , −→ y ] (cid:17) (cid:0) S k − (cid:1)(cid:17) (cid:27) By typing hypothesis, we have ∆ | ∅ ⊢ W : Nat s → µ and the induction hypothesisprovides W [ −→ V ′ / −→ y ] ∈ TRed Q mj =1 q ′ j { ( Nat s → µ ) } ,ρ which, since none of the −→ x appears freely in W ,and by Lemma 32, implies that W [ −→ V , −→ V ′ / −→ x , −→ y ] ∈ VRed Q mj =1 q ′ j Nat s → µ,ρ . – Suppose that Q ni =1 q i = 0. It follows that k < J b s K ρ and therefore k − < J s K ρ whichimplies that S k − ∈ VRed Nat s ,ρ . Since W [ −→ V , −→ V ′ / −→ x , −→ y ] ∈ VRed Q mj =1 q ′ j µ,ρ , we obtainthat (cid:16) W [ −→ V , −→ V ′ / −→ x , −→ y ] (cid:17) (cid:0) S k − (cid:1) is in TRed Q mj =1 q ′ j µ,ρ . By closure by anti-reduction(Lemma 36), we have that case V [ −→ V , −→ V ′ / −→ x , −→ y ] of n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o is in TRed Q mj =1 q ′ j µ,ρ and by downward closure (Lemma 22) we obtain that it is also in TRed ( Q ni =1 q i )( Q mj =1 q ′ j ) µ,ρ , from which we conclude. – Suppose that Q ni =1 q i = 0. Then all we need to prove is that case V [ −→ V , −→ V ′ / −→ x , −→ y ] of n S → W [ −→ V , −→ V ′ / −→ x , −→ y ] | → Z [ −→ V , −→ V ′ / −→ x , −→ y ] o is in TRed µ,ρ . But this term has simple type h µ i and by Lemma 21 the result holds.The case where Θ = ∅ is similar. 59 letrec : Suppose that Γ , ∆ | Θ ⊢ letrec f = V : Nat r → ν [ r / i ]. We treat the case where∆ = Θ = ∅ . The general case is easily deduced using the downward-closure of the reducibilitysets (Lemma 22). Let Γ = x : Nat r , . . . , x n : Nat r n . We need to prove that, for everyfamily ( q i ) i ∈ [0 , n and every ( W , . . . , W n ) ∈ Q ni =1 VRed q i Nat r i ,ρ , we have( letrec f = V ) [ −→ W / −→ x ] = (cid:16) letrec f = V [ −→ W / −→ x ] (cid:17) ∈ VRed Q ni =1 q i Nat r → ν [ r / i ] ,ρ If there exists i ∈ I such that q i = 0, the result is immediate as the term is simply-typed andLemma 21 applies. Else, for every i ∈ I , we have by deﬁnition that VRed q i Nat r i ,ρ = VRed Nat r i ,ρ .Since the sets VRed are downward-closed (Lemma 22), it is in fact enough to prove that forevery ( W , . . . , W n ) ∈ Q ni =1 VRed Nat r i ,ρ , we have letrec f = V [ −→ W / −→ x ] ∈ VRed Nat r → ν [ r / i ] ,ρ Moreover, by size commutation (Lemma 26),

VRed Nat r → ν [ r / i ] ,ρ = VRed Nat i → ν,ρ [ i J r K ρ ] Let us therefore prove the stronger fact that, for every integer m ∈ N ∪ {∞} , letrec f = V [ −→ W / −→ x ] ∈ VRed Nat i → ν,ρ [ i m ] Now, the typing derivation gives us that Γ | f : µ ⊢ V : Nat b i → ν [ b i / i ] and that µ induces anAST sized walk. Denote ( Pr n,m ) n ∈ N ,m ∈ N its associated probabilities of convergence in ﬁnitetime. By induction hypothesis, V ∈ OVRed Γ | f : µ Nat b i → ν [ b i / i ] ,ρ for every ρ and we can apply Lemma 42.It follows that, for every ( n, m ) ∈ N , letrec f = V [ −→ W / −→ x ] ∈ VRed Pr n,m Nat i → ν,ρ [ i m ] Let ε ∈ (0 , n ∈ N such that Pr n,m ≥ − ε . Using downwardclosure (Lemma 22) and quantifying over all the ε , we obtain letrec f = V [ −→ W / −→ x ] ∈ \ <ε< VRed − ε Nat i → ν,ρ [ i m ] so that, by continuity of VRed (Lemma 24), we obtain letrec f = V [ −→ W / −→ x ] ∈ VRed Nat i → ν,ρ [ i m ] (36)for every m ∈ N , allowing us to conclude. It remains however to treat the case where m = ∞ .Since i pos ν and that (36) holds for every m ∈ N , Lemma 44 applies and we obtain the result. (cid:3) This proposition, together with the deﬁnition of

OTRed , implies the main result of the paper,namely that typability implies almost-sure termination:

Theorem 3

Suppose that M ∈ Λ s ⊕ ( µ ) . Then M is AST. Proof.

Suppose that M ∈ Λ s ⊕ ( µ ), then by Proposition 5 we have M ∈ OTRed ∅ | ∅ µ,ρ for every ρ . Bydeﬁnition, OTRed ∅ | ∅ µ,ρ = TRed µ,ρ . Corollary 4 then implies that M is AST. (cid:3) Conclusions and Perspectives

We presented a type system for an aﬃne, simply-typed λ -calculus enriched with a probabilisticchoice operator, constructors for the natural numbers, and recursion. This aﬃnity constraint im-plies that a given higher-order variable may occur (freely) at most once in any probabilistic branchof a program. The type system we designed decorates the aﬃne simple types with size informa-tion , allowing to incorporate in the types relevant information about the recursive behaviour of thefunctions contained in the program. A guard condition on the typing rule for letrec , formulatedwith reference to an appropriate Markov chain, ensures that typable terms are AST. The proof ofsoundness of this type system for AST relies on a quantitative extension of the reducibility method,to accommodate sets of candidates to the inﬁnitary and probabilistic nature of the computationswe consider.A ﬁrst natural question is the one of the decidability of type inference for our system. In thedeterministic case, this question was only addressed by Barthe and colleagues in an unpublishedtutorial [19], and their solution is technically involved, especially when it comes to dealing withthe ﬁxpoint rule. We believe that their approach could be extended to our system of monadicsized types, and hope that it could provide a decidable type inference procedure for it. However,this extension will certainly be challenging, as we need to appropriately infer distribution typesassociated with AST sized walks in the letrec rule.Another perspective would be to study the general, non-aﬃne case. This is challenging, for tworeasons. First, the system of size annotations needs to be more expressive in order to distinguishbetween various occurrences of a same function symbol in a same probabilistic branch. A solutionwould be to use the combined power of dependent types – which already allowed Xi to formulate aninteresting type system for termination in the deterministic case [22] – and of linearity: we coulduse linear dependent types [34] to formulate an extension of the monadic sized type system keepingtrack of how many recursive calls are performed, and of the size of each recursive argument. Thesecond challenge would then be to associate, in the typing rule for letrec , this information containedin linear dependent types with an appropriate random process. This random process should bekept decidable to guarantee that at least derivation checking can be automated, and there willprobably be a trade-oﬀ between the duplication power we allow in programs and the complexityof deciding AST for the guard in the letrec rule.The extension of our type system to deal with general inductive datatypes is essentially straight-forward. Other perspectives would be to enrich the type system so as to be able to treat coin-ductive data, polymorphic types, or ordinal sizes, three features present in most system of sizedtypes dealing with the traditional deterministic case, but which we chose not to address in thispaper to focus on the already complex task of accommodating sized types to a probabilistic andhigher-order framework. 61 eferences [1] Manning, C.D., Sch¨utze, H.: Foundations of statistical natural language processing. MITPress (2001)[2] Pearl, J.: Probabilistic reasoning in intelligent systems - networks of plausible inference.Morgan Kaufmann series in representation and reasoning. (1989)[3] Thrun, S.: Robotic mapping: A survey. In: Exploring Artiﬁcial Intelligence in the NewMillenium, Morgan Kaufmann (2002)[4] de Leeuw, K., Moore, E.F., Shannon, C.E., Shapiro, N.: Computability by probabilisticmachines. Automata Studies (1956) 183–212[5] Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press (1995)[6] Barthe, G., Gr´egoire, B., B´eguelin, S.Z.: Formal certiﬁcation of code-based cryptographicproofs. In Shao, Z., Pierce, B.C., eds.: POPL 2009, ACM (2009) 90–101[7] Barthe, G., Gr´egoire, B., Heraud, S., B´eguelin, S.Z.: Computer-aided security proofs forthe working cryptographer. In Rogaway, P., ed.: CRYPTO 2011. Volume 6841 of LNCS.,Springer (2011) 71–90[8] Goodman, N.D., Mansinghka, V.K., Roy, D.M., Bonawitz, K., Tenenbaum, J.B.: Church: alanguage for generative models. In McAllester, D.A., Myllym¨aki, P., eds.: UAI 2008, AUAIPress (2008) 220–229[9] Bournez, O., Kirchner, C.: Probabilistic rewrite strategies. applications to ELAN. In Tison,S., ed.: RTA 2002. Volume 2378 of LNCS., Springer (2002) 252–266[10] Esparza, J., Gaiser, A., Kiefer, S.: Proving termination of probabilistic programs usingpatterns. In Madhusudan, P., Seshia, S.A., eds.: CAV 2012. Volume 7358 of LNCS., Springer(2012) 123–138[11] Fioriti, L.M.F., Hermanns, H.: Probabilistic termination: Soundness, completeness, andcompositionality. In Rajamani, S.K., Walker, D., eds.: POPL 2015, ACM (2015) 489–501[12] Chatterjee, K., Fu, H., Novotn´y, P., Hasheminezhad, R.: Algorithmic analysis of qualita-tive and quantitative termination problems for aﬃne probabilistic programs. In Bod´ık, R.,Majumdar, R., eds.: POPL 2016, ACM (2016) 327–342[13] Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programsthrough positivstellensatz’s. In Chaudhuri, S., Farzan, A., eds.: CAV 2016. Volume 9779 ofLNCS., Springer (2016) 3–22[14] Hughes, J., Pareto, L., Sabry, A.: Proving the correctness of reactive systems using sizedtypes. In Boehm, H., Jr., G.L.S., eds.: POPL’96, ACM Press (1996) 410–423[15] Hofmann, M.: A mixed modal/linear lambda calculus with applications to Bellantoni-Cooksafe recursion. In Nielsen, M., Thomas, W., eds.: CSL ’97. Volume 1414 of LNCS., Springer(1997) 275–294[16] Girard, J.Y., Taylor, P., Lafont, Y.: Proofs and Types. Cambridge University Press, NewYork, NY, USA (1989)[17] Dal Lago, U., Hofmann, M.: Realizability models and implicit complexity. Theor. Comput.Sci. (20) (2011) 2029–2047[18] Barthe, G., Frade, M.J., Gim´enez, E., Pinto, L., Uustalu, T.: Type-based termination ofrecursive deﬁnitions. MSCS (1) (2004) 97–1416219] Barthe, G., Gr´egoire, B., Riba, C.: A tutorial on type-based termination. In Bove, A.,Barbosa, L.S., Pardo, A., Pinto, J.S., eds.: ALFA Summer School 2008. Volume 5520 ofLNCS., Springer (2008) 100–152[20] Barthe, G., Gr´egoire, B., Riba, C.: Type-based termination with sized products. In Kaminski,M., Martini, S., eds.: CSL 2008. Volume 5213 of LNCS., Springer (2008) 493–507[21] Abel, A.: Termination checking with types. ITA (4) (2004) 277–319[22] Xi, H.: Dependent types for program termination veriﬁcation. Higher-Order and SymbolicComputation (1) (2002) 91–131[23] Amadio, R.M., Coupet-Grimal, S.: Analysis of a guard condition in type theory. In Nivat,M., ed.: FoSSaCS’98. Volume 1378 of LNCS., Springer (1998) 48–62[24] Bournez, O., Garnier, F.: Proving positive almost-sure termination. In Giesl, J., ed.: RTA2005. Volume 3467 of LNCS., Springer (2005) 323–337[25] Chakarov, A., Sankaranarayanan, S.: Probabilistic program analysis with martingales. InSharygina, N., Veith, H., eds.: CAV 2013. Volume 8044 of LNCS., Springer (2013) 511–526[26] Cappai, A., Dal Lago, U.: On equivalences, metrics, and polynomial time. In Kosowski, A.,Walukiewicz, I., eds.: FCT 2015. Volume 9210 of LNCS., Springer (2015) 311–323[27] Dal Lago, U., Parisen Toldin, P.: A higher-order characterization of probabilistic polynomialtime. Inf. Comput. (2015) 114–141[28] Dal Lago, U., Zorzi, M.: Probabilistic operational semantics for the lambda calculus. RAIRO- Theor. Inf. and Applic. (3) (2012) 413–450[29] Br´azdil, T., Broˇzek, V., Etessami, K., Kuˇcera, A., Wojtczak, D.: One-counter markov decisionprocesses. 21st ACM-SIAM Symposium on Discrete Algorithms (2010)[30] Dal Lago, U.: The geometry of linear higher-order recursion. In: LICS 2005, IEEE ComputerSociety (2005) 366–375[31] Sabry, A., Felleisen, M.: Reasoning about programs in continuation-passing style. Lisp andSymbolic Computation (3-4) (1993) 289–360[32] Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons, Inc., NewYork, NY, USA (1986)[33] Dal Lago, U., Grellois, C.: Probabilistic termination by monadic aﬃne sized typing (extendedversion). Available at http://eternal.cs.unibo.it/ptmast.pdfhttp://eternal.cs.unibo.it/ptmast.pdf