[PDF] Strong Call by Value is Reasonable for Time

Abstract

The invariance thesis of Slot and van Emde Boas states that all reasonable models of computation simulate each other with polynomially bounded overhead in time and constant-factor overhead in space. In this paper we show that a family of strong call-by-value strategies in the \lambda-calculus are reasonable for time. The proof is based on a construction of an appropriate abstract machine, systematically derived using Danvy et al.'s functional correspondence that connects higher-order interpreters with abstract-machine models by a well-established transformation technique. This is the first machine that implements a strong CbV strategy and simulates \beta-reduction with the overhead polynomial in the number of \beta-steps and in the size of the initial term. We prove this property using a form of amortized cost analysis \`a la Okasaki.

Full PDF

aa r X i v : . [ c s . P L ] F e b Strong Call by Value is Reasonable for Time ⋆ Małgorzata Biernacka , Witold Charatonik , and Tomasz Drab Institute of Computer Science, University of Wrocław, Poland { mabi,wch,tdr } @cs.uni.wroc.pl https://ii.uni.wroc.pl/˜ { mabi,wch,tdr } Abstract.

The invariance thesis of Slot and van Emde Boas states thatall reasonable models of computation simulate each other with polynomi-ally bounded overhead in time and constant-factor overhead in space. Inthis paper we show that a family of strong call-by-value strategies in the λ -calculus are reasonable for time. The proof is based on a constructionof an appropriate abstract machine, systematically derived using Danvyet al.’s functional correspondence that connects higher-order interpreterswith abstract-machine models by a well-established transformation tech-nique. This is the ﬁrst machine that implements a strong CbV strategyand simulates β -reduction with the overhead polynomial in the numberof β -steps and in the size of the initial term. We prove this propertyusing a form of amortized cost analysis `a la Okasaki. Keywords: λ -calculus · Abstract machines · Computational complexity · Reduction strategies · Normalization by evaluation.

The invariance thesis of Slot and van Emde Boas [27] states that all reasonable models of computation simulate each other with polynomially bounded overheadin time and constant-factor overhead in space. For a long time it was not knownwhether there exist variants of the λ -calculus reasonable in this sense. In partic-ular, it was not known whether there are evaluation strategies for the λ -calculusthat can be simulated on Turing machines in time bounded by a polynomial ofthe number of performed β -reductions. Recently this question has been answeredpositively, for both time and space, with the weak call-by-value (CbV) [23] and,for time, with the strong call-by-name (CbN) [6] strategies. Here, by constructingan appropriate abstract machine, we show that a family of strong CbV strategiesare reasonable for time.It is well known that the λ -calculus provides a foundation for functional pro-gramming languages such as OCaml or Haskell, and more recently—for proofassistants such as Coq or ELF. A typical functional programming language usesa weak reduction strategy (e.g., call by value or call by need) that only reduces ⋆ This research is supported by the National Science Centre of Poland, under grantnumber 2019/33/B/ST6/00289. M. Biernacka et al. terms until a weak value is reached, and in particular it does not descend un-der lambda abstractions. On the other hand, proof assistants—which can beseen as functional languages with a rich system of dependent types—require astrong (i.e., full-reducing) reduction strategy for type checking. The study ofstrong normalization strategies in the λ -calculus, their properties and eﬃcientimplementation models, is therefore directly motivated by the need for eﬃcienttools to handle large-scale, complex veriﬁcation tasks carried out in such proofassistants [21,12].Abstract machines provide implementation models for β -reduction that op-erationalize its key aspects: the process of ﬁnding a redex (decomposition) andthe meta-operation of substitution ( β -contraction). They are low-level devices,pretty close to Turing machines. An abstract machine is typically deterministicand it makes speciﬁc choices when searching for redices, i.e., it implements aspeciﬁc reduction strategy of the calculus. Accattoli et al. in his work on thecomplexity of abstract machines [3,2,8] advocate the following notion of reason-ability: a machine is called reasonable if it simulates the strategy with the timeoverhead bounded by a polynomial in the number of β -steps and in the size ofthe initial term.The main technical goal of our work is to construct a reasonable (in theformal sense deﬁned above) abstract machine for strong CbV. This has been anontrivial [1,4] open problem (partial results include [7,8,13]) and an importantstep in the research program aimed at developing the complexity analysis of theCoq main abstract machine formulated by Accattoli [2]. Since there are knownsimulations of Turing machines in the lambda calculus and of abstract machinesby Turing machines, the existence of such a machine implies reasonability (fortime) of strong CbV in the sense of Slot and van Emde Boas. Related work.

In the case of weak reduction there is a vast body of work on ab-stract machines, their eﬃciency and construction techniques, but in the strongcase the territory is largely uncharted. The ﬁrst machine for strong normalizationin the λ -calculus is due to Cr´egut and it implements Strong CbN [17]. A nor-malization function realizing strong CbV was proposed by Gr´egoire&Leroy andimplemented in their virtual machine extending the ZAM machine [21]. Anothervirtual machine for strong CbV was derived by Ager et al. [10] from Aehlig andJoachimski’s normalization function [9]. Recently, a strong call-by-need strategyhas been proposed by Kesner et al. [12], and the corresponding abstract machinehas been derived by Biernacka et al. [14]. Finally, in a recent work [13], Biernackaet al. introduced the ﬁrst abstract machine for strong CbV, and more speciﬁ-cally for the right-to-left strong CbV strategy. That machine has been derivedby a series of systematic transformation steps from a standard normalization-by-evaluation function for the CbV lambda calculus. All three interpreters ofstrong CbV [10,13,21] are not reasonable in the sense deﬁned above.Our work builds on previous developments in the derivational approach to theconstruction and study of semantic artefacts, and in particular we use Danvyet al’s functional correspondence and standard techniques used in functionalprogramming [11]. The outcome of the derivation is an abstract machine whose trong Call by Value is Reasonable for Time 3 control stacks arise as defunctionalized continuations of the CPS-transformedevaluator. In the case of strong CbV (as well as other hybrid strategies) thesestacks are not uniform; their structure corresponds directly to multiple-kindedreduction contexts and this correspondence can be described by a suitable shapeinvariant of the stacks. Hybrid strategies and their connection with machineshave been studied by Garcia-Perez & Nogueira, and by Biernacka et al. [20,15].In between the weak and the strong CbV strategies, one can consider theOpen CbV strategy that extends the usual weak CbV strategy in that it workson open terms and generalizes the notion of value accordingly. In a series ofarticles, Accattoli et al. study Open CbV in the form of so-called ﬁreball cal-culus . They also show several abstract machines (the GLAM family) and studytheir complexity properties with the aim of providing reasonable and eﬃcientimplementations of Open CbV [8]. Contributions.

In order to construct a reasonable abstract machine for strongCbV we reﬁne the techniques from [13]. Our starting point is the same NbEevaluator as in [13] but we modify it using the standard memoization technique tostore weak values together with their strong normal forms (if they get computedalong the way). Next, we apply Danvy et al’s functional correspondence [11]to transform this new interpreter into a ﬁrst-order abstract machine. This waythe obtained machine is a systematically constructed semantic artefact ratherthan an ad-hoc modiﬁcation of an existing one. The commented code of thedevelopment is available in [28].In order to argue about the eﬃciency of the derived machine, we apply aform of amortized cost analysis that uses a potential function akin to Okasaki’sapproach [25].Thus, the contributions of this paper include:1. a derivation of a reasonable abstract machine for strong CbV,2. two variants of a reasonable machine: an environment-based one and asubstitution-based one that uses delimited substitution,3. a proof of correctness of the machine and of its reasonability,4. a corollary that strong CbV is reasonable for time.

Outline.

In Section 2 we introduce the basic concepts of the lambda calculus andthe strong CbV strategy. In Section 3 we present the derivation of the machinestarting from a NbE evaluator of the CbV lambda calculus. In Section 3.4 wepresent the resulting abstract machine with environments and in Section 4.2 itssubstitution-based variant. In Section 6 we prove the soundness of the machineand in Section 7 its reasonability. Section 8 concludes. λ -calculus We work with pure lambda terms given by the following grammar: t ::= x | t t | λx. t M. Biernacka et al. where x ranges over some set of identiﬁers. As usual, in the sequel the use ofa nonterminal restricts the range of this meta-variable (and its versions withprimes or subscripts) to the deﬁned family.We deﬁne sets of free and bound variables in a term, and the substitution function as follows : FV ( x ) = { x } BV ( x ) = ∅ FV ( t t ) = FV ( t ) ∪ FV ( t ) BV ( t t ) = BV ( t ) ∪ BV ( t ) FV ( λx. t ) = FV ( t ) \ { x } BV ( λx. t ) = BV ( t ) ∪ { x } x ′ [ x := t ] = ( t : x = x ′ x ′ : x = x ′ ( t t )[ x := t ] = t [ x := t ] t [ x := t ]( λx ′ . t ′ )[ x := t ] = ( λx ′ . t ′ : x = x ′ λx ′ . t ′ [ x := t ] : x = x ′ This “raw” form of substitution makes it possible to capture free variableoccurrences of a substituted term when it is substituted under lambda (the lastcase). In order to avoid this problem, it is standard to introduce α -conversionto make sure that bound and free variables are distinct and the substitution iscapture-avoiding.We ﬁrst deﬁne α -contraction suﬃcient to rename bound variables: x ′ / ∈ FV ( t ) ∪ BV ( t ) λx. t ⇀ α λx ′ . t [ x := x ′ ] In general, a contraction relation is local and can be lifted to a reductionrelation deﬁned on any term by contextual closure in the following way. A context is a term with exactly one free occurrence of a special variable (cid:3) called hole .Assuming that (cid:3) is not used as a bound variable, contexts in the lambda calculuscan be deﬁned by the following grammar: C ::= t C | C t | λx. C | (cid:3) Now, for any contraction relation ⇀ and any context C we deﬁne contextualclosure as follows: t ⇀ t C [ t ] C → C [ t ] The notation C [ t ] is a shortcut for C [ (cid:3) := t ] and it denotes a term obtainedby plugging the hole of the context with the given term. If the symbol above thearrow is omitted then the contraction can be done at any location in a term.The reﬂexive-transitive closure of → is denoted by ։ and the reﬂexive-symmetric-transitive closure is denoted by = , and is called conversion . Juxta-position of two relations denotes their composition, e.g., s ։ β = α t means that ∃ t ′ . s ։ β t ′ = α t . trong Call by Value is Reasonable for Time 5 Reduction semantics in the lambda calculus.

Based on the ingredientsintroduced so far, we can deﬁne operational semantics for the lambda calculusin the form of reduction semantics that speciﬁes β -contraction as the atomiccomputation step, and a reduction strategy that prescribes locations in a termwhere β -reduction can take place.We deﬁne β -contraction in the standard way (and its contextual closure de-termines β -reduction): t = α t ′ BV ( t ′ ) ∩ FV ( t ) = ∅ ( λx. t ) t ⇀ β t ′ [ x := t ] In order to deﬁne a speciﬁc strategy we need to restrict general contexts C .For the CbV strategy we also need to restrict β -contraction. Weak reduction, i.e., reduction that does not ‘go under lambda’, can be deﬁnedas reduction in weak contexts W deﬁned by the following grammar: W ::= t W | W t | (cid:3) In a call-by-value strategy, function arguments need to be evaluated beforethe function is applied. If only closed terms are considered, function argumentsare reduced to lambda abstractions, but in the open case we need a more generalnotion of a weak normal form , which is a normal form of W → β . Weak normal forms w can be expressed by the following grammar, where the auxiliary category i denotes inert terms : w ::= λx. t | ii ::= i w | x To prevent the substitution of reducible terms we restrict β -contraction by re-quiring that the argument is a weak normal form: t = α t ′ BV ( t ′ ) ∩ FV ( w ) = ∅ ( λx. t ) w ⇀ β w t ′ [ x := w ] This way we obtain the relation W → β w , which is exactly the reduction of the ﬁreball calculus , where weak normal forms are called ﬁreballs [7]. This calculus isnondeterministic but strongly conﬂuent and hence all derivations to weak normalforms have the same length. It also restores the property (called harmony in [7])that its normal forms are substitutable, i.e., there are no stuck terms.The ﬁreball reduction can be made deterministic by narrowing the space ofpossible evaluation contexts by only allowing to search for redices in the left partof an application when its right-hand side is a weak value: F ::= t F | F w | (cid:3) M. Biernacka et al.

The relation F → β w (which is β w -contraction in contexts generated by F -contexts) is a restriction of W → β w to a right-to-left strategy, and it is a deter-ministic extension of the closed weak, right-to-left CbV strategy to the opencase.To obtain a conservative extension of a weak CbV strategy to a full-reducingone, we can just iterate reduction under lambdas after reaching a weak normalform. Such iterations of W → β w are also strongly conﬂuent and therefore any suchstrategy is a good representative of the strong CbV strategy. In this paper, we study the twice right-to-left call-by-value strategy (inshort, rrCbV), that is an example of a deterministic extension of F → β w to afully reducing strategy: it is deterministic [13], uses the same contraction rela-tion, and F-contexts are a subset of the rrCbV context family. This strategy isdenoted R → β w and its grammar of contexts can be deﬁned using three contextnonterminals H , R and F (the latter deﬁned as above for the weak strategy): R ::= λx. R | H | FH ::= i R | H n where n are normal forms of the strategy → β , and a are neutral terms, bothdeﬁned as follows: n ::= λx. n | aa ::= a n | x rrCbV is a hybrid strategy, i.e., it combines diﬀerent substrategies: the F -contexts deﬁne the weak CbV substrategy, R is the starting nonterminal deﬁningthe substrategy that either weakly reduces, or fully reduces weak normal formsusing the auxiliary H -strategy, which in turn is responsible for fully reducinginert terms. Our starting point is the higher-order evaluator from [13] presented in Figure 1.It computes β -normal forms by following the principles of normalization by eval-uation [19], where the idea is to map a λ -term to an object in the meta-language(here OCaml) from which a syntactic normal form of the input term can subse-quently be read oﬀ. The applicative order, i.e., leftmost-innermost reduction, where arguments of a func-tion are directly reduced to a strong normal form, is not a conservative extension ofweak CbV and therefore we do not consider it as a strong CbV strategy. Moreover,it does not support recursion [26] and may take a diﬀerent number of steps thanstrong CbV strategies as deﬁned here. The right-to-left direction of evaluation in applications is traditional and used, e.g.,in OCaml.trong Call by Value is Reasonable for Time 7 (* syntax of the lambda - calculus with de Bruijn indices *) type index = inttype term = Var of index | Lam of term | App of term * term (* semantic domain *) type level = inttype sem = Abs of ( sem -> sem ) | Neutral of ( level -> term ) (* reification of semantic objects into normal forms *) let rec reify (d : sem ) (m : level ) : term =match d with| Abs f ->Lam ( reify (f ( Neutral ( fun m ’ -> Var (m ’-m -1))))( m +1))| Neutral l ->l m (* sem -> sem as a retract of sem *) let to_sem (f : sem -> sem ) : sem = Abs flet from_sem (d : sem ) : sem -> sem =fun d ’ ->match d with| Abs f ->f d ’| Neutral l ->Neutral ( fun m -> let n = reify d ’ m in App (l m , n )) (* interpretation function *) let rec eval (t : term ) (e : sem list ) : sem =match t with| Var n -> List . nth e n| Lam t ’ -> to_sem ( fun d -> eval t ’ (d :: e ))| App (t1 , t2 ) -> let d2 = eval t2 ein from_sem ( eval t1 e) d2 (* NbE : interpretation followed by reification *) let nbe (t : term ) : term = reify ( eval t []) 0

Fig. 1.

An OCaml implementation of the higher-order compositional evaluatorfrom [13]: an instance of normalization by evaluation for a call-by-value β -reduction inthe λ -calculus. The evaluator works with lambda terms in de Bruijn notation. In this no-tation, lambda terms are generated by the grammar t ::= n | t t | λt where n ranges over natural numbers. There are two ﬂavours of the notation: deBruijn indices, where n denotes the number of lambdas between the representedvariable and its binder; and de Bruijn levels, where n denotes the number of M. Biernacka et al. lambdas between the root of a term and the binder of the variable. For exam-ple, λx. x ( λy. y x ) is represented as λ λ with de Bruijn indices and as λ λ with levels. Using de Bruijn notation has a clear beneﬁt of avoid-ing problems with α -conversion, but there is some cost of having less intuitive β -reduction.The normalization function ﬁrst evaluates terms into the semantic domainrepresented by the recursive type sem – it is completely standard and imple-mented by the function eval . Then the normal form is extracted from the se-mantic object by the function reify that mediates between syntax and semanticsin the way known from Filinski and Rohde’s work [19] on NbE for the untyped λ -calculus.Using Danvy et al.’s functional correspondence [11] between higher-orderevaluators and abstract machines, the evaluator of Figure 1 is transformed in [13]to an abstract machine that implements a full-reducing call-by-value strategy forpure λ -calculus. The obtained machine is not reasonable in terms of complex-ity [8]: it cannot simulate n steps of β -reduction in a number of transitions thatis polynomial in n and in the size of the initial term. The reason is that it neverreuses constructed structures, so it has to introduce each constructor of the re-sulting normal form in a separate step. This can lead to an exponential blow-up,as the following example shows. Consider a family of terms ω := λx. x x e n := λx. c n ω x where c n denotes the n th Church numeral. Each e n reduces to its normal form inthe number of steps linear in n , but the size of this normal form is exponentialin n . Remark 1.

The interpreters of [10,13,21] all suﬀer from exponential time over-head because of the size explosion problem.

Consider term families deﬁned as follows (here x, z are free variables): A := x B := zA n + := A n ( λy. y ) x B n + := z λw. B n Q n := λx. ( λz. B n ) A n Note that there are n + 1 free occurrences of variable z in B n , each of which isunder a diﬀerent number of lambdas (on a diﬀerent de Bruijn level). Terms offamily Q are closed and they reach their normal forms in one β -reduction whichsubstitutes A n for z in B n . The size of A n is linear in n and it is substituted forlinearly many z s in B n resulting in a normal form of quadratic size.Terms A n can be easily shared in memory and the resulting representationhas size linear in n . It is not however possible with de Bruijn indices nor levels trong Call by Value is Reasonable for Time 9 representation because the resulting normal form has quadratically many con-structors of distinct subterms. We illustrate this issue with the term Q and itsnormal forms using 3 diﬀerent representations: with names, indices and levels. Q = λx. ( λz. z λw. z λw. z ) ( x ( λy. y ) x ( λy. y ) x ) N nam ( Q ) = λx. ( x ( λy. y ) x ( λy. y ) x ) λw. ( x ( λy. y ) x ( λy. y ) x ) λw. ( x ( λy. y ) x ( λy. y ) x ) N ind ( Q ) = λ (0 ( λ

0) 0 ( λ

0) 0) λ (1 ( λ

0) 1 ( λ

0) 1) λ (2 ( λ

0) 2 ( λ

0) 2) N lev ( Q ) = λ (0 ( λ

1) 0 ( λ

1) 0) λ (0 ( λ

2) 0 ( λ

2) 0) λ (0 ( λ

3) 0 ( λ

3) 0)

Here we have three occurrences of the term A = ( x ( λy. y ) x ( λy. y ) x ) , each ona diﬀerent de Bruijn level. Therefore, in the index notation, the subterm x hasa diﬀerent representation in each of these occurrences. Similarly, in the levelnotation, the subterm λy. y has a diﬀerent representation in each of these oc-currences. In consequence, none of the two notations allows sharing of diﬀerentoccurrences of A . This example shows that when working with de Bruijn repre-sentations it is not possible to bound the amount of work per single β -reductionby a quantity proportional to the size of the initial term. It does not mean thatmachines working with these representations must be unreasonable but a proofinvolving this quadratic dependency may be more complex than one withoutthis problem. Therefore, we choose to work with names. Now we present a higher-order evaluator that will be transformed into an ab-stract machine. The implementation is given in OCaml [24]. It is a modiﬁedversion of the evaluator from Figure 1, with three major changes. First, it usesnames instead of de Bruijn indices to represent variables in lambda terms. Sec-ond, it abstracts the environments (the second argument of the function eval )in the sense that they are no longer directly implemented as lists, but as diﬀer-ent data structures implementing dictionaries. Third, it uses caches as a form ofsharing.

Terms.

We start with the syntax of λ -terms with names: type i d e n t i f i e r = stringtype term = Var of i d e n t i f i e r| App of term * term| Lam of i d e n t i f i e r * term Environments.

Environments are dictionaries storing values assigned to iden-tiﬁers. To handle open terms an environment returns abstract variables for un-deﬁned identiﬁers (with the same name) and we make sure they will not becaptured during abstraction reiﬁcation. Here we extend the name in order tomark that the variable is free and then free variables of the initial term arereplaced with variables with the extended name in the resulting term. module Dict = Map . Make (struct type t = i d e n t i f i e r let c o m p a r e = c o m p a r e end)type env = sem Dict . tlet rec e n v _ l o o k u p ( x : i d e n t i f i e r) ( e : env ) : sem =match Dict . f i n d _ o p t x e with| Some v -> v| None -> a b s t r a c t _ v a r i a b l e ( x ^ " _free " )

Caches.

To achieve a reasonable implementation for strong CbV we need tointroduce a form of sharing in order to avoid the size explosion problem.We employ a mechanism similar to memothunks. This allows us to reusealready computed subterms in normal forms. An α -cache is a place where aresult of type α can be stored and later used to prevent invoking the samedelayed computation many times. It is implemented as follows: type ’a cache = ’a option reflet c a c h e d _ c a l l ( c : ’a cache ) ( t : unit -> ’a ) : ’a =match ! c with| Some y -> y| None -> let y = t () in c := Some y ; y Values.

In the original strong CbV evaluator there are two kinds of values:abstractions and delayed neutral terms which after defunctionalization corre-spond to weak normal forms and inert terms, respectively. Here we add anotherconstructor to allow annotation of values with caches for their normal forms. type sem = Abs of ( sem -> sem )| N e u t r a l of ( unit -> term )| Cache of term cache * sem

Reiﬁcation.

In normalization by evaluation reiﬁcation plays a role of a readback of concrete syntactic objects from abstract semantic values. It correspondsto full normalization of weak normal forms in the obtained machine.Here reiﬁcation uses two auxiliary functions. Fresh names are generated ina standard way using one extra memory cell. In the representation with names,abstract variables are delayed neutral terms which just return a variable for agiven identiﬁer. let gensym : unit -> int =let c = ref 0 infun () -> trong Call by Value is Reasonable for Time 11 let res = ! c inc := res + 1;reslet a b s t r a c t _ v a r i a b l e ( x : i d e n t i f i e r) : value =let vx = Var x inN e u t r a l ( fun () -> vx )

The reiﬁcation of abstractions and neutral terms is accomplished just as in theoriginal evaluator: abstractions are called with an abstract variable with a freshlygenerated name, and the result is reiﬁed under lambda with the same name;delayed neutral terms are simply forced. In the case of values with cache lookup,

Cache (c,v) , if the result of reiﬁcation of v is known, it is simply read from thecache c ; otherwise it is computed and stored in the cache. let rec reify : sem -> term =f u n c t i o n| Abs f ->let xm = " x_ " ^ s t r i n g _ o f _ i n t ( gensym ()) inLam ( xm , reify ( f @@ a b s t r a c t _ v a r i a b l e xm ))| N e u t r a l l ->l ()| Cache (c , v ) -> c a c h e d _ c a l l c ( fun () -> reify v ) Value application.

Values as elements of a λ -calculus model should also playrole of endofunctions on themselves. Abstractions are wrappings of such func-tions so it is enough to unwrap them. A neutral term applied to a normal formcreates a new neutral term, and it can be constructed by delaying the forcingof this neutral term and the reiﬁcation of the argument. However, if a neutralterm is annotated with cache, the cache should be consulted when the delayedcomputation is forced. In contrast, application does not normalize abstractions(abstractions are changed by reduction) and therefore their caches are ignoredin such a situation. let rec f r o m _ s e m : sem -> ( sem -> sem ) =f u n c t i o n| Abs f -> f| N e u t r a l l -> a p p l y _ n e u t r a l l| Cache (c , N e u t r a l l ) -> a p p l y _ n e u t r a l( fun () -> c a c h e d _ c a l l c l )| Cache (c , v ) -> f r o m _ s e m vand a p p l y _ n e u t r a l ( l : unit -> term ) ( v : sem ) : sem =N e u t r a l ( fun () -> let n = reify v in App ( l () , n )) Evaluation.

Evaluation uses an auxiliary function that annotates a value withthe empty cache, provided it is not already annotated. Evaluation reads a λ - expression as source code. Source variables merely indicate values in environ-ment. Source abstractions are translated to abstraction values that evaluatetheir bodies with environments extended by an argument annotated with cache.Applications are evaluated right-to-left and the left value is applied to the rightone as described earlier. let m o u n t _ c a c h e ( v : sem ) : sem =match v with| Cache (_ , _ ) -> v| _ -> Cache ( ref None , v )let rec eval ( t : term ) ( e : env ) : sem =match t with| Var x -> e n v _ l o o k u p x e| Lam (x , t ’) -> to_sem( fun v -> eval t ’ @@ Dict . add x ( m o u n t _ c a c h e v ) e )| App ( t1 , t2 ) -> let v2 = eval t2 ein f r o m _ s e m ( eval t1 e ) v2 The normalization-by-evaluation function is the composition of evaluation in theempty environment and reiﬁcation. let nbe ( t : term ) : term = reify ( eval t Dict . empty )

Using Danvy et al.’s functional correspondence [11], the evaluator constructedin Section 3.3 is transformed to an abstract machine. The most important stepsin this transformation are the same as on the path from the evaluator in Fig-ure 1 to its corresponding abstract machine in [13]: closure conversion, trans-formation to continuation passing style, defunctionalization of continuations tostacks, entanglement of defunctionalized form to an abstract machine. All thesetransformations are described in the supplementary materials [28]. The machineobtained by derivation is presented in Figures 2 and 3.

Values.

Machine representations of values are representations of weak normalforms that can additionally be annotated with heap locations. These locationsare used to cache full normal forms. The grammar presented here is a bit counter-intuitive as it allows nesting of locations like ( v ℓ ) ℓ ′ or applications of closuresto other values. The machine maintains invariants guaranteeing that there areno nested locations and that all values are decoded to weak normal forms (inparticular, all applications involve inert terms). It is possible to write a moreprecise grammar here; however, this would lead to many new syntactic categoriesand to an increase in the number of transitions of the resulting machine. trong Call by Value is Reasonable for Time 13 Identiﬁers ∋ x Terms ∋ t ::= x | t t | λx. t Locations ∋ ℓ Values ∋ v ::= V ( x ) | v v | [ x, t, E ] | v ℓ Envs ∋ E < : Identiﬁers → ValuesFrames ∋ F ::= [ t, E ] (cid:3) | (cid:3) v | v (cid:3) | (cid:3) t | λx. (cid:3) | @[ ℓ ] Stacks ∋ S ::= • | F :: S Term Optionals ∋ t ? ::= • | [ t ] Heaps ∋ H < : Locations → Term OptionalsCounters ∋ m ∈ N Confs ∋ K ::= h t, E, S, m, H i E | h S, v, m, H i C | h S, t, m, H i S | h t ? , ℓ, S, v, m, H i M Fig. 2.

Syntactic categories used in the environment-based machine

Environments.

Environments are dictionaries whose keys are identiﬁers andvalues are annotated machine values of the form v ℓ . They represent assignmentsof values (weak normal forms) to variables. They can be implemented as asso-ciation lists (as it is done explicitly in [13]); other, more eﬃcient options areconsidered in Section 7.4.The content of the initial environment init corresponds to the result of lookupin the higher-order evaluator when nothing is assigned to any variable. We rep-resent it by the empty collection, but as stated before, the initial environmentin our implementation returns an abstract variable V ( x free ) for variable x . Stack and Frames.

Stacks are machine representations of contexts; technicallythey are just sequences of frames. The ﬁrst ﬁve frames in the grammar of F are the same as in the KNV machine of [13]. They are used in representationsof rrCbV contexts, maintaining the same invariants as KNV; we discuss it inSection 6.2. The last frame, @[ ℓ ] , has similar meaning to an extra frame inCr´egut’s KL and KNL of [17]: it is used to cache a computed normal formunder location ℓ . Here we use heap to indicate explicitly which structures of themachine have to be mutable in an implementation.The stack is the only mechanism responsible for managing the continuation;in particular the machine has no component that could be recognized as a dump which is present for example in [8]. Counter.

The machine has a counter that is stored in every conﬁguration andis not duplicated. It can be seen as a register. Its role is to generate fresh namesfor abstract variables (see transition (9) in Figure 3) and it is only incremented.

It could also be decremented in rule (18) to maintain de Bruijn level as it is donein KN and KNV, but then the freshness of variables would be less obvious.

Heap.

The heap can be seen as a dictionary whose keys are locations and (op-tional) values are terms in normal form. As it is the case with the counter, theheap appears in every conﬁguration exactly once and hence it can be imple-mented in the RAM model as mutable memory. Then locations ℓ can be seenas pointers to such memory locations. In the purely functional setting this canbe simulated with a dictionary, causing logarithmic overhead. However, evenwith the use of mutable state, a conﬁguration of the machine can be seen asa persistent data structure . This is because every mutable location is used onlyto memoize the normal form of a determined value; when the normal form iscomputed, the stored value never changes. No mutable pointers are stored inthe heap. Therefore one can say that every pointer points to a lower level in thepointing hierarchy. Thus no reference cycles are created and garbage collectioncan rely solely on reference counting .In terms of [2] the heap can be recognized as a global environment . It isintroduced in the evaluator and preserved by the derivation, so it is present inthe machine. Moreover, together with the local environment used in the machine,it can be seen as a derived split environment , because every value in the localenvironment has to be annotated with a location coming from a distinguishedset of identiﬁers pointing to the global environment. Conﬁgurations.

The machine uses four kinds of conﬁgurations correspondingto four modes of operation: in E -conﬁgurations the machine evaluates some sub-term to a weak normal form; in C -conﬁgurations it continues with a computedweak normal form and in S -conﬁgurations it continues with a computed strongnormal form. M -conﬁgurations are used to manipulate access to memory. Transitions.

The transitions of the machine are presented in Figure 3. Theﬁrst one loads an input term to the initial conﬁguration; similarly the last oneunloads the computed strong normal form from the ﬁnal conﬁguration.Transitions (1)–(3) are completely standard. In order to evaluate an appli-cation of terms, transition (1) calls the evaluation of the argument and pushesthe current context represented by a closure pairing the calling function withthe current environment to the stack. Note that this implements the right-to-left choice of the order of evaluation of arguments. A lambda abstraction in (2)is already a weak normal form, so we simply change the mode of operation toa C -conﬁguration. Transition (3) simply reads a value of a variable from theenvironment (which always returns a wnf) and changes the mode of operation.Conﬁgurations of the form h S, v, m, H i C continue with a wnf v in a contextrepresented by stack S , with heap H and the value of the variable counterequal to m . There are two goals in these conﬁgurations: the ﬁrst is to ﬁnish theevaluation (to wnfs) of the closures stored on the stack S according to the weak trong Call by Value is Reasonable for Time 15 t

7→ h t, init , • , , emp i E h t t , E, S , m, H i E → h t , E, [ t , E ] (cid:3) :: S , m, H i E (1) h λx. t, E, S , m, H i E → h S , [ x, t, E ] , m, H i C (2) h x, E, S , m, H i E → h S , E ( x ) , m, H i C (3) h [ t, E ] (cid:3) :: S , v, m, H i C → h t, E, (cid:3) v :: S , m, H i E (4) h (cid:3) v ℓ :: S , [ x, t, E ] , m, H i C → h t, E [ x := v ℓ ] , S , m, H i E (5) h (cid:3) v ✄❈ ℓ :: S , [ x, t, E ] , m, H i C → h (cid:3) v ℓ :: S , [ x, t, E ] , m, H ∗ [ ℓ

7→ • ] i C (6) h (cid:3) v :: S , [ x, t, E ] ℓ , m, H i C → h (cid:3) v :: S , [ x, t, E ] , m, H i C (7) h (cid:3) v :: S , i, m, H i C → h S , i v, m, H i C (8) h S , [ x, t, E ] , m, H i C →h t, E [ x := V ( x m ) ℓ ] , λx m . (cid:3) :: S , m + 1 , H ∗ [ ℓ

7→ • ] i E (9) h S , V ( x ) , m, H i C → h S , x, m, H i S (10) h S , i v, m, H i C → h i (cid:3) :: S , v, m, H i C (11) h S , v ℓ , m, H i C → h H ( ℓ ) , ℓ, S , v, m, H i M (12) h [ n ] , ℓ, S , v, m, H i M → h S , n, m, H i S (13) h• , ℓ, S , v, m, H i M → h @[ ℓ ] :: S , v, m, H i C (14) h @[ ℓ ] :: S , n, m, H i S → h S , n, m, H [ ℓ := [ n ]] i S (15) h v (cid:3) :: S , n, m, H i S → h (cid:3) n :: S , v, m, H i C (16) h (cid:3) n :: S , a, m, H i S → h S , a n, m, H i S (17) h λx. (cid:3) :: S , n, m, H i S → h S , λx. n, m, H i S (18) h• , t, m, H i S t Fig. 3.

Transition rules of the environment-based machine call-by-value strategy; the second is to reduce v to a strong normal form. This ishandled by rules (4)–(12), where rules (4)–(8) are responsible for the ﬁrst goal,and rules (9)–(12) for the second. The choice of the executed transition is doneby pattern-matching on v and S ; we always choose the ﬁrst matching transition.This means, in particular, that (cid:3) v ℓ is a right application of an annotated valuein transition (5), but in transition (6) we have a right application of a value thatis not annotated.In rule (4) the stack contains a closure, so we start evaluating this closureand push the already computed wnf to the stack; when this evaluation reachesa wnf, rules (6) or (7) applies. Rule (6) is responsible for an application of anot-annotated abstraction closure to a not-annotated wnf; in this case a newlocation ℓ is created on the heap and the wnf is annotated with ℓ (later thecomputed normal form of the wnf may be stored in ℓ ). Rule (5) is responsiblefor an application of a not-annotated lambda abstraction to an annotated wnfand implements β -contraction by evaluating the body of the abstraction in the appropriately extended environment. Rule (7) is responsible for an applicationof an annotated abstraction; since β -contraction is going to change the normalform of the computed term, we simply remove the information about the cache;this rule is immediately followed by transition (5) or (6) and (5). Rule (8) applieswhen i is an inert term and v is an arbitrary (annotated or not) wnf; in thiscase we reconstruct the application of this inert term to the wnf popped fromthe stack (which gives another wnf).Rules (9)–(12) are applied when there are no more wnfs on the top of thestack; here we pattern-match on the currently processed wnf v . If it is a clo-sure, transition (9) implements the ‘going under lambda’ step: it pushes theelementary context λx m . (cid:3) to the stack (note that x m is a fresh variable here),increments the variable counter, creates a new location ℓ on the heap, adds the(annotated) abstract variable V ( x m ) ℓ to the environment, and starts the evalu-ation of the body. If v is an abstract variable, we reach a normal form; rule (10)changes the mode of operation to a S -conﬁguration. If v is an application i v ,rule (11) delays the normalization of i by pushing it to the stack and continueswith v ; note that this implements the second of our right-to-left choices of theorder of reduction. Finally, if v is an annotated value, we change the mode ofoperation to an M -conﬁguration to check if the normal form has already beencomputed.In conﬁguration h t ? , ℓ, S, v, m, H i M heap H contains location ℓ pointing to t ? which is an optional of v ’s strong normal form. If it contains the normal form n then transition (13) immediately returns n and changes the mode of operationto an S -conﬁguration. Otherwise, if it is empty, rule (14) pushes the location ℓ tothe stack and calls the normalization of v . When this normalization is ﬁnished,rule (15) stores the computed normal form at location ℓ .Conﬁgurations of the form h S, t, m, H i S continue with a (strong) normalform t in a context represented by S . The goal in these conﬁgurations is to ﬁnishthe evaluation of inert terms stored on the stack and to reconstruct the ﬁnalterm. This is handled by transitions (15)–(18); the choice of the transition isdone by pattern-matching on the stack. If there is an inert term v on the top ofthe stack, rule (16) pushes the already computed normal form on the stack andcalls normalization of v by switching the mode of operation to a C -conﬁguration.Otherwise there is a previously computed normal form or a λx. (cid:3) frame on thetop of the stack; in these cases transitions (17) and (18) reconstruct the termaccordingly. Finally, when the stack is empty, the machine stops and unloadsthe ﬁnal result from a conﬁguration. Remark 2.

The machine works on a term representation that allows sharing, asin OCaml or in [16]. In particular, caching of the computed normal forms allowstheir reuse in the construction of the output term. For example, exponentially bignormal forms of the family e n = λx. c n ω x consume only a linear in n amountof memory and are computed in linear time. We assume that unloading of theﬁnal result does not involve unfolding it to the unshared term representation(which might require exponential time). trong Call by Value is Reasonable for Time 17 The machine from Section 3.4 uses environments to represent delayed substitu-tions. In this section we replace environments with delimited substitutions . Thisvariant provides an intermediate step between machine conﬁgurations and theirdecodings, and makes correctness proofs in Section 6 easier to follow.

Accattoli and Barras in [2] present a technique that represents variables as point-ers and enables substitution and lookup of a substituted value in constant time.They use it to obtain an eﬃcient version of Milner Abstract Machine. Here weapply this technique to eliminate local environments.The technique relies on a modiﬁed representation of terms, where identiﬁersin abstractions and variables are replaced by pointers to mutable memory. Withthis representation, a substitution can be performed in constant time by simplywriting to an appropriate memory cell; similarly the lookup of a variable in anenvironment boils down to a single reading operation. However, by overwritinga variable, the substitution destroys the original term, which, in consequence,cannot be shared. In order to avoid incorrect modiﬁcation of shared terms, somesubterms must be copied. In [8] it is observed that a new copy is needed onlywhen it comes to β -reduction: only bodies of λ -abstractions must be copied andthe only identiﬁer that needs to point to a fresh memory cell is the one of theabstraction to be applied. Intuitively, this corresponds to an α -renaming of a λ -abstraction before a β -reduction step. It is called renaming on β .In the next section we fuse copying a term and overwrite-based substitutioninto a substitution function working on purely functional term representation.The tricky bit is that the copying function does not copy already substitutedvariables but shares them instead. To preserve this property, we use substitu-tion delimiters of the form h v i informing that there are no occurrences of thesubstituted variable in subterm v and that v can be shared. The new machine is presented in Figures 4 and 5. The transitions in Figures 3and 5 are in one-to-one correspondence and the two machines bisimulate eachother. The style of the presentation of the new machine is more implicit, e.g., themechanism of fresh name generation and the heap component, though implicitlypresent, are not spelled out in conﬁgurations. Some new notation is used inFigure 5: x ∗ denotes a fresh variable, H ( ℓ ) is the content of location ℓ on theimplicit heap H and [ ℓ := · ] is an update of location ℓ .The syntax of terms is extended with a substitution delimiter hi carryingsubstituted values. It is the only change in term representation; input terms donot have to be compiled before being loaded to the machine because they fallwithin the extended grammar. Output terms are free of substitution delimiters. Every conﬁguration of the environment-based machine can be translated intoa substitution-based one by executing delayed substitution for free variables anddenoting it with substitution delimiters. The main part of this translation, i.e.,translation of closures · , · , is given below, where E \ [ x

7→ − ] denotes the envi-ronment E with removed binding for x . This establishes a strong bisimulationbetween the machines. x, E = ( h E ( x ) i : x ∈ Ex : x / ∈ Et t , E = t , E t , Eλx. t, E = λx. t, E \ [ x

Syntactic categories and substitutions used in the substitution-based machinetrong Call by Value is Reasonable for Time 19 t

7→ h t, •i E h t t , S i E → h t , t (cid:3) :: S i E (1) h λx. t, S i E → h S, [ x, t ] i C (2) h t, S i E → h S, strip ( t ) i C (3) h t (cid:3) :: S, v i C → h t, (cid:3) v :: S i E (4) h (cid:3) v ℓ :: S, [ x, t ] i C → h t [ x := h v ℓ i ] , S i E (5) h (cid:3) v :: S, [ x, t ] i C → h (cid:3) v ℓ :: S, [ x, t ] i C (6) h (cid:3) v :: S, [ x, t ] ℓ i C → h (cid:3) v :: S, [ x, t ] i C (7) h (cid:3) v :: S, v i C → h S, v v i C (8) h S, [ x, t ] i C → h t [ x := h V ( x ∗ ) ℓ i ] , λx ∗ . (cid:3) :: S i E (9) h S, V ( x ) i C → h S, x i S (10) h S, v v i C → h v (cid:3) :: S, v i C (11) h S, v ℓ i C → h H ( ℓ ) , ℓ, S, v i M (12) h [ t ] , ℓ, S, v i M → h S, t i S (13) h• , ℓ, S, v i M → h @[ ℓ ] :: S, v i C (14) h @[ ℓ ] :: S, t i S → h S, t i S [ ℓ := [ t ]] (15) h v (cid:3) :: S, t i S → h (cid:3) t :: S, v i C (16) h (cid:3) t :: S, t i S → h S, t t i S (17) h λx. (cid:3) :: S, t i S → h S, λx. t i S (18) h• , t i S t Fig. 5.

Transitions of the substitution-based machine

Using functional correspondence between evaluators and abstract machines it isalso possible to derive a higher-order evaluator corresponding to the substitution-based machine. Here we present a fragment of this evaluator. The ﬁrst patternmatching reveals that the type of terms is extended with the constructor forsubstitution delimiters. let e n v _ l o o k u p : term -> value =f u n c t i o n| Var x -> a b s t r a c t _ v a r i a b l e ( x ^ " _free " )| Subs v -> v| _ -> assert falselet rec eval ( t : term ) : value =match t with| Lam (x , t ’) -> Abs (x , fun v -> eval ( subst x ( Subs ( m o u n t _ c a c h e v )) t ’))| App ( t1 , t2 ) -> let v2 = eval t2in f r o m _ s e m ( eval t1 ) v2| x -> e n v _ l o o k u p x

The soundness of the obtained machines with respect to the rrCbV strategycould be argued by reasoning starting from the soundness of KNV [13] andshowing its preservation by the subsequent program transformations. Instead,we present a sketch of a direct proof.

Below we deﬁne a decoding of stacks to contexts and of machine terms, valuesand conﬁgurations to source terms. J t t K t = J t K t J t K t J λx. t K t = λx. J t K t J x K t = x J h v i K t = J v K v J v v K v = J v K v J v K v J [ x, t ] K v = λx. J t K v J V ( x ) K v = x J v ℓ K v = J v K v J • K S = (cid:3) J t (cid:3) :: S K S = J S K S [ J t K t (cid:3) ] J (cid:3) v :: S K S = J S K S [ (cid:3) J v K v ] J v (cid:3) :: S K S = J S K S [ J v K v (cid:3) ] J (cid:3) t :: S K S = J S K S [ (cid:3) J t K t ] J λx. (cid:3) :: S K S = J S K S [ λx. (cid:3) ] J @[ ℓ ] :: S K S = J S K S J h t, S i E K K = J S K S [ J t K t ] J h S, v i C K K = J S K S [ J v K v ] J h t ? , ℓ, S, v i M K K = J S K S [ J v K v ] J h S, t i S K K = J S K S [ J t K t ] We state more precise shape invariants to assert that evaluation contexts andvalues follow the rules of the rrCbV strategy. Such invariants can be derivedfrom evaluators explicitly expressing their own invariants. In the grammars be-low numerical subscripts will also discriminate grammar symbols. The syntacticcategories n and a of normal forms and neutral terms are those deﬁned in Sec-tion 2.2. v w ::= [ x, t ] | v i | v wℓ n ? ::= • | [ n ] v i ::= V ( x ) | v i v w | v iℓ a ? ::= • | [ a ] trong Call by Value is Reasonable for Time 21 S ::= t (cid:3) :: S | (cid:3) v w :: S | S S ::= @[ ℓ ] :: S | (cid:3) n :: S | S S ::= @[ ℓ ] :: S | v i (cid:3) :: S | λx. (cid:3) :: S | • Lemma 1.

All reachable conﬁgurations are well-formed, i.e., are in forms: h t, S i E , h S , v w i C , h S , v i i C , h n ? , ℓ, S , v w i M , h a ? , ℓ, S , v i i M , h S , a i S , h S , n i S .Proof (idea). The initial conﬁguration is well-formed and is preserved by alltransitions.

Corollary 1. If S is a reachable stack of the machine then context J S K S is arrCbV context.Proof (sketch). In [13] it is shown that all rrCbV contexts are generated by the(outside-in) grammar of contexts from Section 2.2, with the starting symbol R .It is also shown that this grammar is equivalent to the following (inside-out)grammar with the starting symbol S : S ::= S [ t (cid:3) ] | S [ (cid:3) w ] | S S ::= S [ (cid:3) n ] | S S ::= S [ i (cid:3) ] | S [ λx. (cid:3) ] | (cid:3) Decodings of well-formed stacks follow this grammar.

Corollary 2. If v is a reachable value of the machine then term J v K v is a weaknormal form. The second corollary states one of the invariants mentioned in Subsection 3.4.To capture the second fact that annotations ℓ cannot be stacked, a more pre-cise shape invariant could be established. This, however, would require moregrammar symbols and well-formed conﬁgurations. To omit some technical details we focus on the machine soundness for closedinput terms. It is suﬃcient because open terms can be closed by abstractionsbefore processing.

Lemma 2. If K ( ι ) → K ′ , ι / ∈ { , , } and term J K K K is closed then J K K K = J K ′ K K .Proof. By case analysis on transition rules.

Lemma 3. If K is a reachable conﬁguration, term J K K K is closed and K ( ) → K ′ then J K K K R → β w J K ′ K K . Proof (sketch).

Transitions maintain the invariant that all free variables of termsunder delimiters are bound by the stack. Hence they are not captured duringsubstitution, and substitution can be delimited: the substituted variable does notoccur under delimiter and β -contraction is simulated properly. From Corollaries 1and 2 it follows that an evaluation context is an R -context and a substitutedvalue decodes to a weak normal form. Lemma 4. If K is a reachable conﬁguration, term J K K K is closed and K ( ) → K ′ then J K K K → α J K ′ K K .Proof (sketch). Thanks to the fresh variable α -contraction is simulated correctly.As in Lemma 3, the substituted variable (here x ) does not occur under delimitersand the free variable x ∗ is bound by the stack. Lemma 5 (bypass). If K is a reachable conﬁguration, term J K K K is closedand K ( ) → K ′ then J K K K R ։ β w = α J K ′ K K .Proof (idea). A normal form can be memoized only by getting oﬀ a @[ ℓ ] frame bytransition (15). After pushing it on the stack by transition (14) the only way todo that is to compute the full normal form of a given weak normal form. Thus,if machine had used transition (14) instead of (13) it would have maintainedthe shape invariant, the evaluation context would be still a R -context, and thecomputed full normal form would be α -equivalent. By standard properties of α -conversion its uses in transition (9) can be postponed. Proposition 1. If K is a reachable conﬁguration, term J K K K is closed and K ։ K ′ then J K K K R ։ β w = α J K ′ K K .Proof. This is an immediate consequence of Lemmas 2–5

Theorem 1 (soundness).

If machine starting from t computes t ( i.e., h t , •i E ։ h• , t i S ) , then t reduces in many steps to a normal form t ( i.e. , there exists t ′ such that t R ։ β w t ′ and t ′ R β w and t = α t ′ ) .Proof. By Lemma 1 terminal conﬁguration decodes to a term in normal form,so t is in normal form. By Proposition 1 t reduces to t . One of the motivations for constructing an eﬃcient machine for strong CbVcomes from the conversion problem, which asks if two given terms are β -convertible.In general the problem is undecidable, but it has important applications in proofassistants and thus it is desirable to ﬁnd eﬃcient partial solutions.If both input terms normalize in the strong CbV strategy, then one straight-forward solution is to normalize input terms by the abstract machine and then trong Call by Value is Reasonable for Time 23 to check if they are α -equivalent. Normalization yields a shared representation ofterms, avoiding possibly exponential size of the normal forms in explicit repre-sentation. As Condoluci et al. state in [16], the α -equivalence of shared terms canbe checked in time linear w.r.t. the size of shared representations. Therefore con-vertibility check can be done in time proportional to the time of normalizationof both terms. In the following we analyse the cost of normalization.In [13] it is shown that, using a technique called streaming of terms , the con-vertibility check can be short-circuited if partial results of normalization diﬀer.We do not consider it here as fusion of term streaming and shared equality goesbeyond the scope of this paper. Probably the simplest approach to the complexity analysis of an abstract ma-chine is to ﬁnd a (constant) upper bound on the number of consecutive adminis-trative steps of the machine. Then the overall complexity of an execution is thisbound times the number of β -transitions times the cost of a single transition.In this section we estimate the global number of transitions executed by themachine on a given term. Unfortunately, the simple approach outlined above doesnot work here as the following example shows. Let c n be the n th Church numeral, dub := λx. λp. p x x and I – the identity. The execution of c n dub I starts withtwo β -reductions substituting dub and I . Then next n reductions result in avalue r n , where r = [ x, x ] and r n + = [ p, p h r nℓ i h r nℓ i ] for some ℓ s. Value r n decodes to a normal form, but it takes the machine Ω ( n ) administrative stepsto construct this normal form. Thus, sequences of administrative transitions arenot bounded by any constant (see also Fig. 6).To overcome the problem of long sequences of administrative steps we per-form a kind of amortized analysis of the execution length. Following [25], wedeﬁne a potential function Φ K of conﬁgurations. Then, for most of the transi-tions (more precisely, for all but ( ) → ) the cost of the transition is covered by thechange in the potential of the involved conﬁgurations (cf. Lemma 8). Transition ( ) → is a preparatory step for a β -reduction and its cost is covered by the involved β -reduction (cf. Lemma 10).The potential of a conﬁguration depends on the potentials of its components:term, value, stack and the implicit heap. The potential functions Φ t , Φ v and Φ S are deﬁned as follows. Φ t ( t t ) = 6 + Φ t ( t ) + Φ t ( t ) Φ t ( λx. t ) = 4 + Φ t ( t ) Φ t ( x ) = 4 Φ t ( h v i ) = 4 Φ v ( v v ) = 3 + Φ v ( v ) + Φ v ( v ) Φ v ([ x, t ]) = 3 + Φ t ( t ) Φ v ( V ( x )) = 1 Φ v ( v ℓ ) = 3 ✘✘✘✘ + Φ v ( v ) Φ S ( • ) = 0 Φ S ( t (cid:3) :: S ) = 5 + Φ S ( S ) + Φ t ( t ) Φ S ( (cid:3) v :: S ) = 4 + Φ S ( S ) + Φ v ( v ) Φ S ( v (cid:3) :: S ) = 2 + Φ S ( S ) + Φ v ( v ) Φ S ( (cid:3) t :: S ) = 1 + Φ S ( S ) Φ S ( λx. (cid:3) :: S ) = 1 + Φ S ( S ) Φ S (@[ ℓ ] :: S ) = 1 + Φ S ( S ) Intuitively, these potential functions indicate for how many machine steps agiven construct is responsible. For example, Φ t ( t t ) says that if an application t t appears somewhere in a conﬁguration, it may generate 6 transitions of themachine plus the work generated by the two subterms. The crucial observationis that when a normal form of a value v is known and stored under location ℓ ,then the normalization of v involves only a constant (more precisely, 2) stepsand it does not involve recomputation of v – thus the amount of work generatedby v ℓ is bounded by (one transition involves memoizing the normal form).The potential function Φ H estimates the cost of maintaining the heap (whichis implicitly present in each conﬁguration). It takes into account all values thathave their place on the heap (expressed by the condition v ℓ ∈ K below, mean-ing that v ℓ appears somewhere in the current conﬁguration), but are not yetnormalized (expressed by H ( ℓ ) = • ) and are currently not being evaluated (ex-pressed by @[ ℓ ] / ∈ S , meaning that @[ ℓ ] does not appear in the stack and thusthe evaluation of v has not yet started). It also takes into account the momentwhen the evaluation of v ℓ starts, i.e., when the current conﬁguration is of theform h• , ℓ, S, v i M . Formally, Φ H is deﬁned as follows: Φ H ( K ) = X ( ℓ,v ) s.t. K = h• ,ℓ,S,v i M ∨ ( v ℓ ∈ K ∧ H ( ℓ )= • ∧ @[ ℓ ] / ∈ S ) Φ v ( v ) Now we deﬁne the potential function Φ K for conﬁgurations. We use Iversonbracket in the clause for C -conﬁguration ( [ ϕ ] = 1 if ϕ and [ ϕ ] = 0 otherwise) todenote advancement of transition (6): Φ K ( h t, S i E ) = Φ t ( t ) + Φ S ( S ) + Φ H ( K ) Φ K ( h S, v i C ) = Φ v ( v ) + Φ S ( S ) + Φ H ( K ) − · [ h S, v i C ( ) → ] Φ K ( h t ? , ℓ, S, v i M ) = 2 + Φ S ( S ) + Φ H ( K ) Φ K ( h S, t i S ) = Φ S ( S ) + Φ H ( K ) Lemma 6.

If substitution delimiter h v i occurs somewhere in a reachable con-ﬁguration then it is of the form h v ℓ i .Proof. The only transitions introducing substitution delimiters are (5) and (9)which ensure location annotation. trong Call by Value is Reasonable for Time 25

Lemma 7.

For any t , x , v we have Φ t ( t ) = Φ t ( t [ x := h v i ]) .Proof. The only constructors that can be replaced by a substitution are variablesand Φ t ( x ) = Φ t ( h v i ) . Lemma 8 (decrease). If K is a reachable conﬁguration and K =( ) → K ′ then Φ K ( K ) > Φ K ( K ′ ) .Proof (of Lemma 8). The proof goes by case analysis on machine transitions.(1) Φ K ( h t t , S i E ) = 6 + Φ K ( t ) + Φ K ( t ) + Φ S ( S ) + Φ H ( K ) > Φ K ( t ) + Φ K ( t ) + Φ S ( S ) + Φ H ( K )= Φ K ( h t , t (cid:3) :: S i E ) (2) Φ K ( h λx. t, S i E ) = 4 + Φ K ( t ) + Φ S ( S ) + Φ H ( K ) > Φ K ( t ) + Φ S ( S ) + Φ H ( K ) − · [ h S, [ x, t ] i C ( ) → ]= Φ K ( h S, [ x, t ] i C ) (3) Here, by Lemma 6, term t is of the form either x or h v ℓ i , so strip ( t ) is either V ( x ) or v ℓ and thus Φ K ( h t, S i E ) = 4 + Φ S ( S ) + Φ H ( K ) > Φ S ( S ) + Φ H ( K ) ≥ Φ v ( strip ( t )) + Φ S ( S ) + Φ H ( K ) − · [ h S, v i C ( ) → ]= Φ K ( h S, strip ( t ) i C ) (4) Φ K ( h t (cid:3) :: S, v i C ) = Φ v ( v ) + 5 + Φ S ( S ) + Φ t ( t ) + Φ H ( K ) − > Φ t ( t ) + 4 + Φ S ( S ) + Φ v ( v ) + Φ H ( K )= Φ K ( h t, (cid:3) v :: S i E ) (5) Φ K ( h (cid:3) v ℓ :: S, [ x, t ] i C ) = Φ v ([ x, t ]) + Φ S ( (cid:3) v ℓ :: S ) + Φ H ( K ) −

9= 3 + Φ t ( t ) + 4 + Φ S ( S ) + 3 + Φ H ( K ) − Φ t ( t ) + Φ S ( S ) + Φ H ( K ) + 1 Lemma > Φ t ( t [ x := h v ℓ i ]) + Φ S ( S ) + Φ H ( K )= Φ K ( h t [ x := h v ℓ i ] , S i E ) (6) Φ K ( h (cid:3) v :: S, [ x, t ] i C ) = Φ v ([ x, t ]) + Φ S ( (cid:3) v :: S ) + Φ H ( K ) − Φ v ([ x, t ]) + 4 + Φ S ( S ) + Φ v ( v ) + Φ H ( K ) − > Φ v ([ x, t ]) + 4 + Φ S ( S ) + 3 + Φ v ( v ) + Φ H ( K ) − Φ v ([ x, t ]) + 4 + Φ S ( S ) + Φ v ( v ℓ ) + Φ H ( K ′ ) − Φ K ( h (cid:3) v ℓ :: S, [ x, t ] i C ) (8) This is an easy case: occurring in Φ S ( (cid:3) v :: S ) is greater than occurringin Φ v ( v v ) . (9) Φ K ( h S, [ x, t ] i C ) = Φ v ([ x, t ]) + Φ S ( S ) + Φ H ( K ) −

0= 3 + Φ t ( t ) + Φ S ( S ) + Φ H ( K ) > Φ t ( t ) + 1 + Φ S ( S ) + Φ v ( V ( x ∗ )) + Φ H ( K ) Lemma Φ t ( t [ x := h V ( x ∗ ) ℓ i ]) + Φ S ( λx ∗ . (cid:3) :: S ) + Φ H ( K ′ )= Φ K ( h t [ x := h V ( x ∗ ) ℓ i ] , λx ∗ . (cid:3) :: S i E ) (10) This is an easy case.(11) This is an easy case.(12) Φ K ( h S, v ℓ i C ) = 3 + Φ S ( S ) + Φ H ( K ) > Φ S ( S ) + Φ H ( K )= Φ K ( h H ( ℓ ) , ℓ, S, v i M ) (13) Φ K ( h [ t ] , ℓ, S, v i M ) = 2 + Φ S ( S ) + Φ H ( K ) > Φ S ( S ) + Φ H ( K )= Φ K ( h S, t i S ) (14) Φ K ( h• , ℓ, S, v i M ) = 2 + Φ S ( S ) + Φ H ( K ) > Φ S ( S ) + Φ H ( K )= Φ v ( v ) + 1 + Φ S ( S ) + Φ H ( K ′ )= Φ K ( h @[ ℓ ] :: S, v i C ) (15) The pair ( ℓ, v ) is not counted before the transition because @[ ℓ ] is on thestack, and it is not counted after the transition because H ( ℓ ) = • . Φ K ( h @[ ℓ ] :: S, t i S ) = 1 + Φ S ( S ) + Φ H ( K ) > Φ S ( S ) + Φ H ( K )= Φ K ( h S, t i S ) (16) This is an easy case: occurring in Φ S ( v (cid:3) :: S ) is greater than occurring in Φ S ( (cid:3) t :: S ) . The value counted on the stack before the transition is countedin the conﬁguration after the transition.(17) This is an easy case.(18) This is an easy case. Lemma 9 (subterm). If [ x, t ] is a reachable value of the machine starting fromterm t then Φ v ([ x, t ]) < Φ t ( t ) .Proof. Both machines never perform reductions in bodies of abstractions whichwill be invoked later. In environment-based machine for all values [ x, t, E ] terms λx. t are always subterms of t . By translation in substitution-based machinethese abstractions may be modiﬁed only by replacing source variables with valuesunder substitution delimiters and by Lemma 7 it does not change the potential.Function Φ t assigns 4 to abstraction constructor which is greater than 3 assignedby Φ v . trong Call by Value is Reasonable for Time 27 Lemma 10 (increase). If K is a reachable conﬁguration from t and K ( ) → K ′ then Φ K ( K ′ ) − Φ K ( K ) < Φ t ( t ) .Proof. Φ K ( h (cid:3) v :: S, [ x, t ] ℓ i C ) + Φ t ( t )= Φ v ([ x, t ] ℓ ) + Φ S ( (cid:3) v :: S ) + Φ H ( K ) − Φ t ( t )= Φ t ( t ) + 3 + Φ S ( (cid:3) v :: S ) + Φ H ( K ) − Lemma > Φ v ([ x, t ]) + Φ S ( (cid:3) v :: S ) + Φ H ( K ′ ) − · [ h (cid:3) v :: S, [ x, t ] i C ( ) → ]= Φ K ( h (cid:3) v :: S, [ x, t ] i C ) Example changes of potential are depicted in Figures 6 and 7 using Matplotlib [22].

Fig. 6.

Plot of potential for execution of c dub I performing 217 transitions of which 8 are β -transitions (5) Lemma 11.

Let ρ be a sequence of consecutive machine transitions startingfrom term t , | ρ | be number of steps in ρ and | ρ | ( ) be number of steps (7) in ρ .Then | ρ | ≤ ( | ρ | ( ) + 1) · Φ t ( t ) .Proof. This is an immediate consequence of Lemmas 8 and 10.

Fig. 7.

Plot of potential for execution of c c I performing 817 transitions of which 134 are β -transitions (5) The upper bound on a length of machine trace from Lemma 11 leads to thecompleteness of the machine.

Proposition 2. If K is a reachable conﬁguration, term J K K K is closed and J K K K R → β w t ′ , then there exists K ′ such that K ։ K ′ and t ′ R ։ β w = α J K ′ K K .Proof. By Lemma 1 and inspection of all transitions, conﬁgurations that decodeto a term not in normal form are non-terminal. Since the decoding of the givenconﬁguration β -reduces, by Proposition 1 the machine makes steps until (5) or (13) is performed. By Lemma 8 the number of consecutive transitions =(7) isbounded by the potential. Since transition (7) must be followed by (5) in one ortwo steps, (5) or (13) must be reached eventually. In both cases Proposition 1guarantees that, up to α -conversion, the same term is reached or bypassed. Theorem 2 (completeness). If t reduces in many steps to a normal form t ( i.e. , t R ։ β w t and t R β w ) , then machine starting from t computes t ( i.e., thereexists t ′ such that h t , •i E ։ h• , t ′ i S and t = α t ′ ) .Proof. By Lemma 11 the machine reaches a terminal conﬁguration in ﬁnite num-ber of steps. By Lemma 1 the terminal conﬁguration decodes to a term in normalform and by Proposition 2 this normal form is α -equivalent to t . trong Call by Value is Reasonable for Time 29 The environment-based machine performs insert (in rules (5) and (9)) and lookup(in rule (3)) operations on environments. When environments are implementedas lists, then the cost of lookup is proportional to the size of the list.Every environment is paired with a subterm of the initial term constitutinga closure. The machine maintains an invariant that the size of any environmentis equal to the number of lambda abstractions under which the paired subtermis located in the original term.Accattoli and Barras in [2] outline two realizations of local environmentswhich improve the asymptotic cost of operations. One of them uses balancedtrees, which are a natural choice for named representation of lambda expressions.The other uses random-access lists – this requires precomputing de Bruijn indicesof variables in the original term. Both these data structures are persistent, bothimprove the cost of lookup to logarithmic in the size of the initial term.We also note that if identiﬁers are strings of symbols from a ﬁnite alphabet(as binary numbers are strings of bits) then tries can be applied as dictionar-ies making lookup and update costs proportional to the length of the involvedidentiﬁer.

We treat arithmetic operations and operations on identiﬁers as realizable inconstant time. In fact, they involve extra logarithmic cost but it does not aﬀectthe polynomial complexity. As described in Subsection 3.4, heap operations canbe implemented within constant time.Let n be the number of β -reductions performed in a derivation from term t .It can be simulated on both machines with O ((1 + n ) · | t | ) transitions.The environment-based machine has three transitions ((3), (5) and (9)) whosecost is related to the environment; all the other transitions have constant cost.Therefore the overall cost of the execution is O ((1+ n ) ·| t |· E ( | t | )) where E ( | t | ) is the cost of an operation on an environment of size | t | and this cost can beconsidered as logarithmic because an environment maps variables occurring inthe input term to values and number of such variables is bounded by | t | .In the substitution-based machine transitions (5) and (9) have cost propor-tional to | t | while the cost of all the other transitions is constant. Therefore theoverall cost of the execution is O ((1 + n ) · | t | ) . However, in the weak case, i.e.,before going under λ with transition (9), the machine performs O ((1 + n ) · | t | ) transitions with unitary cost and O ( n ) transitions (5) of cost O ( | t | ) . Thereforethe cost of the weak part of the execution is bilinear , i.e., O ((1 + n ) · | t | ) .Accattoli and Dal Lago, in [5,18], present a polynomial simulation of TuringMachines relevant for any strategy reducing weak redexes ﬁrst which is also truefor rrCbV. We have shown a polynomial simulation of rrCbV strategy. Thereforereasonable machines and the rrCbV strategy can simulate each other with apolynomial overhead making the latter also a reasonable machine for time. Call-by-value strategies as described in the preliminaries perform the samenumber of β -reductions to achieve normal form so this result generalizes to allof them: Theorem 3.

The number of steps performed by a strong call-by-value strategyis a reasonable measure for time.

We presented an abstract machine that realizes the strong CbV strategy (inits right-to-left variant) and we proved its reasonability that makes it a suﬃ-ciently good implementation model. The machine uses a form of memoizationto store computed normal forms and reuse them when needed. A derivationfrom an evaluator using memothunks also in weak normalization would lead toa machine performing some kind of strong call-by-need strategy but its studyand comparison with other call-by-need machines are beyond the scope of thispaper.

References

1. Accattoli, B.: A fresh look at the lambda-calculus (invited talk). In: 4th Interna-tional Conference on Formal Structures for Computation and Deduction, FSCD2019. LIPIcs, vol. 131, pp. 1:1–1:20 (2019)2. Accattoli, B., Barras, B.: Environments and the complexity of abstract machines.In: Proceedings of the 19th International Symposium on Principles and Practiceof Declarative Programming (PPDP’17). pp. 4–16 (2017)3. Accattoli, B., Coen, C.S.: On the relative usefulness of ﬁreballs. In: 30th AnnualACM/IEEE Symposium on Logic in Computer Science, LICS 2015. pp. 141–155(2015)4. Accattoli, B., Condoluci, A., Guerrieri, G., Coen, C.S.: Crumbling abstract ma-chines. In: Proceedings of the 21st International Symposium on Principles andPractice of Programming Languages, PPDP 2019. pp. 4:1–4:15 (2019)5. Accattoli, B., Dal Lago, U.: On the invariance of the unitary cost model for headreduction. In: 23rd International Conference on Rewriting Techniques and Appli-cations, RTA 2012. LIPIcs, vol. 15, pp. 22–37 (2012)6. Accattoli, B., Lago, U.D.: (Leftmost-outermost) beta reduction is invariant, indeed.Logical Methods in Computer Science (2016)7. Accattoli, B., Guerrieri, G.: Open call-by-value. In: 14th Asian Symposium,APLAS 2016, Proceedings. LNCS, vol. 10017, pp. 206–226 (2016)8. Accattoli, B., Guerrieri, G.: Abstract machines for open call-by-value. Sci. Comput.Program. (2019)9. Aehlig, K., Joachimski, F.: Operational aspects of untyped normalization by eval-uation. Mathematical Structures in Computer Science , 587–611 (2004)10. Ager, M.S., Biernacki, D., Danvy, O., Midtgaard, J.: From interpreter to compilerand virtual machine: a functional derivation. Tech. Rep. BRICS RS-03-14, DAIMI,Aarhus University, Aarhus, Denmark (Mar 2003)trong Call by Value is Reasonable for Time 3111. Ager, M.S., Biernacki, D., Danvy, O., Midtgaard, J.: A functional correspondencebetween evaluators and abstract machines. In: Proceedings of the Fifth ACM-SIGPLAN Conference, PPDP’03. pp. 8–19 (2003)12. Balabonski, T., Barenbaum, P., Bonelli, E., Kesner, D.: Foundations of strong callby need. PACMPL (ICFP), 20:1–20:29 (2017)13. Biernacka, M., Biernacki, D., Charatonik, W., Drab, T.: An abstract machinefor strong call by value. In: Programming Languages and Systems - 18th AsianSymposium, APLAS 2020, Proceedings. LNCS, vol. 12470, pp. 147–166 (2020)14. Biernacka, M., Charatonik, W.: Deriving an abstract machine for strong call byneed. In: 4th International Conference on Formal Structures for Computation andDeduction, FSCD 2019. LIPIcs, vol. 131, pp. 8:1–8:20 (2019)15. Biernacka, M., Charatonik, W., Zielinska, K.: Generalized refocusing: From hy-brid strategies to abstract machines. In: 2nd International Conference on FormalStructures for Computation and Deduction, FSCD 2017. pp. 10:1–10:17 (2017)16. Condoluci, A., Accattoli, B., Coen, C.S.: Sharing equality is linear. In: Proceedingsof the 21st International Symposium on Principles and Practice of ProgrammingLanguages, PPDP 2019. pp. 9:1–9:14 (2019)17. Cr´egut, P.: Strongly reducing variants of the Krivine abstract machine. Higher-Order and Symbolic Computation (3), 209–230 (2007)18. Dal Lago, U., Accattoli, B.: Encoding Turing machines into the deterministiclambda-calculus. CoRR abs/1711.10078 (2017)19. Filinski, A., Rohde, H.K.: Denotational aspects of untyped normalization by eval-uation. Theoretical Informatics and Applications (3), 423–453 (2005)20. Garc´ıa-P´erez, A., Nogueira, P.: On the syntactic and functional correspondence be-tween hybrid (or layered) normalisers and abstract machines. Science of ComputerProgramming , 176–199 (2014)21. Gr´egoire, B., Leroy, X.: A compiled implementation of strong reduction. In: In-ternational Conference on Functional Programming. pp. 235–246. SIGPLAN No-tices 37(9) (2002)22. Hunter, J.D.: Matplotlib: A 2d graphics environment. Computing in Science &Engineering (3), 90–95 (2007)23. Lago, U.D., Martini, S.: The weak lambda calculus as a reasonable machine. Theor.Comput. Sci.398