Higher-Order Nonemptiness Step by Step
HHigher-Order Nonemptiness Step by Step
Paweł Parys
Institute of Informatics, University of Warsaw, [email protected]
Abstract
We show a new simple algorithm that checks whether a given higher-order grammar generates anonempty language of trees. The algorithm amounts to a procedure that transforms a grammarof order n to a grammar of order n −
1, preserving nonemptiness, and increasing the size onlyexponentially. After repeating the procedure n times, we obtain a grammar of order 0, whosenonemptiness can be easily checked. Since the size grows exponentially at each step, the overallcomplexity is n -EXPTIME , which is known to be optimal. More precisely, the transformation (andhence the whole algorithm) is linear in the size of the grammar, assuming that the arity of employednonterminals is bounded by a constant. The same algorithm allows to check whether an infinite treegenerated by a higher-order recursion scheme is accepted by an alternating safety (or reachability)automaton, because this question can be reduced to the nonemptiness problem by taking a productof the recursion scheme with the automaton.A proof of correctness of the algorithm is formalised in the proof assistant Coq. Our transfor-mation is motivated by a similar transformation of Asada and Kobayashi (2020) changing a wordgrammar of order n to a tree grammar of order n −
1. The step-by-step approach can be opposedto previous algorithms solving the nonemptiness problem “in one step”, being compulsorily morecomplicated.
Theory of computation → Rewrite systems
Keywords and phrases
Higher-order grammars, Nonemptiness, Model-checking, Transformation,Order reduction
Supplement Material
Coq formalisation: https://github.com/pparys/ho-transform-sbs
Higher-order grammars, also known as higher-order OI grammars [8, 16], generalize context-free grammars: nonterminals of higher-order grammars are allowed to take arguments. Suchgrammars have been studied actively in recent years, in the context of automated verificationof higher-order programs. In this paper we concentrate on a very basic problem of languagenonemptiness: is the language generated by a given higher-order grammar nonempty. Thisproblem, being easy for most devices, is not so easy for higher-order grammars. Indeed, it is n -EXPTIME -complete for grammars of order n [15].We give a new simple algorithm solving the language nonemptiness problem. Thealgorithm amounts to a procedure that transforms a grammar of order n to a grammarof order n −
1, preserving nonemptiness, and increasing the size only exponentially. Afterrepeating the procedure n times, we obtain a grammar of order 0, whose nonemptiness canbe easily checked. Since the size grows exponentially at each step, we reach the optimaloverall complexity of n -EXPTIME . In a more detailed view, the complexity looks even better:the size growth is exponential only in the arity of types appearing in the grammar; if themaximal arity is bounded by a constant, the transformation (and hence the whole algorithm)is linear in the size of the grammar.While a higher-order grammar is a generator of a language of (finite) trees, virtuallythe same object can be seen as a generator of a single infinite tree (encompassing thewhole language). In this context, the grammars are called higher-order recursion schemes.The nonemptiness problem for grammars is easily equivalent to the question whether the a r X i v : . [ c s . F L ] S e p Higher-Order Nonemptiness Step by Step tree generated by a given recursion scheme is accepted by a given alternating safety (orreachability) automaton; for the right-to-left reduction, it is enough to product the recursionscheme with the automaton. Thus, our algorithm solves also the latter problem, calleda model-checking problem. This problem is decidable and n -EXPTIME -complete not onlyfor safety or reachability automata, but actually for all parity automata, with multipleproofs using game semantics [17], collapsible pushdown automata [10], intersection types [14],or Krivine machines [20], and with several extensions [5, 3, 6, 21, 18]. The problem forsafety automata was tackled in particular by Aehlig [1] and by Kobayashi [12]. To thosealgorithms we add another one. The main difference between our algorithm and all theothers is that we solve the problem step by step, repeatedly reducing the order by one, whilemost previous algorithms work “in one step”, being compulsorily more complicated. Theonly proofs that have been reducing the order by one, were proofs using collapsible pushdownautomata [10, 3, 6], being very technical (and contained only in unpublished appendices). Areduction of order was also possible for a subclass of recursion schemes, called safe recursionschemes [11], but it was not known how to extend it to all recursion schemes.Comparing the two variants of the model-checking problem for higher-order recursionschemes—involving safety and reachability automata, and involving all parity automata—wehave to mention two things. First, while most theoretical results can handle all parityautomata, actual tools solving this problem in practice mostly deal only with safety andreachability automata (called also trivial and co-trivial automata) [13, 4, 22, 19]. Second,there exists a polynomial-time (although nontrivial) reduction from the variant involvingparity automata to the variant involving safety automata [9].Our transformation is directly motivated by a recent paper of Asada and Kobayashi [2].They show how to transform a grammar of order n generating a language of words to agrammar of order n − For a number k ∈ N we write [ k ] for { , . . . , k } .The set of (simple) types is constructed from a unique ground type o using a binaryoperation → ; namely o is a type, and if α and β are types, so is α → β . By convention, → associates to the right, that is, α → β → γ is understood as α → ( β → γ ). We oftenabbreviate α → · · · → α | {z } ‘ → β as α ‘ → β . The order of a type α , denoted ord ( α ), is definedby induction: ord ( α → · · · → α k → o ) = max( { } ∪ { ord ( α i ) + 1 | i ∈ [ k ] } ); for example ord ( o ) = 0, ord ( o → o → o ) = 1, and ord (( o → o ) → o ) = 2.Having a finite set of typed nonterminals X , and a finite set of typed variables Y , terms over ( X , Y ) are defined by induction:every nonterminal X ∈ X of type α is a term of type α ;every variable y ∈ Y of type α is a term of type α ;if K , . . . , K k are terms of type o , then •h K , . . . , K k i and ⊕h K , . . . , K k i are terms oftype o ;if K is a term of type α → β , and L is a term of type α , then K L is a term of type β . . Parys 3 The type of a term K is denoted tp ( K ). The order of a term K , written ord ( K ), is definedas the order of its type. We write Ω for ⊕hi , and • for •hi .The construction ⊕h K , . . . , K k i is an alternative; such a term reduces to one of the terms K , . . . , K k . This construction is used to introduce nondeterminism to grammars (definedbelow). In the special case of k = 0 (when we write Ω) no reduction is possible; thus Ωdenotes divergence.The construction •h K , . . . , K k i can be seen as a generator of a tree node with k children;subtrees starting in these children are described by the terms K , . . . , K k . In a usualpresentation, nodes are labeled by letters from some finite alphabet. In this paper, however,we do not care about the exact letters contained in generated trees, only about languagenonemptiness, hence we do not write these letters at all (in other words, we use a single-letteralphabet, where • is the only letter). Actually, in the sequel we even do not consider trees;we rather say that •h K , . . . , K k i is convergent if all K , . . . , K k are convergent (which canbe rephrased as: the language generated from •h K , . . . , K k i is nonempty if the languagesgenerated from all K , . . . , K k are nonempty).A (higher-order) grammar is a tuple G = ( X , X , R ), where X a finite set of typednonterminals, X ∈ X is a starting nonterminal of type o , and R a function assigning toevery nonterminal X ∈ X a rule of the form X y . . . y k → R , where tp ( X ) = ( tp ( y ) →· · · → tp ( y k ) → o ), and R is a term of type o over ( X , { y , . . . , y k } ). The order of a grammaris defined as the maximum of orders of its nonterminals.Having a grammar G = ( X , X , R ), for every set of variables Y we define a reductionrelation −→ G between terms over ( X , Y ) and sets of such terms, as the least relation suchthat (1) X K . . . K k −→ G { R [ K /y , . . . , K k /y k ] } if the rule for X is X y . . . y k → R , where R [ K /y , . . . , K k /y k ] denotes the term obtained from R by substituting K i for y i for all i ∈ [ k ], (2) •h K , . . . , K k i −→ G { K , . . . , K k } , and (3) ⊕h K , . . . , K k i −→ G { K i } for every i ∈ [ k ].We say that a term M is G -convergent if M −→ G N for some set N of G -convergentterms. This is an inductive definition; in particular, the base case is when M −→ G ∅ . Inother words, M is G -convergent if there is a finite tree labeled by terms where for each node,the node and its children satisfy one of (1)-(3). Moreover, the grammar G is convergent if itsstarting nonterminal X is G -convergent. In this section we present a transformation, called order-reducing transformation , resultingin the main theorem of this paper: (cid:73)
Theorem 3.1.
For any n ≥ , there exists a transformation from order- n grammars toorder- ( n − grammars, and a polynomial p n such that, for any order- n grammar G , theresulting grammar G † is convergent if and only if G is convergent, and |G † | ≤ p n ( |G| ) . Intuitions.
Let us first present intuitions behind our transformation. While reducing theorder, we have to replace, in particular, order-1 functions by order-0 terms. Consider forexample a term
K L of type o , where K has type o → o . Notice that L generates treesthat are inserted somewhere in contexts generated by K . Thus, when is K L convergent?There are two possibilities. First, maybe K is convergent without using its argument at all. Higher-Order Nonemptiness Step by Step
Second, maybe K can be convergent but only using its argument, and then L also has to beconvergent. Notice that in the first case K Ω is convergent (i.e., K is convergent even if theargument is not convergent), and in the second case K • is convergent (i.e., K is convergentif its argument is convergent). In the transformation, we transform K into two order-0 terms, K and K corresponding to K Ω and K • , and then we replace K L by ⊕h K , •h K , L ii .As a full example, consider an order-1 grammar with the following rules: X → Y Z , Y x → ⊕h• , x i , Z → • . It will be transformed to the order-0 grammar with the following rules: X → ⊕h Y , •h Y , Z ii , Y → ⊕h• , Ω i , Y → ⊕h• , •i , Z → • . Notice that the original grammar is convergent “for two reasons”: the ⊕ node in the rule for Y may reduce either to the first possibility (i.e., to • ), or to the second possibility (i.e., to x ),in which case convergence follows from convergence of the argument Z . This is reflected bythe two possibilities available for the ⊕ node in the new rule for X : we either choose the firstpossibility and we depend only on convergence of Y , or we choose the the second possibilityand we depend on convergence of both Y and Z . Notice that after replacing the (old andnew) rule for Z by Z → Ω, the modified grammars remain convergent thanks to the firstpossibility above. Likewise, after replacing the original rule for Y by Y x → x , the new ruleswill be Y → Ω and Y → • , and the modified grammars remain convergent thanks to thesecond possibility above. However, after applying both these replacements simultaneously,the grammars stop to be convergent.If our term K takes multiple order-0 arguments, say we have K L . . . L k , while trans-forming K we need 2 k variants of the term: each of the arguments may be either used(replaced by • ) or not used (replaced by Ω). This is why we have the exponential blow-up.Let us compare this quickly with the transformation of Asada and Kobayashi [2], whichworked for grammars generating words (i.e., trees where every node has at most one child).In their case, at most one of the arguments L i could be used, so they needed only k + 1variants of K ; this is why their transformation was polynomial.For higher-order grammars we apply the same idea: functions of order 1 are replaced byterms of order 0, and then the order of any higher-order function drops down by one. Forexample, consider a grammar with the following rules: X → T Y , T y → y ( y • ) , Y x → ⊕h• , x i . The nonterminal Y is again of type o → o , hence it is replaced by two nonterminals Y , Y oftype o , describing the situation when the parameter x is either not used or used. Likewise,the corresponding parameter y of T is replaced by two parameters y , y . The resultinggrammar will have the following rules: X → T Y Y , T y y → ⊕h y , •h y , ⊕h y , •h y , •iiii , Y → ⊕h• , Ω i , Y → ⊕h• , •i . Formal definition.
We now formalize the above intuitions. Having a type, we are interestedin cutting off its suffix being of order 1. Thus, we use the notation α → · · · → α k ⇒ o ‘ → o for a type α → · · · → α k → o ‘ → o such that either k = 0 or α k = o . Notice that everytype α can be uniquely represented in this form. We remark that some among the types α , . . . , α k − (but not α k ) may be o . For a type α we write gar ( α ) (“ground arity”) for thenumber ‘ for which we can write α = ( α → · · · → α k ⇒ o ‘ → o ); we also extend this to terms: gar ( M ) = gar ( tp ( M )). . Parys 5 We transform terms of type α to terms of type α † , which is defined by induction:( α → · · · → α k ⇒ o ‘ → o ) † = (cid:16) ( α † ) gar ( α → · · · → ( α † k ) gar ( αk ) → o (cid:17) . Thus, we remove all trailing order-0 arguments, and we multiplicate (and recursively trans-form) remaining arguments.For a finite set S , we write 2 S for the set of functions A : S → { , } . Moreover, weassume some fixed order on functions in 2 S , and we write P ( Q A ) A ∈ S for an application P Q A . . . Q A | S | , where A , . . . , A | S | are all the function from 2 S listed in the fixed order.The only function in 2 ∅ is denoted ∅ .Fix a grammar G = ( X , X , R ). For every nonterminal X and for every function A ∈ [ gar ( X )] we consider a nonterminal X † A of type ( tp ( X )) † . As the new set of nonterminalswe take X † = { X † A | X ∈ X , A ∈ [ gar ( X )] } . Likewise, for every variable y and for everyfunction A ∈ [ gar ( y )] we consider a variable y † A of type ( tp ( y )) † , and for a set of variables Y we denote Y † = { y † A | y ∈ Y , A ∈ [ gar ( y )] } .We now define a function tr transforming terms. Its value tr ( A, Z, M ) is defined when M is a term over some ( X , Y ), and A ∈ [ gar ( M )] , and Z : Y * { , } is a partial function suchthat dom( Z ) contains only variables of type o . The intention is that A specifies which amongtrailing order-0 arguments can be used, and Z specifies which order-0 variables (among thosein dom( Z )) can be used. The transformation is defined by induction on the structure of M ,as follows: (1) tr ( A, Z, X ) = X A for X ∈ X ; (2) tr ( A, Z, y ) = y A for y ∈ Y \ dom( Z ); (3) tr ( A, Z, z ) = Ω if Z ( z ) = 0; (4) tr ( A, Z, z ) = • if Z ( z ) = 1; (5) tr ( ∅ , Z, •h K , . . . , K k i ) = •h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i ; (6) tr ( ∅ , Z, ⊕h K , . . . , K k i ) = ⊕h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i ; (7) tr ( A, Z, K L ) = ⊕h tr ( A [ ‘ + 1 , Z, K ) , •h tr ( A [ ‘ + 1 , Z, K ) , tr ( ∅ , Z, L ) ii if tp ( K ) =( o ‘ +1 → o ); (8) tr ( A, Z, K L ) = ( tr ( A, Z, K )) ( tr ( B, Z, L )) B ∈ [ gar ( L )] if tp ( K ) = ( α → · · · → α k ⇒ o ‘ → o )with k ≥ X y . . . y k z . . . z ‘ → R in R , where ‘ = gar ( X ), and for every function A ∈ [ ‘ ] , to R † we take the rule X † A ( y † ,B ) B ∈ [ gar ( y . . . ( y † k,B ) B ∈ [ gar ( yk )] → tr ( ∅ , [ z i A ( ‘ + 1 − i ) | i ∈ [ ‘ ]] , R ) . In the function A it is more convenient to count arguments from right to left (then we donot need to shift the domain in Case (7) above), but it is more natural to have variables z , . . . , z ‘ numbered from left to right; this is why in the rule for X † A we assign to z i thevalue A ( ‘ + 1 − i ), not A ( i ).Finally, the resulting grammar G † is ( X † , X † , ∅ , R † ). In this section we analyze complexity of our transformation. First, we formally define the size of a grammar. The size of a term is defined by induction on its structure: | X | = | y | = 1 , | K L | = 1 + | K | + | L | , |•h K , . . . , K k i| = |⊕h K , . . . , K k i| = 1 + | K | + · · · + | K k | . Higher-Order Nonemptiness Step by Step
Then |G| , the size of G is defined as the sum of | R | + k over all rules X y . . . y k → R of G .In Asada and Kobayashi [2] such a size is called Curry-style size; it does not include sizes oftypes of employed variables.We say that a type α is a subtype of a type β if either α = β , or β = ( β → β ) and α is asubtype of β or of β . We write A G for the largest arity of subtypes of types of nonterminalsin a grammar G . Notice that types of other objects appearing in G , namely variables andsubterms of right sides of rules, are subtypes of types of nonterminals, hence their arity isalso bounded by A G . It is reasonable to consider large grammars, consisting of many rules,where simultaneously the maximal arity A G is respectively small.While the exponential bound mentioned in Theorem 3.1 is obtained by applying theorder-reducing transformation to an arbitrary grammar, the complexity becomes slightlybetter if we first apply a preprocessing step. This is in particular necessary, if we wantto obtain linear dependence in the size of G (and exponential only in the maximal arity A G ). The preprocessing, making sure that the grammar is in a simple form (defined below)amounts to splitting large rules into multiple smaller rules. A similar preprocessing is presentalready in prior work [13, 2, 7], however our definition of a simple form is slightly moreliberal, so that the order reduction applied to a grammar in a normal form gives again agrammar in a normal form.An application depth of a term R is defined as the maximal number of applications ona single branch in R , where a compound application K L . . . L k counts only once. Moreformally, we define by induction: ad ( •h K , . . . , K k i ) = ad ( ⊕h K , . . . , K k i ) = max { ad ( K i ) | i ∈ [ k ] } , ad ( X K . . . K k ) = ad ( y K . . . K k ) = max( { } ∪ { ad ( K i ) + 1 | i ∈ [ k ] } ) . We say that a grammar G is in a simple form if the right side of each its rule has applicationdepth at most 2.Any grammar G can be converted to a grammar in a simple form, as follows. Consider arule X y . . . y k → R , and a subterm of R of the form f K . . . K m , where f is a nonterminalor a variable, but some K i already has application depth 2. Then we replace the occurrenceof K i with Y y . . . y k (being a term of application depth 1) for a fresh nonterminal Y , andwe add the rule Y y . . . y k x . . . x s → K i x . . . x s (whose right side already has applicationdepth 2; the additional variables x , . . . , x s are added to ensure that the type is o ). Byrepeating such a replacement for every “bad” subterm of every rule, we clearly obtain agrammar in a simple form. (cid:73) Lemma 4.1.
Let G be the grammar in a simple form obtained by the above simplificationprocedure from a grammar G . Then ord ( G ) = ord ( G ) , and A G ≤ A G , and |G | = O ( A G · |G| ) .The procedure can be performed in time linear in its output size. Proof.
The parts about the order and about the running time are obvious.Types of nonterminals originating from G remain unchanged. The type of a freshnonterminal Y introduced in the procedure is of the form α → · · · → α k → β → · · · → β s → o ,where all α i and β i are types present also in G . The arity of the whole type is k + s , where k is the arity of the original nonterminal X (hence it is bounded by A G ), and s is bounded bythe arity of the type of K i (hence also by A G ).In order to bound the size of the resulting grammar, notice that the considered replacementis performed at most once for every subterm of the right side of every rule, hence the numberof replacements is bounded by |G| . Each such a replacement increases the size of the grammarby at most O ( A G ). (cid:74) . Parys 7 (cid:73) Lemma 4.2.
For every grammar G in a simple form, the grammar G † (i.e., the result ofthe order-reducing transformation) is also in a simple form, and ord ( G † ) = max(0 , ord ( G ) − ,and A G † ≤ A G · A G , and |G † | = O ( |G|· · A G ) . Moreover, the transformation can be performedin time linear in its output size. Proof.
The part about the running time is obvious. It is also easy to see by induction that ord ( α † ) = max(0 , ord ( α ) − G † have type α † for α being the type of a correspondingnonterminal of G .Recall that in the type α † obtained from α = ( α → · · · → α k → o ), every α i eitherdisappears or becomes (transformed and) repeated 2 gar ( α i ) times, that is, at most 2 A G times.This implies the inequality concerning A G † .Every compound application can be written as f K . . . K k L . . . L ‘ , where f is anonterminal or a variable, and ‘ = gar ( f ). In such a term, every K i (after transforming)becomes repeated 2 gar ( K i ) times, that is, at most 2 A G times. Then, for every L i we duplicatethe outcome and we append a small prefix; this duplication happens ‘ times, that is, atmost A G times. In consequence, we easily see by induction that while transforming a termof application depth d , its size gets multiplicated by at most O (2 d · A G ). Moreover, everynonterminal X is repeated 2 gar ( X ) times, that is, at most 2 A G times. Because the applicationdepth of right sides of rules is at most 2, this bounds the size of the new grammar by O ( |G| · · A G ).Looking again at the above description of the transformation, we can notice that theapplication depth cannot grow; in consequence the property of being in a simple form ispreserved. (cid:74) Thus, if we want to check nonemptiness of a grammar G of order n , we can first convertit to a simple form, and then apply the order-reducing transformation n times. This gives usa grammar of order 0, whose nonemptiness can be checked in linear time. By Lemmata 4.1and 4.2, the whole algorithm works in time n -fold exponential in A G and linear in |G| .If the original grammar G generates a language of words, we can start by applyingthe polynomial-time transformation of Asada and Kobayashi [2], which converts G into anequivalent grammar of order n − |G| , and increases the arity onlyquadratically, in this case we obtain an algorithm working in time ( n − A G and linear in |G| . In this section we finish a proof of Theorem 3.1 by showing that the grammar G † resultingfrom transforming a grammar G is convergent if and only if the original grammar G isconvergent. This proof is also formalised in the proof assistant Coq, and available at GitHub(https://github.com/pparys/ho-transform-sbs). The strategy of our proof is similar as inAsada and Kobayashi [2]. Namely, we first show that reductions performed by G can bereordered, so that we can postpone substituting for (trailing) variables of order 0. To storesuch postponed substitutions, called explicit substitutions , we introduce extended terms .Then, we show that such reordered reductions in G are in a direct correspondence with Higher-Order Nonemptiness Step by Step reductions in G † . Extended terms.
In the sequel, terms defined previously are sometimes called non-extendedterms, in order to distinguish them from extended terms defined below. Having a finite set oftyped nonterminals X , and a finite set Z of variables of type o , extended terms over ( X , Z )are defined by induction:if z
6∈ Z is a variable of type o , and E is an extended term over ( X , Z ] { z } ), and L is anon-extended term of type o over ( X , Z ), then E h L/z i is an extended term over ( X , Z );every non-extended term of type o over ( X , Z ) is an extended term over ( X , Z ).The construction of the form E h L/z i is called an explicit substitution . Intuitively, it denotesthe term obtained by substituting L for z in E . Notice that the variable z being free in E becomes bound in E h L/z i , and that explicit substitutions are allowed only for the groundtype o .Of course a (non-extended or extended) term over ( X , Z ) can be also seen as a termover ( X , Z ), where Z ⊇ Z . In the sequel, such extending of the set of variables is oftenperformed implicitly.Having a grammar G = ( X , X , R ), for every set Z of variables of type o we define an ext-reduction relation (cid:32) G between extended terms over ( X , Z ) and sets of such terms, as theleast relation such that (1) X K . . . K k L . . . L ‘ (cid:32) G { R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L ‘ /z ‘ i} if ‘ = gar ( X ), and R ( X ) = ( X y . . . y k z . . . z ‘ → R ), and z , . . . , z ‘ are fresh variablesof type o not appearing in Z , (2) •h K , . . . , K k i (cid:32) G { K , . . . , K k } , (3) ⊕h K , . . . , K k i (cid:32) G { K i } for every i ∈ [ k ], (4) z h L/z i (cid:32) G { L } , (5) z h L/z i (cid:32) G { z } if z = z , and (6) E h L/z i (cid:32) G { F h L/z i | F ∈ F} whenever E (cid:32) G F .We say that an extended term E over ( X , ∅ ) is G -ext-convergent if E −→ G F for someset F of G -ext-convergent extended terms. The grammar G is ext-convergent if its startingnonterminal X is G -ext-convergent.There is an “expand” function from extended terms to non-extended terms, which performsall the explicit substitutions written in front of an extended term:exp( K h L /z i . . . h L ‘ /z ‘ i ) = K [ L /z ] . . . [ L ‘ /z ‘ ] . We also write exp( F ) for { exp( F ) | F ∈ F} (where F is a set of extended terms). Thefollowing lemma, saying that we can consider ext-convergence instead of convergence, canbe proved in a standard way (actually, Asada and Kobayashi have a very similar lemma [2,Lemma 18]); for completeness we attach a proof in Appendix A. (cid:73) Lemma 5.1.
Let G = ( X , X , R ) be a grammar. An extended term E over ( X , ∅ ) is G -ext-convergent if and only if exp( E ) is G -convergent. In particular G is ext-convergent ifand only if it is convergent. We extend the transformation function to extended terms, by adding the following rule,where E h L/z i is an extended term over ( X , Z ), and Z ∈ Z (the first argument is always ∅ ,because all extended terms are of type o ): Asada and Kobayashi have an additional step in their proof, namely a reduction to the case of recursion-free grammars. This step turns out to be redundant, at least in the case of our transformation. . Parys 9 (9) tr ( ∅ , Z, E h L/z i ) = ⊕h tr ( ∅ , Z [ z , E ) , •h tr ( ∅ , Z [ z , E ) , tr ( ∅ , Z, L ) ii . Between ext-convergence and convergence of G † . Once we know that convergence andext-convergence of G are equivalent (cf. Lemma 5.1), it remains to prove that ext-convergenceof G is equivalent to convergence of G † , which is the subject of Lemma 5.2: (cid:73) Lemma 5.2.
Let G = ( X , X , R ) be a grammar. An extended term E over ( X , ∅ ) is G -ext-convergent if and only if tr ( ∅ , ∅ , E ) is G † -convergent. In particular G is ext-convergentif and only if G † is convergent. The remaining part of this section is devoted to a proof of this lemma. Fix a grammar G = ( X , X , R ). Of course the second part (concerning the grammars) follows from the firstpart (concerning an extended term) applied to the starting nonterminal X . It is thus enoughto prove the first part. We start with the left-to-right direction (i.e., from G -ext-convergenceof E to G † -convergence of tr ( ∅ , ∅ , E )). We need two simple auxiliary lemmata. The first ofthem says that the tr function commutes with substitution: (cid:73) Lemma 5.3.
Let R [ K /y , . . . , K k /y k ] be a term over ( X , Z ) , let A ∈ [ gar ( R )] , and let Z ∈ Z . Then tr ( A, Z, R [ K /y , . . . , K k /y k ]) = ( tr ( A, Z, R ))[ tr ( B, Z, K i ) /y † i,B | i ∈ [ k ] , B ∈ [ gar ( K i )] ] . Proof.
A straightforward induction on the structure of R . (cid:74) The second lemma says that by increasing values of the function Z we can make thetransformed term only more convergent: (cid:73) Lemma 5.4.
Let E be an extended term over ( X , Z ] { z } ) , and let Z ∈ Z . If tr ( ∅ ,Z [ z , E ) is G † -convergent, then also tr ( ∅ , Z [ z , E ) is G † -convergent. Proof.
Denote P = tr ( ∅ , Z [ z , E ) and P = tr ( ∅ , Z [ z , E ). Tracing the rules ofthe transformation function, we can see that P and P are created in the same way, withthe exception that occurrences of z in E are transformed to Ω in P , and to • in P . Thus, P can be obtained from P by replacing some occurrences of Ω to • . We know that P is G † -convergent, which means that it can be rewritten using the −→ G relation until reachingempty sets. Moreover, the subterms Ω (which are present in P , but not in P ) cannot bereached during this rewriting, because Ω is not G † -convergent. Thus, P can be rewritten inexactly the same way as P , so it is also G † -convergent. (cid:74) The next lemma shows how ext-reductions of G are reflected in G † : (cid:73) Lemma 5.5.
Let E be an extended term over ( X , Z ) and let Z ∈ Z . If E (cid:32) G F and tr ( ∅ , Z, F ) is G † -convergent for every F ∈ F , then tr ( ∅ , Z, E ) is also G † -convergent. Proof.
Induction on the definition of E (cid:32) G F . We analyze particular cases appearing inthe definition. Missing details are given in Appendix B.In Case (1) E consists of an application of arguments to some nonterminal X . Forsimplicity of presentation, suppose that X has two arguments: y of positive order, and z oforder 0 (the general case is handled in the appendix). Then E = X K L, and F = { F } for F = R [ K/y, z /z ] h L/z i , where R ( X ) = ( X y z → R ) and z is a fresh variable of type o not appearing in Z . For j ∈ { , } let P j = tr ([1 j ] , Z, X K ) , and Q j = tr ( ∅ , Z [ z j ] , R [ K/y, z /z ]) . First, we prove that P j −→ G † { Q j } . By definition we have that P j = X † [1 j ] ( tr ( B, Z, K )) B ∈ [ gar ( K )] , and by Lemma 5.3 we have that Q j = tr ( ∅ , Z [ z j ] , R [ z /z ])[ tr ( B, Z [ z j ] , K ) /y † B | B ∈ [ gar ( K )] ]= tr ( ∅ , [ z j ] , R )[ tr ( B, Z, K ) /y † B | B ∈ [ gar ( K )] ]] , where the second equality holds because the z does not appear in K and the variables fromdom( Z ) do not appear in R . Recalling that the rule for X † A is X † [1 j ] ( y † B ) B ∈ [ gar ( y )] → tr ( ∅ , [ z j ] , R ) , we immediately see that indeed P j −→ G † { Q j } . Having this, we recall that tr ( ∅ , Z, E ) = ⊕h P , •h P , L ii and tr ( ∅ , Z, F ) = ⊕h Q , •h Q , L ii (1)for appropriate L (obtained by transforming L ). Recall that, by definition, a term M is G † -convergent if and only if M −→ G † N for some set N of G † -convergent terms. Thus,the only way why tr ( ∅ , Z, F ) can be G † -convergent (which holds by assumption) is thateither Q is G † -convergent, or both Q and L are G † -convergent. Because of the reduction P j −→ G † { Q j } we have that either P is G † -convergent, or both P and L are G † -convergent,which implies that tr ( ∅ , Z, E ) is G † -convergent.In Cases (2) and (3), when E = •h K , . . . , K k i or E = ⊕h K , . . . , K k i , we have a reductionfrom tr ( ∅ , Z, E ) to { tr ( ∅ , Z, F ) | F ∈ F} , because tr distributes over •h . . . i and ⊕h . . . i . InCases (4) and (5) (elimination of explicit substitution) we also have similar reductions.Finally, in Case (6) we have that E = E h L/z i , F = { E h L/z i , . . . , E k h L/z i} , and E (cid:32) G { E , . . . , E k } . By definition, for every i ∈ { , . . . , k } we have that tr ( ∅ , Z, E i h L/z i ) = ⊕h P i , •h P i , L ii , where (2) P i = tr ( ∅ , Z [ z , E i ) , P i = tr ( ∅ , Z [ z , E i ) , L = tr ( ∅ , Z, L ) . Thus, tr ( ∅ , Z, E i h L/z i ) is G † -convergent if and only if either P i is G † -convergent, or both P i and L are G † -convergent. By assumption this is the case for all i ∈ [ k ], and we have to provethis for i = 0. If for every i ∈ [ k ] we have the former case (i.e., P i is G † -convergent), by theinduction hypothesis (used with the function Z [ z P is G † -convergent,and we are done. In the opposite case, for some i ∈ [ k ] (but for at least one of them) wehave that both P i and L are G † -convergent, and for the remaining i ∈ [ k ] we have that P i is G † -convergent. Using Lemma 5.4 we deduce that if P i is G † -convergent, then also P i is G † -convergent. Thus actually P i is G † -convergent for every i ∈ [ k ], and additionally L is G † -convergent. By the induction hypothesis (used with the function Z [ z P is G † -convergent, and we are also done. (cid:74) . Parys 11 We can now conclude with the left-to-right direction of Lemma 5.2: (cid:73)
Lemma 5.6.
Let E be an extended term over ( X , ∅ ) . If E is G -ext-convergent, then tr ( ∅ , ∅ , E ) is G † -convergent. Proof.
Induction on the fact that E is G -ext-convergent. Because E is G -ext-convergent, E (cid:32) G F for some set F of G -ext-convergent extended terms, for which we can apply theinduction hypothesis. The induction hypothesis says that tr ( ∅ , ∅ , F ) is G † -convergent forevery F ∈ F . In such a situation Lemma 5.5 implies that tr ( ∅ , ∅ , E ) is also G † -convergent, asrequired. (cid:74) For a proof in the opposite direction we need the following definition. We say that a term M G † -convergent in n steps if M −→ G † { N , . . . , N k } , and every N i is G † -convergent in n i steps, and n = 1 + n + · · · + n k (i.e., we count 1 for the above reduction, and we sum thenumbers of steps needed to reduce all N i ). Clearly a term M is G † -convergent if and only ifit is G † -convergent in n steps for some n ∈ N . Notice that the number n is not determinedby M (i.e., that the same term M can be G † -convergent in n steps for multiple values of n ).We can now state the converse of Lemma 5.5: (cid:73) Lemma 5.7.
Let E be an extended term over ( X , Z ) and let Z ∈ Z . If tr ( ∅ , Z, E ) is G † -convergent in n steps and E is not a variable, then there exists a set F of extended termssuch that E (cid:32) G F and tr ( ∅ , Z, F ) is G † -convergent in less than n steps for every F ∈ F . Proof.
Induction on the number of explicit substitutions in E . Depending on the shape of E , we have several cases. Missing details are given in Appendix C.One case is E consists of a nonterminal X with some arguments applied. For simplicity ofpresentation, we again suppose that X has two arguments: y of positive order, and z of order0. Thus, E is of the form E = X K L . Let
X y z → R be the rule for X , and let z be a freshvariable of type o not appearing in Z . In such a situation, taking F = R [ K/y, z /z ] h L/z i we have that E (cid:32) G { F } . Recall the terms P j and Q j (for j ∈ { , } ) from the proof ofLemma 5.5. In that proof we have observed that P j −→ G † { Q j } . But clearly this is theonly way how P j can reduce, so if P j is G † -convergent in n j steps, then necessarily Q j is G † -convergent in n j − tr ( ∅ , Z, E ) is G † -convergentin n steps, then either P is G † -convergent in n = n − P and L are G † -convergent in, respectively, n and n − n − n ∈ N . In the former case, Q is G † -convergent in n − n − tr ( ∅ , Z, F ) is G † -convergent in n − Q is G † -convergent in n − tr ( ∅ , Z, F ) is G † -convergent in ( n −
1) + ( n − n −
2) + 2 = n − E is not a variable, and because (by definition of an extended term) allfree variables of E are of type o .The cases of E = •h K , . . . , K k i and E = ⊕h K , . . . , K k i are straightforward.It remains to assume that E is an explicit substitution. If E = z h L/z i , we should take F = { L } , and if E = z h L/z i for z = z , we should take F = { z } (in these two subcaseswe cannot use the induction assumption, because it does not work for an extended termbeing a single variable). Otherwise E = E h L/z i , where E is not a variable. Recall that tr ( ∅ , Z, E ) = ⊕h P , •h P , L ii for P , P , L as in the proof of Lemma 5.5. By assumption tr ( ∅ , Z, E ) is G † -convergent in n steps, so either P is G † -convergent in n = n − P and L are G † -convergent in, respectively, n and n − n − n ∈ N .Let j = 0 in the former case and j = 1 in the latter case. The induction hypothesis gives us a set { E , . . . , E k } such that E (cid:32) G { E , . . . , E k } and tr ( ∅ , Z [ z j ] , E i ) is G † -convergentin less than n steps for every i ∈ [ k ]. We then take F = { E h L/z i , . . . , E k h L/z i} . Equality (2) holds now for all i ∈ { , . . . , k } . For j = 0 we use that the fact that tr ( ∅ , Z, E i h L/z i ) −→ G † { P i } , which implies that tr ( ∅ , Z, E i h L/z i ) is G † -convergent in lessthan n + 1 = n steps, as required. For j = 1 we use that the fact that tr ( ∅ , Z, E i h L/z i ) −→ G † {•h P i , L i} and •h P i , L i −→ G † { P i , L } , which implies that tr ( ∅ , Z, E i h L/z i ) is G † -conver-gent in less than n + ( n − n −
2) + 2 = n steps, as required. (cid:74) The next lemma finishes the proof of Lemma 5.2, and thus the proof of correctness of ourtransformation: (cid:73)
Lemma 5.8.
Let E be an extended term over ( X , ∅ ) . If tr ( ∅ , ∅ , E ) is G † -convergent then E is G -ext-convergent. Proof.
Induction on the (smallest) number n such that tr ( ∅ , ∅ , E ) is G † -convergent in n steps. By assumption E is not a variable, because it is an extended term over ( X , ∅ ) (no freevariables). So, by Lemma 5.5 there exists a set F of extended terms such that E (cid:32) G F and tr ( ∅ , ∅ , F ) is G † -convergent in less than n steps for every F ∈ F . By the induction hypothesisevery F ∈ F is G -ext-convergent, so by definition also E is G -ext-convergent. (cid:74) We have presented a new, simple algorithm checking whether a higher-order grammargenerates a nonempty language. One may ask whether this algorithm can be used in practice.Of course the complexity n -EXPTIME for grammars of order n is unacceptably large (even ifwe take into account the fact that we are n -fold exponential only in the arity of types, not inthe size of a grammar), but one has to recall that there exist tools solving the consideredproblem in such a complexity. The reason why these tools work is that the time spent bythem on “easy” inputs is much smaller than the worst-case complexity (and many “typicalinputs” are indeed easy). Unfortunately, this is not the case for our algorithm: the size ofthe grammar resulting from our transformation is always large, even if the original grammargenerated a nonempty (or empty) language for some “easy reason”. Thus, our algorithm ismainly of a theoretical interest.The presented transformation preserves nonemptiness, and thus can be used to solvethe nonemptiness problem for higher-order grammars. However, it seems feasible that otherproblems concerning higher-order grammars (higher-order recursion schemes), like model-checking against parity automata or the simultaneous unboundedness problem [7], can besolved using similar transformations. Developing such transformations is a possible directionfor further work. References Klaus Aehlig. A finite semantics of simply-typed lambda terms for infinite runs of automata.
Log. Methods Comput. Sci. , 3(3), 2007. doi:10.2168/LMCS-3(3:1)2007 . Kazuyuki Asada and Naoki Kobayashi. Size-preserving translations from order-(n+1) wordgrammars to order-n tree grammars. In Zena M. Ariola, editor, , volume 167 of
LIPIcs , pages 22:1–22:22. Schloss Dagstuhl- Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.FSCD.2020.22 . . Parys 13 Christopher H. Broadbent, Arnaud Carayol, C.-H. Luke Ong, and Olivier Serre. Recursionschemes and logical reflection. In
Proceedings of the 25th Annual IEEE Symposium on Logic inComputer Science, LICS 2010, 11-14 July 2010, Edinburgh, United Kingdom , pages 120–129.IEEE Computer Society, 2010. doi:10.1109/LICS.2010.40 . Christopher H. Broadbent and Naoki Kobayashi. Saturation-based model checking of higher-order recursion schemes. In Simona Ronchi Della Rocca, editor,
Computer Science Logic 2013(CSL 2013), CSL 2013, September 2-5, 2013, Torino, Italy , volume 23 of
LIPIcs , pages 129–148.Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2013. doi:10.4230/LIPIcs.CSL.2013.129 . Christopher H. Broadbent and C.-H. Luke Ong. On global model checking trees generated byhigher-order recursion schemes. In Luca de Alfaro, editor,
Foundations of Software Scienceand Computational Structures, 12th International Conference, FOSSACS 2009, Held as Partof the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York,UK, March 22-29, 2009. Proceedings , volume 5504 of
Lecture Notes in Computer Science ,pages 107–121. Springer, 2009. doi:10.1007/978-3-642-00596-1_9 . Arnaud Carayol and Olivier Serre. Collapsible pushdown automata and labeled recursionschemes: Equivalence, safety and effective selection. In
Proceedings of the 27th Annual IEEESymposium on Logic in Computer Science, LICS 2012, Dubrovnik, Croatia, June 25-28, 2012 ,pages 165–174. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.73 . Lorenzo Clemente, Paweł Parys, Sylvain Salvati, and Igor Walukiewicz. The diagonal problemfor higher-order recursion schemes is decidable.
CoRR , abs/1605.00371, 2016. arXiv:1605.00371 . Werner Damm. The IO- and OI-hierarchies.
Theor. Comput. Sci. , 20:95–207, 1982. doi:10.1016/0304-3975(82)90009-3 . Matthew Hague, Roland Meyer, Sebastian Muskalla, and Martin Zimmermann. Parity tosafety in polynomial time for pushdown and collapsible pushdown systems. In Igor Potapov,Paul G. Spirakis, and James Worrell, editors, , volume117 of
LIPIcs , pages 57:1–57:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.MFCS.2018.57 . Matthew Hague, Andrzej S. Murawski, C.-H. Luke Ong, and Olivier Serre. Collapsiblepushdown automata and recursion schemes. In
Proceedings of the Twenty-Third Annual IEEESymposium on Logic in Computer Science, LICS 2008, 24-27 June 2008, Pittsburgh, PA, USA ,pages 452–461. IEEE Computer Society, 2008. doi:10.1109/LICS.2008.34 . Teodor Knapik, Damian Niwiński, and Paweł Urzyczyn. Higher-order pushdown trees areeasy. In Mogens Nielsen and Uffe Engberg, editors,
Foundations of Software Science andComputation Structures, 5th International Conference, FOSSACS 2002. Held as Part of theJoint European Conferences on Theory and Practice of Software, ETAPS 2002 Grenoble,France, April 8-12, 2002, Proceedings , volume 2303 of
Lecture Notes in Computer Science ,pages 205–222. Springer, 2002. doi:10.1007/3-540-45931-6_15 . Naoki Kobayashi. Types and higher-order recursion schemes for verification of higher-order pro-grams. In Zhong Shao and Benjamin C. Pierce, editors,
Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, GA,USA, January 21-23, 2009 , pages 416–428. ACM, 2009. doi:10.1145/1480881.1480933 . Naoki Kobayashi. Model checking higher-order programs.
J. ACM , 60(3):20:1–20:62, 2013. doi:10.1145/2487241.2487246 . Naoki Kobayashi and C.-H. Luke Ong. A type system equivalent to the modal mu-calculusmodel checking of higher-order recursion schemes. In
Proceedings of the 24th Annual IEEESymposium on Logic in Computer Science, LICS 2009, 11-14 August 2009, Los Angeles, CA,USA , pages 179–188. IEEE Computer Society, 2009. doi:10.1109/LICS.2009.29 . Naoki Kobayashi and C.-H. Luke Ong. Complexity of model checking recursion schemesfor fragments of the modal mu-calculus.
Log. Methods Comput. Sci. , 7(4), 2011. doi:10.2168/LMCS-7(4:9)2011 . Gregory M. Kobele and Sylvain Salvati. The IO and OI hierarchies revisited.
Inf. Comput. ,243:205–221, 2015. doi:10.1016/j.ic.2014.12.015 . C.-H. Luke Ong. On model-checking trees generated by higher-order recursion schemes. In , pages 81–90. IEEE Computer Society, 2006. doi:10.1109/LICS.2006.38 . Paweł Parys. Recursion schemes and the WMSO+U logic. In Rolf Niedermeier and BrigitteVallée, editors, , volume 96 of
LIPIcs , pages 53:1–53:16. SchlossDagstuhl - Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.STACS.2018.53 . Steven J. Ramsay, Robin P. Neatherway, and C.-H. Luke Ong. A type-directed abstractionrefinement approach to higher-order model checking. In Suresh Jagannathan and Peter Sewell,editors,
The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages, POPL ’14, San Diego, CA, USA, January 20-21, 2014 , pages 61–72. ACM, 2014. doi:10.1145/2535838.2535873 . Sylvain Salvati and Igor Walukiewicz. Krivine machines and higher-order schemes.
Inf.Comput. , 239:340–355, 2014. doi:10.1016/j.ic.2014.07.012 . Sylvain Salvati and Igor Walukiewicz. A model for behavioural properties of higher-orderprograms. In Stephan Kreutzer, editor, , volume 41 of
LIPIcs , pages 229–243.Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPIcs.CSL.2015.229 . Taku Terao and Naoki Kobayashi. A ZDD-based efficient higher-order model checking algorithm.In Jacques Garrigue, editor,
Programming Languages and Systems - 12th Asian Symposium,APLAS 2014, Singapore, November 17-19, 2014, Proceedings , volume 8858 of
Lecture Notes inComputer Science , pages 354–371. Springer, 2014. doi:10.1007/978-3-319-12736-1_19 . A Proof of Lemma 5.1
In this section we prove Lemma 5.1. To this end, fix a grammar G = ( X , X , R ). Ofcourse the second part (about convergence/ext-convergence of the grammar) follows from thefirst part (about convergence/ext-convergence of an extended term) applied to the startingnonterminal X . It is thus enough to prove the first part. We start with the left-to-rightdirection (i.e., from ext-convergence to convergence). We need a simple auxiliary lemma: (cid:73) Lemma A.1. If M −→ G N and M [ K/x ] is a valid term, then M [ K/x ] −→ G { N [ K/x ] | N ∈ N } . Proof.
A trivial case analysis on the definition of M −→ G N . (cid:74) The main ingredient of the proof is the following lemma: (cid:73)
Lemma A.2. If E (cid:32) G F , then either exp( E ) −→ G exp( F ) , or exp( F ) = { exp( E ) } . Proof.
Induction on the definition of E (cid:32) G F . (1) If E = X K . . . K k L . . . L ‘ and F = { R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L ‘ /z ‘ i} , where ‘ = gar ( X ), thenexp( E ) = E −→ G { R [ K /y , . . . , K k /y k , L /z , . . . , L ‘ /z ‘ ] } = { exp( R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L ‘ /z ‘ i ) } = exp( F ) . . Parys 15 (2) If E = •h K , . . . , K k i and F = { K , . . . , K k } , thenexp( E ) = E −→ G { K , . . . , K k } = { exp( K ) , . . . , exp( K k ) } = exp( F ) . (3) If E = ⊕h K , . . . , K k i and F = { K i } , thenexp( E ) = E −→ G { K i } = { exp( K i ) } = exp( F ) . (4) If E = z h L/z i and F = { L } , then exp( F ) = { exp( L ) } = { L } = { exp( E ) } . (5) If E = z h L/z i and F = { z } , then exp( F ) = { exp( z ) } = { z } = { exp( E ) } . (6) Finally, if E = E h L/z i and F = { F h L/z i | F ∈ F } and E (cid:32) G F , we haveexp( E ) = (exp( E ))[ L/z ] and exp( F ) = { (exp( F ))[ L/z ] | F ∈ F } = { N [ L/z ] | N ∈ exp( F ) } ; by the induction hypothesis we have that either exp( E ) −→ G exp( F ), orexp( F ) = { exp( E ) } ; the latter immediately implies that exp( F ) = { exp( E ) } , while theformer implies exp( E ) −→ G exp( F ) by Lemma A.1. (cid:74) We can now conclude with the left-to-right direction of Lemma 5.1: (cid:73)
Lemma A.3.
If an extended term E over ( X , ∅ ) is G -ext-convergent, then exp( E ) is G -convergent. Proof.
Induction on the fact that E is G -ext-convergent. Because E is G -ext-convergent, E (cid:32) G F for some set F of G -ext-convergent extended terms (for which we can apply theinduction hypothesis). Using the induction hypothesis, for every extended term F in F we obtain that exp( F ) is G -convergent, that is, all terms in exp( F ) are G -convergent. ByLemma A.2, we have that either exp( E ) −→ G exp( F ), or exp( F ) = { exp( E ) } . In thelatter case we already know that exp( E ) (as an element of exp( F )) is G -convergent. Inthe former case, we use the definition of G -convergence, and we also obtain that exp( E ) is G -convergent. (cid:74) We now come to the opposite direction: from convergence to ext-convergence. We saythat an extended term E is simplified if E is not of the form z h L /z i . . . h L ‘ /z ‘ i (i.e., ifthe non-extended term inside all explicit substitutions is not a variable). It turns out thatevery extended term can be ext-reduced to a simplified one (this is shown in the proof ofLemma A.5). Then, for a simplified extended term we can find an ext-reduction correspondingto a given standard reduction: (cid:73) Lemma A.4. If E is a simplified extended term over ( X , ∅ ) , and if exp( E ) −→ G N , then E (cid:32) G F for some F such that exp( F ) = N . Proof.
The extended term E can be written in the form E = E h R /z i . . . h R s /z s i , where E is a non-extended term. For every term K let Ξ( K ) = K [ R /z ] . . . [ R s /z s ]; in particularexp( E ) = Ξ( E ). Below, we prove that there exists F such that E (cid:32) G F and { Ξ(exp( F )) | F ∈ F } = N . This already gives the thesis for F = { F h R /z i . . . h R s /z s i | F ∈ F } .Indeed, on the one hand, because of E (cid:32) G F , due to rule (6) of the definition of (cid:32) G (applied s times), we have that E = E h R /z i . . . h R s /z s i (cid:32) G { F h R /z i . . . h R s /z s i | F ∈ F } = F . On the other hand, due to { Ξ(exp( F )) | F ∈ F } = N , we have thatexp( F ) = { exp( F h R /z i . . . h R s /z s i ) | F ∈ F } = { Ξ(exp( F )) | F ∈ F } = N . Thus, it remains to prove existence of the aforementioned set F . We have four casesdepending on the shape of E : (0) The term E is of the form y M . . . M m for some variable y . Then y has to be one of thevariables z , . . . , z s , because the whole E has no free variables; in particular z is of order0. However, this is actually impossible. Indeed, for m = 0 this is impossible, because E is simplified (i.e., E is not a variable), and for m ≥ y would be of positive order. (1) The term E is of the form X K . . . K k L . . . L ‘ for a nonterminal X , where ‘ = gar ( X ).Let X y . . . y k z . . . z ‘ → R be the rule for X . Then exp( E ) = X (Ξ( K )) . . . (Ξ( K k ))(Ξ( L )) . . . (Ξ( L ‘ )), so exp( E ) −→ G N implies that N = { R [Ξ( K ) /y , . . . , Ξ( K k ) /y k , Ξ( L ) /z , . . . , Ξ( L ‘ ) /z ‘ ] } = { Ξ( R [ K /y , . . . , K k /y k , L /z , . . . , L ‘ /z ‘ ] } , where the second equality holds because R does not contain the variables z , . . . , z s . Wetake F = { R [ K /y , . . . , K k /y k , z s +1 /z , . . . , z s + ‘ /z ‘ ] h L /z s +1 i . . . h L ‘ /z s + ‘ i} . Then E (cid:32) G F by rule (1) of the definition of (cid:32) G . Simultaneously { Ξ(exp( F )) | F ∈ F } = { Ξ( R [ K /y , . . . , K k /y k , L /z , . . . , L ‘ /z ‘ ]) } = N . (2) The term E is of the form •h K , . . . , K k i . Then exp( E ) = •h Ξ( K ) , . . . , Ξ( K k ) i , soexp( E ) −→ G N implies that N = { Ξ( K ) , . . . , Ξ( K k ) } . We take F = { K , . . . , K k } .Then E (cid:32) G F by rule (2) of the definition of (cid:32) G . Simultaneously exp( K i ) = K i for all i ∈ [ k ] (the terms K i are non-extended), so { Ξ(exp( F )) | F ∈ F } = { Ξ( K ) , . . . , Ξ( K k ) } = N . (3) The term E is of the form ⊕h K , . . . , K k i . Then exp( E ) = ⊕h Ξ( K ) , . . . , Ξ( K k ) i , soexp( E ) −→ G N implies that N = { Ξ( K i ) } for some i ∈ [ k ]. We take F = { K i } .Then E (cid:32) G F by rule (3) of the definition of (cid:32) G . Simultaneously exp( K i ) = K i , so { Ξ(exp( F )) | F ∈ F } = { Ξ( K i ) } = N . (cid:74) In the last lemma we prove the right-to-left direction of Lemma 5.1: (cid:73)
Lemma A.5. If exp( E ) is G -convergent, for an extended term E over ( X , ∅ ) , then E is G -ext-convergent. Proof.
Induction on the fact that exp( E ) is G -convergent, and internally on the number ofexplicit substitutions in E . One case is that E is simplified. Because exp( E ) is G -convergent,exp( E ) −→ G N for some set N of G -convergent terms (for which we can apply the inductionhypothesis). By Lemma A.4, there is a set F such that E (cid:32) G F and exp( F ) = N . Thelatter means that N = { exp( F ) | F ∈ F} . Using the induction hypothesis for every term in N , we obtain that all extended terms F in F are G -ext-convergent. Due to E (cid:32) G F , thisimplies that E is G -ext-convergent.The opposite case is that the extended term E is not simplified, that is, it is of the form E = z h L /z ih L /z i . . . h L k /z k i , where z is one of the variables z , . . . , z k (the whole E does not have free variables). Suppose first that z = z , and take F = L h L /z i . . . h L k /z k i .Thenexp( E ) = z [ L /z ][ L /z ] . . . [ L k /z k ] = L [ L /z ] . . . [ L k /z k ] = exp( F ) . Notice that F has less explicit substitutions than E (the term L is non-extended), so wecan use the internal induction hypothesis, obtaining that F is G -ext-convergent. Moreover,we have that z h L /z i (cid:32) G { L } by rule (4) of the definition of (cid:32) G , thus also E (cid:32) G { F } byrule (6) of this definition (used k − E is G -ext-convergent.When z = z i for i ≥
2, we proceed similarly. Taking F = z h L /z i . . . h L k /z k i we havethatexp( E ) = z [ L /z ][ L /z ] . . . [ L k /z k ] = z [ L /z ] . . . [ L k /z k ] = exp( F ) . . Parys 17 Because F has less explicit substitutions than E , we can use the internal induction hypothesis,obtaining that F is G -ext-convergent. Moreover, we have that z h L /z i (cid:32) G { z } by rule (5)of the definition of (cid:32) G , thus also E (cid:32) G { F } by rule (6) of this definition (used k − E is G -ext-convergent. (cid:74) B Additional details for the proof of Lemma 5.5
We now complement the proof of Lemma 5.5 with missing details. Recall that we are givenan extended term E over ( X , Z ) and a function Z ∈ Z . Knowing that E (cid:32) G F andthat tr ( ∅ , Z, F ) is G † -convergent for every F ∈ F , we have to prove that tr ( ∅ , Z, E ) is also G † -convergent.As already said, we proceed by induction on the definition of E (cid:32) G F , and we analyzeparticular cases of this definition. (1) Suppose that E = X K . . . K k L . . . L ‘ and F = { R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L ‘ /z ‘ i} , where ‘ = gar ( X ), and R ( X ) = ( X y . . . y k z . . . z ‘ → R ), and z , . . . , z ‘ are freshvariables of type o not appearing in Z . For every s ∈ { , . . . , ‘ } and every function A ∈ [ ‘ − s ] , let P s,A = tr ( A, Z, X K . . . K k L . . . L s ) ,Z s,A = Z [ z i A ( ‘ + 1 − i ) | i ∈ { s + 1 , s + 2 , . . . , ‘ } ] , and Q s,A = tr ( ∅ , Z s,A , R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L s /z s i ) . We prove, by induction on s , that if Q s,A is G † -convergent then also P s,A is G † -convergent.For s = ‘ and A = ∅ this gives the thesis (because tr ( ∅ , Z, E ) = Q ‘, ∅ and { tr ( ∅ , Z, F ) | F ∈ F} = { Q ‘, ∅ } ).Suppose first that s = 0. Then P s,A = tr ( A, Z, X K . . . K k )= X † A ( tr ( B, Z, K )) B ∈ [ gar ( K . . . ( tr ( B, Z, K k )) B ∈ [ gar ( Kk )] , and, by Lemma 5.3, Q s,A = tr ( ∅ , Z s,A , R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ])= tr ( ∅ , Z s,A , R [ z /z , . . . , z ‘ /z ‘ ][ K /y , . . . , K k /y k ])= ( tr ( ∅ , Z s,A , R [ z /z , . . . , z ‘ /z ‘ ]))[ tr ( B, Z s,A , K i ) /y † i,B | i ∈ [ k ] , B ∈ [ gar ( K i )] ] . Because the only variables from dom( Z s,A ) that appear in R [ z /z , . . . , z ‘ /z ‘ ] are z , . . . , z ‘ , we have that tr ( ∅ , Z s,A , R [ z /z , . . . , z ‘ /z ‘ ])= tr ( ∅ , [ z i A ( ‘ + 1 − i ) | i ∈ [ ‘ ]] , R [ z /z , . . . , z ‘ /z ‘ ])= tr ( ∅ , [ z i A ( ‘ + 1 − i ) | i ∈ [ ‘ ]] , R ) . Likewise, because z , . . . , z ‘ do not appear in K , . . . , K k , we have that tr ( B, Z s,A , K i ) = tr ( B, Z, K i ) for all i ∈ [ k ] and B ∈ [ gar ( K i )] . In consequence, Q s,A = ( tr ( ∅ , [ z i A ( ‘ + 1 − i ) | i ∈ [ ‘ ]] , R ))[ tr ( B, Z, K i ) /y † i,B | i ∈ [ k ] , B ∈ [ gar ( K i )] ] . Recall that the rule for X † A is X † A ( y † ,B ) B ∈ [ gar ( y . . . ( y † k,B ) B ∈ [ gar ( yk )] → tr ( ∅ , [ z i A ( ‘ + 1 − i ) | i ∈ [ ‘ ]] , R ) , thus P s,A −→ G † { Q s,A } ; it follows that if Q s,A is G † -convergent then also P s,A is G † -convergent.Next, suppose that s ≥
1. Let us denote P = P s − ,A [ ‘ +1 − s , Q = Q s − ,A [ ‘ +1 − s , L = tr ( ∅ , Z, L s ) .P = P s − ,A [ ‘ +1 − s , Q = Q s − ,A [ ‘ +1 − s , Simultaneously L = tr ( ∅ , Z s,A , L s ), because variables z s +1 , z s +2 , . . . , z ‘ do not appear in L s . By definition we have that P s,A = ⊕h P , •h P , L ii and Q s,A = ⊕h Q , •h Q , L ii . Recall that, by definition, a term M is G † -convergent if and only if M −→ G † N forsome set N of G † -convergent terms. We can have Q s,A −→ G † N only for N = { Q } andfor N = {•h Q , L i} , and we can have •h Q , L i −→ G † N only for N = { Q , L } . Byassumption Q s,A is G † -convergent, so either Q is G † -convergent, or both Q and L are G † -convergent. In the former case, P is G † -convergent by the induction hypothesis, andwe have P s,A −→ G † { P } , so P s,A is G † -convergent, as required. In the latter case, P is G † -convergent by the induction hypothesis, and we have •h P , L i −→ G † { P , L } and P s,A −→ G † {•h P , L i} , so also P s,A is G † -convergent, as required. (2) Suppose that E = •h K , . . . , K k i and F = { K , . . . , K k } . Then, by definition, tr ( ∅ , Z, E ) = •h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i −→ G † { tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) } . By assumption, elements of the latter set are G † -convergent, so also tr ( ∅ , Z, E ) is G † -convergent. (3) Suppose that E = ⊕h K , . . . , K k i and F = { K i } for some i ∈ [ k ] . Then, by definition, tr ( ∅ , Z, E ) = ⊕h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i −→ G † { tr ( ∅ , Z, K i ) } . By assumption tr ( ∅ , Z, K i ) is G † -convergent, so also tr ( ∅ , Z, E ) is G † -convergent. (4) Suppose that E = z h L/z i and F = { L } . . Parys 19 Then tr ( ∅ , Z, E ) = ⊕h Ω , •h• , tr ( ∅ , Z, L ) ii −→ G † {•h• , tr ( ∅ , Z, L ) i} , •h• , tr ( ∅ , Z, L ) i −→ G † {• , tr ( ∅ , Z, L ) } , and • −→ G † ∅ . By assumption tr ( ∅ , Z, L ) is G † -convergent, so also tr ( ∅ , Z, E ) is G † -convergent. (5) Suppose that E = z h L/z i and F = { z } , where z = z. Denote P = tr ( z , Z, L ); we simultaneously have P = tr ( z , Z [ z , L ) = tr ( z ,Z [ z , L ). Then tr ( ∅ , Z, E ) = ⊕h P, •h P, tr ( ∅ , Z, L ) ii −→ G † { P } By assumption P is G † -convergent, so also also tr ( ∅ , Z, E ) is G † -convergent. (6) The last case, when E = E h L/z i , F = { E h L/z i , . . . , E k h L/z i} , and E (cid:32) G { E , . . . , E k } was completely resolved in Section 5, so we do not repeat the proof here. C Additional details for the proof of Lemma 5.7
In this section we give missing details for the proof of Lemma 5.7. Recall that we are givenan extended term E over ( X , Z ), which is not a variable, and a function Z ∈ Z . Knowingthat tr ( ∅ , Z, E ) is G † -convergent in n steps, we have to prove that there exists a set F ofextended terms such that E (cid:32) G F and tr ( ∅ , Z, F ) is G † -convergent in less than n steps forevery F ∈ F .The proof is by induction on the number of explicit substitutions in E . Depending onthe shape of E , we have six cases. (1) Suppose that E starts with an application. Then necessarily it can be written as E = X K . . . K k L . . . L ‘ , where ‘ = gar ( X ). In particular, notice that instead of the nonterminal X we cannothave a variable, because (by definition of an extended term) all variables in Z are of type o . Let X y . . . y k z . . . z ‘ → R be the rule for X , and let z , . . . , z ‘ be fresh variablesof type o not appearing in Z . In such a situation we have that E (cid:32) G { F } for F = R [ K /y , . . . , K k /y k , z /z , . . . , z ‘ /z ‘ ] h L /z i . . . h L ‘ /z ‘ i . For every s ∈ { , . . . , ‘ } and every function A ∈ [ ‘ − s ] , let us define P s,A , Z s,A , and Q s,A as in Appendix B. We prove, by induction on s , that if P s,A is G † -convergent in n s steps(for some n s ∈ N ) then Q s,A is G † -convergent in n s − s = ‘ and A = ∅ thisgives the thesis, taking F = { F } (because tr ( ∅ , Z, E ) = P ‘, ∅ and tr ( ∅ , Z, F ) = Q ‘, ∅ ).Suppose first that s = 0. Recall that in the proof of Lemma 5.5 we have observed that P s,A −→ G † { Q s,A } . Actually, if P s,A −→ G † N then necessarily N = { Q s,A } (because P s,A is a nonterminal with applied arguments, and in this case the reduction is completelydeterministic). By assumption P s,A is G † -convergent in n s steps, which, by definition,immediately implies that Q s,A is G † -convergent in n s − Next, suppose that s ≥
1. Using the definition of P , P , Q , Q , and L fromAppendix B, we have that P s,A = ⊕h P , •h P , L ii and Q s,A = ⊕h Q , •h Q , L ii . We can have P s,A −→ G † N only for N = { P } and for N = {•h P , L i} , and we canhave •h P , L i −→ G † N only for N = { P , L } . By assumption P s,A is G † -convergent in n s steps, so either P is G † -convergent in n s − = n s − P and L are G † -convergent in, respectively, n s − and n s − n s − − n s − ∈ N . In theformer case, Q is G † -convergent in n s − − n s − Q s,A −→ G † { Q } , so Q s,A is G † -convergent in n s − Q is G † -convergent in n s − − •h Q , L i −→ G † { Q , L } and Q s,A −→ G † {•h Q , L i} , so Q s,A is G † -convergentin ( n s − −
1) + ( n s − n s − −
2) + 2 = n s − (2) Suppose that E starts with • . Then it can be written as E = •h K , . . . , K k i , and, bydefinition, tr ( ∅ , Z, E ) = •h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i , and tr ( ∅ , Z, E ) −→ G † N only for N = { tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) } . We know that tr ( ∅ , Z, E ) is G † -convergent in n steps, so necessarily the terms tr ( ∅ , Z, K i ) for i ∈ [ k ]are G † -convergent in n i steps, for some numbers n i such that n + · · · + n k + 1 = n . Inparticular all n i are smaller than n , so F = { K , . . . , K k } satisfies the thesis, because E (cid:32) G F , by definition. (3) Suppose that E starts with ⊕ . Then it can be written as E = ⊕h K , . . . , K k i , and, bydefinition, tr ( ∅ , Z, E ) = ⊕h tr ( ∅ , Z, K ) , . . . , tr ( ∅ , Z, K k ) i , and tr ( ∅ , Z, E ) −→ G † N only when N = { tr ( ∅ , Z, K i ) } for some i ∈ [ k ]. We know that tr ( ∅ , Z, E ) is G † -convergent in n steps, so necessarily tr ( ∅ , Z, K i ), for some i ∈ [ k ], is G † -convergent in n − F = { K i } satisfies the thesis, because E (cid:32) G F , by definition. (4) Suppose that E = z h L/z i . Then tr ( ∅ , Z, E ) = ⊕h Ω , •h• , tr ( ∅ , Z, L ) ii . Notice that tr ( ∅ , Z, E ) −→ G † N only for N = { Ω } and for N = {•h• , tr ( ∅ , Z, L ) i} . Inturn, Ω is not G † -convergent (in any number of steps), and •h• , tr ( ∅ , Z, L ) i −→ G † N onlyfor N = {• , tr ( ∅ , Z, L ) } , and • −→ G † N only for N = ∅ . We know that tr ( ∅ , Z, E ) is G † -convergent in n steps, so, by the above, tr ( ∅ , Z, L ) is G † -convergent in n − F = { L } satisfies the thesis, because E (cid:32) G F , by definition. (5) Suppose that E = z h L/z i for some z = z . Denote P = tr ( z , Z, L ); we simultaneouslyhave P = tr ( z , Z [ z , L ) = tr ( z , Z [ z , L ), so tr ( ∅ , Z, E ) = ⊕h P, •h P, tr ( ∅ , Z, L ) ii . Notice that tr ( ∅ , Z, E ) −→ G † N only for N = { P } and for N = {•h P, tr ( ∅ , Z, L ) i} , and •h P, tr ( ∅ , Z, L ) i −→ G † N only for N = { P, tr ( ∅ , Z, L ) } . We know that tr ( ∅ , Z, E ) is G † -convergent in n steps, so, by the above, P is G † -convergent in less than n steps. Inconsequence F = { z } satisfies the thesis, because E (cid:32) G F , by definition. (6) Finally, suppose that E = E h L/z i , where E0