[PDF] Linear Bounded Composition of Tree-Walking Tree Transducers: Linear Size Increase and Complexity

Abstract

Compositions of tree-walking tree transducers form a hierarchy with respect to the number of transducers in the composition. As main technical result it is proved that any such composition can be realized as a linear bounded composition, which means that the sizes of the intermediate results can be chosen to be at most linear in the size of the output tree. This has consequences for the expressiveness and complexity of the translations in the hierarchy. First, if the computed translation is a function of linear size increase, i.e., the size of the output tree is at most linear in the size of the input tree, then it can be realized by just one, deterministic, tree-walking tree transducer. For compositions of deterministic transducers it is decidable whether or not the translation is of linear size increase. Second, every composition of deterministic transducers can be computed in deterministic linear time on a RAM and in deterministic linear space on a Turing machine, measured in the sum of the sizes of the input and output tree. Similarly, every composition of nondeterministic transducers can be computed in simultaneous polynomial time and linear space on a nondeterministic Turing machine. Their output tree languages are deterministic context-sensitive, i.e., can be recognized in deterministic linear space on a Turing machine. The membership problem for compositions of nondeterministic translations is nondeterministic polynomial time and deterministic linear space. The membership problem for the composition of a nondeterministic and a deterministic tree-walking tree translation (for a nondeterministic IO macro tree translation) is log-space reducible to a context-free language, whereas the membership problem for the composition of a deterministic and a nondeterministic tree-walking tree translation (for a nondeterministic OI macro tree translation) is possibly NP-complete.

Full PDF

aa r X i v : . [ c s . F L ] D ec Linear-Bounded Composition ofTree-Walking Tree Transducers:Linear Size Increase and Complexity ∗ Joost Engelfriet † Kazuhiro Inaba ‡ Sebastian Maneth § Abstract

Compositions of tree-walking tree transducers form a hierarchy withrespect to the number of transducers in the composition. As main tech-nical result it is proved that any such composition can be realized as alinear-bounded composition, which means that the sizes of the interme-diate results can be chosen to be at most linear in the size of the outputtree. This has consequences for the expressiveness and complexity of thetranslations in the hierarchy. First, if the computed translation is a func-tion of linear size increase, i.e., the size of the output tree is at most linearin the size of the input tree, then it can be realized by just one, deter-ministic, tree-walking tree transducer. For compositions of deterministictransducers it is decidable whether or not the translation is of linear sizeincrease. Second, every composition of deterministic transducers can becomputed in deterministic linear time on a RAM and in deterministic lin-ear space on a Turing machine, measured in the sum of the sizes of theinput and output tree. Similarly, every composition of nondeterministictransducers can be computed in simultaneous polynomial time and linearspace on a nondeterministic Turing machine. Their output tree languagesare deterministic context-sensitive, i.e., can be recognized in deterministiclinear space on a Turing machine. The membership problem for composi-tions of nondeterministic translations is nondeterministic polynomial timeand deterministic linear space. All the above results also hold for com-positions of macro tree transducers. The membership problem for thecomposition of a nondeterministic and a deterministic tree-walking treetranslation (for a nondeterministic IO macro tree translation) is log-spacereducible to a context-free language, whereas the membership problem forthe composition of a deterministic and a nondeterministic tree-walkingtree translation (for a nondeterministic OI macro tree translation) is pos-sibly NP-complete. ∗ Published at https://link.springer.com/article/10.1007/s00236-019-00360-8 † LIACS, Leiden University, P.O. Box 9512, 2300 RA Leiden, the Netherlands; email: [email protected] ‡ Google Japan G.K., Tokyo, Japan; email: [email protected] § Department of Mathematics and Informatics, Universit¨at Bremen, P.O. Box 330 440,28334 Bremen, Germany; email: [email protected] ontents Introduction

Tree transducers are used, e.g., in compiler theory or, more generally, the theoryof syntax-directed semantics of context-free languages [39], and in the theoryof XML queries and XML document transformation [69, 47]. One of the mostbasic types of tree transducer is the top-down tree transducer (in short tt ↓ ).It is a ﬁnite-state device that walks top-down on the input tree, from parentto child, possibly branching into parallel copies of itself at each step (thus al-lowing the transducer to visit all children of the parent). During this process,the output tree is generated top-down. The tt ↓ has been generalized in twodiﬀerent ways. By allowing it to walk also bottom-up, from child to parent, stillpossibly branching at every step and still generating the output tree top-down,one obtains the tree-walking tree transducer (in short, tt ). On the other hand,restricting its walk to be top-down but allowing its states to have parameters oftype output tree, one obtains the macro tree transducer (in short, mt ). In gen-eral we consider nondeterministic transducers, with deterministic transducersas an important special case (abbreviated as d tt ↓ , d tt , and d mt ).To turn the tt ↓ into a more ﬂexible model of tree transformation, it wasenhanced with the feature of regular look-ahead, which means that it can testwhether or not the subtree at the current node of the input tree belongs to agiven regular tree language. The mt already has the ability to implement regularlook-ahead. Since both the enhanced tt ↓ and the mt process the input treetop-down, they can also implement “regular look-around”, which means thatthey can test arbitrary regular properties of the current node of the input tree.More precisely, they can test whether the input tree, in which the current node ismarked, belongs to a given regular tree language. Such regular look-around testsare also called mso tests, because they can be expressed by formulas of monadicsecond-order logic with one free node variable. The tt , as deﬁned in [63], doesnot have regular look-ahead or look-around. One of the drawbacks of this isthat the tt cannot recognize all regular tree languages without branching [9].Hence, from now on, we assume that the tt (and the tt ↓ ) is enhanced withregular look-around, i.e., with regular tests of the current input node. Theresulting tt formalism is a quite robust, ﬂexible, and intuitive model of treetransformation.The tt and mt , generalizations of the tt ↓ , are closely related, in particu-lar in the deterministic case. In fact, every d tt can be simulated by a d mt ,whereas every d mt can be simulated by a composition of two d tt ’s. Thus, everycomposition of d tt ’s can be realized by a composition of d mt ’s, and vice versa.Compositions of d tt ’s form a proper hierarchy, in an obvious way. A single d tt is at most of exponential size increase, which means that the size of the outputtree is at most exponential in the size of the input tree. However, a compositionof two d tt ’s can be of double exponential size increase. In general, composi-tions of k d tt ’s are at most, and can be, of k -fold exponential size increase.Compositions of d mt ’s form a proper hierarchy by a similar argument. Fornondeterministic tt ’s and mt ’s the situation is similar but more complicated. The name “tree-walking tree transducer” was introduced in [26]. The adjective “tree-walking” stands for the fact that the transducer walks on the input tree (just as the tree-walking automaton of [2]). The tt is the generalization to trees of the two-way ﬁnite-statestring transducer, which walks on its input string in both directions and produces the outputstring one-way from left to right. Note that “tree-walking” and “two-way” alliterate. mt can be simulated by a composition of two tt ’s. However, as opposedto tt ’s, mt ’s are always ﬁnitary, which means that for every given input treean mt computes ﬁnitely many output trees.In this paper we investigate compositions of tt ’s (and hence of mt ’s) withrespect to their expressivity and their complexity. Our main technical result isthat every composition of tt ’s can be realized by a linear-bounded composition of tt ’s, which means that, when computing an output tree from an input tree,the intermediate results can be chosen in such a way that their sizes are atmost linear in the size of the output tree. More precisely, a composition oftwo transducers (for simplicity) is linear-bounded if there is a constant c suchthat for every pair ( t, s ) of an input tree t and output tree s in the composedtranslation there is an intermediate tree r (meaning that ( t, r ) and ( r, s ) arein the ﬁrst and second translation, respectively) such that the size of r is atmost c times the size of s . Intuitively, to compute s from t there is no need toconsider intermediate results that are much larger than s . If both transducersare deterministic it means that for every input tree t in the domain of thecomposed translation the size of the unique intermediate tree r is at most linearin the size of the unique output tree s .To prove that every composition of two tt ’s can be realized by a linear-bounded composition of two tt ’s, we ﬁrst show that every tt can be decom-posed into a tt ↓ that “prunes” the input tree, followed by a tt that is “pro-ductive” on at least one of the intermediate trees generated by the tt ↓ , whichmeans that it uses each leaf and each monadic node of that intermediate tree inorder to generate the output tree. Productivity guarantees that the compositionof these two transducers is linear-bounded. We also prove that the compositionof an arbitrary tt with a “pruning” top-down tt can be realized by one tt .Thus, when two tt ’s are composed, the second tt can split oﬀ the pruning tt ↓ (to the left), which can be absorbed (to the right) by the ﬁrst tt . Thecomposition of the resulting two tt ’s is then linear-bounded. This also holdsfor deterministic transducers, in which case the pruning tt is also determinis-tic. Similar results were presented for macro tree transducers in [58, Section 3]and [51, Section 4].Thus, roughly speaking, our main technical result provides a method to im-plement compositions of tt ’s in such a way that the generation of superﬂuousnodes, i.e., nodes on which a tt just walks around without producing any out-put, is avoided by pruning those superﬂuous parts from the intermediate trees.As such it can be viewed as a static garbage collection procedure, and leads, inprinciple, to algorithms for automatic compiler and XML query optimization.Since tt ’s are essentially ﬁnite-state automata walking on trees, it is not reallysurprising that only a linearly bounded amount of intermediate information isuseful to the ﬁnal output. However, proving this rigorously requires quite someeﬀort. In particular, the subcomputations of the tt during which it does notproduce output will be determined by regular look-around.The above method can be used to obtain results on both the expressivityand the complexity of compositions of tt ’s, as discussed in the next paragraphs. Expressivity.

We have seen above that compositions of tt ’s can be of k -foldexponential size increase. However, many real world tree transformations areof linear size increase . We prove that the hierarchy of compositions of deter-ministic tt ’s collapses when restricted to translations of linear size increase:4very composition of d tt ’s that is of linear size increase can be realized by justone d tt . We also show that it is decidable whether or not a composition of d tt ’sis of linear size increase. This means that a compiler or XML query, no matterhow ineﬃciently programmed in several phases, can be realized in one eﬃcientphase, provided it is of linear size increase. In fact, as we will see below, thatsingle phase can be executed in linear time. More theoretically, we additionallyprove that a function that can be realized by a composition of nondeterminis-tic tt ’s, can also be realized by a composition of deterministic tt ’s, and henceby one deterministic tt if that function is of linear size increase. Thus, the only(functional) tree transformations that can be realized by a composition of tt ’sbut not by a single tt , are tree transformations of superlinear size increase.The proof of the collapse of the hierarchy of compositions of d tt ’s is basedon the known fact that every d tt of linear size increase can be realized by a d tt that is “single-use”, which means that it never visits a node of the input treetwice in the same state. In fact, it is proved in [29, 32] that even d mt ’s of linearsize increase can be realized by single-use d tt ’s. Vice versa, it is obvious thatevery single-use d tt is of linear size increase. In [7] it is shown that single-use d tt ’s have the same power as deterministic mso tree transducers, which useformulas of monadic second-order logic to deﬁne the output tree in terms of theinput tree (see [13, 14]).By our main technical result, we may always assume that a compositionof two d tt ’s is linear-bounded. If the composition is of linear size increase,then the ﬁrst d tt is obviously also of linear size increase, and can therefore berealized by a single-use d tt . We also prove that the composition of a single-use d tt with an arbitrary d tt can be realized by one d tt . Thus, altogether, ifthe composition of two d tt ’s is of linear size increase, then it can be realized bya single-use d tt . This argument can easily be turned into an inductive prooffor a composition of any number of d tt ’s. Complexity.

We ﬁrst consider deterministic tt ’s. The translation realized bya deterministic tt can be computed on a RAM in linear time, in the sum ofthe sizes of the input and output tree. With respect to space, we prove thatit can be computed on a deterministic Turing machine in linear space (again,in the sum of the sizes of the input and output tree). Since we may assumeby our main technical result that the sizes of the intermediate results are atmost linear in the size of the output tree, it should be clear that these facts alsohold for compositions of d tt ’s. We also consider output tree languages, i.e.,the images of a regular tree language under a composition of d tt ’s. Since theregular tree languages are closed under prunings, our technical decompositionresult now implies that these output languages are in DSPACE ( n ), i.e., can berecognized by a Turing machine in deterministic linear space (or, in other words,are deterministic context-sensitive). Since the yield of a tree can be computedby a d tt (representing it by a monadic tree), the output string languages,which are the yields of the output tree languages, are also in DSPACE ( n ). Thelanguages in the well-known io -hierarchy are examples of such output languages.For compositions of top-down tree transducers (even nondeterministic ones) thisresult on output languages was proved in [4], using a technical result very similarto ours.Our results on nondeterministic tt ’s (and their proofs) are very similarto those for d tt ’s. The translation realized by a composition of tt ’s can be5omputed by a nondeterministic Turing machine in simultaneous polynomialtime and linear space (in the sum of the sizes of the input and output tree). Thecorresponding output languages can be recognized by such a Turing machineand hence are in NPTIME . Using the results on the membership problem forcompositions of tt ’s discussed in the next paragraph, we generalize the resultof [4] and prove that these output languages are even in DSPACE ( n ), whichmeans that they are deterministic context-sensitive. The languages in the well-known oi -hierarchy are examples of such output languages.Finally, we consider the membership problem for compositions of tt ’s, whichasks whether or not a given pair ( t, s ) of input tree t and output tree s belongsto the composed translation. It follows easily from the above complexity resultsthat for (non)deterministic tt ’s the problem is decidable in (non)deterministicpolynomial time and in (non)deterministic linear space. For the special caseof the composition of a nondeterministic tt with a deterministic tt we provethat the problem is even in LOGCFL , i.e., log-space reducible to a context-freelanguage, and hence in

PTIME and

DSPACE (log n ). From this we concludethat for nondeterministic tt ’s the problem is even decidable in deterministiclinear space. However, for the special case of the composition of a deterministic tt with a nondeterministic one, the problem can be NP -complete. From thetwo special cases we obtain that the membership problem for a (single) nonde-terministic macro tree transducer is in LOGCFL for io macro tree transducers(strengthening the result in [52] where it was shown to be in PTIME ), whereasit can be NP -complete for oi macro tree transducers. Structure of the paper.

The reader is assumed to be familiar with the basicsof formal language theory, in particular tree language theory, and complexitytheory. The only formalisms used are tree-walking tree transducers ( tt ’s, ofcourse), top-down tree transducers ( tt ↓ ’s, as a special case of tt ’s), context-free grammars, regular tree grammars, and ﬁnite-state tree automata. Resultson macro tree transducers are taken from the literature.The main results are proved in Sections 8 to 12. Section 2 contains a num-ber of preliminary notions, in particular linear-bounded composition, linear sizeincrease, and regular look-around. In Section 3 we deﬁne the tree-walking treetransducer (with regular look-around), together with some of its special casessuch as top-down and single-use. A tt that does not use regular look-aroundtests is called “local”. A “pruning” tt is a tt ↓ that, roughly speaking, removesor relabels each node of the input tree and possibly deletes several of its chil-dren (together with their descendants). After giving two examples we presentthe composition hierarchy of d tt ’s and end the section with some elementarysyntactic properties of tt ’s. In Section 4 it is shown how to separate the reg-ular look-around from a tt and incorporate it into another tt . For instance,every tt can be decomposed into a deterministic pruning tt ↓ that just relabelsthe nodes of the input tree (and hence does not really “prune”), followed by alocal tt . We also state the fact that the domain of a tt is a regular tree lan-guage. Consequently, it is possible to deﬁne the regular tests of a tt as domainsof other tt ’s, which is a convenient technical tool. Section 5 contains three com-position results. We prove that the composition of a tt with a pruning tt ↓ canbe realized by a tt (such that determinism is preserved). Together with theabove-mentioned decomposition, this implies for instance that in a compositionof two tt ’s, the second tt can always be assumed to be local: the second tt tt ↓ that is absorbed by the ﬁrst tt . In the deterministiccase, we even prove that the composition of a d tt with an arbitrary d tt ↓ canbe realized by a d tt , and we also prove that the composition of a single-use d tt with a d tt can be realized by a d tt . Section 6 presents the known fact that ev-ery d tt of linear size increase can be realized by a single-use d tt , and discussesthe relationship between tt ’s, macro tree transducers, and mso tree transduc-ers. In Section 7 we show that a (partial) function that can be realized by acomposition of nondeterministic tt ’s, can also be realized by a composition ofdeterministic tt ’s. To prove this we ﬁrst prove a lemma: for every tt ↓ thereis a deterministic tt ↓ that realizes a “uniformizer” of the translation realizedby the given tt ↓ , i.e., a function that is a subset of that translation, with thesame domain. Section 8 contains our main technical result: every tt can bedecomposed into a pruning tt and another tt such that the composition islinear-bounded. It implies (by splitting and absorbing) that a composition of tt ’s can always be assumed to be linear-bounded. The “uniformizer” lemma ofthe previous section is applied to the pruning tt ↓ , proving the same result fordeterministic tt ’s. Section 9 presents the main results on linear size increase,and Sections 10 and 11 present the main results on the complexity of composi-tions of deterministic and nondeterministic tt ’s, respectively. In Section 12 weprove the main results on the complexity of the membership problem for thecomposition of two tt ’s. Finally, in Section 13 we show (in a straightforwardway) that all main results also hold for transducers that transform unrankedtrees, or forests, which are a natural model of XML documents.The reader who is interested only in complexity can disregard all results onsingle-use tt ’s, and skip Sections 6.2 and 9. The reader who is interested onlyin expressivity can just skip Sections 10, 11, and 12. Remarks on the literature.

Top-down tree transducers were introducedin [66, 72]; regular look-ahead was added in [20]. Macro tree transducerswere introduced in [15, 34]. Tree-walking tree transducers were introducedin [63] (where they are called 0-pebble tree transducers), and studied in, e.g.,[31, 26, 62]. They were already mentioned in [24, Section 3(7)] (where theyare called RT(Tree-walk) transducers). Regular look-around was added to tt ’sin [14, Section 8.2] (where they are called ms tree-walking transducers); fortree-walking automata that was already done in [6]. However, formal modelssimilar to the tt were introduced and studied before. The tree-walking au-tomaton of [2] translates trees into strings. As explained in [24, Section 3(7)]and [31, Section 3.2], the tt is closely related to the attribute grammar [53],which is a well-known model of syntax-directed semantics (and a compiler con-struction tool). An attribute grammar translates derivation trees of an underly-ing context-free grammar into arbitrary values. Tree-valued attribute grammarswere considered, e.g., in [27]. The attributed tree transducer, introduced in [38],is an operational version of the tree-valued attribute grammar, without under-lying context-free grammar. Regular look-around was added to the attributedtree transducer in [7] (where it is called look-ahead). Attributed tree transduc-ers are a special type of tt ’s, of which the states are viewed as attributes of thenodes of the input tree. By deﬁnition a deterministic attributed tree transducer(like an attribute grammar) has to be noncircular, which means that it shouldgenerate an output tree whenever it is started in any state on any node of aninput tree. Thus, it is total in a strong sense. This is natural from the point of7iew of syntax-directed semantics, but quite restrictive and inconvenient fromthe operational point of view of tree transformation. Several of the auxiliaryresults in Sections 3 to 5 are closely related to (and generalizations of) well-known results on attributed tree transducers (see, e.g., [39]). As an example,it is proved in [38, Theorem 4.3] that, for deterministic transducers, the com-position of an attributed tree transducer with a top-down tree transducer canbe realized by an attributed tree transducer. That does not immediately implythat the same is true for a d tt and a d tt ↓ , which we show in Section 5, because d tt ’s are not necessarily total and they have regular look-around. Moreover,we wanted such results also to be understandable for readers unfamiliar withattribute grammars and attributed tree transducers.The main results of this paper were ﬁrst presented at FSTTCS ’02 [58] (onthe complexity of compositions of deterministic mt ’s), at FSTTCS ’03 [59] (oncompositions of mt ’s that realize functions of linear size increase), at FSTTCS’08 [51] (on the complexity of compositions of nondeterministic mt ’s), at PLAN-X ’09 [52] (on the complexity of the membership problem for mt ’s), and in thePh.D. Thesis of the second author [48] (on the last two subjects). Convention : All results stated and/or proved in this paper are eﬀective.

Sets, strings, and relations.

The set of natural numbers is N = { , , , . . . } .For m, n ∈ N , we denote the interval { k ∈ N | m ≤ k ≤ n } by [ m, n ]. Thecardinality or size of a set A is denoted by A ). The set of strings over A is denoted by A ∗ . It consists of all sequences w = a · · · a m with m ∈ N and a i ∈ A for every i ∈ [1 , m ]. The length m of w is denoted by | w | . The emptystring (of length 0) is denoted by ε . The concatenation of two strings v and w is denoted by v · w or just vw . Moreover, w = ε and w k +1 = w · w k for k ∈ N .The domain and range of a binary relation R ⊆ A × B are denoted bydom( R ) and ran( R ), respectively. For A ′ ⊆ A , R ( A ′ ) = { b ∈ B | ( a, b ) ∈ R for some a ∈ A ′ } . The composition of R with a binary relation S ⊆ B × C is R ◦ S = { ( a, c ) | ∃ b ∈ B : ( a, b ) ∈ R, ( b, c ) ∈ S } . The inverse of R is R − = { ( b, a ) | ( a, b ) ∈ R } . Note that dom( R ◦ S ) = R − (dom( S )) and ran( R ◦ S ) = S (ran( R )). If A = B then the transitive-reﬂexive closure of R is R ∗ = S k ∈ N R k where R = { ( a, a ) | a ∈ A } and R k +1 = R ◦ R k . The composition of two classesof binary relations R and S is R ◦ S = { R ◦ S | R ∈ R , S ∈ S} . Moreover, R = R and R k +1 = R ◦ R k for k ≥

1. The relation R is ﬁnitary if R ( a ) isﬁnite for every a ∈ A , where R ( a ) denotes R ( { a } ). It is a (partial) functionfrom A to B if R ( a ) is empty or a singleton for every a ∈ A , and it is a totalfunction if, moreover, dom( R ) = A . Trees.

An alphabet is a ﬁnite set of symbols. A ranked alphabet Σ is analphabet together with a mapping rank Σ : Σ → N (of which the subscript Σ willbe dropped when it is clear from the context). The maximal rank of elementsof Σ is denoted m x Σ . For every m ∈ N we denote by Σ ( m ) the elements of Σthat have rank m .Trees over Σ are recursively deﬁned to be strings over Σ, as follows. Forevery m ∈ N , if σ ∈ Σ ( m ) and t , . . . , t m are trees over Σ, then σ t · · · t m isa tree over Σ. For readability we also write the tree σ t · · · t m as the term8 ( t , . . . , t m ). The set of all trees over Σ is denoted T Σ ; thus T Σ ⊆ Σ ∗ . For anarbitrary ﬁnite set A , disjoint with Σ, we denote by T Σ ( A ) the set T Σ ∪ A , whereeach element of A has rank 0.As usual trees are viewed as directed labeled graphs. The nodes of a tree t are indicated by Dewey notation, i.e., by elements of N ∗ , which are strings ofnatural numbers. The root of t is indicated by the empty string ε , but will alsobe denoted by root t for readability. The i -th child of a node u of t is indicatedby ui , and there is a directed edge from the parent u to the child ui . Formally,the set N ( t ) of nodes of a tree t = σ t · · · t m over Σ can be deﬁned recursivelyby N ( t ) = { ε } ∪ { iu | i ∈ [1 , m ] , u ∈ N ( t i ) } . Thus, N ( t ) ⊆ [1 , m x Σ ] ∗ . Theroot of t = σt · · · t m has label σ , and the node iu of t has the same label as thenode u of t i . The rank of node u is the rank of its label, i.e., the number of itschildren. A leaf is a node of rank 0, and a monadic node is a node of rank 1.Every node of t has a child number: each node ui has child number i , and theroot ε is given child number 0 for technical convenience. For a node u of t thesubtree of t with root u is denoted t | u ; thus, t | ε = t and t | iu = t i | u . A node v of t is a descendant of a node u of t , and u is an ancestor of v , if there exists w ∈ N ∗ such that w = ε and v = uw (thus, u is not a descendant/ancestor of itself).The size of a tree t is | t | , i.e., its length as a string. Note that | t | = N ( t ))because the nodes of t correspond one-to-one to the positions in the string t ,i.e., for every σ ∈ Σ, each occurrence of σ in t corresponds to a node of t withlabel σ . The left-to-right linear order on N ( t ) according to this correspondenceis called the pre-order of the nodes of t . The yield of t is the string of labelsof its leaves, in pre-order. The height of t is the number of edges of a longestdirected path from the root of t to a leaf; thus, it is the maximal length of itsnodes (which are strings over N ).A tree language L is a set of trees over Σ, for some ranked alphabet Σ, i.e., L ⊆ T Σ . A tree translation τ is a binary relation between trees over Σ and treesover ∆, for some ranked alphabets Σ and ∆, i.e., τ ⊆ T Σ × T ∆ . Linear-bounded composition.

Let Σ, ∆, and Γ be ranked alphabets. Fortree translations τ ⊆ T Σ × T ∆ and τ ⊆ T ∆ × T Γ , we say that the pair ( τ , τ )is linear-bounded if there is a constant c ∈ N such that for every ( t, s ) ∈ τ ◦ τ there exists r ∈ T ∆ such that ( t, r ) ∈ τ , ( r, s ) ∈ τ , and | r | ≤ c · | s | . Thus, theintermediate result r can be chosen such that its size is linear in the size of theoutput s . Note that if τ and τ are functions, this means that | r | ≤ c · | τ ( r ) | for every r ∈ ran( τ ) ∩ dom( τ ).For classes T and T of tree translations, we deﬁne T ∗ T to consist of alltranslations τ ◦ τ such that τ ∈ T , τ ∈ T , and ( τ , τ ) is linear-bounded. Lemma 1

Let T , T , and T be classes of tree translations. Then T ◦ ( T ∗ T ) ⊆ ( T ◦ T ) ∗ T and ( T ∗ T ) ∗ T ⊆ T ∗ ( T ◦ T ) . Proof.

Let τ i ∈ T i for i ∈ { , , } . If the pair ( τ , τ ) is linear-bounded thenso is the pair ( τ ◦ τ , τ ), with the same constant c . If ( τ , τ ) and ( τ ◦ τ , τ )are linear-bounded with constant c and c , respectively, then ( τ , τ ◦ τ ) islinear-bounded with constant c · c . ✷ A function τ : T Σ → T ∆ is of linear size increase if there is a constant c ∈ N such that | τ ( t ) | ≤ c · | t | for every t ∈ dom( τ ). The class of functions of linearsize increase will be denoted by LSIF . 9 emma 2

Let τ : T Σ → T Γ and τ : T Γ → T ∆ be functions such that ran( τ ) ⊆ dom( τ ) . If τ ◦ τ ∈ LSIF and ( τ , τ ) is linear-bounded, then τ ∈ LSIF . Proof.

It follows from ran( τ ) ⊆ dom( τ ) that dom( τ ◦ τ ) = dom( τ ). Since( τ , τ ) is linear-bounded, there is a c such that | τ ( t ) | ≤ c · | τ ( τ ( t )) | for every t ∈ dom( τ ). Since τ ◦ τ ∈ LSIF , there is a c ′ such that | τ ( τ ( t )) | ≤ c ′ · | t | forevery t ∈ dom( τ ). Hence | τ ( t ) | ≤ c · c ′ · | t | for every t ∈ dom( τ ), which meansthat τ ∈ LSIF . ✷ Grammars and automata.

Context-free grammars and, in particular, regulartree grammars will be used to deﬁne the computations of tree-walking treetransducers, and to deﬁne the “regular look-around” used by these transducers.A context-free grammar is speciﬁed as a tuple G = ( N, T, S , R ), where N isthe nonterminal alphabet, T the terminal alphabet (disjoint with N ), S ⊆ N the set of initial nonterminals, and R the ﬁnite set of rules, where each rule isof the form X → ζ with X ∈ N and ζ ∈ ( N ∪ T ) ∗ . A sentential form of G is a string v ∈ ( N ∪ T ) ∗ such that S ⇒ ∗ G v for some S ∈ S , where ⇒ G isthe usual derivation relation of G : if X → ζ is in R , then v Xv ⇒ G v ζv for all v , v ∈ ( N ∪ T ) ∗ . The language L ( G ) generated by G is the set of allterminal sentential forms, i.e., L ( G ) = { w ∈ T ∗ | ∃ S ∈ S : S ⇒ ∗ G w } . Toformally deﬁne the derivation trees of G as ranked trees, we need to subscriptits nonterminals with ranks because G can have rules X → ζ and X → ζ with | ζ | 6 = | ζ | . Let N be the ranked alphabet consisting of all symbols X m , ofrank m , such that G has a rule X → ζ with | ζ | = m . The terminal symbols in T are given rank 0. Then the derivation trees of G are generated by the context-free grammar G der = ( N ′ , N ∪ T, S ′ , R der ) such that N ′ = { X ′ | X ∈ N } , S ′ = { S ′ | S ∈ S} , and if R contains a rule X → ζ , then R der containsthe rule X ′ → X m ζ ′ where m = | ζ | and ζ ′ is obtained from ζ by changingevery nonterminal Y into Y ′ . Note that we only consider derivation trees thatcorrespond to derivations S ⇒ ∗ G w with S ∈ S and w ∈ T ∗ . Such a derivationtree has yield w , because when taking the yield of a derivation tree we skip theleaves with label X . Moreover, when considering a derivation tree of G , wewill disregard the subscripts of the nonterminals and we will say that a nodehas label X rather than X m . As an example, if G has the rules S → aXY b , X → aY , Y → ba , and Y → ε , then G der has the rules S ′ → S aX ′ Y ′ b , X ′ → X aY ′ , Y ′ → Y ba , and Y ′ → Y . The string aabab is generated by G ,and the derivation tree S aX aY Y bab = S ( a, X ( a, Y ) , Y ( ba ) , b ) is generatedby G der ; the nodes of this tree are labeled by S , X , Y , a , and b , and its yield is aabab .A context-free grammar is ε -free if it does not have ε -rules, i.e., rules X → ε .We will mainly deal with ε -free context-free grammars.A context-free grammar G is ﬁnitary if L ( G ) is ﬁnite. We need the followingelementary lemma on ﬁnitary context-free grammars. Lemma 3

Let G = ( N, T, S , R ) be a ﬁnitary context-free grammar. For everystring w ∈ L ( G ) there exists a derivation tree d ∈ L ( G der ) such that the yieldof d is w and the height of d is at most N ) . Proof.

Let d be a derivation tree with yield w and suppose that a node u of d and a descendant v of u have the same nonterminal label (disregarding theranking subscripts). Then the tree d can be pumped in the usual way. But since10 ( G ) is ﬁnite, the yield of the pumped tree remains the same. Hence we canremove the pumped part from d . Repeating this, we obtain a derivation tree asrequired. ✷ A context-free grammar G = ( N, T, S , R ) is forward deterministic if S isa singleton and distinct rules have distinct left-hand sides. Such a grammargenerates at most one string in T ∗ and has at most one derivation tree. If L ( G der ) = { d } , then the height of d is at most N ) by Lemma 3.A regular tree grammar is a context-free grammar G = ( N, Σ , S , R ) such thatΣ is a ranked alphabet, and ζ ∈ T Σ ( N ) for every rule X → ζ in R . A regulartree grammar generates trees over Σ, i.e., L ( G ) ⊆ T Σ . Note that every regulartree grammar is ε -free. Note also that for every context-free grammar G , thegrammar G der is a regular tree grammar. If, in particular, G is itself a regulartree grammar, as above, then it should be noted that the elements of Σ all haverank 0 in G der . As an example, if G has the rules S → σ ( X, Y ), X → τ ( Y ), Y → τ ( a ), and Y → a , where σ , τ , and a have ranks 2, 1 and 0, respectively,then G der has the rules S ′ → S ( σ, X ′ , Y ′ ), X ′ → X ( τ, Y ′ ), Y ′ → Y ( τ, a ), and Y ′ → Y ( a ). The tree σ ( τ ( τ ( a )) , a ) is generated by G , and the derivation tree S ( σ, X ( τ, Y ( τ, a )) , Y ( a )) by G der .A (total deterministic) bottom-up ﬁnite-state tree automaton is speciﬁed asa tuple A = (Σ , P, F, δ ) where Σ is a ranked alphabet, P is a ﬁnite set of states, F ⊆ P is the set of ﬁnal states, and δ is the state transition function suchthat δ ( σ, p , . . . , p m ) ∈ P for every σ ∈ Σ and p , . . . , p m ∈ P , where m is therank of σ . For every t ∈ T Σ , we deﬁne the state δ ( t ) in which A arrives at theroot of t recursively by δ ( σ t · · · t m ) = δ ( σ, δ ( t ) , . . . , δ ( t m )). The tree languagerecognized by A is L ( A ) = { t ∈ T Σ | δ ( t ) ∈ F } .A regular tree language is a set of trees that can be generated by a regulartree grammar, or equivalently, recognized by a bottom-up ﬁnite-state tree au-tomaton. The class of regular tree languages will be denoted by REGT . Thebasic properties of regular tree languages can be found in, e.g., [43, 44, 11, 19].

Regular look-around.

Let Σ be a ranked alphabet. A node test over

Σ is aset of trees over Σ with a distinguished node, i.e., it is a subset of the set T • Σ = { ( t, u ) | t ∈ T Σ , u ∈ N ( t ) } . Intuitively it is a property of a node of a tree.We introduce a new ranked alphabet Σ × { , } , such that the rank of ( σ, b )equals that of σ in Σ. For a tree t over Σ and a node u of t we deﬁne mark( t, u )to be the tree over Σ × { , } that is obtained from t by changing the label σ of u into ( σ,

1) and changing the label σ of every other node into ( σ, t, u ) is t with one “marked” node u . A regular (node) test over Σ is anode test T ⊆ T • Σ such that its marked representation is a regular tree language,i.e., mark( T ) ∈ REGT . Note that ∅ and T • Σ are regular tests, and that the classof regular tests over Σ is closed under the boolean operations complement,intersection, and union, because REGT is closed under those operations. Henceevery boolean combination of regular tests is again a regular test. That is as opposed to a “backward deterministic” context-free grammar in which dis-tinct rules have distinct right-hand sides, see, e.g., [26]. A forward deterministic context-freegrammar that generates a string is also called a “straight-line” context-free grammar. L ⊆ T Σ we deﬁne the node test T ( L ) = { ( t, u ) ∈ T • Σ | t | u ∈ L } over Σ. Intuitively it is a property of the distinguished node that only dependson the subtree at that node. Clearly, if L is regular then T ( L ) is regular. Aregular test of the form T ( L ) with L ∈ REGT will be called a regular sub-test .Note that T ( T Σ ) = T • Σ and T ( ∅ ) = ∅ . Note also that for regular tree languages L and L ′ over Σ, T ( L ) ∩ T ( L ′ ) = T ( L ∩ L ′ ) and T • Σ \ T ( L ) = T ( T Σ \ L ). Thisshows that the class of regular sub-tests over Σ is also closed under the booleanoperations complement, intersection, and union.For a given node test T over Σ, we also wish to be able to apply T to anode v of a tree mark( t, u ), where v need not be equal to u . Thus, we deﬁne thenode test µ ( T ) over Σ ×{ , } to consist of all (mark( t, u ) , v ) such that ( t, v ) ∈ T and u ∈ N ( t ). The test µ ( T ) just disregards the marking of t . It is easy to seethat if T is regular, then so is µ ( T ).The reader familiar with monadic second-order logic (abbreviated mso logic)should realize that it easily follows from the result of Doner, Thatcher andWright [18, 73] that a node test is regular if and only if it is mso deﬁnable(see [6, Lemma 7]). A node test T over Σ is mso deﬁnable if there is an mso formula ϕ ( x ) over Σ, with one free variable x , such that T = { ( t, u ) | t | = ϕ ( u ) } ,where t | = ϕ ( u ) means that the formula ϕ ( x ) holds in t for the node u as valueof x . The formulas of mso logic on trees over Σ use the atomic formulas lab σ ( x )and down i ( x, y ), for every i ∈ [1 , m x Σ ], meaning that node x has label σ ∈ Σ,and that y is the i -th child of x , respectively. In the literature, regular tests arealso called mso tests. In this section we deﬁne tree-walking tree transducers, with and without regularlook-around, and discuss some of their properties.A tree-walking tree transducer (with regular look-around), in short tt , is aﬁnite state device with one reading head that walks from node to node overits input tree following the edges in either direction. In addition to testingthe label and child number of the current node, it can even test any regularproperty of that node. The output tree is produced recursively, in a top-downfashion. When the transducer produces a node of the output tree, labeled by anoutput symbol of rank k , it branches into k copies of itself, which then proceedindependently, in parallel, to produce the subtrees rooted at the children of thatoutput node.The tt is speciﬁed as a tuple M = (Σ , ∆ , Q, Q , R ), where Σ and ∆ areranked alphabets of input and output symbols, Q is a ﬁnite set of states, Q ⊆ Q is the set of initial states, and R is a ﬁnite set of rules. The rules are divided into move rules and output rules . Each move rule is of the form h q, σ, j, T i → h q ′ , α i such that q, q ′ ∈ Q , σ ∈ Σ, j ∈ [0 , m x Σ ], T is a regular test over Σ (speciﬁed insome eﬀective way), and α is one of the following instructions :stay , up provided j = 0 , anddown i with 1 ≤ i ≤ rank Σ ( σ ) . h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) such thatthe left-hand side is as above, δ ∈ ∆ ( k ) , q , . . . , q k ∈ Q , and α , . . . , α k areinstructions as above. A rule h q, σ, j, T i → ζ with T = T • Σ will be written h q, σ, j i → ζ . The tt M is deterministic , in short a d tt , if Q is a singleton,and T ∩ T ′ = ∅ for every two distinct rules h q, σ, j, T i → ζ and h q, σ, j, T ′ i → ζ ′ in R . A d tt with initial state q will be speciﬁed as M = (Σ , ∆ , Q, q , R ).A conﬁguration h q, u i of the tt M on a tree t over Σ is given by the currentstate q of M and the current position u of the head of M on t . Formally, q ∈ Q and u ∈ N ( t ). The set of all conﬁgurations of M on t is denoted Con( t ), i.e.,Con( t ) = Q × N ( t ). A rule h q, σ, j, T i → ζ is applicable to a conﬁguration h q ′ , u i of M on t if q ′ = q and u satisﬁes the tests σ , j , and T , i.e., σ and j are the labeland child number of u , and ( t, u ) ∈ T . For a node u of t and an instruction α we deﬁne the node α ( u ) of t as follows: if α is stay, up, or down i , then α ( u )equals u , is the parent of u , or is the i -th child of u , respectively.For every input tree t ∈ T Σ we deﬁne the regular tree grammar G M,t =( N, ∆ , S , R M,t ) where N = Con( t ), S = {h q , root t i | q ∈ Q } and R M,t isdeﬁned as follows. Let h q, u i be a conﬁguration of M on t and let h q, σ, j, T i → ζ be a rule of M that is applicable to h q, u i . If ζ = h q ′ , α i then R M,t contains therule h q, u i → h q ′ , α ( u ) i , and if ζ = δ ( h q , α i , . . . , h q k , α k i ) then R M,t containsthe rule h q, u i → δ ( h q , α ( u ) i , . . . , h q k , α k ( u ) i ). The derivation relation ⇒ G M,t will be written as ⇒ M,t . The translation realized by M , denoted τ M , is deﬁnedas τ M = { ( t, s ) ∈ T Σ × T ∆ | s ∈ L ( G M,t ) } . In other words, τ M = { ( t, s ) ∈ T Σ × T ∆ | ∃ q ∈ Q : h q , root t i ⇒ ∗ M,t s } . Two tt ’s M and N are equivalent if τ M = τ N .The domain of M , denoted by dom( M ), is deﬁned to be the domain of thetranslation τ M , i.e., dom( M ) = dom( τ M ) = { t ∈ T Σ | ∃ s ∈ T ∆ : ( t, s ) ∈ τ M } .The tt M is total if dom( M ) = T Σ .The tt M is ﬁnitary if τ M is ﬁnitary, which means that τ M ( t ) is ﬁnite (orequivalently, that G M,t is ﬁnitary) for every input tree t ∈ T Σ . All classicaltop-down tree transducers (with or without regular look-ahead) and all macrotree transducers are ﬁnitary.If M is deterministic, then at most one rule of M is applicable to a givenconﬁguration. Hence G M,t is forward deterministic and L ( G M,t ) is either emptyor a singleton. Thus, τ M is a partial function from T Σ to T ∆ (and a total functionif M is total). For every ( t, s ) ∈ τ M the context-free grammar G M,t has exactlyone derivation tree, with root label h q , root t i and yield s .Intuitively, the derivation relation ⇒ M,t of the grammar G M,t formalizesthe computation steps of the tt M on the input tree t , the derivations of G M,t are the sequential computations of M on t , and the derivation trees of G M,t ,generated by the regular tree grammar G der M,t , model the parallel computationsof the independent copies of M on t . If M is deterministic and t ∈ dom( M ),then M has exactly one parallel computation on t .A sentential form of G M,t will be called an output form of M on t . It is atree s ∈ T ∆ (Con( t )) such that h q , root t i ⇒ ∗ M,t s for some q ∈ Q . Intuitively,such an output form s consists on the one hand of ∆-labeled nodes that wereproduced by M previously in the computation, using output rules, and on theother hand of leaves that represent the independent copies of M into which thecomputation has branched previously, due to those output rules, where each leafis labeled by the current conﬁguration of that copy. An output form is initial ifit is the conﬁguration h q , root t i for some q ∈ Q , where root t is the root of t ,13nd it is ﬁnal if it is in T ∆ , which means that all copies of M have disappeared.Intuitively, the computation steps of M lead from one output form to an-other, as follows. Let s be an output form and let v be a leaf of s with label h q, u i ∈ Con( t ). If h q, u i → h q ′ , α ( u ) i is a rule of G M,t , resulting from a moverule h q, σ, j, T i → h q ′ , α i of M that is applicable to conﬁguration h q, u i , as de-ﬁned above, then s ⇒ M,t s ′ where s ′ is obtained from s by changing the labelof v into h q ′ , α ( u ) i . Thus, this copy of M just changes its conﬁguration. More-over, if h q, u i → δ ( h q , α ( u ) i , . . . , h q k , α k ( u ) i ) is a rule of G M,t , resulting froman output rule h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) of M , as deﬁned above,then s ⇒ t,M s ′ where s ′ is obtained from s by changing the label of v into δ and adding children v , . . . , vm with labels h q , α ( u ) i , . . . , h q m , α m ( u ) i , respec-tively. Thus, M outputs δ , and for each child vi it branches into a new process,a copy of itself started in state q i at the node α i ( u ). In the particular case that k = 0, s ′ is obtained from s by changing the label of v into δ ; thus, the copyof M corresponding to the node v of s disappears. The translation τ M realizedby M consists of all pairs of trees t over Σ and s over ∆ such that M has asequential computation on t that starts with an initial output form and endswith the ﬁnal output form s .Before giving an example of a tree-walking tree transducer, we deﬁne sixproperties of tt ’s that will be used throughout this paper.The tt M is sub-testing , abbreviated tt s , if the regular tests used by M are regular sub-tests, i.e., only test the subtree at the current node. Formally,for every rule h q, σ, j, T i → ζ there is a regular tree language L over Σ suchthat T = T ( L ). Recall that T ( L ) = { ( t, u ) | t | u ∈ L } . Thus, informally, M issub-testing if it uses regular look-ahead rather than the more general regularlook-around.The tt M is local , abbreviated tt ℓ , if it does not use regular tests, i.e., T = T • Σ (= { ( t, u ) | t ∈ T Σ , u ∈ N ( t ) } ) for every rule h q, σ, j, T i → ζ . So all itsrules are written h q, σ, j i → ζ . Recall that T • Σ = T ( T Σ ); thus, every local tt is sub-testing. Note that in the formalism of the (non-local) tt the tests on σ and j could be dropped from a rule h q, σ, j, T i → ζ , because they can beincorporated in the regular test T .The tt M is top-down , abbreviated tt ↓ , if it does not use the up-instructionin the right-hand sides of its rules. Due to the use of stay-instructions, a tt ↓ need not be ﬁnitary. It is straightforward to show that the ﬁnitary (determinis-tic) tt s ↓ and tt ℓ ↓ are equivalent to the classical nondeterministic (deterministic)top-down tree transducer, with and without regular look-ahead, respectively;see the end of this section. Note that in the rules of a tt s ↓ or tt ℓ ↓ the test onthe child number j could be dropped, because j can be stored in the ﬁnite stateif necessary.The tt M is single-use , abbreviated tt su , if it is deterministic and nevervisits a node of the input tree twice in the same state. Formally, it shouldsatisfy the following property: for every t ∈ T Σ , s ′ ∈ T ∆ (Con( t )), s ∈ T ∆ , and h q, u i ∈ Con( t ), if h q , root t i ⇒ ∗ M,t s ′ ⇒ ∗ M,t s then h q, u i occurs at most oncein s ′ . In other words, for every t ∈ dom( M ), no nonterminal occurs twice inthe (unique) derivation tree d of the context-free grammar G = G M,t . Notethat, as discussed in the proof of Lemma 3 (and the paragraph following it),the conﬁguration h q, u i cannot occur at two distinct nodes on a path from theroot of d to a leaf. The single-use property also forbids h q, u i to occur at twoindependent nodes of d . It was introduced for attribute grammars in [40, 41, 45].14he tt M is pruning , abbreviated tt pru , if it is a top-down tt of whicheach move rule is of the form h q, σ, j, T i → h q ′ , down i i , and each output rule isof the form h q, σ, j, T i → δ ( h q , down i i , . . . , h q k , down i k i ) such that 1 ≤ i < · · · < i k ≤ rank( σ ). Intuitively, a pruning tt is a tt ↓ without stay-instructionsthat, when arriving at an input node u , either removes u and all its childrenexcept one (together with the descendants of those children), or relabels u andpossibly removes some of its children (together with their descendants). Sincea tt pru does not use the stay-instruction, it is ﬁnitary (and single-use if it isdeterministic). Every tt spru and tt ℓ pru is equivalent to a classical linear top-down tree transducer, with and without regular look-ahead, but not vice versabecause the latter transducer can generate an arbitrary ﬁnite number of outputnodes at each computation step, rather than zero or one.The tt M is relabeling , abbreviated tt rel , if every rule of M is an out-put rule of the form h q, σ, j, T i → δ ( h q , down i , . . . , h q m , down m i ) where m =rank Σ ( σ ) = rank ∆ ( δ ). Thus, the label σ is replaced by the label δ . Obviously,every relabeling tt is pruning.We use the notation TT for the class of translations realized by tree-walkingtree transducers, and f TT and dTT for the subclasses realized by ﬁnitary anddeterministic tt ’s, respectively. Thus, dTT ⊆ f TT ⊆ TT . The subclasses of TT , f TT , and dTT realized by tt ’s with the above six properties (and theircombinations) are indicated by the superscripts ‘s’ and ‘ ℓ ’, and the subscripts‘ ↓ ’, ‘su’, ‘pru’, and ‘rel’, as above. For instance, dTT s ↓ denotes the class oftranslations realized by deterministic tree-walking tree transducers that are bothsub-testing and top-down. Note that TT ℓ is a proper subclass of TT s , becausea local tt of which all output symbols have rank 0 can be viewed as a tree-walking automaton, which cannot recognize all regular tree languages by theresult of [9].By [14, Section 8.4], the tt is equivalent to the ms tree-walking transducerof [14, Section 8.2]. As discussed in the Introduction, the tt ℓ generalizes theattributed tree transducer of [38], which is required to be noncircular and henceﬁnitary; the deterministic attributed tree transducer is also required to be total. In the same way the deterministic tt generalizes the (deterministic) attributedtree transducer with look-ahead of [7]. In [26] all tree-walking tree transducersare local. Example 4

Let Σ = { σ, e } with rank Σ ( σ ) = 2 and rank Σ ( e ) = 0, and let∆ = { σ, e } ∪ [1 , m x Σ ] with rank ∆ ( σ ) = 2, rank ∆ ( e ) = 0, and rank ∆ ( j ) = 0 forevery j ∈ [1 , m x Σ ] = { , } . Moreover, let T be an arbitrary regular node testover Σ. For simplicity we assume that T is not satisﬁed at the leaves of t , i.e.,if ( t, u ) ∈ T then u is not a leaf of t . For instance, T consists of all ( t, u ) ∈ T • Σ such that u has at least one ancestor that has exactly one child that is a leaf,and at least one descendant with that same property. We consider a totaldeterministic tt M = (Σ , ∆ , Q, q , R ) that performs T as a query, i.e., for everyinput tree t it outputs all nodes of t that satisfy T , in pre-order. More precisely, if The tt M is circular if there exist t ∈ T Σ , u ∈ N ( t ), q ∈ Q , and s ∈ T ∆ (Con( t )) such that h q, u i ⇒ ∗ M,t s and h q, u i occurs in s . Thus, M is noncircular if and only if G M,t is nonrecursivefor every t ∈ T Σ , which implies that L ( G M,t ) is ﬁnite. Note that a total deterministic tt isnoncircular if and only if for every t ∈ T Σ , u ∈ N ( t ), and q ∈ Q there exists s ∈ T ∆ such that h q, u i ⇒ ∗ M,t s . It can be shown that for every ﬁnitary tt there is an equivalent noncircular tt , but that will not be needed in this paper. , . . . , u n are the nodes u of t such that ( t, u ) ∈ T , in pre-order, then M outputsthe tree s = σ ( s , σ ( s , . . . σ ( s n , e ) · · · )) where s i = σ ( · · · σ ( σ ( e, j ) , j ) . . . , j k )if u i = j j · · · j k with j , j , . . . , j k ∈ [1 , m x Σ ]. Note that the yield of s is eu eu · · · eu n e . The transducer M performs a left-to-right depth-ﬁrst traversalof the input tree t and applies the test T to every node of t , in pre-order.Whenever M ﬁnds a node u i that satisﬁes the test, it branches into two copies.The ﬁrst copy outputs the tree s i with yield eu i , walking from u to the root,and the second copy continues the traversal.Formally, M has the set of states Q = { d, u , u , p, p ′ } and initial state q = d . Intuitively, d stands for ‘down’, u j for ‘up from the j -th child’, and p for‘print’. It has the following rules, where j ′ ∈ [0 , m x Σ ], j ∈ [1 , m x Σ ], T c = T • Σ \ T ,and τ ∈ Σ: h d, σ, j ′ , T c i → h d, down i h d, e, j i → h u j , up ih d, σ, j ′ , T i → σ ( h p, stay i , h d, down i ) h d, e, i → e h u , σ, j ′ i → h d, down i h p, τ, j i → σ ( h p, up i , j ) h u , σ, j i → h u j , up i h p, τ, i → e h u , σ, i → e where the rule h p, τ, j i → σ ( h p, up i , j ) abbreviates the two rules h p, τ, j i → σ ( h p, up i , h p ′ , stay i ) and h p ′ , τ, j i → j. The tt M does not have any of the six properties deﬁned above. Note that M is not single-use because it pays n visits to the root of t in state p . For theexample test T it is not clear whether there is a local tt equivalent to M , butthat does not seem likely. ✷ Example 5

Let Σ = { σ, e } as in Example 4. We consider a total deterministiclocal tt M exp that translates each tree t with n leaves into the full binary treeof height n with 2 n leaves. As in Example 4, it performs a depth-ﬁrst left-to-right traversal of t , and branches into two copies whenever it visits a leafof t . Formally, M exp = (Σ , Σ , Q, q , R ) with Q = { d, u , u , q } and q = d . Itsrules are similar to those of M in Example 4. In particular, the three rules forstates u and u are the same. The rules for state d are the following, with j ′ ∈ [0 , m x Σ ] and j ∈ [1 , m x Σ ]: h d, σ, j ′ i → h d, down ih d, e, j i → σ ( h u j , up i , h u j , up i ) h d, e, i → σ ( e, e )where the last rule abbreviates the two rules h d, e, i → σ ( h q, stay i , h q, stay i )and h q, e, i → e . ✷ An elementary property of the translation realized by a deterministic tt isthat it is of “linear size-height increase”, as stated in the next lemma. Sincethe size of a tree is at most exponential in its height, this implies that it is ofexponential size increase. This is well known for attributed tree transducers [38,Lemma 4.1] (see also [39, Lemma 5.40]) and for local tt ’s [31, Lemma 7], andobviously also holds for tt ’s. If, moreover, the tt is single-use, then it is oflinear size increase. 16 emma 6 For every τ ∈ dTT there is a constant c such that for every ( t, s ) ∈ τ the height of s is at most c · | t | . Moreover, dTT su ⊆ LSIF . Proof.

Let M = (Σ , ∆ , Q, q , R ) be a d tt and let ( t, s ) ∈ τ M . Let d be theunique derivation tree generated by G der M,t . Clearly, since each rule of M outputsat most one node of s , the height of s is at most the height of d . By Lemma 3the height of d is at most t )), which equals Q ) · | t | . Thus, we can take c = Q ).It should also be clear that the size of s is at most the number of nodes of d that are labeled by a conﬁguration. If M is single-use, then no conﬁgurationoccurs twice in d . Hence | s | ≤ Q ) · | t | , i.e., the function τ M is of linear sizeincrease. ✷ Example 5 and Lemma 6 imply that compositions of deterministic tt ’s forma proper hierarchy. This was proved for attributed tree transducers in [38,Corollary 4.1] (see also [39, Theorem 5.45]), and the proof for tt ’s is exactlythe same. Proposition 7

For every k ≥ , dTT k ( dTT k +1 . Proof.

Let τ exp be the translation realized by the d tt M exp of Example 5.Then τ exp ◦ τ exp translates each tree t with n leaves into the full binary tree ofheight 2 n with 2 n leaves. Since | t | = 2 n −

1, it follows from Lemma 6 that τ exp ◦ τ exp is not in dTT . Hence dTT ( dTT . In a similar way it can be shownthat τ k +1exp is not in dTT k . Since the size of a tree is at most exponential in itsheight, it follows from Lemma 6 that for every τ ∈ dTT there is a constant c such that for every ( t, s ) ∈ τ the height of s is at most 2 c ·| t | . Similarly for τ ∈ dTT k , the height of s is at most ( k − | t | . ✷ Thus, in terms of size increase, a composition of k d tt ’s can create at most a k -fold exponentially large output tree, whereas a composition of k + 1 d tt ’s cannaturally create an output tree of ( k + 1)-fold exponential size. In Section 7 wewill prove that compositions of nondeterministic tt ’s also form a hierarchy, withthe same counter-examples. One of our aims is to show that these hierarchiescollapse for functions of linear size increase, i.e., that TT k ∩ LSIF ⊆ dTT forevery k ≥ tt ’s. First,for an arbitrary tt it may always be assumed that its output rules only usethe stay-instruction: an output rule h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) canbe replaced by the output rule h q, σ, j, T i → δ ( h p , stay i , . . . , h p k , stay i ) andthe move rules h p i , σ, j, T i → h q i , α i i for every i ∈ [1 , k ], where p , . . . , p k arenew states. This replacement preserves determinism and the sub-testing, local,top-down, and single-use properties (but not pruning or relabeling).Second, we may always assume that the regular tests of a tt are disjoint.For a tt M , let T M be the set of regular tests in the left-hand sides of the rulesof M . Lemma 8

For every tt M there is an equivalent tt M ′ such that the testsin T M ′ are mutually disjoint. The construction preserves determinism and thesub-testing, local, top-down, single-use, pruning, and relabeling properties. roof. If T, T ′ ∈ T M and T ∩ T ′ = ∅ , then every rule h q, σ, j, T i → ζ can bereplaced by the two rules h q, σ, j, T ∩ T ′ i → ζ and h q, σ, j, T \ T ′ i → ζ . Thetransducer M ′ is obtained by repeating this procedure. ✷ Third, we can extend the deﬁnition of a tt M = (Σ , ∆ , Q, q , R ) by allowing“general rules”, which can generate any ﬁnite number of output nodes, cf. [31,Lemma 2]. Simple examples of general rules are h p, τ, j i → σ ( h p, up i , j ) inExample 4 and h d, e, i → σ ( e, e ) in Example 5. Formally, a general rule is of theform h q, σ, j, T i → ζ such that ζ is a tree in T ∆ ( Q × I σ,j ), where I σ,j is the usualset of instructions: stay, up (provided j = 0), and down i with i ∈ [1 , rank( σ )].If this rule is applicable to a conﬁguration h q, u i of M on t ∈ T Σ , then G M,t hasthe rule h q, u i → ζ u , where ζ u is obtained from ζ by changing every label h q ′ , α i into h q ′ , α ( u ) i . It is easy to see that a general rule can be replaced by the setof ordinary rules deﬁned as follows. Let p u be a new state for every u ∈ N ( ζ ).Then the rules are h q, σ, j, T i → h p ε , stay i , where ε is the root of ζ , and all rules h p u , σ, j, T i → λ ( h p u , stay i , . . . , h p uk , stay i ) where λ is the label of u in ζ and k is its rank. The ﬁrst rule is a move rule that just changes state, and the latterrules output the ∆-labeled nodes of ζ one by one ( λ ∈ ∆), and then make therequired moves ( λ ∈ Q × I σ,j ). This construction preserves determinism and thesub-testing, local, top-down, and single-use properties. Note that the classicaltop-down tree transducer has general rules.If we allow general rules, then the stay-instruction is not needed any morein ﬁnitary tt ’s. Let us say that a tt is stay-free if it does not use the stay-instruction in its rules. For every tt M (with general rules) we can constructan equivalent stay-free tt M sf with general rules, with possibly inﬁnitely manyrules but such that the right-hand sides of rules with the same left-hand sideform a regular tree language. If M is ﬁnitary, then we can transform M sf into anequivalent stay-free tt with ﬁnitely many rules. The construction is as follows,where we may assume that the node tests in T M are mutually disjoint, by (theproof of) Lemma 8.For every left-hand side h q, σ, j, T i of a rule of M = (Σ , ∆ , Q, Q , R ) we deﬁnea regular tree grammar G q,σ,j,T that simulates the computations of M , startingin a conﬁguration h q, u i to which h q, σ, j, T i is applicable, without leaving thecurrent node u , i.e., executing stay-instructions only. Its set of nonterminals is {h q ′ , stay i | q ′ ∈ Q } with initial nonterminal h q, stay i . Its set of terminals is∆ ∪ D σ,j , where D σ,j = Q × ( I σ,j \ { stay } ) each element of which has rank 0.Finally, if h q ′ , σ, j, T i → ζ is a rule of M (with q ′ ∈ Q and the same σ , j , and T ),then G q,σ,j,T has the rule h q ′ , stay i → ζ .We now deﬁne M sf = (Σ , ∆ , Q, Q , R sf ) where R sf consists of all general rules h q, σ, j, T i → ζ such that ζ ∈ L ( G q,σ,j,T ), for every left-hand side h q, σ, j, T i ofa rule of M . Even if M sf has inﬁnitely many rules, it should be clear that (withall the deﬁnitions as in the ﬁnite case) M sf is equivalent to M .Note that if M is deterministic, then so is M sf , because G q,σ,j,T is forwarddeterministic and hence L ( G q,σ,j,T ) is empty or a singleton. Thus, M sf hasﬁnitely many rules.Assume now that M , and hence M sf , is ﬁnitary. Let h q, σ, j, T i be the left-hand side of a rule of M , and let D ⊆ D σ,j . If M sf has inﬁnitely many rules h q, σ, j, T i → ζ with ζ ∈ T ∆ ( D ), then we remove those rules from R sf . In fact,if M sf would have a computation h q , root t i ⇒ ∗ M sf ,t s with q ∈ Q in whichone of those rules is applied, then it would have a similar computation (with18he same q and t , but, in general, another s ) in which any other of thoserules is applied. Since s contains at least as many occurrences of symbols in ∆as ζ , that would contradict the ﬁnitariness of M sf . Removing all these rules, forevery D ⊆ D σ,j , we are left with an equivalent version of M sf with ﬁnitely manyrules. The construction is eﬀective because L ( G q,σ,j,T ) ∩ T ∆ ( D ) is a regular treelanguage and hence its ﬁniteness can be decided.The above constructions also preserve the sub-testing, local, top-down, andsingle-use properties. Note that if M is a ﬁnitary tt s ↓ or tt ℓ ↓ , then M sf is aclassical top-down tree transducer (after incorporating the child number in itsﬁnite state), with or without regular look-ahead, respectively. In this section we discuss some basic properties of tt ’s with respect to thefeature of regular look-around. We start with the simple fact that the domainof a tt can always be restricted to a regular tree language, except when the tt is local. Lemma 9

For every tt M and every L ∈ REGT there is a tt M ′ such that τ M ′ = { ( t, s ) ∈ τ M | t ∈ L } . The construction preserves determinism and thesub-testing, top-down, single-use, pruning, and relabeling properties. Proof.

The tt M ′ simulates M , but additionally veriﬁes that the input tree t isin L , by using the regular sub-test T ( L ) at the root of t . Formally, M ′ is obtainedfrom M by changing every rule h q , σ, , T i → ζ into h q , σ, , T ∩ T ( L ) i → ζ ,for every initial state q . ✷ In the remainder of this section we show how to separate the regular look-around from a tt , by incorporating it into another tt . We ﬁrst prove thatevery tt M can be decomposed into a deterministic relabeling tt N and alocal tt M ′ . The relabeling tt N preprocesses the input tree t by adding tothe label of each node u of t the truth values of the regular tests of M at thatnode. This allows M ′ , during its simulation of M , to inspect the new label of u instead of testing u . The idea is similar to that of removing regular look-aheadin [20, Theorem 2.6]. The translation realized by N is called an mso relabelingin [7, 14] and [29, Section 4]. Lemma 10 TT ⊆ dTT rel ◦ TT ℓ , i.e., for every tt M there are a deterministicrelabeling tt N and a local tt M ′ such that τ N ◦ τ M ′ = τ M . The constructionpreserves determinism, the top-down property, and the pruning property. Proof.

Let M = (Σ , ∆ , Q, Q , R ) be a tt , and let T be the set of regulartests in the left-hand sides of the rules in R . By Lemma 8 we may assume thatthe tests in T are mutually disjoint. Now let T ⊥ = T ∪ {⊥} where ⊥ is theintersection of the complements of the tests in T . Thus, for every t ∈ T Σ and u ∈ N ( t ), ( t, u ) belongs to a unique node test in T ⊥ . Let Σ × T ⊥ be the rankedalphabet such that h σ, T i has the same rank as σ .We deﬁne the relabeling tt N = (Σ , Σ × T ⊥ , { p } , p, R N ) such that for every σ ∈ Σ, j ∈ [0 , m x Σ ], and T ∈ T ⊥ , the output rule h p, σ, j, T i → h σ, T i ( h p, down i , . . . , h p, down m i )19s in R N , where m is the rank of σ . Additionally we deﬁne the local tt M ′ =(Σ × T ⊥ , ∆ , Q, Q , R ′ ) with the following rules. If h q, σ, j, T i → ζ is a rulein R , then R ′ contains the rule h q, h σ, T i , j i → ζ . Note that N is total anddeterministic. Also, if M is deterministic, then so is M ′ . It should be clear that τ M ′ ( τ N ( t )) = τ M ( t ) for every t ∈ T Σ , i.e., τ N ◦ τ M ′ = τ M . ✷ We will also need a variant of this lemma, for nondeterministic tt ’s only. Lemma 11 TT s ⊆ TT ℓ rel ◦ TT ℓ and TT spru ⊆ TT ℓ rel ◦ TT ℓ pru . Proof.

Let M = (Σ , ∆ , Q, Q , R ) be a sub-testing tt , and let T be the set ofregular tests in the left-hand sides of the rules in R . As in the proof of Lemma 10we may assume that the tests in T are mutually disjoint (by Lemma 8), andwe deﬁne T ⊥ = T ∪ {⊥} as in that proof. Let T ⊥ = { T ( L ) , . . . , T ( L n ) } where L , . . . , L n are regular tree languages. Clearly, there is a bottom-upﬁnite-state tree automaton A = (Σ , P, F, δ ) (where F is irrelevant) and a par-tition { F , . . . , F n } of P such that for every t ∈ T Σ and i ∈ [1 , n ], t ∈ L i if and only if δ ( t ) ∈ F i . We deﬁne the local relabeling tt N = (Σ , Σ ×T ⊥ , P, P, R N ) such that it nondeterministically simulates A top-down. Forevery σ ∈ Σ of rank m , every sequence of states p , . . . , p m ∈ P , and ev-ery j ∈ [0 , m x Σ ], if δ ( σ, p , . . . , p m ) = p ∈ F i , then R N contains the rule h p, σ, j i → h σ, T ( L i ) i ( h p , down i , . . . , h p m , down m i ). The local tt M ′ is de-ﬁned as in the proof of Lemma 10. ✷ The next lemma is based on the folklore technique of computing the statesof a bottom-up ﬁnite-state tree automaton that are “successful” at the currentnode (see, e.g., the proofs of [7, Theorem 10] and [6, Theorem 8]). The lemmashows that every top-down tt is equivalent to one that is sub-testing, andhence to a classical top-down tree transducer with regular look-ahead if it isﬁnitary. It is a slight generalization of the fact that every mso relabeling canbe computed by a top-down tree transducer with regular look-ahead, as shownin [7, Theorem 10] and [31, Theorem 4.4]. Lemma 12 TT ↓ = TT s ↓ . The construction preserves determinism, pruning,and relabeling. Proof.

Let M = (Σ , ∆ , Q, Q , R ) be a tt ↓ that uses a regular test T over Σin its rules. For simplicity we ﬁrst assume that M uses T in each of its rules.Let A = (Σ × { , } , P, F, δ ) be a bottom-up ﬁnite-state tree automaton thatrecognizes mark( T ). We identify the symbols ( σ,

0) and σ ; thus, A can alsohandle trees over Σ. For every tree t ∈ T Σ and every node u ∈ N ( t ), wedeﬁne the set succ t ( u ) of successful states of A at u to consist of all states p ∈ P such that A recognizes t when started at u in state p . To be precise,succ t (root t ) = F and if u has label σ ∈ Σ ( m ) and i ∈ [1 , m ], then succ t ( ui ) isthe set of all states p ∈ P such that δ ( σ, p , . . . , p i − , p, p i +1 , . . . , p m ) ∈ succ t ( u ),where p j = δ ( t | uj ), i.e., p j is the state in which A arrives at the j -th child of u ,for every j ∈ [1 , m ] \ { i } . Obviously, mark( t, u ) is recognized by A if and only if δ (( σ, , δ ( t | u ) , . . . , δ ( t | um )) ∈ succ t ( u ).For every σ ∈ Σ ( m ) and every sequence of states p , . . . , p m ∈ P let L σ,p ,...,p m be the regular tree language consisting of all trees σ ( t , . . . , t m ) ∈ T Σ such that δ ( t i ) = p i for every i ∈ [1 , m ]. Thus, the regular sub-test T ( L σ,p ,...,p m ) veriﬁesthat A arrives at the i -th child of the current node in state p i for every i ∈ [1 , m ].20e construct a sub-testing tt ↓ M ′ = (Σ , ∆ , Q ′ , Q ′ , R ′ ) that is equivalentto M . It keeps track of succ t ( u ) in its ﬁnite state. Its set of states is Q ′ = Q × { S | S ⊆ P } with set of initial states Q ′ = { ( q , F ) | q ∈ Q } . The setof rules R ′ is deﬁned as follows. Let h q, σ, j, T i → ζ be a rule in R , let S ⊆ P ,and let p , . . . , p m ∈ P such that δ (( σ, , p , . . . , p m ) ∈ S where m = rank Σ ( σ ).Then R ′ contains the rule h ( q, S ) , σ, j, T ( L σ,p ,...,p m ) i → ζ ′ where ζ ′ is obtainedfrom ζ by changing every h q ′ , stay i into h ( q ′ , S ) , stay i and every h q ′ , down i i into h ( q ′ , S i ) , down i i with S i = { p ∈ P | δ ( σ, p , . . . , p i − , p, p i +1 , . . . , p m ) ∈ S } .In the general case where M uses regular tests T , . . . , T n , the transducer M ′ must keep track of succ t ( u ) for each of the corresponding bottom-up ﬁnite-statetree automata A , . . . , A n . ✷ The proof of Lemma 12 also shows that in a rule h q, σ, j, T ( L ) i → ζ of asub-testing tt ↓ we may assume that L is of the form L = σ ( L , . . . , L m ) = { σ ( t , . . . , t m ) | t ∈ L , . . . , t m ∈ L m } for regular tree languages L , . . . , L m (where m = rank( σ )). This is how regular look-ahead is usually deﬁned forclassical top-down tree transducers.By Lemmas 10 and 12, dTT ⊆ dTT s ↓ ◦ dTT ℓ . It is proved in [28, Lemmas 49and 50] that even dTT ⊆ dTT ℓ ↓ ◦ dTT ℓ , but this will not be needed in whatfollows. Using Lemmas 10 and 12 we can now prove three essential propertiesof tt ’s, based on well-known results from the literature. Lemma 13

The regular tree languages are closed under inverses of tt trans-lations, i.e., if L ∈ REGT and τ ∈ TT , then τ − ( L ) ∈ REGT . Proof.

Since the inverse of a composition is the composition of the inverses, itsuﬃces to show this for dTT srel and TT ℓ by Lemmas 10 and 12. For dTT srel itfollows from [20, Theorem 2.6 and Lemma 1.2], and for TT ℓ it is proved in [26,Lemma 3]. ✷ Corollary 14

The domain of a tt M is regular, i.e., dom( M ) ∈ REGT . Moregenerally, for every k ≥ , if τ ∈ TT k then dom( τ ) ∈ REGT . Corollary 14 was proved for (nondeterministic) attributed tree transducersin [5], from which it is easy to conclude that Lemma 13 holds for attributed treetransducers, as explained in [26, Lemma 3].

Lemma 15

The regular tree languages are closed under pruning tt transla-tions, i.e., if L ∈ REGT and τ ∈ TT pru , then τ ( L ) ∈ REGT . Proof.

By Lemma 12, TT pru = TT spru . As observed before, every τ ∈ TT spru can be realized by a classical linear top-down tree transducer with regular look-ahead. It is well known that, due to linearity, REGT is closed under such trans-lations, see, e.g., [43, Corollary IV.6.7]. ✷ In [28], dTT and dTT ℓ are denoted by dTT mso and dTT , respectively. We note that an alternative proof is by Lemma 26 (in Section 6) and [34, Theorem 7.4](see also [65, Section 5]). For the reader familiar with mso translations, see [14], we notethat it is proved in [29, Section 4] that dTT srel is the class of mso (tree) relabelings, and that

REGT , which is the class of mso deﬁnable tree languages, is closed under inverse mso (tree)transductions by [14, Corollary 7.12]. tt M are regular by deﬁningthem in terms of, e.g., the domains of other tt ’s or of variants of M itself.In other words, a tt can use tt ’s “to look around”. For instance, Lemma 13is used for this purpose in the proof of Lemma 16 below, where we show thefollowing.In a composition of a d tt with a sub-testing tt the second transducer caneven be assumed to be local, because the ﬁrst transducer can determine thetruth values of the regular sub-tests of the output tree by executing appropriateregular tests on its input tree. Lemma 16 dTT ◦ TT s ⊆ dTT ◦ TT ℓ . The construction preserves determinism(of the second transducer) and the top-down, single-use, pruning, and relabelingproperties of both transducers. Proof.

Let M = (Σ , ∆ , Q, q , R ) be a d tt and let M be a sub-testing tt with input alphabet ∆. We will construct a d tt M ′ and a local tt M ′ thatsimulate the composition of M and M . The construction preserves the top-down, single-use, pruning, and relabeling property of each transducer, i.e., if M has one of these properties, then so has M ′ , and similarly for M and M ′ .Moreover, if M is deterministic, then so is M ′ .Let ( t, s ) ∈ τ M . The d tt M ′ simulates M on the input tree t . Simultane-ously it executes the sub-tests of M at every node v of the output tree s andpreprocesses s by adding to the label of v the truth values of these sub-testsat v , cf. the text before Lemma 10. This allows M ′ , during its simulation of M on s , to inspect the new label of v instead of sub-testing v .Every node of s is produced by an output rule of M during its computationon t . Let ¯ s be an output form of M on t , and let v be a leaf of ¯ s withlabel h q, u i . It should be clear that h q, u i ⇒ ∗ M ,t s | v . Now let L be a regulartree language over ∆ such that M uses the sub-test T ′ = T ( L ). We claimthat, in conﬁguration h q, u i , M ′ can test whether ( s, v ) ∈ T ′ by a regular testinv q ( T ′ ). Note that ( s, v ) ∈ T ( L ) if and only if s | v ∈ L . Thus, inv q ( T ′ )should test whether the output tree generated by the conﬁguration h q, u i isin L . To prove that mark(inv q ( T ′ )) is regular, we deﬁne a d tt N q such thatmark(inv q ( T ′ )) = τ − N q ( L ) and we use Lemma 13. The transducer N q ﬁrst usesa regular test at the root to verify that the input tree is of the form mark( t, u ). After that it walks to the (unique) marked node u , using move rules to executea depth-ﬁrst search of the input tree, and then simulates M starting in state q at u , producing the output tree s | v . During that simulation it treats each symbol( σ,

0) or ( σ,

1) as σ , and for each regular test T of M it instead uses the test µ ( T ), which is the set of all (mark( t, u ) , v ) such that ( t, v ) ∈ T and u ∈ N ( t ),see Section 2.The construction of M ′ and M ′ is similar to the construction of N and M ′ in the proof of Lemma 10. Let T be the set of regular tests in the left-handsides of the rules of M . As in the proof of Lemma 10 we may assume that thetests in T are mutually disjoint (by Lemma 8), and we deﬁne T ⊥ = T ∪ {⊥} as in that proof. Note that the elements of T ⊥ are still regular sub-tests. Notealso that for every q ∈ Q , t ∈ dom( M ) and u ∈ N ( t ), ( t, u ) belongs to a uniqueregular test in { inv q ( T ′ ) | T ′ ∈ T ⊥ } . To be precise, the regular sub-test T (mark( T • Σ )).

22e deﬁne the d tt M ′ = (Σ , ∆ ×T ⊥ , Q, q , R ′ ) such that R ′ contains all moverules in R , and moreover, if h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) is an outputrule in R , then R ′ contains the rule h q, σ, j, T ∩ inv q ( T ′ ) i → h δ, T ′ i ( h q , α i , . . . , h q k , α k i )for every T ′ ∈ T ⊥ . We deﬁne the local tt M ′ with input alphabet ∆ × T ⊥ and the following rules. If h q, δ, j, T ′ i → ζ is a rule of M , then M ′ has therule h q, h δ, T ′ i , j i → ζ . It should now be clear that τ M ′ ( τ M ′ ( t )) = τ M ( τ M ( t ))for every t ∈ T Σ , i.e., τ M ′ ◦ τ M ′ = τ M ◦ τ M . If M is single-use, then M ′ isalso single-use, because M ′ visits the nodes of the input tree in the same statesas M ; the same is true for M and M ′ . Preservation of the other propertieseasily follows from the construction of M ′ and M ′ . ✷ In this section we prove three composition results for tt ’s. Our ﬁrst aim is toprove that d tt ’s are closed under right-composition with top-down d tt ’s, andhence in particular with pruning d tt ’s. As already mentioned at the end of theIntroduction, this generalizes the result of [38, Theorem 4.3] for attributed treetransducers, because d tt ’s need not be total and they have regular look-around.By Lemma 12 we may assume that the top-down tt is sub-testing. It may evenbe assumed to be local by Lemma 16. Lemma 17 dTT ◦ dTT ℓ ↓ ⊆ dTT . In particular dTT ↓ ◦ dTT ℓ ↓ ⊆ dTT ↓ and dTT pru ◦ dTT ℓ pru ⊆ dTT pru . Proof.

Since the domain of a tt can always be restricted to dom( M ) byLemma 9 and Corollary 14, it suﬃces to show that for every d tt M andevery local top-down d tt M , a d tt M can be constructed such that τ M ( t ) = τ M ( τ M ( t )) for every input tree t ∈ dom( M ). For the case where M is alsolocal this construction was presented in the proof of [28, Theorem 55], whichcan easily be adapted to the general case. We repeat it here for completenesssake, and because the proofs of the other two composition closure results willbe based on it.The transducer M is obtained by a straightforward product construction.For every ( t, s ) ∈ τ M , M simulates M on the input tree t until M uses anoutput rule that generates a node v of s . Then M switches to the simulation of M on v , as long as M executes stay-instructions. When M executes a down i -instruction, M switches again to the simulation of M in order to generate the i -th child of v .Formally, let M = (Σ , ∆ , P, p , R ) and M = (∆ , Γ , Q, q , R ). To simplifythe construction of M we assume that M keeps track in its ﬁnite state of thechild number of the output node to be generated. To be precise, we assume thatthere is a mapping χ : P → [0 , m x ∆ ] such that for every output form s ′ and everyleaf v of s ′ that is labeled by a conﬁguration h p, u i , the child number of v in s ′ is χ ( p ). That is possible because the output tree is generated top-down. If M does not satisfy this assumption, then we change M as follows. The new set ofstates is P × [0 , m x ∆ ], and we deﬁne χ ( p, i ) = i . The new initial state is ( p , M starts by generating the root of the output tree. Each move rule h p, σ, j, T i → h p ′ , α i of M is changed into the rules h ( p, i ) , σ, j, T i → h ( p ′ , i ) , α i and each output rule h p, σ, j, T i → δ ( h p , α i , . . . , h p k , α k i ) into h ( p, i ) , σ, j, T i → δ ( h ( p , , α i , . . . , h ( p k , k ) , α k i ), for every i ∈ [0 , m x ∆ ]. For the sake of the proofof Lemma 22 we note that this transformation of M preserves the single-useproperty, because we have only added information to the states of M .The d tt M has input alphabet Σ and output alphabet Γ. Its states are ofthe form ( p, q ) or ( ρ, q ), where p ∈ P , q ∈ Q , and ρ is an output rule of M , i.e., arule of the form h p, σ, j, T i → δ ( h p , α i , . . . , h p k , α k i ). Its initial state is ( p , q ).A state ( p, q ) is used by M to simulate the computation of M that generatesthe next current node of M when M moves down (keeping the state q of M in memory). Initially M simulates the computation of M that generates theroot of the output tree. A state ( ρ, q ) is used by M to simulate the computationof M on the node that M has generated with rule ρ . The rules of M aredeﬁned as follows.First, rules that simulate M . Let ρ : h p, σ, j, T i → ζ be a rule in R . If ζ = h p ′ , α i , then M has the rules h ( p, q ) , σ, j, T i → h ( p ′ , q ) , α i for every q ∈ Q .If ρ is an output rule, then M has the rules h ( p, q ) , σ, j, T i → h ( ρ, q ) , stay i forevery q ∈ Q .Second, rules that simulate M . Let h q, δ, i i → ζ be a rule in R and let ρ : h p, σ, j, T i → δ ( h p , α i , . . . , h p k , α k i ) be an output rule in R , with the same δ and with χ ( p ) = i . Then M has the rule h ( ρ, q ) , σ, j, T i → ζ ′ where ζ ′ is obtainedfrom ζ by changing every h q ′ , stay i into h ( ρ, q ′ ) , stay i , and every h q ′ , down ℓ i into h ( p ℓ , q ′ ) , α ℓ i . Note that the test on σ , j , and T is actually superﬂuous, becausethat was already tested when M included ρ in its state.It is easy to see that τ M ( t ) = τ M ( τ M ( t )) for every input tree t ∈ dom( M ).If the rules of M do not contain stay-instructions, then M does not need thestates ( ρ, q ). Its rules can then be simpliﬁed as follows. Let h p, σ, j, T i → ζ bea rule in R . As above, if ζ = h p ′ , α i , then M has the rules h ( p, q ) , σ, j, T i →h ( p ′ , q ) , α i for every q ∈ Q . If ζ = δ ( h p , α i , . . . , h p k , α k i ) and h q, δ, i i → ζ ′ is a rule in R , with the same δ and with χ ( p ) = i , then M has the rule h ( p, q ) , σ, j, T i → ζ ′′ where ζ ′′ is obtained from ζ ′ by changing every h q ′ , down ℓ i into h ( p ℓ , q ′ ) , α ℓ i . This shows that if both M and M are pruning, then M ispruning too. ✷ We obtain our ﬁrst composition closure result from Lemmas 12, 16, and 17.Note that the closure under composition of dTT ↓ already follows from Lemma 12and [20, Theorem 2.11(2)]. Theorem 18 dTT ◦ dTT ↓ ⊆ dTT . In particular, dTT ↓ and dTT pru are closedunder composition. Theorem 18 can be used to show that in a composition of two d tt ’s we mayalways assume that the second one is local (thus strengthening Lemma 16): byLemma 10 the second tt can be decomposed into a top-down tt and a local tt ,and then (by Theorem 18), the top-down one can be absorbed by the ﬁrst tt .Hence dTT ◦ dTT ⊆ dTT ◦ dTT ↓ ◦ dTT ℓ ⊆ dTT ◦ dTT ℓ . This was already provedin [28, Theorem 53] by means of pebble tree transducers.Our second composition result generalizes Theorem 18 to nondeterministic tt ’s, restricted to right-composition with pruning tt ’s. The proof of the nextlemma is similar to that of Lemma 17.24 emma 19 TT ◦ TT ℓ pru ⊆ TT . In particular TT ↓ ◦ TT ℓ pru ⊆ TT ↓ and TT pru ◦ TT ℓ pru ⊆ TT pru . Proof.

Let M = (Σ , ∆ , P, P , R ) be a tt and M = (∆ , Γ , Q, Q , R ) a localpruning tt . The construction of the transducer M such that τ M = τ M ◦ τ M is a straightforward variant of the one in the last paragraph of the proof ofLemma 17. This time, we do not verify at the start that the input tree is inthe domain of M , because it has to be checked at each step of M that M canproduce an output tree, in particular when M deletes part of that output tree(cf. the proof of [20, Lemma 2.9]).We deﬁne M = (Σ , Γ , P × Q, P × Q , R ) as follows. As in the proof ofLemma 17 we assume that M keeps track in its ﬁnite state of the child numberof the output node to be generated, through a mapping χ : P → [0 , m x Σ ]. Let h p, σ, j, T i → ζ be a rule in R . As before, if ζ = h p ′ , α i , then M has the rules h ( p, q ) , σ, j, T i → h ( p ′ , q ) , α i for every q ∈ Q . If ζ = δ ( h p , α i , . . . , h p k , α k i ) and h q, δ, i i → ζ ′ is a rule in R , with the same δ and with χ ( p ) = i , then M has therule h ( p, q ) , σ, j, T ∩ T ′ i → ζ ′′ where ζ ′′ is obtained (as before) from ζ ′ by changingevery h q ′ , down ℓ i into h ( p ℓ , q ′ ) , α ℓ i , and the node test T ′ consists of all ( t, u ) suchthat for every ℓ ∈ [1 , k ] there exists a computation h p ℓ , α ℓ ( u ) i ⇒ ∗ M ,t s ℓ for some s ℓ ∈ T ∆ . Thus, the only diﬀerence with the proof of Lemma 17 is the additionaltest T ′ . In fact, it suﬃces that T ′ tests every ℓ ∈ [1 , k ] for which down ℓ doesnot occur in ζ ′ . That guarantees the existence of an output tree of M on which M is simulated by M . It should be clear that T ′ is regular by Corollary 14: itcan be written as T ℓ ∈ [1 ,k ] T ′ ℓ where mark( T ′ ℓ ) is the domain of a tt that walksto node α ℓ ( u ) and then simulates M starting in state p ℓ .We note that this construction does not work for an arbitrary top-down M without stay-instructions. If some down ℓ occurs twice in ζ ′ , then there are twooccurrences h ( p ℓ , q ′ ) , α ℓ i and h ( p ℓ , q ′′ ) , α ℓ i in ζ ′′ and it is not guaranteed (as itshould) that from both occurrences the same output subtree of M is generatedby M . We ﬁnally note that, as in the proof of Lemma 17, if both M and M are pruning, then so is M . ✷ We obtain our second composition result from Lemma 12, the second inclu-sion of Lemma 11, and two applications of Lemma 19 (taking into account that TT ℓ rel ⊆ TT ℓ pru ). Theorem 20 TT ◦ TT pru ⊆ TT . In particular TT ↓ ◦ TT pru ⊆ TT ↓ , and TT pru isclosed under composition. Hence, also in a composition of two nondeterministic tt ’s we may alwaysassume that the second one is local: TT ◦ TT ⊆ TT ◦ dTT rel ◦ TT ℓ ⊆ TT ◦ TT ℓ by Lemma 10 and Theorem 20, respectively.The range of a deterministic tt M can be restricted to a regular tree lan-guage L by restricting its domain to τ − M ( L ), using Lemmas 9 and 13. For anondeterministic tt we can use the next corollary. Corollary 21

The translation τ ′ = { ( t, s ) ∈ τ | s ∈ L } is in TT for every τ ∈ TT and L ∈ REGT . If τ is in TT ↓ or TT pru , then so is τ ′ . roof. Let Σ be the output alphabet of τ and let A = (Σ , P, F, δ ) be a bottom-up ﬁnite-state tree automaton such that L ( A ) = L . Obviously τ ′ = τ ◦ τ L where τ L is the identity on L , and obviously τ L ∈ TT ℓ rel : it is realized by the localrelabeling tt (Σ , Σ , P, F, R ) where R consists of all rules h p, σ, j i → σ ( h p , down i , . . . , h p m , down m i )such that δ ( σ, p , . . . , p m ) = p . By Theorem 20, τ ′ satisﬁes the requirements. ✷ Our third composition result is that deterministic tt ’s are closed under left-composition with (deterministic) single-use tt ’s. This is a variant of one of themain results of [40, 41, 45] for (a variant of) attribute grammars, cf. the lastparagraph of [7]. It is proved for attributed tree transducers in [56, Theorem 3](see also [55, Satz 6.5]). Lemma 22 dTT su ◦ dTT ℓ ⊆ dTT . Proof.

Let M = (Σ , ∆ , P, p , R ) and M = (∆ , Γ , Q, q , R ) be a single-use d tt and a local d tt , respectively. We extend the proof of Lemma 17 to thecase that M is an arbitrary local d tt . Thus, we have to deal with the factthat now M can also move up on the output tree of M . Let ( t, s ) ∈ τ M , andlet d be the derivation tree of the computation h p , root t i ⇒ ∗ M ,t s . Since M issingle-use, we can identify each node of d that is labeled by a conﬁguration withthat conﬁguration, because a conﬁguration h p, u i of M occurs at most oncein d . Suppose that M , in conﬁguration h p, u i on t , has generated a node v of s .When M executes an up-instruction at node v , the new transducer M has tobacktrack on the computation of M , back to the moment that the parent of v in s was generated by M . Thus, starting with the conﬁguration h p, u i of M , M has to determine the ancestors of h p, u i in d , and stop at the ﬁrst ancestorthat is a conﬁguration generating an output node. Since M is single-use, eachconﬁguration h p, u i has a unique parent conﬁguration h p ′ , u ′ i in d . That allowsus to ﬁnd h p ′ , u ′ i by a regular test, as follows.For every p, p ′ ∈ P and every instruction α of M , we will deﬁne a regulartest T p,p ′ ,α such that for every t ∈ dom( M ) and u ∈ N ( t ), ( t, u ) ∈ T p,p ′ ,α if and only if h p ′ , α ( u ) i is the parent of h p, u i in the derivation tree of thecomputation h p , root t i ⇒ ∗ M ,t τ M ( t ). We will construct a tt N and deﬁne T p,p ′ ,α = { ( t, u ) | mark( t, u ) ∈ dom( N ) } . Then T p,p ′ ,α is regular by Corollary 14.To be able to describe N , we change notation and consider the node test T ¯ p, ¯ p ′ , ¯ α for ¯ p, ¯ p ′ ∈ P and instruction ¯ α .Let M ′ = (Σ , ∅ , P, { p } , R ′ ) be the nondeterministic tt obtained from M by changing every output rule h p, σ, j, T i → δ ( h p , α i , . . . , h p k , α k i ) into themove rules h p, σ, j, T i → h p i , α i i for every i ∈ [1 , k ]. Intuitively, for an inputtree t ∈ dom( M ), the tree-walking automaton M ′ follows an arbitrary path inthe unique derivation tree d ∈ L ( G der M ,t ), from the root of d down to the leaves.Whenever M branches, M ′ nondeterministically follows one of those branches.The transducer N , which is a variant of M ′ , has states ( p, p ′ , α ) with p, p ′ , α as above. The initial state is ( p , − , − ), with the second and third componentﬁxed, but irrelevant (e.g., ( p , p , stay)). On a tree mark( t, u ), N uses the state( p, p ′ , α ) to simulate the computations of M ′ in state p on t , but additionally For the deﬁnition of α ( u ) see Section 3. M ′ in its ﬁnite state, as the pair ( p ′ , α ).When it arrives at the marked node u in state (¯ p, ¯ p ′ , ¯ α ), it outputs a symbol ofrank 0. Formally, let h p, σ, j, T i → ζ be a rule in R ′ , let p ′ ∈ P , let α be aninstruction, and let b ∈ { , } . Then N has the rule h ( p, p ′ , α ) , ( σ, b ) , j, µ ( T ) i → ζ ′ where h ˜ p, down i i ′ = h (˜ p, p, up) , down i i , h ˜ p, up i ′ = h (˜ p, p, down j ) , up i , and h ˜ p, stay i ′ = h (˜ p, p, stay) , stay i for every ˜ p ∈ P and i ∈ [1 , rank( σ )]. Additionally, N has the rule h (¯ p, ¯ p ′ , ¯ α ) , ( σ, , j, µ ( T ) i → ⊤ , where ⊤ is its unique outputsymbol, of rank 0. Thus, if the tree-walking automaton N arrives in state(¯ p, ¯ p ′ , ¯ α ) at the marked node u , it can accept mark( t, u ). Hence, for every t ∈ dom( M ), N accepts mark( t, u ) if and only if h ¯ p ′ , ¯ α ( u ) i is the parent of h ¯ p, u i in the derivation tree of the computation h p , root t i ⇒ ∗ M ,t τ M ( t ).The transducer M is an extension of the one in the proof of Lemma 17. Itadditionally has states back p,q and back ∗ p,q to simulate the ﬁrst and the follow-ing backward steps of the computation of M . Its rules are obtained as follows.First, it has the same rules that simulate (the forward computation of) M .Second, the rules of M that simulate M are extended in such a way that, to ob-tain ζ ′ from ζ , one has to change additionally every h q ′ , up i into h back p,q ′ , stay i .Third, M additionally has rules that simulate the backward computation of M .For each state back p,q it has all rules h back p,q , σ, j, T p,p ′ ,α i → h back ∗ p ′ ,q , α i (where the tests on σ and j are irrelevant, because M arrived in state back p,q by a stay-instruction). For each state back ∗ p,q it has the following rules. Let ρ : h p, σ, j, T i → ζ be a rule of M . If ρ is a move rule, then M has all rules h back ∗ p,q , σ, j, T ∩ T p,p ′ ,α i → h back ∗ p ′ ,q , α i . If ρ is an output rule, then M has therule h back ∗ p,q , σ, j, T i → h ( ρ, q ) , stay i . ✷ Theorem 23 dTT su ◦ dTT ⊆ dTT . Proof.

It follows from Lemmas 10, 12, and 16 that dTT su ◦ dTT ⊆ dTT su ◦ dTT ℓ rel ◦ dTT ℓ . Thus, by Lemma 22, it suﬃces to show that dTT su ◦ dTT ℓ rel ⊆ dTT su . For asingle-use d tt M and a local relabeling d tt M , consider the construction ofthe d tt M in the last paragraph of the proof of Lemma 17. It should be clearthat M is single-use: if M visits an input node in state p , then M visits thatnode in state ( p, q ) for some q . ✷ It can be proved that dTT su is closed under composition, which also followsfrom Proposition 29 in the next section. The inclusion dTT su ◦ dTT ℓ rel ⊆ dTT su in the previous proof is a special case of that. In this section we collect some results on the connection between tt ’s, macrotree transducers (in short mt ’s) and mso tree transducers. They are taken fromthe literature or can easily be proved using results from the literature. Thissection can be skipped on ﬁrst reading, except that the reader interested inlinear size increase should glance at Corollaries 32 and 33.27 .1 Macro Tree Transducers Let MT denote the class of translations realized by mt ’s, with unrestrictedor outside-in ( oi ) derivation mode, let dMT denote the subclass realized bydeterministic mt ’s, and let d t MT denote the class of total translations in dMT (see [34] where they are denoted by MT oi , DMT oi , and D t MT , respectively). Weﬁrst consider the relationship between deterministic tt ’s and mt ’s.It is proved in [28, Lemma 49 and Corollary 51] that dTT ⊆ dMT , and in [14,Theorem 8.22] (see also [28, Corollary 51]) that dMT = dTT ℓ ↓ ◦ dTT ℓ . Here weprove the following variant. Lemma 24 dTT ⊆ dMT = dTT ↓ ◦ dTT . Proof.

We ﬁrst show that dTT ↓ ◦ dMT ⊆ dMT . By Lemma 12 it suﬃces toshow that dTT s ↓ ◦ dMT ⊆ dMT . The inclusion dTT ℓ ↓ ◦ dMT ⊆ dMT is provedin [34, Theorem 7.6(3)]. As also argued before [32, Theorem 7.5], this impliesthe inclusion dTT s ↓ ◦ dMT ⊆ dMT as follows. By [20, Theorem 2.6] dTT s ↓ ⊆ DBQREL ◦ dTT ℓ ↓ , where DBQREL is the class of deterministic bottom-up ﬁnite-state relabelings. Hence dTT s ↓ ◦ dMT ⊆ DBQREL ◦ dMT . Since dMT is closedunder regular look-ahead by [34, Theorem 6.15], it is straightforward to provethat DBQREL ◦ dMT ⊆ dMT , similar to the proof of [34, Lemma 6.17].By Lemma 10, dTT ⊆ dTT ↓ ◦ dTT ℓ . It is proved in [31, Theorem 35 for n = 0]that dTT ℓ ⊆ dMT . Hence dTT ⊆ dTT ↓ ◦ dTT ℓ ⊆ dTT ↓ ◦ dMT ⊆ dMT , whichimplies that dTT ↓ ◦ dTT ⊆ dTT ↓ ◦ dMT ⊆ dMT . It now remains to show that dMT ⊆ dTT ↓ ◦ dTT . It is proved in [31, Section 5.5] that d t MT ⊆ dTT ℓ ↓ ◦ dTT ℓ .As shown in [34, Theorem 6.18], every translation τ ∈ dMT is the restriction toa regular tree language L of a translation τ ′ ∈ d t MT . Hence τ ′ ∈ dTT ℓ ↓ ◦ dTT ℓ and so τ ∈ dTT ↓ ◦ dTT ℓ , because the ﬁrst tt can start by verifying that theinput tree is in L with a regular test at the root of t , by Lemma 9. ✷ From Lemma 24, together with Theorem 18, we obtain the following corol-lary on compositions.

Corollary 25

For every k ≥ , dTT k ⊆ dMT k = dTT ↓ ◦ dTT k ⊆ dTT k +1 . The above two inclusions are proper, cf. [39, Lemma 6.54] and [34, Theo-rem 4.16]. In fact, the macro tree transducer is, and can be, of exponentialheight increase [34, Theorem 3.24]. Hence τ k +1exp is not in dMT k , cf. the proof ofProposition 7. Also, τ kM is not in dTT k where M is an mt that translates τ n a into τ n a (with τ of rank 1 and a of rank 0).The relationship between nondeterministic tt ’s and mt ’s is less straightfor-ward. On the one hand, even TT ↓ is not included in MT because all macro treetranslations are ﬁnitary. But we can express every tt as a composition of twotop-down tt ’s and an mt . Lemma 26 TT ⊆ TT ↓ ◦ TT ↓ ◦ MT . By mistake, [31, Theorem 35] is stated for n ≥ n = 0 by [31,Lemma 34 and Theorem 31]. roof. By Lemma 10, TT ⊆ TT ↓ ◦ TT ℓ . It follows from [31, Lemmas 34 and 27]that TT ℓ ⊆ MON ◦ MT , where MON is a speciﬁc simple subclass of TT ℓ ↓ deﬁnedbefore [31, Lemma 27].We note that by Lemma 10, TT ⊆ dTT rel ◦ TT ℓ and that it is easy to provethat dTT rel ◦ TT ℓ ↓ ⊆ TT ↓ . Hence we even obtain that TT ⊆ TT ↓ ◦ MT . ✷ On the other hand, every mt can still be realized by a composition of two(ﬁnitary) tt ’s. Lemma 27 MT ⊆ dTT ↓ ◦ dTT ◦ TT pru ⊆ dTT ↓ ◦ f TT . Proof.

By [34, Theorem 6.10], MT = d t MT ◦ SET , and by the proof of [34,Theorem 6.10],

SET ⊆ TT ℓ pru . Hence MT ⊆ d t MT ◦ TT ℓ pru ⊆ dTT ↓ ◦ dTT ◦ TT pru by Lemma 24. That is included in dTT ↓ ◦ f TT by Theorem 20. ✷ It can be shown that f TT ⊆ MT = dTT ↓ ◦ f TT , thus generalizing Lemma 24to the ﬁnitary case, but that will not be needed in what follows.Finally, let MT io denote the class of translations realized by mt ’s with inside-out ( io ) derivation mode (see [34]), and let mrMT io denote the class of trans-lations realized by the multi-return macro tree transducers of [49, 50], whichgeneralize io macro tree transducers. Lemma 28 MT io ⊆ mrMT io ⊆ f TT ↓ ◦ dTT . Proof.

It is shown in [34, Lemma 5.5] that MT io ⊆ f TT ℓ ↓ ◦ YIELD , and in [31,Lemma 36] that

YIELD ⊆ dTT ℓ , and so MT io ⊆ f TT ℓ ↓ ◦ dTT . It follows from [50,Lemma 4] that mrMT io ⊆ dTT ℓ ↓ ◦ MT io ◦ dTT ℓ ↓ . Hence mrMT io ⊆ dTT ℓ ↓ ◦ f TT ℓ ↓ ◦ dTT ◦ dTT ↓ which is included in f TT s ↓ ◦ dTT by [20, Theorem 2.11(2)] and Theorem 18. ✷ Let dMSOT denote the class of deterministic mso tree translations (see [14,Chapter 8], where it is denoted

DMSOT , and where mso tree translations arecalled ms -transductions of terms). The next result is a variant of the main resultof [7], which concerns attributed tree transducers with look-ahead instead of tt ’s. In its present form it is proved in [14, Theorems 8.6 and 8.7]. Proposition 29 dMSOT = dTT su . The next proposition is the main result of [32].

Proposition 30 d t MT ∩ LSIF ⊆ dMSOT . This can be extended to arbitrary deterministic oi macro tree translationsas follows. Lemma 31 dMT ∩ LSIF ⊆ dMSOT . roof. Since the domain L of any mt M is regular ([34, Theorem 7.4]), and dMT is closed under regular look-ahead ([34, Theorem 6.15]), there is a total mt M ′ that extends M by the identity on the complement of L . Clearly, τ M ′ is of linear size increase if and only if τ M is. Hence, by Propositions 29 and 30,if τ M is of linear size increase, then τ M ′ is in dTT su . And so τ M , which is therestriction of τ M ′ to the regular tree language L , is also in dTT su by Lemma 9. ✷ From Lemma 24, Lemma 31, Proposition 29, and Lemma 6 we obtain thefollowing corollary.

Corollary 32 dTT ∩ LSIF = dTT su . It is also shown in [32] that it is decidable for a total deterministic mt whether or not it is of linear size increase. That also holds for arbitrary deter-ministic mt ’s by the proof of Lemma 31, and hence also for d tt ’s by Lemma 24. Corollary 33

It is decidable for a deterministic tt whether or not it is oflinear size increase. Note that since Corollary 32 is eﬀective, if the d tt is indeed of linear sizeincrease, then an equivalent tt su can be constructed. One of our aims is toextend Corollaries 32 and 33 to arbitrary compositions of d tt ’s. In this section we prove that for every nondeterministic top-down tt M a de-terministic top-down tt M ′ can be constructed that realizes a “uniformizer”of τ M , i.e., a subset of τ M with the same domain. This is a generalizationof [21, Lemma], where it is proved for classical nondeterministic top-down treetransducers. Note that, as opposed to the deterministic case, the nondetermin-istic top-down tt is more powerful than the classical nondeterministic top-downtree transducer with regular look-ahead, because, due to the stay-instructions,it may not be ﬁnitary, i.e., it possibly translates one input tree into inﬁnitelymany output trees.A uniformizer of a tree translation τ is a function f such that f ⊆ τ anddom( f ) = dom( τ ). Intuitively, f selects for every input tree t ∈ dom( τ ) one ofthe elements of τ ( t ). Lemma 34

Every τ ∈ TT ↓ has a uniformizer τ ′ ∈ dTT ↓ . If τ ∈ TT pru , then τ ′ ∈ dTT pru . Proof.

Let M = (Σ , ∆ , Q, Q , R ) be a nondeterministic tt ↓ . Without loss ofgenerality we assume that M has exactly one initial state q , i.e., Q = { q } .We have to construct a deterministic tt ↓ M ′ that computes one possible outputtree in τ M ( t ) for every t ∈ dom( M ). The idea of the proof of [21, Lemma] is topick, at the current node of t , one of the rules that lead to the generation of anoutput tree (which can be checked by a regular test). However, that idea doesnot work here, because M may have an inﬁnite computation on t (see [24, NewObservation 5.10]). Thus, we have to be more careful. Note that an inﬁnitecomputation is entirely due to the stay-instructions in the rules of M .30he stay-instructions can be removed from M by constructing the equivalentstay-free tt M sf = (Σ , ∆ , Q, { q } , R sf ), with general rules, as we did at the endof Section 3. Recall that we assume that the regular tests in T M are mutuallydisjoint, and that the set R sf consists of all general rules h q, σ, j, T i → ζ such that ζ ∈ L ( G q,σ,j,T ), for every left-hand side h q, σ, j, T i of a rule of M . In this case M sf is a top-down tt , with possibly inﬁnitely many rules. Since its rules do notcontain stay-instructions any more, it does not have inﬁnite computations on thetrees in its domain. Thus, the idea above can be applied to M sf , which meansthat for every q , σ , j , and T we have to pick one general rule h q, σ, j, T i → ζ from R sf , under the condition that its application leads to the generation of anoutput tree. This condition can be checked by a regular sub-test, as follows.Note that ζ ∈ T ∆ ( D σ ) where D σ = {h q ′ , down i i | q ′ ∈ Q, i ∈ [1 , rank Σ ( σ )] } .For every σ ∈ Σ, q ′ ∈ Q , and i ∈ [1 , rank( σ )], let T σ,q ′ ,i be the node testover Σ consisting of all ( t, u ) such that u has label σ in t and there is a compu-tation h q ′ , ui i ⇒ ∗ M,t s for some s ∈ T ∆ . This node test is regular by Corollary 14because mark( T σ,q ′ ,i ) is the domain of a tt M q ′ ,i that on input mark( t, u ) walksto the marked node u , checks that its label is σ , moves to the i -th child of u ,and then simulates M on t , starting in state q . For every σ ∈ Σ and D ⊆ D σ ,let T σ,D be the regular node test that is the intersection of all T σ,q ′ ,i such that h q ′ , down i i ∈ D and all T • Σ \ T σ,q ′ ,i such that h q ′ , down i i / ∈ D . Obviously thenode tests T σ,D are mutually disjoint.We now deﬁne the deterministic tt ↓ M ′ = (Σ , ∆ , Q, q , R ′ ), where R ′ con-sists of the following general rules. For every left-hand side h q, σ, j, T i of a rule of M and every D ⊆ D σ , if L ( G q,σ,j,T ) ∩ T ∆ ( D ) = ∅ , then R ′ contains the generalrule h q, σ, j, T ∩ T σ,D i → ζ where ζ is a ﬁxed element of L ( G q,σ,j,T ) ∩ T ∆ ( D ).It should be clear that M ′ satisﬁes the requirements, i.e., it has the samedomain as M sf and it realizes a subset of τ M sf . Note that M ′ can be constructedeﬀectively, because L ( G q,σ,j,T ) ∩ T ∆ ( D ) is a regular tree language, and hence itsnonemptiness can be decided and, if so, an element can be computed. Finally,the general rules of M ′ can be replaced by ordinary rules, as discussed afterLemma 8. ✷ At the end of this section we prove that any function that is realized by acomposition of nondeterministic tt ’s can also be realized by a composition ofdeterministic tt ’s. That will (only) be used to show that the results of Section 9also hold for nondeterministic tt ’s and mt ’s. Let F be the class of all partialfunctions from trees to trees. Theorem 35

For every k ≥ , ( TT ↓ ◦ TT k ) ∩ F ⊆ dTT ↓ ◦ dTT k . Proof.

By Lemmas 26 and 27, TT ⊆ TT ↓ ◦ TT ↓ ◦ dTT ↓ ◦ dTT ◦ TT ↓ . Now let τ ∈ ( TT ↓ ◦ TT k ) ∩ F . Then τ = τ ◦ · · · ◦ τ m where m = 5 k + 1, τ j ∈ dTT for every j ∈ [1 , k ], and τ i ∈ TT ↓ for every i ∈ [1 , m ] \ { j | j ∈ [1 , k ] } . ByCorollary 14, the domain of a translation in TT is regular. Hence, we mayassume that ran( τ i ) ⊆ dom( τ i +1 ) for every i ∈ [1 , m − τ i into ¯ τ i for i = m, . . . , τ m = τ m . Second, for i < m we obtain ¯ τ i from τ i by restricting its range to dom(¯ τ i +1 ), see Corollary 21and the paragraph preceding it.Since τ is a function, it should be clear that τ = τ ′ ◦ · · ·◦ τ ′ m where τ ′ i ∈ dTT ↓ is the uniformizer of τ i that exists by Lemma 34 if τ i ∈ TT ↓ , and τ ′ i = τ i if31 i ∈ dTT . Thus, τ ∈ dTT ↓ ◦ ( dTT ↓ ◦ dTT ↓ ◦ dTT ↓ ◦ dTT ◦ dTT ↓ ) k and so, byTheorem 18, τ ∈ dTT ↓ ◦ dTT k . ✷ Corollary 36

For every k ≥ , MT k ∩ F ⊆ dMT k . Proof.

By the same argument as in the proof of Theorem 35, using Lemma 27only, we obtain that MT k ∩ F ⊆ ( dTT ↓ ◦ dTT ◦ dTT ↓ ) k . By Theorem 18 that isincluded in dTT ↓ ◦ dTT k , which equals dMT k by Corollary 25. ✷ Since the inclusions in Corollary 25 are proper, as discussed after that corol-lary, Theorem 35 and Corollary 36 imply that TT k and MT k are also properhierarchies, i.e., TT k ( TT k +1 and MT k ( MT k +1 for every k ≥ In this section we prove that every tt can be decomposed into a pruning tt and another tt such that the composition is linear-bounded. It implies that wemay always assume that a composition of two tt ’s is linear-bounded. Recallfrom Section 2 that the composition of tree translations τ ⊆ T Σ × T ∆ and τ ⊆ T ∆ × T Γ is linear-bounded if there is a constant c ∈ N such that forevery ( t, s ) ∈ τ ◦ τ there exists r ∈ T ∆ such that ( t, r ) ∈ τ , ( r, s ) ∈ τ , and | r | ≤ c · | s | . Formally we say that the pair ( τ , τ ) is linear-bounded. Recall alsothat for classes T and T of tree translations, the class T ∗ T consists of alltranslations τ ◦ τ such that τ ∈ T , τ ∈ T , and ( τ , τ ) is linear-bounded.Two elementary properties of this class operation were stated in Lemma 1. Wewill prove the following theorem. Theorem 37 TT ⊆ TT pru ∗ TT and dTT ⊆ dTT pru ∗ dTT . Since pruning tt ’s can be absorbed to the right by arbitrary tt ’s (by The-orems 20 and 18), Theorem 37 can be generalized to compositions of tt ’s. Itimplies that we may always assume that a composition of a tt with any numberof tt ’s is linear-bounded. Corollary 38

Let k ≥ . (1) TT k ⊆ TT pru ∗ TT k and TT ◦ TT k = TT ∗ TT k , and (2) dTT k ⊆ dTT pru ∗ dTT k and dTT ◦ dTT k = dTT ∗ dTT k . Proof. (1) The proof of the inclusion is by induction on k . For k = 1 it isTheorem 37. The induction step is proved as follows: TT ◦ TT k ⊆ TT ◦ ( TT pru ∗ TT k ) ⊆ ( TT ◦ TT pru ) ∗ TT k ⊆ TT ∗ TT k ⊆ ( TT pru ∗ TT ) ∗ TT k ⊆ TT pru ∗ ( TT ◦ TT k )where the ﬁrst inclusion is by the induction hypothesis and the remaining in-clusions are by Lemma 1, Theorem 20 (which says that TT ◦ TT pru ⊆ TT ),32heorem 37, and Lemma 1 again. The equation now follows from the inclusionsabove.(2) The proof is exactly the same as in (1), using Theorem 18 instead ofTheorem 20. ✷ The remainder of this section is devoted to the proof of Theorem 37. It isessentially a variant of the proof of [25, Lemma 4.1], which is the key lemmaof [25] and concerns the removal of “superﬂuous computations” in attributegrammars. In its turn, that proof generalized the proof of [4, Lemma 1] wherethis was done for top-down tree transducers (and strangely enough, the authorof [25] did not mention that).To prove Theorem 37 it suﬃces, by Lemma 10, Lemma 1, and Theorems 20and 18, to consider local tt ’s, i.e., to prove that TT ℓ ⊆ TT pru ∗ TT and that dTT ℓ ⊆ dTT pru ∗ dTT . We prove the ﬁrst and second inclusion in a ﬁrst andsecond subsection, respectively. In the ﬁrst subsection we additionally take carethat the construction preserves the determinism of the given tt . Let M = (Σ , ∆ , Q, Q , R ) be a tt . For a pair ( t, s ) ∈ τ M and a computation h q , root t i ⇒ ∗ M,t s with q ∈ Q , we say that a node u of t is productive (inthat computation) if there is a q ∈ Q such that an output rule is applied tothe conﬁguration h q, u i in the computation. Obviously, the size of s is at leastthe number of productive nodes of t . For i ∈ { , } we deﬁne the computationto be i -productive if all nodes of t of rank i are productive. Moreover, thecomputation is productive if it is both 0-productive and 1-productive, i.e., allleaves and monadic nodes of t are productive. Finally, we deﬁne τ M to consist ofall ( t, s ) ∈ τ M for which there is a 0-productive computation h q , root t i ⇒ ∗ M,t s for some q ∈ Q , and we deﬁne τ M to consist of all ( t, s ) ∈ τ M for which thereis a productive computation of that form. Since the size of t is at most twicethe number of leaves plus the number of monadic nodes of t , it follows that | t | ≤ · | s | for every ( t, s ) ∈ τ M .To prove that TT ℓ ⊆ TT pru ∗ TT , our goal is to construct, for a given tt ℓ M , a pruning tt N and a tt ℓ M ′ in such a way that τ N ◦ τ M ′ ⊆ τ M and τ M ⊆ τ N ◦ τ M ′ . This obviously implies that τ N ◦ τ M ′ = τ M . The second inclusionsays that for every ( t, s ) ∈ τ M there exists a tree t ′ such that ( t, t ′ ) ∈ τ N and( t ′ , s ) ∈ τ M ′ . Thus, as observed above, | t ′ | ≤ · | s | , and hence ( τ N , τ M ′ ) islinear-bounded (for the constant c = 2).To this aim, N will remove suﬃciently many unproductive nodes from theinput tree, and add state transition information of M to the labels of the re-maining nodes, thus allowing M ′ to simulate M without having to visit thoseunproductive nodes. Since productivity of a node of the input tree t dependson the computation of M on t , N nondeterministically guesses which nodes toremove, and uses its regular tests to determine the possible behaviour of M on the remaining nodes. To reduce the technical complexity of the proof, theconstruction of N and M ′ will be done in two steps, removing unproductiveleaves and monadic nodes in the ﬁrst and second step, respectively. Recall from Section 2 that the rank of a node is the rank of its label, i.e., the number ofits children. To be precise, | t | ≤ (2 · | t | −

1) + | t | where | t | and | t | are the number of leaves andmonadic nodes of t , respectively. emma 39 For every tt ℓ M there are a tt pru N and a tt ℓ M ′ such that τ N ◦ τ M ′ ⊆ τ M ⊆ τ N ◦ τ M ′ . If M is deterministic, then so is M ′ . Lemma 40

For every tt ℓ M there are a tt pru N and a tt ℓ M ′ such that τ N ◦ τ M ′ ⊆ τ M and τ M ⊆ τ N ◦ τ M ′ . If M is deterministic, then so is M ′ . It is easy to see that applying these lemmas one after the other, we haveobtained the goal above; note that pruning tt ’s are closed under compositionby Theorem 20. It remains to prove the two lemmas. The constructions intheir proofs are similar to the removal of ε -rules and chain rules from a context-free grammar, respectively. As is well known, one should not remove theserules in the reverse order, because the removal of ε -rules can create new chainrules. Similarly in our case, we should remove unproductive leaves and monadicnodes in that order, because the removal of unproductive leaves can create newunproductive monadic nodes. Note also that removing ε -rules and chain rulesin one construction is technically more complex. Proof of Lemma 39.

Let M = (Σ , ∆ , Q, Q , R ) be a tt ℓ . As discussed inthe second paragraph after Proposition 7 (in Section 3), we may assume that theoutput rules of M only use the stay-instruction. Let us consider ( t, s ) ∈ τ M and acomputation h q , root t i ⇒ ∗ M,t s with q ∈ Q . The idea of the construction of the tt pru N and tt ℓ M ′ is that N (nondeterministically) preprocesses t by removingthe maximal subtrees of t that consist of unproductive nodes only, and that M ′ simulates M on the rest of t . Let us say that a node u of t is superﬂuous (inthis computation) if it is unproductive and all its descendants are unproductive.Note that the root of t is not superﬂuous. Thus, N changes t into t ′ by pruningall superﬂuous nodes of t . Moreover, it adds state transition information of M to the labels of the remaining nodes to allow M ′ on t ′ to simulate the abovecomputation of M on t . In the resulting computation of M ′ on t ′ , the inputtree t ′ of M ′ has no superﬂuous nodes, which means in particular that all itsleaves are productive. Note that, due to the removal of the superﬂuous nodes,each remaining node loses its superﬂuous children. Since the pruning tt N does not know which nodes are going to be superﬂuous in M ’s computation, itjust nondeterministically removes subtrees of the input tree t and adds to thelabel of each remaining node all possible state transitions of M in computationson the removed subtrees that use move rules only. Whereas N just guesses thesuperﬂuous nodes, it uses its regular tests to determine the state transitionsof M on those nodes.As intermediate alphabet we use the ranked alphabet Γ consisting of allsymbols h σ, ( i , . . . , i n ) , γ i such that σ ∈ Σ, n ∈ [0 , rank( σ )], 1 ≤ i < i < · · · < i n ≤ rank( σ ), and γ ⊆ Q × Q . The rank of h σ, ( i , . . . , i n ) , γ i is n . Inthe case where M is deterministic we require γ to be a partial function from Q to Q . Intuitively, a node u of t with label σ that is not removed by N , willbe relabeled by h σ, ( i , . . . , i n ) , γ i such that the subtrees at its children ui with i / ∈ { i , . . . , i n } are removed by N and γ is the set of all ( q, ¯ q ) such that M has34 computation from h q, u i to h ¯ q, u i (using move rules only) that visits one of theremoved subtrees.Formally, we deﬁne N = (Σ , Γ , { p } , { p } , R N ) with one state p . For everysymbol h σ, ( i , . . . , i n ) , γ i in Γ and every j ∈ [0 , m x Σ ], it has the rule h p, σ, j, T i → h σ, ( i , . . . , i n ) , γ i ( h p, down i i , . . . , h p, down i n i )where T is deﬁned as follows. Let t ∈ T Σ and let u ∈ N ( t ). The state transitionrelation γ is uniquely determined by ( i , . . . , i n ), and is expressed by T . Let ussay that a node v ∈ N ( t ) is a ghost if v = uiw for some i / ∈ { i , . . . , i n } and w ∈ N ∗ . Moreover, let us say that a computation h q , u i ⇒ M,t h q , u i ⇒ M,t · · · ⇒

M,t h q m , u m i ,m ≥

3, is a ghost computation from h q , u i to h q m , u m i if u j is a ghost for every j ∈ [2 , m − u , . . . , u m − all belong to asubtree at the same child ui . Finally, for states q, ¯ q ∈ Q we will write q ֒ → ¯ q ifthere is a ghost computation from h q, u i to h ¯ q, u i . We now deﬁne T to consistof all ( t, u ) such that γ = { ( q, ¯ q ) ∈ Q × Q | q ֒ → ¯ q } . Note that γ is indeeda partial function if M is deterministic. The test T is regular because it is aboolean combination of tests T q, ¯ q = { ( t, u ) | q ֒ → ¯ q } , which are regular becausethe tree language { mark( t, u ) | q ֒ → ¯ q } is regular for every ( q, ¯ q ) ∈ Q × Q byCorollary 14: it is the domain of a tt that ﬁrst walks to u , then simulates aghost computation of M on t from h q, u i to h ¯ q, u i , and ﬁnally outputs a symbolof rank 0.We deﬁne M ′ = (Γ , ∆ , Q, Q , R ′ ) with the following rules. Let ρ : h q, σ, j i → ζ be a rule in R , and let h σ, ( i , . . . , i n ) , γ i be an element of Γ (with the same σ ).If ρ is an output rule or ζ = h q ′ , α i with α ∈ { up , stay } , then R ′ contains the rule h q, h σ, ( i , . . . , i n ) , γ i , j i → ζ . If ζ = h q ′ , down i k i with k ∈ [1 , n ], then R ′ containsthe rule h q, h σ, ( i , . . . , i n ) , γ i , j i → h q ′ , down k i . Otherwise (i.e., ζ = h q ′ , down i i with i / ∈ { i , . . . , i n } ), R ′ contains the rule h q, h σ, ( i , . . . , i n ) , γ i , j i → h ¯ q, stay i for every ( q, ¯ q ) ∈ γ . Note that if M is deterministic, then so is M ′ .It should be clear that τ N ◦ τ M ′ ⊆ τ M , because for every t ′ ∈ τ N ( t ) thecomputations of M ′ on t ′ simulate computations of M on t .To understand that τ M ⊆ τ N ◦ τ M ′ , consider a computation h q , root t i ⇒ ∗ M,t s with q ∈ Q , and let t ′ ∈ τ N ( t ) be such that all superﬂuous nodes of t (in thiscomputation) are removed. Then it should be clear that the computation of M on t can be simulated by a computation h q , root t ′ i ⇒ ∗ M ′ ,t ′ s of M ′ on t ′ . Infact, if M visits a superﬂuous child of the current (non-superﬂuous) node u of t ,then M ′ just stays in the node v corresponding to u in t ′ and changes its stateto the one in which M returns to u . For a completely formal correctness proofone would have to formalize the obvious bijective correspondence f betweenthe non-superﬂuous nodes of t and the nodes of t ′ . In fact, f ( ε ) = ε , and if u is non-superﬂuous and ui , . . . , ui n are all the non-superﬂuous children of u ,then f ( ui k ) = f ( u ) k for every k ∈ [1 , n ]. Note that u and f ( u ) have thesame child number. However, the correctness of the construction should beclear without such a proof. The conﬁgurations h q, u i of M on t , for every non-superﬂuous node u , are simulated by the conﬁgurations h q, f ( u ) i of M ′ on t ′ .Finally, the above computation of M ′ on t ′ is 0-productive, because each leaf f ( u ) of t ′ corresponds to a non-superﬂuous node u of t of which all descendants35re superﬂuous, i.e., to a productive node. Since M ′ simulates M , it followsthat f ( u ) is a productive node of t ′ . This ends the proof of Lemma 39. Proof of Lemma 40.

This proof is similar to the previous one. Let M = (Σ , ∆ , Q, Q , R ) be a tt ℓ . Again, we assume that the output rules of M only use the stay-instruction. And again, let us consider ( t, s ) ∈ τ M and acomputation h q , root t i ⇒ ∗ M,t s with q ∈ Q . This time we deﬁne a node of t to be superﬂuous if it is unproductive (in this computation) and has rank 1.As before, N changes t into t ′ by pruning all superﬂuous nodes of t , and addsinformation to the labels of the remaining nodes to allow M ′ on t ′ to simulatethe above computation of M on t . Whereas in the previous case, M ′ had toshortcut the subcomputations of M on maximal subtrees of superﬂuous nodes,in the present case M ′ has to shortcut the subcomputations of M on maximalsequences u , . . . , u n of superﬂuous nodes ( n ≥ u i +1 is the uniquechild of u i for every i ∈ [1 , n − u n +1 of u n is non-superﬂuous, and either u is the root of t , or the parent u of u isnon-superﬂuous. In the second case, a subcomputation of M on u , . . . , u n is asfollows. When it moves from u down to u , it either returns to u , or it walksto u n +1 . And when it moves from u n +1 up to u n , it either returns to u n +1 , or itwalks to u . In the ﬁrst case, M can only move from u n +1 up to u n and returnto u n +1 . Thus, to the label of every non-superﬂuous node u of t we have to addinformation both on trips to superﬂuous nodes above u and trips to superﬂuousnodes below u . In the ﬁrst case, u n +1 will be the root of t ′ . In the second case, u n +1 will be the i -th child of u in t ′ , where i is the child number of u in t .Thus, the child number of u n +1 changes from 1 to 0, or from 1 to i , respectively.As in the previous proof, the pruning tt N does not know in advance whichnodes are going to be superﬂuous in M ’s computation. Thus, it just nondeter-ministically removes monadic nodes of the input tree t and adds to the labelof each remaining node all possible state transitions of M in subcomputationson the removed nodes that use move rules only. Rather than constructing N directly, it is more convenient to realize this pruning of t by two consecutivepruning tt ’s N and N , and use Theorem 20. The local relabeling tt N non-deterministically marks monadic nodes of t , by possibly changing the label σ ofa monadic node into b σ . The (deterministic) tt N then removes the markednodes, and relabels the unmarked nodes, adding the appropriate state transi-tions of M (determined by regular tests). Since it is easy to construct N , weonly discuss N .The intermediate alphabet Γ now consists of all symbols h σ, j, U, γ i such that σ ∈ Σ, j ∈ [0 , m x Σ )], U ⊆ { up }∪{ down i | i ∈ [1 , rank( σ )] } , and γ ⊆ Q × ( Q × I ),where I is the set of all possible instructions. The rank of h σ, j, U, γ i is the rankof σ . As before, in the case where M is deterministic we require γ to be apartial function from Q to Q × I . Intuitively, a node u of t with label σ thatis not marked by N , will be relabeled by h σ, j, U, γ i such that j is its childnumber in t , α ∈ U if and only if α ( u ) is marked by N , and γ is the set of all( q, h ¯ q, β i ) such that the following holds: M has a computation from h q, u i to h ¯ q, ¯ u i (using move rules only) that visits a maximal sequence of marked nodes,for some unmarked node ¯ u such that β ( v ) = ¯ v , where v and ¯ v are the nodescorresponding to u and ¯ u in the tree t ′ .We deﬁne N = (Σ ∪ b Σ , Γ , P, p , R ), where b Σ = { b σ | σ ∈ Σ (1) } , P = { p j | j ∈ [0 , m x Σ ] } , and R is deﬁned as follows. For every σ ∈ Σ (1) and j, j ′ ∈ [0 , m x Σ ]36he transducer N has the rule h p j , b σ, j ′ i → h p j , down i . Moreover, for every h σ, j, U, γ i ∈ Γ and j ′ ∈ [0 , m x Σ ] it has the rule h p j , σ, j ′ , T i → h σ, j, U, γ i ( h p , down i , . . . , h p m , down m i )where m = rank( σ ) and T is deﬁned as follows. Let ˆ t be a tree over Σ ∪ b Σ andlet u ∈ N (ˆ t ). We deﬁne π (ˆ t ) to be the tree over Σ that is obtained from ˆ t by changing every label b σ into σ . Both U and γ are uniquely determined, andthey are expressed by T . Let us say that a node v ∈ N (ˆ t ) is a ghost if its labelis in b Σ. A ghost computation is deﬁned as in the previous proof, for t = π (ˆ t );note that N ( t ) = N (ˆ t ). And let us write h q, u i ֒ → h ¯ q, ¯ u i if there is a ghostcomputation from h q, u i to h ¯ q, ¯ u i . We now deﬁne T to consist of all (ˆ t, u ) suchthat • up ∈ U if and only u has a parent and that parent is a ghost, • down i ∈ U if and only if ui is a ghost, • ( q, h ¯ q, stay i ) ∈ γ if and only if h q, u i ֒ → h ¯ q, u i , • ( q, h ¯ q, up i ) ∈ γ if and only if h q, u i ֒ → h ¯ q, ¯ u i for some ancestor ¯ u of u , • ( q, h ¯ q, down i i ) ∈ γ if and only if h q, u i ֒ → h ¯ q, ¯ u i for some descendant ¯ u of ui .As before, if M is deterministic, then γ is indeed a partial function. It isstraightforward to prove, using Corollary 14, that T is regular; we leave that tothe reader.We deﬁne M ′ = (Γ , ∆ , Q, Q , R ′ ) with the following rules. Let ρ : h q, σ, j i → ζ be a rule of M , and let h σ, j, U, γ i be in Γ (with the same σ and j ). If ρ is an output rule or ζ = h q ′ , α i with α / ∈ U , then R ′ contains the rule h q, h σ, j, U, γ i , j ′ i → ζ for every j ′ ∈ [0 , m x Σ ] (except j ′ = 0 when α = up).If ζ = h q ′ , α i with α ∈ U , then R ′ contains the rule h q, h σ, j, U, γ i , j ′ i → h ¯ q, β i for every ( q, h ¯ q, β i ) ∈ γ and every j ′ ∈ [0 , m x Σ ] (except j ′ = 0 when β = up).Let τ = τ N ◦ τ N . It should be clear that τ ◦ τ M ′ ⊆ τ M , as in the previousproof. To understand that τ M ⊆ τ ◦ τ M ′ , consider a 0-productive computation h q , root t i ⇒ ∗ M,t s with q ∈ Q , and let t ′ ∈ τ ( t ) be obtained from t by removingall superﬂuous nodes of t . As in the previous proof, there is an obvious bijectivecorrespondence f between the non-superﬂuous nodes of t and the nodes of t ′ .For a node u of t we deﬁne g ( u ) = u if u is non-superﬂuous, and g ( u ) is the ﬁrst(i.e., shortest) non-superﬂuous descendant of u otherwise. Then f ( g ( ε )) = ε ,and if u is non-superﬂuous and ui is a child of u , then f ( g ( ui )) = f ( u ) i . And asbefore, there is a computation h q , root t ′ i ⇒ ∗ M ′ ,t ′ s of M ′ on t ′ that simulatesthe computation of M on t , such that the conﬁgurations h q, u i of M , for everynon-superﬂuous node u of t , are simulated by the conﬁgurations h q, f ( u ) i of M ′ .Since τ does not remove leaves of t , the computation of M ′ is still 0-productive.Moreover, it is also 1-productive because all unproductive monadic nodes wereremoved by τ . This ends the proof of Lemma 40. Remark 41

In the Introduction we observed that our main technical result canbe viewed as a static garbage collection procedure, which leads, in principle, toalgorithms for automatic compiler and XML query optimization. For practical37pplicability our proof of this result is, however, of restricted value because thesizes of the involved transducers are blown up exponentially. This is due to thefact that, in the proof of Lemmas 39 and 40, the pruning tt N uses regulartests to determine the relevant state transition information γ ⊆ Q × Q (or γ ⊆ Q × ( Q × I )) of the given tt M , due to its ghost computations. These regulartests are constructed through Corollary 14, applied to variants of M . Naturally,the number of states of the ﬁnite-state tree automaton recognizing the domainof such a variant is exponential in the number Q ) of states of M , cf. theproof of [26, Lemma 1]. If one now considers the proof of TT ◦ TT ⊆ TT ∗ TT inCorollary 38 (in which the pruning tt N for the second tt M is incorporatedin the ﬁrst tt by Theorem 20), it can be seen that the number of states ofthe ﬁrst constructed tt is 2-fold exponential in the number of states of M .The additional exponential jump is due to Lemma 12, which turns the pruning tt N into one that is sub-testing. This implies that in the construction for theinclusion TT ◦ TT k ⊆ TT ∗ TT k of Corollary 38, the size of the ﬁrst constructed tt can be 2( k − tt . This willalso hold for the deterministic version. ✷ Let M = (Σ , ∆ , Q, q , R ) be a deterministic tt . For t ∈ dom( M ) we say that anode u of t is productive if it is productive in the computation h q , root t i ⇒ ∗ M,t τ M ( t ), and we say that t is productive (for M ) if that computation is productive,i.e., if all leaves and monadic nodes of t are productive. We deﬁne L M, prod tobe the set of all productive trees t ∈ dom( M ). Note that τ M is the restrictionof τ M to L M, prod . The next lemma shows that the set of productive input treesis a regular tree language. Lemma 42

Let M = (Σ , ∆ , Q, q , R ) be a deterministic tt . (1) There is a regular test T M, prod over Σ such that for every t ∈ dom( M ) and u ∈ N ( t ) , ( t, u ) ∈ T M, prod if and only if u is productive. (2) L M, prod is a regular tree language over Σ . Proof. (1) Let M ′ = (Σ × { , } , {⊤} , Q, { q } , R ′ ) be the nondeterministic tt such that ⊤ has rank 0, and R ′ is deﬁned as follows. If h q, σ, j, T i → h q ′ , α i is a move rule in R , then h q, ( σ, b ) , j, µ ( T ) i → h q ′ , α i is a rule in R ′ for every b ∈ { , } . If h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) is an output rule in R , then R ′ contains the rules h q, ( σ, , j, µ ( T ) i → h q i , α i i for every i ∈ [ k ] and it alsocontains the rule h q, ( σ, , j, µ ( T ) i → ⊤ . Intuitively, for an input tree mark( t, u )with t ∈ dom( M ), the tree-walking automaton M ′ follows an arbitrary pathin the unique derivation tree d ∈ L ( G der M,t ), from the root of d down to theleaves (cf. M ′ and N in the proof of Lemma 22). Whenever M branchesat an unmarked node, M ′ nondeterministically follows one of those branches.It accepts mark( t, u ) when an output rule is applied to the marked node u . Itshould be clear that T M, prod = mark − (dom( M ′ )) satisﬁes the requirements.It is regular by Corollary 14. There are several such computations, but they all have the same unique derivation treein L ( G der M,t ). The deﬁnition of productivity clearly does not depend on the particular choiceof the derivation. M ′′ be a d tt that performs a depth-ﬁrst left-to-right traversal ofthe input tree t ∈ T Σ and veriﬁes that ( t, u ) ∈ T M, prod for every leaf andmonadic node u of t . Then L M, prod = dom( M ) ∩ dom( M ′′ ), which is regular byCorollary 14. ✷ For a given deterministic tt M there are a nondeterministic pruning tt N and a deterministic tt ℓ M ′ such that τ N ◦ τ M ′ = τ M and τ M ⊆ τ N ◦ τ M ′ , byLemmas 39 and 40. Our aim is to transform N and M ′ in such a way that N becomes deterministic. We basically do this by applying Lemma 34 to τ N ,replacing it by one of its uniformizers. But to preserve the above two propertieswe ﬁrst restrict the domain of M ′ to productive input trees and then restrictthe range of N to the new domain, as follows.By Lemma 42, the tree language L M ′ , prod is regular. Let M ′′ be the d tt that is obtained from M ′ by restricting its domain to L M ′ , prod , see Lemma 9.Hence, τ M ′′ = τ M ′ and so τ N ◦ τ M ′′ = τ M . Since M ′′ behaves in the same wayas M ′ , every tree t ′ ∈ dom( M ′′ ) is productive (for M ′′ ). Next, we change N into the nondeterministic pruning tt N ′ by restricting its range to dom( M ′′ ),by Corollary 21. Now τ N ′ ◦ τ M ′′ = τ M and ran( τ N ′ ) ⊆ dom( τ M ′′ ). Finally, wedeﬁne τ ∈ dTT pru to be the uniformizer of τ N ′ according to Lemma 34. Then τ ◦ τ M ′′ = τ M . Now consider ( t, s ) ∈ τ M . Then s = τ M ′′ ( r ) for r = τ ( t ). Since r is productive for M ′′ , it follows that | r | ≤ · | s | as observed at the end of thesecond paragraph of Section 8.1. Hence ( τ, τ M ′′ ) is linear-bounded, which showsthat τ M ∈ dTT pru ∗ dTT . In this section we show our ﬁrst main result: the hierarchy of tt ’s collapses forfunctions of linear size increase. Theorem 43

For every k ≥ , dTT k ∩ LSIF = dTT su . Proof.

The proof is by induction on k . For k = 1 it is Corollary 32. Toprove that dTT k +1 ∩ LSIF ⊆ dTT su , let τ ∈ dTT k and let M be a d tt such that τ M ◦ τ ∈ LSIF . By Corollary 38(2) we may assume that ( τ M , τ ) is linear-bounded.Moreover, by restricting the domain of M to dom( τ M ◦ τ ) we may assumethat ran( τ M ) ⊆ dom( τ ), see Lemma 9 and Corollary 14. Hence τ M ∈ LSIF by Lemma 2 and so τ M ∈ dTT su by Corollary 32. Then τ M ◦ τ ∈ dTT k byTheorem 23. Hence τ M ◦ τ ∈ dTT su by induction. ✷ Theorem 44

It is decidable for a composition of deterministic tt ’s whether ornot it is of linear size increase. Proof.

The proof is, again, by induction on k , the number of d tt ’s in thecomposition. It goes along the lines of the proof of Theorem 43, using Corol-lary 33 instead of Corollary 32 for the case k = 1. Assuming that we have analgorithm A k for a composition of k d tt ’s, we construct A k +1 as follows. Let M, M , . . . , M k be d tt ’s, k ≥

1, and let τ = τ M ◦ · · · ◦ τ M k . Since all ourresults are eﬀective, we may assume as in the proof of Theorem 43 that ( τ M , τ )is linear-bounded and ran( τ M ) ⊆ dom( τ ). To decide whether or not τ M ◦ τ is oflinear size increase, we ﬁrst decide whether or not τ M is of linear size increase39y Corollary 33. If not, then τ M ◦ τ is not of linear size increase, by Lemma 2.If so, then a d tt M ′ that realizes τ M ◦ τ M can be constructed by Corollary 32and Theorem 23, and we apply A k to M ′ , M , . . . , M k . ✷ Together with Lemma 24 and Proposition 29 in Section 6, Theorems 43and 44 imply the following two corollaries on macro tree transducers.

Corollary 45

For every k ≥ , dMT k ∩ LSIF = dMSOT = dTT su ⊆ dMT . Corollary 46

It is decidable for a composition of deterministic mt ’s whetheror not it is of linear size increase. For the class dMT io of translations realized by deterministic macro tree trans-ducers with inside-out ( io ) derivation mode, we obtain that dMT k io ∩ LSIF ⊆ dTT su for every k ≥

1, for the simple reason that dMT io is a (proper) sub-class of dMT by [34, Theorem 7.1(1)]. For the same reason Corollary 46 is alsovalid for those transducers. However, dTT su is not included in dMT io , becausenot every regular tree language is the domain of a deterministic io macro treetransducer (see [34, Corollary 5.6]).Since LSIF ⊆ F , it follows from Theorems 43 and 35 that Theorem 43 alsoholds for nondeterministic tt ’s, i.e., TT k ∩ LSIF = dTT su for every k ≥ Similarly, it follows from Corollaries 45 and 36 that Corollary 45 also holdsfor nondeterministic mt ’s, i.e., MT k ∩ LSIF = dMSOT = dTT su ⊆ dMT forevery k ≥

1. This even holds for the so-called stay-macro tree transducers thatcan use stay-instructions, introduced in [31, Section 5.3], because it is shownin [31, Lemma 37] that the stay-macro tree translations are in TT . For theclass MT io of nondeterministic io macro tree translations we also obtain that MT k io ∩ LSIF ⊆ dTT su for every k ≥

1, because MT io ⊆ TT by Lemma 28; thesame is true for multi-return macro tree transducers.The k -pebble tree transducer was introduced in [63] as a model of XMLdocument transformation. It is a tt that additionally can use k distinct pebblesto drop on, and lift from, the nodes of the input tree. The life times of thesepebbles must be nested. The tt is the 0-pebble tree transducer. It is shownin [31, Theorem 10] that every (deterministic) k -pebble tree translation can berealized by a composition of (deterministic) k + 1 tt ’s. Hence Theorems 43and 44 also hold for deterministic k -pebble tree transducers, while Theorem 43additionally holds for the nondeterministic case. In [28, Theorems 5 and 55]this is extended to k -pebble tree transducers that, in addition to the k distinct“visible” pebbles, can use an arbitrary number of “invisible” pebbles, still withnested life times: they can be realized by a composition of k + 2 tt ’s. Thus,Theorems 43 and 44 also hold for such transducers, cf. [28, Theorem 57]. The high-level tree transducer was introduced in [35] as a generalization ofboth the top-down tree transducer and the macro tree transducer. It is provedin [35, Theorem 8.1(b)] that nondeterministic high-level tree transducers can We do not know whether Theorem 44 holds for nondeterministic tt ’s, i.e., whether it isdecidable for a composition of nondeterministic tt ’s whether or not it realizes a translationin LSIF . A “visible” pebble can be observed by the transducer during its entire life time (as usualfor pebbles), whereas an “invisible” pebble p cannot be observed during the life time of apebble p ′ of which the life time is nested within the one of p ; thus, such a pebble p ′ “hides”the pebble p .

40e simulated by compositions of nondeterministic mt ’s. Since every determin-istic high-level tree transducer realizes a partial function (as should be clearfrom the proof of [35, Lemma 5.7]), it follows from Corollary 36 that, similarly,deterministic high-level tree transducers can be simulated by compositions of de-terministic mt ’s. Consequently, Corollaries 45 and 46 also hold for deterministichigh-level tree transducers, and Corollary 45 additionally for the nondetermin-istic case.

10 Deterministic Complexity

Our ﬁrst main complexity result says that a composition of deterministic tt ’scan be computed by a RAM program in linear time, more precisely in time O ( n )where n is the sum of the sizes of the input and the output tree. Theorem 47

For every k ≥ and every τ ∈ dTT k there is an algorithm thatcomputes, given an input t , the output s = τ ( t ) in time O ( | t | + | s | ) . Proof.

The proof is by induction on k . We ﬁrst prove the case k = 1, whichis a slight generalization of the well-known fact for attribute grammars thatthe attribute evaluation of an input tree takes linear time (see, e.g., [17, 23]).Let τ ∈ dTT and let t be an input tree of τ . By Corollary 14, dom( τ ) isregular and hence can be recognized by a bottom-up ﬁnite-state tree automaton.Thus, we can decide whether or not t ∈ dom( τ ) in time O ( | t | ) by running thatautomaton on t . By Lemmas 10 and 12, τ = τ ◦ τ with τ ∈ dTT srel and τ ∈ dTT ℓ . As observed in Section 3, τ can be realized by a classical lineardeterministic top-down tree transducer with regular look-ahead. Thus, by (theproof of) [20, Theorem 2.6], it can be realized by a deterministic bottom-upﬁnite-state relabeling (DBQREL) and a local relabeling tt . To run these tworelabelings on t ∈ dom( τ ) obviously takes time O ( | t | ). Thus, it remains toconsider the case that τ ∈ dTT ℓ . Let M be a local d tt that realizes τ . Tocompute τ M ( t ), we ﬁrst construct the regular tree grammar G M,t in time O ( | t | ),the number of conﬁgurations of M on t . Then we remove the chain rules fromthe context-free grammar G M,t , i.e., the rules h q, u i → h q ′ , u ′ i resulting fromthe move rules of M . Since G M,t is forward deterministic, this can also be donein time O ( | t | ), as follows. Viewing the chain rules as edges of a directed graphwith conﬁgurations as nodes, we compute an evaluation order of the graph bytopological sorting, in time O ( | t | ). Then we compute the new rules by traversingthis order from right to left, again in time O ( | t | ). For an edge h q, u i → h q ′ , u ′ i ,if the (old or new) rule for h q ′ , u ′ i is h q ′ , u ′ i → δ ( h q , u i , . . . , h q k , u k i ), then thenew rule for h q, u i is h q, u i → δ ( h q , u i , . . . , h q k , u k i ). Finally, we use this newregular tree grammar, equivalent to G M,t , to generate s = τ M ( t ), which takestime O ( | s | ) because each rule generates a node of s .Now let τ = τ ◦ τ such that τ ∈ dTT and τ ∈ dTT k , k ≥

1. ByCorollary 38(2) we may assume that ( τ , τ ) is linear-bounded. Let t be aninput tree of τ . Since dom( τ ) is regular by Corollary 14, we can check that t ∈ dom( τ ) in linear time, as above. By the case k = 1, the intermediate tree r = τ ( t ) can be computed in time O ( | t | + | r | ), and by induction the outputtree s = τ ( t ) = τ ( r ) can be computed in time O ( | r | + | s | ). Since ( τ , τ ) islinear-bounded, there is a constant c ∈ N such that | r | ≤ c · | s | , i.e., | r | = O ( | s | ).Hence the total time is O ( | t | + | r | ) + O ( | r | + | s | ) = O ( | t | + | s | ). ✷

41t should be noted that the constant in the time complexity O ( | t | + | s | )can be large in terms of the size of the given transducers due to the use oflinear-boundedness, cf. Remark 41.Since deterministic macro tree transducers, pebble tree transducers, andhigh-level tree transducers can be realized as compositions of deterministic tt ’s(see Section 9), Theorem 47 also holds for such transducers. For k -pebble treetransducers this improves the result of [63, Proposition 3.5], where the timebound is O ( | t | k + | s | ).Before we proceed, we need an elementary lemma on leftmost derivations ofcontext-free grammars. For a context-free grammar G = ( N, T, S , R ), a leftmostsentential form is a string v ∈ ( N ∪ T ) ∗ such that S ⇒ ∗ G, lm v for some S ∈ S ,where ⇒ G, lm is the usual leftmost derivation relation of G : if X → ζ is in R ,then v Xv ⇒ G, lm v ζv for all v ∈ T ∗ and v ∈ ( N ∪ T ) ∗ . Lemma 48

Let G = ( N, T, S , R ) be an ε -free context-free grammar, and let G ′ = ( N ′ , T, S , R ′ ) be the equivalent context-free grammar such that N ′ = N ∪{ Z } and R ′ = { X → ζZ | X → ζ ∈ R }∪{ Z → ε } , where Z is a new nonter-minal. Let v be a leftmost sentential form of G ′ , and let S ⇒ ∗ G ′ , lm v ⇒ ∗ G ′ , lm w be a leftmost derivation of G ′ with S ∈ S and w ∈ L ( G ) . Moreover, let d be thederivation tree corresponding to that derivation. Then the number of occurrencesof Z in v is at most the height of d . Proof.

Each occurrence of a nonterminal Y ∈ N ′ in v corresponds to a nodeof d with label Y in a well-known way. Let u be the node of d corresponding tothe leftmost occurrence of Z in v . Clearly the number of occurrences of Z in v is equal to the number of edges on the path from u to the root of d . ✷ By [64, Theorem 2.5] it follows from Theorem 47 that a composition ofdeterministic tt ’s can be computed by a deterministic Turing machine in cubictime, more precisely in time O ( n ) where n is the sum of the sizes of the inputand the output tree. Our second complexity result says that a compositionof deterministic tt ’s can be computed by a deterministic multi-tape Turingmachine N in linear space (in the sum of the sizes of the input and output tree).On a work tape of N we will represent the input tree t over Σ by the string ϕ ( t )over Σ ∪{ ( , ) } , where { ( , ) } is the set consisting of the left- and right-parenthesis,deﬁned such that if ϕ ( t ) = t ′ , . . . , ϕ ( t m ) = t ′ m then ϕ ( σt · · · t m ) = σ ( t ′ · · · t ′ m ).In other words, we formally insert the parentheses (but not the commas) thatare always used informally to denote trees. The parentheses allow N to walk onthe tree t , from node to node, because it can recognize a subtree of t by checkingthat the numbers of left- and right-parentheses in the corresponding substringof ϕ ( t ) are equal. In particular, it can determine the child number of a node of t by counting the number of its younger siblings. Obviously, the mapping ϕ isinjective, and can be computed in linear space (simulating a one-way push-downtransducer). In what follows we identify t and ϕ ( t ). Theorem 49

For every k ≥ and every τ ∈ dTT k there is a deterministicTuring machine that computes, given an input t , the output s = τ ( t ) in space O ( | t | + | s | ) . Note that there is a straightforward one-to-one correspondence between the leftmostderivations of G and G ′ , and between their derivation trees. Since G is ε -free, the deriva-tion trees have the same height. roof. Again, we ﬁrst show this for k = 1. Let M = (Σ , ∆ , Q, q , R ) bea d tt , and let t ∈ T Σ be an input tree. As usual we assume that the outputrules of M only contain stay-instructions. We describe a deterministic multi-tape Turing machine N that computes τ M in linear space. By Corollary 14,dom( M ) is a regular tree language and hence a context-free language, whichcan be recognized in deterministic linear space. Thus, N starts by decidingwhether or not t ∈ dom( M ). Now assume that t ∈ dom( M ). To compute s = τ M ( t ), the machine N simulates the (unique) leftmost derivation of the forwarddeterministic context-free grammar G M,t . Every leftmost sentential form of G M,t is of the form w h q , u i · · · h q n , u n i with w ∈ ∆ ∗ and h q i , u i i ∈ Con( t ). Ifone views the states of M as recursive procedures with one parameter of type‘node of t ’, then h q , u i · · · h q n , u n i corresponds to the contents of the stack inthe usual implementation of recursive procedures: each conﬁguration h q i , u i i isa call of procedure q i with actual parameter u i . The machine N uses a one-wayoutput tape on which it prints w (which will ﬁnally be s ), a work tape with theinput tree t (or rather ϕ ( t )), and a work tape that contains a stack representing h q , u i · · · h q n , u n i , with the top of the stack to the left. At each moment of time,a reading head of N is at node u of t , and another reading head is at the topof the stack. Note that n ≤ | s | because every conﬁguration h q i , u i i will generateat least one symbol of s . If N would represent the parameters u , . . . , u n bytheir Dewey notation, the size of the stack could be | s | · | t | , which is too much.Thus, we need a more compact representation of the nodes u , . . . , u n . In arule of G M,t with left-hand side h q, u i , every node u ′ in the right-hand side isa neighbour of u , or u itself, and so, the “diﬀerence” between u and u ′ can beexpressed by an instruction in I = { up , stay } ∪ { down i | i ∈ [1 , m x Σ ] } . Thisallows us to represent h q , u i · · · h q n , u n i by the node u and a stack of the form q γ q γ · · · q n γ n where γ i ∈ I ∗ is a sequence of instructions that leads from u i to u i +1 (with u n +1 = root t ). Let us now consider in detail how N simulates theleftmost derivation of G M,t .At each moment of time, the current node of t and the current contentsof the output tape and the stack tape represent a leftmost sentential form of G M,t , which is an element of ∆ ∗ · Con( t ) ∗ . The stack tape contains a stringin ( Q ∪ I ) ∗ ⊥ , where ⊥ is the bottom stack symbol and I is as above. Thecurrent node u of t and the current contents w ∈ ∆ ∗ and ξ ∈ ( Q ∪ I ) ∗ ⊥ ofthe output tape and stack tape, respectively, represent the leftmost sententialform w · µ ( u, ξ ), where the string µ ( u, ξ ) ∈ Con( t ) ∗ is deﬁned as follows (forevery q ∈ Q and β ∈ I ): µ ( u, qξ ) = h q, u i · µ ( u, ξ ), µ ( u, βξ ) = µ ( β ( u ) , ξ ), and µ ( ⊥ ) = ε . Initially, N starts at the root of t , with empty output tape andwith stack tape q ⊥ , representing the initial output form h q , root t i . If the topsymbol of the stack is ⊥ , then N halts. Otherwise, to compute the next leftmostsentential form, N ﬁrst pops the top symbol oﬀ the stack. If that symbol was q ∈ Q , and the current node u of t has label σ and child number j , then N selects the unique rule h q, σ, j, T i → ζ that is applicable to h q, u i . Note that itcan test in linear space whether or not ( t, u ) ∈ T , because mark( T ) is a context-free language. If ζ = h q ′ , α i , then N moves to node α ( u ) of t and pushes thestring q ′ β on the stack where β is deﬁned as follows: if α is up, stay, or down i ,then β is down j , stay, or up, respectively. If ζ = δ ( h q , stay i , . . . , h q k , stay i ),then N outputs δ , and pushes q · · · q k on the stack (if k > N represents thenext leftmost sentential form of G M,t . If the top symbol of the stack was β ∈ I ,43he machine N moves to node β ( u ) of t . This does not change the representedleftmost sentential form. Thus, after applying a rule h q, σ, j, T i → δ (with δ of rank 0), N removes instructions from the stack (and moves its reading headon t accordingly) until the top of the stack is a state again. When N halts, theoutput tape contains s .It remains to show that the length of the stack is linear in | t | + | s | . As men-tioned above, since every conﬁguration h q, u i will generate at least one symbolof s , the number of occurrences of states in the stack is at most | s | . To estimatethe number of occurrences of instructions in the stack, we use Lemma 48. In theabove case where q is the top stack symbol and h q, σ, j, T i → h q ′ , α i is the ruleapplicable to h q, u i , the machine N does not apply the rule h q, u i → h q ′ , α ( u ) i of G M,t , but rather the rule h q, u i → h q ′ , α ( u ) i β where β is deﬁned as above.Moreover, when β is the top stack symbol, N applies the rule β → ε . Fromthis it should be clear that, by Lemma 48 and footnote 14, the number of oc-currences of instructions in the stack is at most the height of the derivationtree corresponding to the derivation h q , root t i ⇒ ∗ M,t s of G M,t . As observedin Section 2 after Lemma 3, that height is at most t )), i.e., Q ) · | t | .Thus, the length of the stack is indeed O ( | s | + | t | ).The induction step can be proved in exactly the same way as in the proof ofTheorem 47, with ‘time’ replaced by ‘space’. ✷ For a class T of tree translations and a class L of tree languages, we denoteby T ( L ) the class of tree languages τ ( L ) with τ ∈ T and L ∈ L . The elementsof T ( REGT ) are called the output tree languages (or surface languages) of T .Since dTT ⊆ dMT by Lemma 24, it follows from the proof of [34, Theorem 7.5]that the output tree languages of dTT k are recursive. From Theorem 49 wenow obtain that they are in DSPACE ( n ), i.e., can be recognized by a Turingmachine in deterministic linear space. This was shown for classical top-downtree transducers in [4]. Theorem 50

For every k ≥ , dTT k ( REGT ) ⊆ DSPACE ( n ) . Proof.

Let L ∈ REGT and τ ∈ dTT k . By Corollary 38(2), τ = τ ◦ τ such that τ ∈ dTT pru , τ ∈ dTT k , and ( τ , τ ) is linear-bounded for some constant c . Let L ′ = τ ( L ), and note that τ ( L ) = τ ( L ′ ) and that L ′ ∈ REGT by Lemma 15.It is straightforward to show that for every s ∈ τ ( L ) there exists t ∈ L ′ suchthat ( t, s ) ∈ τ and | t | ≤ c · | s | . To check whether a given tree s is in τ ( L ),a deterministic Turing machine systematically enumerates all input trees t (of τ ) such that | t | ≤ c · | s | . For each such t it ﬁrst checks that t ∈ L ′ in space O ( | t | ). Then it uses the algorithm of Theorem 49 to compute τ ( t ) in space c ′ · ( | t | + | τ ( t ) | ), but rejects t as soon as the computation takes more than space c ′ · ( | t | + | s | ); thus, the space used is O ( | t | + | s | ) = O ( | s | ). Clearly, s ∈ τ ( L ) ifand only if τ ( t ) = s for some such t . ✷ For a tree t we denote its yield by yt , for a tree language L we deﬁne yL = { yt | t ∈ L } , and for a class L of tree languages we deﬁne y L = { yL | L ∈ L} .For a class T of tree translations, the languages in y T ( REGT ) are called theoutput string languages (or target languages) of T . Corollary 51

For every k ≥ , y dTT k ( REGT ) ⊆ DSPACE ( n ) . roof. For an alphabet ∆, let Γ = ∆ ∪ { e } be the ranked alphabet such that e has rank 0 and every element of ∆ has rank 1. For a string w over ∆ we deﬁnemon( w ) = we ∈ T Γ . It is easy to see that for every ranked alphabet Σ there isa d tt ℓ M such that τ M ( t ) = mon( yt ). From this and Theorem 50 the resultfollows. ✷ We observe here, for k = 1, that dTT ( REGT ) and y dTT ( REGT ) are includedin

LOGCFL , the class of languages that are log-space reducible to a context-freelanguage. This will be proved in Corollaries 64 and 65. Note that

LOGCFL ⊆ DSPACE (log n ).We also observe that Theorem 50 and Corollary 51 also hold for nondeter-ministic tt ’s, as will be proved in Theorem 67 (and was proved for classicaltop-down tree transducers in [4]).As before, Theorems 49 and 50 and Corollary 51 also hold for deterministicmacro tree transducers, pebble tree transducers, and high-level tree transducers.It is proved in [30, Theorem 23] that composition of deterministic mt ’s yieldsa proper hierarchy of output string languages (called the y dMT -hierarchy), i.e.,that y dMT k ( REGT ) ( y dMT k +1 ( REGT ) for every k ≥

1. The io -hierarchy con-sists of the classes of string languages IO ( k ) generated by level- k grammars,with the inside-out ( io ) derivation mode (see, e.g., [16]). By [33, Theorem 7.5]the io -hierarchy can be deﬁned as output string languages of tree transforma-tions: IO ( k ) = y YIELD k ( REGT ). Since

YIELD ⊆ dTT by [31, Lemma 36], weobtain that IO ( k ) ⊆ y dTT k ( REGT ). Thus, the next corollary is immediate fromCorollary 51. Note that it was already proved in [37, Theorem 3.3.8] that the io languages (i.e., the languages in IO (1)) are in NSPACE ( n ); in [3] this wasimproved to LOGCFL . It was proved in [16, Corollary 8.12] that the languagesin the io -hierarchy are recursive. Corollary 52

For every k ≥ , IO ( k ) ⊆ DSPACE ( n ) . Note that by [30, Theorem 36] the

EDTOL control hierarchy is included inthe io -hierarchy.By Corollary 25, y dTT k ( REGT ) ⊆ y dMT k ( REGT ) ⊆ y dTT k +1 ( REGT ). Itis proved in [30, Theorem 32] that there exists a language in IO ( k + 1) thatis not in y dMT k ( REGT ). Since IO ( k + 1) ⊆ y dTT k +1 ( REGT ), that implies thefollowing stronger version of Proposition 7.

Corollary 53

For every k ≥ , y dTT k ( REGT ) ( y dTT k +1 ( REGT ) .

11 Nondeterministic Complexity

We now turn to the complexity of compositions of nondeterministic tt ’s. Weﬁrst consider the case where all the transducers in the composition are ﬁnitary.The next lemma shows that Theorem 37 and Corollary 38 also hold for f TT . Lemma 54 f TT k ⊆ TT pru ∗ f TT k and f TT ◦ f TT k = f TT ∗ f TT k for every k ≥ . Proof.

To show that f TT ⊆ TT pru ∗ f TT , let τ ∈ f TT . By Theorem 37, τ = τ ◦ τ such that τ ∈ TT pru , τ ∈ TT , and ( τ , τ ) is linear-bounded. Since45an( τ ) ∈ REGT by Lemma 15, we may assume that dom( τ ) ⊆ ran( τ ) byLemma 9. Then τ is ﬁnitary too.Theorem 20 implies that f TT ◦ TT pru ⊆ f TT , because the composition oftwo ﬁnitary translations is ﬁnitary. The remainder of the proof is now entirelysimilar to the one of Corollary 38. ✷ We will prove that a composition of tt ’s can be computed by a nondeter-ministic Turing machine in linear space and polynomial time (in the sum of thesizes of the input and output tree), which generalizes Theorem 49. In the nextlemma we consider the case where all tt ’s are ﬁnitary. Lemma 55

For every k ≥ and every τ ∈ f TT k there is a nondeterministicTuring machine that computes, given an input t , any output s ∈ τ ( t ) in space O ( | t | + | s | ) and in time polynomial in | t | + | s | . Proof.

For the case k = 1 the proof is exactly the same as that of Theorem 49except, of course, that the Turing machine N nondeterministically simulatesany leftmost derivation of G M,t , selecting nondeterministically a rule of M tocompute a next leftmost sentential form. It follows from Lemmas 48 and 3 thatthe number n of occurrences of instruction symbols in the stack is O ( | t | ). Infact, since M is ﬁnitary, it suﬃces by Lemma 3 to simulate leftmost derivationsof G M,t for which the corresponding derivation tree in L ( G der M,t ) has height atmost Q ) · | t | . As in the proof of Theorem 49, Lemma 48 implies that n is atmost that height, i.e., at most Q ) · | t | . Thus, N works in space O ( | t | + | s | ).Moreover, it works in time O ( | t | · | s | ), because the size of such a derivation tree(and hence the length of the leftmost derivation) is at most Q ) · | t | · | s | , andeach step in the leftmost derivation takes time O ( | t | ). Note that regular tree lan-guages (which are context-free languages) can be recognized in nondeterministiclinear time.Now let τ = τ ◦ τ such that τ ∈ f TT and τ ∈ f TT k , k ≥

1. We mayassume by Lemma 54 that ( τ , τ ) is linear-bounded. So, there is a constant c ∈ N such that for every ( t, s ) ∈ τ there exists a tree r such that ( t, r ) ∈ τ ,( r, s ) ∈ τ , and | r | ≤ c · | s | . By the case k = 1, the intermediate tree r can becomputed from t in nondeterministic space O ( | t | + | r | ), and by induction, theoutput tree s can be computed from r in nondeterministic space O ( | r | + | s | ).Hence, since | r | = O ( | s | ), s can be computed from t in nondeterministic space O ( | t | + | s | ). The time is polynomial in | t | + | r | and | r | + | s | , and hence polynomialin | t | + | s | . ✷ By Lemma 27, MT ⊆ f TT . Consequently Lemma 55 also holds for every τ ∈ MT k .We now turn to the output languages of f TT k . By NSPACE ( n ) ∧ NPTIME wewill denote the class of languages that can be recognized by a nondeterministicTuring machine in simultaneous linear space and polynomial time. Trivially,

NSPACE ( n ) ∧ NPTIME is included in both

NSPACE ( n ) and NPTIME . Lemma 56

For every k ≥ , f TT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME . Proof.

The proof is similar to the one of Theorem 50. Let L ∈ REGT and τ ∈ f TT k . By Lemma 54, τ = τ ◦ τ where τ ∈ TT pru , τ ∈ f TT k , and ( τ , τ )is linear-bounded for some constant c . Let L ′ = τ ( L ). Then s ∈ τ ( L ) if andonly if there exists t ∈ L ′ such that ( t, s ) ∈ τ and | t | ≤ c · | s | . To check whether46 given tree s is in τ ( L ), a nondeterministic Turing machine guesses an inputtree t such that | t | ≤ c · | s | , it checks that t ∈ L ′ in time and space O ( | t | )(because L ′ is a context-free language), and then computes any s ′ ∈ τ ( t ) with | s ′ | ≤ | s | in space O ( | t | + | s ′ | ) and time polynomial in | t | + | s ′ | , by Lemma 55.Finally it checks that s ′ = s in time and space O ( | s | ). Thus the space used is O ( | t | + | s | ) = O ( | s | ), and the time is polynomial in | s | . ✷ Although mt ’s are ﬁnitary, whereas tt ’s need not be ﬁnitary, it is provedin [31, Theorem 38 and Corollary 39] that compositions of mt ’s have the sameoutput languages as compositions of (local) tt ’s. This implies that Lemmas 56and 55 also hold for TT k . Theorem 57

For every k ≥ , TT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME , andmoreover, MT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME . Proof.

By Lemma 27, MT ⊆ f TT . Thus, by Lemma 56, MT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME . From Lemma 10 and Theorem 20 it follows (by inductionon k ) that TT k ⊆ dTT rel ◦ ( TT ℓ ) k and hence TT k ( REGT ) ⊆ ( TT ℓ ) k ( REGT ) byLemma 15. Finally, by [31, Theorem 38 and Corollary 39], ( TT ℓ ) k ( REGT ) ⊆ MT m ( REGT ) for some m ≥

1. Hence TT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME , bythe above. ✷ As observed already after Corollary 51, the space part of Theorem 57 willbe strengthened to

DSPACE ( n ) in Theorem 67. Theorem 58

For every k ≥ and every τ ∈ TT k there is a nondeterministicTuring machine that computes, given an input t , any output s ∈ τ ( t ) in space O ( | t | + | s | ) and in time polynomial in | t | + | s | . The same holds for τ ∈ MT k . Proof.

For τ ∈ MT k this was already observed after Lemma 55. Now let τ ∈ TT k with input alphabet Σ. Let ¯Σ = { ¯ σ | σ ∈ Σ } with rank(¯ σ ) = rank( σ )be a set of new symbols, and let ¯ t ∈ T ¯Σ be obtained from t ∈ T Σ by changingeach label σ into ¯ σ . Finally, let L τ = { t, s ) | ( t, s ) ∈ τ } is in TT k ( REGT ): theﬁrst transducer additionally copies the input to the output (with bars), andeach other transducer copies the ﬁrst subtree of the input to the output. ByTheorem 57, there is a nondeterministic Turing machine N that recognizes L τ in linear space and polynomial time. We construct the nondeterministic Turingmachine N ′ that, on input t , guesses a possible output tree s , writing t, s )on a worktape, uses N as a subroutine to verify that ( t, s ) ∈ τ , and outputs s .Clearly, N ′ satisﬁes the requirements. ✷ Since io (multi-return) macro tree translations, pebble tree translations,and high-level tree translations can be realized by compositions of tt ’s (seeSection 9), Theorems 57 and 58 also hold for those translations.By the proof of Corollary 51, we additionally obtain from Theorem 57 that y TT k ( REGT ) ⊆ NSPACE ( n ) ∧ NPTIME for every k ≥

1, and the same is truefor y MT k ( REGT ). The oi -hierarchy consists of the classes of string languages OI ( k ) generated by level- k grammars, with the outside-in ( oi ) derivation mode(see, e.g., [16, 33]). It was shown in [37, Theorem 4.2.8] that OI (1) equals theclass of indexed languages of [1], and hence that OI (1) ⊆ NSPACE ( n ) by [1,Theorem 5.1]. Moreover, it was shown in [67, Proposition 2] that OI (1) ⊆ PTIME . In [16, Corollary 7.26] it was proved that the languages in the oi -hierarchy are recursive. As observed in the last paragraph of [35], OI ( k ) isincluded in y MT m ( REGT ) for some m . Corollary 59

For every k ≥ , OI ( k ) ⊆ NSPACE ( n ) ∧ NPTIME . It is shown in [67] that there is an NP-complete language in both OI (1)and y f TT ↓ ( REGT ), and it is shown in [74] that there even is one in the class

ETOL , which is a subclass of both OI (1) and y f TT ↓ ( REGT ). Note that by [75,Theorem 14] the

ETOL control hierarchy is included in the oi -hierarchy.It will be shown in Corollary 68 that OI ( k ) ⊆ DSPACE ( n ).

12 Translation Complexity

In this section we study the time and space complexity of the membershipproblem of the tree translations in TT k , i.e., for a ﬁxed tree translation τ ⊆ T Σ × T ∆ we want to know, for given trees t ∈ T Σ and s ∈ T ∆ , how hard itis to decide whether or not ( t, s ) ∈ τ . To formalize this, we denote by L τ thestring language { ts | ( t, s ) ∈ τ } , where ∩ ∆ = ∅ . Otherwise, wereplace Σ by ¯Σ = { ¯ σ | σ ∈ Σ } as in the proof of Theorem 58. So, L τ isa tree language over Σ ∪ ∆ ∪ { } , where T oftree translations and a complexity class C , we will write T ⊆ C to mean that L τ ∈ C for every τ ∈ T . As usual, we denote the class of languages that areaccepted by a deterministic Turing machine in polynomial time by PTIME , andthe class of languages that are log-space reducible to a context-free languageby

LOGCFL . Note that every regular tree language is a context-free languageand hence is in

LOGCFL . Note also that

LOGCFL ⊆ PTIME (see [68]) and

LOGCFL ⊆ DSPACE (log n ) (see [57, 68] and [46, Theorem 12.7.4]).If τ ∈ dTT k then, on input t, s ), we can compute τ ( t ) according to The-orems 47 and 49 (rejecting the input when the computation takes more thantime or space c · ( | t | + | s | ) for the given constant c ) and then verify that τ ( t ) = s ,cf. the proof of Theorem 50. Thus, L τ can be accepted by a RAM program inlinear time and by a deterministic Turing machine in linear space. This meansthat dTT k ⊆ PTIME and dTT k ⊆ DSPACE ( n ). If τ ∈ TT k , then, as mentionedin the proof of Theorem 58, the tree language L τ is in the class of output lan-guages TT k ( REGT ), and hence in

NSPACE ( n ) ∧ NPTIME by Theorem 57. Thismeans that TT k ⊆ NPTIME and TT k ⊆ NSPACE ( n ). Due to the presence ofboth the input tree and the output tree in L τ , one would expect that betterupper bounds can be shown. Indeed, we will prove that TT k ⊆ DSPACE ( n ).Our main aim in this section is to prove that TT ◦ dTT ⊆ LOGCFL . Wefollow the approach of [25], using multi-head automata.A multi-head tree-walking tree transducer M = (Σ , ∆ , Q, Q , R ) (in short, mh tt ) is deﬁned in the same way as a tt , but has an arbitrary, ﬁxed numberof reading heads. Each of these heads can walk on the input tree, independentof the other heads. It can test the label and child number of the node that it iscurrently reading, and additionally apply a regular test to that node. Moreover,we assume that the heads are “sensing”, which means that M can test whichheads are currently scanning the same node. Thus, if M has ℓ heads, then its48ove rules are of the form h q, σ , j , T , . . . , σ ℓ , j ℓ , T ℓ , E i → h q ′ , α , . . . , α ℓ i where E ⊆ [1 , ℓ ] × [1 , ℓ ] is an equivalence relation. A conﬁguration of M oninput tree t is of the form h q, u , . . . , u ℓ i , to which the rule is applicable if M is in state q , each u i satisﬁes the tests σ i , j i , and T i , and u i = u j for every( i, j ) ∈ E . After application the new conﬁguration is h q ′ , α ( u ) , . . . , α ℓ ( u ℓ ) i .The output rules are deﬁned in a similar way. Initially all reading heads are atthe root of the input tree. This is all similar to how multi-head automata onstrings are deﬁned.We will use the mh tt M as an acceptor of its domain. We will say that itaccepts dom( M ) in polynomial time if there is a polynomial p ( n ) such that forevery t ∈ dom( M ) there is a computation h q , root t i ⇒ ∗ M,t s of length at most p ( | t | ) for some q ∈ Q and s ∈ T ∆ . Note that we consider nondeterministic mh tt ’s only. Lemma 60

For every multi-head tt M , dom( M ) ∈ PTIME . Moreover, if M accepts dom( M ) in polynomial time, then dom( M ) ∈ LOGCFL . Proof.

After this paragraph we will show that the domain of a multi-head tt can be accepted by an alternating multi-head ﬁnite automaton (in short, amfa ),in a straightforward way. Moreover, we will show that if the mh tt accepts inpolynomial time, then the corresponding amfa accepts in polynomial tree-size.That proves the lemma because PTIME is the class of languages accepted by amfa ’s (see [10, 12]) and

LOGCFL is the class of languages accepted by amfa ’sin polynomial tree-size (see [68, 71]).It is well known that the domain of a classical local tt can be accepted by analternating (one-head) tree-walking automaton, see, e.g., [70], [24, Section 4],and [63, Section 4], and the same is true for the multi-head case. Let M =(Σ , ∆ , Q, Q , R ) be an mh tt . The amfa M ′ that accepts dom( M ) simulates M on the input t ∈ T Σ , without producing output. The reading heads of M aresimulated by reading heads of M ′ in the obvious way. Every (initial) state q of M is simulated by the existential (initial) state q of M ′ , and a move rule of M is simulated by a transition of M ′ in an obvious way. If M applies an outputrule in state q , then M ′ ﬁrst goes into a universal state q ′ and then branchesin the same way as M , going into existential states. A regular test T of M is simulated by M ′ in a side branch, using an amfa subroutine that acceptsthe context-free language mark( T ), with additional reading heads. Note thatsince the heads are sensing, the node to be tested is “marked” by a readinghead. Similarly, to move a head h from a parent u to its i -th child ui , M ′ ﬁrstmoves an auxiliary head h ′ nondeterministically to a position to the right of u ,then checks in a side branch that the string between h and h ′ belongs to thecontext-free language T i − , and ﬁnally moves h to h ′ . In a similar way M ′ canmove from ui to u , and can determine the child number of u .If M accepts t in time m , then the size of the corresponding computation treeof M ′ is polynomial in m , because each computation step of M takes polynomialtree-size. Thus, if M accepts in polynomial time, then M ′ accepts in polynomialtree-size.Note that if we assume that the simulation of a step of M takes constanttree-size, and we assume moreover that M only uses output rules (by eventually49eplacing the right-hand side ζ of each move rule by δ ( ζ ), where δ has rank 1),then the output tree of M can be viewed both as the derivation tree of thecomputation of M and as the computation tree of M ′ , roughly speaking. ✷ Thus, to prove that TT ◦ dTT ⊆ LOGCFL it suﬃces to show, for every τ = τ ◦ τ with τ ∈ TT and τ ∈ dTT , that L τ can be accepted by a multi-head tt M in polynomial time. Let M and M be tt ’s that realize τ and τ .For an input tree t and an output tree s of τ , M will simulate M on t , generatingan intermediate tree r , and verify that M translates r into s . Since M cannotstore its output tree r , it must verify the translation of r into s on the ﬂy,i.e., while generating r . That can be done because the context-free grammar G M ,r is forward deterministic, and hence its reduced version has a unique ﬁxedpoint: during the generation of the nodes v of r , M can guess the values of thenonterminals h q, v i of G M ,r (which are subtrees of s ) and check the ﬁxed pointequations for them. However, since G M ,r need not be reduced, we have to bemore careful.Let G = ( N, ∆ , { S } , R ) be a forward deterministic context-free grammar,and let N ∪ ∆ (which stands for ‘undeﬁned’). A stringhomomorphism h : N → ∆ ∗ ∪ { } is a ﬁxed point of G if (1) h ( S ) = h ( X ) is a substring of h ( S ) for every X ∈ N such that h ( X ) = h ( X ) = h ( ζ ) for every rule X → ζ in R such that h ( X ) = h isextended to ∆ by deﬁning h ( a ) = a for every a ∈ ∆. In the special case that G is a regular tree grammar, a tree ﬁxed point of G is a ﬁxed point h of G suchthat h ( X ) ∈ T ∆ ∪ { } for every X ∈ N and h ( X ) is a subtree of h ( S ) for every X ∈ N such that h ( X ) = Lemma 61

Let G = ( N, ∆ , { S } , R ) be a forward deterministic context-freegrammar such that L ( G ) = ∅ . For every w ∈ ∆ ∗ , L ( G ) = { w } if and onlyif there is a ﬁxed point h of G such that h ( S ) = w . If G is a regular treegrammar, then the same statement holds for w ∈ T ∆ and h a tree ﬁxed point. Proof.

Let L ( G ) = { w } , and deﬁne h G ( X ) to be the unique string generatedby X , if that exists and is a substring of w , and otherwise h G ( X ) = h = h G satisﬁes the requirements.Let h be a ﬁxed point of G such that h ( S ) = w . Then h ( v ) = w for everysentential form v of G . Since L ( G ) = ∅ , this shows that L ( G ) = { w } . ✷ Theorem 62 TT ◦ dTT ⊆ LOGCFL . Proof.

Let M = (Σ , Ω , P, P , R ) be a tt , and let M = (Ω , ∆ , Q, q , R ) bea d tt . We will denote τ M and τ M by τ and τ , respectively. Since it is easyto prove (as in the proof of Corollary 38) that TT ◦ dTT = TT ∗ dTT , we mayassume that ( τ , τ ) is linear-bounded. We may also assume, by Lemma 10 andTheorem 20, that M is local. That does not change the linear-boundedness ofthe composition: if ( τ , τ ′ ◦ τ ′′ ) is linear-bounded and τ ′ ∈ TT rel , then ( τ ◦ τ ′ , τ ′′ )is linear-bounded because τ ′ is size-preserving. Similarly, we may assume thatran( τ ) ⊆ dom( τ ) by Corollaries 14 and 21. Finally we assume (as in the proofsof Lemmas 17 and 19) that M keeps track in its ﬁnite state of the child numberof the output node to be generated, through a mapping χ : P → [0 , m x Σ ].On the basis of Lemma 60, we will describe a multi-head tt M that ac-cepts L τ in polynomial time, where τ = τ ◦ τ . Initially M veriﬁes by a regular50est that the input tree is of the form t, s ) with t ∈ T Σ and s ∈ T ∆ . Wewill denote the root of t, s ) by its label t, s ) the transducer M simulates M on t generating an output tree r of M , which is in the domain of M because ran( τ ) ⊆ dom( τ ). It keeps thestate p of M in its ﬁnite state, uses one of its heads to point at a node of t (which it initially moves to the root of t ), and instead of a regular test T appliesthe regular test { ( t, s ) , u ) | ( t, u ) ∈ T } . While generating r it guesses atree ﬁxed point h : Con( r ) → T ∆ ∪ { } of the regular tree grammar G M ,r suchthat h ( h q , root r i ) = s . If that ﬁxed point can be guessed, then τ ( r ) = s byLemma 61, and hence ( t, s ) ∈ τ .Initially, M guesses the values under h of the conﬁgurations in Con( r ) thatcontain the root of r , in linear time. For each q ∈ Q the value of h q, root r i is guessed by nondeterministically moving a reading head named ( q, stay) to anode x of s , i.e., node 2 x of t, s ), meaning that h ( h q, root r i ) = s | x , or tonode h ( h q, root r i ) = h ( h q, root r i ) is “undeﬁned”).In particular, the head ( q , stay) is moved to the root of s , thus guessing that τ ( r ) = s .Suppose that M is going to produce a node v of r with label ω , by simulatingan output rule h p, σ, j, T i → ω ( h p , α i , . . . , h p k , α k i ) of M . In such a situation, M has already guessed the values under h of the conﬁgurations in Con( r ) thatcontain v , and also of those that contain the parent v ′ of v (if it has one). Foreach q ∈ Q the value of h q, v i is stored using the reading head named ( q, stay),as explained above for v = root r , and the value of h q, v ′ i is stored in a similarway using a reading head named ( q, up). Now M guesses the values of theconﬁgurations that contain the children of v , in linear time. For every q ∈ Q and i ∈ [1 , k ], the value h ( h q, vi i ) is guessed by nondeterministically movinga reading head named ( q, down i ) to some node of s or to M checksthat these values satisfy requirement (3) of a ﬁxed point of G M ,r as follows, inlinear time. If h q, ω, χ ( p ) i → h q ′ , α i is a move rule of M such that head ( q, stay)does not point to M checks that the heads ( q, stay) and ( q ′ , α ) pointto nodes with the same subtree. It can do this using two auxiliary heads thatsimultaneously perform a depth-ﬁrst left-to-right traversal of those subtrees.Similarly, if h q, ω, χ ( p ) i → δ ( h q , α i , . . . , h q m , α m i ) is an output rule of M suchthat head ( q, stay) does not point to M checks that it points to anode with label δ and that the subtree at the i -th child of that node equals thesubtree at the head ( q i , α i ), for every i ∈ [1 , m ]. After checking the ﬁxed pointrequirement (3), M outputs the node v and branches in the same way as M . Inthe i -th branch (apart from simulating M ’s rule in the obvious way) it moveshead ( q, up) to the position of head ( q, stay) and then moves head ( q, stay) tothe position of head ( q, down i ), for every q ∈ Q , in linear time.This ends the description of M . It should be clear that τ M is the set of allpairs ( t, s ) , r ) such that ( t, r ) ∈ τ (because M simulates M ) and τ ( r ) = s (because M computes a tree ﬁxed point h of G M ,r such that h ( h q , root r i ) = s ).Hence dom( M ) = { t, s ) | ∃ r : ( t, r ) ∈ τ , τ ( r ) = s } = L τ . It remains toshow that M accepts L τ in polynomial time.There is a computation of M of length at most P ) ·| t |·| r | that translates t into r , because if the number of move rules applied between two output rules Note that a node of t has the same label and child number in t and t, s ), except whenit has child number 1 in t, s ) in which case it has child number 0 or 1 in t , depending onwhether or not its parent in t, s ) has label

51s more than the number of conﬁgurations of M on t , then there is a loop inthe computation that can be removed. Since ( τ , τ ) is linear-bounded, we mayassume that the size of r is at most linear in the size of s . Hence the lengthof that computation is polynomial in | t | and | s | , and hence in | t, s ) | . Since M simulates M , and each simulated computation step takes linear time (asshown above), M accepts t, s ) in polynomial time. ✷ From Theorem 62 and Lemma 28, which says that mrMT io ⊆ f TT ↓ ◦ dTT ,we obtain the following corollary. Note that TT ◦ dTT is larger than f TT ↓ ◦ dTT in two respects. First, it contains non-ﬁnitary translations. Second, it containstotal functions for which the height of the output tree can be double exponentialin the height of the input tree, viz. τ in the proof of Proposition 7, whereasthat is at most exponential for total functions in TT ↓ ◦ dTT by Theorem 35,Lemma 24, and the paragraph after Corollary 25. Corollary 63 MT io ⊆ mrMT io ⊆ LOGCFL . As another corollary we even obtain an upper bound on the complexity ofthe output languages of dTT that improves the one of Theorem 50. It wasproved for attribute grammars in [25].

Corollary 64 dTT ( REGT ) ⊆ LOGCFL . Proof.

Let L be a regular tree language over Ω and let τ ⊆ T Ω × T ∆ bein dTT . Let Σ = { e } with rank( e ) = 0, and let τ = { ( e, r ) | r ∈ L } . Theone-state tt ℓ with rules h p, e, i → ω ( h p, stay i , . . . , h p, stay i ) for every ω ∈ Ωrealizes the translation { ( e, r ) | r ∈ T Ω } , and hence τ ∈ TT by Corollary 21. Let τ = τ ◦ τ . Then L τ = { e, s ) | ∃ r : r ∈ L, τ ( r ) = s } = { e, s ) | s ∈ τ ( L ) } .By Theorem 62 L τ ∈ LOGCFL , and hence τ ( L ) ∈ LOGCFL because τ ( L ) islog-space reducible to L τ . ✷ Theorem 62 and Corollary 64 can be extended to deal with the yields ofthe output trees, as also proved in [25] for attribute grammars (generalizingthe proof in [3] of IO (1) ⊆ LOGCFL ). For a ranked alphabet Σ we deﬁne themapping y Σ : T Σ → (Σ (0) ) ∗ such that y Σ ( t ) = yt , the yield of t . Let yield bethe class of all such mappings y Σ . In what follows we will identify each string w with the monadic tree mon( w ) as deﬁned in the proof of Corollary 51. Hence,as mentioned in that proof, yield ⊆ dTT ℓ . This even holds if we assume theexistence of special symbols in Σ (0) that are skipped when taking the yield of t (such as the symbols X in the derivation trees of context-free grammars with ε -rules, cf. Section 2). Corollary 65 TT ◦ dTT ◦ yield ⊆ LOGCFL and y dTT ( REGT ) ⊆ LOGCFL . Proof.

It is straightforward to show that yield ⊆ dTT pru ∗ yield . In fact,the deterministic pruning tt removes all nodes of rank 1 and, using regularlook-ahead, all subtrees of which the yield is the empty string ε (due to thespecial symbols mentioned above). Consequently, as in the proof of Corollary 38, TT ◦ dTT ◦ yield = TT ∗ ( dTT ◦ yield ). This allows us to repeat the proof ofTheorem 62, this time with respect to the forward deterministic context-freegrammar G ′ M ,r that generates the yields of the trees generated by G M ,r : if X → ζ is a rule of G M ,r , then X → yζ is a rule of G ′ M ,r . Thus, this time52he mh tt M guesses a ﬁxed point h of G ′ M ,r , rather than a tree ﬁxed point.To do this it uses two heads h q, stay , left i and h q, stay , right i instead of the onehead h q, stay i , to guess the left- and right-end of the substring generated by theconﬁguration h q, v i , and similarly for up and down i . It should be clear thatthe ﬁxed point requirement (3) can easily be checked, showing that one suchsubstring equals another one or is the concatenation of several other ones. ✷ The inclusion TT ◦ dTT ⊆ LOGCFL of Theorem 62 has consequences for bothspace and time complexity. We ﬁrst consider space complexity.Since

LOGCFL ⊆ DSPACE ( n ), we obtain that TT ⊆ DSPACE ( n ) from Theo-rem 62. This can easily be generalized to arbitrary compositions of tt ’s. Theorem 66

For every k ≥ , TT k ⊆ DSPACE ( n ) . Proof.

The proof is by induction on k , with an induction step similar to theone in the proof of Theorem 47.Let τ = τ ◦ τ such that τ ∈ TT and τ ∈ TT k , k ≥

1. For a given inputstring ts it has to be checked whether ( t, s ) ∈ τ . By Corollary 38(1) we mayassume that ( τ , τ ) is linear-bounded. Hence there is a constant c ∈ N such thatfor every ( t, s ) ∈ τ there is an intermediate tree r such that | r | ≤ c · | s | . To checkwhether ( t, s ) ∈ τ a deterministic Turing machine systematically enumerates alltrees r such that | r | ≤ c · | s | (cf. the proof of Theorem 50). For each such r itcan check in linear space whether ( t, r ) ∈ τ by the case k = 1. Moreover, byinduction it can check in linear space whether ( r, s ) ∈ τ . Thus it uses space O ( | t | + | r | ) + O ( | r | + | s | ) = O ( | t | + | s | ). ✷ This result allows us to prove one of our main results, viz. that the outputlanguages of TT k are in DSPACE ( n ), originally proved in [48]. It generalizes themain result of [4] from classical top-down tree transducers to tree-walking treetransducers and macro tree transducers. Theorem 67

For every k ≥ , TT k ( REGT ) ⊆ DSPACE ( n ) and MT k ( REGT ) ⊆ DSPACE ( n ) . Proof.

The proof is similar to the one of Theorem 50. Let L ∈ REGT and τ ∈ TT k . By Corollary 38(1), τ = τ ◦ τ such that τ ∈ TT pru , τ ∈ TT k ,and ( τ , τ ) is linear-bounded for some constant c . Let L ′ = τ ( L ), and notethat τ ( L ) = τ ( L ′ ) and that L ′ ∈ REGT by Lemma 15. It is straightforwardto show that for every s ∈ τ ( L ) there exists t ∈ L ′ such that ( t, s ) ∈ τ and | t | ≤ c · | s | . To check whether a given tree s is in τ ( L ), a deterministic Turingmachine enumerates all input trees t (of τ ) such that | t | ≤ c ·| s | . For each such t it ﬁrst checks that t ∈ L ′ in space O ( | t | ) = O ( | s | ). Then it uses the algorithmof Theorem 66 to check that ( t, s ) ∈ τ in space O ( | t | + | s | ) = O ( | s | ).The inclusion for MT k is now immediate from Lemma 27. ✷ As before, Theorems 66 and 67 also hold for io (multi-return) macro treetranslations, pebble tree translations, and high-level tree translations, which canbe realized by compositions of tt ’s (see Section 9).By the proof of Corollary 51, Theorem 67 implies that y TT k ( REGT ) ⊆ DSPACE ( n ) and y MT k ( REGT ) ⊆ DSPACE ( n )for every k ≥

1. Hence the oi -hierarchy is also contained in DSPACE ( n ), cf.Corollaries 59 and 52. 53 orollary 68 For every k ≥ , OI ( k ) ⊆ DSPACE ( n ) . Next we consider time complexity. Since

LOGCFL ⊆ PTIME , it follows fromTheorem 62 that TT ◦ dTT ⊆ PTIME . This result can be generalized as follows.One way to increase the power of the tt is to give it a more powerfulfeature of look-around. For a class L of tree or string languages, we deﬁnethe tt with L look-around by allowing the tt to use node tests T such thatmark( T ) ∈ L . Similarly we obtain the mh tt with L look-around. We nowconsider in particular the case where L = PTIME . Obviously, (the proof of) theﬁrst sentence of Lemma 60 is still valid for a multi-head tt M with PTIME look-around. Thus, the domain of an mh tt with PTIME look-around is in

PTIME ,and hence, in particular, the domain of a tt with PTIME look-around is in

PTIME . This implies that Lemma 19, and hence Theorem 20, also holds if theﬁrst transducer has

PTIME look-around. From the proof of Theorem 62 it noweasily follows that TT P ◦ dTT ⊆ PTIME , where the feature of

PTIME look-aroundis indicated by a superscript P . This, in its turn, implies the following variantof Corollary 63 for (multi-return) io macro tree transducers with PTIME look-around (appropriately deﬁned): MT P io ⊆ mrMT P io ⊆ PTIME . Examples of treelanguages in

PTIME that can be used as look-around are those in dTT ( REGT ),by Corollary 64, and the tree languages deﬁned by bottom-up tree automatawith equality and disequality constraints ([8]), which can obviously be acceptedby a multi-head tt .In the remainder of this section we show that there are translations in dTT ◦ TT , even in dTT ↓ ◦ TT , for which the membership problem is NP -complete.We will use a reduction of SAT, the satisﬁability problem of boolean formulas(see, e.g., [42]), to such a membership problem.Let ∆ = {∨ , ∧ , ¬ , v , e } with ∆ (2) = {∨ , ∧} , ∆ (1) = {¬ , v } , and ∆ (0) = { e } .Let B be the set of all trees over ∆ generated by the regular tree grammarwith nonterminals F and V , initial nonterminal F , and rules F → ∨ ( F, F ), F → ∧ ( F, F ), F → ¬ ( F ), F → V , V → v ( V ), and V → v ( e ). Thus, B is theset of all boolean formulas that use boolean variables of the form v ℓ e for ℓ ≥ ϕ we deﬁne ν ( ϕ ) to be the nesting-depth of its booleanoperators, i.e., ν ( ϕ ) = 0 if ϕ is a variable, ν ( ∨ ( ϕ , ϕ )) = ν ( ∧ ( ϕ , ϕ )) =max { ν ( ϕ ) , ν ( ϕ ) } + 1, and ν ( ¬ ( ϕ )) = ν ( ϕ ) + 1. For every m ≥ n ≥

1, let B ( m, n ) be the set of all formulas ϕ ∈ B such that ν ( ϕ ) ≤ m , and ℓ ∈ [1 , n ] forevery v ℓ e that occurs in ϕ . Thus, the formulas in B ( m, n ) have nesting-depthat most m and use at most the variables ve , vve , . . . , v n e .The proof of the next lemma is essentially a variant of the one of [74, The-orem 3.1]. Let Σ = { c, d, , , a } with Σ (1) = { c, d, , } and Σ (0) = { a } . Lemma 69

There is a translation τ ∈ f TT ℓ ↓ such that, for every m ≥ andevery string w ∈ { , } ∗ of length n ≥ , the set τ ( d m cwa ) consists of all booleanformulas ϕ ∈ B ( m, n ) such that ϕ is true when the value of v ℓ e is the ℓ -th symbolof w for every ℓ ∈ [1 , n ] . Proof.

We construct the top-down local tt M = (Σ , ∆ , { q , q } , { q } , R ). Notethat the initial state is q . The boolean operations i ∨ j , i ∧ j , and ¬ i on { , } are deﬁned as usual, where 0 stands for ‘false’ and 1 for ‘true’. Since the childnumbers of the nodes of the input tree will be irrelevant, we omit them fromthe left-hand sides of the rules of M . The only instruction used in the right-hand sides of the rules is α = down . The rules are the following, for every54 , j ∈ { , } . h q i ∨ j , d i → ∨ ( h q i , α i , h q j , α i ) h q i , c i → v ( h q i , α i ) h q i ∧ j , d i → ∧ ( h q i , α i , h q j , α i ) h q i , j i → v ( h q i , α i ) h q ¬ i , d i → ¬ ( h q i , α i ) h q i , i i → e h q i , d i → h q i , α i Let u be the node of the input tree t = d m cwa with label c . After consuming d m ,the tt M has nondeterministically generated any output form that is a booleanformula ϕ of nesting-depth at most m and with the two conﬁgurations h q i , u i as variables, such that ϕ is true when the value of h q i , u i is i . For instance, inthe ﬁrst step of that computation M consumes d and changes the initial outputform h q , root t i into one of the output forms ∨ ( h q , x i , h q , x i ), ∨ ( h q , x i , h q , x i ), ∨ ( h q , x i , h q , x i ), ∧ ( h q , x i , h q , x i ), ¬ ( h q , x i ), or h q , x i , where x is the child ofroot t . After that, each h q i , u i generates any variable v ℓ e such that the ℓ -thsymbol of w is i . Note that since i and j are not necessarily distinct, M hasin particular the rule h q i , i i → v ( h q i , α i ) for every i ∈ { , } . Thus, q i cannondeterministically choose any occurrence of i in w to output e and end thecomputation. ✷ Applying the translation τ of Lemma 69 to the regular tree language L consisting of all trees d m cwa such that m ≥ w is a nonempty stringover { , } , produces the set τ ( L ) of all satisﬁable formulas in B . Thus, sincethe membership problem for that set is NP -complete, we obtain the follow-ing corollary that was proved in [67], as already mentioned after Corollary 59.Note that it is easy to prove that f TT ↓ ( REGT ) ⊆ y f TT ↓ ( REGT ): just changeevery output rule h q, σ, j, T i → δ ( h q , α i , . . . , h q k , α k i ) into the (general) rule h q, σ, j, T i → ω k +1 ( δ, h q , α i , . . . , h q k , α k i ) where ω k +1 has rank k + 1 (and δ now has rank 0). Corollary 70

There is an NP -complete language in f TT ↓ ( REGT ) , and hencethere is one in y f TT ↓ ( REGT ) . We now prove the existence of a translation in dTT ↓ ◦ TT for which themembership problem is NP -complete. Recall that, for a tree translation τ , wedenote by L τ the tree language { t, s ) | ( t, s ) ∈ τ } . Theorem 71

There is a translation τ ∈ dTT ℓ ↓ ◦ dTT ℓ ◦ TT ℓ pru ⊆ dTT ↓ ◦ f TT such that L τ is NP -complete. Proof.

The inclusion dTT ℓ ◦ TT ℓ pru ⊆ f TT is immediate from Lemma 19.We ﬁrst describe a translation τ ∈ dTT ℓ ↓ ◦ f TT ℓ such that L τ is NP -complete.Let Γ = { a, b, c, d, e } with Γ (1) = { a, b, c, d } and Γ (0) = { e } . The translation τ ⊆ T Γ × T ∆ transforms each tree t = ab n cd m e into all satisﬁable booleanformulas in B ( m, n ). This will be realized by the composition of two tt ’s M and M such that the deterministic tt M transforms t into a tree s of whichthe path language consists of all strings awcd m e with w ∈ { , } ∗ of length n ,and M nondeterministically chooses a leaf of s and then walks back to the The path language of a tree s ∈ T Ω consists of all strings in Ω ∗ that are obtained bywalking along a path from the root of s to one of its leaves, writing down the labels of thenodes of that path from left to right. s while simulating the transducer M of (the proof of) Lemma 69 on thetree d m cwa ∈ T Σ . Thus, M provides all possible valuations of the variables ve , vve , . . . , v n e and M chooses one such valuation and produces all formulas in B ( m, n ) that are true for that valuation.Let Ω = { a, , , c, d, e } with Ω (2) = { a, , } , Ω (1) = { c, d } , and Ω (0) = { e } .We deﬁne τ = τ M ◦ τ M ⊆ T Γ × T ∆ where M and M are the following tt ’s.The deterministic tt ℓ ↓ M = (Γ , Ω , { q, q , q , p } , { q } , R ) has the following rules,for i ∈ { , } and α = down . h q, a, i → a ( h q , α i , h q , α i ) h p, d, i → d ( h p, α i ) h q i , b, i → i ( h q , α i , h q , α i ) h p, e, i → e h q i , c, i → c ( h p, α i )It should be clear that for an input tree ab n cd m e , with m ≥ n ≥ τ M ( ab n cd m e ) consists of all strings awcd m e with w ∈ { , } ∗ of length n .The tt ℓ M = (Ω , ∆ , Q, Q , R ) has states Q = { q , q , q } and Q = { q } .On an input tree τ M ( ab n cd m e ), it walks nondeterministically in state q fromthe root to some leaf (without producing output), moves to the parent of thatleaf, and then simulates the transducer M of Lemma 69 on the tree d m cwa ∈ T Σ while walking back to the root. It starts that simulation in the state q of M ,and then uses the rules of M with α = up.With this deﬁnition of M and M , it follows from Lemma 69 that theset τ ( ab n cd m e ) consists of all boolean formulas ϕ ∈ B ( m, n ) such that ϕ issatisﬁable. Thus, for a formula ϕ ∈ B ( m, n ), ϕ is satisﬁable if and only if ab n cd m e, ϕ ) is in L τ . This shows that satisﬁability is reducible to membershipin L τ , because the nesting-depth m of ϕ and the number n of variables it uses,can easily be computed from any ϕ ∈ B in polynomial time.We ﬁnally show that τ M ∈ dTT ℓ ◦ TT ℓ pru , by a standard technique (see,e.g., [34, Section 6.1]). In fact, we will show that τ M ∈ dTT ℓ ◦ SET , cf. theproof of Lemma 27. Let + be a new symbol of rank 2, and θ a new symbol ofrank 0. Let M ′ be the deterministic tt ℓ with output alphabet ∆ ∪{ + , θ } that isobtained from M as follows. For every triple h q, ω, j i such that q ∈ Q , ω ∈ Ω,and j ∈ [0 , m x Ω ], if h q, ω, j i → ζ , . . . , h q, ω, j i → ζ r are all the rules of M with left-hand side h q, ω, j i , then M ′ has the rule h q, ω, j i → +( ζ , +( ζ , ζ ))if r = 3, the rule h q, ω, j i → +( ζ , ζ ) if r = 2, the rule h q, ω, j i → ζ if r = 1, and the rule h q, ω, j i → θ if r = 0. Let M be the pruning tt with onestate p and rules h p, δ, j i → δ ( h p, down i , . . . , h p, down k i ) for every δ ∈ ∆ ( k ) ,plus the rules h p, + , j i → h p, down i and h p, + , j i → h p, down i (for every childnumber j ). Since M ﬁrst moves from the root to a leaf, and then moves backto the root, it does not have inﬁnite computations. From that it should be clearthat τ M = τ M ′ ◦ τ M . ✷ Corollary 72

There is a translation τ ∈ MT such that L τ is NP -complete. Proof.

By Lemma 24, dTT ℓ ↓ ◦ dTT ℓ ⊆ dMT . Moreover, by [34, Theorem 7.6(3)], dMT ◦ TT ℓ pru ⊆ MT . Hence the translation τ of Theorem 71 is in MT . ✷ Since MT ⊆ MT io by [34, Theorem 6.10], this also shows that there is atranslation τ ∈ MT io such that L τ is NP -complete, cf. Corollary 63.56 Whereas we have considered ranked trees until now, i.e., trees over a rankedalphabet, XML documents naturally correspond to unranked trees or forests ,over an ordinary unranked alphabet. For that reason we now consider trans-ducers that transform forests into forests. Rather than generalizing the tt toa “forest-walking forest transducer”, we take the equivalent, natural approachof letting the tt transform representations of forests by (ranked) trees, cf. [63]and [28, Section 11].For an ordinary (unranked) alphabet Σ the set F Σ of forests over Σ is thelanguage generated by the context-free grammar with nonterminals F and T ,initial nonterminal F , set of terminals Σ ∪{ [ , ] } , where { [ , ] } is the set consistingof the left and right square bracket, and rules F → ε , F → T F , and T → σ [ F ]for every σ ∈ Σ. Thus, intuitively, a forest is a sequence of unranked trees, andan unranked tree is of the form σ [ t · · · t n ] where each t i is an unranked tree.Note that every forest f ∈ F Σ can be uniquely written as f = σ [ f ] f with σ ∈ Σ and f , f ∈ F Σ .As usual, forests can be encoded as binary trees. With Σ we associate theranked alphabet Σ e = Σ ∪ { e } where e has rank 0 and every σ ∈ Σ has rank 2.The mapping enc Σ : F Σ → T Σ e is deﬁned as follows. The encoding of the emptyforest is enc Σ ( ε ) = e , and recursively, the encoding of a forest f = σ [ f ] f isenc Σ ( f ) = σ (enc Σ ( f ) , enc Σ ( f )). The mapping enc Σ is a bijection, and theinverse decoding is denoted by dec Σ . Let enc and dec denote the classes ofencodings enc Σ and decodings dec Σ , respectively, for all alphabets Σ. We deﬁne FT = enc ◦ TT ◦ dec to be the class of tt forest translations . Thus, a tt foresttranslation is of the form τ = enc Σ ◦ τ M ◦ dec ∆ where Σ and ∆ are alphabets and M is a tt with input alphabet Σ e and output alphabet ∆ e , which in this contextcan be called a tt forest transducer. We ﬁrst restrict attention to deterministic tt forest transducers, i.e., to the class dFT = enc ◦ dTT ◦ dec .The next simple lemma shows that the encodings of compositions are thecompositions of encodings (of deterministic tt ’s). Lemma 73

For every k ≥ , dFT k = enc ◦ dTT k ◦ dec . Proof.

The inclusion dFT k ⊆ enc ◦ dTT k ◦ dec is obvious, because dec ∆ ◦ enc ∆ is the identity on T ∆ e for every (unranked) alphabet ∆. To show that enc ◦ dTT k ◦ dec ⊆ dFT k , it suﬃces to prove that dTT ◦ dTT ⊆ dTT ◦ dec ◦ enc ◦ dTT . Let Γ be the (ranked) output alphabet of a ﬁrst transducer, which isalso the input alphabet of the second, and let id Γ be the identity on T Γ . Bythe composition results of Theorems 18 and 23, it now suﬃces to show thatid Γ ∈ dTT ℓ ↓ ◦ dec ◦ enc ◦ dTT ℓ su . We do this by encoding the trees over Γ asbinary trees, similar to the transformation of the derivation trees of a context-free grammar into those of its Chomsky Normal Form. Let ω be a new symbol,and let ∆ be the unranked alphabet Γ ∪{ ω } . We encode the trees over Γ as treesover the ranked alphabet ∆ e , which are the usual encodings of forests over ∆.The encoding h : T Γ → T ∆ e is deﬁned as follows: for every γ ∈ Γ ( k ) , if h ( t i ) = t ′ i for every i ∈ [1 , k ], then h ( γ ( t , t , . . . , t k )) = γ ( e, ω ( t ′ , ω ( t ′ , . . . ω ( t ′ k , e ) · · · ))).It should be clear that h is an injection. It should also be clear that h ∈ dTT ℓ ↓ (in fact, h is a tree homomorphism, which can be realized by a classical top-down tree transducer). Finally, it is also easy to construct a local top-down57ingle-use tt M such that τ M ( h ( t )) = t for every t ∈ T Γ . It has the set of states Q = { q i | i ∈ [0 , m x Γ ] } with initial state q , and the following rules (where γ ∈ Γ ( k ) , j ∈ [0 , q i ∈ Q , i = 1): h q , γ, j i → γ ( h q , down i , . . . , h q k , down i ) h q , ω, i → h q , down ih q i , ω, i → h q i − , down i Note that γ and ω have rank 2 in ∆ e . ✷ We now wish to show that our main results also hold for deterministic tt forest translations. Let us ﬁrst consider the complexity results of Section 10.It is easy to see that for every alphabet Σ, the mappings enc Σ and dec Σ canbe computed by a deterministic Turing machine in linear time and space, sim-ulating a one-way pushdown transducer. This implies, by Lemma 73, thatTheorems 47 and 49 also hold for dFT k . We deﬁne a set of forests L ⊆ F Σ to bea regular forest language if enc Σ ( L ) ∈ REGT , and we denote the class of regularforest languages by

REGF . Then, for every k ≥

1, the class dFT k ( REGF ) of out-put forest languages is included in the class dec ( dTT k ( REGT )) by Lemma 73.Let L ∈ REGT and τ ∈ dTT k with output alphabet ∆ e . Then a forest f over ∆is in dec ∆ ( τ ( L )) if and only if enc ∆ ( f ) is in τ ( L ). That implies that Theorem 50also holds for dFT k , in the sense that dFT k ( REGF ) ⊆ DSPACE ( n ).Next we consider the results of Section 9, and extend the class LSIF in theobvious way to forest translations. Since it is easy to show that for every forest f ∈ F Σ , we have | enc Σ ( f ) | = | f | + 1 (see footnote 17), a translation τ ′ =enc Σ ◦ τ ◦ dec ∆ is of linear size increase if and only if τ is of linear size increase.Thus, since dFT k = enc ◦ dTT k ◦ dec by Lemma 73, it is decidable for a givencomposition of deterministic tt forest transducers whether or not it is of linearsize increase. And if so, an equivalent deterministic tt forest transducer can beconstructed: dFT k ∩ LSIF = enc ◦ ( dTT k ∩ LSIF ) ◦ dec = enc ◦ dTT su ◦ dec ⊆ dFT .Intuitively, enc ◦ dTT su ◦ dec is the class of translations realized by “single-useforest-walking forest transducers”. Since dTT su = dMSOT by Proposition 29,it is also the class enc ◦ dMSOT ◦ dec . Viewing forests as graphs, and hence aslogical structures, in the obvious way (just as trees), every encoding enc Σ andevery decoding dec Σ is a deterministic (i.e., parameterless) mso translation, asdeﬁned in [14, Chapter 7]. Hence, by the closure of mso translations undercomposition [14, Theorem 7.14], enc ◦ dMSOT ◦ dec equals the (natural) class ofdeterministic mso translations from forests to forests.As observed in [65] for macro tree transducers, whereas the encoding offorests as binary trees is quite natural for the input forest of a tt , for theoutput forest it is less natural, because it forces the tt to generate the out-put forest f in its unique form f = σ [ f ] f . It is more natural to additionallyallow the tt to generate f as a concatenation f f of two forests f and f .To formalize this, as in [26, Section 7] and in accordance with [65], we as-sociate with an alphabet ∆ the ranked alphabet ∆ @ = ∆ ∪ { @ , e } where@ has rank 2, e has rank 0, and every δ ∈ ∆ has rank 1. The mappingﬂat ∆ : T ∆ @ → F ∆ is a “ﬂattening” deﬁned as follows (for t , t ∈ T ∆ @ and δ ∈ ∆): ﬂat ∆ ( e ) = ε , ﬂat ∆ (@( t , t )) = ﬂat ∆ ( t )ﬂat ∆ ( t ), the concatenation of In fact, enc Σ can even be computed without pushdown: for every forest f ∈ F Σ , enc Σ ( f )can be obtained from f by removing all left-brackets, changing each right-bracket into e , andadding one e at the end. ∆ ( t ) and ﬂat ∆ ( t ), and ﬂat ∆ ( δ ( t )) = δ [ﬂat ∆ ( t )]. The mapping ﬂat ∆ issurjective but, in general, not injective. Let ﬂat denote the class of ﬂatteningsﬂat ∆ , for all alphabets ∆. We deﬁne FT @ = enc ◦ TT ◦ ﬂat to be the classof extended tt forest translations . An extended tt forest tree transducer is a tt with input alphabet Σ e and output alphabet ∆ @ . Again, we ﬁrst restrictattention to deterministic transducers, i.e., to the class dFT @ = enc ◦ dTT ◦ ﬂat .Let us show that there is an extended tt forest translation in dFT @ that isnot in dFT . That was shown for macro tree transducers in [65, Theorem 8] bya similar argument. Let Γ = { σ } and Ω = { δ } be alphabets, and let us identifythe forest σ [ ] with the symbol σ , and similarly δ [ ] with δ . Then Γ ∗ ⊆ F Γ and Ω ∗ ⊆ F Ω . There is a deterministic extended tt forest transducer thattranslates the string σ n into the string δ n +1 for every n ∈ N . In fact, let M bethe d tt (with general rules) that is obtained from the d tt M exp of Example 5by changing its output alphabet into Ω @ = { @ , δ, e } , and changing σ into @ and e into δ ( e ) in the right-hand sides of its rules. Note that the input alphabet Σ of M exp and M equals Γ e . The input tree t n = enc Γ ( σ n ) = σ ( e, σ ( e, . . . σ ( e, e ) · · · ))is translated by M exp into the full binary tree s n over Σ with 2 n +1 leaves.Clearly, M translates t n into the tree s ′ n that is obtained from s n by changingevery σ into @ and every e into δ ( e ). Thus, ﬂat Ω ( s ′ n ) = δ n +1 . This foresttranslation is not in dFT , because | enc Γ ( σ n ) | = | t n | = 2 n + 1 but the height of s ′′ n = enc Ω ( δ n +1 ) is 2 n +1 , and so, by Lemma 6, there is no d tt that translates t n into s ′′ n .We will show that dFT ⊆ dFT @ ⊆ dFT . A similar result was proved formacro tree transducers in [65, Theorem 8 and Corollary 12]. To compare dFT and dFT @ , and their compositions, we establish two relationships between dec and ﬂat in the next lemma. Lemma 74 dec ⊆ dTT ℓ ↓ ◦ ﬂat and ﬂat ⊆ dTT ℓ su ◦ dec . Proof.

To show the ﬁrst inclusion, let ∆ be an alphabet and deﬁne the mapping h : T ∆ e → T ∆ @ such that h ( e ) = e and if h ( t ) = t ′ and h ( t ) = t ′ , then h ( δ ( t , t )) = @( δ ( t ′ ) , t ′ ). It is straightforward to prove that h ◦ ﬂat ∆ = dec ∆ .It is also easy to show that h ∈ dTT ℓ ↓ (as in the proof of Lemma 73, h is a treehomomorphism, which can be realized by a classical top-down tree transducer).Hence dec ∆ ∈ dTT ℓ ↓ ◦ ﬂat .For the second inclusion, let ∆ be an alphabet. The mapping ﬂat ∆ ◦ enc ∆ can be realized by a local single-use d tt M = (∆ @ , ∆ e , Q, q , R ) that performsa depth-ﬁrst left-to-right tree traversal in a special way. Rather than performingthis traversal in one branch, it does so in all its branches together, each branchperforming a separate piece of the traversal. When M arrives from above ata node u with label δ ∈ ∆, it outputs δ and splits into two branches. Theﬁrst branch traverses the subtree at u , and the second branch continues thetraversal after that subtree. Each branch outputs e when arriving from belowat a ∆-labeled node (or at the root, at the end of the traversal). Formally, M has the state set Q = { d, u , u } with initial state q = d , cf. Examples 4and 5. It has the following (general) rules, where j ′ ∈ [0 , m x Σ ], j ∈ [1 , m x Σ ],59nd δ ∈ ∆: h d, @ , j ′ i → h d, down i h d, e, j i → h u j , up ih d, δ, j i → δ ( h d, down i , h u j , up i ) h d, e, i → e h d, δ, i → δ ( h d, down i , e ) h u , @ , j ′ i → h d, down i h u , δ, j ′ i → e h u , @ , j i → h u j , up ih u , @ , i → e Thus, since τ M = ﬂat ∆ ◦ enc ∆ , it follows that ﬂat ∆ = τ M ◦ dec ∆ ∈ dTT ℓ su ◦ dec .We note that the mapping ﬂat ∆ ◦ enc ∆ is denoted ‘eval’ in [65, Section 4],‘APP’ in [61], and ‘app’ in [26, Section 7]. For the reader familiar with mso translations we observe that it is also easy to show that both ﬂat ∆ and enc ∆ aredeterministic mso translations, and hence their composition is one. The secondinclusion then follows from Proposition 29. ✷ It follows from the ﬁrst inclusion of Lemma 74 that dFT ⊆ dFT @ . In fact, enc ◦ dTT ◦ dec ⊆ enc ◦ dTT ◦ dTT ℓ ↓ ◦ ﬂat , which is included in enc ◦ dTT ◦ ﬂat by Theorem 18. It follows from the second inclusion that dFT k @ ⊆ dFT k +1 forevery k ≥

1. In fact, dFT k @ = ( enc ◦ dTT ◦ ﬂat ) k ⊆ ( enc ◦ dTT ◦ dTT ℓ su ◦ dec ) k ⊆ enc ◦ ( dTT ◦ dTT ℓ su ) k ◦ dec , which is included in enc ◦ dTT k ◦ dTT ℓ su ◦ dec byTheorem 23 and hence in enc ◦ dTT k +1 ◦ dec , which equals dFT k +1 by Lemma 73. Corollary 75 dFT k ⊆ dFT k @ ⊆ dFT k +1 for every k ≥ . From the second inclusion we obtain that our main results also hold fordeterministic extended tt forest transducers. It is decidable whether or not acomposition of such transducers is of linear size increase, and dFT k @ ∩ LSIF = enc ◦ dTT su ◦ dec ⊆ dFT ⊆ dFT @ . The complexity results of Theorems 47, 49, and 50 also hold for dFT k @ .The class of deterministic macro forest translations of [65] can be deﬁnedas dMFT @ = enc ◦ dMT ◦ ﬂat . Since dTT ⊆ dMT ⊆ dTT by Lemma 24,we conclude by similar arguments as for dFT @ that dMFT k @ ⊆ dFT k +1 andhence our main results also hold for deterministic macro forest transducers. Itis decidable whether or not a composition of such transducers is of linear sizeincrease, and dMFT k @ ∩ LSIF = enc ◦ dTT su ◦ dec ⊆ dFT ⊆ dFT @ ⊆ dMFT @ . The complexity results of Theorems 47, 49, and 50 also hold for dMFT k @ .The main results of Sections 9 and 11 also hold for nondeterministic foresttransducers. Instead of Lemma 73 we use the obvious fact that TT ◦ dec ◦ enc ◦ TT ⊆ TT ◦ TT . This implies, together with Lemma 74, that it suﬃces toprove that the results for TT k also hold for the class enc ◦ TT k ◦ dec . For thenondeterministic version of Theorem 43 in Section 9, we note that a translationenc Σ ◦ τ ◦ dec ∆ is a function if and only if τ is a function. Consequently,( enc ◦ TT k ◦ dec ) ∩ F ⊆ enc ◦ ( TT k ∩ F ) ◦ dec ⊆ enc ◦ dTT k +1 ◦ dec = dFT k +1 by It can be shown that the nondeterministic version of Lemma 73 also holds, but we willnot do that here. enc ◦ TT k ◦ dec ) ∩ LSIF = enc ◦ dTT su ◦ dec .Obviously, the complexity results of Theorems 57 and 58 in Section 11 hold for enc ◦ TT k ◦ dec , with the same proof as in the deterministic case. The class ofnondeterministic macro forest translations of [65] can be deﬁned as MFT @ = enc ◦ MT ◦ ﬂat . From Lemmas 27 and 74 we obtain that MFT k @ ⊆ enc ◦ TT k ◦ dec ,and hence all these results also hold for macro forest transducers.We ﬁnally show that the results of Section 12 also hold for nondeterministicforest transducers. We ﬁrst consider enc ◦ TT ◦ dTT ◦ dec and enc ◦ TT ◦ dTT ◦ ﬂat .For a forest translation τ we deﬁne the forest language L τ = { f g ] | ( f, g ) ∈ τ } .If τ = enc Σ ◦ τ ′ ◦ dec ∆ with τ ′ ∈ TT ◦ dTT , then f g ] ∈ L τ if and only if Σ ( f ) , enc ∆ ( g )) ∈ L τ ′ . Since enc Σ ( f ) can be computed by a deterministicﬁnite-state transducer (see footnote 17), and similarly for enc ∆ ( g ), L τ is log-space reducible to L τ ′ . Hence enc ◦ TT ◦ dTT ◦ dec ⊆ LOGCFL by Theorem 62.Similarly if τ ′ ∈ dTT , then g ∈ τ ( L ) if and only if enc ∆ ( g ) ∈ τ ′ (enc Σ ( L )) forevery L ∈ REGF , and hence dFT ( REGF ) ⊆ LOGCFL by Corollary 64. To showthe same results for ﬂat instead of dec , we need the following small lemma.

Lemma 76 ﬂat ⊆ dTT ↓ ◦ yield . Proof.

For an alphabet ∆, let Ω be the ranked alphabet ∆ ∪ { [ , ] } ∪ { λ, @ , ω } such that Ω (0) = ∆ ∪ { [ , ] , λ } , Ω (2) = { @ } , and Ω (4) = { ω } . We deﬁne thedeterministic tt ℓ ↓ N = (∆ @ , Ω , { p } , p, R ) with the following (general) rules. h p, j, @ i → @( h p, down i , h p, down i ) h p, j, e i → λ h p, j, δ i → ω ( δ, [ , h p, down i , ] )for every δ ∈ ∆. Assuming that the symbol λ is skipped when taking yields (cf.the sentence before Corollary 65), it should be clear that ﬂat ∆ ( t ) is the yield of τ N ( t ) for every t ∈ T ∆ @ . ✷ It follows from Lemma 76 and Theorem 18 that enc ◦ TT ◦ dTT ◦ ﬂat ⊆ enc ◦ TT ◦ dTT ◦ yield and dFT @ = enc ◦ dTT ◦ ﬂat ⊆ enc ◦ dTT ◦ yield . If τ is a foresttranslation such that τ = enc Σ ◦ τ ′ with τ ′ ∈ TT ◦ dTT ◦ yield , then f g ] ∈ L τ if and only if Σ ( f ) , g ) ∈ L τ ′ . Hence enc ◦ TT ◦ dTT ◦ ﬂat ⊆ LOGCFL byCorollary 65. Similarly if τ ′ ∈ dTT ◦ yield , then τ ( L ) = τ ′ (enc Σ ( L )) for every L ∈ REGF , and so dFT @ ( REGF ) ⊆ LOGCFL by Corollary 65. If we deﬁne theclass of io macro forest translations to be enc ◦ MT io ◦ ﬂat , then that class isincluded in enc ◦ TT ◦ dTT ◦ ﬂat by Lemma 28 and hence in LOGCFL by theabove. Thus, Corollary 63 also holds for macro forest transducers.For a forest translation τ = enc Σ ◦ τ ′ ◦ dec ∆ with τ ′ ∈ TT k it is easyto prove that L τ ∈ DSPACE ( n ) and that τ ( L ) ∈ DSPACE ( n ) for every L ∈ REGF , as we did above for τ ′ ∈ TT ◦ dTT and τ ′ ∈ dTT , respectively, thusgeneralizing Theorems 66 and 67. That also holds for ﬂat ∆ instead of dec ∆ ,because enc ◦ TT k ◦ ﬂat ⊆ enc ◦ TT k +1 ◦ dec by Lemma 74.The NP -completeness results of Section 12 also hold for extended foresttranslations. The translation τ of Theorem 71 can be changed into a trans-lation in enc ◦ dTT ↓ ◦ f TT ◦ ﬂat as follows. First, change M in the proof ofTheorem 71 such that it obtains as input the encodings of the strings ab n cd m (viewed as forests). Second, change M such that it outputs trees over ∆ @ rather than ∆ (by changing the rule h q i ∨ j , d i → ∨ ( h q i , α i , h q j , α i ) of M in the61roof of Lemma 69 into the general rule h q i ∨ j , d i → ∨ (@( h q i , α i , h q j , α i )), andsimilarly for ∧ ). As a result τ outputs boolean expressions as forests ratherthan ranked trees. Thus we obtain an NP -complete extended forest translationin enc ◦ dTT ↓ ◦ f TT ◦ ﬂat , and hence one in MFT @ . In a similar way we alsoobtain an NP -complete forest language in FT @ ( REGF ). The details are left tothe reader. It is not clear whether these results hold for dec instead of ﬂat .

14 Conclusion

Our main technical result transforms a composition of k tt ’s into a linear-bounded composition of k tt ’s, cf. Corollary 38. As observed in Remark 41,our proof of this result can involve a 2( k − dTT k ∩ LSIF ⊆ dTT for every k ≥ d tt ’s that is of linear size increase can be realizedby one d tt . Moreover, it is decidable whether or not such a composition is oflinear size increase. Do similar results hold for polynomial size increase? Forinstance, does there exist m ≥ S k ≥ dTT k ofquadratic size increase is in dTT m ? The same question can be asked for ℓ -foldexponential size increase, for each ﬁxed ℓ ∈ N .We have shown in Section 7 that even TT k ∩ LSIF ⊆ dTT for every k ≥ tt M whether or not τ M is a function of linear size increase.This would be solved if it was decidable whether or not τ M is a function. Butthat is also unknown, whereas it has been proved for classical top-down treetransducers (with regular look-ahead) in [36, Theorem 8]. Note that decidingfunctionality of τ M also solves the equivalence problem for d tt ’s, which is al-ready a long standing open problem (cf. [22, 60]); in fact, τ , τ ∈ dTT are thesame if and only if they have the same domain and τ ∪ τ is functional.Another open question for nondeterministic tt ’s is whether or not thereexists m ≥ TT k ∩ LSIR ⊆ TT m holds for every k ≥ LSIR consists of all relations τ ⊆ T Σ × T ∆ of linear size increase, whichmeans that there is a constant c ∈ N such that | s | ≤ c · | t | for every ( t, s ) ∈ τ . Itfollows from (the proof of) [48, Theorem 3.21] (see also [49, 50]) that TT ∩ LSIR is not included in MT , and hence not in TT by the remark following Lemma 27.Similar questions can be asked for macro tree transducers, i.e., for the classes dMT and MT .We have shown in Lemma 12 that dTT ↓ = dTT s ↓ , but we do not knowwhether or not dTT = dTT s . In other words, we do not know whether forevery tt there is an equivalent sub-testing tt , in which the regular test of arule only inspects the subtree of the current node. Or even more informally, canregular look-around be simulated by regular look-ahead?We have shown in Corollary 59 that the string languages in the oi -hierarchy,which are generated by high-level grammars, are in NSPACE ( n ) ∧ NPTIME , andin Corollary 68 that they are in

DSPACE ( n ). However, the languages of the oi -hierarchy are generated by so-called “safe” high-level grammars, and it is62ot known whether the same results hold for unsafe high-level grammars. Itis proved in [54] that the languages generated by unsafe level-2 grammars, theunsafe version of OI (2), are in NSPACE ( n ).In Section 12 we have shown that dTT k ⊆ PTIME , that TT ◦ dTT ⊆ LOGCFL ⊆ PTIME , and that dTT ◦ TT contains an NP -complete translation.It remains to ﬁnd out for k ≥ TT ◦ dTT k ⊆ PTIME or whether itcontains an NP -complete translation. Acknowledgements.

We are grateful to the reviewers for their constructivecomments.

References [1] Aho AV (1968) Indexed grammars - an extension of context-free grammars.Journal of the ACM 15: 647–671[2] Aho AV, Ullman JD (1971) Translations on a context-free grammar. Infor-mation and Control 19: 439–475[3] Asveld PRJ (1981) Time and space complexity of inside-out macro lan-guages. International Journal of Computer Mathematics 10: 3–14[4] Baker BS (1978) Generalized syntax-directed translation, tree transducers,and linear space. SIAM Journal on Computing 7: 376–391[5] Bartha M (1982) An algebraic deﬁnition of attributed transformations.Acta Cybernetica 5: 409–421[6] Bloem R, Engelfriet J (1997) Monadic second order logic and node rela-tions on graphs and trees. In: Mycielski J, Rozenberg G, Salomaa A (eds)

Structures in Logic and Computer Science

Tree Automata Techniques and Applications . Available athttp://tata.gforge.inria.fr/[12] Cook SA (1971) Characterizations of pushdown machines in terms of time-bounded computers. Journal of the ACM 18: 4–186313] Courcelle B (1994) Monadic second-order deﬁnable graph translations: asurvey. Theoretical Computer Science 126: 53–75[14] Courcelle B, Engelfriet J (2012)

Graph Structure and Monadic Second-Order Logic . Cambridge University Press[15] Courcelle B, Franchi-Zannettacci P (1982) Attribute grammars and recur-sive program schemes I, II. Theoretical Computer Science 17: 163–191,235–257[16] Damm W (1982) The IO- and OI-hierarchies. Theoretical Computer Sci-ence 20: 95–207[17] Deransart P, Jourdan M, Lorho B (1988)

Attribute Grammars – Deﬁni-tions, Systems and Bibliography . Lecture Notes in Computer Science 323,Springer-Verlag[18] Doner J (1970) Tree acceptors and some of their applications. Journal ofComputer and System Sciences 4: 406–451[19] Engelfriet J (1975)

Tree automata and tree grammars . DAIMI FN-10 Lec-ture Notes, Aarhus University. A slightly revised version is available atarXiv:1510.02036[20] Engelfriet J (1977) Top-down tree transducers with regular look-ahead.Mathematical Systems Theory 10: 289–303[21] Engelfriet J (1978) On tree transducers for partial functions. InformationProcessing Letters 7: 170–172[22] Engelfriet J (1980) Some open questions and recent results on tree trans-ducers and tree languages. In: Book RV (ed)

Formal Language Theory –Perspectives and Open Problems , Academic Press, pp 241–286[23] Engelfriet J (1984) Attribute grammars: attribute evaluation methods. In:Lorho B (ed)

Methods and Tools for Compiler Construction . CambridgeUniversity Press, pp 103–138[24] Engelfriet J (1986) Context-free grammars with storage. Technical Re-port 86-11, University of Leiden. A slightly revised version is available atarXiv:1408.0683[25] Engelfriet J (1986) The complexity of languages generated by attributegrammars. SIAM Journal on Computing 15: 70–86[26] Engelfriet J (2009) The time complexity of typechecking tree-walking treetransducers. Acta Informatica 46: 139–154[27] Engelfriet J, Fil´e G (1981) The formal power of one-visit attribute gram-mars. Acta Informatica 16: 275–302[28] Engelfriet J, Hoogeboom HJ, Samwel B (2018) XML navigation and trans-formation by tree-walking automata and transducers with visible and in-visible pebbles. Technical Report available at arXiv:1809.057306429] Engelfriet J, Maneth S (1999) Macro tree transducers, attribute grammars,and MSO deﬁnable tree translations. Information and Computation 154:34–91[30] Engelfriet J, Maneth S (2002) Output string languages of compositionsof deterministic macro tree transducers. Journal of Computer and SystemSciences 64: 350–395[31] Engelfriet J, Maneth S (2003) A comparison of pebble tree transducerswith macro tree transducers. Acta Informatica 39: 613–698[32] Engelfriet J, Maneth S (2003) Macro tree translations of linear size increaseare MSO deﬁnable. SIAM Journal on Computing 32: 950–1006[33] Engelfriet J, Schmidt EM (1978) IO and OI, Part II. Journal of Computerand System Sciences 16: 67–99[34] Engelfriet J, Vogler H (1985) Macro tree transducers. Journal of Computerand System Sciences 31: 71–146[35] Engelfriet J, Vogler H (1988) High level tree transducers and iterated push-down tree transducers. Acta Informatica 26: 131–192[36] ´Esik Z (1980) Decidability results concerning tree transducers I. Acta Cy-bernetica 5: 1-20[37] Fischer MJ (1968)

Grammars with Macro-Like Productions . Ph.D. Thesis,Harvard University[38] F¨ul¨op Z (1981) On attributed tree transducers. Acta Cybernetica 5: 261–279[39] F¨ul¨op Z, Vogler H (1998)

Syntax-Directed Semantics – Formal ModelsBased on Tree Transducers . Springer-Verlag[40] Ganzinger H (1983) Increasing modularity and language-independency inautomatically generated compilers. Science of Computer Programming 3:223–278[41] Ganzinger H, Giegerich R (1984) Attribute coupled grammars. In: Proc.SIGPLAN’84. SIGPLAN Notices 19: 157–170[42] Garey MR, DS Johnson (1979)

Computers and Intractability – A Guide tothe Theory of NP-Completeness . W. H. Freeman and Co.[43] G´ecseg F, Steinby M (1984)

Tree Automata . Akad´emiai Kiad´o, Budapest.A re-edition is available at arXiv:1509.06233[44] G´ecseg F, Steinby M (1997) Tree languages. In: Rozenberg G, Salomaa A(eds)

Handbook of Formal Languages , Volume 3. Springer-Verlag, Chapter 1[45] Giegerich R (1988) Composition and evaluation of attribute coupled gram-mars. Acta Informatica 25: 355–423[46] Harrison MA (1978)

Introduction to Formal Language Theory . Addison-Wesley 6547] Hosoya H (2011)

Foundations of XML Processing – The Tree-AutomataApproach . Cambridge University Press[48] Inaba K (2009)

Complexity and Expressiveness of Models of XMLTransformations

Berechnungsst¨arken von Teilklassen primitiv-rekursiver Programmschemata

Modern Applications of AutomataTheory . IISc Research Monographs Series 2, World Scientiﬁc, pp 325–372[63] Milo T, Suciu D, Vianu D (2003) Typechecking for XML transformers.Journal of Computer and System Sciences 66: 66–97[64] Papadimitriou CH (1994)

Computational Complexity . Addison-Wesley[65] Perst T, Seidl H (2004) Macro forest transducers. Information ProcessingLetters 89: 141–149[66] Rounds WC (1970) Mappings and grammars on trees. Mathematical Sys-tems Theory 4: 257–287[67] Rounds WC (1973) Complexity of recognition in intermediate-level lan-guages. In: Proc. 14th Annual Symposium on Switching and AutomataTheory, pp 145–158[68] Ruzzo WL (1980) Tree-size bounded alternation. Journal of Computer andSystem Sciences 21: 218–235[69] Schwentick T (2007) Automata for XML – A survey. Journal of Computerand System Sciences 73: 289–315[70] Slutzki G (1985) Alternating tree automata. Theoretical Computer Science41: 305-318[71] Sudborough IH (1978) On the tape complexity of deterministic context-freelanguages. Journal of the ACM 25: 405–414[72] Thatcher JW (1970) Generalized2