Regular resynchronizability of origin transducers is undecidable
RRegular resynchronizability of origin transducers isundecidable
Denis Kuperberg
CNRS, LIP, ENS Lyon, [email protected]
Jan Martens
Eindhoven University of Technology, [email protected]
Abstract
We study the relation of containment up to unknown regular resynchronization between two-waynon-deterministic transducers. We show that it constitutes a preorder, and that the correspondingequivalence relation is properly intermediate between origin equivalence and classical equivalence. Wegive a syntactical characterization for containment of two transducers up to resynchronization, anduse it to show that this containment relation is undecidable already for one-way non-deterministictransducers, and for simple classes of resynchronizations. This answers the open problem stated inrecent works, asking whether this relation is decidable for two-way non-deterministic transducers.
Theory of computation → Transducers
Keywords and phrases transducers, origin, resynchronisation, MSO, one-way, two-way, undecidabil-ity
Digital Object Identifier
Related Version https://arxiv.org/abs/2002.07558 © Denis Kuperberg and Jan Martens;licensed under Creative Commons License CC-BY45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020).Editors: Javier Esparza and Daniel Král’; Article No. 51; pp. 51:1–51:21Leibniz International Proceedings in InformaticsSchloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany a r X i v : . [ c s . F L ] J u l The study of transductions, that is functions and relations from words to words, is afundamental field of theoretical computer science. Many models of transducers have beenproposed, and robust notions such as regular transductions emerged [7, 1]. However, manynatural problems on transductions are undecidable, for instance equivalence of one-waynon-deterministic transducers [9, 10].In order to circumvent this, and to obtain a better-behaved model, Bojańczyk introducedtransducers with origin information [2], where the semantics takes into account not onlythe input/output pair of words, but also the way the output is produced from the input. Itis shown in [2] that translations between different models of transducers usually preservethe origin semantics, more problems become decidable, such as the equivalence between twotransducers, and the model of transduction with origins is more amenable to an algebraicapproach.The fact that two transducers are origin-equivalent if they produce their output in exactlythe same way can seem too strict, and prompted the idea of resynchronization . The idea,introduced in [8], where the main focus was the sequential uniformization problem, anddeveloped in [5, 4], is to allow a distortion of the origins in a controlled way, in order torecognize that two transducers have a similar behaviour.It is shown in [5], that containment of 2-way transducers up to a fixed resynchronizationis in
PSpace , so no more difficult than classical containment of non-deterministic one-wayautomata. This covers in particular the case where the resynchronization is trivial, in whichcase the problem boils down to testing strict origin equivalence.In [4], the resynchronizer synthesis problem was studied. The goal is now to decidewhether there exists a resynchronizer R such that containment or equivalence holds up to R .Some results are obtained for two notions of resynchronizers. The first notion, introducedin [8] is called rational resynchronizers , it is specialized for 1-way transducers, and usesan interleaving of input and output letters. The second notion is called (bounded) regularresynchronizers , it is the focus of [5] and is defined for two-way transducers.For rational resynchronizers, a complete picture is obtained in [4]: the synthesis problem isdecidable for k -valued transducers, but undecidable in general. For regular resynchronizers, itis shown in [4] that the synthesis problem is decidable for unambiguous two-way transducers,i.e. transducers that have at most one accepting run on each input word. The ambiguouscase is left open. It was also shown in [4] that for one-way transducers, the notion of rationaland regular resynchronizer do not match. The picture for resynchronizability from previousworks is summed up in this table, where the first line describes constraints on the input pairof transducers: unambiguous functional/finite-valued general caseFixed regular resync. (2-way) PSpace PSpace -c PSpace -c.Unknown rational resync. (1-way) decidable decidable undecidableUnknown regular resync. (2-way) decidable ? ?
In this work, we tackle the general case (last question mark), and show a stronger result: thesynthesis of regular resynchronizers is already undecidable for one-way transducers.To do so, we introduce the notion of limited traversal, which characterizes whether twotransducers verify a containment relation up to some unknown resynchronization. Outside ofthis undecidability proof, this notion can be used to show that some natural transducers,equivalent in the classical sense, cannot be resynchronized. As a by-product, we alsoobtain that the resynchronizer synthesis problem is undecidable even if we restrict regularresynchronizers to any natural subclass containing the simple “shifting” resynchronizations, . Kuperberg and J. Martens 51:3 allowing origins to change by at most k positions for a fixed bound k . Our proof can also belifted to show a different statement, emphasizing the difference between rational and regularresynchronization: even in presence of regular resynchronization, synthesis of a rationalresynchronizer is undecidable. Notations If i, j ∈ N , we denote [ i, j ] the set { i, i + 1 , . . . , j } . We will note B := { , } the set ofbooleans. If X is a set, we denote X ∗ := S i ∈ N X i the set of words on alphabet X . Theempty word is denoted ε . We will denote u v v if u is a prefix of v . We will denote Σ and Γfor arbitrary finite alphabets throughout the paper. If u ∈ Σ ∗ , we will denote | u | its lengthand dom ( u ) = { , , . . . , | u |} its set of positions. A one-way non-deterministic transducer (1NT) is a tuple T = h Q, Σ , Γ , ∆ , I, F i , where Q is a finite set of states, Σ is a finite input alphabet, Γ is a finite output alphabet,∆ ⊆ Q × (Σ ∪{ ε } ) × Γ ∗ × Q is the transition relation, I is the set of initial states, and F the setof final states. A transition ( p, a, v, q ) of ∆ will be denoted as p a | v −→ q . A run of T on an inputword u ∈ Σ ∗ is a sequence of transitions p a | v −→ p a | v −→ . . . a n | v n −→ p n , such that u = a a . . . a n , p ∈ I and p n ∈ F . The output of this run is the word v = v . . . v n . The relation computedby T is (cid:74) T (cid:75) = { ( u, v ) | there exists a run of T on u with output v } ⊆ Σ ∗ × Γ ∗ . To avoidunnecessary special cases, we will always assume throughout the paper that the input word u is not empty. Two transducers T , T are classically equivalent if (cid:74) T (cid:75) = (cid:74) T (cid:75) . It is knownfrom [9] that classical equivalence of 1NTs is undecidable. In 1NTs, transitions can either leave the reading head on the same input letter, or moveit one step to the right. If the possibility of moving to the left is added, we obtain themodel of two-way non-deterministic transducer (2NT). The transition relation is now of theform ∆ ⊆ Q × (Σ ∪ {‘ , a} ) × Γ ∗ × { left , right } × Q , where the symbol ‘ (resp. a ) marks thebeginning (resp. end) of the input word. When reading this symbol, we forbid the productionof a non-empty output, and the only allowed direction for transitions is right (resp. left ).The semantics (cid:74) T (cid:75) ⊆ Σ ∗ × Γ ∗ of a 2NT is defined in a natural way: the output of a run p a | v ,d −→ p a | v ,d −→ . . . a n | v n ,d n −→ p n is v v . . . v n . See [5] for a formal definition. Notice that ε -transitions are not necessary anymore, since a transition p ε | v −→ q can be simulated by twotransitions going right then left (or left then right if the symbol a is reached).If the transition relation is deterministic, i.e. if for all ( p, a ) ∈ Q × (Σ ∪ {‘ , a} ) thereexists at most one ( v, d, q ) ∈ Γ ∗ × { left , right } × Q such that p a | v,d −→ q is a transition in ∆, wesay that the transducer is a two-way deterministic transducer (2DT).Notice that the relation defined by a 2DT T is necessarily a (partial) function: for all u ∈ Σ ∗ there is at most one v ∈ Γ ∗ such that ( u, v ) ∈ (cid:74) T (cid:75) . The class of functions definableby 2DTs is called regular string-to-string functions . It has equivalent characterizations, suchas MSO transductions [7] and streaming transducers [1]. M F C S 2 0 2 0
The origin semantics was introduced in [2] as an enrichment of the classical semantics forstring-to-string transductions. The principle is that the contribution of a run of T to thesemantics of T is not only the input/output pair ( u, v ), but an origin graph describing how v is produced from u during this run.Formally, an origin graph is a triple ( u, v, orig) where u ∈ Σ ∗ , v ∈ Γ ∗ , and orig : dom ( v ) → dom ( u ) associates to each position in v a position in u : its origin . An origin graph is associatedto a run of a transducer T in a natural way, by mapping to each position y in v the positionorig( y ) of the reading head in u when writing to this position y . If an output is produced byan ε -transition after the whole word has been processed in a 1NT, we take the last inputletter as origin. The origin semantics (cid:74) T (cid:75) o of T is the set of origin graphs associated withruns of T . (cid:73) Example 1.
The two following 2DTs T id and T rev are classically equivalent and computethe identity relation { ( a n , a n ) | n ∈ N } , but their origin semantics differ, as witnessed bytheir unique origin graphs for input a given below. p p a | a, right a | ε q q q a | ε, right a | ε a | a, left ‘ | εa a a a a aa a a a a a Input:Output: a a a a a aa a a a a a
Two transducers are said origin equivalent if they have the same origin semantics. It isshown in [2] that origin equivalence is decidable for regular transductions, and in [5] thatorigin equivalence is
PSpace -complete for 2NTs. See Appendix A.1 for an example of twoone-way transducers both computing the full relation Σ ∗ × Γ ∗ , but not origin equivalent. While origin semantics gives a satisfying framework to recover decidability of transducerequivalence, it can be argued that this semantics is too rigid, as origin equivalence requirethat the output is produced in exactly the same way in both transducers.In order to relax this constraint, the intermediate notion of resynchronization has beenintroduced [8, 5]. The idea is to let origins differ in a controlled way, while preserving theinput/output pair. Several notions of resynchronizations have been considered [8, 5, 4], wewill focus in this work on MSO resynchronizers, also called regular resynchronizers.
We recall here how Monadic Second-Order logic (MSO) can be used to define languages.This framework will be then used to represent resynchronizers. Formulas of MSO are definedby the following grammar, where a ranges over the alphabet Σ: ϕ, ψ := a ( x ) | x ≤ y | x ∈ X | ∃ x.ϕ | ∃ X.ϕ | ϕ ∨ ψ | ¬ ϕ . Kuperberg and J. Martens 51:5 Such formulas are evaluated on structures induced by finite words: the universe is the set ofpositions of the word, a ( x ) means that position x is labelled by letter a , and x ≤ y means thatposition x occurs before position y . Lowercase notation is used for first-order variables, rangingover positions of the word, and uppercase notation is used for second-order variables, rangingover sets of positions. Other classical operators such as ∧ , ⇒ , ∀ , = , +1 , +2 , first , last , . . . canbe defined from this syntax and will be used freely. Let > be a tautology, defined for instanceas ∃ x.a ( x ) ∨ ¬ ( ∃ x.a ( x )).If ϕ is an MSO formula and u ∈ Σ ∗ , we will note u | = ϕ if u is a model of ϕ , with classicalMSO semantics. The language L ( ϕ ) defined by a closed formula ϕ is { u ∈ Σ ∗ | u | = ϕ } .If ϕ contains free variables X , . . . , X n , x , . . . , x k , we can still define the language of ϕ ,using an extended alphabet Σ × B n + k . Extra boolean components at each position are usedto convey the values of free variables at this position: it is 1 if the value of the second-ordervariable contains the position (resp. if the value of the first-order variable matches the position)and 0 otherwise. The language of ϕ is in this case a subset of (Σ × B n + k ) ∗ , i.e. a set of wordson Σ enriched with valuations for the free variables. If I , . . . , I n , i , . . . , i k is an instantiationfor the free variables of ϕ in a word u , we will also write ( u, I , . . . , I n , i , . . . , i k ) | = ϕ tosignify that u with this instantiation of the free variables satisfies ϕ .For instance if ϕ = ∃ x. ( x ∈ X ∧ a ( x )) uses a free second-order variable X , then the word u = ( a, , ( b, , ( a, ∈ (Σ × B ) ∗ is a model of ϕ , denoted also ( aba, { , } ) | = ϕ , but theword ( a, , ( b, , ( a,
0) is not.A language L ⊆ (Σ × B n ) ∗ is regular if and only if there is a formula ϕ of MSO with n free variables recognizing L . This is equivalent to L being recognizable by a deterministicfinite automaton (DFA) on alphabet Σ × B n [6]. The principle behind MSO resynchronizers as defined in [5] is to describe in a regular way,with MSO formulas, how the origins can be redirected. This will induce a relation betweensets of origin graphs: containment up to resynchronization.The MSO formulas will be allowed to use a finite set of parameters : extra informationlabelling the input word. This is reminiscent of the model of non-deterministic two-waytransducers with common guess [3], where the guessing of extra parameters can be done in aconsistent way through different visits of the same position in the input word.
We now define a subclass of regular resynchronizers from [5, 4]. We will see that forour purpose of resynchronizer synthesis, this subclass is equivalent to the full class ofresynchronizers from [5, 4]. Intuitively, the full definition from [5, 4] allows to further restrictthe semantics of a resynchronizer, which is not useful if we are just interested in the existenceof a resynchronization between two transducers. This is further explained in Section 4.1 andAppendix A.3.Given an origin graph σ = ( u, v, orig), an input parameter is a subset of the inputpositions, encoded by a word on B . Thus, a valuation for m input parameters is given by atuple ¯ I = ( I , . . . , I m ) where for each i ∈ [1 , m ] , I i ∈ B | u | .The main differences between the following simplified definition and the one from [5, 4] isthat we ignored output parameters (an extra labelling of the output word), and also removedextra formulas constraining the behaviour of the resynchronization with respect to bothinput and output parameters. M F C S 2 0 2 0 (cid:73)
Definition 2.
An MSO (or regular ) resynchronizer R with m input parameters is an MSOformula γ with m + 2 free variables γ ( ¯ I, x, y ) , evaluated over the input word u . Intuitively, γ ( ¯ I, x, y ) indicates that the origin x of an output position can be redirectedto a new origin y , as made precise in Definition 3. Although R and γ are actually the sameobject here, we will keep the two notations to maintain coherence with [5], using R for theabstract resynchronizer and γ for the MSO formula, which is only one of the components of R in [5]. We now describe formally the semantics of an MSO resynchronizer. (cid:73) Definition 3. [5] An MSO resynchronizer R induces a relation (cid:74) R (cid:75) on origin graphs inthe following way. If σ = ( u, v, orig) and σ = ( u , v , orig ) are two origin graphs, we have ( σ, σ ) ∈ (cid:74) R (cid:75) if and only if u = u , v = v , and there exists input parameters ¯ I ∈ ( B | u | ) m ,such that for every output position z ∈ dom ( v ) , we have ( u, ¯ I, orig( z ) , orig ( z )) | = γ . Plain blue arrows will represent the “old” origins in σ , and red dotted arrows the “new”origins in σ . (cid:73) Example 4. [5] The resynchronizer without parameters R univ , using only a tautologyformula γ = > , is called the universal resynchronizer, and resynchronizes any two origingraphs that share the same input and output. (cid:73) Example 5. [5] The resynchronizer without parameters R + − shifts all origins by exactly 1position left or right. This is achieved using a formula γ ( x, y ) = ( x = y + 1) ∨ ( y = x + 1). (cid:73) Example 6.
The resynchronizer with one parameter defined by γ = ( I = { x } ) ∨ ( x = y )allows at most one input position to be resynchronized to different origins. a a a a a ab b b b b b Input:Output: a a a a a ab b b b b b
Example 5 Example 6 (cid:73)
Definition 7. [5] For a resynchronizer R and two transducers T , T we note T ⊆ R ( T ) if for every origin graph σ ∈ (cid:74) T (cid:75) o , there exists σ ∈ (cid:74) T (cid:75) o such that ( σ , σ ) ∈ (cid:74) R (cid:75) . In other words this means that (cid:74) T (cid:75) o is contained in the resynchronization expansion of (cid:74) T (cid:75) o . Examples can be found in Appendix A.2.For a fixed resynchronizer R and a 2NT T , it might not be the case that T ⊆ R ( T ),as witnessed by the resynchronizer R + − from Example 5. Moreover, if T ⊆ R ( T ) and T ⊆ R ( T ) it might not be the case that T ⊆ R ( T ), again this is examplified by R + − . Thismeans that the containment relation up to a fixed resynchronizer R is neither reflexive nortransitive in general. Note that the universal resynchronizer R univ from Example 4 relates any two graphs thatshare the same input and output. This causes the containment relation up to R univ to boil . Kuperberg and J. Martens 51:7 down to classical containment, ignoring the origin information. I.e. we have T ⊆ R univ ( T )if and only if (cid:74) T (cid:75) ⊆ (cid:74) T (cid:75) . This inclusion relation is undecidable, even in the case of one-way non-deterministic transducers [9]. Thus containment up to a fixed resynchronizer isundecidable in general, if no extra constraint is put on resynchronizers. That is why thenatural boundedness restriction is introduced on MSO resynchronizers in [5]. (cid:73) Definition 8. [5] (Boundedness) A regular resynchronizer R has bound k if for all inputs u , input parameters ¯ I , and target position y ∈ dom ( u ) , there are at most k distinct positions x , . . . x k ∈ dom ( u ) such that ( u, ¯ I, x i , y ) | = γ for all i ∈ [1 , k ] . A regular resynchronizer isbounded if it has bound k for some k ∈ N . All examples of resynchronizations given in this paper (including Appendix) are bounded,except for R univ . In Appendix A.2, we give examples of bounded resynchronizations thatdisplace the origin by a distance that is not bounded.Boundedness is a decidable property of MSO resynchronizers [5, Prop. 15]. As statedin the next theorem, boundedness guarantees that the containment problem up to a fixedresynchronizer becomes decidable. Moreover, for any fixed bounded MSO resynchronizer,the complexity of this problem matches the complexity of containment with respect to strictorigin semantics, or more simply the complexity of inclusion of non-deterministic automata. (cid:73) Theorem 9. [5, Cor. 17] For a fixed bounded MSO resynchronizer R and given two 2NTs T , T , it is decidable in PSpace whether T ⊆ R ( T ) . We will now be interested in the containment up to an unknown bounded resynchronizer.Let us define the relation (cid:22) on 2NTs by T (cid:22) T if there exists a bounded resynchronizer R such that T ⊆ R ( T ). This relation has been introduced in [4], along with the same notionwith respect to rational resynchronizers.Focusing on bounded regular resynchronizers, the following result is obtained in [4]: (cid:73) Theorem 10. [4] The relation (cid:22) is decidable on unambiguous 2NTs.
The problem is left open in [4] for general 2NTs, and this is the purpose of the presentwork. Now that the necessary notions have been presented, we move to our contributions.
Let us start by expliciting a few properties of (cid:22) . First, let us emphasize that our simplifieddefinition of MSO resynchronizer is justified by the fact that this definition yields the samerelation (cid:22) as the one from [5, 4]. This is fully explicited in Appendix A.3.This simplified definition allows us to show basic properties of the (cid:22) relation, see AppendixA.4 for a detailed proof: (cid:73)
Lemma 11.
The relation (cid:22) is reflexive and transitive.
Since (cid:22) is a pre-order, it induces an equivalence relation ∼ on 2NTs, defined by ∼ = (cid:22) ∩ (cid:23) .Notice that this equivalence relation is intermediate between classical equivalence and originequivalence, but it is not immediately clear that it does not coincide with classical equivalence.The following claim presents two pairs of transducers (one pair of 2DTs and one pair of1NTs) equivalent for the classical semantics, but not ∼ -equivalent. (cid:66) Claim 12.
The 2NTs T id and T rev from Example 1 are not ∼ -equivalent. M F C S 2 0 2 0
The two following 1NTs T one − two , T two − one have the same classical semantics { ( a n , a m ) | n ≤ m ≤ n } , but are not ∼ -equivalent. p p Transducer T one − two a | a ε | ε a | aa q q Transducer T two − one a | aa ε | ε a | a A variant of the pair T one − two , T two − one is presented in [4, Example 5], where it is claimedwithout proof that no bounded regular resynchronizer exists. A proof of Claim 12 will beobtained as a by-product of Theorem 17 and explicited in Corollary 19. The goal of this section is to exhibit a pattern characterizing families of origin graphs thatcannot be resynchronized with a bounded MSO resynchronizer. (cid:73)
Definition 13.
Let σ = ( u, v, orig) and σ = ( u, v, orig ) be two origin graphs with sameinput/output words. Given two input positions x, z ∈ dom ( u ) , we say x traverses z if thereexists an output position t ∈ dom ( v ) with orig ( t ) = x and either: x ≤ z and orig ( t ) > z (left to right traversal); x ≥ z and orig ( t ) < z (right to left traversal).Intuitively, x traverses z if x is resynchronized to some y = z , and z is between the twopositions x, y . a a a a aa a a a a position z position x a a a a aa a a a a position z position x position t position tx traverses z from left to right x traverses z from right to left Let k ∈ N , a pair of origin graphs ( σ, σ ) on input/output words ( u, v ) is said to have k -traversal if for every z ∈ dom ( u ), there are at most k distinct positions of dom ( u ) thattraverse z . A resynchronizer R is said to have k -traversal if every pair of origin graphs( σ, σ ) ∈ (cid:74) R (cid:75) has k -traversal. A resynchronizer R has limited traversal if there exists k ∈ N such that R has k -traversal.For any k ∈ N we want to construct a bounded resynchronizer R k that relates any pairof origin graphs that have k -traversal. We will use 2 k input parameters: Right i and Left i for i ∈ [0 , k − Right i (resp. Left i ) corresponds to a guessed set of inputpositions that may be redirected to the right (resp. left), but without traversing a position ofthe same set. For instance it is not possible for a position of R to traverse another positionof R from left to right. Similarly, a position of L cannot traverse another position of L from right to left. We do not a priori require any of these sets to be disjoint from each other.We construct γ ( x, y ) = ( x = y ) ∨ R trav ∨ L trav to ensure this fact, where R trav = _ ≤ i ≤ k (cid:0) x ∈ Right i ∧ x < y ∧ ( ∀ z ∈ [ x + 1 , y ] .z Right i ) (cid:1) . Kuperberg and J. Martens 51:9 verifies that positions labelled by the same Right i do not traverse each other, and L trav doesthe same for the Left i labels. This achieves the description of the resynchronizer R k , whichwill be proved correct in Lemmas 14 and 15. (cid:73) Lemma 14.
The resynchronizer R k is bounded. Proof.
For each potential target position y , if two sources x were labelled with the sameinput parameter, either one would traverse the other, or one would be at the left of y , whichwould contradict the definition of the formula. This means that if γ ( x, y ) is valid then either x = y or one of the parameters is used to indicate a single x as source. There are only 2 k parameters so for every input position y there are at most 2 k + 1 distinct positions x suchthat γ ( x, y ) is valid. (cid:74)(cid:73) Lemma 15.
If a pair of origin graphs ( σ, σ ) has k -traversal, then ( σ, σ ) ∈ (cid:74) R k (cid:75) . Proof sketch.
We describe an algorithm performing a left to right pass of the input word,and assigning labels
Right , Right , . . . , Right k − to positions that are resynchronized to theright. We always assign to a position the minimal index currently available, in order to avoidthe right traversal of any position by another position with the same label. We then showthat under the hypothesis of k -traversal, this algorithm succeeds in finding an assignment oflabels witnessing ( σ, σ ) ∈ (cid:74) R k (cid:75) . The same algorithm is then run in the other direction (rightto left), to assign labels Left i . See Appendix A.5 for the full construction. (cid:74)(cid:73) Lemma 16.
An MSO resynchronizer R has limited traversal if and only if it is bounded. Proof.
Let m be the number of input parameters used in R .( ⇒ ) Assume R is not bounded, and let k ∈ N , we want to build a pair ( σ, σ ) ∈ (cid:74) R (cid:75) exhibiting k -traversal. Since R is not bounded, there exists a word u ∈ Σ ∗ , with inputparameters ¯ I , a position y , and a set X of 2 k + 1 distinct positions such that for all x ∈ X ,we have ( u, ¯ I, x, y ) | = γ . Without loss of generality, we can assume that there are k distinctpositions x , . . . x k in X that are strictly to the left of y . Let a ∈ Γ be an arbitrary outputletter and v = a k . We define the origin graphs σ, σ on ( u, v ) by setting for each i ∈ [1 , k ] theorigin of the i th letter of v to x i in σ and to y in σ . As witnessed by parameters ¯ I , we have( σ, σ ) ∈ (cid:74) R (cid:75) . Moreover, the input position y − k differentsources. Since k is arbitrarily chosen, R does not have limited traversal.( ⇐ ) For the other direction, assume R has no limited traversal. Let A be a deterministicautomaton recognizing γ , on alphabet Σ A = Σ × B m +2 , and Q be the state space of A . Let k ∈ N be arbitrary. There exists ( σ, σ ) ∈ (cid:74) R (cid:75) a pair of origin graphs on words ( u, v ), anda position z ∈ dom ( u ) such that, without loss of generality, z is traversed by K = k · | Q | positions x < x < · · · < x K from left to right, i.e. x K ≤ z . Let ¯ I be the input parameterswitnessing ( σ, σ ) ∈ (cid:74) R (cid:75) . This means that for each i ∈ [1 , K ] there exists y i > z with( u, ¯ I, x i , y i ) | = γ . Let us split the input sequence U = ( u, ¯ I ) ∈ Σ ∗A according to position z : U = wr , where the last letter of w is in position z . For each i ∈ [1 , K ], let w i ∈ Σ ∗A be theword w with two extra boolean components: the source is marked by a bit 1 in position x i ,and the target is left to be defined. We know that for each i there exists r i ∈ Σ ∗A extending r with a target position such that w i r i is accepted by A . Let q i be the state reached by A afterreading w i . By choice of K , there exists q ∈ Q such that q i = q for k distinct values i , . . . i k of i . This means that for each j ∈ [1 , k ], we have w i j r i accepted by A , i.e. ( u, ¯ I, x i j , y i ) | = γ .This achieves the proof that R is not bounded. (cid:74) M F C S 2 0 2 0 (cid:73)
Theorem 17.
Let T , T be 2NTs. Then T (cid:22) T if and only if there exists k ∈ N suchthat for every σ ∈ (cid:74) T (cid:75) o , there exists σ ∈ (cid:74) T (cid:75) o with same input/output and ( σ, σ ) has k -traversal. Proof.
Assume such a bound k exists. By Lemma 15, for every σ ∈ (cid:74) T (cid:75) o there exists σ ∈ (cid:74) T (cid:75) o such that ( σ, σ ) ∈ (cid:74) R k (cid:75) . This implies T ⊆ R k ( T ), and by Lemma 14 this R k isbounded thus witnessing T (cid:22) T .Conversely, assume that no such bound k exists, but that there is a bounded resynchronizer R witnessing T (cid:22) T . By Lemma 16, R has k -traversal for some k ∈ N . By assumption,there exists σ ∈ (cid:74) T (cid:75) o such that for all σ ∈ (cid:74) T (cid:75) o , ( σ, σ ) does not have k -traversal. However,there must exists σ such that ( σ, σ ) ∈ (cid:74) R (cid:75) , contradicting the fact that R has k -traversal. (cid:74)(cid:73) Remark 18.
We have shown here that the resynchronizers R k are universal: if twotransducers can be resynchronized, then this is witnessed by a resynchronizer R k . This givesfor instance a bound on the logical complexity of the MSO formulas needed in resynchronizers:the formula for R k is a disjunction of formulas using only one ∀ quantifier.Notice that unlike the existence of bounded resynchronizer, the notion of limited traversalis directly visible on pairs of origin graphs, and is therefore useful to prove that two transducerscannot be resynchronized. This is exemplified in the following corollary. (cid:73) Corollary 19.
The transducers from Claim 12 are not ∼ -equivalent. Indeed, in both cases,for a given input/output pair ( u, v ) in the relation, only one pair ( σ, σ ) of origin graphs iscompatible with ( u, v ) , and these pairs of graphs exhibit traversal of arbitrary size. Here are visualizations of the phenomenon. The first picture shows a pair of graphs with5-traversal for T id , T rev , witnessed by the only origin graphs on words ( a , a ). The secondpicture does the same for the two 1NTs T one − two , T two − one , which has 3-traversal on words( a , a ). In both cases, the input position being traversed is circled, and only origin arrowsrelevant to the traversal of this position are represented. aa aa aa aa aa aa aa aa aa aaa T id , T rev a a a a a a a a aaa a a a a a a a a a a a a a a T one − two , T two − one The aim of this section is to prove our main result: (cid:73)
Theorem 20.
Given two 2NTs T , T , it is undecidable whether T (cid:22) T . The result remains true if T , T are 1NTs, with equivalence instead of containment, and ifwe restrict to any class of resynchronization that contains the “shift resynchronizations” :for each k ∈ N , the k -shift resynchronization is defined by γ ( x, y ) = ( y ≤ x ≤ y + k ).We will proceed by reduction from the problem BoundTape , which asks given a determ-inistic Turing Machine M , whether it uses a bounded amount of its tape on empty input.For completeness, we prove in Appendix A.7 that this problem is undecidable, by a simplereduction from the Halting problem. To perform the reduction from BoundTape to the (cid:22) relation, we first describe a classical construction used to encode runs of a Turing machine. . Kuperberg and J. Martens 51:11
Let M be a deterministic Turing Machine with alphabet A , states Q , and transition table δ : Q × A → Q × A × { left , right } . Let q (resp. q f ) be the initial (resp. final) state of M ,and B be the special blank symbol from the alphabet A , initially filling the tape.Let / ∈ A ∪ Q be a new separation symbol, and Γ = A ∪ Q ∪ { } .We sketch here a classical idea of using domino tiles to simulate the run of a TuringMachine, for instance to prove undecidability of the Post Correspondence Problem [11, 12].See Appendix A.6 for the detailed construction of the set of tiles.We encode successive configurations of M by words on Γ ∗ . The full run, or computationhistory of M is encoded by a finite or infinite word Hist M ∈ Γ ∗ ∪ Γ ω . We use a set of tiles D M = { ( u i , v i ) ∈ (Γ ∗ ) | i ∈ Σ } , where Σ is a finite alphabet of tile indexes. These tiles aredesigned to simulate the run of M in the following sense (recall that v stands for prefix): (cid:73) Lemma 21.
Let λ = i . . . i k ∈ Σ ∗ be a sequence of tile indexes. Let u λ = u i . . . u i k , and v λ = q v i . . . v i k . If λ is such that u λ v v λ , then we have v λ v Hist M . We give here an example of how a run of M is encoded, and how it is reflected on tiles: (cid:73) Example 22.
Consider the run of M encoded by q q B aq aq B q ab ∈ Γ ∗ . Thisis reflected by the following sequences of tiles: λ : u λ : v λ : q i q q B i q Baq i i aa i q q B i aq Bq ab i We now build two 1NTs T up and T down , based on the tiles of D M . The input alphabetof these transducers is the set Σ of indexes of tiles of D M . The output alphabet is Γ.Roughly, on input i , T up outputs u i and T down outputs v i . Additionally, T up is allowed tonon-deterministically start outputting a word that is not a prefix of u i , and from there outputanything in Γ ∗ . The transducer T up is also allowed to output anything after the end of theinput. The transducer T down starts by outputting q λ ∈ Σ ∗ it outputs v λ .The transducers T up , T down are pictured here, with W i = { u ∈ Γ ∗ , | u | ≤ | u i | , u u i } : p p fail p Transducer T up i | u i i | W i i | ε , ε | Γ ε | ε ε | Γ s s Transducer T down ε | q i | v i The main idea of this construction is that if λ = i . . . i k ∈ Σ ∗ is such that u λ v v λ follow Hist M as in Example 22, then on input λ , T down outputs v λ , the only matching computationof T up starts by outputting u λ , and the bound on traversal will (roughly) match the size of M F C S 2 0 2 0 the tape used by M in this prefix of the computation. Indeed, if T up and T down output theencoding of the same configuration of size K on disjoint inputs, it witnesses a traversal ofsize roughly K (“roughly” because tiles allow up to three output letters on one input letter).The extra part of T up is used to guarantee that (cid:74) T down (cid:75) ⊆ (cid:74) T up (cid:75) holds, even in cases whenthe input λ does not correspond to a prefix of the computation of M . (cid:73) Example 23.
Let λ = i i . . . i be the sequence of tile indexes from Example 22. Weshow here a 2-traversal exhibited by T up , T down on input λ . The traversed input position iscircled, and only arrows relevant to the traversal of this position are represented. i i i i i i i q q B a q a q B q a b T up , T down (cid:73) Theorem 24.
We have T down (cid:22) T up if and only if M ∈ BoundTape.
Proof.
First, assume M ∈ BoundTape , let K be the bound on the tape size used by M .Let R be the resynchronization that shifts by at most K + 2 positions to the left, via γ ( x, y ) = ( y ≤ x ) ∧ ( x ≤ y + K + 2). We claim that T down ⊆ R ( T up ). It is clear that R isbounded. Let σ ∈ (cid:74) T down (cid:75) o be an origin graph ( λ, v, orig ). Notice that by definition of T down , we have v = v λ = q v i . . . v i n on input λ = i . . . i n . We now distinguish two cases:If u λ v v λ , then by Lemma 21, we have v λ v Hist M . The transducer T up is able tooutput v λ without going through the state p fail , with a shift of one configuration as seenin Example 23. It only needs to pad u λ with the last configuration in state p . Let σ be the origin graph for this run. Since the encoding of a configuration has size at most K + 2, we have ( σ, σ ) ∈ (cid:74) R (cid:75) .If u λ v λ , let λ v λ be the longest prefix such that u λ v v λ . Now in order to output v λ , the transducer T up has to output u λ in p when processing λ . After processing λ ,the transducer T up is forced to move to state p fail in order to match the output of T down .From this state T up is allowed to output anything from any positions, so in particularthere exists a run where the remaining output of v λ is produced immediately, then T up synchronizes with T down during the next configuration encoding, and finally the rest ofthe desired output v λ is produced on the same input positions as in T down . As before,the shift when processing λ is at most K + 2, and therefore this run induces an origingraph σ with ( σ, σ ) ∈ (cid:74) R (cid:75) .We now assume M / ∈ BoundTape . We want to use Theorem 17 to conclude that T down T up . Let k ∈ N , and λ ∈ Σ ∗ such that u λ v v λ and u λ is a prefix of Hist M witnessing a configuration of size k + 2. Let σ be the only origin graph of T down on input λ , with output v λ . There is only one way for T up to output v λ on input λ : it is by usinga run avoiding p fail . Let σ ∈ (cid:74) T up (cid:75) o be the corresponding origin graph. Since T up is oneconfiguration behind, and since a configuration of size k + 2 is produced by at least k inputs,the pair ( σ, σ ) has a position traversed k times. This is true for arbitrary k , so by Theorem17, we can conclude that T down T up . (cid:74) Since
BoundTape is undecidable, this achieves the proof of Theorem 20.Notice that in the case where M ∈ BoundTape , the resynchronization does not needparameters, and can be restricted to some simple classes of resynchronizations. This is statedin the following corollary: . Kuperberg and J. Martens 51:13 (cid:73)
Corollary 25.
Given T , T two 1NTs, it is undecidable whether T (cid:22) T . This resultstill holds when considering any restricted class of resynchronizers that contains the k -shiftresynchronizers. We can also strengthen the above proof to show undecidability of equivalence up to someunknown resynchronization: (cid:73)
Theorem 26.
Given T , T two 1NTs, it is undecidable whether T ∼ T . Proof.
It suffices to take T down = T down ∪ T up in the above proof. This way we clearly have T up (cid:22) T down , and the other direction T down (cid:22) T up is equivalent to T down (cid:22) T up , so it reducesto BoundTape as well. (cid:74)
Finally, let us mention that this proof allows us to recover and strengthen undecidabilityresults on rational transducers from [4]. We recall the definition of rational transducers inAppendix A.8.Since the shift resynchronizations are rational, and that any rational resynchronization isin particular bounded regular [4, Theorem 3], our reduction can be used in particular as analternative proof of undecidability of rational resynchronization synthesis, shown in [4] viaone-counter automata. This means we directly obtain this corollary: (cid:73)
Corollary 27.
Given two 1NTs T , T such that (cid:74) T (cid:75) ⊆ (cid:74) T (cid:75) , it is undecidable whetherthere exists a rational resynchronizer R rat such that T ⊆ R rat ( T ) . We can further strengthen the result via the following theorem: (cid:73)
Theorem 28.
Given two 1NTs T , T and a regular resynchronizer R reg such that T ⊆ R reg ( T ) , it is undecidable whether there exists a rational resynchronizer R rat such that T ⊆ R rat ( T ) . Due to space constraints, the proof is presented in Appendix A.8.
In this work we investigated the containment relation on transducers up to unknown regularresynchronization. We showed that this relation forms a pre-order, strictly between classicalcontainment and containment with respect to origin semantics. We introduced a syntacticalcondition called limited traversal, characterizing resynchronizable transducers pairs. Usingthis tool we proved that the resynchronizer synthesis is undecidable already in the case of1NTs, while the problem was left open for 2NTs in [4].We leave open the decidability of the resynchronizability relation on functional transducers.Since our construction highly uses non-functionality, it seems a different approach is needed.
M F C S 2 0 2 0
References Rajeev Alur and Pavol Černý. Expressiveness of streaming string transducers. In KamalLodaya and Meena Mahajan, editors,
IARCS Annual Conference on Foundations of SoftwareTechnology and Theoretical Computer Science (FSTTCS 2010) , volume 8 of
Leibniz Inter-national Proceedings in Informatics (LIPIcs) , pages 1–12, Dagstuhl, Germany, 2010. SchlossDagstuhl–Leibniz-Zentrum fuer Informatik. Mikołaj Bojańczyk. Transducers with origin information. In
International Colloquium onAutomata, Languages, and Programming , pages 26–37. Springer, 2014. Mikolaj Bojańczyk, Laure Daviaud, Bruno Guillon, and Vincent Penelle. Which classes oforigin graphs are generated by transducers. In , pages 114:1–114:13, 2017. Sougata Bose, Shankara Narayanan Krishna, Anca Muscholl, Vincent Penelle, and Gabri-ele Puppis. On Synthesis of Resynchronizers for Transducers. In Peter Rossmanith, PinarHeggernes, and Joost-Pieter Katoen, editors, , volume 138 of
Leibniz InternationalProceedings in Informatics (LIPIcs) , pages 69:1–69:14, Dagstuhl, Germany, 2019. SchlossDagstuhl–Leibniz-Zentrum fuer Informatik. Sougata Bose, Anca Muscholl, Vincent Penelle, and Gabriele Puppis. Origin-equivalence oftwo-way word transducers is in PSPACE. In . Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. J Richard Buchi. Weak second-order arithmetic and finite automata.
Zeitschrift für mathem-atische Logik und Grundlagen der Mathematik , page 6:66–92, 1960. Joost Engelfriet and Hendrik Jan Hoogeboom. MSO definable string transductions and two-wayfinite-state transducers.
ACM Transactions on Computational Logic (TOCL) , 2(2):216–254,2001. Emmanuel Filiot, Ismaël Jecker, Christof Löding, and Sarah Winter. On equivalence anduniformisation problems for finite transducers. In . Schloss Dagstuhl-Leibniz-Zentrum fuerInformatik, 2016. T. V. Griffiths. The unsolvability of the equivalence problem for Λ-free nondeterministicgeneralized machines.
J. ACM , 15(3):409–413, July 1968. Oscar H Ibarra. The unsolvability of the equivalence problem for ε -free NGSM’s with unaryinput (output) alphabet and applications. SIAM Journal on Computing , 7(4):524–532, 1978. Emil L. Post. A variant of a recursively unsolvable problem.
Bull. Amer. Math. Soc. ,52(4):264–268, 04 1946. Michael Sipser.
Introduction to the Theory of Computation . International Thomson Publishing,1st edition, 1996. . Kuperberg and J. Martens 51:15
A AppendixA.1 Examples of transducers (cid:73)
Example 29.
Two equivalent transducers computing the full relation Σ ∗ × Γ ∗ . Notice that ε -transitions are necessary to compute this relation. p p Σ | ε ε | ε ε | Γ q q ε | Γ ε | ε Σ | ε (cid:73) Example 30.
Consider the two transducers from Example 29 with Σ = { a, b } and Γ = { c, d } .Although they are equivalent in the classical sense as they compute the full relation Σ ∗ × Γ ∗ ,their origin semantics is different, as witnessed by the following examples of origin graphs oninput u = abbaba and output v = cdddcc . p p a, b | ε ε | ε ε | c, d q q ε | c, d ε | ε a, b | εa b b a b ac d d d c c Input:Output: a b b a b ac d d d c c
A.2 Examples of resynchronizers (cid:73)
Example 31.
The resynchronizer without parameters R block behaves as follows: if theorigin is the first letter of an a -block, then it is moved to the last letter of this a -block. Ifthe origin is a b then it does not change. γ ( x, y ) = ( x ≤ y ∧ ( ∀ z ∈ [ x, y ] .a ( z )) ∧ ¬ a ( x − ∧ ¬ a ( y + 1)) W ( b ( x ) ∧ x = y ) a a a b a a bc d c d Input:Output:Here is an example of behaviour of the same resynchronizer, applied to a two-way transducer T →← doing two passes of the input word, one left-to-right and one-right-to-left, and outputtinga new letter at each alternation of input letters a and b . a a a b a a bc d c d c d c Input:Output:
M F C S 2 0 2 0 (cid:73)
Example 32. [5] We give the example of R − to − last = ( > , > , γ, > ): a resynchronizerwithout parameters, with γ ( x, y ) = ( x = first ) ∧ ( y = last ), allowing only the resynchronizationof origins from the first input position to the last one, and no other origins in the new origingraph. a b b a b ac d d d c c Input:Output: (cid:73)
Example 33.
Let T first , T last be the two transducers from Example 30, and R − to − last the MSO resynchronizer from Example 32. Then we have T last ⊆ R − to − last ( T first ). (cid:73) Example 34.
Let us give an example of two transducers T fast , T slow with (cid:74) T fast (cid:75) = (cid:74) T slow (cid:75) = { ( a n , a m ) | n, m ∈ N } , and T slow (cid:22) T fast but T fast T slow . p p Transducer T fast ε | a ε | ε a | ε q q q Transducer T slow a | a a | ε a | εε | a ε | a Indeed, we have T slow ⊆ R ( T fast ) where R uses only γ ( x, y ) = ( x = first ), which is bounded.However, if we had T fast ⊆ R ( T slow ), then R would need to redirect arbitrarily manypositions to the first one, and therefore it could not be bounded. A.3 The original definition of resynchronizers
We now give the original definition of MSO resynchronizers from [5, 4], that we will call here extended MSO resynchronizer , to emphasize the difference with our simplified version..In addition to input parameters, extended MSO resynchronizers are also allowed to guess output parameters , labelling the output word.Given an origin graph σ = ( u, v, orig), an output parameter is a subset of the outputpositions, encoded by a word on B . Thus, a valuation for n output parameters are givenby ¯ O = ( O , . . . , O n ) ∈ ( B | v | ) n . Given an output alphabet Γ and a number n of outputparameters, we define the set of output-types as Γ × B n . The role of an output-type is todescribe a possible labelling of an output position, including the value of output parameters.More precisely, given v ∈ Γ ∗ , ¯ O = ( O , . . . , O m ) ∈ ( B | v | ) n and x ∈ dom ( v ), we call output-type of x the element τ = ( a, b , . . . , b m ) ∈ Γ × B n obtained by projecting each coordinate of( v, O , . . . , O m ) onto its x th position. Notice that in the absence of output parameters, anoutput-type is simply a letter from Γ.We can now give the definition of extended MSO resynchronizers: (cid:73) Definition 35. [5] An MSO resynchronizer R with m input parameters and n outputparameters is a tuple ( α, β, γ, δ ) , where α ( ¯ I ) is an MSO formula over the input word with input parameters ¯ I = ( I , . . . , I m ) . . Kuperberg and J. Martens 51:17 β ( ¯ O ) is an MSO formula over the output word with output parameters ¯ O = ( O , . . . , O n ) .For every output-type τ ∈ Γ × B n , γ ( τ ) is an MSO formula with m + 2 free variables: γ ( τ )( ¯ I, x, y ) over the input word u , that indicates that the origin x of an output positionof type τ can be redirected to a new origin y .For every pair of output-types τ , τ , δ ( τ , τ ) is an MSO formula with m + 2 free variables: δ ( τ , τ )( ¯ I, z , z ) over the input word u is required to hold if z , z are the new origins oftwo consecutive output positions x , x with type τ , τ respectively. We now describe formally the semantics of a extended MSO resynchronizer. (cid:73)
Definition 36. [5] An MSO resynchronizer R = ( α, β, γ, δ ) induces a relation (cid:74) R (cid:75) onorigin graphs in the following way. If σ = ( u, v, orig) and σ = ( u , v , orig ) are two origingraphs, we have ( σ, σ ) ∈ (cid:74) R (cid:75) if and only if u = u , v = v , and there exists input parameters ¯ I ∈ ( B | u | ) m , ¯ O ∈ ( B | v | ) n , such that the following requirements hold: ( u, ¯ I ) | = α ( v, ¯ O ) | = β For every output position x ∈ dom ( v ) of type τ , we have ( u, ¯ I, orig( x ) , orig ( x )) | = γ ( τ ) For all consecutive output positions x , x ∈ dom ( v ) of type τ , τ respectively, we have ( u, ¯ I, orig ( x ) , orig ( x )) | = δ ( τ , τ ) . For examples making use of all components, see [5].We also recall the definition of boundedness for extended MSO resynchronizers: (cid:73)
Definition 37. [5] (Boundedness) A regular resynchronizer R has bound k if for all inputs u , input parameters ¯ I , output-types τ ∈ Γ × B n , and target position y ∈ dom ( u ) , there are atmost k distinct positions x , . . . x k ∈ dom ( u ) such that ( u, ¯ I, x i , y ) | = γ ( τ ) for all i ∈ [1 , k ] .A regular resynchronizer is bounded if it is bounded by k for some k ∈ N . Now, moving to simplified MSO resynchronizer in the present work is justified by thefollowing Lemma: (cid:73)
Lemma 38. If R = ( α, β, γ, δ ) is a bounded extended MSO resynchronizer, then thereexists a simplified MSO resynchronizer R that is also bounded, such that (cid:74) R (cid:75) ⊆ (cid:74) R (cid:75) . So iffor two transducers T and T the relation T (cid:22) T holds, as witnessed by a bounded extendedresynchronizer, then it is also witnessed by a bounded simplified resynchronizer. Proof.
Let m be the number of input parameters of R , and Θ its set of output-types. Thesimplified resynchronizer R will use m input parameters as well, and is defined by theformula γ = [ τ ∈ Θ γ ( τ ) . Let k ∈ N be such that R is bounded by k . Let K = k ∗ | Θ | , we show that R is boundedby K . Indeed, assume there are an input word u labelled with input parameters ¯ I , K + 1distinct positions x , . . . , x K +1 , and a position y , such that ( u, ¯ I, x i , y ) for all i ∈ [1 , K + 1].Then by pigeonhole principle, there exists τ such that ( u, ¯ I, x i , y ) γ ( τ ) is true for k + 1 distinctvalues of i . This contradicts the fact that R is bounded by k .Finally, the fact that (cid:74) R (cid:75) ⊆ (cid:74) R (cid:75) is straightforward from the definition of R : the presenceof output parameters forcing γ to use one of its disjuncts, and the addition of constraints α, β, δ , only restrict the semantics of a resynchronizer. Any pair of origin graphs ( σ, σ )accepted by R is accepted by R as well, using the same input parameters as witness. Thismeans that if an extended resynchronizer R = ( α, β, γ, δ ) witnesses T (cid:22) T , then R asdefined here witnesses it as well. (cid:74) M F C S 2 0 2 0
Therefore, as far as the relation (cid:22) is concerned, we can assume that all boundedresynchronizers are in simplified form, and we do so throughout the paper.
A.4 Proof of Lemma 11
We want to show that (cid:22) is reflexive and transitive.Let T be a 2NT, we have T (cid:22) T , witnessed by the MSO resynchronizer γ ( x, y ) = ( x = y ).This resynchronizer preserves the strict origin semantics, and is bounded by 1. This showsreflexivity of (cid:22) .Let T , T , T be 2NTs such that T (cid:22) T (cid:22) T . This means there exists R , R boundedsuch that T ⊆ R ( T ) and T ⊆ R ( T ). Let m , γ (resp. m , γ ) be the numbers ofinput parameters and MSO formula of R (resp. R ). We define a resynchronizer R with m = m + m input parameters, by γ ( ¯ I, x , x ) = ∃ x .γ ( τ )( ¯ I , x , x ) ∧ γ ( ¯ I , x , x ) , where ¯ I (resp. ¯ I ) is obtained from ¯ I by restriction to the first n (resp. last n ) components.The formula γ guesses a valid position x for the position of the origin according to T , anduses it to redirect the origin from x to x directly.It remains to verify that R is a witness that T (cid:22) T , i.e. that T ⊆ R ( T ). Let σ =( u, v, orig ) ∈ (cid:74) T (cid:75) o , we know from T ⊆ R ( T ) that there exists σ = ( u, v, orig ) ∈ (cid:74) T (cid:75) o such that ( σ , σ ) ∈ (cid:74) γ (cid:75) , witnessed by parameters ¯ I . From T ⊆ R ( T ), there exists σ = ( u, v, orig ) ∈ (cid:74) T (cid:75) o such that ( σ , σ ) ∈ (cid:74) γ (cid:75) , witnessed by parameters ¯ I . Let us showthat ( σ , σ ) ∈ (cid:74) R (cid:75) . Let ¯ I be the concatenation ¯ I · ¯ I . Let x ∈ dom ( v ) be an output position.We need to show that ( u, ¯ I, orig ( x ) , orig ( x )) | = γ . For i ∈ { , , } let x i = orig i ( x ).We have ( u, ¯ I , x , x ) | = γ and ( u, ¯ I , x , x ) | = γ , therefore, by definition of γ , we have( u, ¯ I, x , x ) | = γ . This concludes the proof of T ⊆ R ( T ). A.5 Proof of Lemma 15
Each input position x that can be redirected to the right (resp. left) is labelled by some Right i (resp. Left i ). Notice that these labels are not exclusive, and a position x can a priorihave many such labels. However our construction ensures that every position x has at mostone right label and one left label.We construct an algorithm that builds the input parameters Left i , Right i such that itwitnesses ( σ, σ ) ∈ (cid:74) R k (cid:75) . We will describe how to assign Right i parameters, the left variantis symmetrical. The parameter variable Right i starts with value ∅ for each i ∈ [0 , k − R dist = { x , . . . , x n } ⊆ dom ( u ) be the set (indexed in increasing order) ofpositions x such that there exists an output position t with orig( t ) = x and orig ( t ) > x , i.e. R dist is the set of positions that can be redirected to the right. The algorithm makes a leftto right pass of the input positions in R dist , starting at x . When treating x j ∈ R dist it doesthe following: Set
FreeIndexes = { i | ∀ x ∈ Right i , x does not traverse x j } . If FreeIndexes is empty, then output “error” and stop, otherwise let i min be the minimalelement of FreeIndexes , and add x j to Right i min .If the algorithm never outputs “error”, then by construction these input parameterswitness ( σ, σ ) ∈ (cid:74) R k (cid:75) . Indeed, if a position x traverses a position z , the algorithm cannotgive the same label Right i to both x and z . . Kuperberg and J. Martens 51:19 Notice that in the algorithm, the set of free indexes is recomputed from scratch at everystep. Equivalently, we could remember for each i the rightmost redirection target y i of theposition s i currently labelled by Right i , and free index i when we reach position y i .We prove that “error” will never be output, under the k -traversal hypothesis on ( σ, σ ).Assume for contradiction that at stage j , FreeIndexes is empty. This means that for all i ∈ [0 , k − s i ∈ Right i that traverses x j . These s i are all distinct, sinceby construction an input position is only added to at most one input parameter Right i . Thisshows that position x j is traversed by k positions strictly before x j , and since it also traversesitself, we have a contradiction with the k -traversal assumption. A.6 Construction of domino tiles A configuration of M is the data of a tape content, a state, and the position of the headon the tape. Such a configuration will be encoded by a word of Γ ∗ of the form u · qa · v u, v ∈ A ∗ , q ∈ Q , and a ∈ A . The symbol M . When necessary, intermediaryconfigurations are interleaved to add blank symbols at the extremity of the tape.The word u · qa · v uav , with a machine in state q currently reading themarked letter a .The full computation history of M on empty input is a finite or infinite sequenceof configurations, and can be encoded by a single word Hist M ∈ Γ ∗ ∪ Γ ω , obtained byconcatenation of the encodings of the successive configurations.We will now associate a finite set of tiles D M to the machine M . Each tile of D M isindexed by an integer i , and consists of a pair of words ( u i , v i ) ∈ (Γ ∗ ) .The set D M contains the following tiles:for every a ∈ A ∪ { } , a copy tile ( a, a ),for every right moving transition δ ( p, a ) = ( q, b, right ), a right tile ( pa, bq ),for every q ∈ Q , a right expansion tile ( q , qB δ ( p, a ) = ( q, b, left ), and every letter c ∈ a , a left tile( cpa, qcb ), as well as a left expansion tile ( pa, qBb ).Notice that we omitted to include a start tile ( ε, q D M , as we will encode it explicitlyin the reduction. Let Σ ⊆ N be the finite set of indexes of tiles from D M . In the classicalproof of undecidability of the Post Correspondence Problem [11], these tiles are designed tosimulate the run of M as specified by Lemma 21. A.7 Undecidability of
BoundTape (cid:73)
Lemma 39.
For a deterministic Turing Machine M it is undecidable whether M ∈ BoundTape.
Proof.
We reduce from the halting problem on an empty tape. Consider a deterministicTuring machine M , we build a new Turing machine M which simulates M by writing thefull computation history of M on its tape. This new machine M halts if and only if thecomputation of M halts. Moreover, M halts if and only if M ∈ BoundTape , regardlessof the tape usage of M . Therefore, we have that M halts on empty input if and only if M ∈ BoundTape , which is the wanted reduction. (cid:74)
M F C S 2 0 2 0
A.8 Undecidability results for rational transducers
We recall here briefly the definition of rational resynchronizations for 1NTs. See [8] for a fullpresentation.The notion of origin graph is replaced here by interleaved word : we assume the inputalphabet Σ and the output alphabet Γ to be disjoint, and we represent the origin informationof a pair ( u, v ) ∈ Σ ∗ × Γ ∗ by a word w ∈ (Σ ∪ Γ) ∗ , such that when keeping only the lettersfrom Σ (resp. Γ) in w , we obtain the word u (resp. v ). The origin of an output letter v i ∈ Γis then given by the letter u j of Σ immediately preceding it in w .Thus, a resynchronization is now a set of pairs of interleaved words ( w, w ), stating thatthe origins encoded by w can be changed to those encoded by w . Notice that the length of w and w are always equal, so such a pair can be seen as a word on alphabet (Σ ∪ Γ) A resynchronization is rational if it is a regular language on alphabet (Σ ∪ Γ) . (cid:73) Example 40.
Let us recall the origin graphs from Example 31. a a a b a a bc d c d
Input:Output:The blue origin graph would be encoded by the interleaved word acaabdacabd , and thered one by aaacbdaacbd . So this particular resynchronization pair is represented by thepair of words ( acaabdacabd, aaacbdaacbd ), that we can represent in columns to visualize thealphabet (Σ ∪ Γ) : (cid:18) acaabdacabdaaacbdaacbd (cid:19) The resynchronizer R block from Example 31 is rational, as witnessed by the following regularexpression on alphabet (Σ ∪ Γ) :( e bd ) ∗ (cid:0) e block ( e bd ) + (cid:1) ∗ e block ( e bd ) ∗ where e bd = (cid:18) bb (cid:19)(cid:18) dd (cid:19) and e block = (cid:18) aa (cid:19) (cid:18)(cid:18) cc (cid:19) + (cid:18) ca (cid:19)(cid:18) aa (cid:19) ∗ (cid:18) ac (cid:19)(cid:19) In particular it is shown in [8] that the shift resynchronizations are rational (under thename bounded delay resynchronisers ).As mentioned in Section 5, since the shift resynchronizations are rational, and that anyrational resynchronization is in particular bounded regular [4, Theorem 3], our reductionfrom Section 5 can be used in particular as an alternative proof of undecidability of rationalresynchronization synthesis, shown in [4] via one-counter automata. This means we directlyobtain Corollary 27: (cid:73)
Corollary 27.
Given two 1NTs T , T such that (cid:74) T (cid:75) ⊆ (cid:74) T (cid:75) , it is undecidable whetherthere exists a rational resynchronizer R rat such that T ⊆ R rat ( T ) . We can further strengthen the result via Theorem 28: (cid:73)
Theorem 28.
Given two 1NTs T , T and a regular resynchronizer R reg such that T ⊆ R reg ( T ) , it is undecidable whether there exists a rational resynchronizer R rat such that T ⊆ R rat ( T ) . . Kuperberg and J. Martens 51:21 We prove this by a small modification of the construction of T up from the undecidabilityproof in Section 5. We design T up such that it either simulates T up , or outputs an arbitraryword with origin on the first input letter and then finishes. The transducer T up is representedbelow: q q q T up Transducer T up ε | ε ε | Γ i | ε i | εε | ε We have T down (cid:22) T up , witnessed by the bounded resynchronizer R defined by γ ( x, y ) = f irst ( x ). In this resynchronizer, any origin pointing to the first input letter can be resyn-chronized to any input position. However, R is not rational, and the existence of a rationalresynchronizer witnessing T down (cid:22) T up reduces to BoundTape . (cid:73) Lemma 41.
There exists a rational resynchronization R such that T down ⊆ R ( T up ) if andonly if M ∈ BoundTape.
Proof.
Let R be a rational resynchronizer such that for all graph σ ∈ (cid:74) T down (cid:75) o , there existsa graph σ ∈ (cid:74) T up (cid:75) o such that ( σ, σ ) ∈ (cid:74) R (cid:75) . We show that when the input word is longenough, the graph σ corresponds to a run of T up simulating T up . Assuming the contrary, wewould obtain that the rational resynchronizer R contains arbitrarily long pairs p n of the form (cid:18) i v v . . . v n i . . . i n i v i v . . . . . . i n v n (cid:19) , with i j ∈ Σ and v j ∈ Γ for all j . Let n be bigger than twice thenumber of states of a DFA A recognizing the rational resynchronization on alphabet (Σ ∪ Γ ).We can pump a factor of length at least 2 in the factor (cid:0) v v (cid:1)(cid:0) v i (cid:1)(cid:0) v v (cid:1) . . . (cid:0) v n x (cid:1) of the pair p n .This way we can produce pairs of words accepted by A , but whose projection to Γ do notmatch, i.e. the output word is not the same before and after resynchronization. This meansthat A is not the automaton of a rational resynchronization, a contradiction. We obtainedthat there exists a constant k ∈ N such that for inputs longer than k , T up behaves as T up .Thus the proof of Theorem 24 can now be used to show that a rational resynchronizer existsif and only if M ∈ BoundTape . This uses the fact that a k -shift resynchronization is rational,and that any rational resynchronization is in particular regular [4, Theorem 3]. (cid:74)(cid:74)