Scattered one-counter languges have rank less than ω 2
aa r X i v : . [ c s . F L ] J un Scattered one-counter languges have rank less than ω Kitti Gelle , Szabolcs Iván a Department of Computer Science, University of Szeged, Hungary
Abstract
A linear ordering is called context-free if it is the lexicographic ordering of some context-free language andis called scattered if it has no dense subordering. Each scattered ordering has an associated ordinal, calledits rank. It is known that scattered context-free (regular, resp.) orderings have rank less than ω ω ( ω , resp).In this paper we confirm the conjecture that one-counter languges have rank less than ω .
1. Introduction
If an alphabet Σ is equipped by a linear order < , this order can be extended to the lexicographic ordering < ℓ on Σ ∗ as u < ℓ v if and only if either u is a proper prefix of v or u = xay and v = xbz for some x, y, z ∈ Σ ∗ and letters a < b . So any language L ⊆ Σ ∗ can be viewed as a linear ordering ( L, < ℓ ) . Since { a, b } ∗ containsthe dense ordering ( aa + bb ) ∗ ab and every countable linear ordering can be embedded into any countablyinfinite dense ordering, every countable linear ordering is isomorphic to one of the form ( L, < ℓ ) for somelanguage L ⊆ { a, b } ∗ .This way, order types can be represented by languages over some alphabet (by a prefix-free encoding ofthe alphabet by binary strings, one can restrict the alpbahet to the binary one). A very natural choice isto use regular or context-free languages as these language classes are well-studied. A linear ordering (or anorder type) is called regular or context-free if it is isomorphic to the linear ordering (or, is the order type)of some language of the appropriate type. It is known [2] that an ordinal is regular if and only if it is lessthan ω ω and is context-free if and only if it is less than ω ω ω . Also, the Hausdorff rank [14] of any scatteredregular (context-free, resp.) ordering is less than ω ( ω ω , resp) [9, 5].It is known [8] that the order type of a well-ordered language generated by a prefix grammar (i.e. inwhich each nonterminal generates a prefix-free language) is computable, thus the isomorphism problemof context-free ordinals is decidable if the ordinals in question are given as the lexicograpic ordering of prefix grammars. Also, the isomorphism problem of regular orderings is decidable as well [16, 3], even inpolynomial time [12]. At the other hand, it is undecidable for a context-free grammar whether it generatesa dense language, hence the isomorphism problem of context-free orderings in general is undecidable [4]. Itis unknown whether the isomorphism problem of scattered context-free orderings is decidable – a partialresult in this direction is that if the rank of such an ordering is at most one (that is, the order type is afinite sum of the terms ω , − ω and ), then the order type is effectively computable from a context-freegrammar generating the language [6, 7]. Also, it is also decidable whether a context-free grammar generatesa scattered language of rank at most one.It is a very plausible scenario though that the isomorphism problem of scattered context-free orderingsis undecidable in general – the rank is quite low compared to the upper bound ω ω of the rank of theseorderings, and there is no known structural characterization of scattered context-free orderings. Clearly,among the well-orderings, exactly the ordinals smaller than ω ω ω are context-free but for scattered orderingsthe main obstacle is the lack of a finite “normal form” – as every ω -indexed sum of the terms ω and − ω isscattered of rank two, there are already uncountably many scattered orderings of rank two and thus only areally small fraction of them can possibly be context-free.The class of the one-counter languages lies strictly between the classes of regular and context-free lan-guages: these are the ones that can be recognized by a pushdown automaton having only one stack symbol. Preprint submitted to Elsevier July 2, 2020 n [11], a family of well-ordered languages L n ⊆ { a, b, c } ∗ was given for each integer n ≥ so that the ordertype of L n is ω ω × n (thus its rank is ω × n ) and Kuske formulated two conjectures: i) the order type ofwell-ordered one-counter languages is strictly less than ω ω and more generally, ii) the rank of scatteredone-counter languages is strictly less than ω . Of course the second conjecture implies the first.In this paper we prove the second conjecture of [11]: ω is a strict upper bound for the rank of scatteredone-counter languages. The contents of the paper contain new results only: instead of reproving the resultsof [6] and the subsequent, more general [7] (these papers already contain full proofs and examples as wellto their respective results), we push the boundaries of the knowledge of scattered context-free orderingsby applying some of the tools we developed in the earlier papers to the class of one-counter languages. Itturns out that it is enough to study restricted one-counter languages to prove the conjecture, and for this, acrucial step is to reason about the cycles in a generalized sequential machine – so at the end, we can againuse some graph-theoretic methods.
2. Notation
We assume the reader has some background with formal language theory and linear orderings (e.g. withthe textbook [10, 14]), but we try to list the notations we use in the paper to settle the notation (which isthe same as we used in [6] and [7]). We assume each alphabet (finite, nonempty set) comes with a fixed totalordering. When Σ is a totally ordered set, we use two partial orderings on Σ ∗ : the prefix ordering ≤ p ( u ≤ p v if and only if v = uu ′ for some u ′ ∈ Σ ∗ ), with < p denoting the strict variant of ≤ p , and the strict ordering < s ( u < s v if and only if u = u au and v = u bu for some words u , u , u ∈ Σ ∗ and letters a < b ). Theirunion is the lexicographic ordering ≤ ℓ of Σ ∗ which is a total ordering and whose strict variant is denoted < ℓ . This way, each language L ⊆ Σ ∗ can be viewed as a (linearly) ordered set ( L, ≤ ℓ ) ; let o ( L ) denote theorder type of the language L . As an example, for the binary alphabet { , } with < we have o (0 ∗ ) isthe least infinite ordinal ω , o (0 ∗ is the order type − ω of the negative integers as . . . < ℓ < e ll < e ll (note that we use the negative sign to indicate reversal of an order type to avoid confusion with the Kleenestar), and o ((00 + 11) ∗ is the order type η of the rationals.A linear ordering is called scattered if it has no dense subordering, i.e. the rationals cannot be embeddedinto it, called quasi-dense if it is not scattered, and is called countable if so is its domain. Hausdorff associatedan ordinal rank to each scattered order (see e.g. [14]), but we use a slightly modified variant (not affectingthe main result as this variant differs from the original one by at most one) introduced in [5] as follows. Foreach ordinal α we define a class H α of linear orderings:• H contains all the finite linear orderings;• H α for α > is the least class of linear orderings closed under finite sum and isomorphism whichcontains all the sums of the form P i ∈ ζ o i , where for each integer i , the linear ordering o i belongs to H β i for some ordinal β i < α .By Hausdorff’s theorem, a countable linear ordering is scattered if and only if some class H α contains it: theleast such α is called the rank of the ordering (or of the order type as the value factors through isomorphism).We note here that the original definition of Hausdorff includes only the empty ordering and the singletonsinto H and does not require the classes H α to be closed under finite sum. Since a finite sum of orderingscan always be written as a zeta-sum of the same orderings and infinitely many zeros, and a zeta-sum offinite linear orderings is also a zeta-sum of empty and singleton orderings, this slight change can introduceonly a difference of one between the rank, e.g. ω + ω has rank one in our rank notion but has rank two inthe original one. Since α < o for a limit ordinal o and an ordinal α if and only if α + 1 < o , and o = ω is alimit ordinal, the main theorem holds for the original notion of rank as well.For a language L ⊆ Σ ∗ , we let Pref ( L ) stand for the set { u ∈ Σ ∗ : u ≤ p v for some v ∈ L } of theprefices of the members of L . Similarly, let Suf ( L ) stand for the set of the suffices of the members of L (which is formally the reversal of the prefix language of the reversal of L , say).2or each word u there is a shortest prefix v of u so that u ∈ v ∗ , this word v is called the primitive root root( u ) of u . The word u is called primitive if u = root( u ) .Let D ⊆ { , } ∗ be the language of proper bracketings, that are generated by the grammar S → S | SS | ε . That is, plays the role of the opening bracket while plays the closing bracket.A (nondeterministic) regular transducer for the purposes of this paper is a tuple M = ( Q, Σ , ∆ , q , F, µ ) where Q is the finite set of states, q ∈ Q is the initial state, F ⊆ Q is the set of final states, Σ is the output alphabet, ∆ ⊆ Q × { , } × Q is the transition relation and for each ( p, a, q ) ∈ ∆ , µ ( p, a, q ) , also denoted R p,a,q is a nonempty regular language over Σ .For each word w ∈ { , } ∗ and states p, q we associate a (regular) language L ( M, w, p, q ) inductivelyas follows: let L ( M, ε, p, q ) = ε if p = q and is the empty language if p = q . For each nonempty word w = ua , let L ( M, ua, p, q ) = S ( r,a,q ) ∈ ∆ L ( M, u, p, r ) · R r,a,q . We define L ( M, w ) = S q ∈ F L ( M, w, q , q ) and L ( M ) = S u ∈ D L ( M, u ) . Observe that we only allow the binary alphabet as input, moreover, the transduceris by definition only applied to the language D of proper bracketings – we make these restrictions to easenotation and to maintain readability of the paper.A language L ⊆ Σ ∗ is called a restricted one-counter language if L = L ( M ) for some regular transducer M . As an example, consider the transducer given on Figure 1, with q being its initial and q f being its onlyfinal state. Clearly, only words of the form w = 0 ∗ + can have a nonempty image L ( M, w ) under M , so as ∗ + ∩ D = { n n : n ≥ } , L ( M ) = S n ≥ L ( M, n n ) = S n ≥ c n ( b ∗ a ) n , so this language L = L ( M ) is arestricted one-counter language. In [11] it has been shown that o ( L ) = ω ω and o ( L k ) = ω ω × k . In particular,for each k ≥ , L k is a scattered language of rank ω × k . (Note that L ∗ is not scattered by e.g. Proposition 2so L ∗ is not an example of a scattered language of rank ω , though it’s a one-counter language.) q q f / b ∗ a / c / b ∗ a Figure 1: Transducer for c n ( b ∗ a ) n A one-counter language is usually defined via the means of pushdown automata operating with a single stacksymbol. The characterization from [1] suits our purposes better: the class of one-counter languages is theleast language class which contains the restricted one-counter languages and is closed under concatenationand Kleene iteration.The reason why we use the modified rank variant instead of the original one is the following couple ofhandy statements:
Proposition 1 ([5]).
Some useful properties of the version of the Hausdorff rank that we use that hold forscattered languages K and L :• rank( L ) = rank( Pref ( L )) • rank( K ∪ L ) = max (cid:0) rank( K ) , rank( L ) (cid:1) • rank( KL ) ≤ rank( L ) + rank( K ) • more generally, if K is scattered of rank α and for each w ∈ K , L w is a scattered language with rankat most β , then S w ∈ K wL w is scattered of rank at most β + α . . Some properties of scattered languages In this section we list some propositions regarding some operations (mostly iteration and product) ofscattered languages.
Proposition 2.
Assume L ⊆ Σ ∗ is a language such that L + is scattered. Then L ⊆ v ∗ for some word v ∈ Σ ∗ . Proof.
Assume u, v ∈ L are nonempty words with root( u ) = root( v ) . Then, by Lyndon’s theorem (seee.g. [15], Theorem 2.2), uv = vu , say uv < s vu (having the same length, they cannot be in the < p relation,so it’s either uv < s vu or the other way around). Then the language { uvuv, vuvu } ∗ uvvu forms a densesubset in L + . Thus, if L + is scattered, then the nonempty members of L share a common primitive root v ,and hence L ⊆ v ∗ . Proposition 3. If L ⊆ Σ ∗ is a dense language, then it has a prefix-free dense subset K ⊆ L . Proof.
Let P ⊆ L be the language containing all the words which are members of some infinite prefixchain of L . Now we have two cases:If P is not dense, then there exist two elements u, v ∈ P such that u < ℓ v but there is no w ∈ P with u < ℓ w < ℓ v . Then, the sublanguage L ′ = { x ∈ L : u < ℓ x < ℓ v } of L is still dense and has no memberin P . In L ′ there can be elements which are in the prefix relation, but all the < p -chains are finite within L ′ (since if L ′ contains an infinite < p chain, its elements would be in P ). So let K ⊆ L ′ be the languagecontaining the < p -maximal elements of L ′ (i.e. there is no such word which is greater than them in prefixrelation). Since there is no infinite prefix chain in L ′ , we have L ′ ⊆ S w ∈ K Pref ( w ) . Since Pref ( w ) is finitefor each word w ∈ K , while L ′ is infinite (and dense), so K has to be still dense and prefix-free.If P is dense, we define a word x u ∈ P inductively for each word u ∈ { , } ∗ { ε, } such that u < p v implies x u < p x v and u < s v implies x u < s x v . First observe that for each x ∈ P , there has to be aninfinite number of ω -words w such that x ∈ Pref ( w ) and Pref ( w ) ∩ P is infinite (that is, there have to beinfinitely many different prefix chains containing w ), for if there were some x ∈ P with only a finite numberof such ω -words, say { w , . . . , w k } , then choosing one of them, say w , there would be a length N such thatif u ∈ Pref ( w ) with | u | ≥ N , then u / ∈ Pref ( w i ) for i > . Hence, if u and v were long enough members of Pref ( w ) , then only a finite number of elements of P would fit between them (each of them being prefixesof the same w ) and P wouldn’t be a dense set.So, moving back to the construction, for the base step, we choose an arbitrary word from P , for x ε .Having defined x u ∈ P with u ∈ { , } ∗ , we define x u , x u and x u as follows. Since there are infinitelymany infinite prefix chains in P containing x u , we can choose three different ω -words, w , w and w with x u being a prefix of each of them and with w < s w < s w . Since the three ω -words differ, long enoughprefices of w i are not prefices of the other two words, and since each w i is a limit of an infinite prefix chain,we can choose long enough prefices of each w i which are in P and not prefices of the other two ω -words. Wedefine x u , x u and x u to be this prefix of w , w and w respectively.Then, words of the form u x form a dense subset of P . Proposition 4. If L ⊆ Σ ∗ is a scattered language and uK ⊆ Pref ( L ) for some word u Σ ∗ and language K ⊆ Σ ∗ , then K is scattered as well. Proof.
Since u − L embeds into L under the mapping x ux , we get that u − L is scattered as welland K ⊆ u − Pref ( L ) = Pref ( u − L ) . Assume K is not scattered, that is, it has a dense subset X ⊆ K .By Proposition 3 there exists a language X ′ ⊆ X such that X ′ is prefix-free and still dense. Hence, X ′ embeds into Pref ( u − L ) as well, which is a contradiction since a dense ordering cannot be embedded intoa scattered one. Thus, K has to be scattered. Corollary 1. If L = L L is a nonempty scattered language, then so are L and L . . Linear and semilinear sets Let N stand for the set of nonnegative integers. We call a set X ⊆ N k periodic if it has the form X = { N + M · t : t ≥ } for some vectors N, M ∈ N k ; linear if it has the form X = { N + N · t + N · t + . . . + N k · t n : t , . . . , t n ≥ } for some integer n ≥ and vectors N , . . . , N k ∈ N k ; semilinear if it isa finite union of linear sets and ultimately periodic if it is a finite union of periodic sets. (Observe that asingleton set is also periodic, by choosing the vector M in the definition to be the null vector, thus finitesets are ultimately periodic.)It is known [13] that a subset of N is ultimately periodic if and only if it is semilinear. Moreover, byParikh’s theorem we know that the Parikh image Ψ( L ) = { ( | u | , | u | ) : u ∈ L } of any context-free language L ⊆ { , } ∗ is semilinear (the theorem holds for arbitrary alphabets).Let us define the (net) opening depth of a word w ∈ { , } ∗ as open( w ) = | w | − | w | . Clearly, a word w belongs to Pref ( D ) if and only if open( w ′ ) ≥ for each prefix w ′ of w , and to D if additionally, open( w ) =0 . As an extension, we define open ′ : N → N as ( n, m ) n − m . Then clearly, open( w ) = open ′ (Ψ( w )) foreach word w ∈ { , } ∗ and the image of a linear set { ( n , m ) + ( n , m ) · t + . . . + ( n k , m k ) · t k : t , . . . , t k ≥ } ⊆ N is the linear (thus ultimately periodic) set (cid:26) ( n − m ) + k P i =1 ( n i − m i ) · t i : t . . . , t k ≥ (cid:27) ⊆ N .Hence, open( L ) is an ultimately periodic set for any context-free language L ⊆ { , } ∗ .Similarly, let us define the closing depth of a word w ∈ { , } ∗ as close( w ) = | w | − | w | . Then, a word w belongs to Suf ( D ) if and only if close( w ′ ) ≥ for each suffix w ′ of w , and belongs to D if and onlyif additionally close( w ) = 0 . Again, we define close ′ ( n, m ) = m − n . We get also that for any context-freelanguage L ⊆ { , } ∗ , close( L ) is ultimately periodic.Given a transducer M = ( Q, Σ , ∆ , q , F, µ ) , we associate to each state q ∈ Q the following set N ( q ) ⊆ N of integers: n ∈ N ( q ) if and only if there exist words u, v ∈ { , } ∗ with uv ∈ D , q ∈ q u , qv ∩ F = ∅ and open( u ) = n . It will be useful to define two additional sets N − ( q ) and N + ( q ) as follows: let n ∈ N − ( q ) if and only if q ∈ q u for some u ∈ Pref ( D ) with open( u ) = n and similarly, n ∈ N + ( q ) if and only if qu ∩ F = ∅ for some u ∈ Suf ( D ) with close( u ) = n . Clearly, N ( q ) = N − ( q ) ∩ N + ( q ) . Proposition 5.
For each state q of a transducer M , the set N ( q ) is ultimately periodic. Proof. As N − ( q ) = open( { u ∈ Pref ( D ) : q ∈ q u } ) and this language is the intersection of the context-free language Pref ( D ) and the regular language { u ∈ { , } ∗ : q ∈ q u } , we have that N − ( q ) is ultimatelyperiodic.Similarly, N + ( q ) is ultimately periodic as well. As the intersection of finitely many ultimately periodicsets is ultimately periodic [13], so is N ( q ) .For an example for a transducer (without the output function as that does not play a role in the sets N ( q ) )and the sets N ( q ) see Figure 2. The reader is encouraged to verify some of these sets, e.g. for N + ( q ) we havethat the words accepted from q are the members of the language (000 + 01) ∗ ∗ + 11) ∩ Suf ( D ) onwhich if we apply the close function we get the nonnegative numbers belonging to the set {− t − t : t , t ≥ } ∪ {− t − t ≥ } , that is, { t − t : t , t ≥ , t ≥ t } ∪ { } which in turn is simply N , or { t : t ≥ } as each nonnegative integer k can be written as either k = 2 · t − · if k is even and as k = 2 t − · if k is odd. Proposition 6.
For any transducer M , there exists some integer P > , called a period of M and for eachstate q of M , some subset τ ( q ) of { , . . . , P − } , called the type of q such that N ( q ) = (cid:0) τ ( q ) ∩ { , . . . , P − } (cid:1) ∪ { n ∈ N : n ≥ P, n ≡ r mod P for some r ≥ P, r ∈ τ ( q ) } . Proof.
By Proposition 5, each set N ( q ) is ultimately periodic, that is, a finite union of sets of the form { r + p · t : t ≥ } for some constants r, p ≥ (called the remainder and the period – the case p = 0 defines asingleton set). Let P be the least integer which is a multiple of each nonzero period and larger than all theremainders and is also at least two. 5 start q q q q q q q q − : { t } , + : { t } , ∩ : { t } − : { t + 2 } , + : { t } , ∩ : { t + 2 } − : { t + 2 } , + : { t + 1 } , ∩ : { t + 2 } − : { t + 2 } , + : { t } , ∩ : { t + 2 } − : { t + 2 } , + : { } ∪ { t + 1 } , ∩ : { } ∪ { t + 5 } − : { t } , + : { t } , ∩ : { t } − : { t } , + : { t } , ∩ : { t } − : { t + 1 } , + : { } , ∩ : { } − : { t } , + : { } , ∩ : { } P = 6 Figure 2: The sets N − ( q ) , N + ( q ) and N ( q ) , denoted by − , + and ∩ respectively. We claim that X ( q ) = { n : 0 ≤ n ≤ P − } ∩ N ( q ) is a good choice for the type of q . To this end, let b X ( q ) stand for the (ultimately periodic) set (cid:0) X ( q ) ∩ { , . . . , P − } (cid:1) ∪ [ r ∈ X ( q ) ,r ≥ P { n ≥ P : n ≡ r mod P } . So we have to show that N ( q ) = b X ( q ) .First, observe that b X ( q ) ∩ { , . . . , P − } = N ( q ) ∩ { , . . . , P − } by the definition of X ( q ) so we have toshow that for any integer n ≥ P , n ∈ b X ( q ) if and only if n ∈ N ( q ) . Let us write N ( q ) = S i ∈ [ k ] { r i + p i · t : t ≥ } And indeed, for n ≥ P (and thus n ≥ r i , p i for each i ∈ [ k ] ) we have n ∈ b X ( q ) ⇔ n ≡ r mod P for some r ∈ X ( q ) , r ≥ P ⇔ n ≡ r mod P for some r ∈ N ( q ) , P ≤ r < P ⇔ n ≡ r i + p i · t mod P for some i ∈ [ k ] , ≤ t ⇔ n ≡ r i + p i · t mod P for some i ∈ [ k ] , ≤ t < P/p i ⇔ n ≡ r i mod p i , n ≥ r i for some i ∈ [ k ] ⇔ n ∈ N ( q ) . Now we create a transducer M ′ from M by creating copies of each state. We want to construct M ′ sothat each state should have a singleton type. The states of M ′ will be triples of the form ( q, n, σ ) with q ∈ Q , n ∈ τ ( q ) and σ ∈ {≡ , ↑ , ↓} .Let P be a period of M . From the state q of M , we will create states ( q, n, ≡ ) for each P ≤ n ∈ τ ( q ) and two states, ( q, n, ↑ ) and ( q, n, ↓ ) for each n ∈ τ ( q ) with n < P . Observe that since q w ∩ F = ∅ for some w ∈ D , we have ∈ τ ( q ) . In M ′ , let ( q , , ↑ ) be the initial state. Also, if q f ∈ F , then we can assume thatthere exists some word w ∈ D with q f ∈ q w (otherwise we can remove q f from F , the resulting transducer6ill be equivalent with M ), and so ∈ N ( q f ) as well. So let { ( q f , , ↓ ) : q f ∈ F } be the (nonempty) set ofaccepting states in M ′ .We define the transitions of M ′ as follows: let (( p, n, σ ) , a, ( q, m, σ )) ∈ ∆ ′ if and only if ( p, a, q ) ∈ ∆ and one of the following conditions holds:i) n + 1 = m < P , σ = σ and a = 0 ii) n − m , m < P , σ ∈ { σ , ↓} and a = 1 iii) n + 1 ≡ m mod P , m ≥ P , n ≥ P − , a = 0 , σ = ≡ and σ = ↓ iv) n − ≡ m mod P , n ≥ P , m ≥ P − , a = 1 , σ = ≡ and σ = ↑ .Moreover, for (( p, n ) , a, ( q, m )) ∈ ∆ ′ , let µ ′ (( p, n ) , a, ( q, m )) = µ ( p, a, q ) . Finally, if there is any non-accessibleor non-coaccessible state in M ′ , then let us drop it. Figure 3 shows a part of the transducer M ′ constructedfrom the transducer M of Figure 2 with some states missing and without the output function, to maintainreadability of the transition diagram. The idea is that when M ′ reads some input word, then for a whileit uses states labeled by ↑ , then if for the currently read prefix the opening depth reaches P , then fromthat point it uses states labeled by ≡ , then, after reading in the longest prefix with opening depth at least P it switches to states labeled by ↓ . In the ↑ and ↓ states, the exact opening depth is maintained whilein the ≡ states it’s maintained only up to modulo P . (During the switch from an ≡ state to a ↓ state,nondeterminism is used to guess the end of the longest prefix and this guess is then checked against by the ↓ states.) Finally, if the depth of the word never reaches P , then the transducer switches at some point froman ↑ -state to a ↓ state by a transition of type ii). Most of these latter transitions are missing intentionallyfrom the diagram of M ′ of Figure 3. Proposition 7.
For each word u = a . . . a n ∈ D and run q a /R −→ q a /R −→ . . . a n /R n −→ q n in M with q n ∈ F there is a run ( q , , ↑ ) a /R −→ ( q , t , σ ) a /R −→ . . . a n /R n −→ ( q n , t n , σ n ) in M ′ with ( q n , t n , σ n ) ∈ F × { } in M ′ . Proof.
Let u = a . . . a n ∈ D be a word and q a /R −→ q a /R −→ . . . a n /R n −→ q n be a run in M with q n ∈ F .There are two cases: either open( v ) < P for each prefix v of u , or open( v ) ≥ P for at least one prefix v of u . We construct an accepting run ( q , , ↑ ) a /R −→ ( q , t , σ ) a /R −→ . . . a n /R n −→ ( q n , t n , σ n ) of M ′ in both cases.1. If open( v ) < P for each prefix v of u , then let us define t i = open( v ) for each ≤ i ≤ n , σ i = ↑ foreach ≤ i < n and σ n = ↓ . Then, the first n − transitions are of type i) and type ii) depending on a i , with σ = σ = ↑ , and the last transition is of type ii) with σ = ↓ , since by u ∈ D we get a n = 1 .Thus this is indeed an accepting run in M ′ .2. If open( v ) ≥ P for at least one prefix v of u , then let i ↑ ≥ be the largest index so that for each j ≤ i ↑ , open( a . . . a j ) < P and let i ↓ be the smallest index so that for each j ≥ i ↓ , open( a . . . a j ) < P . Theseindices exist since open( a ) = 1 < P and open( a . . . a n ) = 0 < P , moreover, i ↑ < i ↓ since there existssome i with open( a . . . a i ) ≥ P and all of these i s have to fall strictly between i ↑ and i ↓ .Now let us define t i = ( open( a . . . a i ) if i ≤ i ↑ or i ≥ i ↓ (open( a . . . a i ) mod P ) + P otherwise σ i = ↑ if i ≤ i ↑ ≡ if i ↑ < i < i ↓ ↓ if i ↓ ≤ i. We claim that for each ≤ i < n , (( q i , t i , σ i ) , a i +1 , ( q i +1 , t i +1 , σ i +1 )) is a transition in M ′ . Indeed: ( q i , a i +1 , q i +1 ) is a transition of M and• if i < i ↑ and a i +1 = 0 , then t i = open( a . . . a i ) , t i +1 = open( a . . . a i +1 ) = t i + 1 < P and σ = σ = ↑ , thus then the triple is a type i) transition• if i < i ↑ and a i +1 = 1 , then t i = open( a . . . a i ) , t i +1 = open( a . . . a i +1 ) = t i − , t i < P and σ = σ = ↑ , thus then the triple is a type ii) transition• if i = i ↑ , then (by the maximality of i ↑ ) a i +1 = 0 , open( a . . . a i ) = t i = P − , open( a . . . a i +1 ) = t i +1 = P (as ( P mod P ) + P = P , σ = ↑ , σ = ≡ and the triple is a type iii) transition7 , , ↑ start q , , ↑ q , , ≡ q , , ≡ q , , ↑ q , , ↑ q , , ≡ q , , ≡ q , , ↑ q , , ↑ q , , ≡ q , , ≡ q , , ↑ q , , ↑ q , , ≡ q , , ≡ q , , ↑ q , , ↑ q , , ≡ q , , ↓ q , , ↓ q , , ↓ q , , ≡ q , , ≡ q , , ≡ q , , ↓ q , , ↓ q , , ↓ q , , ≡ q , , ≡ q , , ≡ q , , ↑ q , , ↓ Figure 3: The automaton M ′ . i ↑ < i < i ↓ − and a i +1 = 0 , then σ i = σ i +1 = ≡ , t i = (open( a . . . a i ) mod P ) + P ≥ P , t i +1 = ((open( a . . . a i ) + 1) mod P ) + P ≥ P and the triple is a type iii) transition• if i ↑ < i < i ↓ − and a i +1 = 1 , then σ i = σ i +1 = ≡ , t i = (open( a . . . a i ) mod P ) + P ≥ P , t i +1 = ((open( a . . . a i ) −
1) mod P ) + P ≥ P and the triple is a type iv) transition• if i = i ↓ − , then (by the minimality of i ↓ ) t i = open( a . . . a i ) = P , a i +1 = 1 , t i +1 =open( a . . . a i +1 ) = P − , σ i = ≡ , σ = ↓ and the triple is a type iv) transition• if i ≤ i ↓ and a i +1 = 0 , then t i = open( a . . . a i ) , t i +1 = open( a . . . a i +1 ) = t i + 1 < P and σ = σ = ↓ , thus then the triple is a type i) transition• if i ≤ i ↓ and a i +1 = 1 , then t i = open( a . . . a i ) , t i +1 = open( a . . . a i +1 ) = t i − , t i < P and σ = σ = ↓ , thus then the triple is a type ii) transition Corollary 2. L ( M ) = L ( M ′ ) for the transducers M and M ′ of Proposition 7. Proof.
From Proposition 7 we have L ( M ) ⊆ L ( M ′ ) . For the other direction, L ( M ) ⊆ L ( M ′ ) also clearlyholds since the mapping ( q, n, σ ) q for each q ∈ Q , n ∈ τ ( q ) , σ ∈ {↑ , ↓ , ≡} transforms an accepting run in M ′ into an accepting run in M , with the same labels on the transitions.Hence, we can consider the automaton M ′ and call those runs of the form ( q , t , σ ) a /R −→ ( q , t , σ ) a /R −→ . . . a n /R n −→ ( q n , t n , σ n ) of M ′ explained in the construction consistent . By Proposition 7, L ( M ) is then the union of all the languages R . . . R n occurring as output sequences on accepting consistent runs of M ′ on input words belonging to D .
5. Cycles in M ′ Let us fix for this section a transducer M = ( Q, { , } , δ, q , F ) generating a scattered language L ( M ) ,let P be a period of M and let M ′ be the construction of Proposition 7. Viewing M ′ as a directedgraph, we can study the strongly connected components (SCCs) of M ′ . Without loss of generality, as M is nondeterministic, we can assume that q is a source state (there are no incoming transitions to q ) andeach member of F is a sink state (there are no outgoing transitions from the members of F ). Hence, thestate ( q , , ↑ ) is also a source in M ′ and each ( q f , , ↓ ) with q f ∈ F is a sink in M ′ , thus each one ofthese states lie in its own trivial SCC. Let (cid:22) be the usual reachability order on the states of M ′ , i.e., ( q, n, σ ) (cid:22) ( q ′ , n ′ , σ ′ ) if and only if ( q, n, σ ) u ∋ ( q ′ , n ′ , σ ′ ) for some u ∈ { , } ∗ and let ( q, n, σ ) ≈ ( q ′ , n ′ , σ ′ ) if and only if ( q, n, σ ) (cid:22) ( q ′ , n ′ , σ ′ ) and ( q ′ , n ′ , σ ′ ) (cid:22) ( q, n, σ ) . The strongly connected components, SCCs of M ′ are its ≈ -classes. By construction of M ′ (using the condition P > ) we get that if a component is asingleton set, then it is trivial : no state can have a loop edge as if ( q, n, σ ) a ∋ ( q ′ , n ′ , σ ′ ) , then n = n ′ . Wewrite ( q, n, σ ) ≺ ( q ′ , n ′ , σ ′ ) if ( q, n, σ ) (cid:22) ( q ′ , n ′ , σ ′ ) and not the way around. This preorder gives rise to thepartial order ≺ on the SCCs of M ′ : C ≺ C ′ if and only if C = C ′ and ( q, n, σ ) ≺ ( q ′ , n ′ , σ ′ ) for some states ( q, n, σ ) ∈ C , ( q ′ , n ′ , σ ′ ) ∈ C ′ .A cycle in M ′ (from a state ( p , k , σ ) ) is a closed sequence of edges ( p , k , σ ) a /R −→ ( p , k , σ ) a /R −→ . . . a n /R n −→ ( p n , k n , σ n ) = ( p , k , σ ) s for some n > . The label of this cycle is a . . . a n . Clearly, all the states on a cycle belong to the sameSCC of M ′ , moreover, by construction we have that if u is the label on a cycle, then open( u ) ≡ P .In particular, if open( u ) is zero, positive or negative, then the cycle is called zero, positive or negative,respectively. Thus, if u is the label of a cycle from some state ( q, n, σ ) with n < P , then u is a cycle of zeroweight.We begin with a couple observations: 9 roposition 8. In any SCC of M ′ , σ is constant, i.e. if ( q, n, σ ) ≈ ( q ′ , n ′ , σ ′ ) , then σ = σ ′ .For each state ( q, n, σ ) of M ′ with σ ∈ {↑ , ↓} (and thus ≤ n < P ), it holds that τ ( q, n, σ ) = { n } . Foreach state ( q, n, ≡ ) of M ′ (and thus P ≤ n < P ), it either holds that τ ( q, n, ≡ ) = { n } or τ ( q, n, ≡ ) = { n, n − P } . Proof.
Let us introduce the ordering ↑≤≡≤↓ . Then, for each transition (( q, n, σ ) , a, ( q ′ , n ′ , σ ′ )) of M ′ wehave σ ≤ σ ′ , hence if ( q, n, σ ) ≈ ( q ′ , n ′ , σ ′ ) , then σ = σ ′ has to hold.In particular, in any accepting run we first visit a positive number of ↑ -states, then a nonnegative numberof ≡ -states, and finally a positive number of ↓ -states.It is easy to see via induction on the length of the computation that if ( q , , ↑ ) a /R −→ ( q , k , ↑ ) a /R −→ . . . a n /R n −→ ( q n , k n , ↑ ) is a path in M ′ , then for each ≤ i ≤ n we have open( a . . . a i ) = k i , prov-ing N − ( q, k, ↑ ) ⊆ { k } for each q ∈ Q , k ∈ τ ( q ) for which ( q, k ↑ ) is a state of M ′ , and since N ( q, k, ↑ ) is nonempty (otherwise we would leave this state out), it has to be the case that N ( q, k, ↑ ) = { k } . Thesame reasoning applied to states of the form ( q, k, ↓ ) , considering the suffix of an accepting run that passesthrough solely on ↓ -states.Finally, by the construction of ∆ it is clear that if ( q, k, σ ) ∈ ( q , , ↑ ) · u in M ′ , then open( u ) ≡ k mod P .Hence, for each state ( q, k, ≡ ) (thus P ≤ k < P ) we have τ ( q, k, ≡ ) ⊆ { k, k − P } . Also, since k ∈ τ ( q ) in M , for each t ≥ there is at least one accepting run π of M on some word u ∈ D such that for some prefix v of u with open( v ) = k + t · P , π is in the state q . Then, the “lifted” run π ′ of Proposition 7 is in somestate ( q, t, σ ) but as σ ∈ {↑ , ↓} cannot happen here since τ ( q, n, σ ) would be then { n } with n < P , it has tobe the case that σ = ≡ and n = (( k + t · P ) mod P ) + P = k , thus k ∈ τ ( q, t, ≡ ) as well. Hence, τ ( q, t, ≡ ) iseither { k } or { k, k − P } . Proposition 9.
In a ↑ - or a ↓ -component, each cycle is a -cycle. Proof. If ( q, k, ↑ ) u ∋ ( q, k, ↑ ) , and ( q , , ↑ ) v ∋ ( q, k, ↑ ) , then by Proposition 8 we have open( v ) =open( vu ) = k , hence open( u ) = 0 . For the ↓ -states we have to consider the suffix of the computationthe same way.We introduce a couple of shorthands: let CycleWords( q, k, σ ) ⊆ { , } ∗ be the language { u : ( q, k, σ ) ∈ ( q, k, σ ) · u } and CycleOutputs( q, k, σ ) = S ( q,k,σ ) u/R −→ ( q,k,σ ) R . Proposition 10.
To each state ( q, k, σ ) there exists a primitive word w ( q, k, σ ) ∈ Σ ∗ such that whenever ( q, k, σ ) u/R −→ ( q, k, σ ) with open( u ) = 0 , then R ⊆ w ( q, k, σ ) ∗ . Proof.
Let ( q, k, σ ) be a state of M ′ . If there is no cycle of weight visiting ( q, k, σ ) , then the claimis vacuously satisfied. Otherwise, for each such cycle ( q, k, σ ) u/R −→ ( q, k, σ ) there is some input word w andoutput language R with ( q , , ↑ ) w/R −→ ( q, k, σ ) such that wu ∈ Pref ( D ) . Indeed, if σ ∈ {↑ , ↓} then anysuch word w ∈ Pref ( D ) leading into ( q, k, σ ) with open( w ) = k will do since in a ↑ - or ↓ -component k always stores correctly the open ing value of the input consumed so far within an accepting run, so duringthe consumation of the cycle of weight , the opening depth remains nonnegative since we stay within thesame component the whole time and there are no states ( q ′ , k ′ , σ ) with negative k ′ . Otherwise, if σ = ≡ ,then there is some word w ∈ Pref ( D ) leading into ( q, k, σ ) with open( w ) = k + | u | · P , and so no prefix v of wu can have a negative open value, so this run on wu can be extended into some accepting path.But then, R R ∗ ⊆ Pref ( L ( M )) , so by Proposition 4 we get that R ∗ is scattered, so by Proposition 2, R ⊆ w ( q, k, σ ) ∗ for some (primitive) word w ( q, k, σ ) .Now if there is another cycle ( q, k, σ ) v/R ′ −→ ( q, k, σ ) , then by the same reasoning we get that for some(deeply opening enough) prefix w/R , the language R ( R ∪ R ′ ) ∗ is a subset of Pref ( L ( M )) and so R ∪ R ′ has a primitive root, which has to be the primitive root w ( q, k, σ ) as well, thus this word is the primitiveroot of all the cycles of zero weight, starting from ( q, k, σ ) .10 roposition 11. Assume there is some cycle of positive weight in some SCC C of M ′ . Then for each ( q, k, σ ) ∈ C there exists a (unique, primitive) word w ( q, k, σ ) ∈ Σ ∗ such that CycleOutputs( q, k, σ ) ⊆ w ( q, k, σ ) ∗ .(Clearly, this w ( q ) has to coincide with the word w ( q, k, σ ) of Proposition 10 for states having a passingcycle of weight zero as well.) Proof.
Let ( q, k, σ ) be a state of M ′ with ( q, k, σ ) u /R −→ ( q, k, σ ) u /R −→ ( q, k, σ ) (so u , u ∈ CycleWords( q, k, σ ) and R ∪ R ⊆ CycleOutputs( q, k, σ ) ) such that open( u ) > . This in particular means that σ = ≡ and k ≥ P , since by construction of M ′ , ↑ - and ↓ -components can only have cycles with zero weight.Then, as k ∈ τ ( q, k, ≡ ) , for each t ≥ there exists some word u ∈ { , } ∗ and language R ⊆ Σ ∗ with ( q, k, ≡ ) u/R −→ ( q f , , ↓ ) for some q f ∈ F and open( u ) = − ( k + t · P ) (that is, k + t · P parenthesis can be openedin q for any t , and they can still be closed with some word). Also, there is some word w (it can be assumedthat open( w ) ≥ k is large enough, since ( q, k, ≡ ) is a ≡ -state, so wu ∈ Pref ( D ) ) and language R with ( q , , ↑ ) w/R −→ ( q, k, ≡ ) . Thus, since open( wu t ) = open( w ) + t · open( u ) which is of the form k + t ′ · P forsome t ′ ≥ since open( u ) > , we have that R R ∗ ⊆ Pref ( L ( M ′ )) . Applying Propositions 4 and 2 wehave that R ⊆ w ( q, k, σ ) ∗ for some primitive word w ( q, k, σ ) .Also, if ( q, k, ≡ ) u /R −→ ( q, k, ≡ ) for some u ∈ { , } ∗ and R ⊆ Σ ∗ (with u being possibly a negativecycle), then we have that for some large enough t ≥ , ( q, k, ≡ ) u t u /R t R −→ ( q, k, ≡ ) is so that open( u t u ) > ,hence, the language R ∪ R t R ⊆ CycleOutputs( q, k, ≡ ) consists of words all sharing the same primitiveroot, which can only be w ( q, k, σ ) (as R ⊆ R ∪ R t R ), implying R ∪ R t R ⊆ w ( q, k, σ ) ∗ which implies R ⊆ w ( q, k, σ ) ∗ as well since R t ⊆ w ( q, k, σ ) ∗ .Hence, if there is some cycle with positive weight containing a state ( q, k, ≡ ) , then there exists a primitiveword w ( q, k, σ ) ∈ Σ ∗ such that CycleOutputs( q, k, ≡ ) ⊆ w ( q, k, σ ) ∗ .Also, in a ≡ -component if there exists a cycle with positive weight, then there is such a cyclefor each state in the same SCC: if ( q, k, ≡ ) u/R −→ ( q, k, ≡ ) and ( q ′ , k ′ , ≡ ) ≈ ( q, k, ≡ ) , that is, ( q, k, ≡ ) u /R −→ ( q ′ , k ′ , ≡ ) u /R −→ ( q, k, ≡ ) and for some large enough t then we have ( q ′ , k ′ , ≡ ) u u t u /R R t R −→ ( q ′ , k ′ , ≡ ) with open( u u t u ) > , proving the statement. Proposition 12.
Assume C is a component of M ′ and there is a state ( q, k, σ ) ∈ C and an output word w ∈ Σ ∗ such that the set { open( u ) : u ∈ Pref ( D ) , ( q , , ↑ ) u/w −→ ( q, k, σ ) } is infinite. (Thus, σ = ≡ .)Then for each state ( q ′ , k ′ , σ ) ∈ C there exists a word w ( q ′ , k ′ , σ ) such that CycleOutputs( q ′ , k ′ , σ ) ⊆ w ( q ′ , k ′ , σ ) ∗ . Proof.
Let ( q ′ , k ′ , σ ) be a state in C . If there are no cycles from ( q ′ , k ′ , σ ) (that is, if C is trivial),then the claim is vacuously satisfied. Otherwise, let ( q ′ , k ′ , σ ) u /R −→ ( q ′ , k ′ , σ ) u /R −→ ( q ′ , k ′ , σ ) be two cycles(possibly the same). Then, from the condition of the Proposition, there is a word u ∈ Pref ( D ) with open( u ) ≥ max {| u | , | u |} + | C | + P , a word u ′ ∈ { , } ∗ of length at most | C | and some language R ′ (independent from u and u ) with ( q , , ↑ ) u/w −→ ( q, k, σ ) u ′ /R ′ −→ ( q ′ , k ′ , σ ) . By the condition on u , we have thatboth uu ′ u and uu ′ u are in Pref ( D ) and still has an opening depth of at least P , so both runs can beextended to some accepting run, yielding wR ′ ( R ∪ R ) ⊆ Pref ( L ( M )) for each choice of R and R whichare output languages of some cycle starting from ( q ′ , k ′ , σ ) . Now since if x and y are cycles, then so is xy , weget that wR ′ ( R ∪ R ) ∗ is then also a subset of the scattered language Pref ( L ( M )) , thus by Propositions 4and 2 we get that R ∪ R ⊆ w ( q ′ , k ′ , σ ) ∗ for some primitive word w ( q ′ , k ′ , σ ) , thus both R and R havethe very same primitive root w ( q ′ , k ′ , σ ) , no matter the choice of R and R . Thus, all the cycles indeedhave the same primitive root, proving the claim.For each transition δ = (( q, k, σ ) , a/R, ( q ′ , k ′ , σ ′ )) in M ′ with ( q, k, σ ) ≺ ( q ′ , k ′ , σ ′ ) (that is, for eachintercomponent edge) we define the language L ( δ ) ⊆ Σ ∗ as the output language of runs which use δ as their11nal transition and can be extended to an accepting run. Formally: L ( δ ) = [ ( q , , ↑ ) u/R −→ ( q,k,σ ): ua ∈ Pref ( D ) , open( ua ) ∈ τ ( q ′ ,k ′ ,σ ′ ) R R. The following proposition has the most involved proof in the paper and is the central statement on the wayfor bounding the rank of scattered restricted one-counter languages.
Proposition 13.
For each intercomponent edge δ , L ( δ ) is a scattered language of rank smaller than ω . Proof. As L ( δ ) ⊆ Pref ( L ( M ′ )) = Pref ( L ( M )) and L ( M ) is a scattered language, so is L ( δ ) by Proposi-tion 4.Let δ = (( q, k, σ ) , a/R, ( q ′ , k ′ , σ ′ )) be an intercomponent edge, ( q, k, σ ) ∈ C and ( q ′ , k ′ , σ ′ ) ∈ C ′ for thecomponents C ≺ C ′ of M ′ . We use induction on the height of C with respect to ≺ to prove the statement.If C = { ( q , , ↑ ) } is the smallest component (recall that q is assumed to be a source state in M ), thenthe claim holds since then L ( δ ) = R which is a (scattered) regular language and thus has a finite rank.If C is not the smallest component, then either C contains a cycle of positive weight, or it does not.In the latter case, either there is an output word w ∈ L ( δ ′ ) for some intercomponent transition δ ′ =(( p, n, σ ) , b/R ′ , ( p ′ , n ′ , σ )) leading into C such that { open( u ) : ( q , , ↑ ) u/w −→ ( p, n, σ ) } is infinite, or there isnot. Let us deal with the three cases separately: we collapse the first and the second case into one.1. If C contains a cycle of positive weight, or if there is some output word w whose open-set describedin the previous paragraph is infinite, then by Proposition 11 or 12 respectively we have that whenever ( q , k , σ ) u/R −→ ( q , k , σ ) is a cycle within C , then R ⊆ w ( q , k , σ ) ∗ . In particular, the order type of R is either ω or finite, so its rank is at most one.Now for any run π using δ as its final transition there exist a sequence of distinct states ( q , k , σ ) , . . . , ( q n , k n , σ ) of C and an intercomponent transition δ ′ leading into ( q , k , σ ) such that• π enters C via δ ′ , reaching ( q , k , σ ) • then takes zero or more cycles involving ( q , k , σ ) • then, after visiting ( q , k , σ ) the last time, uses the transition to ( q , k , σ ) labeled R , say• then takes zero or more cycles involving ( q , k , σ ) (that do not involve ( q , k , σ ) but that willnot be important)• then after visiting ( q , k , σ ) the last time, uses a transition to ( q , k , σ ) labeled R , say, and soon• finally, after visiting ( q n , k n , σ ) the last time, uses δ .Now for any fixed δ ′ , ( q , k , σ ) , . . . , ( q n , k n , σ ) the output language of these languages is containedwithin L ( δ ′ ) · w ( q , k , σ ) ∗ · R · w ( q , k , σ ) ∗ · R · . . . · w ( q n , k n , σ ) ∗ · R. By the induction hypothesis, the rank of L ( δ ′ ) is strictly smaller than ω , the rank of each w ( q i , k i , σ ) ∗ is at most and the rank of the regular languages R , R , R , . . . , R n is finite, so the rank of theselanguages is finite plus something strictly smaller than ω by Proposition 1, hence the rank of thisproduct is still smaller than ω . Now in this product there might be words which are not in L ( δ ) butthe intersection of L ( δ ) and this product is a subset of the product language, hence the intersectionalso has a rank smaller than ω .As there are only finitely many options for choosing the transition δ ′ and the sequence of distinctstates of C , the language L ( δ ) is thus a finite union of languages, each having a rank smaller than ω ,applying the equation for finite unions in Proposition 1 we get that L ( δ ) also has a rank smaller than ω . 12. Otherwise, C might contain cycles of zero weight and negative cycles as well. By Proposition 10, toeach state ( q , k , σ ) in C there exists a primitive word w ( q , k , σ ) such that if ( q , k , σ ) u/R −→ ( q , k , σ ) is so that open( u ) = 0 , then R ⊆ w ( q , k , σ ) ∗ . In this case we also partition L ( δ ) but this time intoa larger number of clusters. As in the previous case, let δ ′ = (( p , n , σ ) , b/R , ( q , k , σ )) be anintercomponent transition leading into ( q , k , σ ) ∈ C . Now let wb ∈ L ( δ ′ ) be a possible output wordof some run using δ ′ as its final step. For any such fixed wb , the set { open( u ) : ( q , , ↑ ) u/w −→ ( p , n , σ ) } is finite since the case when it can be infinite is handled in the previous case. So let n ∈ { open( u ) :( q , , ↑ ) u/w −→ ( p , n , σ ) } be some integer in this finite set.Now if a run starts with the labels u/w with open( u ) = n ≥ , enters a component C via δ ′ whichleads into ( q , k , σ ) ∈ C such that in C there are only cycles of nonpositive weight, reads in someword v ∈ { , } ∗ within the component and leaves the component by the transition δ in a way that uv is still a member of the language Pref ( D ) (in order to be a prefix of some accepting path), thenwhenever ( q , k , σ ) , ( q , k , σ ) , . . . , ( q t , k t , σ ) is a sequence of (not necessarily distinct) states with t ≤ | C | · ( | C | + n + 1) , then a path π • takes zero or more cycles of weight zero from ( q , k , σ ) - the output language of which is a subsetof w ( q , k , σ ) ∗ , and thus has rank at most one,• then takes a transition (( q , k , σ ) , a /R , ( q , k , σ )) to ( q , k , σ ) - the output language of whichis a regular language of finite rank,• then takes zero or more cycles of weight zero from ( q , k , σ ) - again with rank at most one,• then moves to ( q , k , σ ) outputting a language of finite rank,• and so on, finally after the cycles from ( q t , k t , σ ) , leaving the component using δ (so q t = q , k t = k ), outputting again a language of finite rank.Now since for each fixed sequence q , . . . , q t the rank of the product language is finite, and there is afinite number of them since t ≤ | C | · ( | C | + n + 1) , this union also has a finite rank as well. We claimthat for any run following u from ( q , k , σ ) which can still be extended to an accepting run there isalways such a sequence of states of bounded length. It is clear that some such sequence exists (as,say, taking no cycles all and modeling the steps of the run inside C is an option), so let us assumethe sequence ( q , k , σ ) , . . . , ( q t , k t , σ ) is the shortest one with the property that π ′ can be written asabove: cycles of weight from ( q , k , σ ) a transition to ( q , k , σ ) , cycles of weight from there, onemore transition etc and that t > | C | · ( | C | + n + 1) . As there are only | C | distinct states in C , thereis at least one state ( p, k p , σ ) which occurs at least | C | + n + 1 times in the sequence. Whenever astate occurs twice in this sequence of minimal length, then between the repetitions the run has to takea cycle of negative weight: there are no cycles of positive weight in C and if it would be a cycle ofweight zero, then we could collapse the segment between the repetition and gain a shorter sequence,contradicting minimality. Thus, while the state ( p, k p , σ ) gets repeated | C | + n + 1 times, takes | C | + n cycles of negative weight, which decrease the open of the consumed input word by at least | C | + n .Now before the first occurrence of ( p, k p , σ ) the segment of the run might increase the open of theconsumed input word (which is n upon entering C ) but only by at most | C | − : the run starts at ( q , k , σ ) ∈ C , takes some cycles there which have a nonpositive weight, then after the last visit of ( q , k , σ ) it moves to some ( p , k , σ ) (notice this is another decomposition of the prefix than the onewe used), changing the open by at most one, takes some cycles there having a nonpositive weight (thusnot increasing the open value), after the last visit to ( p , k , σ ) it takes a step to some other state(possibly increasing the open by one), and so on but as there are only | C | − such transitions as wealways move into a new state, overall the open ing depth of the input word can be increased to at most n + | C | − . Then we are taking n + | C | negative cycles, decreasing the open to a negative number,hence this run cannot be extended to an accepting one as the input word cannot be in Pref ( D ) .Thus, as for each output word w ∈ L ( δ ′ ) there are only a finite number of possibilities for openingdepth of the input read so far, and for each such possibility n a finite number of state sequences oflength at most | C | · ( | C | + n + 1) , each defining a language of finite rank, thus for each word w ∈ L ( δ ′ )
13e have a language L w of finite rank such that w − L ( δ ) ⊆ L w . That is, as L ( δ ′ ) is a scattered languageof rank smaller than ω by the induction hypothesis, and for each w ∈ L ( δ ′ ) we have the language L w of finite rank, that is, with rank at most ω so that L ( δ ) ⊆ S δ ′ S w ∈ L ( δ ′ ) wL w , applying Proposition 1 weget that the rank of L ( δ ) is at most ω + α for some α < ω , thus ω + α < ω also holds, proving thestatement. Corollary 3. If L ( M ) is a scattered language for the transducer M , then the rank of L ( M ) is smaller than ω . Proof.
Since M (and so M ′ ) can be assumed to only have sinks as final states, we get that L ( M ) = L ( M ′ ) = [ δ =(( q,k,σ ) ,a/R, ( q f ,k ′ ,σ ′ )) with q f ∈ F L ( δ ) which is a finite union of languages, each having rank strictly less than ω by Proposition 13.We are ready to show the main result of the paper: Theorem 1.
The rank of each scattered one-counter language is less than ω . Proof.
By Corollary 3, the rank of restricted scattered one-counter languages is less than ω . It sufficesto see that the property is preserved under concatenation and Kleene plus as due to Proposition 1, if L = L L for a scattered nonempty language L , then L and L are also scattered. By Proposition 1 wehave rank( L ) ≤ rank( L ) + rank( L ) in this case, hence if both L and L have rank less than ω , then sohas their product. For the case of iteration, if L + is scattered, then by 2, L ⊆ v ∗ and hence L + ⊆ v ∗ forsome word w , thus L + is either finite (if L ⊆ { ε } ) or has the order type ω , hence rank( L ) ≤ < ω againholds, proving the statement.
6. Conclusion
We confirmed the conjecture of [11] that scattered one-counter languages always have a rank strictlysmaller than ω , thus in particular, well-ordered one-counter languages always have an order type smallerthan ω ω . In the proof we used some upper bounds on the rank – it would be an interesting question to turnthis into an algorithm which computes the exact rank of the language. Also, since scattered order typeslack a Cantor-like normal form, it is not clear whether the order type of a scattered one-counter language ispresentable by some expression involving, say, ω , − ω , , finite products, sums and powers and if so, whethersuch a presentation is computable, or from the descriptive complexity point of view, whether representingsuch an expression by a transducer can be more succint than storing the expression itself. Also, it is still notknown whether the order isomorphism problem of two scattered context-free languages is decidable (for thegeneral case of arbitrary context-free languages it is known to be undecidable), and not even for one-counterlanguages. For the case of regular languages the order isomorphism is known to be decidable, so to extenddecidability the class of restricted one-counter languages might be a good choice.Ministry of Human Capacities, Hungary grant 20391-3/2018/FEKUSTRAT is acknowledged. SzabolcsIván was supported by the János Bolyai Scholarship of the Hungarian Academy of Sciences. References [1] Jean Berstel and Luc Boasson. Transductions and context-free languages.
Ed. Teubner , pages 1–278, 1979.[2] Stephen L. Bloom and Zoltán Ésik. Algebraic ordinals.
Fundam. Inform. , 99(4):383–407, 2010.[3] Stephen L. Bloom and Zoltán Ésik. The equational theory of regular words.
Information and Computation , 197(1):55 –89, 2005.[4] Zoltán Ésik. An undecidable property of context-free linear orders.
Information Processing Letters , 111(3):107 – 109,2011.
5] Zoltán Ésik and Szabolcs Iván. Hausdorff rank of scattered context-free linear orders. In David Fernández-Baca, editor,
LATIN 2012: Theoretical Informatics , pages 291–302, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.[6] Kitti Gelle and Szabolcs Iván. On the order type of scattered context-free orderings. In
The Tenth International Symposiumon Games, Automata, Logics, and Formal Verification, September 2-3, 2019. , pages 169–182, 2019.[7] Kitti Gelle and Szabolcs Iván. The order type of scattered context-free orderings of rank one is computable. In AlexanderChatzigeorgiou, Riccardo Dondi, Herodotos Herodotou, Christos A. Kapoutsis, Yannis Manolopoulos, George A. Pa-padopoulos, and Florian Sikora, editors,
SOFSEM 2020: Theory and Practice of Computer Science - 46th InternationalConference on Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus, January 20-24,2020, Proceedings , volume 12011 of
Lecture Notes in Computer Science , pages 273–284. Springer, 2020.[8] Kitti Gelle and Szabolcs Iván. The ordinal generated by an ordinal grammar is computable.
Theoretical Computer Science ,793:1 – 13, 2019.[9] Stephan Heilbrunner. An algorithm for the solution of fixed-point equations for infinite words.
RAIRO - TheoreticalInformatics and Applications - Informatique Théorique et Applications , 14(2):131–141, 1980.[10] John E. Hopcroft and Jeff D. Ullman.
Introduction to Automata Theory, Languages, and Computation . Addison-WesleyPublishing Company, 1979.[11] Dietrich Kuske. Logical aspects of the lexicographic order on 1-counter languages. In Krishnendu Chatterjee and JiríSgall, editors,
Mathematical Foundations of Computer Science 2013 - 38th International Symposium, MFCS 2013,Klosterneuburg, Austria, August 26-30, 2013. Proceedings , volume 8087 of
Lecture Notes in Computer Science , pages619–630. Springer, 2013.[12] Markus Lohrey and Christian Mathissen. Isomorphism of regular trees and words.
Information and Computation , 224:71– 105, 2013.[13] Armando B. Matos. Periodic sets of integers.
Theoretical Computer Science , 127:287–312, 1994.[14] J.G. Rosenstein.
Linear Orderings . Pure and Applied Mathematics. Elsevier Science, 1982.[15] Grzegorz Rozenberg and Arto Salomaa, editors.
Handbook of Formal Languages, Vol. 1: Word, Language, Grammar .Springer-Verlag, Berlin, Heidelberg, 1997.[16] Wolfgang Thomas. On frontiers of regular trees.
ITA , 20(4):371–381, 1986., 20(4):371–381, 1986.