[PDF] Automatic sequences: from rational bases to trees

Abstract

The nth term of an automatic sequence is the output of a deterministic finite automaton fed with the representation of n in a suitable numeration system. In this paper, instead of considering automatic sequences built on a numeration system with a regular numeration language, we consider these built on languages associated with trees having periodic labeled signatures and, in particular, rational base numeration systems. We obtain two main characterizations of these sequences. The first one is concerned with r-block substitutions where r morphisms are applied periodically. In particular, we provide examples of such sequences that are not morphic. The second characterization involves the factors, or subtrees of finite height, of the tree associated with the numeration system and decorated by the terms of the sequence.

Full PDF

aa r X i v : . [ c s . F L ] F e b AUTOMATIC SEQUENCES: FROM RATIONAL BASES TOTREES

MICHEL RIGO AND MANON STIPULANTI

Abstract.

The n th term of an automatic sequence is the output of a de-terministic ﬁnite automaton fed with the representation of n in a suitablenumeration system. In this paper, instead of considering automatic sequencesbuilt on a numeration system with a regular numeration language, we con-sider these built on languages associated with trees having periodic labeledsignatures and, in particular, rational base numeration systems. We obtaintwo main characterizations of these sequences. The ﬁrst one is concerned with r -block substitutions where r morphisms are applied periodically. In particu-lar, we provide examples of such sequences that are not morphic. The secondcharacterization involves the factors, or subtrees of ﬁnite height, of the treeassociated with the numeration system and decorated by the terms of thesequence. Mathematics Subject Classiﬁcation : 68R15, 68Q45, 11A63.

Keywords: Automatic sequences, abstract numeration systems, rational base nu-meration systems, alternating morphisms, PD0L systems, Cobham’s theorem. Introduction

Motivated by a question of Mahler in number theory, the introduction of rationalbase numeration systems has brought to light a family of formal languages with arich combinatorial structure [1]. In particular, the generation of inﬁnite trees witha periodic signature has emerged [17, 18, 19, 20]. Marsault and Sakarovitch veryquickly linked the enumeration of the vertices of such trees (called breadth-ﬁrst seri-alization) to the concept of abstract numeration system built on the correspondingpreﬁx-closed language: the traversal of the tree is exactly the radix enumerationof the words of the language. In this paper, we study automatic sequences associ-ated with that type of numeration systems. In particular, in the rational base pq ,a sequence is pq -automatic if its n th term is obtained as the output of a DFAO fedwith the base- pq representation of n . Thanks to a result of Lepist¨o [13] on factorcomplexity, we observe that we can get sequences that are not morphic.We obtain several characterizations of these sequences. The ﬁrst one boils downto translate Cobham’s theorem from 1972 into this setting. In Section 4, we showthat any automatic sequence built on a tree language with a purely periodic la-beled signature is the image under a coding of an alternate ﬁxed point of uniformmorphisms not necessarily of the same length. If all the morphisms had the same The ﬁrst author dedicates this paper to the memory of his grandmother Marie Wuidar (1923–2020). length, as observed in [11], we would only get classical k -automatic sequences. Asa consequence, in the rational base pq , if a sequence is pq -automatic, then it is theimage under a coding of a ﬁxed point of a q -block substitution whose images allhave length p . In the literature, these substitutions are also called PD0L where aperiodic control is applied — q diﬀerent morphisms are applied depending on theindex of the considered letter modulo q .On the other hand, Sturmian trees as studied in [3] also have a rich combinatorialstructure where subtrees play a special role analogous to factors occurring in inﬁnitewords. In Section 5, we discuss the number of factors, i.e., subtrees of ﬁnite height,that may appear in the tree whose paths from the root are labeled by the words ofthe numeration language and whose vertices are colored according to the sequenceof interest. Related to the k -kernel of a sequence, we obtain a new characterizationof the classical k -automatic sequences: a sequence x is k -automatic if and only ifthe labeled tree of the base- k numeration system decorated by x is rational, i.e.,it has ﬁnitely many inﬁnite subtrees. For numeration systems built on a regularlanguage, the function counting the number of decorated subtrees of height n isbounded, and we get a similar result. This is not the case in the more generalsetting of rational base numeration systems. Nevertheless, we obtain suﬃcientconditions for a sequence to be pq -automatic in terms of the number of subtrees.This paper is organized as follows. In Section 2, we recall basic deﬁnitions aboutabstract numeration systems, tree languages, rational base numeration systems,and alternate morphisms. In Section 3, we give some examples of the automaticsequences that we will consider. The parity of the sum-of-digits in base is suchan example. In Section 4, Cobham’s theorem is adapted to the case of automaticsequences built on tree languages with a periodic labeled signature in Theorem 20(so, in particular, to the rational base numeration systems in Corollary 21). InSection 5, we decorate the nodes of the tree associated with the language of arational base numeration system with the elements of a sequence taking ﬁnitelymany values. Under some mild assumption (always satisﬁed when distinct statesof the deterministic ﬁnite automaton with output producing the sequence havedistinct output), we obtain a characterization of pq -automatic sequences in termsof the number of trees of some ﬁnite height occurring in the decorated tree. InSection 6, we review some usual closure properties of pq -automatic sequences.2. Preliminaries

We make use of common notions in combinatorics on words, such as alphabet,letter, word, length of a word, language and usual deﬁnitions from automata theory.In particular, we let ε denote the empty word. For a ﬁnite word w , we let | w | denoteits length. For each i ∈ { , . . . , | w | − } , we let w i denote the i th letter of w (andwe thus start indexing letters at 0.)2.1. Abstract numeration systems.

When dealing with abstract numerationsystems, it is usually assumed that the language of the numeration system is regular.However the main feature is that words are enumerated by radix order (also calledgenealogical order: words are ﬁrst ordered by increasing length and words of thesame length are ordered by lexicographical order). The generalization of abstractnumeration systems to context-free languages was, for instance, considered in [5].

UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 3

Rational base numeration systems discussed below in Section 2.3 are also abstractnumeration systems built on non-regular languages.

Deﬁnition 1. An abstract numeration system (or ANS for short) is a triple S =( L, A, < ) where L is an inﬁnite language over a totally ordered (ﬁnite) alphabet( A, < ). We say that L is the numeration language . The map rep S : N → L isthe one-to-one correspondence mapping n ∈ N onto the ( n + 1)st word in theradix ordered language L , which is then called the S -representation of n . The S -representation of 0 is the ﬁrst word in L . The inverse map is denoted by val S : L → N . For any word w in L , val S ( w ) is its S -numerical value .Positional numeration systems, such as integer base numeration systems, theFibonacci numeration system, and Pisot numeration systems, are based on thegreediness of the representations. They all share the following property: m < n ifand only if rep( m ) is less than rep( n ) for the radix order. These numeration systemsare thus ANS. As a non-standard example of ANS, consider the language a ∗ b ∗ over { a, b } and assume that a < b . Let S = ( a ∗ b ∗ , { a, b } , < ). The ﬁrst few words inthe numeration language are ε, a, b, aa, ab, bb, . . . . For instance, rep S (3) = aa andrep S (5) = bb . One can show that val S ( a p b q ) = ( p + q )( p + q +1)2 + q . For details, werefer the reader to [12] or [23].In the next deﬁnition, we assume that most signiﬁcant digits are read ﬁrst. Thisis not real restriction (see Section 6). Deﬁnition 2.

Let S = ( L, A, < ) be an abstract numeration system and let B be aﬁnite alphabet. An inﬁnite word x = x x x · · · ∈ B N is S -automatic if there existsa deterministic ﬁnite automaton with output (DFAO for short) A = ( Q, q , A, δ, µ : Q → B ) such that x n = µ ( δ ( q , rep S ( n ))) for all n ≥ k ≥ A k denote the alphabet { , , . . . , k − } . Forthe usual base- k numeration system built on the language(2.1) L k := { ε } ∪ { , . . . , k − }{ , . . . , k − } ∗ , an S -automatic sequence is said to be k -automatic [2]. We also write rep k and val k in this context.2.2. Tree languages.

Preﬁx-closed languages deﬁne labeled trees (also called trie or preﬁx-tree in computer science) and vice-versa. Let ( A, < ) be a totally ordered(ﬁnite) alphabet and let L be a preﬁx-closed language over ( A, < ). The set of nodesof the tree is L . If w and wd are words in L with d ∈ A , then there is an edgefrom w to wd with label d . The children of a node are ordered by the labels of theletters in the ordered alphabet A . In Figure 1, we have depicted the ﬁrst levels ofthe tree associated with the preﬁx-closed language a ∗ b ∗ . Nodes are enumerated bybreadth-ﬁrst traversal (or, serialization).We recall some notion from [18] or [20]. Let T be an ordered tree of ﬁnitedegree. The (breath-ﬁrst) signature of T is a sequence of integers, the sequenceof the degrees of the nodes visited by the (canonical) breadth-ﬁrst traversal ofthe tree. The (breath-ﬁrst) labeling of T is the inﬁnite sequence of the labels ofthe edges visited by the breadth-ﬁrst traversal of this tree. As an example, withthe tree in Figure 1, its signature is 2 , , , , , , , , , , , . . . and its labeling is a, b, a, b, b, a, b, b, b, a, b, . . . . MICHEL RIGO AND MANON STIPULANTI a ba bb a bbb Figure 1.

The ﬁrst few levels of the tree associated with a ∗ b ∗ . Remark 3.

As observed by Marsault and Sakarovitch [18], it is usually convenientto consider i-trees : the root is assumed to be a child of itself. It is especially thecase for positional numeration systems when one has to deal with leading zeroes asthe words u and 0 u may represent the same integer.We now present a useful way to describe or generate inﬁnite labeled i-trees. Let A be a ﬁnite alphabet containing 0. A labeled signature is an inﬁnite sequence( w n ) n ≥ of ﬁnite words over A providing a signature ( | w n | ) n ≥ and a consistentlabeling of a tree (made of the sequence of letters of ( w n ) n ≥ ). It will be assumedthat the letters of each word are in strictly increasing order and that w = 0 x with x ∈ A + . To that aim we let inc ( A ∗ ) denote the set of words over A with increasinglyordered letters. For instance, 025 belongs to inc ( A ∗ ) but 0241 does not. Examplesof labeled signatures will be given in the next section. Remark 4.

Since a labeled signature s generates an i-tree, by abuse, we say thatsuch a signature deﬁnes a preﬁx-closed language denoted by L ( s ). Moreover, sincewe assumed the words of s all belong to inc ( A ∗ ) for some ﬁnite alphabet A , thecanonical breadth-ﬁrst traversal of this tree produces an abstract numeration sys-tem. Indeed the enumeration of the nodes v , v , v , . . . of the tree is such that v n is the n th word in the radix ordered language L ( s ). The language L ( s ), the set ofnodes of the tree and N are thus in one-to-one correspondence.2.3. Rational bases.

The framework of rational base numeration systems [1] is aninteresting setting giving rise to a non-regular numeration language. Neverthelessthe corresponding tree has a rich combinatorial structure: it has a purely periodiclabeled signature.Let p and q be two relatively prime integers with p > q >

1. Given a positiveinteger n , we deﬁne the sequence ( n i ) i ≥ as follows: we set n = n and, for all i ≥ qn i = pn i +1 + a i where a i is the remainder of the Euclidean division of qn i by p .Note that a i ∈ A p for all i ≥

0. Since p > q , the sequence ( n i ) i ≥ is decreasing andeventually vanishes at some index ℓ + 1. We obtain n = ℓ X i =0 a i q (cid:18) pq (cid:19) i . Conversely, for a word w = w ℓ w ℓ − · · · w ∈ A ∗ p , the value of w in base pq is therational number val pq ( w ) = ℓ X i =0 w i q (cid:18) pq (cid:19) i . UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 5

Note that val pq ( w ) is a not always an integer and val pq ( uv ) = val pq ( u )( pq ) | v | +val pq ( v )for all u, v ∈ A ∗ p . We let N pq denote the value set , i.e., the set of numbers repre-sentable in base pq : N pq = val pq ( A ∗ p ) = n x ∈ Q | ∃ w ∈ A ∗ p : val pq ( w ) = x o . A word w ∈ A ∗ p is a representation of an integer n ≥ pq if val pq ( w ) = n .As for integer bases, representations in rational bases are unique up to leadingzeroes [1, Theorem 1]. Therefore we let rep pq ( n ) denote the representation of n inbase pq that does not start with 0. By convention, the representation of 0 in base pq is the empty word ε . In base pq , the numeration language is the set L pq = n rep pq ( n ) | n ≥ o . Hence, rational base numeration systems are special cases of ANS built on L pq : m < n if and only if rep pa ( m ) < rep pa ( n ) for the radix order. It is clear that L pq ⊆ A ∗ p is a preﬁx-closed language. As a consequence of the previous section, itcan be seen as a tree. Example 5.

The alphabet for the base is A = { , , } . The ﬁrst few wordsin L are ε, , , , , , , , , , , . . . and its labeling is 0 , , , , , , , , , . . . . Otherwise stated,the purely periodic labeled signature (02 , ω gives the i-tree of the language L ;see Figure 2. For all n ≥

0, the n th node in the breadth-ﬁrst traversal is the wordrep ( n ). Observe that there is an edge labeled by a ∈ A from the node n to thenode m if and only if m = · n + a . This remark is valid for all rational bases.01235811 0 4690 1020 71112212 Figure 2.

The ﬁrst levels of the tree associated with L . Remark 6.

The language L pq is highly non-regular: it has the bounded left-iteration property; for details, see [17]. In L pq seen as a tree, no two inﬁnite subtreesare isomorphic, i.e., for any two words u, v ∈ L pq with u = v , the quotients u − L pq and v − L pq are distinct. As we will see with Lemma 29, this does not prevent MICHEL RIGO AND MANON STIPULANTI the languages u − L pq and v − L pq from coinciding on words of length bounded bya constant depending on val pq ( u ) and val pq ( v ) modulo a power of q . Neverthelessthe associated tree has a purely periodic labeled signature. For example, with pq respectively equal to , , and , we respectively have the signatures (02 , ω ,(024 , ω , (036 , , ω , (048 , , , ω . Generalizations of these languages(called rhythmic generations of trees) are studied in [20]. Deﬁnition 7.

We say that a sequence is pq -automatic if it is S -automatic for theANS built on the language L pq , i.e., S = ( L pq , A p , < ).2.4. Alternating morphisms.

The Kolakoski–Oldenburger word [24,

A000002 ]is the unique word k over { , } starting with 2 and satisfying ∆( k ) = k where ∆is the run-length encoding map k = 2211212212211 · · · . It is a well-known (and challenging) object of study in combinatorics on words. Itcan be obtained by periodically iterating two morphisms, namely h : (cid:26)

22 and h : (cid:26) . More precisely, in [7], k = k k k · · · is expressed as the ﬁxed point of the iteratedmorphisms ( h , h ), i.e., k = h ( k ) h ( k ) · · · h ( k n ) h ( k n +1 ) · · · . In the literature, one also ﬁnds the terminology PD0L for

D0L system with periodiccontrol [11, 13].

Deﬁnition 8.

Let r ≥ A be a ﬁnite alphabet, and let f , . . . , f r − be r morphisms over A ∗ . An inﬁnite word w = w w w · · · over A is an alternateﬁxed point of ( f , . . . , f r − ) if w = f ( w ) f ( w ) · · · f r − ( w r − ) f ( w r ) · · · f i mod r ( w i ) · · · . As observed by Dekking [8] for the Kolakoski word, an alternate ﬁxed point canalso be obtained by an r -block substitution. Deﬁnition 9.

Let r ≥ A be a ﬁnite alphabet. An r -blocksubstitution g : A r → A ∗ maps a word w · · · w rn − ∈ A ∗ to g ( w · · · w r − ) g ( w r · · · w r − ) · · · g ( w r ( n − · · · w rn − ) . If the length of the word is not a multiple of r , then the suﬃx of the word is ignoredunder the action of g . An inﬁnite word w = w w w · · · over A is a ﬁxed point ofthe r -block substitution g : A r → A ∗ if w = g ( w · · · w r − ) g ( w r · · · w r − ) · · · . Proposition 10.

Let r ≥ be an integer, let A be a ﬁnite alphabet, and let f , . . . , f r − be r morphisms over A ∗ . If an inﬁnite word over A is an alternateﬁxed point of ( f , . . . , f r − ) , then it is a ﬁxed point of an r -block substitution.Proof. For every of length- r word a · · · a r − ∈ A ∗ , deﬁne the r -block substitution g : A r → A ∗ by g ( a · · · a r − ) = f ( a ) · · · f r − ( a r − ). (cid:3) UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 7

Thanks to the previous result, the Kolakoski–Oldenburger word k is also a ﬁxedpoint of the 2-block substitution g :  h (1) h (1) = 2112 h (1) h (2) = 21121 h (2) h (1) = 22122 h (2) h (2) = 2211 . Observe that the lengths of images under g are not all equal.3. Concrete examples of automatic sequences

Let us present how the above concepts are linked with the help of some examples.The ﬁrst one is our toy example.

Example 11.

Let ( s ( n )) n ≥ be the sum-of-digits in base . This sequence was, inparticular, studied in [10]. We have( s ( n )) n ≥ = 0 , , , , , , , , , , , , , , , , , . . .. We let t denote the sequence ( s ( n ) mod 2) n ≥ , t = 00111011111011011 · · · . The sequence t is -automatic as the DFAO in Figure 3 generates t when readingbase- representations. 0 10 , , Figure 3.

A DFAO generating the sum-of-digits in base modulo 2.As a consequence of Proposition 16, it will turn out that t is an alternate ﬁxedpoint of ( f , f ) with(3.1) f : (cid:26)

11 and f : (cid:26) . With Proposition 10, t is also a ﬁxed point of the 2-block substitution g :  f (0) f (0) = 00101 f (0) f (1) = 00010 f (1) f (0) = 11111 f (1) f (1) = 110 . Observe that we have a 2-block substitution with images of length 3. This is not acoincidence, as we will see with Corollary 21.Automatic sequences in integer bases are morphic words, i.e., images, under acoding, of a ﬁxed point of a prolongable morphism [2]. As shown by the nextexample, there are -automatic sequences that are not morphic. For a word u ∈{ , } ∗ , we let u denote the word obtained by applying the involution i − i , i ∈ { , } , to the letters of u . MICHEL RIGO AND MANON STIPULANTI

Example 12.

Lepist¨o considered in [13] the following 2-block substitution h :  g (0)0 = 01101 g (0)1 = 01010 g (1)0 = 00111 g (1)1 = 000 with g : 0 , , producing the word F = 01001100001 · · · . He showed that the factor complexity p F of this word satisﬁes p F ( n ) > δn t for some δ > t >

2. Hence, thisword cannot be purely morphic nor morphic (because these kinds of words have afactor complexity in O ( n ) [21]). With Proposition 17, we can show that F is a -automatic sequence generated by the DFAO depicted in Figure 4.0 10 1 , , , Figure 4.

A DFAO generating F . Remark 13.

Similarly, the non-morphic word F p introduced in [13] is p +1 p -automatic.It is generated by the p -block substitution deﬁned by h p ( au ) = g ( a ) u for a ∈ { , } and u ∈ { , } p − , where g is deﬁned in Example 12.We conclude this section with an example of an automatic sequence associatedwith a language coming from a periodic signature. Example 14.

Consider the periodic labeled signature s = (023 , , ω producingthe i-tree in Figure 5. The ﬁrst few words in L ( s ) are ε, , , , , , , , , , , , . . . which give the representations of the ﬁrst 12 integers in the abstract numerationsystem S = ( L ( s ) , A , < ). For instance, rep S (15) = 2121 as the path of label 2121leads to the node 15 in Figure 5. The sum-of-digits in S modulo 2, starting with001100110101 · · · , is S -automatic since it is generated by the DFAO in Figure 6. As a consequence ofProposition 16 and Theorem 20, we will see that this sequence is also the coding ofan alternate ﬁxed point of three morphisms.4. Cobham’s theorem

Cobham’s theorem from 1972 states that a sequence is k -automatic if and only ifit is the image under a coding of the ﬁxed point of a k -uniform morphism [6] (or see[2, Theorem 6.3.2]). This result has been generalized to various contexts: numer-ation systems associated with a substitution, Pisot numeration systems, Bertrandnumeration systems, ANS with regular languages, and so on [4, 9, 14, 22]. Alsosee [12] or [23] for a comprehensive presentation. In this section, we adapt it tothe case of S -automatic sequences built on tree languages with a periodic labeled UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 9

Figure 5.

The tree associated with the signature (023 , , ω .0 10 , , , , , , , , Figure 6.

A DFAO generating the sum-of-digits modulo 2 in theANS S = ( L ( s ) , A , < ) where s = (023 , , ω .signature (so, in particular, to the rational base case). We start oﬀ with a technicallemma. Lemma 15.

Let r ≥ be an integer, let A be a ﬁnite alphabet, and let f , . . . , f r − be morphisms over A ∗ . Let x = x x x · · · be an alternate ﬁxed point of ( f , . . . , f r − ) .For all m ≥ , we have f m mod r ( x m ) = x i · · · x i + | f m mod r ( x m ) |− where i = P m − j =0 | f j mod r ( x j ) | .Proof. Let m ≥

0. From the deﬁnition of an alternate ﬁxed point, we have thefactorization x = uf m mod r ( x m ) f ( m +1) mod r ( x m +1 ) · · · where u = f ( x ) f ( x ) · · · f r − ( x r − ) f ( x r ) · · · f ( m −

1) mod r ( x m − ) . Now | u | = P m − j =0 | f j mod r ( x j ) | , which concludes the proof. (cid:3) Given an S -automatic sequence associated with the language of a tree with apurely periodic labeled signature, we can turn it into an alternate ﬁxed point ofuniform morphisms. Proposition 16.

Let r ≥ be an integer and let A be a ﬁnite alphabet of digits.Let w , . . . , w r − be r non-empty words in inc ( A ∗ ) . Consider the language L ( s ) of the i-tree generated by the purely period signature s = ( w , w , . . . , w r − ) ω . Let A = ( Q, q , A, δ ) be a DFA. For i ∈ { , . . . , r − } , we deﬁne the r morphisms from Q ∗ to itself by f i : Q → Q | w i | , q δ ( q, w i, ) · · · δ ( q, w i, | w i |− ) , where w i,j denotes the j th letter of w i . The alternate ﬁxed point x = x x · · · of ( f , . . . , f r − ) starting with q is the sequence of states reached in A when readingthe words of L ( s ) in increasing radix order, i.e., for all n ≥ , x n = δ ( q , rep S ( n )) with S = ( L ( s ) , A, < ) .Proof. Up to renaming the letters of w , without loss of generality we may assumethat w = 0 x with x ∈ A + .We proceed by induction on n ≥

0. It is clear that x = δ ( q , ε ) = q . Let n ≥ n and we prove it for n .Write rep S ( n ) = a ℓ · · · a a . This means that in the i-tree generated by s , wehave a path of label a ℓ · · · a from the root. We identify words in L ( s ) with verticesof the i-tree.Since L ( s ) is preﬁx-closed, there exists an integer m < n such that rep S ( m ) = a ℓ · · · a . Let i = m mod r . By deﬁnition of the periodic labeled signature s , inthe i-tree generated by s , reading a ℓ · · · a from the root leads to a node having | w i | children that are reached with edges labeled by the letters of w i . Since w i ∈ inc ( A ∗ ), the letter a occurs exactly once in w i , so assume that w i,j = a for some j ∈ { , . . . , | w i | − } . By construction of the i-tree given by a periodic labeledsignature (see Figure 7 for a pictorial description), we have that(4.1) n = X v ∈ L ( s ) v< rep S ( m ) deg( v ) + j = m − X k =0 | w k mod r | + j. By the induction hypothesis, we obtain δ ( q , rep S ( n )) = δ ( δ ( q , rep S ( m )) , a ) = δ ( x m , a )and by deﬁnition of f i , we get δ ( x m , a ) = [ f i ( x m )] j = [ f m mod r ( x m )] j . FromLemma 15 and Equation (4.1), this is exactly x n , as desired. (cid:3) Given an alternate ﬁxed point of uniform morphisms, we can turn it into an S -automatic sequence for convenient choices of a language of a tree with a purelyperiodic labeled signature and a DFAO. Proposition 17.

Let r ≥ be an integer and let A be a ﬁnite alphabet. Let f , . . . , f r − : A ∗ → A ∗ be r uniform morphisms of respective length ℓ , . . . , ℓ r − such that f is prolongable on some letter a ∈ A , i.e., f ( a ) = ax with x ∈ A + . Let x = x x · · · be the alternate ﬁxed point of ( f , . . . , f r − ) starting with a . Considerthe language L ( s ) of the i-tree generated by the purely periodic labeled signature s =  · · · ( ℓ − , ℓ ( ℓ + 1) · · · ( ℓ + ℓ − , . . . ,  X j

Illustration of Equation (4.1).

Then the word x is the sequence of the states reached in A when reading the wordsof L ( s ) by increasing radix order, i.e., x n = δ ( a, rep S ( n )) with S = ( L ( s ) , B, < ) .Proof. We again proceed by induction on n ≥

0. It is clear that x = a = δ ( a, ε ).Let n ≥

1. Assume the property holds for all values less than n and we prove it for n . Write rep S ( n ) = a ℓ · · · a a . This means that in the i-tree with a periodic labeledsignature s , we have a path of label a ℓ · · · a from the root. We identify words in L ( s ) ⊆ B ∗ with vertices of the i-tree.Since L ( s ) is preﬁx-closed, there exists m < n such that rep S ( m ) = a ℓ · · · a . Let j = m mod r . In the i-tree generated by s , reading a ℓ · · · a from the root leads toa node having ℓ j children that are reached with edges labeled by X k ≤ j − ℓ k , X k ≤ j − ℓ k + 1 , . . . , X k ≤ j ℓ k − . Observe that the words in s belong to inc ( B ∗ ). Therefore the letter a occurs exactlyonce in B and in particular amongst those labels, assume that a = P k ≤ j − ℓ k + t for some t ∈ { , . . . , ℓ j − } . By construction of the i-tree, we have that(4.2) n = X v ∈ L ( s ) v< rep S ( m ) deg( v ) + t = m − X i =0 ℓ i mod r + t. By the induction hypothesis, we obtain δ ( a, rep S ( n )) = δ ( δ ( a, rep S ( m )) , a ) = δ ( x m , a )and by deﬁnition of the transition function, δ ( x m , a ) = [ f j ( x m )] t = [ f m mod r ( x m )] t .From Lemma 15 and Equation (4.2), this is exactly x n . (cid:3) Remark 18.

What matters in the above statement is that two distinct words ofthe signature s do not share any common letter. It mainly ensures that the choiceof the morphism to apply when deﬁning δ is uniquely determined by the letter tobe read. Example 19.

If we consider the morphisms in (3.1), Proposition 17 provides uswith the signature s = (01 , ω instead of the signature (02 , ω of L . We willproduce the sequence t using the language h ( L ) where the coding h is deﬁned by h (0) = 0, h (1) = 2 and h (2) = 1 and in the DFAO in Figure 3, the same coding isapplied to the labels of the transitions. What matters is the form of the tree (i.e.,the sequence of degrees of the vertices) rather than the labels themselves. Theorem 20.

Let

A, B be two ﬁnite alphabets. An inﬁnite word over B is theimage under a coding g : A → B of an alternate ﬁxed point of uniform morphisms(not necessarily of the same length) over A if and only if it is S -automatic for anabstract numeration system S built on a tree language with a purely periodic labeledsignature.Proof. The forward direction follows from Proposition 17: deﬁne a DFAO wherethe output function τ is obtained from the coding g : A → B deﬁned by τ ( b ) = g ( b )for all b in A . The reverse direction directly follows from Proposition 16. (cid:3) We are able to say more in the special case of rational bases. The tree lan-guage associated with the rational base pq has a periodic signature of the form( w , . . . , w q − ) ω with P q − i =0 | w i | = p and w i ∈ A ∗ p for all i . See Remark 6 forexamples. Corollary 21.

If a sequence is pq -automatic, then it is the image under a codingof a ﬁxed point of a q -block substitution whose images all have length p .Proof. Let ( w , . . . , w q − ) ω denote the periodic signature in base pq . Proposition 16provides q morphisms f i that are respectively | w i | -uniform. By Proposition 10, thealternate ﬁxed point of ( f , . . . , f q − ) is a ﬁxed point of a q -block substitution g such that, for any length- q word a · · · a q − , | g ( a · · · a q − ) | = | f ( a ) f ( a ) · · · f q − ( a q − ) | = q − X i =0 | w i | = p. (cid:3) Decorating trees and subtrees

As already observed in Section 2.2, a preﬁx-closed language L over an ordered(ﬁnite) alphabet ( A, < ) gives an ordered labeled tree T ( L ) in which edges are labeledby letters in A . Labels of paths from the root to nodes provide a one-to-onecorrespondence between nodes in T ( L ) and words in L . We now add an extrainformation, such as a color, on every node. This information is provided by asequence taking ﬁnitely many values. UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 13

Deﬁnition 22.

Let T = ( V, E ) be a rooted ordered inﬁnite tree, i.e., each nodehas a ﬁnite (ordered) sequence of children. As observed in Remark 4, the canonicalbreadth-ﬁrst traversal of T gives an abstract numeration system — an enumerationof the nodes: v , v , v , . . . . Let x = x x · · · be an inﬁnite word over a ﬁnitealphabet B . A decoration of T by x is a map from V to B associating with thenode v n the decoration (or color) x n , for all n ≥ label and deco-ration the labeling of the edges and nodes of a tree. Example 23.

In Figure 8 are depicted a preﬁx of T ( L ) decorated with the se-quence t of Example 11 and a preﬁx of the tree T ( L ) associated with the bi-nary numeration system (see (2.1)) and decorated with the Thue–Morse sequence0110100110010110 · · · . In these trees, the symbol 0 (respectively 1) is denoted bya black (respectively red) decorated node.1 0 0 2212 0 10 0 110 0 10 0 1111 Figure 8.

Preﬁxes of height 4 of two decorated trees.We use the terminology of [3] where Sturmian trees are studied; it is relevant toconsider (labeled and decorated) factors occurring in trees.

Deﬁnition 24.

The domain dom( T ) of a labeled tree T is the set of labels of pathsfrom the root to its nodes. In particular, dom( T ( L )) = L for any preﬁx-closedlanguage L over an ordered (ﬁnite) alphabet. The truncation of a tree at height h is the restriction of the tree to the domain dom( T ) ∩ A ≤ h .Let L be a preﬁx-closed language over ( A, < ) and x = x x · · · be an inﬁniteword over some ﬁnite alphabet B . (We could use an ad hoc notation like T x ( L ) butin any case we only work with decorated trees and it would make the presentationcumbersome.) From now on, we consider the labeled tree T ( L ) decorated by x . Forall n ≥

0, the n th word w n in L corresponds to the n th node of T ( L ) decorated by x n . Otherwise stated, for the ANS S = ( L, A, < ) built on L , if w ∈ L , the nodecorresponding to w in T ( L ) has decoration x val S ( w ) . Deﬁnition 25.

Let w ∈ L . We let T [ w ] denote the subtree of T having w as root.Its domain is w − L = { u | wu ∈ L } . We say that T [ w ] is a suﬃx of T .For any h ≥

0, we let T [ w, h ] denote the factor of height h rooted at w , whichis the truncation of T [ w ] at height h . The preﬁx of height h of T is the factor T [ ε, h ]. Two factors T [ w, h ] and T [ w ′ , h ] of the same height are equal if they have the same domain and the same decorations, i.e., x val S ( wu ) = x val S ( w ′ u ) for all u ∈ dom( T [ w, h ]) = dom( T [ w ′ , h ]). We let F h = { T [ w, h ] | w ∈ L } denote the set of factors of height h occurring in T . The tree T is rational if it hasﬁnitely many suﬃxes.Note that, due to Remark 6, with any decoration, even constant, the tree T ( L pq )is not rational.In Figure 9, we have depicted the factors of height 2 occurring in T ( L ) decoratedby t . In Figure 10, we have depicted the factors of height 2 occurring in T ( L )decorated by the Thue–Morse sequence. In this second example, except for thepreﬁx of height 2, observe that a factor of height 2 is completely determined by thedecoration of its root.12 1 0 0 22 1 0 0 22 0 20 12 0 20 1211 11 0 21 0 21 Figure 9.

The 9 factors of height 2 in T ( L ) decorated by t . Theﬁrst one is the preﬁx occurring only once.0 11 0 10 0 11 0 10 0 11 Figure 10.

The 3 factors of height 2 in T ( L ) decorated by theThue–Morse sequence. The ﬁrst one is the preﬁx occurring onlyonce.Since every factor of height h is the preﬁx of a factor of height h + 1, we triviallyhave F h +1 ≥ F h . This is quite similar to factors occurring in an inﬁnite word:any factor has at least one extension. In particular, ultimately periodic words arecharacterized by a bounded factor complexity. Lemma 26. [3, Proposition 1]

Let L be a preﬁx-closed language over ( A, < ) and let x = x x · · · be an inﬁnite word over some ﬁnite alphabet B . Consider the labeledtree T ( L ) decorated by x . The tree T ( L ) is rational if and only if F h = F h +1 for some h ≥ . In particular, F h = F h + n for all n ≥ . UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 15

We can characterize S -automatic sequences built on a preﬁx-closed regular lan-guage L in terms of the decorated tree T ( L ). For the sake of presentation, wemainly focus on the case of k -automatic sequences. The reader can relate ourconstruction to the k -kernel of a sequence. Roughly, each element of the k -kernelcorresponds to reading one ﬁxed suﬃx u from each node w of the tree T ( L k ).We have val k ( wu ) = k | u | val k ( w ) + val k ( u ) and an element from the k -kernel is asequence of the form ( x k | u | n +val k ( u ) ) n ≥ . Theorem 27.

Let k ≥ be an integer. A sequence x is k -automatic if and only ifthe labeled tree T ( L k ) decorated by x is rational.Proof. Let us prove the forward direction. If x is k -automatic, there exists a DFAO A = ( Q, q , A k , δ, τ ) producing it when fed with base- k representations of integers.Let w ∈ L k be a non-empty base- k representation and let h ≥ T [ w, h ] is completely determined by the state δ ( q , w ). Indeed, it is a full k -ary tree of height h and the decorations are given by τ ( δ ( q , wu )) for u runningthrough A ≤ hk in radix order. For the empty word, however, the preﬁx T [ ε, h ] isdecorated by τ ( δ ( q , u )) for u running through { ε } ∪ { , . . . , k − } A

0. Since h F h is non-decreasing, there exists H ≥ F H = F H +1 . We conclude by using Lemma 26.Let us prove the other direction. Assume that the tree T ( L k ) is rational. Inparticular, there exists an integer h ≥ F h = F h +1 . This means thatany factor of height h can be extended in a unique way to a factor of height h + 1,i.e., if T [ w, h ] = T [ w ′ , h ] for two words w, w ′ ∈ L k , then T [ w, h + 1] = T [ w ′ , h + 1].This factor of height h + 1 is made of a root and k subtrees of height h attached toit. So, for each copy of T [ w, h ] in the tree T ( L k ), to its root are attached the same k trees T [ w , h ] , . . . , T [ w ( k − , h ]. The same observation holds for the preﬁx ofthe tree except that to the root are attached the k − T [1 , h ] , . . . , T [ k − , h ].We thus deﬁne a DFAO F whose set of states is F h and whose transition functionis given by ∀ i ∈ A k : δ ( T [ w, h ] , i ) = T [ wi, h ] . The initial state is given by the preﬁx T [ ε, h ] and we set δ ( T [ ε, h ] ,

0) = T [ ε, h ].Finally the output function maps a factor T [ w, h ] to the decoration of its root w ,that is, x val k ( w ) . For each n ≥ x n is the decoration of the n th node in T ( L k )by deﬁnition. To conclude the proof of the backward direction, we have to showthat x n is the output of F when fed with rep k ( n ). This follows from the deﬁnitionof F : starting from the initial state T [ ε, h ], we reach the state T [rep k ( n ) , h ] andthe output is x val k (rep k ( n )) = x n . (cid:3) We improve the previous result to ANS with a regular numeration language.

Theorem 28.

Let S = ( L, A, < ) be an ANS built on a preﬁx-closed regular language L . A sequence x is S -automatic if and only if the labeled tree T ( L ) decorated by x is rational.Proof. The proof follows exactly the same lines as for integer base numerationsystems. The only reﬁnement is the following one. A factor T [ w, h ] of T ( L ) isdetermined by w − L ∩ A ≤ h and δ ( q , w ). Since L is regular, the set { w − L ∩ A ≤ h | w ∈ A ∗ } is ﬁnite. Thus F h is bounded by Q times the number of states of theminimal automaton of L . (cid:3) Rational bases.

We now turn to rational base numeration systems. A factorof height h in T ( L ) only depends on the value of its root modulo 2 h . This resultholds for any rational base numeration system. Lemma 29. [16, Lemme 4.14]

Let w, w ′ ∈ L pq be non-empty words and let u ∈ A ∗ p be a word of length h . • If val pq ( w ) ≡ val pq ( w ′ ) mod q h , then u ∈ w − L pq if and only if u ∈ ( w ′ ) − L pq . • If u ∈ ( w − L pq ∩ ( w ′ ) − L pq ) , then val pq ( w ) ≡ val pq ( w ′ ) mod q h . In the previous lemma, the empty word behaves diﬀerently. For a non-emptyword w ∈ L pq with val pq ( w ) ≡ q h , a word u ∈ A hp not starting with 0 veriﬁes u ∈ ε − L pq if and only if u ∈ w − L pq . Therefore the preﬁx of the tree T ( L pq ) hasto be treated separately. Lemma 30. [16, Corollaire 4.17]

Every word u ∈ A ∗ p is suﬃx of a word in L pq . As a consequence of these lemmas { w − L pq ∩ A hp | w ∈ A + p } is a partitionof A hp into q h non-empty languages. Otherwise stated, in the tree T ( L pq ) with nodecoration or, equivalently with a constant decoration for all nodes, there are q h + 1factors of height h ≥ h preﬁx, which has a diﬀerentshape). For instance, if the decorations in Figure 9 are not taken into account,there are 5 = 2 + 1 height-2 factors occurring in T ( L ).Except for the height- h preﬁx, each factor of height h is extended in exactly q ways to a factor of height h + 1. To the ﬁrst (leftmost) leaf of a factor of height h are attached children corresponding to one of the q words of the periodic signature.To the next leaves on the same level are periodically attached as many nodes asthe length of the diﬀerent words of the signature. For instance, in the case pq = ,the ﬁrst (leftmost) leaf of a factor of height h becomes a node of degree either 1(label 1) or 2 (labels 0 and 2) to get a factor of height h + 1. The next leaves onthe same level periodically become nodes of degree 2 or 1 accordingly. An exampleis depicted in Figure 11. Lemma 31.

Let x be a pq -automatic sequence produced by the DFAO A = ( Q, q , A p , δ, τ ) and let T ( L pq ) be decorated by x . For all h ≥ , the number F h of height- h factorsof T ( L pq ) is bounded by q h · Q .Proof. Let w ∈ L pq be a non-empty base- pq representation and let h ≥

1. Weclaim that the factor T [ w, h ] is completely determined by the word w . First, fromLemma 29, the labeled tree T [ w, h ] of height h with root w and in particular, itsdomain, only depends on val pq ( w ) modulo q h . Indeed, if w, w ′ ∈ L pq are such thatval pq ( w ) ≡ val pq ( w ′ ) mod q h , thendom( T [ w, h ]) = w − L pq ∩ A ≤ hp = w ′− L pq ∩ A ≤ hp = dom( T [ w ′ , h ]) . Second, the decorations of the factor T [ w, h ] are given by τ ( δ ( q , wu )) for u runningthrough dom( T [ w, h ]) = w − L pq ∩ A ≤ hp enumerated in radix order. So the decora-tions only depend on the state δ ( q , w ) of A . Hence the number of such factors isbounded by q h · Q .Similarly, the height- h preﬁx T [ ε, h ] is decorated by τ ( δ ( q , u )) for u runningthrough dom( T [ ε, h ]) = L pq ∩ A ≤ hp . UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 17 n + 211 0 0 20 122 8 n + 7111 8 n + 410 0 220 112 8 n + 11 0 0 2218 n + 60 21 0 10 0 222 8 n + 30 211 8 n n + 50 20 121 Figure 11.

For the rational base , each factor of height h = 2gives 2 factors of height h + 1 = 3.Hence F h is bounded by 1 + q h · Q , for all h ≥ (cid:3) Deﬁnition 32.

A tree of height h ≥ h + 1 levels: the level of anode is its distance to the root. Hence, the root is the only node of level 0 and theleaves have level h .For instance, in Figure 11, each tree of height 3 has four levels. Deﬁnition 33.

Let T be a labeled decorated tree and let h ≥

0. We let F ∞ h ⊆ F h denote the set of factors of height h occurring inﬁnitely often in T . For any suitableletter a in the signature of T , we let F ∞ h,a ⊆ F ∞ h denote the set of factors of height h occurring inﬁnitely often in T such that the label of the edge between the ﬁrstnode on level h − a . Otherwise stated, the ﬁrst word of length h in the domain of the factor ends with a . Example 34.

In Figure 11, assuming that they occur inﬁnitely often, the ﬁrst fourtrees belong to F ∞ , and the last four on the second row belong to F ∞ , .Even though the language L pq is highly non-regular, we can still handle a subsetof pq -automatic sequences. Roughly, with the next two theorems, we characterize pq -automatic sequences in terms of the number of factors of a ﬁxed height occurringinﬁnitely often. As mentioned below, the ﬁrst result can be notably applied whendistinct states of the DFAO producing the sequence have distinct outputs.In the remaining of the section, we let ( w , . . . , w q − ) denote the signature of T ( L pq ). For all 0 ≤ j ≤ q − ≤ i ≤ | w j | −

1, we also let w j,i denote the i th letter of w j . Theorem 35.

Let x be a pq -automatic sequence over a ﬁnite alphabet B generatedby a DFAO A = ( Q, q , A p , δ, τ : A p → B ) with the following property: there exists an integer h such that, for all distinct states q, q ′ ∈ Q and all words w ∈ L pq , thereexists a word u in w − L pq of length at most h such that τ ( δ ( q, u )) = τ ( δ ( q ′ , u )) .Then in the tree T ( L pq ) decorated by x , we have for all ≤ j ≤ q − , F ∞ h +1 ,w j, ≤ F ∞ h . Proof.

Consider a factor of height h occurring inﬁnitely often, i.e., there is a se-quence ( u i ) i ≥ of words in L pq such that T [ u , h ] = T [ u , h ] = T [ u , h ] = · · · . FromLemma 29, all values val pq ( u i ) are congruent to r modulo q h for some 0 ≤ r < q h .Thus the values of val pq ( u i ) modulo to q h +1 that appear inﬁnitely often take atmost q values (among r, r + q h , . . . , r + ( q − q h ).The extra assumption on the DFAO means that if two words v, w ∈ L pq withval pq ( v ) ≡ val pq ( w ) mod q h are such that δ ( q , v ) = δ ( q , w ), then T [ v, h ] = T [ w, h ].Indeed, by assumption, there exists u ∈ v − L pq ∩ A ≤ hp = w − L pq ∩ A ≤ hp such that τ ( δ ( q , vu )) = τ ( δ ( q , wu )). Hence, by contraposition, since T [ u i , h ] = T [ u j , h ],then δ ( q , u i ) = δ ( q , u j ). Consequently, if T [ u i , h + 1] and T [ u j , h + 1] have thesame domain, then T [ u i , h + 1] = T [ u j , h + 1] because δ ( q , u i v ) = δ ( q , u j v ) for allwords v ∈ dom( T [ u i , h + 1]).Consequently, no two distinct factors of height h + 1 occurring inﬁnitely oftenand having the same domain can have the same preﬁx of height h . Therefore, eachfactor U of height h occurring inﬁnitely often gives rise to at most one factor U ′ of height h + 1 in every F ∞ h +1 ,w j, for 0 ≤ j ≤ q − U and the ﬁrst letter w j, uniquely determine the domain of U ′ ). (cid:3) Remark 36.

In the case of a k -automatic sequence, the assumption of the abovetheorem is always satisﬁed. We may apply the usual minimization algorithm aboutundistinguishable states to the DFAO producing the sequence: two states q, q ′ are distinguishable if there exists a word u such that τ ( δ ( q, u )) = τ ( δ ( q ′ , u )). Thepairs { q, q ′ } such that τ ( q ) = τ ( q ′ ) are distinguishable (by the empty word). Thenproceed recursively: if a not yet distinguished pair { q, q ′ } is such that δ ( q, a ) = p and δ ( q ′ , a ) = p ′ for some letter a and an already distinguished pair { p, p ′ } , then { q, q ′ } is distinguished. The process stops when no new pair is distinguished andwe can merge states that belong to undistinguished pairs. In the resulting DFAO,any two states are distinguished by a word whose length is bounded by the numberof states of the DFAO. We can thus apply the above theorem. Notice that for a k -automatic sequence, there is no restriction on the word distinguishing states sinceit belongs to A ∗ k . The extra requirement that w ∈ L pq is therefore important in thecase of rational bases and is not present for base- k numeration systems. Remark 37.

For a rational base numeration system, the assumption of the abovetheorem is always satisﬁed if the output function τ is the identity; otherwise stated,if the output function maps distinct states to distinct values. This is for instancethe case of our toy example t . However the assumption is not readily satisﬁed onexamples such as the following one with the DFAO depicted in Figure 12 readingbase- representations.For instance the words u = 212001220110220 and v = 212022000012021 are suchthat q .u = q , q .v = q , u − L ∩ A = v − L ∩ A = { } and u − L ∩ A = v − L ∩ A = { , } . So T [ u,

4] = T [ v,

4] because reading 1’s from q or q q | q | q | q | , , , ,

202 20

Figure 12.

A DFAO with two distinct outputs but four states.leads to one of these two states with the same output. But T [ u, = T [ v,

5] because q .u q . q and q .v q . q , and the corresponding outputs arediﬀerent.We can generalize the above example with the suﬃx 1 . Let h ≥ h . From Lemma 30, it occurs as a suﬃx of words in L . One may thusﬁnd words similar to u and v in the above computations. Actually, val ( u ) = 591and val ( v ) = 623 are both congruent to 15 = 2 − (so, they can befollowed by the suﬃx 1 ), and val ( u ) and val ( v ) are both even (so, they canbe followed by either 0 or 2). To have a situation similar to the one with u and v above, we have to look for numbers n which are congruent to 2 h − h andsuch that n (cid:18) (cid:19) h + val (1 h ) = n (cid:18) (cid:19) h + (cid:18) (cid:19) h − n = (2 j + 1)2 h − n ends either in q or q . Theorem 38.

Let x be a sequence over a ﬁnite alphabet B , and let the tree T ( L pq ) be decorated by x . If there exists some h ≥ such that F ∞ h +1 ,w j, ≤ F ∞ h for all ≤ j ≤ q − , then x is pq -automatic.Proof. For the sake of readability, write T = T ( L pq ). The length- h factors of T occurring only a ﬁnite number of times appear in a preﬁx of the tree. Let t ≥ ℓ ≥ t are roots of a factor in F ∞ h .We ﬁrst deﬁne a NFA T in the following way. An illustration that we hope tobe helpful is given below in Example 39. It is made (nodes and edges) of the preﬁx T [ ε, t + h −

1] of height t + h − F ∞ h . So the setof states is the union of the nodes of the preﬁx T [ ε, t + h −

1] and the nodes inthe trees of F ∞ h . Final states are all the nodes of the preﬁx T [ ε, t + h −

1] and thenodes of level exactly h in every element of F ∞ h , i.e., the leaves of every element of F ∞ h . The unique initial state is the root of the preﬁx T [ ε, t + h − • If a node m of level t − T [ ε, t + h −

1] has a child n reachedthrough an arc with label d , then in the NFA we add an extra transitionwith the same label d from m to the root of the element of F ∞ h equal to T [ n, h ]. This is well deﬁned because n has level t . • Let r be the root of an element T [ r, h ] of F ∞ h . Suppose that r has a child s reached through an arc with label d . The assumption in the statementmeans that the element T [ r, h ] in F ∞ h can be extended in at most one wayto an element U c in F ∞ h +1 ,c for some c ∈ { w , , . . . , w q − , } . The tree U c with root r has a subtree of height h with root rd = s denoted by V c,d ∈ F ∞ h (as depicted in Figure 13). In the NFA, we add extra transitions with label d from r to the root of V c,d (there are at most q such trees). r sdc V c,d T [ r, h ] U c h h Figure 13.

Extension of a tree in F ∞ h .We will make use of the following unambiguity property of T . Every word u ∈ L pq is accepted by T and there is exactly one successful run for u in T . If the lengthof u ∈ L pq is less than t + h , there is one successful run and it remains in the preﬁx T [ ε, t + h − t − T [ ε, t + h −

1] and the root of an element in F ∞ h , then the word has to be of lengthat least t + h to reach a ﬁnal state by construction. Now consider a word u ∈ L pq of length t + h + j with j ≥ u = u · · · u t − u t u t +1 · · · u t + h − · · · u t + h + j − . Reading the preﬁx u · · · u t − leads to the root of an element U in F ∞ h . Assume thatthis element can be extended in (at least) two ways to a tree of height h + 1. Thismeans that in T , we have two transitions from the root of U with label u t − : onegoing to the root of some V ∈ F ∞ h,c and one going to the root of some V ∈ F ∞ h,c .Note that V and V have the same preﬁx of height h −

1. The diﬀerence appears pre-cisely at level h where the labeling is periodically ( w e , w e +1 , . . . , w q , w , . . . , w e − )and ( w f , w f +1 , . . . , . . . , w q , w , . . . , w f − ) respectively where w e (respectively w f )starts with c (respectively c ) and the two q -tuples of words are a cycle shift of thesignature ( w , . . . , w q − ) of T . Nevertheless, if x has length h − V and thus of V , then xc belongs to the domain of V if and onlyif xc belongs to the domain of V . So if we non-deterministically make the wrong UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 21 choice of transition at step t , we will not be able to process the letter at position t + h . The choice of a transition determines the words of length h that can be readfrom that point on. The same reasoning occurs for the decision taken at step t + j and the letter at position t + h + j .We still have to turn T into a DFAO producing x ∈ B N . To do so, we determinize T with the classical subset construction. Thanks to the unambiguity property of T , if a subset of states obtained during the construction contains ﬁnal states of T ,then they are all decorated by the same letter b ∈ B . The output of this state isthus set to b . If a subset of states obtained during the construction contains noﬁnal state, then its output is irrelevant (it can be set to any value). (cid:3) Example 39.

Consider the rational base . Our aim is to illustrate the abovetheorem: we have information about factors of a decorated tree T ( L ) — thoseoccurring inﬁnitely often and those occurring only a ﬁnite number of times — andwe want to build the corresponding -automatic sequence. Assume that t = h = 1and that factors of length 1 can be extended as in Figure 9. We assume that the lasteight trees of height 2 occur inﬁnitely often. Hence their four preﬁxes of height 1have exactly two extensions. We assume that the preﬁx given by the ﬁrst tree inFigure 9 occurs only once.From this, we build the NFA T depicted in Figure 14. The preﬁx tree of height t + h − F ∞ . Their respective leaves are ﬁnal states. Finally, we have toinspect Figure 9 to determine the transitions connecting roots of these trees. Forinstance, let us focus on state 7 in Figure 14. On Figure 9, the corresponding treecan be extended in two ways: the second and the fourth trees on the ﬁrst row.In the ﬁrst of these trees, the tree hanging to the child 0 (respectively 2) of theroot corresponds to state 5 (respectively 7). Hence, there is a transition of label 0(respectively 2) from 7 to 5 (respectively 7) in Figure 14. Similarly, the second treegives the extra transitions of label 0 from 7 to 7 and of label 2 from 7 to 5. q q , , , , Figure 14.

A NFA T . Take the word 210 ∈ L . Starting from q , the only successful run is q −→ −→ −→

8. If we had reached 0 with q −→ q −→ −→

5, but from state 5 there is notransition with label 0. The successful runs of the ﬁrst few words in L are givenbelow: ε q q → q q → → q → → → q → → → q → → → → q → → → → q → → → → q → → → → → q → → → → → q → → → → → q → → → → → T . We apply the classical subset constructionto get a DFAO. If a subset of states contains a ﬁnal state of T from { , , } (respectively { q , q , , , } ), the corresponding decoration being 1 (respectively0), the output for this state is 1 (respectively 0). Indeed, as explained in theproof, a subset of states of T obtained during the determinization algorithm cannotcontain states with two distinct decorations. After determinization, we obtain the(minimal) DFAO depicted in Figure 15. In the latter ﬁgure, we have not set anyoutput for state 2 because it corresponds to a subset of states in T which does notcontain any ﬁnal state. Otherwise stated, that particular output is irrelevant as novalid representation will end up in that state.1 | |· | | | , , , , , Figure 15.

Determinization of T .6. Recognizable sets and stability properties

In this short section, our aim is to present some direct closure properties ofautomatic sequences in ANS built on tree languages. These statements should notsurprise the reader used to constructions of automata and automatic sequences.

UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 23

In [15], a subset X of N pq is said to be pq -recognizable if there exists a DFA over A p accepting a language L such that val pq ( L ) = X . Since L pq is not regular, the set N is not pq -recognizable. Proposition 40.

A sequence x = x x · · · over A is pq -automatic if and only if, forevery a ∈ A , there exists a pq -recognizable set R a such that { i ∈ N : x i = a } = R a ∩ N .Proof. In the DFAO producing the sequence, consider as ﬁnal the states havingoutput a . The accepted set is R a . (cid:3) For k -automatic sequences, the above result can also be expressed in terms ofﬁbers (see, for instance, [2, Lemma 5.2.6]). The pq -ﬁber of an inﬁnite sequence x is the language I pq ( x , a ) = { rep pq ( i ) : i ∈ N and x i = a } . A sequence x = x x · · · over A is pq -automatic if and only if, for every a ∈ A , there exists a regular language S a such that I pq ( x , a ) = S a ∩ L pq .We can verbatim take several robustness or closure properties of automatic se-quences. They use classical constructions of automata such as reversal or composi-tions. Proposition 41.

Let S be an abstract numeration system built on a tree languagewith a purely periodic labeled signature. The set of S -automatic sequences is stableunder ﬁnite modiﬁcations.Proof. One has to adapt the DFAO to take into account those ﬁnite modiﬁcations.Suppose that these modiﬁcations occur for representations of length at most ℓ . Thenthe DFAO can have a tree-like structure for words of length up to ℓ and we enter theoriginal DFAO after passing through this structure encoding the modiﬁcations. (cid:3) Proposition 42.

Let S be an abstract numeration system built on a tree languagewith a purely periodic labeled signature. The set of S -automatic sequences is stableunder codings. Automatic sequences can be produced by reading least signiﬁcant digits ﬁrst.Simply adapt the corresponding result in [22].

Proposition 43.

Let S = ( L, A, < ) be an abstract numeration system built on atree language with a purely periodic labeled signature. A sequence x is S -automaticif and only if there exists a DFAO ( Q, q , A, δ, τ ) such that, for all n ≥ , x n = τ ( δ ( q , (rep S ( n )) R )) . Adding leading zeroes does not aﬀect automaticity. Simply adapt the proof of[2, Theorem 5.2.1].

Proposition 44.

A sequence x is pq -automatic if and only if there exists a DFAO ( Q, q , A p , δ, τ ) such that, for all n ≥ and all j ≥ , x n = τ ( δ ( q , j rep pq ( n ))) . For any ﬁnite alphabet D ⊂ Z of digits, we let χ D denote the digit-conversion map deﬁned as follows: for all u ∈ D ∗ such that val pq ( u ) ∈ N , χ D ( u ) is the uniqueword v ∈ L pq such that val pq ( u ) = val pq ( v ). In [1], it is shown that χ D can berealized by a ﬁnite letter-to-letter right transducer. As a consequence of this re-sult, multiplication by a constant a ≥ u = u · · · u t ∈ L pq and consider the alphabet D = { , a, a, . . . , ( p − a } . Feed the transducer realizing χ D with au t , . . . , au .The output is the base- pq representation of a · val pq ( u ). Similarly, translation by aconstant b ≥ D ′ = { , . . . , p + b − } . Feed the transducer realizing χ D ′ with ( u t + b ), u t − , . . . , u . The output is the base- pq representation of val pq ( u ) + b . Combin-ing these results with the DFAO producing a pq -automatic sequence, we get thefollowing result. Corollary 45.

Let a ≥ , b ≥ be integers. If a sequence x is pq -automatic, thenthe sequence ( x an + b ) n ≥ is also pq -automatic. Remark 46.

Ultimately periodic sequences are k -automatic for any integer k ≥ S -automatic for any abstract numerationsystem S based on a regular language [12]. In general, this is not the case for pq -automaticity: the characteristic sequence of multiples of q is not pq -automatic [15,Proposition 5.39]. Nevertheless when the period length of an ultimately periodicsequence is coprime with q , then the sequence is pq -automatic [15, Th´eor`eme 5.34]. Acknowledgment

Manon Stipulanti is supported by the FNRS Research grant 1.B.397.20.

References [1] S. Akiyama, Ch. Frougny, and J. Sakarovitch, Powers of rationals modulo 1 and rational basenumber systems,

Israel J. Math. (2008), 53–91.[2] J.-P. Allouche and J. Shallit,

Automatic Sequences: Theory, Applications, Generalizations ,Cambridge University Press, Cambridge, (2003).[3] J. Berstel, L. Boasson, O. Carton, and I. Fagnot, Sturmian Trees,

Theoret. Comput. Sci. (2010), 443–478.[4] V. Bruy`ere and G. Hansel, Bertrand numeration systems and recognizability, Theoret. Com-put. Sci. (1997), 17–43.[5] ´E. Charlier, M. Le Gonidec, and M. Rigo, Representing real numbers in a generalized numer-ation system,

J. Comput. System Sci. (2011), no. 4, 743–759.[6] A. Cobham, Uniform tag sequences, Math. Systems Theory (1972), 164–192.[7] K. Culik, J. Karhum¨aki, and A. Lepist¨o, Alternating iteration of morphisms and the Ko-lakovski sequence, in Lindenmayer systems , 93–106, Springer, Berlin, (1992).[8] F. M.Dekking, Regularity and irregularity of sequences generated by automata,

S´em. Th.Nombres Bordeaux

Theoret. Comput. Sci. (1989), 153–169.[10] T. Edgar, H. Olafson, and J. Van Alstine, Some combinatorics of rational base representations,preprint.[11] J. Endrullis and D. Hendriks, On periodically iterated morphisms, Proc. CSL-LICS’14 inVienna (2014), 1–10.[12] P. Lecomte and M. Rigo, Abstract numeration systems, Ch. 3, in

Combinatorics, Automataand Number Theory , Encyclopedia Math. Appl. , Cambridge University Press, (2010).[13] A. Lepist¨o, On the power of periodic iteration of morphisms, ICALP 1993, 496–506,

Lect.Notes Comp. Sci , (1993).[14] J. Peltom¨aki, A. Massuir, and M. Rigo, Automatic sequences based on Parry or Bertrandnumeration systems,

Adv. Appl. Math. (2019), 11–30.[15] V. Marsault, On pq -recognisable sets, arXiv:1801.08707 .[16] V. Marsault, ´Enum´eration et num´eration , Ph.D. thesis, T´elecom-Paristech, 2015.[17] V. Marsault and J. Sakarovitch, On sets of numbers rationally represented in a rationalbase number system. Algebraic informatics, Lect. Notes Comp. Sci. , 89–100, Springer,Heidelberg, 2013.

UTOMATIC SEQUENCES: FROM RATIONAL BASES TO TREES 25 [18] V. Marsault and J. Sakarovitch, Breadth-ﬁrst serialisation of treesand rational languages, De-velopments in Language Theory - 18th International Conference, 2014, Ekaterinburg, Russia,August 26-29, 2014,

Lect. Notes Comp. Sci. , 252–259.[19] V. Marsault and J. Sakarovitch, Trees and languages with periodic signature,

IndagationesMathematicae (2017), 221–246.[20] V. Marsault and J. Sakarovitch, The signature of rational languages, Theor. Comput. Sci. (2017), 216–234.[21] J.-J. Pansiot, Complexit´e des facteurs des mots inﬁnis engendr´es par morphismes it´er´es.Automata, languages and programming (Antwerp, 1984), 380–389,

Lect. Notes Comp. Sci. , Springer, Berlin, (1984).[22] M. Rigo and A. Maes, More on generalized automatic sequences,

J. Autom. Lang. Comb. (2002), 351–376.[23] M. Rigo, Formal Languages, Automata and Numeration Systems , ISTE–Wiley, (2014).[24] N. Sloane et al., The On-Line Encyclopedia of Integer Sequences, http://oeis.org . Department of Mathematics, University of Li`ege, All´ee de la D´ecouverte 12, 4000Li`ege, Belgium, { m.rigo,m.stipulanti }}