aa r X i v : . [ c s . F L ] D ec Cobham’s Theorem and Automaticity
Lucas Mol and Narad Rampersad ∗ Department of Mathematics and StatisticsUniversity of Winnipeg {l.mol,n.rampersad}@uwinnipeg.ca
Jeffrey Shallit † School of Computer ScienceUniversity of Waterloo [email protected]
Manon Stipulanti ‡ Department of MathematicsUniversity of Liège [email protected]
December 17, 2018
Abstract
We make certain bounds in Krebs’ proof of Cobham’s theorem explicit and ob-tain corresponding upper bounds on the length of a common prefix of an aperiodic a -automatic sequence and an aperiodic b -automatic sequence, where a and b are mul-tiplicatively independent. We also show that an automatic sequence cannot have arbi-trarily large factors in common with a Sturmian sequence. This paper is concerned with the following question: Given a b -automatic sequence f and asequence g from some other family of sequences G , how similar can f and g be? By “similar”we could mean several things:1. f and g are identical;2. f and g have a long common prefix;3. f and g have a factor of length n in common for infinitely many n ;4. f and g have the same set of factors of length n for all sufficiently large n ;5. f and g agree on a set of positions of density . ∗ The author is supported by an NSERC Discovery Grant. † The author is supported by an NSERC Discovery Grant. ‡ The author is supported by FRIA Grant 1.E030.16. G is the family of a -automatic sequences, where a and b are multiplicatively indepen-dent ( a and b are not powers of the same integer), then we have some answers. Notably,Cobham’s theorem [6] states that f and g can be identical only if f and g are ultimatelyperiodic. Recently, Krebs [8] has given a very short and elegant proof of Cobham’s theo-rem. Much of what we do in the first part of this paper is based on this proof of Cobham’stheorem. We also note that Byszewski and Konieczny [4] generalized Cobham’s theorem byshowing that if f and g coincide on a set of positions of density , then they are periodic ona set of positions of density .One of the main results of this paper concerns the “long common prefix” measure ofsimilarity. In particular we give explicit bounds (in terms of the number of states of theautomata generating the sequences) on how long f and g can agree before they are forced toagree forever. As an example of a result of this type, consider the following generalizationof the Fine–Wilf theorem [10, Theorem 2.3.5]: If f ∈ w { w, x } ω and g ∈ x { w, x } ω ( w and x are finite words) agree on a prefix of length | w | + | x | − gcd( | w | , | x | ) , then f = g . (Herethe notation { w, x } ω denotes the set of infinite words of the form U U U · · · , where each U i ∈ { w, x } .) In our setting, where f is an a -automatic sequence and g is a b -automaticsequence, we obtain our bounds on the length of the common prefix by following the proofof Krebs and making explicit several of the bounds that appear in this proof. Our resultanswers a question posed by Zamboni (personal communication), who asked how long asequence generated by a b -uniform morphism and one generated by an a -uniform morphismcan agree before the two sequences are forced to be equal.This problem of bounding the length of the common prefix of f and g is related to theconcept of b -automaticity of infinite sequences [9], which measures the minimum number ofstates of a base- b automaton that computes the length- n prefix of the sequence. In particular,we are able to get a lower bound on the b -automaticity of an a -automatic sequence.Regarding the property of having “arbitrarily large factors in common”, it is not difficult tosee that even distinct aperiodic a -automatic and b -automatic sequences can have arbitrarilylarge factors in common. For example, the characteristic sequences of powers of and respectively are -automatic and -automatic respectively, and clearly have arbitrarily largeruns of ’s in common. The problem in this case is to show that in general such large factorsnecessarily have some simple structure; however, we do not address this question in thispaper.If we now change the family G of sequences from a -automatic to Sturmian, then it issomewhat easier to answer these kinds of questions. Sturmian sequences are those given bythe first differences of sequences of the form ( ⌊ nα + β ⌋ ) n ≥ , where ≤ α, β < and α is irrational [3]. The number α is called the slope of the Sturmiansequence and the number β is the called the intercept . It is well-known that a Sturmiansequence cannot be b -automatic. This follows from the fact that the limiting frequency of ’s in a Sturmian sequence is α , whereas if a letter in a b -automatic sequence has a limitingfrequency, that frequency must be rational [6, Thm. 6, p. 180].The problem of determining the maximum length of a common prefix of a b -automaticsequence and a Sturmian sequence was examined by Shallit [9]. Upper bounds on the length2f the common prefix can be deduced from the automaticity results given by Shallit. In thepresent paper we answer, in the negative, the question, “Can a Sturmian sequence and a b -automatic sequence have arbitrarily large finite factors in common?”Byszewski and Konieczny [4] examine these questions for the family of generalized poly-nomial functions (these are sequences defined by expressions involving algebraic operationsalong with the floor function). This family contains the family of Sturmian sequences as asubset. In recent work [5], they have extended some of the results of this paper to this moregeneral class.We also mention the work of Tapsoba [11]. Recall that the complexity of a word s is the function counting the number of distinct factors of length n in s . It is also well-known that Sturmian words have the minimum possible complexity n + 1 achievable by anaperiodic infinite word. Tapsoba shows another distinction between automatic sequences andSturmian words by giving a formula for the minimal complexity function of the fixed pointof an injective k -uniform binary morphism and comparing this to the complexity functionof Sturmian words. a -automatic and b -automaticsequences This section is largely based on the work of Krebs [8] and so we will mostly stick to thenotation used in his paper. The reader should read this section in conjunction with Krebs’paper; we occasionally omit details that can be found there.
Let b ≥ and let w ∈ { , , , . . . } ∗ . Write w = w n − w n − · · · w , where each w i ∈{ , , , . . . } . We define the number [ w ] b by [ w ] b = w n − b n − + w n − b n − + · · · + w b + w . Typically, one restricts w to be over the canonical digit set { , , . . . , b − } , in which caseevery natural number x has a unique representation w such that x = [ w ] b and w does notbegin with a (the number is represented by the empty string). In this case, we use h x i b to denote this representation w .However, Krebs’ proof requires the use of a larger digit set. Let D b denote the digit set { , . . . , b } . Over this digit set, numbers may no longer have unique representations, evenwith the restriction that the representation must begin with a non-zero digit. We use thenotation ( x ) D b to refer to some particular representation of x over the digit set D b that doesnot begin with the digit zero, without necessarily specifying which representation it is. Notealso that if some representation ( x ) D b has length n , then x ≤ b n − X i =0 b i = 2 b ( b n − b − ≤ b n +1 . deterministic finite automaton with output (DFAO) is a -tuple ( S, D, δ, s , ∆ , F ) ,where S is a finite set of states , D is a finite input alphabet , δ : S × D → S is the transitionfunction , s ∈ S is the initial state , ∆ is a finite output alphabet , and F : S → ∆ is the output function . See [2] for more details.Let D be a set of non-negative digits containing { , , . . . , b − } . A sequence ( f x ) x ∈ N is ( b, D ) -automatic if there is a DFAO M = ( S, D, δ, s , ∆ , F ) such that f [ w ] b = F ( δ ( s , w )) forall w ∈ D ∗ . Note that for each x , the DFAO M must produce the same output for all w ∈ D ∗ satisfying x = [ w ] b . The DFAO M is called a ( b, D ) -DFAO . A sequence is b -automatic if itis ( b, { , , . . . , b − } ) -automatic, and the automaton M in this case is called a b -DFAO . Krebs begins his proof by showing that a sequence f is automatic with respect to repre-sentations over the canonical base- b digit set if and only if it is automatic with respect torepresentations over the digit set D b . The reverse direction can be seen by noting that givena ( b, D b ) -DFAO generating f , one obtains a b -DFAO generating f simply by deleting thetransitions on all digits other than { , , . . . , b − } . The forward direction is proved usingtwo results: the first is a modification of [2, Theorem 6.8.6] and the second can be foundin [7, Proposition 7.1.4]. The first result [2, Theorem 6.8.6] states that if a sequence f isgenerated by a b -DFAO M , then so is the sequence obtained by first applying a transducer T to the input and then feeding the output of T to M . As presented in [2], this result requires T to map words over the digits set { , , . . . , b − } to words over the same digit set; how-ever, the proof is easily modified to allow T to map words over any digit set to words over { , , . . . , b − } . Krebs therefore applies this modified version of [2, Theorem 6.8.6] where T is the transducer of [7, Proposition 7.1.4], which converts input over a non-canonical digitset (in our case D b ) to the canonical digit set for a given base b (this is called normaliza-tion ). The result of this operation is therefore a ( b, D b ) -DFAO computing f . We now discussthe details of this construction with the aim of obtaining a reasonably small ( b, D b ) -DFAOcomputing f .Let N be the transducer of [7, Lemma 7.1.1], which converts from the digit set D b to thedigit set { , , . . . , b − } and reads its input from least significant digit to most significantdigit. The number of states of N is determined by the quantity m = max {| e − d | : e ∈ D b , d ∈ { , , . . . , b − }} ; in particular, the state set of N is defined to be Q = { s ∈ N : s < m/ ( b − } . In ourcase, we have m = 2 b , and furthermore, for b = 2 we have b/ ( b −
1) = 4 and for b > we have < b/ ( b − ≤ . We therefore set γ = 4 if b = 2 and γ = 3 if b > , so that Q = { s ∈ N : s < γ } .The set of transitions of N is E = { s e | d −→ s ′ : s + e = bs ′ + d } . The initial state is and the output function ω maps each state s ∈ Q to h s i b . Note that N is subsequential , or “input-deterministic”. To see this, suppose we have two transitions s e | d ′ −→ s ′ and s e | d ′′ −→ s ′′ . | , | | , | | | , | | | , | | , | | | , | | , | | | , | Figure 1: The transducer N in base converting D into { , } .Then bs ′ + d ′ = bs ′′ + d ′′ , which we can rewrite as ( s ′ − s ′′ ) b = d ′′ − d ′ . However, we have | d ′′ − d ′ | < b , so | s ′ − s ′′ | < , which implies s ′ = s ′′ and d ′ = d ′′ .On input u = e n e n − · · · e over D b , the transducer N produces output v = ω ( s ) d n d n − · · · d over { , , . . . , b − } , where s is the state reached by N after reading u , and [ u R ] b = [ v R ] b . Example 1.
Throughout this section, we illustrate the proof with the case b = 2 . In thiscase, the transducer N is the one given in Figure 1. For instance, on input u = 4032 over D , the transitions of N are | −→ | −→ | −→ | −→ , so N outputs v = h i , which is the canonical base- expansion of u .Let M = ( S, { , , . . . , b − } , δ, I, ∆ , F ) be a b -DFAO generating a b -automatic sequence f . Recall that our convention is that a b -DFAO reads its input from most significant digitto least significant digit. Example 1 (Continued) . We now consider the Thue–Morse sequence t = 01101001 · · · which is the fixed point of the morphism τ : 0 , . It is well known thatthe Thue–Morse sequence t is -automatic and can be generated by the -DFAO M =( S, { , } , δ, I, ∆ , F ) with S = { , } = F and I = 0 drawn in Figure 2.Let M ′ = ( S ′ , D b , δ ′ , I ′ , ∆ , F ′ ) , be the ( b, D b ) -DFAO defined as follows (again, it reads itsinput from most significant digit to least significant digit). We define S ′ = {{ ( s , , ( s , , . . . , ( s γ − , γ − } : s , s , . . . , s γ − ∈ S } , and I ′ = { ( δ ( I, h q i b ) , q ) : 0 ≤ q < γ } .
10 1 01
Figure 2: The -DFAO M generating the Thue–Morse sequence.Clearly we have I ′ ∈ S ′ . For any t ∈ S ′ and e ∈ D b , we define δ ′ ( t, e ) = [ ( s,q ) ∈ t n ( δ ( s, d ) , q ′ ) : q ′ e | d −→ q in N o . Finally, for t ∈ S ′ , define F ′ ( t ) = F ( s ) , where ( s, ∈ t (by the definition of S ′ , there is aunique such s ∈ S ).We first show that δ ′ is well-defined. Let t ∈ S ′ and e ∈ D b , and we will show that δ ′ ( t, e ) ∈ S ′ . We need to show that for every state p of N (i.e., every p ∈ Q ) the set δ ′ ( t, e ) contains a unique element of the form ( s, p ) , where s ∈ S . Let p ∈ Q be a state of N . Since N is input-deterministic, there is exactly one outgoing transition from p in N with inputsymbol e , say p e | d −→ q in N . Since ( s, q ) ∈ t for exactly one s ∈ S (by definition of S ′ ),we conclude that ( δ ( s, d ) , p ) ∈ δ ′ ( t, e ) , and it is the unique element in δ ′ ( t, e ) with secondcoordinate p .Now we show that M ′ computes the same automatic sequence as M . For any u = u m · · · u ∈ D ∗ b that doesn’t begin with , there exists exactly one v = v n · · · v ∈ { , , . . . , b − } ∗ that doesn’t begin with such that [ u ] b = [ v ] b . Namely, v = h [ u ] b i b . Note that m ≤ n ≤ m + 2 . We need to show that if ( s, ∈ δ ′ ( I ′ , u ) , then δ ( I, v ) = s . Suppose that ( s, ∈ δ ′ ( I ′ , u ) . Then in N , we have u | v −→ q u | v −→ q u | v −→ · · · u m | v m −→ q m , and h q m i b = v n · · · v m +1 . Therefore, we have ( δ ( I, v n · · · v m +1 ) , q m ) ∈ I ′ , and retracing thesteps of M ′ , we conclude that δ ( I, v ) = s. Informally, M ′ works through the transducer N in the reverse direction, while computingthe transitions of M on the output. Since we are working through the transducer backwards,there are γ possible places to start, each corresponding to a different backwards path throughthe transducer. Further, if we start working backwards from state q in the transducer, thenthe output function of the transducer will be h q i b . The output function of the transduceris read first by M ′ , which explains the definition of I ′ . Only when we reach the end of theinput string do we know which backwards path through the transducer was correct (the onethat started at state ), so M ′ computes the transitions of M for all γ paths along the way.We have therefore shown how, given a b -DFAO M for f , to produce a ( b, D b ) -DFAO M ′ that also generates f . Furthermore, the ( b, D b ) -DFAO M ′ has at most | S | γ ≤ | S | (1)6 (1 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (0 , }{ (0 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , } { (0 , , (1 , , (1 , , (0 , } { (0 , , (0 , , (1 , , (1 , }{ (1 , , (1 , , (0 , , (0 , } { (0 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (1 , }{ (0 , , (0 , , (1 , , (0 , }
01 2 340 , ,
412 30 123 40 123 4
Figure 3: The (2 , D ) -DFAO M ′ computing the Thue–Morse sequence (“white” states output ; “grey” states output ).states. Example 1 (Continued) . In Figure 3, we give the (2 , D ) -DFAO M ′ (omitting all unreach-able states) that computes the Thue–Morse sequence. We also give its transition table inTable 1. To that aim, recall that γ = 4 . From Figure 2, we also get I ′ = { ( δ ( I, h q i b ) , q ) : 0 ≤ q < γ } = { ( δ ( I, ǫ ) , , ( δ ( I, , , ( δ ( I, , , ( δ ( I, , } = { (0 , , (1 , , (1 , , (0 , } . We also compute M ′ on two different words u ∈ D ∗ . Take u = 4032 ∈ D ∗ whose canonical7 ′ ( t, e ) e ∈ { , , } t ∈ S ′ { (0 , , (1 , , (1 , , (0 , } { (0 , , (1 , , (1 , , (0 , } { (1 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (0 , }{ (1 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (1 , , (0 , , (1 , , (0 , } { (1 , , (0 , , (0 , , (1 , } { (0 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (0 , , (1 , , (0 , , (0 , } { (0 , , (1 , , (1 , , (0 , } { (1 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (1 , , (0 , , (0 , , (1 , } { (1 , , (0 , , (0 , , (1 , } { (0 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (1 , }{ (0 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (1 , , (0 , , (1 , , (1 , } { (1 , , (0 , , (0 , , (1 , } { (0 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (1 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (0 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (0 , , (1 , , (0 , , (1 , } { (0 , , (1 , , (1 , , (0 , } { (1 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , } δ ′ ( t, e ) e ∈ { , } t ∈ S ′ { (0 , , (1 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (1 , , (1 , , (0 , , (1 , } { (0 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (1 , , (0 , , (1 , , (0 , } { (1 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (0 , , (1 , , (0 , , (0 , } { (0 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (1 , }{ (1 , , (0 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , }{ (0 , , (0 , , (1 , , (1 , } { (1 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (0 , }{ (1 , , (0 , , (1 , , (1 , } { (1 , , (1 , , (0 , , (1 , } { (1 , , (0 , , (1 , , (0 , }{ (1 , , (1 , , (0 , , (0 , } { (0 , , (0 , , (1 , , (0 , } { (0 , , (1 , , (0 , , (1 , }{ (0 , , (0 , , (1 , , (0 , } { (1 , , (1 , , (0 , , (0 , } { (1 , , (0 , , (0 , , (1 , }{ (0 , , (1 , , (0 , , (1 , } { (0 , , (0 , , (1 , , (1 , } { (0 , , (1 , , (1 , , (0 , } Table 1: The transition function δ ′ of M ′ as a function of t ∈ S ′ and e ∈ { , , , , } . ase- expansion is v = 101000 . The transitions are I ′ = { (0 , , (1 , , (1 , , (0 , } −→ { (1 , , (0 , , (0 , , (1 , } −→ { (1 , , (0 , , (0 , , (1 , } −→ { (1 , , (0 , , (1 , , (1 , } −→ { (0 , , (1 , , (1 , , (0 , } . By definition of F ′ , we have F ′ ( { (0 , , (1 , , (1 , , (0 , } ) = F (0) = 0 . Thus the automaton M ′ outputs after reading u , just as the automaton M does when reading v . The secondcoordinates of the ordered pairs in bold are the states of the “correct path” through thetransducer N , in reverse: | −→ | −→ | −→ | −→ . The first coordinate of the bolded pair in I ′ is δ ( I, h i ) = δ ( I,
10) = 1 , and the firstcoordinates of the remaining bolded pairs are determined by starting from state δ ( I,
10) = 1 in M and following the transitions of M given by the output labels of the above path through N (again, working backwards through N ): δ ( I,
10) = 1 −→ −→ −→ −→ δ ( I, v ) . This illustrates how, on input u , M ′ computes F ( δ ( I, v )) , which is exactly the output of M on input v .As a second illustration, take u ′ = 2014 ∈ D ∗ whose canonical base- expansion is v ′ = 10110 . On the input u ′ , the transitions of M ′ are I ′ = { (0 , , (1 , , (1 , , (0 , } −→ { (1 , , (0 , , (1 , , (0 , } −→ { (1 , , (0 , , (0 , , (1 , } −→ { (0 , , (0 , , (1 , , (0 , } −→ { (1 , , (0 , , (0 , , (1 , } . Similarly, F ′ ( { (1 , , (0 , , (0 , , (1 , } ) = F (1) = 1 , so the automaton M ′ outputs afterreading u ′ , agreeing with the output of M on input v ′ . Again, we have bolded the orderedpairs corresponding to the “correct path” through the transducer N .We end this section with some remarks on the construction. We hope that the readeris convinced that the construction we have described works for any digit set containing { , , . . . , b − } and not just the digit set D b . Furthermore, Krebs has pointed out (privatecommunication) that the number of states needed for the construction can be improved bychanging the digit set from D b to { , , . . . , b − } . Recall that our construction resultsin a DFAO with | S | γ states. If b = 2 , then we have γ = 4 , while if b > , then we have γ = 3 . However, if we change the digit set as suggested by Krebs, we improve this to | S | states. Krebs’ proof of Cobham’s Theorem works just as well with this new choice of digitset; however, a number of bounds and constants in his proof would have to be modified. Wedo not present these modifications here; we just note that it is possible to do it.9 .3 Upper bound on longest commmon prefix Having dealt with the conversion to the larger digit set required by Krebs, we now proceedwith the Diophantine approximation result used by Krebs.
Lemma 2.
Let a, b ≥ be integers and let ǫ be a positive real number. Define η := max {⌈ log a b ⌉ , ⌈ log b a ⌉} . There are non-negative integers m, n < η (( b − /ǫ + 1) such that | a m − b n | ≤ ǫb n . Proof.
First suppose that a ≥ b . Let ( f x ) x ∈ N be the sequence such that a x b − f x ∈ [1 , b ) forall x ∈ N . Then ≤ (log b a ) x − f x , so f x ≤ (log b a ) x . Now by the pigeonhole principle thereexist x < y ≤ ( b − /ǫ + 1 such that (cid:12)(cid:12) a y b − f y − a x b − f x (cid:12)(cid:12) ≤ ǫ ; i.e., (cid:12)(cid:12) a y − x − b f y − f x (cid:12)(cid:12) ≤ ǫb f y a − x ≤ ǫb f y − f x . Thus, we have m = y − x ≤ y ≤ ( b − /ǫ + 1 and n = f y − f x ≤ f y ≤ (log b a ) y ≤ (log b a )(( b − /ǫ + 1) ≤ η (( b − /ǫ + 1) , as required.Now suppose that a < b . Applying the previous argument with a ⌈ log a b ⌉ in place of a (where ⌈ ρ ⌉ denotes the least integer greater than or equal to ρ ) , we find that m = ⌈ log a b ⌉ ( y − x ) ≤ ⌈ log a b ⌉ y ≤ ⌈ log a b ⌉ (( b − /ǫ + 1) ≤ η (( b − /ǫ + 1) , and n = f y − f x ≤ f y ≤ ⌈ log a b ⌉ (log b a ) y ≤ ⌈ log a b ⌉ (log b a )(( b − /ǫ + 1) ≤ η (( b − /ǫ + 1) , as required (the final inequality above follows from the fact that log b a < in this case).As in Lemma 2, define η := max {⌈ log a b ⌉ , ⌈ log b a ⌉} and also define θ := max { a, b } . Wenow define E ( a, b, R, S ) := η (cid:2) (cid:0) θ ( S +1)( R +1) + 1 (cid:1) ( θ −
1) + 1 (cid:3) ,A ( a, b, R, S ) := (cid:0) θ ( S +1)( R +1) + 2 (cid:1) θ E ( a,b,R,S ) , and note that both these functions are symmetric under exchange of their first two argumentsand also under exchange of their last two arguments. Theorem 3.
Let a, b ≥ be multiplicatively independent integers. Let g = ( g x ) x ∈ N becomputed by a DFAO M a = ( S a , D a , δ a , s ,a , ∆ a , F a ) in base a and let f = ( f x ) x ∈ N be computedby a DFAO M b = ( S b , D b , δ b , s ,b , ∆ b , F b ) in base b . Suppose that f and g agree on a prefixof length A ( a, b, | S a | , | S b | ) . Then f and g are equal and ultimately periodic. roof. Let S ∞ be the subset of states of M b consisting of all states s with the property thatthere are infinitely many numbers x such that some representation ( x ) D b reaches state s in M b . For each s ∈ S ∞ , we claim that there must exist a state t ∈ S a and positive integers x st and y st such that some base- b representations ( x st ) D b and ( y st ) D b both lead to state s in M b and some base- a representations ( x st ) D a and ( y st ) D a both lead to state t in M a . We showthis by giving an explicit upper bound on x st and y st .If a string W has length at least | S b | , then any computation of M b on W repeats a state.Since for each s ∈ S ∞ there are infinitely many ( x ) D b that reach state s , there must existsome number x , some representation ( x ) D b , and some factorization ( x ) D b = uw with thefollowing properties: • | ( x ) D b | ≤ | S b | . • There exists a non-empty v such that | v | ≤ | S b | and uv i w reaches s for all i ≥ .For ≤ i ≤ | S a | , let x i be the integer such that ( x i ) D b = uv i w . Then the numbers x i are alldistinct. Now consider the states reached in M a by some choice of representations ( x i ) D a ,for ≤ i ≤ | S a | . There must be two such numbers x i and x j such that ( x i ) D a and ( x j ) D a reach the same state t in M a . We choose these as our x st and y st . Finally, we note that for ≤ i ≤ | S a | , we have | ( x i ) D b | ≤ | S b | ( | S a | + 1) , which gives the bound x st , y st ≤ b | S b | ( | S a | +1)+1 .Let ξ := max { x st , y st | s ∈ S ∞ } + 1 ≤ b | S b | ( | S a | +1)+1 + 1 . By Lemma 2, there exist m, n ≤ η [6 ξ ( b −
1) + 1] ≤ E ( a, b, | S a | , | S b | ) such that ξ | a m − b n | ≤ b n . As defined in [8], let p st := ( x st − y st )( a m − b n ) (swapping x st and y st if necessary, so that p st > ), and note that from [8] we have p st ≤ b n . Let z beany integer such that z, z + p st ∈ (cid:2) b n , b n (cid:3) . In particular, there exist representations ( z ) D b and ( z + p st ) D b such that | ( z ) D b | , | ( z + p st ) D b | ≤ n . In what follows, we specifically use therepresentations of z and z + p st that satisfy this condition on their lengths. We also note thatby the calculation in [8], we have z − y st ( a m − b n ) ≤ a m , so there is also a representation ( z − y st ( a m − b n )) D a of length at most m .Let x be any integer such that some representation ( x ) D b goes to state s in M b . Recallthat ( x st ) D b and ( y st ) D b go to state s in M b and ( x st ) D a and ( y st ) D a go to state t in M a . If f and g agree on a sufficiently long prefix (to be specified later), then we have f xb n + z = f y st b n + z (since ( x ) D b and ( y st ) D b go to state s in M b ) = f y st a m + z − y st ( a m − b n ) (rewriting the index) = g y st a m + z − y st ( a m − b n ) (since f and g agree) = g x st a m + z − y st ( a m − b n ) (since ( y st ) D a and ( x st ) D a go to state t in M a ) = g x st b n + z + p st (rewriting the index) = f x st b n + z + p st (since f and g agree) = f xb n + z + p st (since ( x st ) D b and ( x ) D b go to state s in M b ) . For this calculation to be correct, the two sequences f and g should agree on a prefix of11ength max { y st , x st | s ∈ S ∞ } a m + z − y st ( a m − b n ) ≤ ( ξ − a m + z − y st ( a m − b n ) ≤ ( ξ − a m + 2 a m ≤ ( ξ + 1) a m . Now ξ ≤ b | S b | ( | S a | +1)+1 + 1 , so we have ( ξ + 1) a m ≤ (cid:0) b | S b | ( | S a | +1)+1 + 2 (cid:1) a m ≤ (cid:0) b | S b | ( | S a | +1)+1 + 2 (cid:1) a E ( a,b, | S a | , | S b | ) ≤ A ( a, b, | S a | , | S b | ) . Thus, if f and g agree on a prefix of length A ( a, b, | S a | , | S b | ) , then f has a local period p st ≤ b n ≤ b E ( a,b, | S a | , | S b | ) on the interval [( x + 1 / b n , ( x + 5 / b n ] . By the same argument as in [8], the sequence f isultimately periodic. We will show further that the periodicity begins after a prefix of lengthat most (cid:18) b | S b | +1 + 13 (cid:19) b n ≤ (cid:18) b | S b | +1 + 13 (cid:19) b E ( a,b, | S a | , | S b | ) . Any representation ( x ) D b of length | S b | must reach a state in S ∞ . Therefore if x = 2 b | S b | +1 ,then for every y ≥ x , every representation ( y ) D b reaches a state in S ∞ . Now by the argumentof [8], the sequence f has period p f := p st starting from index i f := (cid:18) b | S b | +1 + 13 (cid:19) b n . By a similar argument (with the roles of f and g reversed) we find that if f and g agreeon a prefix of length A ( a, b, | S a | , | S b | ) , then g has period p g starting from some index i g ,where p g and i g are defined analogously to p f and i f . Now, we have max { p f , p g } ≤ θ E ( a,b, | S a | , | S b | ) , and max { i f , i g } ≤ max (cid:26)(cid:18) b | S b | +1 + 13 (cid:19) b E ( a,b, | S a | , | S b | ) , (cid:18) a | S a | +1 + 13 (cid:19) a E ( b,a, | S b | , | S a | ) (cid:27) ≤ max (cid:26)(cid:18) b | S b | +1 + 13 (cid:19) , (cid:18) a | S a | +1 + 13 (cid:19)(cid:27) θ E ( a,b, | S a | , | S b | ) , so max { i f , i g } + p f + p g ≤ max (cid:26)(cid:18) b | S b | +1 + 23 (cid:19) , (cid:18) a | S a | +1 + 23 (cid:19)(cid:27) θ E ( a,b, | S a | , | S b | ) ≤ A ( a, b, | S a | , | S b | ) . Therefore, by the Fine–Wilf theorem [10, Theorem 2.3.5], the sequences f and g are equal.12n the next corollary, let exp r ( x ) denote the function r x . Corollary 4.
Let a, b ≥ be multiplicatively independent integers. Let g = ( g x ) x ∈ N and f = ( f x ) x ∈ N be sequences over a set ∆ of size d . Suppose that g is computed by an a -DFAO M ′ a with R states and f is computed by a b -DFAO M ′ b with S states. There is a positiveconstant C (depending only on a and b ) such that if f and g agree on a prefix of length exp θ (exp θ ( CR S )) (2) then f and g are equal and ultimately periodic.Proof. We have previously observed that conversion from a b -DFAO to a ( b, D b ) -DFAOincreases the number of states to at most the quantity (1). We apply the bound of Theorem 3with | S a | = R and | S b | = S . Simplifying the resulting expression, we find that there is a positive constant C such thatthe bound of Theorem 3 is at most the quantity (2).Note that the bound on the length of the common prefix that we obtain seems absurdlylarge compared to what seems likely to be the optimal bound. It is not too difficult to givean example where the common prefix has length that is (singly) exponential in the size ofthe defining automata. For instance, let g be the constant (and hence a -automatic) sequence (0 , , , , . . . ) . Fix some positive integer N and let f be the characteristic sequence of theset { b n − n ≥ N } . Then f is an aperiodic b -automatic sequence. Indeed, a b -DFAO M generating f can be obtained from the N + 2 state DFA accepting the regular language ∗ ( b − N ( b − ∗ by making the accepting state output and all other states output .Then M has N + 2 states and the sequences f and g agree on a prefix of length b N − .Now we examine the connection to automaticity . The b -automaticity of a sequence g is the function A bg ( n ) whose value is the least number of states required in a b -DFAO thatcomputes a prefix of g of length n . Shallit [9, Proposition 1(c)] showed that if g is not b -automatic, then there is a constant c such that A bg ( n ) ≥ c log b n for infinitely many n . Corollary 5.
Let a, b be multiplicatively independent integers with a, b ≥ . There is apositive constant D such that the b -automaticity A bg ( n ) of an aperiodic a -automatic sequence g satisfies A bg ( n ) > D (log log n ) / , for all n ≥ .Proof. Let M a be an a -DFAO computing g and let M b,n be a b -DFAO computing a sequence f that agrees with g on a prefix of length n . Suppose that M a has E states and that M b,n has S n states. Since g is aperiodic, by (2) we have n < exp θ (exp θ ( CE ( S n ) )) Treating E as a constant, we get S n > (cid:18) C / E (cid:19) (log θ log θ n ) / = D (log log n ) / , for some positive constant D . 13ote that while this may seem weaker than the c log b n lower bound mentioned previously,the former only holds for infinitely many n , whereas our lower bound holds for all n . Withoutthe assumption that g is a -automatic, the b -automaticity of g could potentially be constantfor long stretches, and only for very sparsely distributed values of n satisfy A bg ( n ) ≥ c log b n .Our result shows that under the assumption that g is a -automatic, the function A bg ( n ) cannotbe constant for too long.On the other hand, our lower bound on the b -automaticity does seem to be rather weakcompared to what can be proved for specific sequences. Shallit [9] showed that if p is thefixed point of → , → , then for k odd, we have A kp ( n ) = Ω( n / /k ) , and if t is the fixed point of → , → (the Thue–Morse word), then for k odd, we have A kt ( n ) = Ω( n / /k / ) . b -automatic and Sturmiansequences As mentioned in the introduction, the problem of bounding the length of the longest commonprefix of a b -automatic sequence and a Sturmian sequence was addressed by Shallit [9]. In thissection, we show that two such sequences cannot have arbitrarily large factors in common.Our main result is the following: Theorem 6.
Let f be a b -automatic sequence and let g be a Sturmian sequence. There existsa constant C (depending on f and g ) such that if f and g have a factor in common of length n , then n ≤ C . Note that this result would follow fairly easily from the frequency results mentionedpreviously, if f is uniformly recurrent (meaning that every factor z of f occurs infinitelyoften, and with bounded gap size between two consecutive occurrences). However, unlikeSturmian sequences, automatic sequences need not be uniformly recurrent: consider, forexample, the -automatic sequence that is the characteristic sequence of the powers of .Our proof is therefore based on the finiteness of the b -kernel of f , along with the uniformdistribution property of Sturmian sequences (this is similar to the techniques used in [9]). Proof.
Let f = f f · · · and g = g g · · · , where g has slope α and intercept β . Since thefactors of a Sturmian word do not depend on β , without loss of generality, we may supposethat β = 0 (or, in other words, that g is a characteristic word ). Then g can be defined bythe following rule: g n = ( , if { ( n + 1) α } < α ;0 , otherwise.Here {·} denotes the fractional part of a real number.Suppose that for some integer L , the words f and g have a factor of length L in common:i.e., for some i ≤ j , we have f i · · · f i + L − = g j · · · g j + L − . i ≤ j since g is recurrent, but this is not important for what follows.)Suppose that the b -kernel of f , { ( f nb r + s ) n ≥ : r ≥ and ≤ s < b r } , has Q distinct elements. Let r satisfy b r > Q . There there exist integers s , s with ≤ s { d α } . Choose ǫ > such that ǫ < { d α } − { d α } . Note that d − d = s − s ,so ǫ does not depend on L (or I ). Since b r α is irrational, if I is sufficiently large, then byKronecker’s theorem (which asserts that the set of points { nα } is dense in (0 , ) there exists N ∈ I such that { N ( b r α ) + d α } ∈ [ α, α + ǫ ] . By the choice of ǫ , this implies that { N ( b r α ) + d α } ≥ α and { N ( b r α ) + d α } < α, contradicting the assumption that N satisfies one of (3) or (4). The contradiction meansthat L must be bounded by some constant C , which proves the theorem. Example 7.
Consider the Thue-Morse word t = 01101001 · · · , and the Fibonacci word f = 01001010 · · · given by the fixed point of → and → . The latter is Sturmian.The set of common factors is { ǫ, , , , , , , , , , , , , , , , , , , , , , , , } , so C = 8 . 15 Final thoughts
As noted at the end of Section 2, the
Ω((log log n ) / ) lower bound we obtain on the b -automaticity of an aperiodic a -automatic sequence is surely not optimal. Sequences with O (log n ) (i.e., “low”) b -automaticity are called b -quasiautomatic in [9]. It seems unlikely thatan aperiodic a -automatic sequence can even be b -quasiautomatic. Known examples of b -quasiautomatic sequences strongly resemble b -automatic sequences. For example, the fixedpoint of the morphism c → cba , a → aa , b → b , starting with c , is -quasiautomatic, butnot -automatic [9]. Similarly, the fixed point of → , → is not -automatic[1] but is conjectured to be -quasiautomatic [9]. Curiously, this latter sequence is auto-matic with respect to the positional numeration system (and a certain choice of canonicalrepresentations) whose place values are given by the sequence ((2 n − ( − n ) / n ≥ [1].We conclude by again mentioning the problem stated in the introduction of characterizingthe common factors of a b -automatic sequence and an a -automatic sequence. Can the methodof Krebs be applied to this problem? Acknowledgments
We thank Jean-Paul Allouche for helpful discussions. The normalization construction ofSection 2.2 was obtained in discussions with Émilie Charlier, Julien Leroy, and Michel Rigoof the University of Liège. We thank them for their help with this problem.After we posted an initial version of this paper on the arXiv, Thijmen Krebs contacted uswith a number of very helpful comments. He clarified some important points regarding hispaper, and gave several suggestions which greatly simplified and improved the presentationof the normalization construction. We are very grateful for his feedback, which significantlyimproved the exposition in Section 2.2.
References [1] G. Allouche, J.-P. Allouche, and J. Shallit, “Kolam indiens, dessins sur le sable aux îlesVanuatu, courbe de Sierpinski et morphismes de monoïde”, Ann. Inst. Fourier, Grenoble56 (2006), 2115–2130.[2] J.-P. Allouche and J. Shallit,
Automatic Sequences: Theory, Applications, Generaliza-tions , Cambridge, 2003.[3] J. Berstel and P. Séébold, “Sturmian words”, in M. Lothaire, ed.,
Algebraic Combinatoricson Words , Cambridge University Press, 2002, pp. 40–97.[4] J. Byszewski and J. Konieczny, “Automatic sequences and generalized polynomials”.Preprint available at https://arxiv.org/abs/1705.08979 .[5] J. Byszewski and J. Konieczny, “Factors of generalised polynomials and automatic se-quences”, Indag. Math. (N.S.) 29 (2018), no. 3, 981–985.166] A. Cobham, “Uniform tag sequences”, Math. Systems Theory 6 (1972), 164–192.[7] M. Lothaire,
Algebraic Combinatorics on Words , Cambridge, 2002.[8] T. Krebs, “A more reasonable proof of Cobham’s Theorem”. Preprint available at https://arxiv.org/abs/1801.06704 .[9] J. Shallit, “Automaticity IV: sequences, sets, and diversity”, J. Théorie des Nombres deBordeaux, 8 (1996), 347–367.[10] J. Shallit,