[PDF] Lie complexity of words

Abstract

Given a finite alphabet \Sigma and a right-infinite word \bf w over \Sigma, we define the Lie complexity function L_{\bf w}:\mathbb{N}\to \mathbb{N}, whose value at n is the number of conjugacy classes (under cyclic shift) of length-n factors x of \bf w with the property that every element of the conjugacy class appears in \bf w. We show that the Lie complexity function is uniformly bounded for words with linear factor complexity, and as a result we show that words of linear factor complexity have at most finitely many primitive factors y with the property that y^n is again a factor for every n. We then look at automatic sequences and show that the Lie complexity function of a k-automatic sequence is again k-automatic.

Full PDF

aa r X i v : . [ c s . F L ] F e b LIE COMPLEXITY OF WORDS

JASON P. BELL AND JEFFREY SHALLIT

Abstract.

Given a ﬁnite alphabet Σ and a right-inﬁnite word w over Σ, wedeﬁne the Lie complexity function L w : N → N , whose value at n is the numberof conjugacy classes (under cyclic shift) of length- n factors x of w with theproperty that every element of the conjugacy class appears in w .We show that the Lie complexity function is uniformly bounded for wordswith linear factor complexity, and as a result we show that words of linear factorcomplexity have at most ﬁnitely many primitive factors y with the property that y n is again a factor for every n .We then look at automatic sequences and show that the Lie complexity func-tion of a k -automatic sequence is again k -automatic. Contents

1. Introduction 12. Lie complexity 33. Algebraic interpretation of Lie complexity 44. Proof of Theorems 1.1 and 1.2 65. Construction 76. Automatic sequences 97. Examples 108. Concluding Remarks 12References 121.

Introduction

Let Σ be a ﬁnite alphabet and let w be a right-inﬁnite word over Σ. The factorcomplexity function p w : N → N , which counts the number of factors of w of eachlength, plays a fundamental role in understanding the behaviour of w as a word;see, e.g., [9, 16]. (It is also called the subword complexity function .) Often, however,one wishes to understand factors of w of a special form (e.g., palindromes, bordered,unbordered, squarefree, repetition-free, k -power, etc.). To accomplish this task, onerequires the use of ﬁner invariants, which are designed to count factors of a certainform. Of course, it is generally a very diﬃcult problem to exactly count factors ofa speciﬁc form in a given word, and so in practice one settles for invariants that areeasier to compute, which give upper and lower bounds for the desired quantities. Mathematics Subject Classiﬁcation.

Key words and phrases.

Combinatorics on words, automatic sequences, morphic words, linearfactor complexity, Lie complexity.Jason Bell is supported by NSERC grant 2016-03632. Jeﬀrey Shallit is supported by NSERCgrant 2018-04118.

In this paper, we look at the problem of counting factors y of a right-inﬁnite word w with the property that all cyclic shifts of y remain factors of w . In particular,this includes factors y with unbounded exponent (that is, factors y of w with theproperty that y n is again a factor of w for every n ≥ y of unbounded exponent, we may restrict our attentionto the case when y is itself not a perfect power; that is, when y is primitive . Inthis case, it is known that the set of primitive factors y of w having unboundedexponent is a ﬁnite set when w is pure morphic (ﬁxed point of a morphism).Since automatic words have linear factor complexity (that is, the number offactors of length n is bounded by a ﬁxed aﬃne function An + B in n for every n ),it is natural to ask whether a similar phenomenon holds more generally for wordsof linear factor complexity. To accomplish this, we introduce a new complexityfunction, the Lie complexity function , which is motivated by ideas from the theoryof Lie algebras. Given a right-inﬁnite word w , we deﬁne its Lie complexity function, L w : N → N , to be the map in which L w ( n ) is equal to the number of equivalenceclasses [ y ] of length- n factors of w with the property that every cyclic permutationof y is again a factor of w . Our main theorem is the following estimate. Theorem 1.1.

Let Σ be a ﬁnite alphabet, let w be a right-inﬁnite word over Σ ,and let L w : N → N be the Lie complexity function of w . Then for each n ≥ wehave L w ( n ) ≤ p w ( n ) − p w ( n −

1) + 1 . In particular, if w has linear factor complexity, then L w ( n ) is uniformly boundedabove by a constant. Observe that if w is a right-inﬁnite word and y is a primitive word such that y n is a factor of w for every n , then for every n , all cyclic permutations of y n arenecessarily factors of w . Using this observation, we are able to prove the followingresult. Theorem 1.2.

Let Σ be a ﬁnite alphabet and let w be a right-inﬁnite word over Σ .If w has linear factor complexity, then the set of primitive factors y of w such that y n is a factor of w for every n is a ﬁnite set. We point out that an analogue of Theorem 1.2 was already known to hold for puremorphic sequences [15, Corollary 20]. We are also able to show that the conditionthat lim sup n →∞ p w ( n ) /n be ﬁnite in Theorem 1.2 cannot be relaxed. Theorem 1.3.

Let f : N → N be a function that tends to inﬁnity as n → ∞ andlet Σ be a ﬁnite alphabet. Then there is a right-inﬁnite recurrent word w over Σ such that p w ( n ) ≤ nf ( n ) for n suﬃciently large such that w has inﬁnitely manydistinct primitive factors y with the property that y n is a factor of w for every n . We next turn our attention to automatic words w . If k ≥ f : N → ∆ is a k -automatic sequence, then we can identify f with the right-inﬁnite word w := f (0) f (1) f (2) · · · over the alphabet ∆. Thus itmakes sense to talk about the Lie complexity function of the automatic sequence f , by making this identiﬁcation with the word w . Our next result shows that theLie complexity functions of automatic words are particularly well-behaved. Theorem 1.4.

Let k ≥ be a positive integer, let ∆ be a ﬁnite set, and let f : N → ∆ be a k -automatic sequence. Then the Lie complexity function of f is againa k -automatic sequence. IE COMPLEXITY OF WORDS 3

The outline of this paper is as follows. In § § § §

5, we give a construction which proves Theorem 1.3. In § §

7. Finally, in § Lie complexity

Let Σ be a ﬁnite alphabet and w be a right-inﬁnite word over Σ. A factor of w is a ﬁnite block of contiguous symbols occurring within w . We let Fac( w ) denotethe collection of factors of w (including the empty word).We say that two words v, v ′ over Σ ∗ are cyclically equivalent , which we write v ∼ C v ′ , if v and v ′ are cyclic permutations of one another. We then let [ v ] C denotethe equivalence class of v under ∼ C . For example, the equivalence class of theEnglish word tea is { tea , eat , ate } . We deﬁne the Lie complexity of w to be thefunction L w ( n ) := { [ v ] C : | v | = n and [ v ] C ⊆ Fac( w ) } . (1)That is, L w ( n ) counts the number of cyclic equivalence classes of length n withthe property that every word in the equivalence class is a factor of w . This canbe contrasted with the cyclic complexity function of Cassaigne, Fici, Sciortino, andZamboni [11], deﬁned as follows: c w ( n ) := { [ v ] C : | v | = n and [ v ] C ∩ Fac( w ) = ∅} , that is, where “every word” in our deﬁnition of Lie complexity is replaced by “someword”. Observe, in particular from our deﬁnition that we have the inequality L w ( n ) ≤ c w ( n ) for n ≥ . (2)Similarly, we have L w ( n ) ≤ a w ( n ) for n ≥ , (3)where a w : N → N is the abelian complexity function, which counts factors oflength n up to abelian equivalence, where v and v ′ are abelian equivalent if v ′ canbe obtained from v via some permutation of the letters [20].We now show the relation between factors y of w of unbounded exponent in w and the Lie complexity function. To make this precise, we construct an equivalencerelation ∼ on the collection of right-inﬁnite words over Σ in which two right-inﬁnitewords are equivalent if they have the same set of (ﬁnite) factors. We then let Per( w )denote the set of ∼ equivalence classes of right-inﬁnite words of the form v ω suchthat Fac( v ω ) ⊆ Fac( w ). The following result is the key estimate, which will be usedin proving Theorem 1.2. Proposition 2.1.

Let Σ be a ﬁnite alphabet and let w be a right-inﬁnite word over Σ . Suppose that there is a positive number κ such that for each positive integer b ≥ ,there is a positive integer n = n ( b ) such that L w ( bn ) ≤ κ . Then w ) ≤ κ . Inparticular, if L w ( n ) is uniformly bounded then w ) is ﬁnite.Proof. Suppose that there exist distinct equivalence classes [ u ω ] , . . . , [ u ωs ] in Per( w )with s > κ . Pick D such that u Di is not a factor of u ωj whenever i = j . Let b := D | u | · | u | · · · | u s | . JASON P. BELL AND JEFFREY SHALLIT

Then by construction, for each n ≥

1, the words u nb/ | u | , . . . , u nb/ | u s | s are cyclicallyinequivalent words of length nb with the property that every cyclic permutationoccurs as a factor of w . Hence L w ( bn ) ≥ s > κ for every n ≥

1, which contradictsthe hypothesis that L w ( bn ) must be at most κ for some positive integer n . Theresult follows. (cid:3) Algebraic interpretation of Lie complexity

We now give a purely algebraic interpretation of the Lie complexity function,which will be used later in proving Theorem 1.1. To do this, we introduce the factor algebra of a right-inﬁnite word w .Let Σ be a ﬁnite alphabet and let w be a right-inﬁnite word over Σ. Given a ﬁeld k , we can construct the factor k -algebra of w , which we denote by A w . As a vectorspace, this is just all ﬁnite formal k -linear combinations of elements of Fac( w ); thatis, A w =  X v ∈ Fac( w ) λ v v : λ v ∈ k, λ v = 0 for all but ﬁnitely many v ∈ Fac( w )  , with multiplication of v, v ′ ∈ Fac( w ) deﬁned by declaring that v · v ′ is the concate-nation of v and v ′ if vv ′ is again a factor of w and v · v ′ is zero otherwise. We canthen extend the multiplication to general elements of A w by linearity, and so  X v ∈ Fac( w ) α v v   X u ∈ Fac( w ) β u u  = X y ∈ Fac( w ) X { ( u,v ): uv = y } α u β v y. We now introduce some notation that we will use in obtaining Theorem 1.1.

Notation 3.1.

We make the following assumptions and introduce the followingnotation. (1)

We let

Σ = { x , . . . , x d } be a ﬁnite alphabet and we let w be a right-inﬁniteword over Σ . (2) We let A w be the factor algebra of w with base ﬁeld k = Q . (3) We let V n denote the subspace of A w spanned by the images of factors of w of length n . (4) We let W n denote the subspace of V n spanned by elements of the form ab − ba ,where a, b ∈ Fac( w ) with | a | + | b | = n . Notice that since V n has a basis given by factors of w of length n , we have p w ( n ) = dim( V n ) , (4)where we are taking the dimension as a Q -vector space.One important remark is that if we adopt the notation from Notation 3.1 andwe let x = x + · · · + x d ∈ V , then x n is the sum of all n -fold concatenations of x , . . . , x d . Each such concatenation will either be 0 in the factor algebra or will beequal to a factor of w of length n ; moreover, each factor of w of length n can berealized as a unique concatenation of length n of these elements. Thus, when wework in the factor algebra, we have the formula( x + · · · + x d ) n = X { v ∈ Fac( w ): | v | = n } v. (5) IE COMPLEXITY OF WORDS 5

Lemma 3.2.

Adopt the assumptions and notation from Notation 3.1. Then L w ( n ) = dim( V n ) − dim( W n ) . Proof.

We ﬁx n and let m denote the dimension of the quotient space V n /W n . Let u + W n , . . . , u m + W n be a basis for V n /W n consisting of W n -cosets of factors u , . . . , u d of w of length n . Observe that every cyclic permutation of u i must bea factor of w , since otherwise we could ﬁnd words a and b such that u i = ab andsuch that ba is not a factor of w . But this would give that ba = 0 and ab = u i in A w and so we would have ab − ba = u i , which would mean that u i ∈ W n , which isa contradiction, since u i + W n is part of a basis for V n /W n .Furthermore, the u i must be cyclically inequivalent, since if there were i and j with i = j such that some u j were a cyclic permutation of u i , we could again write u i = ab and u j = ba and we would have u i − u j = ab − ba ∈ W n , which again wouldcontradict the independence of u , . . . , u m mod W n .Thus u , . . . , u m are cyclically inequivalent words such that [ u ] C , . . . , [ u m ] C areall contained in Fac( w ) and so L w ( n ) ≥ m = dim( V n /W n ) . Now we show that L w ( n ) ≤ dim( V n /W n ). Observe that if L w ( n ) is strictlygreater than dim( V n /W n ), then there must exist some word u m +1 ∈ Fac( w ) oflength n such that every cyclic permutation of u m +1 is also a factor of w and suchthat u m +1 is not cyclically equivalent to u i for i = 1 , . . . , m .Since u + W n , . . . , u m + W n form a basis for V n /W n . By assumption, there existrational constants α , . . . , α m such that u m +1 − P mi =1 α i u i ∈ W n . Then by deﬁnitionof W n there are words a , . . . , a s , b , . . . , b s and rational constants β , . . . , β s suchthat u m +1 − m X i =1 α i u i = s X i =1 β i ( a i b i − b i a i ) (6)in the factor algebra A w . We let U denote the subspace of V n spanned by images ofwords that are cyclically equivalent to u m +1 and we deﬁne a linear map π : V n → U .Since V n has a basis consisting of factors of w of length n , it suﬃces to deﬁne π onsuch factors and then extend linearly. For a factor u of w of length n , we deﬁne π ( u ) = u if u ∼ C u m +1 and π ( u ) = 0 otherwise.Then since u , . . . , u m , u m +1 are pairwise cyclically inequivalent, the left side ofEquation (6) is sent to u m +1 by the map π ; the right side, however, is sent to anelement of W n , since for each i , either a i b i and b i a i are both cyclically equivalentto u m +1 or neither a i b i nor b i a i is cyclically equivalent to u m +1 . It follows that u m +1 ∈ W n .Thus u m +1 is a Q -linear combination of elements of the form ab − ba with each ab and ba cyclic permutations of u m +1 . But by assumption, each cyclic permutationof u m +1 is in Fac( w ) and so if we let T : U → Q be the linear map uniquely deﬁnedby sending u to 1 for each cyclic permutation u of u m +1 , we see that T ◦ π sendsthe right side of Equation 6 to zero and T ◦ π ( u m +1 ) = 1, a contradiction. Thus weobtain the reverse inequality and so L w ( n ) = dim( V n /W n ) = dim( V n ) − dim( W n ) . (cid:3) JASON P. BELL AND JEFFREY SHALLIT

Remark 3.3.

Notice that W n is spanned by commutators, ab − ba , and that thealgebra A w becomes a Lie algebra when endowed with the bracket [ a, b ] := ab − ba .It is this fact and Lemma 3.2, which motivates the name Lie complexity for thefunction L w . 4. Proof of Theorems 1.1 and 1.2

We can now use the algebraic framework from the preceding section to proveTheorem 1.1. Our proof adapts an argument with Lie brackets from [4].

Proof of Theorem 1.1.

Let x = P di =1 x i ∈ V . We have a linear mapΦ n : V n → W n +1 deﬁned by u ux − xu = d X i =1 ux i − x i u for u ∈ V n . Then by construction, Φ n sends a factor of w of length n into the space W n +1 , and so Φ n does indeed map into W n +1 .Recall from Equation (5) that x n = X { u ∈ Fac( w ): | u | = n } u. We claim that the kernel of Φ n is spanned by x n . To see this, observe that Φ n ( x n ) = x n · x − x · x n = 0 and so x n is in the kernel of Φ n .Suppose that there is z := X { u ∈ Fac( w ): | u | = n } α u u ∈ ker(Φ n )with z not in the span of x n . Then we can replace z by z − αx n for some α in Q and assume that there is some factor u of w of length n with α u = 0 but that z is nonzero. Since z is nonzero, there is some factor v of w of length n such that α v = 0.Then since v and u are both factors of w there is some factor y u,v of w thateither has v as a preﬁx and u as a suﬃx or has u as a suﬃx and has v as a preﬁx.Among all factors y of w having the property that u is either a preﬁx or suﬃx andhaving the property that some word v ′ of length n with α v ′ = 0 is either a preﬁxor a suﬃx, we pick one, y , of shortest length possible.By symmetry, it suﬃces to consider the case when u is a preﬁx of y , and we let v ′ , with α v ′ = 0, denote the suﬃx of y of length n . Then y = ua = bv ′ (7)for some words a and b . Since α u = 0 = α v ′ we see that | a | = | b | ≥ j be such that x j is the last letter of b and write b = b ′ x j . By assumptionΦ n ( z ) = 0 and so X { s ∈ Fac( w ): | s | = n } d X i =1 α s ( sx i − x i s ) = 0 . (8)We now consider the coeﬃcient of x j v ′ in both sides of Equation (8). The coeﬃcientin the right side is equal to zero. On the other hand, we have x j v ′ = v ′′ x k (9) IE COMPLEXITY OF WORDS 7 for some k ∈ { , . . . , d } and some word v ′′ of length n , and so the coeﬃcient of x j v ′ in the left side of Equation (8) is − α v ′ + α v ′′ , since the only contribution from theterms P di =1 α s sx i occurs when i = k and s = v ′′ and the only contribution fromthe terms P di =1 − α s x i s comes when i = j and s = v ′ .Hence 0 = − α v ′ + α v ′′ and so in particular α v ′′ is nonzero. Then from Equations(7) and (9) and the fact that b = b ′ x j , we see ua = bv ′ = b ′ x j v ′ = b ′ v ′′ x k . Thus x k is the last letter of a and so a = a ′ x k for some word a ′ with | a ′ | < | a | .But now y ′ := ua ′ = b ′ v ′′ has the property that u is a preﬁx, v ′′ is a suﬃx and α v ′′ = 0 and | y ′ | < | y | ,which contradicts the minimality of | y | . It follows that the kernel of Φ n is spannedby x n , and since x n is nonzero in the factor algebra, the kernel of Φ n is exactlyone-dimensional.Then the rank-plus-nullity theorem for linear maps gives thatdim( V n ) = dim(ker(Φ n )) + dim(Im(Φ n )) . Since we have now established that dim(ker(Φ n )) = 1 and since the image of Φ n isa subspace of W n +1 , we have in fact thatdim( V n ) ≤ dim( W n +1 ) + 1 , or, equivalently, dim( W n +1 ) ≥ dim( V n ) − . Consequently, Lemma 3.2 gives L w ( n + 1) = dim( V n +1 ) − dim( W n +1 ) ≤ dim( V n +1 ) − (dim( V n ) − . Using Equation (4), we then see that L w ( n + 1) ≤ p w ( n + 1) − p w ( n ) + 1 , and so we obtain the desired inequality. When w has linear factor complexity, adeep result of Cassaigne [10] shows that p w ( n + 1) − p w ( n ) is uniformly boundedabove by a constant, which then gives that L w ( n ) is similarly bounded, and so theproof is complete. (cid:3) We now get the proof of Theorem 1.2.

Proof of Theorem 1.2.

By Theorem 1.1, L w ( n ) is uniformly bounded when w haslinear factor complexity. Proposition 2.1 then gives that w ) is ﬁnite. Sincethere are only ﬁnitely many primitive words y ′ such that ( y ′ ) ω has the same set offactors of a ﬁxed periodic right-inﬁnite word, we then obtain the desired result. (cid:3) Construction

In this section, we give a construction that proves Theorem 1.3. We note thatwe make use of a similar construction given by the ﬁrst author and Smoktunowicz[6] in the context of monomial algebras, which we sharpen slightly.

JASON P. BELL AND JEFFREY SHALLIT

Proof.

Let f : N → N be a function that tends to inﬁnity. We shall construct arecurrent binary word w whose factor complexity function is bounded above by nf ( n ) for n suﬃciently large such that Per( w ) is inﬁnite.First observe that by replacing f ( n ) by min( f ( j ) : j ≥ n ), we may assume that f is weakly increasing. Then for each j there is some largest natural number m j suchthat f ( m j ) ≤ j . Then since f ( n ) is weakly increasing and tends to inﬁnity, wesee that the m j are weakly increasing and tend to inﬁnity.To begin, we let f be the Fibonacci word, which is a Sturmian word, and hencehas complexity function p f ( n ) = n + 1. We ﬁx a preﬁx u of f of length 2 m . Since f is uniformly recurrent, there are inﬁnitely many occurrences of u in f , and hencewe can ﬁnd a preﬁx u of f of length at least 2 m + m such that u is a suﬃx of u .We let d denote the length of u . In general, for each i there is a preﬁx u i of f such that u i − is a suﬃx and such that | u i | > m i | u i − | , and we let d i denote thelength of u i . Then d i is at least 2 m i + ··· + m . We deﬁne a i,j = ⌈| u i | / | u j |⌉ for i, j ≥ v n = u n u a n,n − n − · · · u a , u a n, u a n, · · · u a n,n − n − u n (10)and we deﬁne a sequence of words s n with s = u and for n ≥ s n = s n − v n s n − v n . (11)Since each s i is a preﬁx of s i +1 , we can deﬁne the right-inﬁnite word w = lim n s n , (12)and since every factor of w appears in some preﬁx s i and since s i +1 = s i v i +1 s i v i +1 ,we see that w is recurrent.This construction ﬁrst appears in work of the ﬁrst author and Smoktunowicz [6, §

4] (but in that paper the authors used W for f , W i for the preﬁxes u i , V n for thefactors v n , U n for the factors s n , and U for the word w ).Let n be a natural number that is larger than | u | . Then there is a unique d suchthat | u d | ≤ n < | u d +1 | . Since u d ≥ m d + ··· + m , we see that n ≥ m d .Then we may write w = ( s d v d +1 s d v d +1 ) v d +2 ( s d v d +1 s d v d +1 ) v d +2 · · · and since | v j | > n for j > d , a factor of w of length n is either:(1) a factor of some word of the form v j s d v k with j, k > d ; or(2) a factor of length n of some v i v i +1 with i ≥ d + 1 that overlaps with both asuﬃx of v i and a preﬁx of v i +1 .Then since u d +1 is both a preﬁx and suﬃx of v i for i ≥ d + 1 and since | u d +1 | > n ,we see that every factor of v j s d v k , with j, k > d , of length n is either a factor of u d +1 s d u d +1 or a factor of v j for some j . Similarly, a factor of v i v i +1 with i ≥ d + 1that overlaps with both a suﬃx of v i and a preﬁx of v i +1 must be a factor of u d +1 that overlaps with both copies of u d +1 .We now consider these three types of factors in a case-by-case basis. A factor of u d +1 s d u d +1 of length n is either a factor of u d +1 , or it must overlap with s d . Since u d +1 is a factor of a Sturmian word, there are at most n + 1 distinct factors of u d +1 of length n ; there are at most n − | s d | ways of choosing a factor of u d +1 s d u d +1 oflength n that overlaps with s d . Thus we see that there are at most 2 n + | s d | factorsof u d +1 s d u d +1 of length n .There are n − u d +1 of length n that overlaps withboth copies of u d +1 . Thus we see that factors of u d +1 that overlap with both copiesof u d +1 contribute at most n − IE COMPLEXITY OF WORDS 9

Finally, there are at most 12 d n factors of some v j [6, Lemma 4.4] and so we seethat the total number of factors of w of length n is at most3 n − | s d | + 12 d n. Now [6, Equation (4.8)] gives that | s d | ≤ d | u d | ≤ d n and so the numberof factors of w of length n is at most 3 n − d n + 12 d n ≤ nd . We have n ≥ m d and since f ( j ) > d for j > m d , we see that 19 nd ≤ nf ( n ), and so p w ( n ) ≤ nf ( n ) for n ≥ | u | , which gives the desired bound on the factor complexityof w .Finally, observe that for a ﬁxed i , the word u a n,i i appears as a factor of v n andhence as a factor of w . Since a n,i ≥ | u n | / | u i | → ∞ , we see that arbitrarily largepowers of u i appear as factors of w . Now for each i , there is some primitive word y i such that u i = y e i i for some e i ≥

1. Since u i is a factor of the Fibonacci wordand since the Fibonacci word is 4th-power free [17], we see that e i ∈ { , , } forevery i . Hence | y i | → ∞ as i → ∞ , and so we have inﬁnitely many primitive words y such that y n is a factor of w for every n . (cid:3) Automatic sequences

A sequence s = ( s n ) n ≥ is k -automatic if there exists a ﬁnite automaton that,on input the base- k representation of n , computes s n (by arriving at a state whoseoutput is s n ). We have the following result [12]: Theorem 6.1.

Let s be a k -automatic sequence. (a) There is an algorithm that, given a well-formed ﬁrst-order logical formula ϕ in FO( N , + , , , n → s [ n ]) having no free variables, decides if ϕ is true orfalse. (b) Furthermore, if ϕ has free variables, then the algorithm constructs an au-tomaton recognizing the representation of the values of those variables forwhich ϕ evaluates to true. A sequence ( a n ) n ≥ taking values in Z is k -regular if there is a linear repre-sentation for it, that is, a row vector v , a column vector w , and a matrix-valuedmorphism ζ : { , , . . . , k − } → Z d × d such that a n = v · ζ ( x ) · w , where x is thebase- k representation of n . If A is an automaton accepting the base- k represen-tation of pairs ( i, n ) in parallel, then the sequence a n = { i : A accepts ( i, n ) } is k -regular, and furthermore the matrices ζ ( a ) in the linear representation for ( a n )have non-negative integer entries [12]. A k -regular sequence taking only ﬁnitelymany values is k -automatic [1, Thm. 16.1.5], and the automaton can be algorith-mically produced from the linear representation because the entries of ζ ( a ) are in N . In this section we prove Theorem 1.4: if w is a k -automatic sequence, then thesequence ( L w ( n )) n ≥ is also k -automatic. Proof.

We will show that the sequence ( L w ( n )) n ≥ is k -regular. Since automaticsequences have linear factor complexity [1, Thm. 10.3.1], it follows from Theorem 1.2that ( L w ( n )) n ≥ is bounded, and hence automatic.We construct a linear representation for ( L w ( n )) n ≥ by constructing a ﬁrst-orderlogical formula lie( i, n ) for the pairs ( i, n ) such that(a) All of the cyclic shifts of w [ i..i + n −

1] appear in w ;(b) w [ i..i + n −

1] is the lexicographically least of all its cyclic shifts appearingin w ; and (c) w [ i..i + n −

1] is the ﬁrst occurrence of this particular factor.Then the number of i making lie( i, n ) true equals L w ( n ).We do this in a number of steps: • factoreq( i, j, n ) asserts that the length- n factor w [ i..i + n −

1] equals w [ j..j + n − • shift( i, j, n, t ) asserts that w [ i..i + n −

1] is the shift, by t positions, of thefactor w [ j..j + n − • conj( i, j, n ) asserts that the factor w [ i..i + n −

1] is a cyclic shift of w [ j..j + n − • lessthan( i, j, n ) asserts that the factor w [ i..i + n −

1] is lexicographicallysmaller than w [ j..j + n − • lessthaneq( i, j, n ) asserts that the factor w [ i..i + n −

1] is lexicographically ≤ the factor w [ j..j + n − • allconj( i, n ) asserts that all cyclic shifts w [ i..i + n −

1] appear as factors of w . • lexleast( i, n ) asserts that w [ i..i + n −

1] is lexicographically least among allits cyclic shifts that actually appear in w . • lie( i, n ) asserts that all cyclic shifts of w [ i..i + n −

1] appear in w , that w [ i..i + n −

1] is the lexicographically least cyclic shift, and that w [ i..i + n − w .Here are the deﬁnitions of the formulas. Recall that the domain of all variablesis N = { , , . . . } .factoreq( i, j, n ) := ∀ u, v ( i + v = j + u ∧ u ≥ i ∧ u < i + n ) = ⇒ w [ u ] = w [ v ]shift( i, j, n, t ) := factoreq( j, i + t, n − t ) ∧ factoreq( i, ( j + n ) − t, t )conj( i, j, n ) := ∃ t ( t ≤ n ) ∧ shift( i, j, n, t )lessthan( i, j, n ) := ∃ t ( t < n ) ∧ factoreq( i, j, t ) ∧ w [ i + t ] < w [ j + t ]lessthaneq( i, j, n ) := lessthan( i, j, n ) ∨ factoreq( i, j, n )allconj( i, n ) := ∀ t ( t ≤ n ) = ⇒ ∃ j shift( i, j, n, t )lexleast( i, n ) := ∀ j conj( i, j, n ) = ⇒ lessthaneq( i, j, n )lie( i, n ) := allconj( i, n ) ∧ lexleast( i, n ) ∧ ( ∀ j factoreq( i, j, n ) = ⇒ ( j ≥ i ))From the remarks preceding the proof, we are now done. (cid:3) Remark 6.2.

Most of the logical formulas should be self-explanatory, with oneexception: in order to specify allconj, why do we use shifts of length 0 , , . . . , n ? Itis because we want the formula to work even in the case of the empty word. Corollary 6.3.

Given an automatic sequence w , the quantity sup n ≥ L w ( n ) iscomputable. Remark 6.4.

Theorem 1.4 and Corollary 6.3 also hold for automata based onother kinds of numeration systems, such as Fibonacci numeration [13]; Tribonaccinumeration [19]; and Ostrowski numeration systems [3].7.

Examples

Using the free software

Walnut [18], we can implement the algorithm of theprevious section to ﬁnd automata and closed-form expressions for L w ( n ) for someclassical words of interest. IE COMPLEXITY OF WORDS 11

Example 7.1.

Let t be the Thue-Morse word, the ﬁxed point of the morphism µ sending 0 to 01 and 1 to 10. Then L t ( n ) =  , if n = 0 or n = 2 k for k ≥ , if n = 1 , n = 3 · k for k ≥ , if n = 2;0 , otherwise.To some extent this is not surprising, since we know that the only squares in t are of length 2 k or 3 · k . However, L w ( n ) can be nonzero even if a sequence has nosquares, as the following example shows. Example 7.2.

Let vtm be the variant of the Thue-Morse word deﬁned over aternary alphabet, the ﬁxed point of the morphism sending 2 to 210, 1 to 20, and 0to 1. It is well-known that vtm is squarefree [7]. Then L vtm ( n ) =  , if n = 0 or n = 2 k for k ≥ , if n = 3 · k for k ≥ , if n = 1 , , otherwise. Example 7.3.

Let us look at an example in a diﬀerent base, and where there arefactors of unbounded exponent. Let c = 101000101 · · · be the Cantor sequence ,which is the ﬁxed point of the morphism 1 →

101 and 0 → L c ( n ) =  , if n = 4;2 , if n = 0 , , · k for k ≥ , otherwise. Example 7.4.

Let f be the Fibonacci word, the ﬁxed point of the morphismsending 0 to 01 and 1 to 0. Deﬁne the Fibonacci numbers by F = 0, F = 1, and F n = F n − + F n − for n ≥

2. Then L f ( n ) =  , if n = 0 or n = F k for k ≥ n = F k + F k − for k ≥ , if n = 1 , , otherwise. Example 7.5.

Let TR be the Tribonacci word, the ﬁxed point of the morphismsending 0 to 01, 1 to 02, and 2 to 0. Deﬁne the Tribonacci numbers by T = 0, T = 1, T = 1, and T n = T n − + T n − + T n − for n ≥

3. Then L TR ( n ) =  , if n = 0 or n = T k for k ≥ n = T k + T k − for k ≥ n = T k + T k − for k ≥ , if n = 4;3 , if n = 1 , , otherwise.Finally, we give an example where L w ( n ) = 0 for n ≥ Example 7.6.

Let Σ = { x , . . . , x , y , . . . , y } and let Φ : Σ ∗ → Σ ∗ be the mor-phism given by x x x y y x x x y y x x x y y x x x y y x x x y y x x x y y y x x y y y x x y y y x x y y y x x y y y x x y y y x x y y . and let w = Φ ω ( x ). Then w is 2-automatic and Lemma 6.1 of [5] shows that L w ( n ) = 0 for n ≥ L w ( n ) = 0 for n ≥ i has been studied over various alphabets [14].8. Concluding Remarks

In this ﬁnal section, we pose a question that is suggested by the computationswe’ve performed.

Question 8.1.

Can the Lie complexity function of a morphic word be unbounded?

We note that the analogue of Theorem 1.2 is known to hold for pure morphicwords [15, Corollary 20], and so if Question 8.1 has an aﬃrmative answer, thiswould give an extension of this result to general morphic words.Although many classes of morphic words, including primitive morphic and k -uniform morphic words, have linear complexity and hence are covered by Theorem1.2, the factor complexity function of a morphic word need not be linear in general.Pansiot (see [8, Theorem 4.7.1]) has shown that the factor complexity of a puremorphic word is either O(1), Θ( n ), Θ( n log log n ), Θ( n log n ), or Θ( n ), and thateach of these possibilities can be realized as the factor complexity of a pure morphicword. References [1] J.-P. Allouche and J. O. Shallit.

Automatic Sequences , Cambridge University Press, 2003.[2] G. Badkobeh and P. Ochem. Avoiding conjugacy classes on the 5-letter alphabet.

RAIROTheor. Inform. Appl. (2020), 1–4.[3] A. R. Baranwal. Decision algorithms for Ostrowski-automatic sequences. M. Math.thesis, University of Waterloo, School of Computer Science, 2020. Available at https://uwspace.uwaterloo.ca/handle/10012/15845 .[4] J. P. Bell, A dichotomy result for prime algebras of Gelfand-Kirillov dimension two. J. Algebra (2010), no. 4, 831–840.[5] J. P. Bell and B. W. Madill, Iterative algebras.

Algebr. Represent. Theory (2015), no. 6,1533–1546.[6] J. P. Bell and A. Smoktunowicz, The prime spectrum of algebras of quadratic growth. J.Algebra (2008), no. 1, 414–431.[7] J. Berstel. Sur la construction de mots sans carr´e.

S´eminaire de Th´eorie des Nombres (1978–1979), 18.01–18.15.[8] V. Berth´e and Michel Rigo (eds.),

Combinatorics, automata and number theory , Encyclopediaof Mathematics and its Applications, vol. 135, Cambridge University Press, Cambridge, 2010.[9] J. Cassaigne. Special factors of sequences with linear subword complexity. In J. Dassow, G.Rozenberg, and A. Salomaa, eds.

Developments in Language Theory II . World Scientiﬁc, 1996,pp. 25–34.

IE COMPLEXITY OF WORDS 13 [10] J. Cassaigne. Complexit´e et facteurs sp´eciaux. Journ´ees Montoises (Mons, 1994).

Bull. Belg.Math. Soc. Simon Stevin (1997), no. 1, 67–88.[11] J. Cassaigne, G. Fici, M. Sciortino, and L. Q. Zamboni. Cyclic complexity of words. J. Combin.Theory Ser. A (2017), 36–56.[12] E. Charlier, N. Rampersad and J. Shallit. Enumeration and decidable properties of automaticsequences.

Internat. J. Found. Comp. Sci. (2012) 1035–1066.[13] C. Frougny. Fibonacci numeration systems and rational functions. In J. Gruska, B. Rovan,and J. Wiedermann, eds., MFCS 86 , Lect. Notes in Comp. Sci. , Vol. 233, Springer-Verlag,1986, pp. 350–359.[14] G. Gamard, P. Ochem, G. Richomme, and P. S´e´ebold. Avoidability of circular formulas.

Theoret. Comput. Sci. (2018), 1–4.[15] K. Klouda and S. Starosta. An algorithm for enumerating all inﬁnite repetitions in a D0L-system.

J. Discrete Algorithms (2015), 130–138.[16] F. Mignosi. Inﬁnite words with linear subword complexity. Theor. Comput. Sci. (1989),221–242.[17] F. Mignosi and G. Pirillo. Repetitions in the Fibonacci inﬁnite word. RAIRO Info. Theor. (1992), 199–204.[18] H. Mousavi. Automatic theorem proving in Walnut , Arxiv preprint, 2016. Available at http://arxiv.org/abs/1603.06017 .[19] H. Mousavi and J. Shallit. Mechanical proofs of properties of the Tribonacci word. In F. Maneaand D. Nowotka, eds.,

WORDS 2015 , Lect. Notes in Comp. Sci., Vol. 9304, Springer-Verlag,2015, pp. 170–190.[20] G. Richomme, K. Saari, and L. Q. Zamboni. Abelian complexity of minimal subshifts.

J.London Math. Soc. (2011), 79–95. Department of Pure Mathematics, University of Waterloo, Waterloo, ON N2L3G1, Canada

Email address : [email protected] School of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1,Canada

Email address ::