Lie complexity of words
aa r X i v : . [ c s . F L ] F e b LIE COMPLEXITY OF WORDS
JASON P. BELL AND JEFFREY SHALLIT
Abstract.
Given a finite alphabet Σ and a right-infinite word w over Σ, wedefine the Lie complexity function L w : N → N , whose value at n is the numberof conjugacy classes (under cyclic shift) of length- n factors x of w with theproperty that every element of the conjugacy class appears in w .We show that the Lie complexity function is uniformly bounded for wordswith linear factor complexity, and as a result we show that words of linear factorcomplexity have at most finitely many primitive factors y with the property that y n is again a factor for every n .We then look at automatic sequences and show that the Lie complexity func-tion of a k -automatic sequence is again k -automatic. Contents
1. Introduction 12. Lie complexity 33. Algebraic interpretation of Lie complexity 44. Proof of Theorems 1.1 and 1.2 65. Construction 76. Automatic sequences 97. Examples 108. Concluding Remarks 12References 121.
Introduction
Let Σ be a finite alphabet and let w be a right-infinite word over Σ. The factorcomplexity function p w : N → N , which counts the number of factors of w of eachlength, plays a fundamental role in understanding the behaviour of w as a word;see, e.g., [9, 16]. (It is also called the subword complexity function .) Often, however,one wishes to understand factors of w of a special form (e.g., palindromes, bordered,unbordered, squarefree, repetition-free, k -power, etc.). To accomplish this task, onerequires the use of finer invariants, which are designed to count factors of a certainform. Of course, it is generally a very difficult problem to exactly count factors ofa specific form in a given word, and so in practice one settles for invariants that areeasier to compute, which give upper and lower bounds for the desired quantities. Mathematics Subject Classification.
Key words and phrases.
Combinatorics on words, automatic sequences, morphic words, linearfactor complexity, Lie complexity.Jason Bell is supported by NSERC grant 2016-03632. Jeffrey Shallit is supported by NSERCgrant 2018-04118.
In this paper, we look at the problem of counting factors y of a right-infinite word w with the property that all cyclic shifts of y remain factors of w . In particular,this includes factors y with unbounded exponent (that is, factors y of w with theproperty that y n is again a factor of w for every n ≥ y of unbounded exponent, we may restrict our attentionto the case when y is itself not a perfect power; that is, when y is primitive . Inthis case, it is known that the set of primitive factors y of w having unboundedexponent is a finite set when w is pure morphic (fixed point of a morphism).Since automatic words have linear factor complexity (that is, the number offactors of length n is bounded by a fixed affine function An + B in n for every n ),it is natural to ask whether a similar phenomenon holds more generally for wordsof linear factor complexity. To accomplish this, we introduce a new complexityfunction, the Lie complexity function , which is motivated by ideas from the theoryof Lie algebras. Given a right-infinite word w , we define its Lie complexity function, L w : N → N , to be the map in which L w ( n ) is equal to the number of equivalenceclasses [ y ] of length- n factors of w with the property that every cyclic permutationof y is again a factor of w . Our main theorem is the following estimate. Theorem 1.1.
Let Σ be a finite alphabet, let w be a right-infinite word over Σ ,and let L w : N → N be the Lie complexity function of w . Then for each n ≥ wehave L w ( n ) ≤ p w ( n ) − p w ( n −
1) + 1 . In particular, if w has linear factor complexity, then L w ( n ) is uniformly boundedabove by a constant. Observe that if w is a right-infinite word and y is a primitive word such that y n is a factor of w for every n , then for every n , all cyclic permutations of y n arenecessarily factors of w . Using this observation, we are able to prove the followingresult. Theorem 1.2.
Let Σ be a finite alphabet and let w be a right-infinite word over Σ .If w has linear factor complexity, then the set of primitive factors y of w such that y n is a factor of w for every n is a finite set. We point out that an analogue of Theorem 1.2 was already known to hold for puremorphic sequences [15, Corollary 20]. We are also able to show that the conditionthat lim sup n →∞ p w ( n ) /n be finite in Theorem 1.2 cannot be relaxed. Theorem 1.3.
Let f : N → N be a function that tends to infinity as n → ∞ andlet Σ be a finite alphabet. Then there is a right-infinite recurrent word w over Σ such that p w ( n ) ≤ nf ( n ) for n sufficiently large such that w has infinitely manydistinct primitive factors y with the property that y n is a factor of w for every n . We next turn our attention to automatic words w . If k ≥ f : N → ∆ is a k -automatic sequence, then we can identify f with the right-infinite word w := f (0) f (1) f (2) · · · over the alphabet ∆. Thus itmakes sense to talk about the Lie complexity function of the automatic sequence f , by making this identification with the word w . Our next result shows that theLie complexity functions of automatic words are particularly well-behaved. Theorem 1.4.
Let k ≥ be a positive integer, let ∆ be a finite set, and let f : N → ∆ be a k -automatic sequence. Then the Lie complexity function of f is againa k -automatic sequence. IE COMPLEXITY OF WORDS 3
The outline of this paper is as follows. In § § § §
5, we give a construction which proves Theorem 1.3. In § §
7. Finally, in § Lie complexity
Let Σ be a finite alphabet and w be a right-infinite word over Σ. A factor of w is a finite block of contiguous symbols occurring within w . We let Fac( w ) denotethe collection of factors of w (including the empty word).We say that two words v, v ′ over Σ ∗ are cyclically equivalent , which we write v ∼ C v ′ , if v and v ′ are cyclic permutations of one another. We then let [ v ] C denotethe equivalence class of v under ∼ C . For example, the equivalence class of theEnglish word tea is { tea , eat , ate } . We define the Lie complexity of w to be thefunction L w ( n ) := { [ v ] C : | v | = n and [ v ] C ⊆ Fac( w ) } . (1)That is, L w ( n ) counts the number of cyclic equivalence classes of length n withthe property that every word in the equivalence class is a factor of w . This canbe contrasted with the cyclic complexity function of Cassaigne, Fici, Sciortino, andZamboni [11], defined as follows: c w ( n ) := { [ v ] C : | v | = n and [ v ] C ∩ Fac( w ) = ∅} , that is, where “every word” in our definition of Lie complexity is replaced by “someword”. Observe, in particular from our definition that we have the inequality L w ( n ) ≤ c w ( n ) for n ≥ . (2)Similarly, we have L w ( n ) ≤ a w ( n ) for n ≥ , (3)where a w : N → N is the abelian complexity function, which counts factors oflength n up to abelian equivalence, where v and v ′ are abelian equivalent if v ′ canbe obtained from v via some permutation of the letters [20].We now show the relation between factors y of w of unbounded exponent in w and the Lie complexity function. To make this precise, we construct an equivalencerelation ∼ on the collection of right-infinite words over Σ in which two right-infinitewords are equivalent if they have the same set of (finite) factors. We then let Per( w )denote the set of ∼ equivalence classes of right-infinite words of the form v ω suchthat Fac( v ω ) ⊆ Fac( w ). The following result is the key estimate, which will be usedin proving Theorem 1.2. Proposition 2.1.
Let Σ be a finite alphabet and let w be a right-infinite word over Σ . Suppose that there is a positive number κ such that for each positive integer b ≥ ,there is a positive integer n = n ( b ) such that L w ( bn ) ≤ κ . Then w ) ≤ κ . Inparticular, if L w ( n ) is uniformly bounded then w ) is finite.Proof. Suppose that there exist distinct equivalence classes [ u ω ] , . . . , [ u ωs ] in Per( w )with s > κ . Pick D such that u Di is not a factor of u ωj whenever i = j . Let b := D | u | · | u | · · · | u s | . JASON P. BELL AND JEFFREY SHALLIT
Then by construction, for each n ≥
1, the words u nb/ | u | , . . . , u nb/ | u s | s are cyclicallyinequivalent words of length nb with the property that every cyclic permutationoccurs as a factor of w . Hence L w ( bn ) ≥ s > κ for every n ≥
1, which contradictsthe hypothesis that L w ( bn ) must be at most κ for some positive integer n . Theresult follows. (cid:3) Algebraic interpretation of Lie complexity
We now give a purely algebraic interpretation of the Lie complexity function,which will be used later in proving Theorem 1.1. To do this, we introduce the factor algebra of a right-infinite word w .Let Σ be a finite alphabet and let w be a right-infinite word over Σ. Given a field k , we can construct the factor k -algebra of w , which we denote by A w . As a vectorspace, this is just all finite formal k -linear combinations of elements of Fac( w ); thatis, A w = X v ∈ Fac( w ) λ v v : λ v ∈ k, λ v = 0 for all but finitely many v ∈ Fac( w ) , with multiplication of v, v ′ ∈ Fac( w ) defined by declaring that v · v ′ is the concate-nation of v and v ′ if vv ′ is again a factor of w and v · v ′ is zero otherwise. We canthen extend the multiplication to general elements of A w by linearity, and so X v ∈ Fac( w ) α v v X u ∈ Fac( w ) β u u = X y ∈ Fac( w ) X { ( u,v ): uv = y } α u β v y. We now introduce some notation that we will use in obtaining Theorem 1.1.
Notation 3.1.
We make the following assumptions and introduce the followingnotation. (1)
We let
Σ = { x , . . . , x d } be a finite alphabet and we let w be a right-infiniteword over Σ . (2) We let A w be the factor algebra of w with base field k = Q . (3) We let V n denote the subspace of A w spanned by the images of factors of w of length n . (4) We let W n denote the subspace of V n spanned by elements of the form ab − ba ,where a, b ∈ Fac( w ) with | a | + | b | = n . Notice that since V n has a basis given by factors of w of length n , we have p w ( n ) = dim( V n ) , (4)where we are taking the dimension as a Q -vector space.One important remark is that if we adopt the notation from Notation 3.1 andwe let x = x + · · · + x d ∈ V , then x n is the sum of all n -fold concatenations of x , . . . , x d . Each such concatenation will either be 0 in the factor algebra or will beequal to a factor of w of length n ; moreover, each factor of w of length n can berealized as a unique concatenation of length n of these elements. Thus, when wework in the factor algebra, we have the formula( x + · · · + x d ) n = X { v ∈ Fac( w ): | v | = n } v. (5) IE COMPLEXITY OF WORDS 5
Lemma 3.2.
Adopt the assumptions and notation from Notation 3.1. Then L w ( n ) = dim( V n ) − dim( W n ) . Proof.
We fix n and let m denote the dimension of the quotient space V n /W n . Let u + W n , . . . , u m + W n be a basis for V n /W n consisting of W n -cosets of factors u , . . . , u d of w of length n . Observe that every cyclic permutation of u i must bea factor of w , since otherwise we could find words a and b such that u i = ab andsuch that ba is not a factor of w . But this would give that ba = 0 and ab = u i in A w and so we would have ab − ba = u i , which would mean that u i ∈ W n , which isa contradiction, since u i + W n is part of a basis for V n /W n .Furthermore, the u i must be cyclically inequivalent, since if there were i and j with i = j such that some u j were a cyclic permutation of u i , we could again write u i = ab and u j = ba and we would have u i − u j = ab − ba ∈ W n , which again wouldcontradict the independence of u , . . . , u m mod W n .Thus u , . . . , u m are cyclically inequivalent words such that [ u ] C , . . . , [ u m ] C areall contained in Fac( w ) and so L w ( n ) ≥ m = dim( V n /W n ) . Now we show that L w ( n ) ≤ dim( V n /W n ). Observe that if L w ( n ) is strictlygreater than dim( V n /W n ), then there must exist some word u m +1 ∈ Fac( w ) oflength n such that every cyclic permutation of u m +1 is also a factor of w and suchthat u m +1 is not cyclically equivalent to u i for i = 1 , . . . , m .Since u + W n , . . . , u m + W n form a basis for V n /W n . By assumption, there existrational constants α , . . . , α m such that u m +1 − P mi =1 α i u i ∈ W n . Then by definitionof W n there are words a , . . . , a s , b , . . . , b s and rational constants β , . . . , β s suchthat u m +1 − m X i =1 α i u i = s X i =1 β i ( a i b i − b i a i ) (6)in the factor algebra A w . We let U denote the subspace of V n spanned by images ofwords that are cyclically equivalent to u m +1 and we define a linear map π : V n → U .Since V n has a basis consisting of factors of w of length n , it suffices to define π onsuch factors and then extend linearly. For a factor u of w of length n , we define π ( u ) = u if u ∼ C u m +1 and π ( u ) = 0 otherwise.Then since u , . . . , u m , u m +1 are pairwise cyclically inequivalent, the left side ofEquation (6) is sent to u m +1 by the map π ; the right side, however, is sent to anelement of W n , since for each i , either a i b i and b i a i are both cyclically equivalentto u m +1 or neither a i b i nor b i a i is cyclically equivalent to u m +1 . It follows that u m +1 ∈ W n .Thus u m +1 is a Q -linear combination of elements of the form ab − ba with each ab and ba cyclic permutations of u m +1 . But by assumption, each cyclic permutationof u m +1 is in Fac( w ) and so if we let T : U → Q be the linear map uniquely definedby sending u to 1 for each cyclic permutation u of u m +1 , we see that T ◦ π sendsthe right side of Equation 6 to zero and T ◦ π ( u m +1 ) = 1, a contradiction. Thus weobtain the reverse inequality and so L w ( n ) = dim( V n /W n ) = dim( V n ) − dim( W n ) . (cid:3) JASON P. BELL AND JEFFREY SHALLIT
Remark 3.3.
Notice that W n is spanned by commutators, ab − ba , and that thealgebra A w becomes a Lie algebra when endowed with the bracket [ a, b ] := ab − ba .It is this fact and Lemma 3.2, which motivates the name Lie complexity for thefunction L w . 4. Proof of Theorems 1.1 and 1.2
We can now use the algebraic framework from the preceding section to proveTheorem 1.1. Our proof adapts an argument with Lie brackets from [4].
Proof of Theorem 1.1.
Let x = P di =1 x i ∈ V . We have a linear mapΦ n : V n → W n +1 defined by u ux − xu = d X i =1 ux i − x i u for u ∈ V n . Then by construction, Φ n sends a factor of w of length n into the space W n +1 , and so Φ n does indeed map into W n +1 .Recall from Equation (5) that x n = X { u ∈ Fac( w ): | u | = n } u. We claim that the kernel of Φ n is spanned by x n . To see this, observe that Φ n ( x n ) = x n · x − x · x n = 0 and so x n is in the kernel of Φ n .Suppose that there is z := X { u ∈ Fac( w ): | u | = n } α u u ∈ ker(Φ n )with z not in the span of x n . Then we can replace z by z − αx n for some α in Q and assume that there is some factor u of w of length n with α u = 0 but that z is nonzero. Since z is nonzero, there is some factor v of w of length n such that α v = 0.Then since v and u are both factors of w there is some factor y u,v of w thateither has v as a prefix and u as a suffix or has u as a suffix and has v as a prefix.Among all factors y of w having the property that u is either a prefix or suffix andhaving the property that some word v ′ of length n with α v ′ = 0 is either a prefixor a suffix, we pick one, y , of shortest length possible.By symmetry, it suffices to consider the case when u is a prefix of y , and we let v ′ , with α v ′ = 0, denote the suffix of y of length n . Then y = ua = bv ′ (7)for some words a and b . Since α u = 0 = α v ′ we see that | a | = | b | ≥ j be such that x j is the last letter of b and write b = b ′ x j . By assumptionΦ n ( z ) = 0 and so X { s ∈ Fac( w ): | s | = n } d X i =1 α s ( sx i − x i s ) = 0 . (8)We now consider the coefficient of x j v ′ in both sides of Equation (8). The coefficientin the right side is equal to zero. On the other hand, we have x j v ′ = v ′′ x k (9) IE COMPLEXITY OF WORDS 7 for some k ∈ { , . . . , d } and some word v ′′ of length n , and so the coefficient of x j v ′ in the left side of Equation (8) is − α v ′ + α v ′′ , since the only contribution from theterms P di =1 α s sx i occurs when i = k and s = v ′′ and the only contribution fromthe terms P di =1 − α s x i s comes when i = j and s = v ′ .Hence 0 = − α v ′ + α v ′′ and so in particular α v ′′ is nonzero. Then from Equations(7) and (9) and the fact that b = b ′ x j , we see ua = bv ′ = b ′ x j v ′ = b ′ v ′′ x k . Thus x k is the last letter of a and so a = a ′ x k for some word a ′ with | a ′ | < | a | .But now y ′ := ua ′ = b ′ v ′′ has the property that u is a prefix, v ′′ is a suffix and α v ′′ = 0 and | y ′ | < | y | ,which contradicts the minimality of | y | . It follows that the kernel of Φ n is spannedby x n , and since x n is nonzero in the factor algebra, the kernel of Φ n is exactlyone-dimensional.Then the rank-plus-nullity theorem for linear maps gives thatdim( V n ) = dim(ker(Φ n )) + dim(Im(Φ n )) . Since we have now established that dim(ker(Φ n )) = 1 and since the image of Φ n isa subspace of W n +1 , we have in fact thatdim( V n ) ≤ dim( W n +1 ) + 1 , or, equivalently, dim( W n +1 ) ≥ dim( V n ) − . Consequently, Lemma 3.2 gives L w ( n + 1) = dim( V n +1 ) − dim( W n +1 ) ≤ dim( V n +1 ) − (dim( V n ) − . Using Equation (4), we then see that L w ( n + 1) ≤ p w ( n + 1) − p w ( n ) + 1 , and so we obtain the desired inequality. When w has linear factor complexity, adeep result of Cassaigne [10] shows that p w ( n + 1) − p w ( n ) is uniformly boundedabove by a constant, which then gives that L w ( n ) is similarly bounded, and so theproof is complete. (cid:3) We now get the proof of Theorem 1.2.
Proof of Theorem 1.2.
By Theorem 1.1, L w ( n ) is uniformly bounded when w haslinear factor complexity. Proposition 2.1 then gives that w ) is finite. Sincethere are only finitely many primitive words y ′ such that ( y ′ ) ω has the same set offactors of a fixed periodic right-infinite word, we then obtain the desired result. (cid:3) Construction
In this section, we give a construction that proves Theorem 1.3. We note thatwe make use of a similar construction given by the first author and Smoktunowicz[6] in the context of monomial algebras, which we sharpen slightly.
JASON P. BELL AND JEFFREY SHALLIT
Proof.
Let f : N → N be a function that tends to infinity. We shall construct arecurrent binary word w whose factor complexity function is bounded above by nf ( n ) for n sufficiently large such that Per( w ) is infinite.First observe that by replacing f ( n ) by min( f ( j ) : j ≥ n ), we may assume that f is weakly increasing. Then for each j there is some largest natural number m j suchthat f ( m j ) ≤ j . Then since f ( n ) is weakly increasing and tends to infinity, wesee that the m j are weakly increasing and tend to infinity.To begin, we let f be the Fibonacci word, which is a Sturmian word, and hencehas complexity function p f ( n ) = n + 1. We fix a prefix u of f of length 2 m . Since f is uniformly recurrent, there are infinitely many occurrences of u in f , and hencewe can find a prefix u of f of length at least 2 m + m such that u is a suffix of u .We let d denote the length of u . In general, for each i there is a prefix u i of f such that u i − is a suffix and such that | u i | > m i | u i − | , and we let d i denote thelength of u i . Then d i is at least 2 m i + ··· + m . We define a i,j = ⌈| u i | / | u j |⌉ for i, j ≥ v n = u n u a n,n − n − · · · u a , u a n, u a n, · · · u a n,n − n − u n (10)and we define a sequence of words s n with s = u and for n ≥ s n = s n − v n s n − v n . (11)Since each s i is a prefix of s i +1 , we can define the right-infinite word w = lim n s n , (12)and since every factor of w appears in some prefix s i and since s i +1 = s i v i +1 s i v i +1 ,we see that w is recurrent.This construction first appears in work of the first author and Smoktunowicz [6, §
4] (but in that paper the authors used W for f , W i for the prefixes u i , V n for thefactors v n , U n for the factors s n , and U for the word w ).Let n be a natural number that is larger than | u | . Then there is a unique d suchthat | u d | ≤ n < | u d +1 | . Since u d ≥ m d + ··· + m , we see that n ≥ m d .Then we may write w = ( s d v d +1 s d v d +1 ) v d +2 ( s d v d +1 s d v d +1 ) v d +2 · · · and since | v j | > n for j > d , a factor of w of length n is either:(1) a factor of some word of the form v j s d v k with j, k > d ; or(2) a factor of length n of some v i v i +1 with i ≥ d + 1 that overlaps with both asuffix of v i and a prefix of v i +1 .Then since u d +1 is both a prefix and suffix of v i for i ≥ d + 1 and since | u d +1 | > n ,we see that every factor of v j s d v k , with j, k > d , of length n is either a factor of u d +1 s d u d +1 or a factor of v j for some j . Similarly, a factor of v i v i +1 with i ≥ d + 1that overlaps with both a suffix of v i and a prefix of v i +1 must be a factor of u d +1 that overlaps with both copies of u d +1 .We now consider these three types of factors in a case-by-case basis. A factor of u d +1 s d u d +1 of length n is either a factor of u d +1 , or it must overlap with s d . Since u d +1 is a factor of a Sturmian word, there are at most n + 1 distinct factors of u d +1 of length n ; there are at most n − | s d | ways of choosing a factor of u d +1 s d u d +1 oflength n that overlaps with s d . Thus we see that there are at most 2 n + | s d | factorsof u d +1 s d u d +1 of length n .There are n − u d +1 of length n that overlaps withboth copies of u d +1 . Thus we see that factors of u d +1 that overlap with both copiesof u d +1 contribute at most n − IE COMPLEXITY OF WORDS 9
Finally, there are at most 12 d n factors of some v j [6, Lemma 4.4] and so we seethat the total number of factors of w of length n is at most3 n − | s d | + 12 d n. Now [6, Equation (4.8)] gives that | s d | ≤ d | u d | ≤ d n and so the numberof factors of w of length n is at most 3 n − d n + 12 d n ≤ nd . We have n ≥ m d and since f ( j ) > d for j > m d , we see that 19 nd ≤ nf ( n ), and so p w ( n ) ≤ nf ( n ) for n ≥ | u | , which gives the desired bound on the factor complexityof w .Finally, observe that for a fixed i , the word u a n,i i appears as a factor of v n andhence as a factor of w . Since a n,i ≥ | u n | / | u i | → ∞ , we see that arbitrarily largepowers of u i appear as factors of w . Now for each i , there is some primitive word y i such that u i = y e i i for some e i ≥
1. Since u i is a factor of the Fibonacci wordand since the Fibonacci word is 4th-power free [17], we see that e i ∈ { , , } forevery i . Hence | y i | → ∞ as i → ∞ , and so we have infinitely many primitive words y such that y n is a factor of w for every n . (cid:3) Automatic sequences
A sequence s = ( s n ) n ≥ is k -automatic if there exists a finite automaton that,on input the base- k representation of n , computes s n (by arriving at a state whoseoutput is s n ). We have the following result [12]: Theorem 6.1.
Let s be a k -automatic sequence. (a) There is an algorithm that, given a well-formed first-order logical formula ϕ in FO( N , + , , , n → s [ n ]) having no free variables, decides if ϕ is true orfalse. (b) Furthermore, if ϕ has free variables, then the algorithm constructs an au-tomaton recognizing the representation of the values of those variables forwhich ϕ evaluates to true. A sequence ( a n ) n ≥ taking values in Z is k -regular if there is a linear repre-sentation for it, that is, a row vector v , a column vector w , and a matrix-valuedmorphism ζ : { , , . . . , k − } → Z d × d such that a n = v · ζ ( x ) · w , where x is thebase- k representation of n . If A is an automaton accepting the base- k represen-tation of pairs ( i, n ) in parallel, then the sequence a n = { i : A accepts ( i, n ) } is k -regular, and furthermore the matrices ζ ( a ) in the linear representation for ( a n )have non-negative integer entries [12]. A k -regular sequence taking only finitelymany values is k -automatic [1, Thm. 16.1.5], and the automaton can be algorith-mically produced from the linear representation because the entries of ζ ( a ) are in N . In this section we prove Theorem 1.4: if w is a k -automatic sequence, then thesequence ( L w ( n )) n ≥ is also k -automatic. Proof.
We will show that the sequence ( L w ( n )) n ≥ is k -regular. Since automaticsequences have linear factor complexity [1, Thm. 10.3.1], it follows from Theorem 1.2that ( L w ( n )) n ≥ is bounded, and hence automatic.We construct a linear representation for ( L w ( n )) n ≥ by constructing a first-orderlogical formula lie( i, n ) for the pairs ( i, n ) such that(a) All of the cyclic shifts of w [ i..i + n −
1] appear in w ;(b) w [ i..i + n −
1] is the lexicographically least of all its cyclic shifts appearingin w ; and (c) w [ i..i + n −
1] is the first occurrence of this particular factor.Then the number of i making lie( i, n ) true equals L w ( n ).We do this in a number of steps: • factoreq( i, j, n ) asserts that the length- n factor w [ i..i + n −
1] equals w [ j..j + n − • shift( i, j, n, t ) asserts that w [ i..i + n −
1] is the shift, by t positions, of thefactor w [ j..j + n − • conj( i, j, n ) asserts that the factor w [ i..i + n −
1] is a cyclic shift of w [ j..j + n − • lessthan( i, j, n ) asserts that the factor w [ i..i + n −
1] is lexicographicallysmaller than w [ j..j + n − • lessthaneq( i, j, n ) asserts that the factor w [ i..i + n −
1] is lexicographically ≤ the factor w [ j..j + n − • allconj( i, n ) asserts that all cyclic shifts w [ i..i + n −
1] appear as factors of w . • lexleast( i, n ) asserts that w [ i..i + n −
1] is lexicographically least among allits cyclic shifts that actually appear in w . • lie( i, n ) asserts that all cyclic shifts of w [ i..i + n −
1] appear in w , that w [ i..i + n −
1] is the lexicographically least cyclic shift, and that w [ i..i + n − w .Here are the definitions of the formulas. Recall that the domain of all variablesis N = { , , . . . } .factoreq( i, j, n ) := ∀ u, v ( i + v = j + u ∧ u ≥ i ∧ u < i + n ) = ⇒ w [ u ] = w [ v ]shift( i, j, n, t ) := factoreq( j, i + t, n − t ) ∧ factoreq( i, ( j + n ) − t, t )conj( i, j, n ) := ∃ t ( t ≤ n ) ∧ shift( i, j, n, t )lessthan( i, j, n ) := ∃ t ( t < n ) ∧ factoreq( i, j, t ) ∧ w [ i + t ] < w [ j + t ]lessthaneq( i, j, n ) := lessthan( i, j, n ) ∨ factoreq( i, j, n )allconj( i, n ) := ∀ t ( t ≤ n ) = ⇒ ∃ j shift( i, j, n, t )lexleast( i, n ) := ∀ j conj( i, j, n ) = ⇒ lessthaneq( i, j, n )lie( i, n ) := allconj( i, n ) ∧ lexleast( i, n ) ∧ ( ∀ j factoreq( i, j, n ) = ⇒ ( j ≥ i ))From the remarks preceding the proof, we are now done. (cid:3) Remark 6.2.
Most of the logical formulas should be self-explanatory, with oneexception: in order to specify allconj, why do we use shifts of length 0 , , . . . , n ? Itis because we want the formula to work even in the case of the empty word. Corollary 6.3.
Given an automatic sequence w , the quantity sup n ≥ L w ( n ) iscomputable. Remark 6.4.
Theorem 1.4 and Corollary 6.3 also hold for automata based onother kinds of numeration systems, such as Fibonacci numeration [13]; Tribonaccinumeration [19]; and Ostrowski numeration systems [3].7.
Examples
Using the free software
Walnut [18], we can implement the algorithm of theprevious section to find automata and closed-form expressions for L w ( n ) for someclassical words of interest. IE COMPLEXITY OF WORDS 11
Example 7.1.
Let t be the Thue-Morse word, the fixed point of the morphism µ sending 0 to 01 and 1 to 10. Then L t ( n ) = , if n = 0 or n = 2 k for k ≥ , if n = 1 , n = 3 · k for k ≥ , if n = 2;0 , otherwise.To some extent this is not surprising, since we know that the only squares in t are of length 2 k or 3 · k . However, L w ( n ) can be nonzero even if a sequence has nosquares, as the following example shows. Example 7.2.
Let vtm be the variant of the Thue-Morse word defined over aternary alphabet, the fixed point of the morphism sending 2 to 210, 1 to 20, and 0to 1. It is well-known that vtm is squarefree [7]. Then L vtm ( n ) = , if n = 0 or n = 2 k for k ≥ , if n = 3 · k for k ≥ , if n = 1 , , otherwise. Example 7.3.
Let us look at an example in a different base, and where there arefactors of unbounded exponent. Let c = 101000101 · · · be the Cantor sequence ,which is the fixed point of the morphism 1 →
101 and 0 → L c ( n ) = , if n = 4;2 , if n = 0 , , · k for k ≥ , otherwise. Example 7.4.
Let f be the Fibonacci word, the fixed point of the morphismsending 0 to 01 and 1 to 0. Define the Fibonacci numbers by F = 0, F = 1, and F n = F n − + F n − for n ≥
2. Then L f ( n ) = , if n = 0 or n = F k for k ≥ n = F k + F k − for k ≥ , if n = 1 , , otherwise. Example 7.5.
Let TR be the Tribonacci word, the fixed point of the morphismsending 0 to 01, 1 to 02, and 2 to 0. Define the Tribonacci numbers by T = 0, T = 1, T = 1, and T n = T n − + T n − + T n − for n ≥
3. Then L TR ( n ) = , if n = 0 or n = T k for k ≥ n = T k + T k − for k ≥ n = T k + T k − for k ≥ , if n = 4;3 , if n = 1 , , otherwise.Finally, we give an example where L w ( n ) = 0 for n ≥ Example 7.6.
Let Σ = { x , . . . , x , y , . . . , y } and let Φ : Σ ∗ → Σ ∗ be the mor-phism given by x x x y y x x x y y x x x y y x x x y y x x x y y x x x y y y x x y y y x x y y y x x y y y x x y y y x x y y y x x y y . and let w = Φ ω ( x ). Then w is 2-automatic and Lemma 6.1 of [5] shows that L w ( n ) = 0 for n ≥ L w ( n ) = 0 for n ≥ i has been studied over various alphabets [14].8. Concluding Remarks
In this final section, we pose a question that is suggested by the computationswe’ve performed.
Question 8.1.
Can the Lie complexity function of a morphic word be unbounded?
We note that the analogue of Theorem 1.2 is known to hold for pure morphicwords [15, Corollary 20], and so if Question 8.1 has an affirmative answer, thiswould give an extension of this result to general morphic words.Although many classes of morphic words, including primitive morphic and k -uniform morphic words, have linear complexity and hence are covered by Theorem1.2, the factor complexity function of a morphic word need not be linear in general.Pansiot (see [8, Theorem 4.7.1]) has shown that the factor complexity of a puremorphic word is either O(1), Θ( n ), Θ( n log log n ), Θ( n log n ), or Θ( n ), and thateach of these possibilities can be realized as the factor complexity of a pure morphicword. References [1] J.-P. Allouche and J. O. Shallit.
Automatic Sequences , Cambridge University Press, 2003.[2] G. Badkobeh and P. Ochem. Avoiding conjugacy classes on the 5-letter alphabet.
RAIROTheor. Inform. Appl. (2020), 1–4.[3] A. R. Baranwal. Decision algorithms for Ostrowski-automatic sequences. M. Math.thesis, University of Waterloo, School of Computer Science, 2020. Available at https://uwspace.uwaterloo.ca/handle/10012/15845 .[4] J. P. Bell, A dichotomy result for prime algebras of Gelfand-Kirillov dimension two. J. Algebra (2010), no. 4, 831–840.[5] J. P. Bell and B. W. Madill, Iterative algebras.
Algebr. Represent. Theory (2015), no. 6,1533–1546.[6] J. P. Bell and A. Smoktunowicz, The prime spectrum of algebras of quadratic growth. J.Algebra (2008), no. 1, 414–431.[7] J. Berstel. Sur la construction de mots sans carr´e.
S´eminaire de Th´eorie des Nombres (1978–1979), 18.01–18.15.[8] V. Berth´e and Michel Rigo (eds.),
Combinatorics, automata and number theory , Encyclopediaof Mathematics and its Applications, vol. 135, Cambridge University Press, Cambridge, 2010.[9] J. Cassaigne. Special factors of sequences with linear subword complexity. In J. Dassow, G.Rozenberg, and A. Salomaa, eds.
Developments in Language Theory II . World Scientific, 1996,pp. 25–34.
IE COMPLEXITY OF WORDS 13 [10] J. Cassaigne. Complexit´e et facteurs sp´eciaux. Journ´ees Montoises (Mons, 1994).
Bull. Belg.Math. Soc. Simon Stevin (1997), no. 1, 67–88.[11] J. Cassaigne, G. Fici, M. Sciortino, and L. Q. Zamboni. Cyclic complexity of words. J. Combin.Theory Ser. A (2017), 36–56.[12] E. Charlier, N. Rampersad and J. Shallit. Enumeration and decidable properties of automaticsequences.
Internat. J. Found. Comp. Sci. (2012) 1035–1066.[13] C. Frougny. Fibonacci numeration systems and rational functions. In J. Gruska, B. Rovan,and J. Wiedermann, eds., MFCS 86 , Lect. Notes in Comp. Sci. , Vol. 233, Springer-Verlag,1986, pp. 350–359.[14] G. Gamard, P. Ochem, G. Richomme, and P. S´e´ebold. Avoidability of circular formulas.
Theoret. Comput. Sci. (2018), 1–4.[15] K. Klouda and S. Starosta. An algorithm for enumerating all infinite repetitions in a D0L-system.
J. Discrete Algorithms (2015), 130–138.[16] F. Mignosi. Infinite words with linear subword complexity. Theor. Comput. Sci. (1989),221–242.[17] F. Mignosi and G. Pirillo. Repetitions in the Fibonacci infinite word. RAIRO Info. Theor. (1992), 199–204.[18] H. Mousavi. Automatic theorem proving in Walnut , Arxiv preprint, 2016. Available at http://arxiv.org/abs/1603.06017 .[19] H. Mousavi and J. Shallit. Mechanical proofs of properties of the Tribonacci word. In F. Maneaand D. Nowotka, eds.,
WORDS 2015 , Lect. Notes in Comp. Sci., Vol. 9304, Springer-Verlag,2015, pp. 170–190.[20] G. Richomme, K. Saari, and L. Q. Zamboni. Abelian complexity of minimal subshifts.
J.London Math. Soc. (2011), 79–95. Department of Pure Mathematics, University of Waterloo, Waterloo, ON N2L3G1, Canada
Email address : [email protected] School of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1,Canada
Email address ::