[PDF] How far away must forced letters be so that squares are still avoidable?

Abstract

We describe a new non-constructive technique to show that squares are avoidable by an infinite word even if we force some letters from the alphabet to appear at certain occurrences. We show that as long as forced positions are at distance at least 19 (resp. 3, resp. 2) from each other then we can avoid squares over 3 letters (resp. 4 letters, resp. 6 or more letters). We can also deduce exponential lower bounds on the number of solutions. For our main Theorem to be applicable, we need to check the existence of some languages and we explain how to verify that they exist with a computer. We hope that this technique could be applied to other avoidability questions where the good approach seems to be non-constructive (e.g., the Thue-list coloring number of the infinite path).

Full PDF

HHow far away must forced letters be so thatsquares are still avoidable?

Matthieu RosenfeldFebruary 10, 2020

Abstract

We describe a new non-constructive technique to show that squaresare avoidable by an inﬁnite word even if we force some letters from thealphabet to appear at certain occurrences. We show that as long as forcedpositions are at distance at least 19 (resp. 3, resp. 2) from each otherthen we can avoid squares over 3 letters (resp. 4 letters, resp. 6 or moreletters). We can also deduce exponential lower bounds on the numberof solutions. For our main Theorem to be applicable, we need to checkthe existence of some languages and we explain how to verify that theyexist with a computer. We hope that this technique could be applied toother avoidability questions where the good approach seems to be non-constructive (e.g., the Thue-list coloring number of the inﬁnite path).

A square is a word of the form uu where u is a non empty word. We saythat a word is square-free (or avoids squares) if none of its factors is a square.For instance, hotshots is a square while minimize is square-free. In 1906,Thue showed that there are arbitrarily long ternary words avoiding squares [8].This result is often regarded as the starting point of combinatorics on words,and the generalizations of this particular question received a lot of attention.The authors of [2] study three such questions asked by Harju [5]. They alsointroduced a stronger version of the third problem. Problem 1 ([2, Problem 4]) . Let p ≥ be an integer and let v = v v v . . . beany inﬁnite ternary word. Does there exist an inﬁnite ternary square-free word w = w w w . . . such that for all i , w p · i = v i ? They give a partial solution to this question and they show that the answeris yes for any v if p ≥ . In fact, they showed something slightly stronger. Let d (Σ) be the smallest integer such that for all v ∈ Σ ω and for all sequence ofindices ( p i ) ≤ i such that ∀ i , p i +1 − p i ≥ d (Σ) , there is an inﬁnite square-freeword u ∈ Σ ω such that v i = u p i . They showed that ≤ d ( { , , } ) ≤ .Moreover, the fact that squares are avoidable over 3 letters can be used to show1 a r X i v : . [ c s . D M ] F e b hat d ( { , , , } ) ≤ . We show that ≤ d ( { , , } ) ≤ , d ( { , , , } ) = 3 , d ( { , , , ..., k } ) = 2 for k ≥ .The main theorem of this paper gives suﬃcient conditions for the existence ofsquare-free languages that fulﬁll some constraints. Kolpakov showed that thereare more than . n square-free words of length n over a ternary alphabetusing a new non-constructive technique [6]. One of the ideas behind Kolpakov’sresult is roughly to approximate (using a computer) the language of square-freewords by the language of words avoiding squares of period less than l for large l , and to show that we do not lose too many words if we remove the largersquares from this language. We use a similar idea in this paper. We also useideas from the power series method (see for instance [1, 7]) even if we do notexplicitly manipulate any power series. It seems to be a good approach to showthat the Thue-list number of paths is (see [3, 4] for deﬁnitions and conjectureson this topic) or to tackle other problems that might require a non-constructiveapproach.This paper is organized as follows. We start by ﬁxing some notations inSection 2. In Section 3, we give a weaker version of Theorem 4 to present theideas of the theorem without some of the technicalities. Then in Section 4, wegive the proof of Theorem 4, our main theorem. In Section 5, we explain howto verify with a computer the existence of some languages that are required toapply Theorem 4. Finally, in Section 6, we use Theorem 4 to bound the valuesof d for diﬀerent alphabet sizes. We denote the set of non-negative integer (resp. positive integers) by N (resp. N > ). For any word w ∈ Σ ∗ , we denote the i th letter of w by w i and the lengthof w by | w | . Then for any w ∈ Σ ∗ , w = w w . . . w | w | . For any set of non-emptywords W , we let W ∗ (resp. W ω ) be the set of words obtained by catenationof ﬁnitely many (resp. inﬁnitely many) elements of W . A language over analphabet is a set of ﬁnite words over this alphabet. We use the convention that (cid:81) x ∈∅ x = 1 and max x ∈∅ x = 0 (we could use −∞ for the second one, but it isslightly less convenient for the implementation).A partial word over Σ is a (possibly inﬁnite) word over the alphabet Σ ∪ {(cid:5)} .For any partial word µ ∈ (Σ ∪ {(cid:5)} ) ∗ ∪ (Σ ∪ {(cid:5)} ) ω and word v ∈ Σ ∗ ∪ Σ ω , we saythat v is compatible with µ if | v | ≤ | µ | and µ i (cid:54) = (cid:5) = ⇒ µ i = v i for all i suchthat v i and µ i are deﬁned. We denote by S ( µ ) the set of square-free words thatare compatible with the partial word µ . The main Theorem of this paper is Theorem 4. The main idea of this theoremis that if a language avoids short squares and is large enough then it containssquare-free words of any length. The statement and proof of this theorem are2ather diﬃcult to follow so we give in this section a version of the Theorem forthe case where the set W is a singleton { w } . We hope that this helps to conveythe ideas of the proof of Theorem 4. This is in fact really similar to the ideasof [7], but instead of building the word letter by letter, we construct it factorby factor. For that we ﬁx one size of a factor and look at the number of wordswhose length corresponds to multiples of this size. Theorem 2.

Let Σ be an alphabet, w ∈ (Σ ∪ {(cid:5)} ) ∗ be a ﬁnite partial word and p ≥ | w | such that | w | divides p . Suppose that there are C ∈ N > and L alanguage such that:(I) ε ∈ L .(II) For all u ∈ L , u avoids squares of period less than p .(III) For any u ∈ L there are at least C diﬀerent words v ∈ Σ | w | compatiblewith w such that uv ∈ L .(IV) There exists x ∈ ]0 , such that: C (cid:32) − x p | w | − | w | − x (cid:33) ≥ x − Then S ( w ω ) is inﬁnite.Proof. Let µ = w ω . Let L ( µ ) be a set of words from L that are compatible with µ such that, for any u ∈ L ( µ ) of length divisible by | w | , there are exactly C diﬀerent words v ∈ Σ ∗ compatible with w with uv ∈ L ( µ ) . Conditions (I),(II)and (III) imply that such a set can be obtained by removing words from L . Forall non-negative i , let s i = | S ( µ ) ∩ L ( µ ) ∩ { u ∈ Σ ∗ : | u | = i | w |}| be the numberof square-free words of L ( µ ) of length i | w | .We will show by induction on i that for all positive i , s i +1 ≥ x − s i . Let n be a positive integer such that: ∀ ≤ i < n, s i +1 ≥ x − s i (IH1)By deﬁnition of L ( µ ) , for any word w of S ( µ ) ∩ L ( µ ) there are exactly C diﬀerent factors v of length | w | such that wv is in L ( µ ) . Let F be the setof words in L ( µ ) \ S ( µ ) of length ( n + 1) | w | whose preﬁx of length n | w | is in S ( µ ) ∩ L ( µ ) . Then by deﬁnition: s n +1 ≥ Cs n − | F | . (1)In order to bound | F | , let us introduce for all i < n + 1 , F i = { uvvy ∈ F : | w | ( i − < | uv | ≤ i | w | , | y | < | w |} . That is, F i is the set of words of F thatcontain a square whose midpoint (the middle of the square) is located betweenthe positions ( i − | w | and i | w | in the word. Clearly | F | ≤ (cid:80) ni =1 | F i | , so ournext task is to compute bounds on | F i | for all i .3 emma 3. We have the following inequalities: • for all i > n + 1 − p | w | , | F i | = 0 , • for all i ≤ n + 1 − p | w | , | F i | ≤ s i C | w | .Proof. If i > n + 1 − p | w | , then ( i − | w | + p ≥ ( n + 1) | w | . Since, L does notcontain squares of period less than p , F i = ∅ .Now, let i ≤ n +1 − p | w | . For any i and z ∈ S ( µ ) ∩ L ( µ ) ∩{ u ∈ Σ ∗ : | u | = i | w |} let F i ( z ) be the set of words of F i that admit z as a preﬁx.By deﬁnition of F i , any word F i ( z ) contains a square whose second halfstarts in position a + 1 and ends in position b where ( i − | w | < a ≤ i | w | and n | w | < b ≤ ( n + 1) | w | . Given z , a and b , we know the ﬁrst half of the squareand thus the word is known at least up to position n | w | . By deﬁnition of L ( µ ) there are at most C possible values for the remaining | w | letters. By summingover all the values of a and b one gets: | F i ( z ) | ≤ C | w | . By summing over allthe values of z , we ﬁnally get | F i | ≤ s i C | w | .Now, by (IH1), for all i , | F i | ≤ C | w | x n − i s n and thus: | F | ≤ n (cid:88) i =1 | F i | ≤ n +1 − p | w | (cid:88) i =1 C | w | x n − i s n ≤ C | w | s n ∞ (cid:88) i = p | w | − x i | F | ≤ C | w | s n x p | w | − − x We can use this bound in inequality (1) and we get: s n +1 ≥ Cs n − C | w | s n x p | w | − − xs n +1 ≥ s n C (cid:32) − x p | w | − | w | − x (cid:33) s n +1 ≥ x − s n (By Theorem hypothesis IV)This concludes the proof that for all positive i , s i +1 ≥ x − s i . Since s = 1 ,we deduce that s i is unbounded and thus S ( w ω ) is inﬁnite. This section is devoted to the proof of the main Theorem. As already mentionedthe ideas of the proof are the same as for the proof of Theorem 2. However, thisis more technical because W is not a singleton anymore. Moreover, we want theequivalent of condition (IV) to be as general as possible and for that, we needto bound the size of | F | as tightly as possible. Thus the equivalent of Lemma 3(Lemma 5) is much more technical and we delay its proof to a later subsection.4 heorem 4. Let Σ be an alphabet, W ⊆ (Σ ∪ {(cid:5)} ) ∗ be a ﬁnite set of ﬁnitepartial words, p ≥ {| w | : w ∈ W } be an integer. Suppose that there is alanguage L and a function f : N > → N > such that:(I) ε ∈ L .(II) For any u ∈ L and w ∈ W there are at least f ( | w | ) diﬀerent words v ∈ Σ | w | compatible with w and such that uv ∈ L .(III) For all u ∈ L , u avoids squares of period less than p .(IV) For all u, v ∈ W and integer ≤ i ≤ | v | , let α ( | u | , | v | ) = | u | (cid:88) m =1 (cid:98) | v |− m (cid:99) (cid:88) j =0 min (cid:110) f ( | v | ) , ( | Σ | − | v |− − jm (cid:111) and α (cid:48) ( i, | v | ) = i − (cid:88) m =0 min { f ( | v | ) , ( | Σ | − m } . There exist x , x , . . . , x max {| w | : w ∈ W } ∈ ]0 , and β : { , . . . , p } → [0 , solution of the following system:  ∀ w ∈ W,f ( | w | ) − max u,v ∈ W ≤ r ≤| v | (cid:110) β ( r + p − | w | − | v | ) (cid:16) α (cid:48) ( r, | w | ) + x | v | α ( | u | , | w | )1 − x | u | (cid:17)(cid:111) ≥ x − | w | ∀ j ≤ p, β ( j ) = max  (cid:81) i ∈{| u | : u ∈ W } x n i i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∀ i ∈ {| u | : u ∈ W } , n i ∈ N , and (cid:80) i ∈{| u | : u ∈ W } i · n i = j  Then for any inﬁnite partial word µ ∈ W ω , S ( µ ) is inﬁnite.Proof. Let µ ∈ W ω and ( µ i ) i ∈ N > ∈ W N > be a sequence of elements of W suchthat µ = µ µ µ . . . . For any integer i , let l ( i ) = | µ . . . µ i | . Let L ( µ ) be aset of words from L that are compatible with µ such that, for any u ∈ L ( µ ) oflength | µ . . . µ j | , there are exactly f ( | µ j +1 | ) diﬀerent words v ∈ Σ ∗ compatiblewith µ j +1 with uv ∈ L ( µ ) . That is, we remove words from L in order toreplace the “at least f ( | µ j +1 | ) ” by “exactly f ( | µ j +1 | ) ”. For all non-negative i ,let s i = | S ( µ ) ∩ L ( µ ) ∩ { v ∈ Σ ∗ : | v | = i }| be the number of square-free wordsof L ( µ ) of length i .We will show by induction on i that for all positive i , s l ( i +1) ≥ x − | µ i +1 | s l ( i ) .Let n be a positive integer such that: ∀ ≤ i < n, s l ( i +1) ≥ x − | µ i +1 | s l ( i ) (IH1)By deﬁnition of L ( µ ) , for any word w of S ( µ ) ∩ L ( µ ) of length l ( n ) there areexactly f ( | µ n +1 | ) diﬀerent factors v of length | µ n +1 | such that wv is in L ( µ ) .5et F be the set of words in L ( µ ) \ S ( µ ) of length l ( n + 1) whose preﬁx of length l ( n ) is in S ( µ ) ∩ L ( µ ) . Then by deﬁnition: s l ( n +1) ≥ f ( | µ n +1 | ) s l ( n ) − | F | . (2)In order to bound | F | , let us introduce for all i < n + 1 , F i = { uvvw ∈ F : l ( i − < | uv | ≤ l ( i ) , | w | < | µ n +1 |} . That is, F i is the set of words of F thatcontain a square whose midpoint (the middle of the square) is located betweenthe positions | µ . . . µ i | and | µ . . . µ i − | in the word. Clearly | F | ≤ (cid:80) ni =1 | F i | ,so our next task is to compute the values of | F i | for all i. Let d be the smallestinteger such that | µ d +1 . . . µ n +1 | ≤ p and let r = | µ d µ d +1 . . . µ n +1 | − p . Remarkthat r > . Lemma 5.

We have the following inequalities: • for all i > d , | F i | = 0 , • | F d | ≤ s l ( d ) α (cid:48) ( r, | µ n +1 | ) , • for all i ≤ d , | F i | ≤ s l ( i ) α ( | µ i | , | µ n +1 | ) . The proof of this Lemma is not really informative and is mostly a rathertechnical counting argument, so we moved it to Section 4.1.We can use the bounds on the sizes of the F i s to bound | F | : Lemma 6.

We have | F | ≤ s l ( d ) max u ∈ W (cid:26) α (cid:48) ( r, | µ n +1 | ) + x | µ d | α ( | u | , | µ n +1 | )1 − x | u | (cid:27) . Proof.

First, let us show by induction on i that for all ≤ i < d : i (cid:88) j =0 s l ( j ) α ( | µ j | , | µ n +1 | ) ≤ s l ( i ) max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) . (IH2)Let us ﬁrst show that this is true with i = 0 , using the fact that µ ∈ ]0 , . s l (0) α ( | µ | , | µ n +1 | ) ≤ s l (0) α ( | µ | , | µ n +1 | )1 − x | µ | ≤ s l (0) max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) . Now, let i + 1 be an integer such that (IH2) is true for i . i +1 (cid:88) j =0 s l ( j ) α ( | µ j | , | µ n +1 | ) ≤ s l ( i +1) α ( | µ i +1 | , | µ n +1 | ) + s l ( i ) max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) ≤ s l ( i +1) (cid:18) α ( | µ i +1 | , | µ n +1 | ) + x | µ i +1 | max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27)(cid:19) (By (IH1)) ≤ s l ( i +1) max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) i ≤ d and in particular for i = d − and weget: | F | ≤ | F d | + s l ( d − max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) | F | ≤ s l ( d ) α (cid:48) ( r, | µ n +1 | ) + s l ( d ) x | µ d | max u ∈ W (cid:26) α ( | u | , | µ n +1 | )1 − x | u | (cid:27) | F | ≤ s l ( d ) max u ∈ W (cid:26) α (cid:48) ( r, | µ n +1 | ) + x | µ d | α ( | u | , | µ n +1 | )1 − x | u | (cid:27) This concludes the proof of this Lemma.By induction hypothesis (IH1) s l ( d ) ≤ s l ( n ) (cid:81) ni = d +1 x | µ j | . Let us bound theproduct on the right hand side: n (cid:89) i = d +1 x | µ j | ≤ max  (cid:89) i ∈{| u | : u ∈ W } x n i i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∀ i ∈ {| u | : u ∈ W } , n i ∈ N and (cid:80) i ∈{| u | : u ∈ W } i · n i = l ( n + 1) − l ( d ) − µ n +1  n (cid:89) i = d +1 x | µ j | ≤ β ( l ( n + 1) − l ( d ) − µ n +1 ) n (cid:89) i = d +1 x | µ j | ≤ β ( r + p − µ n +1 − µ d ) Now, using this equation with Lemma 6 gives | F | ≤ s l ( n ) β ( r + p − µ n +1 − µ d ) max u ∈ W (cid:26) α (cid:48) ( r, | µ n +1 | ) + x | µ d | α ( | u | , | µ n +1 | )1 − x | u | (cid:27) Now recall that r = | µ d µ d +1 . . . µ n +1 |− p and thus by deﬁnition of d , ≤ r ≤ µ d .We deduce: | F | ≤ s l ( n ) max u,v ∈ W ≤ r ≤| v | (cid:26) β ( r + p − µ n +1 − | v | ) (cid:18) α (cid:48) ( r, | µ n +1 | ) + x | v | α ( | u | , | µ n +1 | )1 − x | u | (cid:19)(cid:27) We can ﬁnally replace | F | by this bound in inequality (2) and we get: s l ( n +1) ≥ s l ( n ) (cid:32) f ( | µ n +1 | ) − max u,v ∈ Wr ∈{ ,..., | v |} (cid:26) β ( r + p − µ n +1 − | v | ) (cid:18) α (cid:48) ( r, | µ n +1 | ) + x | v | α ( | u | , | µ n +1 | )1 − x | u | (cid:19)(cid:27) (cid:33) s l ( n +1) ≥ s l ( n ) x − | µ n +1 | (By Theorem hypothesis (IV))Moreover s = 1 and thus for all i , s | µ ...µ i | ≥ (cid:81) ij =1 x − | µ j | . For all j , x − | µ j | > ,so we conclude that S ( µ ) is inﬁnite. 7emark that Theorem 4 is far from sharp. One could improve the boundsgiven by Lemma 5. This could be done by lowering α and α (cid:48) or by introducinga third coeﬃcient α (cid:48)(cid:48) for the second non-empty F i . However, we were not ableto obtain signiﬁcant improvement that were worth the additional technicalities.In Section 5 we explain how to verify with a computer that there exists alanguage L that satistﬁes conditions (I),(II) and (III). We also need a way toverify condition (IV). In order to compute β , we can use that β (0) = 1 and, forall j ∈ { , . . . , p } , β ( j ) = max (cid:8) x | u | β ( j − | u | ) : u ∈ W, | u | ≤ j (cid:9) . Thus given thevalues of the x i one can compute β using a dynamic algorithm and all the rest isstraight forward to compute. Thus it is easy to verify with a computer whetheror not a given set of values of x i is a solution. We provide a C++ program thattakes as input | Σ | , k , p , f and x , . . . , x k and veriﬁes whether this is a solutionof the equations of condition (IV). This subsection is dedicated to the proof of Lemma 5. Remark that the state-ment and proof are not self-contained since some of the notations are deﬁned inthe proof of Theorem 4.

Lemma 7.

We have the following inequalities: • for all i > d , | F i | = 0 , • | F d | ≤ s l ( d ) α (cid:48) ( r, | µ n +1 | ) , • for all i ≤ d , | F i | ≤ s l ( i ) α ( | µ i | , | µ n +1 | ) .Proof. If i > d then by deﬁnition | µ i . . . µ n +1 | ≤ p . Moreover, L does not containsquares of period less than p and thus F i = ∅ .Now, let i ≤ d . By deﬁnition, any word from F i can be written uvvy with l ( i − < | uv | ≤ l ( i ) , | y | < | µ n +1 | . For any i and z ∈ S ( µ ) ∩ L ( µ ) ∩ { u ∈ Σ ∗ : | u | = l ( i ) } , let F i ( z ) be the set of words of F i that admit z as a preﬁx. Clearly F i = (cid:80) z ∈ S ( µ ) ∩ L ( µ ) ∩{ u ∈ Σ ∗ : | u | = l ( i ) } F i ( z ) .Let a, b (resp. a (cid:48) , b (cid:48) ) be integers such that there is an element z (cid:48) ∈ F i ( z ) that contains a square starting at a (resp. a (cid:48) ) and of period b (resp. b (cid:48) ) with l ( i −

1) + 1 < a (cid:48) + b (cid:48) = a + b ≤ | z | + 1 and a > a (cid:48) . Because of the square in z (cid:48) ,we know that for all ≤ j ≤ | z | − a − b , z a + j = z a + b + j = z a (cid:48) + b (cid:48) + j = z a (cid:48) + j . If a ≤ a (cid:48) + | z | − a − b + 1 then z contains a square which is not possible. Hence a > a (cid:48) + | z | − a − b + 1 a + b + b (cid:48) > a (cid:48) + b + b (cid:48) + | z | − a − b + 1 a (cid:48) + 2 b (cid:48) > a + 2 b + | z | − a − b + 1 since a + b = a (cid:48) + b (cid:48) (3)Let u ∈ F i ( z ) be a word that contains a square starting at a and of period b thenwe know its suﬃx of size a + 2 b − > l ( n ) . Thus there are at most f ( | µ n +1 | ) possibilities, moreover since the size of the unknown suﬃx is l ( n +1)+1 − a − b

p i + | z |− a − b +1 (cid:40) j (cid:88) i =1 min (cid:110) f ( | µ n +1 | ) , ( | Σ | − l ( n +1)+1 − p i (cid:111)(cid:41) = max j ∈ N , ≤ p ,...,p j ≤| µ n +1 |− , ∀ i

1) + 1 < a (cid:48) + b (cid:48) = a + b ≤ l ( n + 1) + 1 − p and a > a (cid:48) . By deﬁnition of d , l ( n + 1) − l ( d ) ≤ p . We can use equation (3)again and we get: a (cid:48) + 2 b (cid:48) > b + | z | + 1 ≥ p + | z | + 1 ≥ l ( n + 1) + 1 This is a contradiction with the fact that a (cid:48) + 2 b (cid:48) ≤ l ( n + 1) + 1 . Thus given thevalue of a + b there is at most one possible value for a and b . The number ofways for a ﬁxed z and value of a + b to complete z with a suﬃx into an element9f F d is at most: max (cid:26) min (cid:110) f ( | µ n +1 | ) , ( | Σ | − l ( n +1)+1 − s (cid:111) : s ∈ N > , s ≤ l ( n + 1) + 1 ,a + b + p ≤ s, (cid:27) ≤ min (cid:110) f ( | µ n +1 | ) , ( | Σ | − l ( n +1)+1 − a − b − p (cid:111) Then by summing over all the possible values of a + b , we get: | F d ( z ) | ≤ l ( n +1)+1 − p (cid:88) a + b = l ( d − min (cid:110) f ( | µ n +1 | ) , ( | Σ | − l ( n +1)+1 − a − b − p (cid:111) We can use the variable substitution m = l ( n + 1) + 1 − a − b − p and remarkthat l ( n + 1) + 1 − p − ( l ( d −

1) + 2) = | µ d µ d +1 . . . µ n +1 | − p − r − and weget: | F d ( z ) | ≤ r − (cid:88) m =0 min { f ( | µ n +1 | ) , ( | Σ | − m } ≤ α (cid:48) ( r, | µ n +1 | ) We conclude that | F d | ≤ s l ( d ) α (cid:48) ( r, | µ n +1 | ) by summing over all z . L that satisﬁes Theorem 4 In this section, we explain how to verify the existence of a language that fulﬁllsconditions (I),(II) and (III) of Theorem 4.We consider some particular directed labeled graphs: G ( V, A ) is a set V ofvertices together with a set A ⊆ ( V × V × Σ) of labeled arcs. For any u, v ∈ V and a ∈ Σ , ( u, v, a ) ∈ A is an arc from u to v with label a . These graphs couldalso be seen as ﬁnite state machines where all the states are initial and ﬁnal.The Rauzy graph of length n of a factorial language L over Σ is the graph G ( V, A ) where V = L ∩ Σ n and E = { ( au, ub, b ) : aub ∈ L, a, b ∈ Σ } . For anygraph G ( V, A ) and any set X ∈ V , we denote by G [ X ] the subgraph inducedby X .Let R p (Σ) be the Rauzy graph of length p − of the square-free words over Σ . Remark, that the factors of length p − of any walk on R p (Σ) correspondto edges of R p (Σ) and by deﬁnition they are square free. Thus, the sequenceof labels of any walk on R p (Σ) avoids squares of period less than p , but cancontain longer squares. We let S p (Σ) be the set of words that contains nosquare of period less than p (from the previous remark S p (Σ) can also be seenas the set of walks on R p (Σ) ).As an illustration, we give R ( { , , } ) in Fig. 1 without the arc labels.For this Section, we abuse the notation and allow ourself to identify wordsand sequences.For any graph G ( V, A ) and partial word w ∈ (Σ ∪{(cid:5)} ) ∗ , we deﬁne inductively10 Figure 1: The Rauzy graph R ( { , , } ) .for any integer i ∈ { , . . . , | w |} and vertex v ∈ V : p i,w,G ( v ) =  if i = 0 , (cid:80) ( v,u,a ) ∈ Aa ∈ Σ p i − ,w,G ( u ) if w | w | +1 − i = (cid:5) , (cid:80) ( v,u,w | w | +1 − i ) ∈ A p i − ,w,G ( u ) otherwise.Intuitively, p i,w,G ( v ) gives the number of walks of length i starting from v thatare compatible with the i last letters of w . Indeed, there is one walk of length and we always take the transition that is labeled by the current letter of w andany transition if this letter is (cid:5) . Remark that in the third case, there are in facteither or summands in the sum. Lemma 8.

Let W ⊆ (Σ ∪ {(cid:5)} ) ∗ , G ( V, A ) = R p (Σ) , f : N > → N > and anon-empty set X ⊆ V . If for all v ∈ X and w ∈ W , p | w | ,w,G [ X ] ( v ) ≥ f ( | w | ) ,then there exists a language L such that: • ε ∈ L , • for all u ∈ L , u avoids squares of period less than p , • for any u ∈ L and w ∈ W there are at least f ( | w | ) diﬀerent words v ∈ Σ | w | compatible with w such that uv ∈ L .Proof. Let L be the set of sequences of labels that correspond to a walk in G [ X ] .By deﬁnition, the two ﬁrst conditions on L are fulﬁlled.Let u ∈ L and w ∈ W . If | u | ≥ p − we let u (cid:48) be the suﬃx of length p − of u . Otherwise, we let u (cid:48) ∈ L such that | u (cid:48) | = 2 p − and w is a suﬃx of u (cid:48) (there is such an element in L ). Each walk of length | w | starting in u (cid:48) givesa unique sequence of labels u (cid:48)(cid:48) such that uu (cid:48)(cid:48) contains no square of period lessthan p .We easily deduce, by induction on i , that for all v the number of walks oflength i starting at v that are compatible with w | w |− i +1 w | w |− i +2 . . . w | w | is at11east p i,w,G [ X ] ( v ) . So, in particular, the number of walks of length | w | startingat u (cid:48) and compatible with w is at least p | w | ,w,G [ X ] ( u (cid:48) ) ≥ f ( | w | ) . This concludesthe proof.In fact, we need something stronger because for the values of p that we usethe graphs R p are too big to ﬁt in a computer. We can exploit symmetries of R p (Σ) to work on a smaller equivalent graph.For any square-free word w ∈ Σ ∗ , we let Ψ( w ) be the shortest suﬃx of w suchthat for all i ∈ {(cid:100) | Ψ( w ) | (cid:101) + 1 , . . . , p − } there exists k ∈ { , , . . . , | Ψ( w ) | − i − } with Ψ( w ) | Ψ( w ) |− k (cid:54) = Ψ( w ) | Ψ( w ) |− k − i . If | w | = 2 p − , then w is such a suﬃx ofitself (since {(cid:100) | w | (cid:101) + 1 , . . . , p − } is empty) and thus there is always a shortestsuﬃx.For instance, with p = 5 we have Ψ(0210120) = 10120 . For any letter α ,the word α is square free if and only if α is square free. Indeed, anew square is necessarily a suﬃx of α , it is enough to look at the twoletters in bold in α to deduce that there is no square of length and allthe other possible suﬃx of even length of α are also suﬃxes of α .In fact, for any w of size p − , the word wa avoids squares if and only if Ψ( w ) a avoids squares (this is proven in the next lemma) and this is the mainmotivation behind the deﬁnition of Ψ (in particular, words with the same imageby Ψ can be extended in the same way).For any graph G ( V, A ) , let Ψ( G )(Ψ( V ) , A (cid:48) ) be the graph such that A (cid:48) =Ψ( A ) = { (Ψ( a ) , Ψ( b ) , c ) : ( a, b, c ) ∈ A } . The next lemma tells us that we onlyneed to consider the walks on Ψ( G ) instead of the walks on G . Lemma 9.

Let p be a positive integer and w ∈ (Σ ∪ {(cid:5)} ) ∗ , G ( V, A ) = R p (Σ) and X ⊆ Ψ( V ) . Let Ψ − ( X ) = { x ∈ V : Ψ( x ) ∈ X } . Then for all v ∈ Ψ − ( X ) , p | w | ,w,G [Ψ − ( X )] ( v ) = p | w | ,w, Ψ( G )[ X ] (Ψ( v )) .Proof. Let us ﬁrst show that for any a ∈ Σ and v ∈ S p (Σ) with | v | = 2 p − if Ψ( v ) a ∈ S p (Σ) then va ∈ S p (Σ) . Let us show that under these assumptions,for any i , va avoids squares of period i . Since v is square free, we only need toshow that no suﬃx of va is a square. We have to distinguish between two cases: • i ≤ | Ψ( v ) | + 1 . Suppose for the sake of contradiction that there is asquare of period i in va . We deduce that the suﬃx of length | Ψ( v ) | + 1 of va contains a square of period i . That is, Ψ( v ) a contains a square ofperiod i which is a contradiction. • i ≥ | Ψ( v ) | + 2 . Since i is an integer we get i ≥ (cid:100) | Ψ( v ) | (cid:101) + 1 . Moreover | va | = 2 p − and thus i ≤ p − . Thus by deﬁnition of Ψ( v ) , there exists k ∈ { , , . . . , | Ψ( v ) | − i − } such that Ψ( v ) | Ψ( v ) |− k (cid:54) = Ψ( v ) | Ψ( v ) |− k − i .Thus there is k ∈ { , . . . , | Ψ( v ) a | − i − } such that (Ψ( v ) a ) | Ψ( v ) a |− k (cid:54) =(Ψ( v ) a ) | Ψ( v ) a |− k − i . Remark, that | Ψ( v ) a | − i − | Ψ( v ) | − i ≤ i − − i = i − . We conclude that there is k ∈ { , . . . , i − } such that ( va ) | va |− k (cid:54) = ( va ) | va |− k − i . This implies that the suﬃx of length i of va isnot a square of period i . 12e deduce that for any a ∈ Σ and v ∈ S p (Σ) with | v | = 2 p − if Ψ( v ) a ∈ S p (Σ) then va ∈ S p (Σ) .Let u ∈ V , v ∈ Ψ( V ) and a ∈ Σ such that (Ψ( u ) , v, a ) ∈ Ψ( A ) . By def-inition of Ψ( A ) this implies that there is ( u (cid:48) , v (cid:48) , a ) ∈ A with Ψ( u (cid:48) ) = Ψ( u ) and Ψ( v (cid:48) ) = v . Thus u (cid:48) a ∈ S p (Σ) and Ψ( u ) a = Ψ( u (cid:48) ) a ∈ S p (Σ) . Fromthe previous paragraph, it implies that ua is square-free. Let us show that Ψ( u u . . . u | u | a ) = v . By deﬁnition, for all i ∈ {(cid:100) | Ψ( u (cid:48) ) | (cid:101) + 1 , . . . , p − } thereexists k ∈ { , , . . . , | Ψ( u (cid:48) ) | − i − } with Ψ( u (cid:48) ) | Ψ( u (cid:48) ) |− k (cid:54) = Ψ( u (cid:48) ) | Ψ( u (cid:48) ) |− k − i .We easily deduce that for all i ∈ {(cid:100) | Ψ( u (cid:48) ) a | (cid:101) + 1 , . . . , p − } there exists k ∈{ , , . . . , | Ψ( u (cid:48) a ) | − i − } with Ψ( u (cid:48) a ) | Ψ( u (cid:48) a ) |− k (cid:54) = Ψ( u (cid:48) a ) | Ψ( u (cid:48) a ) |− k − i . This im-plies that v = Ψ( v (cid:48) ) is a suﬃx of Ψ( u (cid:48) ) a . Since Ψ( u (cid:48) ) a = Ψ( u ) a , we deducethat v is also a suﬃx of u u . . . u | u | a . Since v = Ψ( v (cid:48) ) and v is a suﬃx of u u . . . u | u | a , we get that Ψ( u u . . . u | u | a ) = v . We showed that if there are u ∈ V , v ∈ Ψ( V ) and a ∈ Σ such that (Ψ( u ) , v, a ) ∈ Ψ( A ) , then there exists v (cid:48)(cid:48) such that ( u, v (cid:48)(cid:48) , a ) ∈ A and Ψ( v (cid:48)(cid:48) ) = v .We deduce that for all u ∈ V , { (Ψ( u ) , Ψ( v ) , a ) ∈ Ψ( A ) } ⊆ { (Ψ( u ) , Ψ( v ) , a ) :( u, v, a ) ∈ A } . The other inclusion is clear from the deﬁnition of Ψ( A ) and weget for all u ∈ V : { (Ψ( u ) , Ψ( v ) , a ) ∈ Ψ( A ) } = { (Ψ( u ) , Ψ( v ) , a ) : ( u, v, a ) ∈ A } (4)By deﬁnition of R p (Σ) , for any u there is at most one outgoing arc for everylabel in the set of the right. Since the two sets are equals, we deduce that everyvertex of the graph Ψ( G ) has at most one outgoing arc for any label. Intuitively,(4) implies (by induction on the length of the walk) that for any u, v ∈ V theset of labeled walks from u to v in G is equal to the set of labeled walks from Ψ( u ) to Ψ( v ) in Ψ( G ) . We are now ready to show by induction on i that forall i ∈ { , . . . , | w |} and v ∈ Ψ − ( X ) , p i,w,G [Ψ − ( X )] ( v ) = p i,w, Ψ( G )[ X ] (Ψ( v )) . Bydeﬁnition p ,w,G [Ψ − ( X )] ( v ) = 1 = p ,w, Ψ( G )[ X ] (Ψ( v )) .Let n be a positive integer such that for all v ∈ Ψ − ( X ) , ∀ i < n, p i,w,G [Ψ − ( X )] ( v ) = p i,w, Ψ( G )[ X ] (Ψ( v )) . (IH)13hen, for all v ∈ Ψ − ( X ) , if w | w | +1 − i (cid:54) = (cid:5) we get: p i,w,G [Ψ − ( X )] ( v ) = (cid:88) ( v,u,w | w | +1 − i ) ∈ Au ∈ Ψ − ( X ) p i − ,w,G [Ψ − ( X )] ( u ) p i,w,G [Ψ − ( X )] ( v ) = (cid:88) ( v,u,w | w | +1 − i ) ∈ A Ψ( u ) ∈ X p i − ,w, Ψ( G )[ X ] (Ψ( u )) (From (IH)) p i,w,G [Ψ − ( X )] ( v ) = (cid:88) (Ψ( v ) , Ψ( u ) ,w | w | +1 − i ) ∈ Ψ( A )Ψ( u ) ∈ X p i − ,w, Ψ( G )[ X ] (Ψ( u )) (From (4)) p i,w,G [Ψ − ( X )] ( v ) = (cid:88) (Ψ( v ) ,u,w | w | +1 − i ) ∈ Ψ( A ) u ∈ X p i − ,w, Ψ( G )[ X ] ( u ) p i,w,G [Ψ − ( X )] ( v ) = p i,w, Ψ( G )[ X ] (Ψ( v )) The case where w | w | +1 − i = (cid:5) is similar.Using Lemma 8 together with Lemma 9, we get the following lemma: Lemma 10.

Let W ⊆ (Σ ∪ {(cid:5)} ) ∗ , G ( V, A ) = R p (Σ) , f : N > → N > and X ⊆ Ψ( V ) be a non-empty set. If for all v ∈ X and w ∈ W , p | w | ,w, Ψ( G )[ X ] ( v ) ≥ f ( | w | ) , then there exists a language L such that: • ε ∈ L , • for all u ∈ L , u avoids squares of period less than p , • for any u ∈ L and w ∈ W there are at least f ( | w | ) diﬀerent words v ∈ Σ | w | compatible with w and such that uv ∈ L . The graph Ψ( R p (Σ)) is much smaller than R p (Σ) and we can use a computerto check the conditions of this lemma for the values of p that we used. One shouldﬁrst ﬁnd the graph Ψ( R p (Σ)) . The following fact allows us to easily computethe set of vertices of Ψ( R p (Σ)) without computing R p (Σ) : Lemma 11.

Let w ∈ S p (Σ) . Then w ∈ Ψ( S p (Σ) ∩ Σ p − ) if and only if w isthe smallest non-empty suﬃx of w such that Ψ( w ) = w . Moreover, given a graph G , the deﬁnition of p | w | ,w,G gives a trivial dy-namic algorithm that computes p | w | ,w,G in time O ( | Σ | · | w | · | G | ) . Starting with X = Ψ( R p (Σ)) and inductively removing from X all the vertices for which p | w | ,w, Ψ( R p (Σ))[ X ] < f ( | w | ) gives the largest subgraph that meets the conditionsof Lemma 10. As long as this subgraph is not empty one can then apply Lemma10. Algorithm 1 computes the largest subgraph of Ψ( G ) with the required prop-erty. 14 lgorithm 1: How to compute the subgraph of Ψ( G ) . Input :

The graph Ψ( G ) , the set W Output:

The largest set X ⊆ Ψ( V ) such that for all v ∈ X and w ∈ W , p | w | ,w, Ψ( G )[ X ] ( v ) ≥ f ( | w | ) X = Ψ( V ) ; todo := true ; while todo do todo := f alse ; foreach w ∈ W do compute p | w | ,w, Ψ( G )[ X ] ; X (cid:48) := { v ∈ X : p | w | ,w, Ψ( G )[ X ] ( v ) ≥ f ( | w | ) } ; if X (cid:54) = X (cid:48) then X := X (cid:48) ; todo := true ; return X ; In this section we apply Theorem 4. We provide a C++ implementation ofAlgorithm 1 that veriﬁes the existence of the language that fulﬁll conditions(I),(II) and (III) from Theorem 4. Condition (IV) can be easily veriﬁed (as longas solutions are given) and we also provide a C++ code to do that. Theorem 12.

For any alphabet Σ , let d (Σ) be the smallest integer such thatfor all v ∈ Σ ω and for all sequence ( p i ) ≤ i such that ∀ i , p i +1 − p i ≥ d (Σ) , thereis an inﬁnite square-free word u ∈ Σ ω such that v i = u p i . Then: • ≤ d ( { , , } ) ≤ , • d ( { , , , } ) = 3 , • ≤ d ( { , , , , } ) ≤ , • if | Σ | ≥ , d (Σ) = 2 .Proof. For any alphabet Σ , we have d (Σ) ≥ . Moreover, d is a decreasingfunction of the size of the alphabet. Thus the third statement can easily be de-duced from the second one. We show the remaining statements independentlyof each others. We will show the upper bounds using Algorithm 1, Lemma 10and Theorem 4. The lower bounds are veriﬁed by exhaustive search.If | Σ | ≥ , d ( Σ ) = : Let Σ = { , , , , , } and W = {(cid:5)} ∪ {(cid:5) a (cid:5) : a ∈ Σ } ∪ {(cid:5) a (cid:5) b : a, b ∈ Σ } . We can use Algorithm 1 to check that we can apply The codes can be found in the ancillary ﬁles of https://arxiv.org/abs/1903.04214 f (1) = 3 , f (3) = f (4) = 6 , p = 12 . Thus conditions (I),(II)and (III) of Theorem 4 are fullﬁlled. We can check with a computer that con-dition (IV) of Theorem 4 is also fullﬁlled with x = and x = x = . Thisimplies that for any µ ∈ W ω there are inﬁnite square-free words over Σ com-patible with µ . Since {(cid:5) i a : i ≥ , a ∈ Σ } ω ⊆ W ω , we deduce that for any µ ∈ {(cid:5) i a : i ≥ , a ∈ Σ } ω there are inﬁnite square-free words over Σ compatiblewith µ . We get d ( { , , , , , } ) ≤ . d ( { , , , } ) = : Let w = (0 (cid:5) (cid:5) (cid:5) (cid:5) ) ω . An exhaustive search conﬁrms thatthere are only square-free words over { , , , } compatible with w . Thus d ( { , , , } ) ≥ .Let Σ = { , , , } and W = {(cid:5)} ∪ {(cid:5)(cid:5) a (cid:5) : a ∈ Σ } ∪ {(cid:5)(cid:5) a (cid:5)(cid:5) b : a, b ∈ Σ } . We can use Algorithm 1 to check that we can apply Lemma 10 with f (1) = 2 , f (4) = 5 and f (6) = 8 , p = 18 . We can then apply Theorem 4 with x = , x = and x = and we deduce that for any µ ∈ W ω there are inﬁnite square-free words over Σ compatible with µ . Moreover, {(cid:5) i a : i ≥ , a ∈ Σ } ω ⊆ W ω . We deduce that for any µ ∈ {(cid:5) i a : i ≥ , a ∈ Σ } ω there are inﬁnite square-freewords over Σ compatible with µ . Thus d ( { , , , } ) ≤ . ≤ d ( { , , } ) ≤ : Let w = (0 (cid:5) (cid:5) (cid:5) ) ω . An exhaustive search conﬁrmsthat there are only square-free words over { , , } compatible with w .Thus d ( { , , } ) ≥ .Let Σ = { , , } and W = {(cid:5) } ∪ {(cid:5) i a : i ∈ { , . . . , } , a ∈ Σ } . We can useAlgorithm 1 to check that we can apply Lemma 10 with p = 61 and the values of f given in Table 1. Thus conditions (I),(II) and (III) of Theorem 4 are fullﬁlled.We can also check that the values of x | w | given in Table 1 fulﬁll condition (IV) | w | f ( | w | ) x | w | Table 1: The values of f ( | w | ) and x | w | for the computation of d ( { , , } ) of Theorem 4. We deduce that for any µ ∈ W ω there are inﬁnite square-freewords over Σ compatible with µ . Moreover, {(cid:5) i a : i ≥ , a ∈ Σ } ω ⊆ W ω . Wededuce that for any µ ∈ {(cid:5) i a : i ≥ , a ∈ Σ } ω there are inﬁnite square-freewords over Σ compatible with µ . Thus d ( { , , } ) ≤ .The three applications of Algorithm 1 require between 30 and 100GB ofRAM (and around 5 hours of computations). We had to optimize the waystrings are stored in memory in order to be able to compute the graphs for largeenough values of p . The rest of the computations (ﬁnding the solution to thesystem and the exhaustive search) easily run on a laptop in a few milliseconds.Remark that we showed something slightly stronger since the results would still16old if an adversary was to tell us at every choice of letter only the next forcedletters with their positions (that is, we know the next element of W ).Experimental computations suggest that d ( { , , } ) is closer to 7 than to and that d ( { , , , , } ) = 2 . Acknowledgement

Computational resources have been provided by the Consortium des Équipementsde Calcul Intensif (CÉCI), funded by the Fonds de la Recherche Scientiﬁque deBelgique (F.R.S.-FNRS) under Grant No. 2.5020.11 and by the Walloon Region

References [1] J. P. Bell and T. L. Goh. Exponential lower bounds for the number ofwords of uniform length avoiding a pattern.

Information and Computation ,205(9):1295–1306, 2007.[2] J. Currie, T. Harju, P. Ochem, and N. Rampersad. Some further results onsquarefree arithmetic progressions in inﬁnite words.

Theoretical ComputerScience , 799:140–148, 2019.[3] S. Czerwiński and J. Grytczuk. Nonrepetitive colorings of graphs.

Elec-tronic Notes in Discrete Mathematics , 28:453–459, 2007. 6th Czech-SlovakInternational Symposium on Combinatorics, Graph Theory, Algorithms andApplications.[4] J. Grytczuk, J. Kozik, and P. Micek. New approach to nonrepetitive se-quences.

Random Structures & Algorithms , 42(2):214–225.[5] T. Harju. On square-free arithmetic progressions in inﬁnite words.

Theoret-ical Computer Science , 2018.[6] R. M. Kolpakov. On the number of repetition-free words.

Journal of Appliedand Industrial Mathematics , 1(4):453–462, 2007.[7] P. Ochem. Doubled patterns are 3-avoidable.

Electronic Journal of Combi-natorics , 23(1), 2016.[8] A. Thue. Über unendliche Zeichenreihen. ’Norske Vid. Selsk. Skr. I. Mat.Nat. Kl. Christiania’Norske Vid. Selsk. Skr. I. Mat.Nat. Kl. Christiania