A generalized palindromization map in free monoids
aa r X i v : . [ c s . D M ] J a n A generalized palindromization map in free monoids
Aldo de Luca a, ∗ , Alessandro De Luca b a Dipartimento di Matematica e Applicazioni “R. Caccioppoli”Universit`a degli Studi di Napoli Federico IIvia Cintia, Monte S. Angelo — I-80126 Napoli, Italy b Dipartimento di Scienze FisicheUniversit`a degli Studi di Napoli Federico IIvia Cintia, Monte S. Angelo — I-80126 Napoli, Italy
Abstract
The palindromization map ψ in a free monoid A ∗ was introduced in 1997 by the first author in thecase of a binary alphabet A , and later extended by other authors to arbitrary alphabets. Acting oninfinite words, ψ generates the class of standard episturmian words, including standard Arnoux-Rauzy words. In this paper we generalize the palindromization map, starting with a given code X over A . The new map ψ X maps X ∗ to the set PAL of palindromes of A ∗ . In this way someproperties of ψ are lost and some are saved in a weak form. When X has a finite decipheringdelay one can extend ψ X to X ω , generating a class of infinite words much wider than standardepisturmian words. For a finite and maximal code X over A , we give a suitable generalization ofstandard Arnoux-Rauzy words, called X -AR words. We prove that any X -AR word is a morphicimage of a standard Arnoux-Rauzy word and we determine some suitable linear lower and upperbounds to its factor complexity.For any code X we say that ψ X is conservative when ψ X ( X ∗ ) ⊆ X ∗ . We study conservativemaps ψ X and conditions on X assuring that ψ X is conservative. We also investigate the specialcase of morphic-conservative maps ψ X , i.e., maps such that ϕ ◦ ψ = ψ X ◦ ϕ for an injectivemorphism ϕ . Finally, we generalize ψ X by replacing palindromic closure with ϑ -palindromicclosure, where ϑ is any involutory antimorphism of A ∗ . This yields an extension of the class of ϑ -standard words introduced by the authors in 2006. Keywords:
Palindromic closure, Episturmian words, Arnoux-Rauzy words, Generalizedpalindromization map, Pseudopalindromes
1. Introduction
A simple method of constructing all standard Sturmian words was introduced by the firstauthor in [1]. It is based on an operator definable in any free monoid A ∗ and called right palin-dromic closure, which maps each word w ∈ A ∗ into the shortest palindrome of A ∗ having w as a ∗ Corresponding author
Email addresses: [email protected] (Aldo de Luca), [email protected] (Alessandro DeLuca)
Preprint submitted to Elsevier September 3, 2018
INTRODUCTION 2prefix. Any given word v ∈ A ∗ can suitably ‘direct’ subsequent iterations of the preceding oper-ator according to the sequence of letters in v , as follows: at each step, one concatenates the nextletter of v to the right of the already constructed palindrome and then takes the right palindromicclosure. Thus, starting with any directive word v , one generates a palindrome ψ ( v ). The map ψ ,called palindromization map, is injective; the word v is called the directive word of ψ ( v ).Since for any u , v ∈ A ∗ , ψ ( uv ) has ψ ( u ) as a prefix, one can extend the map ψ to right infinitewords x ∈ A ω producing an infinite word ψ ( x ). It has been proved in [1] that if each letter of abinary alphabet A occurs infinitely often in x , then one can generate all standard Sturmian words.The palindromization map ψ has been extended to infinite words over an arbitrary alphabet A by X. Droubay, J. Justin, and G. Pirillo in [2], where the family of standard episturmian wordsover A has been introduced. In the case that each letter of A occurs infinitely often in the directiveword, one obtains the class of standard Arnoux-Rauzy words [3, 4]. A standard Arnoux-Rauzyword over a binary alphabet is a standard Sturmian word.Some generalizations of the palindromization map have been given. In particular, in [5] a ϑ -palindromization map, where ϑ is any involutory antimorphism of a free monoid, has beenintroduced. By acting with this operator on any infinite word one obtains a class of words largerthan the class of standard episturmian, called ϑ -standard words; when ϑ is the reversal operatorone obtains the class of standard episturmian words. Moreover, the palindromization map hasbeen recently extended to the case of the free group F by C. Kassel and C. Reutenauer in [6]. Arecent survey on palindromization map and its generalizations is in [7].In this paper we introduce a natural generalization of the palindromization map which is con-siderably more powerful than the map ψ since it allows to generate a class of infinite words muchwider than standard episturmian words. The generalization is obtained by replacing the alphabet A with a code X over A and then ‘directing’ the successive applications of the right-palindromicclosure operator by a sequence of words of the code X . Since any non-empty element of X ∗ canbe uniquely factorized by the words of X , one can uniquely map any word of X ∗ to a palindrome.In this way it is possible associate to every code X over A a generalized palindromization mapdenoted by ψ X . If X = A one reobtains the usual palindromization map.General properties of the map ψ X are considered in Section 3. Some properties satisfied by ψ are lost and others are saved in a weak form. In general ψ X is not injective; if X is a prefix code,then ψ X is injective. Moreover, for any code X , w ∈ X ∗ , and x ∈ X one has that ψ X ( w ) is a prefixof ψ X ( wx ).In Section 4 the generalized palindromization map is extended to infinite words of X ω . Inorder to define a map ψ X : X ω → A ω one needs that the code X has a finite deciphering delay,i.e., any word of X ω can be uniquely factorized in terms of the elements of X . For any t ∈ X ω the word s = ψ X ( t ) is trivially closed under reversal, i.e., if u is a factor of s , then so will beits reversal u ∼ . If X is a prefix code, the map ψ X : X ω → A ω is injective. Moreover, one canprove that if X is a finite code having a finite deciphering delay, then for any t ∈ X ω the word ψ X ( t ) is uniformly recurrent. We show that one can generate all standard Sturmian words by thepalindromization map ψ X with X = A . Furthermore, one can also construct the Thue-Morseword by using the generalized palindromization map relative to a suitable infinite code.In Section 5 we consider the case of a map ψ X : X ω → A ω in the hypothesis that X is amaximal finite code. From a basic theorem of Sch¨utzenberger the code X must have a decipheringdelay equal to 0, i.e., X has to be a maximal prefix code. Given y = x · · · x i · · · ∈ X ω with x i ∈ X , i ≥
1, we say that the word s = ψ X ( y ) is a generalized Arnoux-Rauzy word relative to X , briefly X -AR word, if for any word x ∈ X there exist infinitely many integers i such that x = x i . If X = A one obtains the usual definition of standard Arnoux-Rauzy word. NOTATIONANDPRELIMINARIES 3Some properties of the generalized Arnoux-Rauzy words are proved. In particular, any X -ARword s is ω -power free, i.e., any non-empty factor of s has a power which is not a factor of s . Weprove that the number S r ( n ) of right special factors of s of length n for a su ffi ciently large n hasthe lower bound given by the number of proper prefixes of X , i.e., (card( X ) − / ( d − d = card( A ). From this one obtains that for a su ffi ciently large n , the factor complexity p s ( n ) hasthe lower bound (card( X ) − n + c , with c ∈ Z . Moreover, we prove that for all n , p s ( n ) hasthe linear upper bound 2 card( X ) n + b with b ∈ Z . The proof of this latter result is based on atheorem which gives a suitable generalization of a formula of Justin [15]. A further consequenceof this theorem is that any X -AR word is a morphic image of a standard Arnoux-Rauzy word onan alphabet of card( X ) letters. An interesting property showing that any X -AR word s belongs to X ω is proved in Section 6.In Section 6 we consider a palindromization map ψ X satisfying the condition ψ X ( X ∗ ) ⊆ X ∗ .We say that ψ X is conservative. Some general properties of conservative maps are studied and asu ffi cient condition on X assuring that ψ X is conservative is given. A special case of conservativemap is the following: let ϕ : A ∗ → B ∗ be an injective morphism such that ϕ ( A ) = X . The map ψ X is called morphic-conservative if for all w ∈ A ∗ , ϕ ( ψ ( w )) = ψ X ( ϕ ( w )). We prove that if ψ X ismorphic-conservative, then X ⊆ PAL , where
PAL is the set of palindromes, and X has to be abifix code. This implies that ψ X is injective. Moreover one has that ψ X is morphic-conservativeif and only if X ⊆ PAL , X is prefix, and ψ X is conservative. Any morphic-conservative map ψ X can be extended to X ω and the infinite words which are generated are images by an injectivemorphism of epistandard words. An interesting generalization of conservative map to the caseof infinite words is the following: a map ψ X , with X a code having a finite deciphering delay,is weakly conservative if for any t ∈ X ω , ψ X ( t ) ∈ X ω . If ψ X is conservative, then it is triviallyweakly conservative, whereas the converse is not in general true. We prove that if X is a finitemaximal code, then ψ X is weakly conservative.In Section 7 we give an extension of the generalized palindromization map ψ X by replacingthe palindromic closure operator with the ϑ -palindromic closure operator, where ϑ is an arbitraryinvolutory antimorphism in A ∗ . In this way one can define a generalized ϑ -palindromization map ψ ϑ, X : X ∗ → PAL ϑ , where PAL ϑ is the set of fixed points of ϑ ( ϑ -palindromes). If X is a codehaving a finite deciphering delay one can extend ψ ϑ, X to X ω obtaining a class of infinite wordslarger than the ϑ -standard words introduced in [5]. We limit ourselves to proving a noteworthytheorem showing that ψ ϑ = µ ϑ ◦ ψ = ψ ϑ, X ◦ µ ϑ where X = µ ϑ ( A ) and µ ϑ is the injective morphismdefined for any a ∈ A as µ ϑ ( a ) = a if a = ϑ ( a ) and µ ϑ ( a ) = a ϑ ( a ), otherwise.
2. Notation and preliminaries
Let A be a non-empty finite set, or alphabet . In the following, A ∗ will denote the free monoid generated by A . The elements of A are called letters and those of A ∗ words . The identity elementof A ∗ is called empty word and it is denoted by ε . We shall set A + = A ∗ \ { ε } . A word w ∈ A + canbe written uniquely as a product of letters w = a a · · · a n , with a i ∈ A , i = , . . . , n . The integer n is called the length of w and is denoted by | w | . The length of ε is conventionally 0.Let w ∈ A ∗ . A word v is a factor of w if there exist words r and s such that w = rvs . A factor v of w is proper if v , w . If r = ε (resp. s = ε ), then v is called a prefix (resp. su ffi x ) of w . If v is a prefix (resp. su ffi x) of w , then v − w (resp. wv − ) denotes the word u such that vu = w (resp. uv = w ). If v is a prefix of w we shall write v (cid:22) w and, if v , w , v ≺ w .A word w is called primitive if w , v n , for all v ∈ A ∗ and n >
1. We let
PRIM denote the setof all primitive words of A ∗ . NOTATIONANDPRELIMINARIES 4The reversal of a word w = a a · · · a n , with a i ∈ A , 1 ≤ i ≤ n , is the word w ∼ = a n · · · a .One sets ε ∼ = ε . A palindrome is a word which equals its reversal. The set of all palindromesover A will be denoted by PAL ( A ), or PAL when no confusion arises. For any X ⊆ A ∗ we set X ∼ = { x ∼ | x ∈ X } . For any word w ∈ A ∗ we let LPS ( w ) denote the longest palindromic su ffi x of w . For X ⊆ A ∗ , we set LPS ( X ) = { LPS ( x ) | x ∈ X } . A word w is said to be rich in palindromes,or simply rich, if it has the maximal possible number of distinct palindromic factors, namely | w | + infinite word , w is just an infinite sequence of letters: w = a a · · · a n · · · , where a i ∈ A , for all i ≥ . For any integer n ≥ w [ n ] will denote the prefix a a · · · a n of w of length n . A factor of w iseither the empty word or any sequence a i · · · a j with i ≤ j . If w = uvvv · · · v · · · = uv ω with u ∈ A ∗ and v ∈ A + , then w is called ultimately periodic and periodic if u = ε .The set of all infinite words over A is denoted by A ω . We also set A ∞ = A ∗ ∪ A ω . For any w ∈ A ∞ we denote respectively by Fact w and Pref w the sets of all factors and prefixes of theword w . For X ⊆ A ∗ , Pref X denotes the set of all prefixes of the words of X .Let w ∈ A ∞ . A factor u of w is right special (resp. left special ) if there exist two letters a , b ∈ A , a , b , such that ua and ub (resp. au and bu ) are factors of w . The factor u is called bispecial if it is right and left special. The order of a right (resp. left) special factor u of w is thenumber of distinct letters a ∈ A such that ua ∈ Fact w (resp. au ∈ Fact w ).Let w ∈ A ∞ and u a factor of w . An occurrence of u in w is any λ ∈ A ∗ such that λ u (cid:22) w . If λ and λ are two distinct occurrences of u in w with | λ | < | λ | , the gap between the occurrencesis | λ | − | λ | . For any w ∈ A ∗ and letter a ∈ A , | w | a denotes the number of occurrences of the letter a in w .The factor complexity p w of a word w ∈ A ∞ is the map p w : N → N counting for each n ≥ w of length n , i.e., p w ( n ) = card( A n ∩ Fact w ) . The following recursive formula (see, for instance, [8]) allows one to compute the factor com-plexity in terms of right special factors: for all n ≥ p w ( n + = p w ( n ) + d X j = ( j − s r ( j , n ) , (1)where d = card( A ), and s r ( j , n ) is the number of right special factors of w of length n and order j . A morphism (resp. antimorphism ) from A ∗ to the free monoid B ∗ is any map ϕ : A ∗ → B ∗ such that ϕ ( uv ) = ϕ ( u ) ϕ ( v ) (resp. ϕ ( uv ) = ϕ ( v ) ϕ ( u )) for all u , v ∈ A ∗ . A morphism ϕ can benaturally extended to A ω by setting for any w = a a · · · a n · · · ∈ A ω , ϕ ( w ) = ϕ ( a ) ϕ ( a ) · · · ϕ ( a n ) · · · . A code over A is a subset X of A + such that every word of X + admits a unique factorizationby the elements of X (cf. [9]). A subset of A + with the property that none of its elements is aproper prefix (resp. su ffi x) of any other is trivially a code, usually called prefix (resp. su ffi x ). Werecall that if X is a prefix (resp. su ffi x) code, then X ∗ is right unitary (resp. left unitary ), i.e., forall p ∈ X ∗ and w ∈ A ∗ , pw ∈ X ∗ (resp. wp ∈ X ∗ ) implies w ∈ X ∗ . NOTATIONANDPRELIMINARIES 5A bifix code is a code which is both prefix and su ffi x. A code X is called infix if no word of X is a proper factor of another word of X . A code X will be called weakly overlap-free if no word x ∈ X can be factorized as x = sp where s and p are respectively a proper non-empty su ffi x of aword x ′ ∈ X and a proper non-empty prefix of a word x ′′ ∈ X . Note that the code X = { abb , bbc } is not overlap-free [10], but it is weakly overlap free.A code X has a finite deciphering delay if there exists an integer k such that for all x , x ′ ∈ X ,if xX k A ∗ ∩ x ′ X ∗ , ∅ then x = x ′ . The minimal k for which the preceding condition is satisfied iscalled deciphering delay of X . A prefix code has a deciphering delay equal to 0.Let X be a set of words over A . We let X ω denote the set of all infinite words x = x x · · · x n · · · , with x i ∈ X , i ≥ . As is well known [9], if X is a code having a finite deciphering delay, then any x ∈ X ω can beuniquely factorized by the elements of X . We introduce in A ∗ the map ( + ) : A ∗ → PAL which associates to any word w ∈ A ∗ thepalindrome w ( + ) defined as the shortest palindrome having the prefix w (cf. [1]). We call w ( + ) the right palindromic closure of w . If Q = LPS ( w ) is the longest palindromic su ffi x of w = uQ , thenone has w ( + ) = uQu ∼ . Let us now define the map ψ : A ∗ → PAL , called right iterated palindromic closure , or simply palindromization map , over A ∗ , as follows: ψ ( ε ) = ε and for all u ∈ A ∗ , a ∈ A , ψ ( ua ) = ( ψ ( u ) a ) ( + ) . The following proposition summarizes some simple but noteworthy properties of the palin-dromization map (cf., for instance, [1, 2]):
Proposition 2.1.
The palindromization map ψ over A ∗ satisfies the following properties: foru , v ∈ A ∗ P1. If u is a prefix of v, then ψ ( u ) is a palindromic prefix (and su ffi x) of ψ ( v ) .P2. If p is a prefix of ψ ( v ) , then p ( + ) is a prefix of ψ ( v ) .P3. Every palindromic prefix of ψ ( v ) is of the form ψ ( u ) for some prefix u of v.P4. The palindromization map is injective. For any w ∈ ψ ( A ∗ ) the unique word u such that ψ ( u ) = w is called the directive word of w .One can extend ψ to A ω as follows: let w ∈ A ω be an infinite word w = a a · · · a n · · · , a i ∈ A , i ≥ . Since by property P1 of the preceding proposition for all n , ψ ( w [ n ] ) is a prefix of ψ ( w [ n + ), wecan define the infinite word ψ ( w ) as: ψ ( w ) = lim n →∞ ψ ( w [ n ] ) . AGENERALIZEDPALINDROMIZATIONMAP 6The extended map ψ : A ω → A ω is injective. The word w is called the directive word of ψ ( w ).The family of infinite words ψ ( A ω ) is the class of the standard episturmian words , or simply epistandard words , over A introduced in [2](see also [11]). When each letter of A occurs in-finitely often in the directive word, one has the class of the standard Arnoux-Rauzy words [3, 4].A standard Arnoux-Rauzy word over a binary alphabet is usually called standard Sturmian word . E pistand A will denote the class of all epistandard words over A .An infinite word s ∈ A ω is called episturmian (resp. Sturmian ) if there exists a standardepisturmian (resp. Sturmian) word t ∈ A ω such that Fact s = Fact t .The words of the set ψ ( A ∗ ) are the palindromic prefixes of all standard episturmian wordsover the alphabet A . They are called epicentral words , and simply central [12], in the case of atwo-letter alphabet. Example . Let A = { a , b } . If w = ( ab ) ω , then the standard Sturmian word f = ψ (( ab ) ω ) havingthe directive word w is the famous Fibonacci wordf = abaababaabaab · · · In the case of a three letter alphabet A = { a , b , c } the standard Arnoux-Rauzy word having thedirective word w = ( abc ) ω is the so-called Tribonacci word τ = abacabaabacaba · · · .
3. A generalized palindromization map
Let X be a code over the alphabet A . Any word w ∈ X + can be uniquely factorized in termsof the elements of X . So we can introduce the map ψ X : X ∗ → PAL , inductively defined for any w ∈ X ∗ and x ∈ X as: ψ X ( ε ) = ε, ψ X ( x ) = x ( + ) ,ψ X ( wx ) = ( ψ X ( w ) x ) ( + ) . In this way to each word w ∈ X ∗ , one can uniquely associate the palindrome ψ X ( w ). We call ψ X the palindromization map relative to the code X . If X = A , then ψ A = ψ . Example . Let A = { a , b } , X = { ab , ba } , and w = abbaab ; X is a code so that w can be uniquelyfactorized as w = x x x with x = ab and x = ba . One has: ψ X ( ab ) = aba , ψ X ( abba ) = ( ababa ) ( + ) = ababa , and ψ X ( abbaab ) = ababaababa .The properties of the palindromization map ψ stated in Proposition 2.1 are not in generalsatisfied by the generalized palindromization map ψ X . For instance, take X = { ab , abb } one has ab ≺ abb but ψ X ( ab ) = aba is not a prefix of ψ X ( abb ) = abba . Property P1 can be replaced bythe following: Proposition 3.1.
Let v = x · · · x n with x i ∈ X, i = , . . . , n. For any v j = x · · · x j , ≤ j < n onehas ψ X ( v j ) ≺ ψ X ( v ) . If X is a prefix code, then the following holds: for u , v ∈ X ∗ if u (cid:22) v, then ψ X ( u ) (cid:22) ψ X ( v ) . AGENERALIZEDPALINDROMIZATIONMAP 7
Proof.
For any j = , . . . , n − ψ X ( x · · · x j x j + ) = ( ψ X ( x · · · x j ) x j + ) ( + ) , so that ψ X ( x · · · x j ) ≺ ψ X ( x · · · x j + ). From the transitivity of relation ≺ it follows ψ X ( v j ) ≺ ψ X ( v ). Let now X be a prefix code and suppose that u , v ∈ X ∗ and u (cid:22) v . We can write v = x · · · x n and u = x ′ · · · x ′ m with x i , x ′ j ∈ X , i = , . . . , n and j = , . . . , m . Since u (cid:22) v , one has v = u ζ , with ζ ∈ A ∗ . From the right unitarity of X ∗ it follows ζ ∈ X ∗ and, therefore, x ′ i = x i for i = , . . . , m .From the preceding result it follows that ψ X ( u ) (cid:22) ψ X ( v ).Properties P2 and P3 are also in general not satisfied by ψ X . As regards P2, consider, forinstance, the code X = { a , ab , bb } and the word w = abbab . One has ψ X ( w ) = abbaabba . Now ψ X ( w ) has the prefix ab but not ( ab ) ( + ) = aba . As regards P3 take X = { abab , b } one has that ψ X ( abab ) = ababa . Its palindromic prefix aba is not equal to ψ X ( v ) for any v ∈ X ∗ .Di ff erently from ψ , the map ψ X is not in general injective. For instance, if X is the code X = { ab , aba } , then ψ X ( ab ) = ψ X ( aba ) = aba . Property P4 can be replaced by the following: Proposition 3.2.
Let X be a prefix code over A. Then ψ X is injective.Proof. Suppose that there exist words x , . . . , x m , x ′ , . . . x ′ n ∈ X such that ψ X ( x · · · x m ) = ψ X ( x ′ · · · x ′ n ) . We shall prove that m = n and that for all 1 ≤ i ≤ n , one has x i = x ′ i .Without loss of generality, we can suppose m ≤ n . Let us first prove by induction that for all1 ≤ i ≤ m , one has x i = x ′ i . Let us assume that x = x ′ , . . . , x k = x ′ k for 0 < k < m and showthat x k + = x ′ k + . To this end let us set w = ψ X ( x · · · x m ) and w ′ = ψ X ( x ′ · · · x ′ m ). In view of thepreceding proposition, we can write: w = ψ X ( x · · · x k x k + ) ζ = ( ψ X ( x · · · x k ) x k + ) ( + ) ζ and w ′ = ψ X ( x · · · x k x ′ k + ) ζ ′ = ( ψ X ( x · · · x k ) x ′ k + ) ( + ) ζ ′ , with ζ, ζ ′ ∈ A ∗ . Now one has:( ψ X ( x · · · x k ) x k + ) ( + ) = ψ X ( x · · · x k ) x k + ξ and ( ψ X ( x · · · x k ) x ′ k + ) ( + ) = ψ X ( x · · · x k ) x ′ k + ξ ′ , with ξ, ξ ′ ∈ A ∗ . Therefore, we obtain: w = ψ X ( x · · · x k ) x k + ξζ = ψ X ( x · · · x k ) x ′ k + ξ ′ ζ ′ = w ′ . By cancelling on the left in both the sides of previous equation the common prefix ψ X ( x · · · x k )one derives x k + ξζ = x ′ k + ξ ′ ζ ′ . (2)Since X is a prefix code one obtains x k + = x ′ k + . Since an equation similar to (2) holds also inthe case k = x = x ′ . Therefore, x i = x ′ i for i = , . . . , m . We can write: ψ X ( x · · · x m ) = ψ X ( x · · · x m x ′ m + · · · x ′ n ) . Since by Proposition 3.1, ψ X ( x · · · x m ) (cid:22) ψ X ( x · · · x m x ′ m + · · · x ′ n ) it follows that m = n . ANEXTENSIONTOINFINITEWORDS 8A partial converse of the preceding proposition is: Proposition 3.3.
Let X be a code such that X ⊆ PAL ∩ PRIM. If ψ X is injective, then X is prefix.Proof. Let us suppose that X is not a prefix code. Then there exist words x , y ∈ X such that x , y and y = x λ with λ ∈ A + . Since x , y ∈ PAL one has y = x λ = λ ∼ x . We shall prove that the longestpalindromic su ffi x LPS ( yyx ) of the word yyx = λ ∼ xyx is xyx . This would imply, as x , y ∈ PAL ,that ψ X ( yyx ) = ( yyx ) ( + ) = λ ∼ xyx λ = yyy = ( yyy ) ( + ) = ψ X ( yyy ) , so that ψ X would be not injective, a contradiction.Let us then suppose that y = λ ∼ x = α x β , α, β ∈ A ∗ , and that LPS ( yyx ) = x β yx . This implies β y ∈ PAL , so that, β y = βα x β = y β ∼ = α x ββ ∼ . Therefore, one has β = β ∼ and β ( α x β ) = ( α x β ) β. From a classic result of combinatorics on words [13], there exist w ∈ PRIM and integers h , k ∈ N such that β = w h and y = α x β = w k . Since y ∈ PRIM , it follows that k = y = w , and β = y h .As | β | < | y | , the only possibility is h =
0, so that β = ε , which implies LPS ( yyx ) = xyx .
4. An extension to infinite words
Let us now consider a code X having a finite deciphering delay. One can extend ψ X to X ω as follows: let x = x x · · · x n · · · , with x i ∈ X , i ≥
1. From Proposition 3.1, for any n ≥ ψ X ( x · · · x n ) is a proper prefix of ψ X ( x · · · x n x n + ) so that there existslim n →∞ ψ X ( x · · · x n ) = ψ X ( x ) . Let us observe that the word ψ X ( x ) has infinitely many palindromic prefixes. This implies that ψ X ( x ) is closed under reversal , i.e., if w ∈ Fact ψ X ( x ), then also w ∼ ∈ Fact ψ X ( x ). If X = A oneobtains the usual extension of ψ to the infinite words.Let us explicitly remark that if X is a code with an infinite deciphering delay one cannotassociate by the generalized palindromization map to each word x ∈ X ω a unique infinite word.For instance, the code X = { a , ab , bb } has an infinite deciphering delay; the word ab ω admits twodistinct factorizations by the elements of X . The first beginning with ab is ( ab )( bb ) ω , the secondbeginning with a is a ( bb ) ω . Using the first decomposition one can generate by the generalizedpalindromization map the infinite word ( ababb ) ω and using the second the infinite word ( abb ) ω .Let us observe that the previously defined map ψ X : X ω → A ω is not in general injective.For instance, take the code X = { ab , aba } which has finite deciphering delay equal to 1. As it isreadily verified one has ψ X (( ab ) ω ) = ψ X (( aba ) ω ) = ( aba ) ω .The following proposition holds; we omit its proof, which is very similar to that of Proposi-tion 3.2. Proposition 4.1.
Let X be a prefix code over A. Then the map ψ X : X ω → A ω is injective. The class of infinite words that one can generate by means of generalized palindromizationmaps ψ X is, in general, strictly larger than the class of standard episturmian words. ANEXTENSIONTOINFINITEWORDS 9 Example . Let A = { a , b } and X = { a , bb } . Let x be any infinite word x = abbay with y ∈ X ω .One has that ψ X ( abba ) = abbaabba , so that the word ψ X ( x ) will not be balanced (cf. [12]). Thisimplies that ψ X ( x ) is not a Sturmian word. Let A = { a , b , c } and X = { a , abca } . Take any word x = abcay with y ∈ X ω . One has ψ X ( abca ) = abcacba . Since the prefix abca is not rich inpalindromes, it follows that ψ X ( x ) is not an episturmian word. Theorem 4.2.
For any finite code X having finite deciphering delay and any t ∈ X ω , the words = ψ X ( t ) is uniformly recurrent.Proof. Let t = x x · · · x n · · · ∈ X ω , with x i ∈ X , i ≥
1, and w be any factor of s . Let α be theshortest prefix α = x · · · x h of t such that w ∈ Fact u , with u = ψ X ( α ). The word s is triviallyrecurrent since it has infinitely many palindromic prefixes. Hence, w occurs infinitely many timesin s . We will show that the gaps between successive occurrences of w in s are bounded above by | u | + ℓ X , where ℓ X = max x ∈ X | x | . This is certainly true within the prefix u : even if w occurs in u more than once, the gap between any two such occurrences cannot be longer than | u | .Let us then assume we proved such bound on gaps for successive occurrences of w in ψ X ( β ),where β = x · · · x k , h ≤ k , and let us prove it for occurrences in ψ X ( β y ), where y = x k + . We canwrite ψ X ( β ) = u ρ = ρ ∼ u and ψ X ( β y ) = ψ X ( β ) λ = λ ∼ ψ X ( β ) for some λ, ρ ∈ A ∗ , so that ψ X ( β y ) = ρ ∼ u λ = λ ∼ u ρ . (3)By inductive hypothesis, the only gap we still need to consider is the one between the last oc-currence of w in ρ ∼ u and the first one in u ρ as displayed in (3). If | ρ | > | λ | , then both suchoccurrences of w fall within ρ ∼ u = ψ X ( β ), so that by induction we are done. So suppose | λ | > | ρ | .As one easily verifies, the previous gap is at most equal to the gap between the two displayedoccurrences of u in (3), namely | λ | − | ρ | . From (3) one has: | λ | − | ρ | = | ψ X ( β y ) | − | ψ X ( β ) | − ( | ψ X ( β ) | − | u | ) = | ψ X ( β y ) | − | ψ X ( β ) | + | u | . Now, as | ψ X ( β y ) | = | ( ψ X ( β ) y ) ( + ) | < | ψ X ( β ) | + | y | ) ≤ | ψ X ( β ) | + ℓ X , we have | λ | − | ρ | < | u | + ℓ X . By induction, we can conclude that gaps between successiveoccurrences of w are bounded by | u | + ℓ X in the whole s , as desired.Let y = y y · · · y n · · · ∈ X ω , with y i ∈ X for all i ≥
1. We say that a word x ∈ X is persistent in y if there exist infinitely many integers i < i < · · · < i k < · · · such that x = y i k for all k ≥ y = y y · · · y n · · · ∈ X ω is alternating if there exist distinct letters a , b ∈ A , a word λ ∈ A ∗ , and a sequence of indices i < i < · · · < i n < · · · , such that λ a (cid:22) y i k and λ b (cid:22) y i k + for all k ≥ x , x ∈ X , which are persistent in y andsuch that { x , x } is a prefix code, then y is alternating. If X is finite, then the two conditions areactually equivalent. Proposition 4.3.
Let y = y · · · y n · · · ∈ X ω with y i ∈ X, i ≥ . If y is alternating, then ψ X ( y ) isnot ultimately periodic.Proof. By hypothesis, there exists an increasing sequence of indices ( i n ) n ≥ , such that for all k ≥ λ a (cid:22) y i k and λ b (cid:22) y i k + , for some λ ∈ A ∗ and letters a , b . ANEXTENSIONTOINFINITEWORDS 10For all n ≥
0, let u n denote the word ψ X ( y · · · y n ). We shall prove that u n λ is a right specialfactor of s = ψ X ( y ) for any n , thus showing that s cannot be ultimately periodic (cf. [12]).We can choose an integer h > i h > n . Let us set m = i h and x = y i h . Now onehas that: u m − x (cid:22) u m ∈ Pref s . Since u n is a prefix and a su ffi x of u m − it follows, writing x = λ a η for some η ∈ A ∗ , that u n x = u n λ a η ∈ Fact s . Since i h + > i h , setting x = y h + = λ b η ′ for some η ′ ∈ A ∗ , one derives by a similar argumentthat: u n x = u n λ b η ′ ∈ Fact s . From the preceding equations one has that u n λ is a right special factor of s .We shall now prove a theorem showing how one can generate all standard Sturmian wordsby the palindromization map relative to the code X = { a , b } . We premise the following lemmawhich is essentially a restatement of a well known characterization of central words (see forinstance [1, Proposition 9]). Lemma 4.4.
Let A = { a , b } and E be the automorphism of A ∗ interchanging the letter a with b.If z ∈ A and w ∈ A ∗ \ z ∗ , then ψ ( wz ) = ψ ( w ) zE ( z ) ψ ( w ′ ) for some w ′ ∈ Pref w . Theorem 4.5.
Let A = { a , b } and X = A . An infinite word s ∈ A ω is standard Sturmian if andonly if s = ψ X ( t ) for some alternating t ∈ X ω such thatt ∈ (( aa ) ∗ ∪ ( bb ) ∗ ) { ab , ba } ω . Proof.
Let s = ψ X ( t ); we can assume without loss of generality that t ∈ ( aa ) k { ab , ba } ω with k ∈ N . Let t [2 n ] be the prefix of t of length 2 n (which belongs to X ∗ ). We shall prove that ψ X ( t [2 n ] )is a central word for all n ≥
0. This is trivial for all prefixes t [2 p ] of t with p ≤ k . Let us nowassume, by induction, that ψ X ( t [2 n ] ) is central for a given n ≥ k and prove that ψ X ( t [2 n + ) iscentral.We can write t [2 n + = t [2 n ] ab or t [2 n + = t [2 n ] ba . Since by the inductive hypothesis ψ X ( t [2 n ] )is central, there exists u n ∈ A ∗ such that ψ X ( t [2 n ] ) = ψ ( u n ). The words ψ X ( t [2 n ] ) ab and ψ X ( t [2 n ] ) ba are finite standard words and therefore, as is well known, prefixes of standard Sturmian words(cf. [12, Corollary 2.2.28]). By property P2 of Proposition 2.1, their palindromic closures( ψ X ( t [2 n ] ) ab ) ( + ) = ψ X ( t [2 n ] ab ) and ( ψ X ( t [2 n ] ) ba ) ( + ) = ψ X ( t [2 n ] ba ) are both central. Hence, in anycase ψ X ( t [2 n + ) is central so that there exists u n + ∈ A ∗ such that ψ X ( t [2 n + ) = ψ ( u n + ). Since ψ ( u n ) is a prefix of ψ ( u n + ) from Proposition 2.1 one derives that u n ≺ u n + .We have thus proved the existence of a sequence of finite words ( u n ) n ≥ , with u i ≺ u i + for all i ≥
0, such that for all n ≥ ψ X ( t [2 n ] ) = ψ ( u n ) . Letting ∆ = lim n →∞ u n , we obtain s = ψ ( ∆ ). Since t is alternating, s is not ultimately periodic byProposition 4.3, so that it is a standard Sturmian word. ANEXTENSIONTOINFINITEWORDS 11Conversely, let s be a standard Sturmian word, and let ∆ be its directive word. Without lossof generality, we can assume that ∆ begins in a ; let n ≥ a n b ∈ Pref ∆ . If n is even,we have ψ ( a n b ) = (cid:16) ( aa ) n b (cid:17) ( + ) = (cid:16) ( aa ) n ba (cid:17) ( + ) = ψ X (cid:16) ( aa ) n ba (cid:17) whereas if n is odd, ψ ( a n b ) = (cid:16) ( aa ) n − ab (cid:17) ( + ) = ψ X (cid:16) ( aa ) n − ab (cid:17) . Let now z ∈ A and uz be a prefix of ∆ longer than a n b . By induction, we can suppose that thereexists some w ∈ ( aa ) ∗ { ab , ba } ∗ such that ψ ( u ) = ψ X ( w ). From Lemma 4.4 and Proposition 2.1,we obtain, setting ˆ z = E ( z ), ( ψ ( u ) z ˆ z ) ( + ) (cid:22) ψ ( uz ) (cid:22) ( ψ ( u ) z ˆ z ) ( + ) . Hence, ψ ( uz ) = ( ψ ( u ) z ˆ z ) ( + ) = ( ψ X ( w ) z ˆ z ) ( + ) = ψ X ( wz ˆ z ) . (4)We have thus shown how to construct arbitrarily long prefixes of the desired infinite word t ,starting from the Sturmian word s . Since a and b both occur infinitely often in ∆ , by (4) wederive that t is alternating. Example . In the case of Fibonacci word f let us take X = { ab , ba } . As it is readily verified,one has: f = ψ X ( ab ( abba ) ω ) . Let µ be the Thue-Morse morphism, and t = µ ω ( a ) the Thue-Morse word [13]. We recall that µ is defined by µ ( a ) = ab and µ ( b ) = ba . The next proposition will show that t can be obtainedusing our generalized palindromization map, relative to a suitable infinite code.Let us set u n = µ n ( a ) and v n = E ( u n ) b , for all n ∈ N . Thus v = bb , v = baabb , v = baababbaabbabaabb , and so on. Proposition 4.6.
The set X = { a } ∪ { v n | n ∈ N } is a prefix code, andt = ψ X ( av v v · · · ) . Proof.
As a consequence of [5, Theorem 8.1], we can write u n + = µ n + ( a ) = (cid:16) µ n + ( a ) b (cid:17) ( + ) .Since for any k ≥ µ k + ( a ) = µ k ( a ) E (cid:16) µ k ( a ) (cid:17) , we obtain for all n ≥ u n + = ( u n E ( u n ) b ) ( + ) = ( u n v n ) ( + ) . (5)Since b ≺ v i for all i ≥
0, by (5) it follows u i b ≺ u i v i (cid:22) u i + , so that u i b ≺ u j whenever 0 ≤ i < j ,whence E ( u i b ) = E ( u i ) a ≺ E ( u j ). This implies that for 0 ≤ i < j , v i = E ( u i ) b is not a prefix of v j = E ( u j ) b . Clearly v i is not a prefix of any v k with k < i , nor of a , which in turn is not a prefixof any v i with i ∈ N ; hence X is a prefix code.Since u = a = ψ X ( a ), from (5) it follows that for all n > u n = ψ X ( av · · · v n − ). As t = lim n →∞ u n , the assertion is proved. GENERALIZEDARNOUX-RAUZYWORDS 12
5. Generalized Arnoux-Rauzy words
Let us suppose that the code X over the alphabet A is finite and maximal , i.e., it is not properlyincluded in any other code on the same alphabet. By a classic result of Sch¨utzenberger either X is prefix or has an infinite deciphering delay [9]. Therefore, if one wants to define a map ψ X : X ω → A ω one has to suppose that the code is a prefix maximal code.We shall now introduce a class of infinite words which are a natural generalization in ourframework of the standard Arnoux-Rauzy words.Let X be a finite maximal prefix code over the alphabet A of cardinality d >
1. We say thatthe word s = ψ X ( y ), with y ∈ X ω is a standard Arnoux-Rauzy word relative to X , or X -AR wordfor short, if every word x ∈ X is persistent in y .Let us observe that if X = A we have the usual definition of standard Arnoux-Rauzy word.Any X -AR word is trivially alternating and therefore, from Proposition 4.3 it is not ultimatelyperiodic. The following proposition extends to X -AR words a property satisfied by the classicstandard Arnoux-Rauzy words. Proposition 5.1.
Let s = ψ X ( y ) be an X-AR word with y = y · · · y n · · · , y i ∈ X, i ≥ . Then forany n ≥ , u n = ψ X ( y · · · y n ) is a bispecial factor of s of order d = card( A ) . This implies thatevery prefix of s is a left special factor of s of order d.Proof. Since X is a finite maximal prefix code, it is complete [9], i.e., it is represented by theleaves of a full d -ary tree (i.e., each node in the tree is either a leaf or has exactly degree d ).Hence, X f = A , where X f denotes the set formed by the first letter of all words of X . Any word x ∈ X is persistent in y , so that, by using an argument similar to that of the proof of Proposition4.3, one has that for any n ≥ u n X ⊆ Fact s , that implies u n X f = u n A ⊆ Fact s , i.e., u n is a right special factor of s of order d . Since s is closedunder reversal and u n is a palindrome, one has that u n is also a left special factor of s of order d .Hence, u n is a bispecial factor of order d . Let u be a prefix of s . There exists an integer n suchthat u (cid:22) u n . From this one has that u is a left special factor of s of order d .An infinite word s over the alphabet A is ω -power free if for every non-empty word u ∈ Fact s there exists an integer p > u p < Fact s . We recall the following result (see, forinstance, [14, Lemma 2.6.2]) which will be useful in the sequel: Lemma 5.2.
A uniformly recurrent word is either periodic or ω -power free. Corollary 5.3.
An X-AR word is ω -power free.Proof. An X -AR word is not periodic and by Theorem 4.2 it is uniformly recurrent, so that theresult follows from the preceding lemma. Lemma 5.4.
Let X ⊆ A ∗ be a finite set and set ℓ = ℓ X = max {| x | | x ∈ X } . Let w = w · · · w m ,w i ∈ A, i = , . . . , m, be a palindrome with m ≥ ℓ . If there exist u , v ∈ (Pref X ) \ X such that | u | = p, | v | = q, p < q, and w p + · · · w m u = w q + · · · w m v , (6) then w · · · w m − p = α k α ′ , where α ′ ∈ Pref α , α ∼ is a prefix of v of length q − p, and k ≥ m ℓ − − . GENERALIZEDARNOUX-RAUZYWORDS 13
Proof.
Let u = a · · · a p and v = b · · · b q with a i , b j ∈ A , i = , . . . , p , j = , . . . , q . From (6) onederives: a i = b q − p + i , i = , . . . , p , and w p + · · · w q ( w q + · · · w m ) = ( w q + · · · w m ) b · · · b q − p . From a classic result of Lyndon and Sch¨utzenberger (cf. [13]), there exist λ, µ ∈ A ∗ and an integer h ≥ w p + · · · w q = λµ, b · · · b q − p = µλ, w q + · · · w m = ( λµ ) h λ. (7)Hence, w p + · · · w m = ( λµ ) h + λ. Since w ∈ PAL , one has for any i = , . . . , m , w i = w m − i + . Hence, by taking the reversals of boththe sides of the preceding equation, one has: w · · · w m − p = w m · · · w p + = ( λ ∼ µ ∼ ) h + λ ∼ = α k α ′ , having set k = h + α = λ ∼ µ ∼ , and α ′ = λ ∼ . Now from (7), α ∼ = µλ = b · · · b q − p is a prefix of v . From (7) one has that m − q = h ( q − p ) + | λ | . Since | λ | ≤ q − p it follows that m − q ≤ h ( q − p ) + ( q − p ) = ( h + q − p ) = k ( q − p ). Hence, k ≥ m − qq − p . As q − p ≤ ℓ − q ℓ − ≤ Lemma 5.5.
Let X be a finite maximal prefix code over a d-letter alphabet. Then card((Pref X ) \ X ) = card( X ) − d − . Proof.
The code X is represented by the set of leaves of a full d -ary tree. The elements of theset (Pref X ) \ X , i.e., the proper prefixes of the words of X are represented by the internal nodesof the tree. As is well known, the number of internal nodes of a full d -ary tree is equal to thenumber of leaves minus 1 divided by d − λ X be the quantity λ X = card( X ) − d − . Proposition 5.6.
Let s be an X-AR word. There exists an integer e s such that for any non-emptyproper prefix u of a word of X, one has u e s < Fact s. Moreover, also ( u ∼ ) e s < Fact s.Proof.
Any word x ∈ X , as well as any prefix of x , is a factor of s . Let u be any proper non-emptyprefix of a word of X . From Lemma 5.2 there exists an integer p such that u p < Fact s . Let e u bethe smallest p such that this latter condition is satisfied. Let us set e s = max { e v | v ∈ (Pref X ) \ ( X ∪ { ε } ) } . We observe that e s is finite since X is a finite code. Therefore, for any u ∈ (Pref X ) \ ( X ∪ { ε } )one has u e s < Fact s . Since s is closed under reversal it follows that also ( u ∼ ) e s < Fact s . GENERALIZEDARNOUX-RAUZYWORDS 14 Theorem 5.7.
Let s = ψ X ( y ) , with y = y · · · y n · · · ∈ X ω , y i ∈ X, i ≥ , be an X-AR word. Thereexists an integer ν such that for all h ≥ ν the number S r ( h ) of right special factors of s of lengthh has the lower bound λ X , i.e., S r ( h ) ≥ λ X . Moreover, any such right special factor of s is of degree d.Proof.
In the following we shall set for all n , u n = ψ X ( y · · · y n ). Let ℓ be as in Lemma 5.4, m be the minimal integer such that m ℓ − − ≥ e s , and let n be an integer such that | u n | = m ≥ m .Let us write u n as u n = w · · · w m with w i ∈ A , i = , . . . , m . Since any word x ∈ X is persistentin y it follows that u n X ⊆ Fact s . Therefore, for any proper prefix u of a word x ∈ X one has that: u n u = w · · · w m u is a right special factor of s of order d and length m + | u | . This implies that w | u | + · · · w m u (8)is a right special factor of length m . However, for u , v ∈ (Pref X ) \ X , u , v , one cannot have w | u | + · · · w m u = w | v | + · · · w m v . This is trivial if | u | = | v | . If | u | < | v | , as u n ∈ PAL , by Lemma 5.4 one would derive: w · · · w m −| u | = α k α ′ with k ≥ e s and α equal to the reversal of a proper prefix of a word of X , which is absurd in viewof Proposition 5.6. Thus one has that all the words of (8) with u ∈ (Pref X ) \ X , are right specialfactors of s of length m and order d . Since by Lemma 5.5 the number of proper prefixes of thewords of X is λ X it follows that the number S r ( m ) of right special factors of length m has thelower bound S r ( m ) ≥ λ X . Thus we have proved the result for all m = | u n | ≥ m .Let us now take h such that m < h < m ′ = | u n + | . We can write u n + = ζ w · · · w m for someword ζ . Since for any u ∈ (Pref X ) \ X , u n + u is a right special factor of s of length m ′ + | u | andorder d , so is its su ffi x of length h . We wish to prove that all such su ffi xes of length h , for di ff erentvalues of u in (Pref X ) \ X , are distinct. Indeed, if two such su ffi xes were equal, for instance theones corresponding to u , v ∈ (Pref X ) \ X , then their su ffi xes of length m would be equal, i.e., w | u | + · · · w m u = w | v | + · · · w m v , which is absurd as shown above. Hence, S r ( h ) ≥ λ X . Corollary 5.8.
Let s be an X-AR word. There exists an integer ν such that the factor complexityp s of s has for all n ≥ ν the linear lower bound (card( X ) − n + c , with c ∈ Z . Proof.
From the preceding theorem for all n ≥ ν , s has at least λ X right special factors of length n and order d . Therefore, in view of (1), we can write for all n ≥ ν p s ( n ) ≥ p s ( ν ) + ( n − ν ) λ X ( d − = p s ( ν ) + ( n − ν )(card( X ) − = (card( X ) − n + c , having set c = p s ( ν ) − ν (card( X ) − p s of an X -AR word s is linearly upper bounded(cf. Theorem 5.15). We need some preparatory results and a theorem (cf. Theorem 5.13) whichis a suitable extension of a formula of Justin [15] to generalized palindromization maps.We recall that a positive integer p is a period of the word w = a · · · a n , a i ∈ A , 1 ≤ i ≤ n if the following condition is satisfied: if i and j are any integers such that 1 ≤ i , j ≤ n and i ≡ j (mod p ), then a i = a j . We shall denote by π ( w ) the minimal period of w .Let X be a finite prefix code and ℓ X be the maximal length of the words of X . We say that ψ X ( x · · · x m ) with x i ∈ X , i ≥
1, is full if it satisfies the three following conditions:F1. For any x ∈ X there exists at least one integer j such that 1 ≤ j ≤ m and x j = x .F2. π ( ψ X ( x · · · x m )) ≥ ℓ X .F3. For all x ∈ X the longest palindromic prefix of ψ X ( x · · · x m ) followed by x is ψ X ( x · · · x r x − ),where r x is the greatest integer such that 1 ≤ r x ≤ m and x r x = x . Proposition 5.9.
Let X be a finite prefix code, z ∈ X + , and y ∈ X. If ψ X ( z ) is full, then ψ X ( zy ) isfull.Proof. It is clear that ψ X ( zy ) satisfies property F1. Moreover, one has also that π ( ψ X ( zy )) ≥ ℓ X .Indeed, otherwise since ψ X ( z ) is a prefix of ψ X ( zy ), one would derive that ψ X ( z ) has a period, andthen the minimal period, less than ℓ X , which is a contradiction.Let us first prove that ψ X ( z ) = P , where P is the longest proper palindromic prefix of ψ X ( zy ).Indeed, we can write: ψ X ( zy ) = ψ X ( z ) y λ = P µ, with λ, µ ∈ A ∗ and µ , ε . One has that | P | ≥ | ψ X ( z ) | and, moreover, | P | < | ψ X ( z ) y | . This lastinequality follows from the minimality of the length of palindromic closure. Let us then supposethat: P = ψ X ( z ) y ′ = ( y ′ ) ∼ ψ X ( z ) , with y ′ ≺ y . From the Lyndon and Sch¨utzenberger theorem there exist α, β ∈ A ∗ and n ∈ N suchthat ( y ′ ) ∼ = αβ, y ′ = βα , and ψ X ( z ) = ( αβ ) n α . Since ψ X ( z ) is full, from property F1 one hasthat | ψ X ( z ) | ≥ ℓ X , so that n > π ( ψ X ( z )) ≤ | αβ | = | y ′ | < ℓ X which is a contradiction. Thus P = ψ X ( z ).From the preceding result one derives that the longest palindromic prefix of ψ X ( zy ) followedby y is ψ X ( z ). Now let x , y and let Q be the longest palindromic prefix of ψ X ( zy ) followed by x .We can write: ψ X ( zy ) = ψ X ( z ) y λ = Qx δ, with δ ∈ A ∗ . From the preceding result one has | Q | ≤ | ψ X ( z ) | . If | Q | = | ψ X ( z ) | , then, as X is aprefix code, one gets x = y , a contradiction. Hence, | Q | < | ψ X ( z ) | . We have to consider two cases:Case 1. | Qx | > | ψ X ( z ) | . This implies ψ X ( z ) = Qx ′ = ( x ′ ) ∼ Q , with x ′ ≺ x . Hence, one would derive ( x ′ ) ∼ = uv , x ′ = vu , and ψ X ( z ) = ( uv ) n u with u , v ∈ A ∗ and n >
0. This gives rise to a contradiction, as π ( ψ X ( z )) ≤ | uv | < ℓ X .Case 2. | Qx | ≤ | ψ X ( z ) | . Let z = x · · · x m with x i ∈ X , 1 ≤ i ≤ m . In this case Q is the longestpalindromic prefix of ψ X ( z ) followed by x , namely ψ X ( x · · · x r x − ).In conclusion, ψ X ( zy ) satisfies conditions F1–F3 and is then full. GENERALIZEDARNOUX-RAUZYWORDS 16 Lemma 5.10.
Let s be an X-AR word and ψ X ( z ) , with z ∈ X ∗ , be a prefix of s. There existsan integer ν s such that if | ψ X ( z ) | ≥ ν s , then for any prefix u = ψ X ( zyx · · · x k ) of s with k ≥ ,y , x , . . . , x k ∈ X, y , x i , ≤ i ≤ k, the longest palindromic prefix of u followed by y is ψ X ( z ) .Proof. Let us denote by P the longest palindrome such that Py is a prefix of u . We wish to provethat for a su ffi ciently large ψ X ( z ) one has that P = ψ X ( z ). Let us then suppose by contradictionthat | P | > | ψ X ( z ) | . Setting x = y , there exists an integer i , − ≤ i ≤ k − | ψ X ( zx · · · x i ) | ≤ | P | ≤ | ψ X ( zx · · · x i + ) | , (9)where for i = − | ψ X ( z ) | . Let us prove that for − ≤ i ≤ k , P , ψ X ( zx · · · x i ). This is trivial for i = − i = k as | P | < | u | . For 0 ≤ i ≤ k − P is followed by y whereas ψ X ( zx · · · x i ) is followedby x i + . As X is a prefix code, one would obtain y = x i + which is a contradiction. Hence in (9)the inequalities are strict. If | ψ X ( zx · · · x i ) x i + | ≤ | P | < | ψ X ( zx · · · x i + ) | , then one would contradict the definition of palindromic closure. Thus the only possibility is thatthere exists − ≤ i ≤ k − P = ψ X ( zx · · · x i ) p = p ∼ ψ X ( zx · · · x i )where p is a proper non-empty prefix of x i + . This implies that there exist words λ, µ ∈ A ∗ andan integer n ≥ p ∼ = λµ, p = µλ, ψ X ( zx · · · x i ) = ( λµ ) n λ. (10)Let us set ν s = ( e s + ℓ X , where e s has been defined in Proposition 5.6 and ℓ X is the maximallength of the words of X . Let us suppose that | ψ X ( z ) | ≥ ν s . Since( e s + ℓ X ≤ | ψ X ( z ) | ≤ | ψ X ( zx · · · x i ) | ≤ ( n + ℓ X , one would derive n ≥ e s and p n < Fact s which contradicts (10) and this concludes the proof. Corollary 5.11.
Let s = ψ X ( x x · · · x n · · · ) be an X-AR word, with x i ∈ X, i ≥ . There exists aninteger m ≥ such that for all n ≥ m, ψ X ( x · · · x n ) is full.Proof. Since s is an X -AR word, for any x ∈ X there exist infinitely many integers j such that x = x j . We can take the integer m so large that for any x ∈ X there exists at least one integer j such that 1 ≤ j ≤ m , x j = x , and, moreover, for each x ∈ X | ψ X ( x · · · x r x − ) | > ν s . This assures, in view of preceding lemma, that for each x ∈ X the longest palindromic prefixof ψ X ( x · · · x m ) followed by x is ψ X ( x · · · x r x − ). Finally, there exists an integer m such that π ( ψ X ( x · · · x m )) ≥ ℓ X . Indeed, s is ω -power free, so that there exists an integer p such thatfor any non-empty factor u of s of length | u | < ℓ X one has u p < Fact s . Thus if for all m , π ( ψ X ( x · · · x m )) < ℓ X we reach a contradiction by taking m such that | ψ X ( x · · · x m ) | ≥ ( p + ℓ X .Hence there exists an integer m such that conditions F1–F3 are all satisfied, so that ψ X ( x · · · x m )is full. By Proposition 5.9, ψ X ( x · · · x n ) is also full, for all n ≥ m . GENERALIZEDARNOUX-RAUZYWORDS 17 Lemma 5.12.
Let z ∈ X ∗ and y ∈ X. Suppose that ψ X ( z ) has some palindromic prefixes followedby y, and let ∆ y be the longest one. Then ψ X ( zy ) = ψ X ( z ) ∆ − y ψ X ( z ) . Proof.
Since ∆ = ∆ y is the longest palindromic prefix of ψ X ( z ) followed by y , it is also the longestpalindromic su ffi x preceded by y ∼ , so that y ∼ ∆ y is the longest palindromic su ffi x of ψ X ( z ) y . Thus,letting ψ X ( z ) = ∆ y ζ = ζ ∼ y ∼ ∆ for a suitable ζ , we obtain ψ X ( zy ) = ( ψ X ( z ) y ) ( + ) = ζ ∼ y ∼ ∆ y ζ = ψ X ( z ) ∆ − ψ X ( z ) . Let B be a finite alphabet and µ : B → X be a bijection to a prefix code X ⊆ A ∗ . For z ∈ X ∗ ,we define a morphism ϕ z : B ∗ → A ∗ by setting for all b ∈ B ϕ z ( b ) = ψ X ( z µ ( b )) ψ X ( z ) − = ψ X ( z ) ∆ − µ ( b ) , (11)where for the last equality we used Lemma 5.12. Theorem 5.13.
Let s = ψ X ( x x · · · x n · · · ) be an X-AR word with x i ∈ X, i ≥ . If z = x · · · x m is such that u m = ψ X ( z ) is full and µ , ϕ z are defined as above, then for any w ∈ B ∗ the followingholds: ψ X ( z µ ( w )) = ϕ z ( ψ ( w )) ψ X ( z ) . Proof.
In the following we shall use the readily verified property that if γ : B ∗ → A ∗ is amorphism and v is a su ffi x of u ∈ B ∗ , then γ ( uv − ) = γ ( u ) γ ( v ) − .We will prove the theorem by induction on | w | . It is trivial that for w = ε the claim is truesince ψ ( ε ) = ε = ϕ z ( ε ). Suppose that for all the words shorter than w , the statement holds. For | w | >
0, we set w = vb with b ∈ B , and let y = µ ( b ).First we consider the case | v | b ,
0. We can then write v = v bv with | v | b =
0. Since ψ X ( z )is full, so is ψ X ( z µ ( v )); hence ψ X ( z µ ( v )) is the longest palindromic prefix (resp. su ffi x) followed(resp. preceded) by y (resp. y ∼ ) in ψ X ( z µ ( v )). Therefore, by Lemma 5.12 we have ψ X ( z µ ( v ) y ) = ψ X ( z µ ( v )) ψ X ( z µ ( v )) − ψ X ( z µ ( v )) (12)and, as ψ ( v ) is the longest palindromic prefix (resp. su ffi x) followed (resp. preceded) by b in ψ ( v ), ψ ( vb ) = ψ ( v ) ψ ( v ) − ψ ( v ) . (13)By induction we have: ψ X ( z µ ( v )) = ϕ z ( ψ ( v )) ψ X ( z ) , ψ X ( z µ ( v )) = ϕ z ( ψ ( v )) ψ X ( z ) . Replacing in (12), and by (13), we obtain ψ X ( z µ ( v ) y ) = ϕ z ( ψ ( v )) ϕ z ( ψ ( v )) − ϕ z ( ψ ( v )) ψ X ( z ) = ϕ z ( ψ ( v ) ψ ( v ) − ψ ( v )) ψ X ( z ) = ϕ z ( ψ ( vb )) ψ X ( z ) , which was our aim. GENERALIZEDARNOUX-RAUZYWORDS 18Now suppose that | v | b =
0. As ψ X ( z ) is full, the longest palindromic prefix of ψ X ( z ) which isfollowed by y is ∆ y = ψ X ( x · · · x r y − ), where r y is the greatest integer such that 1 ≤ r y ≤ m and x r y = y . By Lemma 5.12 we obtain ψ X ( z µ ( v ) y ) = ( ψ X ( z µ ( v )) y ) ( + ) = ψ X ( z µ ( v )) ∆ − y ψ X ( z µ ( v )) . (14)By induction, this implies ψ X ( z µ ( v ) y ) = ϕ z ( ψ ( v )) ψ X ( z ) ∆ − y ϕ z ( ψ ( v )) ψ X ( z ) . (15)From (11) it follows ϕ z ( b ) = ψ X ( zy ) ( ψ X ( z )) − = ψ X ( z ) ∆ − y . Moreover, since ψ ( v ) has no palindromic prefix (resp. su ffi x) followed (resp. preceded) by y one has ψ ( vb ) = ψ ( v ) b ψ ( v ) . (16)Thus from (15) we obtain ψ X ( z µ ( v ) y ) = ϕ z ( ψ ( v )) ϕ z ( b ) ϕ z ( ψ ( v )) ψ X ( z ) = ϕ z ( ψ ( v ) b ψ ( v )) ψ X ( z ) = ϕ z ( ψ ( vb )) ψ X ( z ) , which completes the proof. Corollary 5.14.
Every X-AR word is a morphic image of a standard Arnoux-Rauzy word overan alphabet B of the same cardinality as X.Proof.
Let s = ψ X ( x x · · · x n · · · ) be an X -AR word with x i ∈ X , i ≥
1, and let x i = µ ( b i ) for all i ≥
1, where µ : B → X is a bijection. By the preceding theorem, there exists an integer m ≥ z = x · · · x m , for all w ∈ B ∗ we have ψ X ( z µ ( w )) = ϕ z ( ψ ( w )) ψ X ( z ). Hence for all k ≥ m we have ψ X ( x · · · x k ) = ϕ z ( ψ ( b m + · · · b k )) ψ X ( z ) , so that taking the limit of both sides as k → ∞ , we get s = ϕ z ( ψ ( b m + b m + · · · b n · · · )) . The assertion follows, as each letter of B occurs infinitely often in the word b m + b m + · · · b n · · · . Example . Let X = { aa , ab , b } , B = { a , b , c } , and µ : B → X be defined by µ ( a ) = ab , µ ( b ) = b ,and µ ( c ) = aa . Let s be the X -AR word s = ψ X (( abbaa ) ω ) = ababaaababaababaaabababaaababaababaaaba · · · . Setting z = abbaa , it is easy to verify that the prefix ψ X ( z ) = ababaaababa of s is full, so that s = ϕ z ( ψ (( abc ) ω )), where ϕ z ( a ) = ababaaababa , ϕ z ( b ) = ababaaab , and ϕ z ( c ) = ababaa .Let s = ψ X ( x · · · x n · · · ) be an X -AR word with x i ∈ X , i ≥
1, and let m be the minimalinteger such that u m = ψ X ( x · · · x m ) is full. For all j ≥ α j = u m + j and n j = | α j | . CONSERVATIVEMAPS 19
Theorem 5.15.
Let s be an X-AR word. Then the factor complexity of s is linearly upperbounded. More precisely for all n ≥ n p s ( n ) ≤ X ) n − card( X ) . Proof.
We shall first prove that for all j ≥ p s ( n j ) ≤ card( X ) n j − card( X ) . (17)Let µ be a bijection of an alphabet B and X . We set z j = x · · · x m + j and consider the morphism ϕ z j : B ∗ → A ∗ defined, in view of (11), for all b ∈ B as: ϕ z j ( b ) = α j ∆ − µ ( b ) , where α j = ψ X ( z j ) and ∆ µ ( b ) is the longest palindrome such that ∆ µ ( b ) µ ( b ) (resp. ( µ ( b )) ∼ ∆ µ ( b ) ) isa prefix (resp. su ffi x) of α j .Since s is uniformly recurrent, there exists an integer p such that all factors of s of length n j are factors of α j + p . Hence, there exist p letters b , . . . , b p ∈ B such that α j + p = ψ X ( z j µ ( b ) · · · µ ( b p )).By Theorem 5.13 one has ψ X ( z j µ ( b ) · · · µ ( b p )) = α j ∆ − µ ( b ) α j ∆ − µ ( b ) · · · α j ∆ − µ ( b p ) α j . Thus α j covers α j + p and the overlaps between two consecutive occurrences of α j in α j + p aregiven by ∆ µ ( b i ) , 1 ≤ i ≤ p . Any factor of s of length n j will be a factor of two consecutiveoverlapping occurrences of α j , i.e., of α j ∆ − µ ( b i ) α j , i = , . . . , p . (18)For any 1 ≤ i ≤ p the number of distinct factors in (18) is at most n j − | ∆ µ ( b i ) | ≤ n j −
1. Since µ ( B ) = X and the number of distinct consecutive overlapping occurrences of α j in α j + p is at mostcard( X ), equation (17) is readily derived.Now let n be any integer n ≥ n such that n , n k for all k ≥
0. There exists an integer j such that n j < n < n j + . Since s is not periodic, by a classic result of Morse and Hedlund(see [12, Theorem 1.3.13]) the factor complexity p s is strictly increasing with n . Moreover, as n j + < n j < n , one has by (17): p s ( n ) < p s ( n j + ) ≤ card( X ) n j + − card( X ) < X ) n − card( X ) , which concludes the proof.
6. Conservative maps
Let A be an alphabet of cardinality d > X be a code over A . We say that thepalindromization map ψ X is conservative if ψ X ( X ∗ ) ⊆ X ∗ . (19)When X = A , the palindromization map ψ is trivially always conservative. In the general case ψ X may be non conservative. CONSERVATIVEMAPS 20 Example . Let X = { ab , ba } . One has ψ X ( ab ) = aba < X ∗ , so that ψ X is not conservative. Inthe case Y = { aa , bb } one easily verifies that ψ Y ( Y ∗ ) ⊆ Y ∗ . If Z = { a , ab } one has that for anyword w ∈ Z ∗ , ψ Z ( w ) ∈ aA ∗ \ A ∗ bbA ∗ , with A = { a , b } , so that it can be uniquely factorized by theelements of Z . This implies that ψ Z is conservative.The following result shows that a prefix code having a conservative palindromization mapallows a natural generalization of properties P2 and P3 of Proposition 2.1, in addition to the onesfor P1 and P4 shown in Propositions 3.1 and 3.2. Proposition 6.1.
Let X be a prefix code such that ψ X is conservative, and p , w ∈ X ∗ with p aprefix of ψ X ( w ) . The following hold: p ( + ) is a prefix of ψ X ( w ) and p ( + ) ∈ X ∗ . If p is a palindrome, then p = ψ X ( u ) for some prefix u ∈ X ∗ of w.Proof. Let w = x x · · · x k with x i ∈ X for 1 ≤ i ≤ k , and let v be the longest prefix of w in X ∗ such that ψ X ( v ) is a prefix of p ; we can write v = x · · · x n or set n = v = ε . Thus p = ψ X ( v ) ζ with ζ ∈ A ∗ . Since ψ X is conservative one has ψ X ( v ) ∈ X ∗ . Moreover, as X is a prefix code, X ∗ is right unitary, so that one has ζ ∈ X ∗ . If ζ = ε , then p = ψ X ( v ) = p ( + ) and there is nothing toprove. Let us then suppose ζ , ε . Since ψ X ( v ) x n + , as well as p , is a prefix of ψ X ( w ) and X is aprefix code, one has that ζ ∈ x n + X ∗ . Thus ψ X ( v ) x n + is a prefix of p .From the definition of palindromic closure it follows that | ( ψ X ( v ) x n + ) ( + ) | ≤ | p ( + ) | . By themaximality of n , we also obtain that p is a (proper) prefix of ( ψ X ( v ) x n + ) ( + ) = ψ X ( x · · · x n + ), sothat | p ( + ) | ≤ | ( ψ X ( v ) x n + ) ( + ) | . Thus | p ( + ) | = | ( ψ X ( v ) x n + ) ( + ) | . Since p ( + ) is a palindrome of minimallength having ψ X ( v ) x n + as a prefix, from the uniqueness of palindromic closure it follows that p ( + ) = ψ X ( vx n + ). Hence, p ( + ) is a prefix of ψ X ( w ), and p ( + ) ∈ X ∗ as ψ X is conservative.If p is a palindrome and p , ψ X ( v ), then the argument above shows that p ( + ) = p = ψ X ( vx n + ),which is absurd by the maximality of n .The following proposition gives a su ffi cient condition which assures that ψ X is conservative. Proposition 6.2.
Let X ⊆ PAL be an infix and weakly overlap-free code. Then ψ X is conserva-tive.Proof. We shall prove that for all n ≥ ψ X ( X n ) ⊆ X ∗ . The proof is by induction onthe integer n . The base of the induction is true. Indeed the case n = n = X ⊆ PAL , one has ψ X ( X ) = X . Let us then suppose the result true up to n and prove itfor n +
1. Let w ∈ X n and x ∈ X . By induction we can write ψ X ( w ) = x ′ · · · x ′ m , with x ′ i ∈ X ,1 ≤ i ≤ m . Thus: ψ X ( wx ) = ( ψ X ( w ) x ) ( + ) = ( x ′ · · · x ′ m x ) ( + ) . (20)Let Q denote the longest palindromic su ffi x of x ′ · · · x ′ m x . Since x ∈ PAL we have | Q | ≥ | x | . Wehave to consider two cases:Case 1. | Q | = | x | . From (20) and X ⊆ PAL , it follows: ψ X ( wx ) = ( x ′ · · · x ′ m x ) ( + ) = x ′ · · · x ′ m xx ′ m · · · x ′ . Thus ψ X ( wx ) ∈ X ∗ and in this case we are done.Case 2. | Q | > | x | . One has: x ′ · · · x ′ m x = ζ Q . CONSERVATIVEMAPS 21Since | Q | > | x | and x , Q ∈ PAL , there exists 1 ≤ j ≤ m such that x ′ j = λµ , λ, µ ∈ A ∗ and µ x ′ j + · · · x ′ m x = Q = x η, with η ∈ A ∗ . We shall prove that λ = ε . Indeed, suppose that λ , ε . We have to consider thefollowing subcases:1) | x | ≤ | µ | . This implies that x is a proper factor of x ′ j which is a contradiction, since X is an infixcode.2) | x | ≥ | µ x ′ j + | . In this case one has that x ′ j + is a factor of x which is a contradiction.3) | µ | < | x | < | µ x ′ j + | . This implies that x = µ p , where p is a proper prefix of x ′ j + . Since µ is aproper su ffi x of x ′ j we reach a contradiction with the hypothesis that X is weakly overlap-free.Hence, λ = ε and µ = x = x ′ j . Therefore, one has, as X ⊆ PAL , ψ X ( wx ) = ( x ′ · · · x ′ m x ) ( + ) = x ′ · · · x ′ j − xx ′ j + · · · x ′ m xx ′ j − · · · x ′ ∈ X ∗ , which concludes the proof. Example . Let X = { bab , bcb } . One has that X ⊆ PAL . Moreover, X is an infix and weaklyoverlap-free code. From the preceding proposition one has that ψ X is conservative.Let us observe that Proposition 6.2 can be proved by replacing the requirement X ⊆ PAL with the two conditions: X = X ∼ and ψ X ( X ) ⊆ X ∗ . However, the following lemma shows that ifthe code X is prefix these two latter conditions are equivalent to X ⊆ PAL . Lemma 6.3.
Let X be a prefix code. Then one has:X ⊆ PAL ⇐⇒ X = X ∼ and ψ X ( X ) ⊆ X ∗ . Proof. If X ⊆ PAL , then trivially X = X ∼ . Moreover, for any x ∈ X one has ψ X ( x ) = x ( + ) = x ∈ X ∗ . Let us prove the converse. Suppose that x ∈ X is not a palindrome. We can write x = λ Q ,where Q = LPS ( x ) is the longest palindromic su ffi x of x and λ , ε . One has, by hypothesis: ψ X ( x ) = x ( + ) = λ Q λ ∼ = x λ ∼ ∈ X ∗ . Since X is a prefix code, from the right unitarity of X ∗ one has λ ∼ ∈ X ∗ . As X = X ∼ it follows λ ∈ X ∗ . Since x = λ Q and X is a prefix code, one derives λ = x and Q = ε which is absurd as | Q | > Proposition 6.4.
Let X ⊆ PAL be a prefix code. Then: ψ X is conservative ⇐⇒ for all x ∈ X , LPS ( ψ X ( X ∗ ) x ) ⊆ X ∗ . Proof. ( ⇒ ) Let w ∈ X ∗ . If w = ε , since X ⊆ PAL , one has
LPS ( x ) = x ∈ X . Suppose w , ε , sothat w = x · · · x n , with x i ∈ X , i = , . . . , n . Let x ∈ X and Q be the longest palindromic su ffi x of ψ X ( x · · · x n ) x . We can write: ψ X ( x · · · x n ) x = δ Q with δ ∈ A ∗ and ψ X ( x · · · x n x ) = ( ψ X ( x · · · x n ) x ) ( + ) = δ Q δ ∼ = ψ X ( x · · · x n ) x δ ∼ . CONSERVATIVEMAPS 22Since ψ X is conservative, one has ψ X ( x · · · x n ) , ψ X ( x · · · x n x ) ∈ X ∗ , so that as X is a prefix codefrom the preceding equation one derives δ ∼ ∈ X ∗ and then δ ∈ X ∗ because X ⊆ PAL . Finally,from the equation ψ X ( x · · · x n ) x = δ Q it follows Q ∈ X ∗ as X is a prefix code.( ⇐ ) We shall prove that for all n ≥ ψ X ( X n ) ⊆ X ∗ . The result is trivial if n =
0. For n = x ∈ X , ψ X ( x ) = x ( + ) = x as X ⊆ PAL , so that ψ X ( X ) ⊆ X ∗ . Let us nowby induction suppose that ψ X ( X n ) ⊆ X ∗ and prove that ψ X ( X n + ) ⊆ X ∗ . Let x , . . . , x n , x ∈ X andlet Q denote the longest palindromic su ffi x of ψ X ( x · · · x n ) x , so that ψ X ( x · · · x n ) x = δ Q , with δ ∈ A ∗ . The code X is bifix because X is a prefix code and X ⊆ PAL . Since by hypothesis Q , ψ X ( x · · · x n ) ∈ X ∗ , from the preceding equation and the left unitarity of X ∗ , one gets δ ∈ X ∗ .Moreover, δ ∼ ∈ X ∗ since X ⊆ PAL . Hence, one has: ψ X ( x · · · x n x ) = ( ψ X ( x · · · x n ) x ) ( + ) = δ Q δ ∼ ∈ X ∗ , which concludes the proof.Let X be a code over the alphabet B and ϕ : A ∗ → B ∗ an injective morphism such that ϕ ( A ) = X . We say that ψ X is morphic-conservative if for any w ∈ A ∗ one has ϕ ( ψ ( w )) = ψ X ( ϕ ( w )) . (21) Example . Let A = { a , b } , B = { a , b , c } , X = { c , bab } , and ϕ : A ∗ → B ∗ be the injectivemorphism defined by ϕ ( a ) = c and ϕ ( b ) = bab . Let w = abaa ; one has ψ ( w ) = abaabaaba , ϕ ( w ) = cbabcc , and ϕ ( ψ ( w )) = cbabccbabccbabc = ψ X ( ϕ ( w )) . As a consequence of Corollary 6.11, one can prove that ψ X is morphic-conservative. Lemma 6.5. If ψ X is morphic-conservative, then it is conservative.Proof. Let u ∈ X ∗ . The result is trivial if u = ε . If u is not empty let us write u = x · · · x n , with x i ∈ X , i = , . . . , n . Since ϕ is injective, let a i ∈ A be the unique letter such that x i = ϕ ( a i ).Therefore, u = ϕ ( a · · · a n ). By (21) one has ψ X ( u ) = ϕ ( ψ ( a · · · a n )) ∈ X ∗ , which proves theassertion.The converse of the preceding lemma is not true in general. Indeed, from the followingproposition, one has that if ψ X is morphic-conservative, then the words of X have to be palin-dromes. However, as we have seen in Example 6.1, there are ψ X which are conservative with acode X whose words are not palindromes. Proposition 6.6. If ψ X is morphic-conservative, then X ⊆ PAL and X has to be a bifix code.Proof.
Let a be any letter of A and set x = ϕ ( a ). One has from (21) that ϕ ( ψ ( a )) = ϕ ( a ) = x = ψ X ( ϕ ( a )) = ψ X ( x ) = x ( + ) . Hence, x = x ( + ) ∈ PAL , so that all the words of X have to bepalindromes.Let us now prove that X is a su ffi x code. Indeed, suppose by contradiction that there existwords x , y ∈ X such that y = λ x with λ ∈ A + . Let a , b ∈ A be letters such that ϕ ( a ) = x and ϕ ( b ) = y . For w = ba one has: ϕ ( ψ ( ba )) = ϕ ( bab ) = yxy , CONSERVATIVEMAPS 23and, recalling that y ∈ PAL , ψ X ( ϕ ( ba )) = ψ X ( yx ) = ( yx ) ( + ) = ( λ xx ) ( + ) . Since xx ∈ PAL , the longest palindromic su ffi x Q of λ xx has a length | Q | ≥ | x | . Thus | ( λ xx ) ( + ) | ≤ | λ xx λ ∼ | = | y | < | yxy | , which is absurd. Hence, X has to be a su ffi x code and then bifix as X ⊆ PAL . Remark 6.7.
As a consequence of Lemma 6.5 and Proposition 6.6, every code X having amorphic-conservative ψ X satisfies the hypotheses of Propositions 3.1, 3.2, 4.1, and 6.1, so thatall properties P1–P4 in Proposition 2.1 admit suitable generalizations for ψ X . Let us highlight inparticular the following: Proposition 6.8. If ψ X is morphic-conservative, then it is injective.Proof. From Proposition 6.6 the code X has to be bifix, so that the result follows from Proposi-tion 4.1.Let us observe that in the preceding proposition one cannot replace morphic-conservativewith conservative. Indeed, for instance, if X = { a , ab } then ψ X is conservative (see Example 6.1)but it is not injective, since ψ X ( aba ) = ψ X ( abab ).The following theorem relates the two notions of conservative and morphic-conservativepalindromization map. Theorem 6.9.
The map ψ X is morphic-conservative if and only if X ⊆ PAL, X is prefix, and ψ X is conservative. For the proof of the preceding theorem we need the following
Lemma 6.10.
Let ϕ : A ∗ → B ∗ be an injective morphism and ϕ ( A ) = X ⊆ PAL. For any w ∈ A ∗ , ϕ ( w ∼ ) = ( ϕ ( w )) ∼ . Thus for any w ∈ A ∗ , ϕ ( w ) = ( ϕ ( w )) ∼ if and only if w ∈ PAL.Proof.
The result is trivial if w = ε . Let us suppose w , ε and write w as w = a · · · a n with a i ∈ A , 1 ≤ i ≤ n . One has ϕ ( w ) = ϕ ( a ) · · · ϕ ( a n ) = x · · · x n , having set x i = ϕ ( a i ) ∈ X , 1 ≤ i ≤ n . Since X ⊆ PAL , one derives( ϕ ( w )) ∼ = x n · · · x = ϕ ( w ∼ ) . As ϕ is injective one obtains: ϕ ( w ) = ( ϕ ( w )) ∼ = ϕ ( w ∼ ) if and only if w = w ∼ , which concludes the proof. CONSERVATIVEMAPS 24 Proof of Theorem 6.9. ( ⇒ ) Immediate from Proposition 6.6 and Lemma 6.5.( ⇐ ) Let ϕ : A ∗ → B ∗ be an injective morphism such that ϕ ( A ) = X is a prefix code and X ⊆ PAL .We wish to prove that for any w ∈ A ∗ one has: ϕ ( ψ ( w )) = ψ X ( ϕ ( w )) . The proof is by induction on the length n of w . The result is trivial if n =
0. If n =
1, i.e., w = a ∈ A , one has, as ϕ ( a ) ∈ PAL , ϕ ( ψ ( a )) = ϕ ( a ) = ψ X ( ϕ ( a )) . Let us then suppose the result true up to the length n and prove it for n +
1. We can write, byusing the induction hypothesis and the fact that ϕ ( w ) ∈ X ∗ , ψ X ( ϕ ( wa )) = ψ X ( ϕ ( w ) ϕ ( a )) = ( ψ X ( ϕ ( w )) ϕ ( a )) ( + ) = ( ϕ ( ψ ( w )) ϕ ( a )) ( + ) . Let z = ψ ( w ); we need to show that ( ϕ ( z ) ϕ ( a )) ( + ) = ϕ ( ψ ( wa )). As ψ X is conservative, by Proposi-tion 6.4 the longest palindromic su ffi x Q of ψ X ( ϕ ( w )) ϕ ( a ) = ϕ ( z ) ϕ ( a ) belongs to X ∗ . Since ϕ ( a )is a palindrome and X is a su ffi x code, there exists a su ffi x v of z such that Q = ϕ ( v ) ϕ ( a ). UsingLemma 6.10 one derives that va is the longest palindromic su ffi x of za , so that, letting z = uv ,( ϕ ( z ) ϕ ( a )) ( + ) = ϕ ( uvau ∼ ) = ϕ (( za ) ( + ) ) = ϕ ( ψ ( wa )) , which concludes the proof. Corollary 6.11.
Let X be a weakly overlap-free and infix code such that X ⊆ PAL. Then ψ X ismorphic-conservative.Proof. Trivial by Proposition 6.2 and Theorem 6.9.
Remark 6.12.
The hypotheses in the previous corollary that X is a weakly overlap-free andinfix code are not necessary in order that ψ X is morphic-conservative. For instance, let X be theprefix code X = { aa , cbaabc } . One has that bcX ∗ ∩ PAL = ∅ . From this one easily verifies thatfor all n ≥
0, if ψ X ( X n ) ⊆ X ∗ , then for x ∈ X , LPS ( ψ X ( X n ) x ) ⊆ X ∗ . Thus by using the sameargument as in the su ffi ciency of Proposition 6.4, one has that ψ X ( X n + ) ⊆ X ∗ . It follows that ψ X is conservative and then morphic-conservative by Theorem 6.9.Let ψ X be a morphic-conservative palindromization map and ϕ : A ∗ → B ∗ the injectivemorphism such that X = ϕ ( A ) and ϕ ◦ ψ = ψ X ◦ ϕ . Since X has to be bifix, ϕ can be extended toa bijection ϕ : A ω → X ω . The extension of ψ X to X ω is such that for any x ∈ X ω ψ X ( x ) = ϕ ( ψ ( ϕ − ( x ))) . For any x ∈ X ω the word ψ ( ϕ − ( x )) is an epistandard word over A , so that ψ X ( X ω ) = ϕ ( E pistand A ) . Therefore, one has:
Proposition 6.13.
The infinite words generated by morphic-conservative generalized palin-dromization maps are images by injective morphisms of the epistandard words.
CONSERVATIVEMAPS 25Let us now consider the case when X is a finite and maximal prefix code. Lemma 6.14.
If X is a finite and maximal prefix code over A such that X ⊆ PAL, then X = A.Proof.
Let ℓ X be the maximal length of the words of X . Since X is represented by a full d -arytree, there exist d distinct words pa ∈ X , with p a fixed word of A ∗ , a ∈ A , and | pa | = ℓ X . As forany a ∈ A , the word pa ∈ PAL , the only possibility is p = ε , so that X = A . Proposition 6.15.
Let X be a finite and maximal prefix code over A. Then ψ X is morphic-conservative if and only if X = A.Proof.
The proof is an immediate consequence of Theorem 6.9 and Lemma 6.14.In the case of a finite maximal prefix code the map ψ X can be non conservative. For instance,if X = { a , ba , bb } , then ψ X ( ba ) = bab < X ∗ . The situation can be quite di ff erent if one refers toinfinite words over X . Let us give the following definition.Let X be a code having a finite deciphering delay. We say that ψ X is weakly conservative iffor any t ∈ X ω , one has ψ X ( t ) ∈ X ω ; in other terms the map ψ X : X ω → A ω can be reduced to amap ψ X : X ω → X ω . In general, ψ X is not weakly conservative. For instance, if X = { ab , ba } and t ∈ ababX ω , then ψ X ( t ) < X ω .Trivially, if ψ X is conservative, then it is also weakly conservative. However, the converse isnot in general true as shown by the following: Theorem 6.16.
If X is a finite and maximal prefix code, then ψ X is weakly conservative.Proof. Let s = ψ X ( t ) where X a finite and maximal prefix code and t ∈ X ω . We recall [9] thatany maximal prefix code is right complete, i.e., for any f ∈ A ∗ , one has f A ∗ ∩ X ∗ , ∅ . If X isfinite, then for any f ∈ A ∗ and any letter a ∈ A one has: f a k ∈ X ∗ , for a suitable integer k , depending on f and on a , such that 0 ≤ k ≤ ℓ , where ℓ = ℓ X is themaximal length of the words of X . Let a be a fixed letter of A . We can write: s [ n ] a k n ∈ X ∗ , with 0 ≤ k n ≤ ℓ . Setting p = ⌊ n ℓ ⌋ , we can write: s [ n ] = x x · · · x q n λ, with x i ∈ X , i = , . . . , q n , q n ≥ p and | λ | < ℓ . Now s [ n ] ≺ s [ n + ℓ ] , so that since X is a prefix code,one has: s [ n + ℓ ] = x x · · · x q n + ℓ λ ′ , with q n + ℓ > q n , x i ∈ X , i = q n + , . . . , q n + ℓ , and | λ ′ | < ℓ . Sincelim n →∞ x · · · x q n ∈ X ω and lim n →∞ x · · · x q n = lim n →∞ s [ n ] , the result follows. Corollary 6.17.
Let s = ψ X ( t ) with t ∈ X ω be an X-AR word. Then s is the morphic image by aninjective morphism of a word w ∈ B ω , where B is an alphabet of the same cardinality as X. THEPSEUDO-PALINDROMIZATIONMAP 26
Proof.
By the preceding theorem, since ψ X is weakly conservative, we can write: s = x x · · · x n · · · , with x i ∈ X , i ≥
1. Let B be an alphabet having the same cardinality of X and ϕ : B ∗ → X ∗ be theinjective morphism induced by an arbitrary bijection of B and X . If ϕ − is the inverse morphismof ϕ one has: ϕ − ( s ) = ϕ − ( x ) ϕ − ( x ) · · · ϕ − ( x n ) · · · . Setting ϕ − ( x i ) = w i ∈ B for all i ≥
1, one has ϕ − ( s ) = w w · · · w n · · · = w ∈ B ω and s = ϕ ( w ).Let us observe that in general the word w ∈ B ω is not episturmian as shown by the following: Example . Let X = { a , ba , bb } and s = ψ X (( ababb ) ω ). One has: s = ababbabaababbabababbabaababbaba · · · . Let B = { , , } and ϕ the morphism of B ∗ in X ∗ defined by the bijection ϕ (0) = a , ϕ (1) = ba ,and ϕ (2) = bb . One has: w = ϕ − ( s ) = · · · , and the word w is not episturmian (indeed, for instance, the factor 01201 is not rich in palin-dromes).
7. The pseudo-palindromization map An involutory antimorphism of A ∗ is any antimorphism ϑ : A ∗ → A ∗ such that ϑ ◦ ϑ = id.The simplest example is the reversal operator R : A ∗ −→ A ∗ mapping each w ∈ A ∗ to its reversal w ∼ . Any involutory antimorphism ϑ satisfies ϑ = τ ◦ R = R ◦ τ for some morphism τ : A ∗ → A ∗ extending an involution of A . Conversely, if τ is such a morphism, then ϑ = τ ◦ R = R ◦ τ is aninvolutory antimorphism of A ∗ .Let ϑ be an involutory antimorphism of A ∗ . For any w ∈ A ∗ we shall denote ϑ ( w ) simplyby ¯ w . We call ϑ -palindrome any fixed point of ϑ , i.e., any word w such that w = ¯ w , and let PAL ϑ denote the set of all ϑ -palindromes. We observe that ε ∈ PAL ϑ by definition, and that R -palindromes are exactly the usual palindromes. If one makes no reference to the antimorphism ϑ , a ϑ -palindrome is called a pseudo-palindrome .For any w ∈ A ∗ , w ⊕ ϑ , or simply w ⊕ , denotes the shortest ϑ -palindrome having w as a prefix.If Q is the longest ϑ -palindromic su ffi x of w and w = sQ , then w ⊕ = sQ ¯ s . Example . Let A = { a , b , c } and ϑ be defined as ¯ a = b , ¯ c = c . If w = abacabc , then Q = cabc and w ⊕ = abacabcbab .We can define the ϑ -palindromization map ψ ϑ : A ∗ → PAL ϑ by ψ ϑ ( ε ) = ε and ψ ϑ ( ua ) = ( ψ ϑ ( u ) a ) ⊕ for u ∈ A ∗ and a ∈ A .The following proposition extends to the case of ϑ -palindromization map ψ ϑ the propertiesof palindromization map ψ of Proposition 2.1 (cf., for instance, [5]): THEPSEUDO-PALINDROMIZATIONMAP 27 Proposition 7.1.
The map ψ ϑ over A ∗ satisfies the following properties: for u , v ∈ A ∗ P1. If u is a prefix of v, then ψ ϑ ( u ) is a ϑ -palindromic prefix (and su ffi x) of ψ ϑ ( v ) .P2. If p is a prefix of ψ ϑ ( v ) , then p ⊕ is a prefix of ψ ϑ ( v ) .P3. Every ϑ -palindromic prefix of ψ ϑ ( v ) is of the form ψ ϑ ( u ) for some prefix u of v.P4. The map ψ ϑ is injective. The map ψ ϑ can be extended to infinite words as follows: let x = x x · · · x n · · · ∈ A ω with x i ∈ A for i ≥
1. Since for all n , ψ ϑ ( x [ n ] ) is a prefix of ψ ϑ ( x [ n + ), we can define the infinite word ψ ϑ ( x ) as: ψ ϑ ( x ) = lim n →∞ ψ ϑ ( x [ n ] ) . The infinite word x is called the directive word of ψ ϑ ( x ), and s = ψ ϑ ( x ) the ϑ -standard word directed by x . If one does not make reference to the antimorphism ϑ a ϑ -standard word is alsocalled pseudostandard word .The class of pseudostandard words was introduced in [5]. Some interesting results aboutsuch words are also in [16, 17]. In particular, we mention the noteworthy result that any pseudo-standard word can be obtained, by a suitable morphism, from a standard episturmian word.More precisely let µ ϑ be the endomorphism of A ∗ defined for any letter a ∈ A as: µ ϑ ( a ) = a ⊕ ,so that µ ϑ ( a ) = a if a = ¯ a and µ ϑ ( a ) = a ¯ a , if a , ¯ a . We observe that µ ϑ is injective since µ ϑ ( A )is a prefix code. The following theorem, proved in [5], relates the maps ψ ϑ and ψ through themorphism µ ϑ . Theorem 7.2.
For any w ∈ A ∞ , one has ψ ϑ ( w ) = µ ϑ ( ψ ( w )) . An important consequence is that any ϑ -standard word is a morphic image of an epistandardword.A generalization of the pseudo-palindromization map, similar to that given in Section 3 forthe palindromization map, is the following. Let ϑ be an involutory antimorphism of A ∗ and X acode over A . We define a map: ψ ϑ, X : X ∗ → PAL ϑ , inductively as: ψ ϑ, X ( ε ) = ε and for any w ∈ X ∗ and x ∈ X , ψ ϑ, X ( wx ) = ( ψ ϑ, X ( w ) x ) ⊕ . If ϑ = R , then ψ R , X = ψ X . If X = A then ψ ϑ, A = ψ ϑ . The map ψ ϑ, X will be called the ϑ -palindromization map relative to the code X . Example . Let A = { a , b , c } and ϑ be defined as ¯ a = b and c = ¯ c . Let X be the code X = { ab , ba , c } and w = abcba . One has: ψ ϑ, X ( ab ) = ab , ψ ϑ, X ( abc ) = abcab and ψ ϑ, X ( abcba ) = abcabbaabcab .Let us now consider a code X having a finite deciphering delay. One can extend ψ ϑ, X to X ω as follows: let x = x x · · · x n · · · , with x i ∈ X , i ≥
1. For any n ≥ ψ ϑ, X ( x · · · x n ) is a properprefix of ψ ϑ, X ( x · · · x n x n + ) so that there existslim n →∞ ψ ϑ, X ( x · · · x n ) = ψ ϑ, X ( x ) . THEPSEUDO-PALINDROMIZATIONMAP 28Let us observe that the word ψ ϑ, X ( x ) has infinitely many ϑ -palindromic prefixes. This impliesthat ψ ϑ, X ( x ) is closed under ϑ , i.e., if w ∈ Fact ψ ϑ, X ( x ), then also ¯ w ∈ Fact ψ ϑ, X ( x ).We remark that the maps ψ ϑ, X and their extensions to X ω , when X is a code with finitedeciphering delay, are not in general injective. The following proposition, extending Propositions3.2 and 4.1, can be proved in a similar way. Proposition 7.3.
Let X be a prefix code over A. Then the map ψ ϑ, X : X ∗ → PAL ϑ and itsextension to X ω are injective. Several concepts, such as conservative and morphic-conservative maps, and results consid-ered in the previous sections for the map ψ X can be naturally extended to the case of the map ψ ϑ, X . We limit ourselves only to proving the following interesting theorem relating the maps ψ ϑ and ψ ϑ, X where X = µ ϑ ( A ). Combining this result with Theorem 7.2 one will obtain that ψ ϑ, X ismorphic-conservative. Theorem 7.4.
Let A be an alphabet, ϑ an involutory antimorphism, and X = µ ϑ ( A ) . Then forany w ∈ A ∞ one has: ψ ϑ ( w ) = ψ ϑ, X ( µ ϑ ( w )) . Proof.
It is su ffi cient to prove that the above formula is satisfied for any w ∈ A ∗ . The proof isobtained by making induction on the length of w .Let us first prove the base of the induction. The result is trivially true if w = ε . Let w = a ∈ A .If a = ¯ a , then a ∈ X and ψ ϑ ( a ) = a = ψ ϑ, X ( µ ϑ ( a )) = ψ ϑ, X ( a ). If a , ¯ a , one has µ ϑ ( a ) = a ¯ a ∈ X and ψ ϑ ( a ) = a ¯ a = ψ ϑ, X ( µ ϑ ( a )) = ψ ϑ, X ( a ¯ a ).Let us now prove the induction step. For w ∈ A ∗ and a ∈ A we can write, by using theinduction hypothesis, ψ ϑ ( wa ) = ( ψ ϑ ( w ) a ) ⊕ = ( ψ ϑ, X ( µ ϑ ( w )) a ) ⊕ . (22)Moreover, one has: ψ ϑ, X ( µ ϑ ( wa )) = ψ ϑ, X ( µ ϑ ( w ) a ⊕ ) = ( ψ ϑ, X ( µ ϑ ( w )) a ⊕ ) ⊕ = ( ψ ϑ ( w ) a ⊕ ) ⊕ . (23)We have to consider two cases. If a = ¯ a , then a ⊕ = a , so that from the preceding formulas (22)and (23) we obtain the result.Let us then consider the case a , ¯ a . We shall prove that ( ψ ϑ ( w ) a ) ⊕ = ψ ϑ ( wa ) has the prefix p = ψ ϑ ( w ) a ¯ a , so that from property P2 of Proposition 7.1 one will have p ⊕ (cid:22) ψ ϑ ( wa ). Since ψ ϑ ( w ) a (cid:22) p , one will derive that | ψ ϑ ( wa ) | = | ( ψ ϑ ( w ) a ) ⊕ | ≤ | p ⊕ | so that p ⊕ = ( ψ ϑ ( w ) a ) ⊕ fromwhich the result will follow. We have to consider two cases:Case 1. ψ ϑ ( w ) has not a ϑ -palindromic su ffi x preceded by the letter ¯ a . Thus( ψ ϑ ( w ) a ) ⊕ = ψ ϑ ( w ) a ¯ a ψ ϑ ( w ) , so that in this case we are done.Case 2. ψ ϑ ( w ) has a ϑ -palindromic su ffi x u of maximal length preceded by the letter ¯ a . Since u is also a ϑ -palindromic prefix of ψ ϑ ( w ), by property P3 of Proposition 7.1 there exists v prefix of w such that u = ψ ϑ ( v ). Since ¯ au is a su ffi x of ψ ϑ ( w ) one has that ua = ψ ϑ ( v ) a is a prefix of ψ ϑ ( w ).By property P2 of Proposition 7.1, ( ψ ϑ ( v ) a ) ⊕ is a prefix of ψ ϑ ( w ).Since | v | < | w | one has | va | ≤ | w | . By using two times the inductive hypothesis one has:( ψ ϑ ( v ) a ) ⊕ = ψ ϑ ( va ) = ψ ϑ, X ( µ ϑ ( v ) a ¯ a ) = ( ψ ϑ, X ( µ ϑ ( v )) a ¯ a ) ⊕ = ( ψ ϑ ( v ) a ¯ a ) ⊕ . EFERENCES 29Hence, ψ ϑ ( w ) has the prefix ua ¯ a and the su ffi x a ¯ au , so that ψ ϑ ( w ) = λ a ¯ au with λ ∈ A ∗ and( ψ ϑ ( w ) a ) ⊕ = λ a ¯ aua ¯ a ¯ λ = ψ ϑ ( w ) a ¯ a ¯ λ, from which the result follows.From Theorems 7.2 and 7.4 one derives the noteworthy: Corollary 7.5.
Let A be an alphabet, ϑ an involutory antimorphism, and X = µ ϑ ( A ) . Then onehas: ψ ϑ = µ ϑ ◦ ψ = ψ ϑ, X ◦ µ ϑ . Example . Let A = { a , b } , ϑ be defined as ¯ a = b , and X = ϑ ( A ) = { ab , ba } . Let w = aab . Onehas ψ ( aab ) = aabaa , ψ ϑ ( aab ) = ababbaabab = µ ϑ ( aabaa ). Moreover, µ ϑ ( aab ) = ababba and ψ ϑ, X ( ababba ) = ababbaabab . References [1] A. de Luca, Sturmian words: structure, combinatorics, and their arithmetics, Theor. Comput. Sci. 183 (1997)45–82.[2] X. Droubay, J. Justin, G. Pirillo, Episturmian words and some constructions of de Luca and Rauzy, Theor. Comput.Sci. 255 (2001) 539–553.[3] P. Arnoux, G. Rauzy, Repr´esentation g´eom´etrique de suites de complexit´e 2 n +
1, Bull. Soc. Math. France 119(1991) 199–215.[4] G. Rauzy, Mots infinis en arithm´etique, in: Automata on infinite words (Le Mont-Dore, 1984), volume 192 of
Lecture Notes in Comput. Sci. , Springer, Berlin, 1985, pp. 165–171.[5] A. de Luca, A. De Luca, Pseudopalindrome closure operators in free monoids, Theor. Comput. Sci. 362 (2006)282–300.[6] C. Kassel, C. Reutenauer, A palindromization map for the free group, Theor. Comput. Sci. 409 (2008) 461–470.[7] A. de Luca, A palindromization map in free monoids, Proc. Steklov Institute of Mathematics 274 (2011) 124–135.[8] A. de Luca, On the combinatorics of finite words, Theor. Comput. Sci. 218 (1999) 13–39.[9] J. Berstel, D. Perrin, Theory of Codes, Academic Press, 1985.[10] M. Bucci, A. de Luca, A. De Luca, Characteristic morphisms of generalized episturmian words, Theor. Comput.Sci. 410 (2009) 2840–2859.[11] J. Justin, G. Pirillo, Episturmian words and episturmian morphisms, Theor. Comput. Sci. 276 (2002) 281–313.[12] M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, 2002.[13] M. Lothaire, Combinatorics on Words, Addison-Wesley, Reading MA, 1983. Reprinted by Cambridge UniversityPress, Cambridge UK, 1997.[14] A. de Luca, S. Varricchio, Finiteness and regularity in semigroups and formal languages, Monographs in Theoreti-cal Computer Science. An EATCS Series, Springer-Verlag, Berlin, 1999.[15] J. Justin, Episturmian morphisms and a Galois theorem on continued fractions, Theor. Inform. Appl. 39 (2005)207–215.[16] M. Bucci, A. de Luca, A. De Luca, L. Q. Zamboni, On some problems related to palindrome closure, Theor.Inform. Appl. 42 (2008) 679–700.[17] M. Bucci, A. de Luca, A. De Luca, L. Q. Zamboni, On di ffff