Binary intersection formalized
BBinary intersection formalized ˇStˇep´an Holub and ˇStˇep´an Starosta Dept. of Algebra, Faculty of Mathematics and Physics, Charles University, CzechRepublic [email protected] Dept. of Applied Math., Faculty of Information Technology, Czech TechnicalUniversity in Prague, Czech Republic [email protected]
Abstract.
We provide a reformulation and a formalization of the clas-sical result by Juhani Karhum¨aki characterizing intersections of two lan-guages of the form { x, y } ∗ ∩{ u, v } ∗ . We use the terminology of morphismswhich allows to formulate the result in a shorter and more transparentway, and we formalize the result in the proof assistant Isabelle/HOL. Keywords: binary code; intersection; Isabelle/HOL
One of the classical results that deserve to be better known is the descriptionJuhani Karhum¨aki gave in [3] for the intersection of two free monoids of ranktwo, that is, for languages of the form { x, y } ∗ ∩ { u, v } ∗ where x and y , as wellas u and v , do not commute. The purpose of this article is twofold. First, wereformulate here the result in terms of morphisms which allows an expositionthat is much shorter, and hopefully also more transparent. This layer of thearticle is a slightly modified version of [2]. Second, we complement the improved“human” proof with a formalization in the proof assistant Isabelle/HOL.It is well known that an intersection of two free submonoids of a free monoidis free. On the other hand, the intersection { x, y } ∗ ∩{ u, v } ∗ can have infinite rank.The Theorem 2 in [3] gives two possible forms: { β, γ } ∗ and ( β + β ( γ (1+ δ + · · · + δ t )) ∗ (cid:15) ) ∗ . The original proof spans about fifteen pages (without Preliminaries).The proof often crucially relies on “the way” certain words are “built up” fromwords x and y , and/or u and v . This is exactly the kind of argument that is mucheasier to make if x and y ( u and v ) are seen as images of a binary morphism whichis demonstrated in the present article. An important feature of our reformulationis that it allowed to identify the difficult core of the proof, namely Lemma 8.Given this lemma, the rest of the proof is a fairly straightforward. We refer to[2] for a more detailed comparison of the two approaches.Our second contribution is a formalization of the result in the proof assistantIsabelle/HOL. To our knowledge, this is the first formalization of a comparableresult in Combinatorics on Words. We believe that computer assisted proofs are a r X i v : . [ c s . F L ] J un ighly desirable in our field which typically features high level of technicality.The verified formalization not only makes sure that the result is correct, butalso allows to outsource tedious and uninspiring work where it belongs, namelyto computers. We try to provide a reader without any experience with thiskind of research with the rough idea of what it entails. It may perhaps serveas a very modest introduction into some basic features of formalization usingIsabelle/HOL. The full working formalization is published in the repository [5]. Words are lists of letters from a given alphabet. They form a (free) monoid withthe operation of concatenation and the neutral element, the empty word, that isdenoted ε . If the alphabet is Σ then the monoid of lists is typically denoted by Σ ∗ using the Kleene star. There is an ambivalence in this notation. If Q is a subsetof a monoid M , then Q ∗ denotes the submonoid generated by Q in M , that is,more algebraically, the submonoid (cid:104) Q (cid:105) . However, elements of the alphabet are not words! This is typically ignored, or at best glossed over by identification ofletters with words of length one. However, in the context of the formalization,we have to keep in mind the difference. In our convention, the expression Σ ∗ isequivalent to (cid:104) Σ (cid:105) , which means that Σ is not the set of letters but the set of singleton words , that is, words of length one. In the particular case of the binaryalphabet, we shall use the generating set A = { , } where is the word [0] and the word [1].The fact that u is a prefix (suffix resp.) of v is denoted u ≤ p v ( u ≤ s v resp.). If u ≤ p v ( u ≤ s v resp.) and u (cid:54) = v , then u is a proper prefix (suffix resp.)of u . We shall denote the longest common prefix (suffix resp.) of u and v by u ∧ p v ( u ∧ s v resp.). Two words are prefix-comparable , denoted u (cid:46)(cid:47) v , ( suffix-comparable , denoted (cid:46)(cid:47) s , resp.) if one of them is a prefix (suffix resp.) of the other.If we want to say that u is a prefix (suffix resp.) of some sufficiently large powerof v , we say that u is a prefix (suffix resp.) of v ∗ . Concepts of concatenation,prefix and suffix are extended to pairs in the obvious way.We shall use the standard notation of regular expressions to describe certainsets of words. Note that { u, v } ∗ is an alternative notation for ( u + v ) ∗ . In regularexpressions, the empty word is represented by 1.If u is a prefix (suffix resp.) of v , then u − v ( vu − resp.) denotes the uniqueword such that v = uz ( v = zu resp.). The expressions u − v ( vu − resp.) isundefined otherwise.A pair of noncommuting words is also called a binary code . We need thefollowing properties of binary codes (see [1, Lemma 3.1]). If u and v do notcommute, then the word α = uv ∧ p vu is prefix-comparable with all words in { u, v } ∗ . Moreover, there are distinct letters c u and c v such that αc u is prefix-comparable with each word in u { u, v } ∗ and αc v is prefix comparable with eachword in v { u, v } ∗ . We shall use these facts for suffixes analogously. They directlyimply a weak version of the Periodicity lemma in the following form: emma 1. If w is a common prefix (suffix resp.) of u ∗ and v ∗ and | u | + | v | ≤ | w | ,then u and v commute. A binary morphism f (defined on { , } ∗ ) is called marked if pref ( f ( )) (cid:54) =pref ( f ( )), where pref ( u ) denotes the first letter of u . For a general binary mor-phism f , its marked version f m is the morphism defined by f m ( u ) = α − f f ( u ) α f where α f = f ( ) ∧ p f ( ). It is easy to see, from the facts mentioned above,that the definition of f m is correct, and that f m is marked.We remark that, compared to [2], we adopt a more elementary approach, anddo not use the powerful technique of the free basis and the Graph lemma. Whileusing the Graph lemma in general makes certain arguments much more com-fortable, in our particular case it turns out that the exposition is only negligiblyaffected by this choice. Isabelle is a generic proof assistant allowing a formalization of mathematicalformulas and their proofs. Isabelle was originally developed at the Universityof Cambridge and Technische Universit¨at M¨unchen, but now includes numerouscontributions from institutions and individuals worldwide. The most importantinstantiation of Isabelle to higher-order logic is Isabelle/HOL, the reader mightconsult for instance [4] for more details on Isabelle/HOL. The freely availabledistribution of the proof assistant also contains detailed documentation.As mentioned in Introduction, one of the goals of this article is to providea formalization of the presented result (and of its proof). This is done in Is-abelle/HOL. The full formalization is available at [5]. In this article, we give anoverview of key concepts, with comments suitable for readers not familiar withIsabelle/HOL. If a reader is not interested in this formalization, these sectionsmay be skipped.We start by introducing the formalization of the main ideas of Preliminaries.The core building stones of Isabelle are datatypes, terms and formulae. Ourbasic datatype, used for a word, is a list, which is in Isabelle equipped withmany needed tools such as concatenation, denoted as multiplication. To capture a word over a binary alphabet, we use a custom datatype whichallows to work with all binary words. The following code defines the datatypeconsisting of two values bin0 and bin1: datatype binA = bin0 | bin1The next declarations set up abbreviations for the two words of length 1,denoted by and (these are the lists of length 1, denoted by [bin0] and [bin1]). https://isabelle . in . tum . de bbreviation bin-word-0 :: binA list ( ) where bin-word-0 ≡ [bin0] abbreviation bin-word-1 :: binA list ( ) where bin-word-1 ≡ [bin1]As an example, we exhibit the claim that all lists over the constructeddatatype binA are generated by the two words of length 1. The keyword UNIVstands for the set of all elements of given type (types are inferred automatically). lemma A-generates: (cid:104){ , }(cid:105) = UNIV by (metis A-singletons basis-gen-monoid bin-UNIV lists-UNIV lists-basiswords-univ . FMonoid-axioms)The proof verified by Isabelle is given on the second line. It gives the proofmethod (here metis) and the names of used claims (supplied in the full code).This formalization includes most of the concepts mentioned in Preliminaries (ingeneral, when possible, we keep the same notation in the formalization). Forinstance, let us exhibit the definition of a prefix and its notation ≤ p : definition Prefix ( infixl ≤ p 50) where prefdef[simp]: u ≤ p v ≡ ∃ z . v =u · zAs morphisms and their marked version form an important part of usedtools, we next give their formalization details, along with further Isabelle’s coreconcepts. We formalize the concept of a (general) morphism using locale , the Isabelle’s en-vironment used to deal with parametric theories. In particular, a locale allows tointroduce global parameters (introduced by the keyword fixes ) and assumptions(introduced by the keyword assumes ), thus prevents unnecessary repetition ofassumptions in every lemma. As an illustration, we exhibit a simple claim andits proof using these assumptions, called context in Isabelle, and delimited bykeywords begin and end . locale morphism = fixes f assumes morph: f (u · v) = f u · f v beginlemma empty-to-empty: f ε = ε by (metis morph self-append-conv2) end Such a lemma in the context in fact produces a claim named morphism . empty-to-emptywhich is equivalent to the following lemma: lemma morphism f = ⇒ f ε = ε by (simp add: morphism . empty-to-empty)ote that the assumption named morph contains a term with two free vari-ables, u and v, with no quantifiers. As customary, such variables are understoodto be universally quantified, that is, the assumption holds for all u and v. Sincetypes are inferred automatically, this assumption implies that u and v are lists,and f is a mapping from lists to lists.As mentioned in Introduction, we see elements of a binary code as imagesby a morphism. Accordingly, a binary code is formalized by extending the localemorphism by an additional assumption on the images of singletons as follows.This gives raise to a new locale binary-code: locale binary-code = morphism f :: binA list ⇒ (cid:48) a list for f + assumes bin-code: f · f (cid:54) = f · f The declaration f :: binA list ⇒ (cid:48) a list specifies the datatype of the parameterf. The given datatype is a mapping from all lists over the datatype binA to thelists over a generic unspecified datatype (cid:48) a, thus setting the domain to be allbinary words.The next pointed out formalization step are the definitions of α and f m inthe context of binary-code, i.e., for a given morphism f. definition α where α LCP[simp]: α = f ( · ) ∧ p f ( · ) definition f m where f m -def[simp]: f m = ( λ w . ( α − · (f w) · α ))The definition of f m is done using a nameless function using λ -calculus con-ventions.The next claim is also in the context of binary-code, giving an essentialstatement on α : α is a prefix of f ( w ) α for every w . We display the formalizedproof as well; it is done by induction on the list w (that is, the base case is theempty list, and the induction step proves the claim for the list [a] · w assumingthat it holds for w ). lemma α w α : α ≤ p f w · α proof (induct w) case Nil (* case w = (cid:15) *) then show ?case by simp case (Cons a w) (* induction step: case w (cid:48) = aw , α ≤ p f ( w ) α *) then show ?case proof − have α ≤ p f [a] · α using α α α α alphabet-or by metis show ?thesis using pref-prolong[OF (cid:104) α ≤ p f [a] · α (cid:105) (cid:104) α ≤ p f w · α (cid:105) ]hd-word[of a w] by (metis append-assoc morph) qedqed his proof gives a rough idea about the level of detail contained in the formal-ization. Note that the induction step uses the validity of the claim for singletons(facts named α α and α α ) and the simple fact (called pref-prolong) whichclaims that if w ≤ p zr and r ≤ p s , then w ≤ p zs . The latter claim illustrateswhat can be considered a single step in the formalization. Note nevertheless thateven this step is based on an auxiliary lemma which is proved elsewhere usingeven more elementary auxiliary lemmas. Let G = { x, y } and H = { u, v } be two binary codes, that is xy (cid:54) = yx and uv (cid:54) = vu . Our aim is to describe the intersection I = G ∗ ∩ H ∗ . The aim isachieved by a series of reformulations.First, we shall see the languages G ∗ and H ∗ as ranges of the morphisms g and h over A ∗ = { , } ∗ , defined by G = { g ( ) , g ( ) } and H = { h ( ) , h ( ) } .The structure of the intersection of G ∗ and H ∗ will follow from a stronger result:a characterization of the coincidence set of g and h , defined by C ( g , h ) = { ( r, s ) ∈ A ∗ × A ∗ | g ( r ) = h ( s ) } . Indeed, we have I = { g ( r ) | ( r, s ) ∈ C ( g , h ) } = { h ( s ) | ( r, s ) ∈ C ( g , h ) } . Second, instead of C ( g , h ) we shall investigate C ( g, h ) = { ( r, s ) ∈ A ∗ × A ∗ | g ( r ) = h ( s ) } , where g is the marked version of g , and h is the marked version of h . The set C ( g, h ) is easier to investigate since both g and h are marked. The more difficultpart of the result is establishing the relationship between C ( g, h ) and C ( g , h ).Assume that I contains a nonempty word, that is, that there are nonemptywords r and s such that g ( r ) = h ( s ). Then both α g and α h are prefixesof g ( r ) i = h ( s ) i for a sufficiently large i , which implies that α g and α h areprefix comparable. Without loss of generality we shall suppose α h ≤ α g . Let α = α − h α g . Then g ( r ) = h ( s ) if and only if αg ( r ) = h ( s ) α. (1) Formalization: basic locales and the coincidence set
The morphisms g and h are formalized as two instances of the locale binary-code,producing a new locale binary-intersection-possibly-empty. This gives access tothe words α g and α h and to marked versions of g and h , which obtain theirexpected names using notation . (It also gives access to the auxiliary claims ofbinary-code for the two morphisms.) The assumption α h ≤ p α g is then added inyet another locale. ocale binary-intersection-possibly-empty =g0: binary-code g :: binA list ⇒ (cid:48) a list + h0: binary-code h :: binA list ⇒ (cid:48) a list for g h beginnotation h0 .α ( α h ) (* setting the notation α h to α from the parent localerepresenting h *) notation g0 .α ( α g ) notation h0 . f m (h) notation g0 . f m (g) endlocale binary-intersection = binary-intersection-possibly-empty + assumes alphas: α h ≤ p α g begindefinition α where α ≡ α h − · α g end Using the datatype used for a binary morphism (binA list ⇒ (cid:48) a list), wedefine the coincidence set C as follows: definition Coincidence-Set :: (binA list ⇒ (cid:48) a list) ⇒ (binA list ⇒ (cid:48) a list) ⇒ (binA list × binA list) set ( C [- , -]) where Coincidence-Set g h ≡ { (r , s) . g r = h s } The crucial relation between C ( g , h ) and C ( g, h ) is formalized as an equiv-alence (denoted by ≡ ): lemma solution-marked-version: g r = h s ≡ α · g r = h s · α using gmarked . f m -conjugates hmarked . f m -conjugates α def by (smt append-assoc append-same-eq g0 . f m -conjugates h0 . f m -conjugatessame-append-eq)Again, the displayed proof references auxiliary claims that are not presentin the excerpt from the whole formalization which consists of formalizing many“obvious” steps. C ( g , h ) We call pairs ( r, s ) ∈ C ( g , h ) solutions . C ( g , h ) is a free semigroup and theelements of its minimal generating set are minimal solutions .The structure of C ( g , h ) heavily depends on the existence of the followingthree pairs of words, called blocks : We say that ( p, q ) is the starting block if αg ( p ) = h ( q ), and αg ( p (cid:48) ) (cid:54) = h ( q (cid:48) ) for any ( p (cid:48) , q (cid:48) ) < ( p, q ). Note that α g g ( p ) = α h h ( q ). We say that ( e, f ) is the a -block if a ∈ { , } is a prefix e , and ( e, f )is a minimal solution of g and h . The -block and -block are also called letterblocks . Since g and h are marked, the process of the construction of a solutionis deterministic in the following sense. For any comparable g ( r ) and h ( s ) suchhat g ( r ) (cid:54) = h ( s ), there is at most one extension of either r or s which keeps theimages comparable. This implies the following facts: – each block (the starting block, the -block and the -block) is unique if itexists; – any solution in C ( g, h ) has a unique decomposition into letter blocks.Similarly, we obtain the following characterization of morphisms without thestarting block. Lemma 2.
If the starting block does not exist, then C ( g , h ) contains at mostone minimal solution.Proof. Note that for α = ε , the pair ( ε, ε ) is the starting block. Therefore, theword α is not empty, and since g and h are marked and there is no startingblock, the words r and s satisfying αg ( r ) = h ( s ) α are constructed deterministically, using the mentioned procedure, letter by letterand keeping the images prefix comparable. If such solution exists, then the firstone produced by this procedure is a prefix of any other nonempty solution, andusing (1), it is thus the unique minimal solution of C ( g , h ).Let us further suppose that the starting block ( p, q ) exists. Then we have thefollowing reduction of elements of C ( g , h ) to elements of C ( g, h ). Lemma 3.
If the starting block ( p, q ) exists, and ( e, f ) ∈ C ( g , h ) , then ( p, q ) is a prefix of ( ep, f q ) , and ( p − ep, q − f q ) ∈ C ( g, h ) .Proof. As ( p, q ) is the starting block, and using (1), we have αg ( ep ) = h ( f q ).Thus, ( p, q ) is a prefix of ( ep, f q ). We may write αg ( p ) g ( p − ep ) = h ( q )( q − f q )and obtain g (cid:0) p − ep (cid:1) = h (cid:0) q − f q (cid:1) . This implies that each solution has a block decomposition by which we meanthe decomposition of ( p − ep, q − f q ) into letter blocks.However, the structure of C ( g , h ) does not necessarily mirror the simplestructure of C ( g, h ). Although we may be tempted to conclude that C ( g , h )consist of elements ( pep − , qf q − ) where ( e, f ) ∈ C ( g, h ), the problem is that( pep − , qf q − ) is ill-defined if ( p, q ) is not a suffix of ( pe, qf ). Instead we havethe following characterization: Lemma 4. C ( g , h ) = (cid:8)(cid:0) pep − , qf q − (cid:1) | ( e, f ) ∈ C ( g, h ) and ( p, q ) ≤ s ( pe, qf ) (cid:9) . Proof.
The inclusion ⊆ is Lemma 3.To see the inclusion ⊇ , we first verify, using the properties of the startingblock, that g ( e ) = h ( f ) implies, αg ( pep − ) = h ( qf q − ) α . The claim now follows from (1). ormalization of minimal solutions and blocks
The definition of a minimal solution (for a morphism g, word r, morphism h,and word s, in this order) is formalized in the following way, introducing a usefulshort notation g r = m h s: definition MinimalSolution :: (binA list ⇒ (cid:48) a list) ⇒ binA list ⇒ (binA list ⇒ (cid:48) a list) ⇒ binA list ⇒ bool ((- -) = m (- -) [80 , , ,
80] 51 ) where minsoldef: MinimalSolution g r h s ≡ r (cid:54) = ε ∧ s (cid:54) = ε ∧ g r = h s ∧ ( ∀ r (cid:48) s (cid:48) . r (cid:48) ≤ np r ∧ s (cid:48) ≤ p s ∧ g r (cid:48) = h s (cid:48) −→ r (cid:48) = r ∧ s (cid:48) = s) (* ≤ np standsfor nonempty prefix *) Formalization of Lemma 2, dealing with the case of no starting block, isrewritten and proven as: lemma no-pq-one-minimal: assumes (cid:86) p q . α · g p (cid:54) = h q and g r = m h s and g r (cid:48) = m h s (cid:48) shows (r , s) = (r (cid:48) , s (cid:48) )The fact that there is at most one starting block is stated (and proven) in thesecond basic way of writing assumptions and claims in Isabelle using implications= ⇒ . lemma at-most-one-pq: z (cid:54) = ε = ⇒ z · g r = h s = ⇒ ∃ p q . z · g p = h q ∧ ( ∀ r s . z · g r = h s −→ (p ≤ p r ∧ q ≤ p s))Note that the lemma has two assumptions, namely z (cid:54) = ε and zg ( r ) = h ( s ),and the conclusion is a complicated logical formula, which itself contains animplication which is nevertheless written as −→ . This illustrates two levels onwhich the formalization operates, and which reflect the composed name “Is-abelle/HOL” of the proof assistant we use. While the formula of the conclusionis formulated in the object logic , namely HOL (see [4]), the implication = ⇒ ispart of the metalogic proper to Isabelle, called Pure . This metalogic is best seenas an abbreviation for the natural language construction “if . . . then”. That is,the whole claim should be read as: “If z (cid:54) = ε , and if zg ( r ) = h ( s ), then thefollowing formula holds . . . .”Finally, the assumption of existence of such a starting pair is realized usinga locale, with two additional assumptions called pq and pq-minimal. Lemma 4is formalized within this locale. locale binary-intersection-pq = binary-intersection- for p q + assumes pq: α · g p = h q and pq-minimal: α · g p (cid:48) = h q (cid:48) = ⇒ p ≤ p p (cid:48) ∧ q ≤ p q (cid:48) beginlemma char-solutions: g r = h s ←→ ( ∃ e f . g e = h f ∧ p ≤ s p · e ∧ q ≤ s q · f ∧ r = (p · e) · p − ∧ s = (q · f) · q − ) (* Lemma 4 *) nd4.2 Letter blocks as morphisms Since the elements of C ( g, h ) decompose into letter blocks, we define morphisms e and f on A ∗ where ( e ( a ) , f ( a )) is the a -block. The morphisms are partial if someletter block does not exist. The characterization is finally reduced to character-izing the set T satisfying the condition of Lemma 4. Namely we set T = { τ ∈ A ∗ | ( p, q ) ≤ s ( p e ( τ ) , q f ( τ )) } . Lemma 5. If τ , τ τ ∈ T , then τ ∈ T .Proof. Since p is a suffix of pe ( τ ), we have that pe ( τ ) is a suffix of pe ( τ ) e ( τ ).Since p is also a suffix of pe ( τ ) e ( τ ), we deduce that p is a suffix of pe ( τ ).Similarly, we obtain that q is a suffix of qf ( τ ). Hence τ ∈ T .We also have the following simple property. Lemma 6. If c i ∈ T , where i is positive and c ∈ { , } , then also c ∈ T .Proof. If p is a suffix of pc i , then p is a suffix of c ∗ . It implies that p is a suffixof pc . Similarly, if q is a suffix of qc i , it is a suffix of qc .We point out three more auxiliary arguments. Lemma 7. If ζ i ∈ T , with < i , then(1) p is a proper suffix of e ( ) .(2) α g g ( p ) ≤ s g (cid:0) e ( i ) (cid:1) .(3) q ≤ s f ( i ) .Proof.
1. If p is not a proper suffix of e ( ), then p ≤ s p e ( ζ i ) implies that e ( )is a suffix of p . From αg ( e ( p )) = h ( f ( q )), g ( e ( )) = h ( f ( )) and q ≤ s q f ( ζ i )we deduce that ( e ( ) , f ( )) is a suffix of ( p, q ), contradictiong the minimality of( p, q ).2. Recall that α g is suffix comparable with any g ( w ), since g ( w ) = α − g g ( w ) α g .This implies that α g g ( p ) and g (cid:0) e ( i ) (cid:1) are suffix comparable. It is thereforeenough to show that α g g ( p ) is shorter than g (cid:0) e ( i ) (cid:1) . From (1) we have (cid:12)(cid:12) g (cid:0) e ( i ) (cid:1)(cid:12)(cid:12) + | g ( p ) | ≤ (cid:12)(cid:12) g (cid:0) e ( i − ) (cid:1)(cid:12)(cid:12) , and the claim follows from | α g | < | g ( ) | .3. If q is not a suffix of f ( i ), then f ( i ) is a proper suffix of q since q ≤ s q f ( i ). This contradicts (2) in view of αg ( e ( p )) = h ( f ( q )) and g ( e ( i )) = h ( f ( i )).The most challenging part of the proof is the following lemma. It constitutesthe real core of the proof. emma 8. If ζc ∈ T for some c ∈ { , } and ζ ∈ (cid:104){ , }(cid:105) , then also c ∈ T .Proof. Without loss of generality, let c = . The claim follows from Lemma 6 if τ ∈ ∗ . Let therefore τ = ζ (cid:48) i , and assume( p, q ) ≤ s (cid:0) pe ( ζ (cid:48) ) e ( ) e ( ) i , q f ( ζ (cid:48) ) f ( ) f ( i ) (cid:1) . We want to show that ( p, q ) is a suffix of ( p e ( ) , q f ( )). This is equivalent toshowing that ( p, q ) is a suffix of ( e ( ) ∗ , f ( ) ∗ ). Assume the contrary.The equality αg ( p ) = h ( q ) and Lemma 7 (1) imply that g ( e ( ) p − ) is a suffixof α . Since | α | < | g ( ) | , we have that e ( ) p − is m for some m ≥ α f = f ( ) ∗ ∧ s f ( ) ∗ , and let c and c be distinct letters such that c α f ≤ s f ( ) ∗ and c α f ≤ s f ( ) ∗ . Let, moreover, α h = h ( ) ∗ ∧ s h ( ) ∗ . Then α h h ( α f ) isthe longest common suffix of h ( f ( ) ∗ ) and h ( f ( ) ∗ ). Since α is a suffix of both g ( e ( ) ∗ ) and g ( e ( ) ∗ ), we deduce that α is a suffix of α h h ( α f ) and hence | α | ≤ | h ( α f ) | + | α h | . Since q is a suffix of q f ( ζ (cid:48) ) f ( ) f ( ) i and not a suffix of f ( ) ∗ , we obtain that c α f f ( ) i is a suffix of q . From αg ( p ) = h ( q ) and e ( ) = m p , we now have h ( c α f f ( ) i − ) ≤ s αg ( m ) − , which yields | h ( c α f ) | + | g ( a ) | ≤ | α | . The two inequalities above imply that | h ( c ) | + | g ( ) | ≤ | α | and | h ( c ) | + | g ( ) | ≤| α h | . Since α h is a suffix of h ( c ) ∗ , α is a suffix of h ( ) ∗ and α h and α are suffixcomparable, the Periodicity lemma implies that g ( ) and h ( c ) commute (seeLemma 1). Since both g and h are marked, we obtain that f ( ) ∈ c ∗ whichcontradicts c α f ≤ s f ( ) ∗ .We can now have characterize the slightly surprising possibility when the inter-section of two free binary monoids is infinitely generated. This happens whenboth letter blocks exist, but one of the singletons is not in T . By symmetry, weshall therefore suppose, in the following classification lemma, that ∈ T and / ∈ T . Lemma 9.
Assume that both letter blocks exist, ∈ T and / ∈ T . Then τ is aminimal element of T if and only if τ = or is a prefix of τ, and t +1 is its suffix, andthere is no other occurrence of t +1 in τ, where t is the least non negative integer such that q ≤ s q f ( t +1 ) .Proof. From ( p, q ) ≤ s ( p e ( ) , q f ( )), we have that ( p, q ) is a suffix of ( e ( ) ∗ , f ( ) ∗ ).Hence there exists a least non negative integer t such that ( p, q ) is a suffix of( p e ( t +1 ) , f ( t +1 )), that is, such that t +1 ∈ T .emma 7, items (1) and (3) yield that ζ i ∈ T if and only if i ≥ t + 1 , (2)which implies that ζ t +1 ∈ T for all ζ . We may now characterize the minimalgenerating set of T .As ∈ T , using Lemma 5, we have that the only minimal generating element τ ∈ T starting with is τ = .Assume now that τ is a minimal generating element of T starting with . By(2), i is a suffix of τ with i ≥ t + 1, hence t +1 ≤ s τ . Let us write τ = ζ t +1 ζ (cid:48) .As ζ t +1 ∈ T , Lemma 5 implies ζ (cid:48) ∈ T , and minimality of τ implies ζ (cid:48) = ε .Hence, the only occurrence of ζ t +1 in τ is as its suffix.Assume now that τ has prefix , suffix t +1 , and there is no other occurrenceof t +1 . Have τ = τ τ with τ , τ ∈ T and τ non-empty. As is a prefix of τ ,we may write τ = ζ i , and thus by (2) we have i ≥ t + 1, which produces anoccurrence of t +1 , and thus τ = ε . Therefore, there is no decomposition of τ ,and it is a minimal element of T . Formalization of letter blocks, the set T and the result We skip the formal construction of morphisms e and f as much more Isabelle’sconcepts would need to be introduced in order to explain its technical details.We invite the reader to inspect it in the full code.The case when only one letter block exists is treated rather implicitly in thehuman proof. Nevertheless, in the formalization, we have the following explicitclaim. lemma unique-block: assumes g e = m h f and (cid:86) e (cid:48) f (cid:48) . g e (cid:48) = m h f (cid:48) = ⇒ (e (cid:48) , f (cid:48) ) = (e , f) and g r = m h s shows (r , s) = (p · e · p − , q · f · q − )The assumption of existence of both letter blocks is introduced as a localewhich used further on. locale binary-intersection-blocks = binary-intersection-pq + assumes minblock0: g ( e ) = m h ( f ) and hdblock0: e !0 = bin0 and (* e ! e *) minblock1: g ( e ) = m h ( f ) and hdblock1: e !0 = bin1The set T is introduced as the predicate of its elements, which is more suitablefor further use. definition Tpred :: binA list ⇒ bool where Tpred τ ≡ p ≤ s p · e τ ∧ q ≤ sq · f τ definition T where T ≡ { τ . Tpred τ } he relation between the solutions, the morphism e and f , and the set T (i.e., the predicate Tpred), is now a consequence of a few more straightforwardlemmas in Isabelle resulting in the following: corollary KeyRelation: C [g , h ] = { ((p · e τ ) · p − , (q · f τ ) · q − ) | τ . Tpred τ } Formalizations of Lemmas 5 and 8 are straightforward: lemma
T-prefix-code: assumes
Tpred τ and Tpred ( τ · τ ) shows Tpred τ (* Lemma 5 *) lemma last-block: Tpred (z · [c]) = ⇒ Tpred [c] (* Lemma 8 *)
The human proof of Lemma 8 contains several steps which depend on somelevel of insight into properties of binary codes. The formalization of this proofis therefore particularly interesting and important (and demanding). The mainproof is preceded by a dedicated locale that contains forty three claims, includ-ing the claims of Lemma 7. In a sense, therefore, the proof of the lemma isfragmented into forty three smaller steps. It should be made clear, however, thatthe fragmentation is to a great extent a matter of taste, since a single proof canbe often quite naturally divided into several lemmas, or vice versa. Moreover,fourteen lemmas out of the forty three are of purely preparatory nature, allowingto use other claims formulated for prefixes in a reversed way for suffixes. Thisis something which in the given human proof is done by a simple appeal to a“mirrored situation”, an insight that is hardly possible to formalize in a uniformway.We do not list the formalized equivalents of Lemmas 7 and 6 as they are splitin the code into several lemmas.The characterization of the set T is concluded in the two following locales,the first, called binary-intersection-blocks-trivial, is for the case , ∈ T , thesecond, named binary-intersection-blocks-nontrivial, for the case ∈ T, (cid:54)∈ T .The term B T stands for a basis of T , i.e., the set of its minimal elements, andthe term Suc t represents t + 1. locale binary-intersection-blocks-trivial = binary-intersection-blocks + assumes easy-block0: (p ≤ s p · e ∧ q ≤ s q · f ) and easy-block1: (p ≤ s p · e ∧ q ≤ s q · f ) begintheorem Tpred τ (* i.e., T = (cid:104){ , }(cid:105) *) endlocale binary-intersection-blocks-nontrivial = binary-intersection-blocks for t + assumes asy-block: (p ≤ s p · e ∧ q ≤ s q · f ) and t-block: ¬ q ≤ s q · f · f ˆt and t-block-suc: q ≤ s q · f · f ˆSuc t begincorollary Tbasis: B T = { τ . τ = ∨ ( ≤ p τ ∧ ˆSuc t ≤ s τ ∧ ¬ ˆSuc t ≤ f butlast τ ) } (* Lemma 9 *) end Let us explain the notation in the claim Tbasis: ≤ f stands for “is factor of”and the function butlast returns the list without its last element. Returning from the coincidence set back to the intersection properly speaking,the main claim (Theorem 2) of [3] is that if { x, y } and { u, v } are binary codes,then the intersection I = { x, y } ∗ ∩ { u, v } ∗ has one of the following forms: I = { β, γ } ∗ ( ∗ ) I = (cid:0) β + β ( γ (1 + δ + · · · + δ t )) ∗ (cid:15) (cid:1) ∗ ( ∗∗ )Let us summarize our proof and show that it agrees with the formulation from[3]. Recall that, by definition, we have { x, y } = { g ( ) , g ( ) } and { u, v } = { h ( ) , h ( ) } . If I = { ε } , then the claim holds for β = γ = ε . Let therefore I contain a nonempty word. That is, C ( g , h ) contains atleast one minimal solution. Then α g and α h are prefix comparable. By symmetry,we assume α h ≤ α g and α = α − h α g is well defined. If there is no starting block, then the construction of a solution isdeterministic, hence C ( g , h ) contains a unique minimal solution ( r, s ). Then I = { β, γ } ∗ with β = g ( r ) = h ( s ) and γ = ε . Let now the starting block exist, i.e., there exist ( p, q ) such that αg ( p ) = h ( q ). Then each solution ( r, s ) has a block decomposition τ . We define nonerasing morphisms e , f : { , } ∗ → { , } ∗ such that, for a solution ( r, s ) withthe block decomposition τ , we have g ( e ( τ )) = h ( f ( τ )). Let T be the set of blockdecompositions of all solutions. That is, let C ( g , h ) = { ( p e ( τ ) p − , q f ( τ ) q − ) | τ ∈ T } . Note that at this moment we do not guarantee that g ( e ( c )) = h ( f ( c )), c ∈{ , } , that is, ( e ( c ) , f ( c )) need not be defined. Because of the existence of atleast one minimal solution, we may however assume, by symmetry, that ζ ∈ T for some ζ . Then ∈ T by Lemma 8, in particular g ( e ( )) = h ( f ( )). If ( e ( ) , f ( )) is not a letter block, then T = ∗ , and I = { β, γ } ∗ with β = g (cid:0) p e ( ) p − (cid:1) = h (cid:0) q f ( ) q − (cid:1) , γ = ε. .2.2. Suppose that ( e ( ) , f ( )) is a letter block. If ∈ T , then T = { , } ∗ , and I = { β, γ } ∗ with β = g (cid:0) p e ( ) p − (cid:1) = h (cid:0) q f ( ) q − (cid:1) , γ = g (cid:0) p e ( ) p − (cid:1) = h (cid:0) q f ( ) q − (cid:1) . If / ∈ T , then by Lemma 9, there is a non negative integer t suchthat T = (cid:16) + (cid:0) + + · · · + t (cid:1) ∗ t +1 (cid:17) ∗ . Using Lemma 7 (2), we now have I = (cid:0) β + β ( γ (1 + δ + · · · + δ t )) ∗ (cid:15) (cid:1) ∗ where β = g (cid:0) p e ( ) p − (cid:1) = h (cid:0) q f ( ) q − (cid:1) β = α g g ( p ) = α h h ( q ) γ = g ( e ( )) = h ( f ( )) δ = g ( e ( )) = h ( f ( )) (cid:15) = g (cid:0) e ( t +1 ) p − (cid:1) α − g = h (cid:0) f ( t +1 ) q − (cid:1) α − h . This last case, in which the intersection is infinitely generated, is further specifiedin [3, Theorem 3]. The generating set is of one of the following forms (we keepthe notation of words from [3], although it is not compatible with the notationabove; however, we modify integer variables): βγ + β ( γβ ) t (cid:0) δ (cid:0) γβ + · · · + ( γβ ) t (cid:1)(cid:1) ∗ δγ ( † ) βγ + β ( γβ ) t + m +1 (cid:0) δ (cid:0) γβ + · · · + ( γβ ) t (cid:1)(cid:1) ∗ δ ( β ( γβ ) m ) − ( †† )for some 0 ≤ m, t , where δ and γβ are nonempty and pref ( δ ) (cid:54) = pref ( γβ ).Here βγ = g (cid:0) p e ( ) p − (cid:1) = h (cid:0) q f ( ) q − (cid:1) δ = g ( e ( )) = h ( f ( )) γβ = g ( e ( )) = h ( f ( ))and α g g ( p ) = α h h ( q ) = β ( γβ ) t for ( † ) β ( γβ ) t + m +1 for ( †† ) . The possibility ( †† ) corresponds to the situation when f (cid:48) f ( ) m is a suffix of f ( ),where f (cid:48) is a suffix of f ( ) such that q = f (cid:48) f ( ) m + t +1 (and β = α h h ( f (cid:48) )). Inother words, the difference between ( † ) and ( †† ) is whether f ( ) contributes tothe eventual occurrence of q as a suffix of f ( t +1 ).We finally illustrate the theory by several examples. The first two are from[3]. xample 1. g : (cid:55)→ a (cid:55)→ a m b α g = α = a m g : (cid:55)→ a (cid:55)→ ba m h : (cid:55)→ a (cid:55)→ ba m α h = ε h : (cid:55)→ a (cid:55)→ ba m e : (cid:55)→ (cid:55)→ p = ε f : (cid:55)→ (cid:55)→ q = m t = mT = (cid:16) + (cid:0) + + · · · + m − (cid:1) ∗ m (cid:17) ∗ I = a + (cid:0) a m b + a m ba + · · · + a m ba m − (cid:1) ∗ a m ba m Example 2. g : (cid:55)→ aba (cid:55)→ aab α g = α = a g : (cid:55)→ baa (cid:55)→ aba h : (cid:55)→ a (cid:55)→ baaba α h = ε h : (cid:55)→ a (cid:55)→ baaba e : (cid:55)→
00 1 (cid:55)→ p = ε f : (cid:55)→
10 1 (cid:55)→ q = t = 1 T = (cid:0) + + (cid:1) ∗ I = ( abaaba + ( aabaab ) + abaaba ) ∗ = ( a ( abaaba ) ∗ baaba ) ∗ The noteworthy property of the following example is that f ( ) is a suffix of f ( ). The example therefore illustrates the possibility ( †† ) above. Example 3. g : (cid:55)→ aa (cid:55)→ a b α g = α = a g : (cid:55)→ aa (cid:55)→ ba h : (cid:55)→ a (cid:55)→ ba α h = ε h : (cid:55)→ a (cid:55)→ ba e : (cid:55)→ (cid:55)→ p = ε f : (cid:55)→
00 1 (cid:55)→ q = t = 2 T = ( + ( + ( ) ∗ ) ∗ I = ( aa + (cid:0) a b + a baa (cid:1) ∗ a baaaa ) ∗ Example 4.
Finally, Table 1 lists various situations in which the intersection isgenerated by at most one word. Interesting is the last line where all three blocksexist, yet the intersection contains the empty word only. Note that ( p, q ) is nota suffix of ( pe ( τ ) , qf ( τ )) for any nonempty τ in that case. Acknowledgments
The authors acknowledge support by the Czech Science Foundation grant GA ˇCR20-20621S. ( ) g ( ) h ( ) h ( ) α ( p, q ) ( e ( ) , f ( )) ( e ( ), f ( )) I aabb ab aba bab a ( , ) × ( , ) ababab ∗ aa ab aba ba a ( , ) × ( , ) { ε } aabb ab aba babb a ( , ) × × { ε } aab aba aba baa a × ( , ) ( , ) aba ∗ aab abb aba bba a × ( , ) ( , ) { ε } aabb ab abaa bb a × × × abaabb ∗ aab abb aa bb a × × × { ε } aab abb aab bba a × × ( , ) aab ∗ aab abb aba bab a × ( , ) × { ε } abaab ababab a ba aba ( ε, ) ( , ) ( , ) { ε } Table 1.
References
1. Christian Choffrut and Juhani Karhum¨aki. Handbook of formal languages, vol. 1.chapter Combinatorics of Words, pages 329–438. Springer-Verlag, Berlin, Heidel-berg, 1997.2. ˇStˇep´an Holub. Binary intersection revisited. In Robert Merca¸s and Daniel Rei-denbach, editors,
Combinatorics on Words , pages 217–225, Cham, 2019. SpringerInternational Publishing.3. Juhani Karhum¨aki. A note on intersections of free submonoids of a free monoid.
Semigroup Forum , 29(1):183–205, Dec 1984.4. Lawrence C. Paulson, Tobias Nipkow, and Makarius Wenzel. From LCF to Is-abelle/HOL.
Formal Aspects of Computing , 31:675–698, 2019.5. ˇStˇep´an Starosta ˇStˇep´an Holub. Combinatorics on Words Formalized - Binary Inter-section Formalized. https://gitlab . com/formalcow/combinatorics-on-words-formalized/-/tree/Binary-Intersection-Formalizedcom/formalcow/combinatorics-on-words-formalized/-/tree/Binary-Intersection-Formalized