Dissecting Power of a Finite Intersection of Context Free Languages
aa r X i v : . [ c s . F L ] S e p Dissecting Power of a Finite Intersection ofContext Free Languages
Josef Rukavicka ∗ August 31, 2020Mathematics Subject Classification: 68Q45
Abstract
Let exp k,α denote a tetration function defined as follows: exp ,α =2 α and exp k +1 ,α = 2 exp k,α , where k, α are positive integers. Let ∆ n denote an alphabet with n letters. If L ⊆ ∆ ∗ n is an infinite languagesuch that for each u ∈ L there is v ∈ L with | u | < | v | ≤ exp k,α | u | thenwe call L a language with the growth bounded by ( k, α ) -tetration.Given two infinite languages L , L ∈ ∆ ∗ n , we say that L dissects L if | L ∩ L | = ∞ and | (∆ ∗ n \ L ) ∩ L | = ∞ .Given a context free language L , let κ ( L ) denote the size of thesmallest context free grammar G that generates L . We define the sizeof a grammar to be the total number of symbols on the right sides ofall production rules.Given positive integers n, k with k ≥ , we show that there arecontext free languages L , L , . . . , L k − ⊆ ∆ ∗ n with κ ( L i ) ≤ k suchthat if α is a positive integer and L ⊆ ∆ ∗ n is an infinite language withthe growth bounded by ( k, α ) -tetration then there is a regular language M such that M ∩ (cid:16)T k − i =1 L i (cid:17) dissects L and the minimal deterministicfinite automaton accepting M has at most k + α + 3 states. ∗ Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering,Czech Technical University in Prague ([email protected]). Introduction
In the theory of formal languages, the regular and the context free languagesconstitute a fundamental concept that attracted a lot of attention in thepast several decades. Recall that every regular language is accepted by somedeterministic finite automaton and every context free language is acceptedby some pushdown automaton.In contrast to regular languages, the context free languages are closedneither under intersection nor under complement. The intersection of contextfree languages have been systematically studied in [4, 6, 9, 10, 11]. Let
CFL k denote the family of all languages such that for each L ∈ CFL k there are k context free languages L , L , . . . , L k with L = T ki =1 L i . For each k , ithas been shown that there is a language L ∈ CFL k +1 such that L CFL k .Thus the k -intersections of context free languages form an infinite hierarchyin the family of all formal languages lying between context free and contextsensitive languages [6].One of the topics in the theory of formal languages that has been studiedis the dissection of infinite languages. Let ∆ n be an alphabet with n letters,and let L , L ⊆ ∆ ∗ n be infinite languages. We say that L dissects L if | L ∩ L | = ∞ and | (∆ ∗ n \ L ) ∩ L | = ∞ . Let C be a family of languages.We say that a language L ∈ ∆ ∗ n is C -dissectible if there is L ∈ C suchthat L dissects L . Let REG denote the family of regular languages. In[12] the
REG -dissectibility has been investigated. Several families of
REG -dissectible languages have been presented. Moreover it has been shown thatthere are infinite languages that cannot be dissected with a regular language.Also some open questions for
REG -dissectibility can be found in [12]. Forexample it is not known if the complement of a context free languages is
REG -dissectible.There is a related longstanding open question in [1]: Given two contextfree languages L , L ⊆ ∆ ∗ n such that L ⊂ L and L \ L is an infinitelanguage, is there a context free language L such that L ⊂ L , L ⊂ L ,and both the languages L \ L and L \ L are infinite? This question wasmentioned also in [12].Some other results concerning the dissection of infinite languages maybe found in [5]. A similar topic is the constructing of minimal covers oflanguages [2]. Recall that a language L ⊆ ∆ ∗ n is called C - immune if there isno infinite language L ⊆ L such that L ∈ C . The immunity is also relatedto the dissection of languages; some results on this theme can be found in23, 7, 11].Let N denote the set of all positive integers. An infinite language L ⊆ ∆ ∗ n is called constantly growing , if there is a constant c ∈ N and a finite set K ⊂ N such that for each w ∈ L with | w | ≥ c there is a word ¯ w ∈ L anda constant c ∈ K such that | ¯ w | = | w | + c . We say also that L is ( c , K ) -constantly growing. In [12], it has been proved that every constantly growinglanguage L is REG -dissectible.We define a tetration function (a repeated exponentiation) as follows: exp ,α = 2 α and exp j +1 ,α = 2 exp j,α , where j ∈ N . The tetration function isknown as a fast growing function. If k, α are positive positive integers and L ⊆ ∆ ∗ n is an infinite language such that for each u ∈ L there is v ∈ L with | u | < | v | ≤ exp k,α | u | then we call L a language with the growth bounded by ( k, α ) -tetration.Let L ⊆ ∆ ∗ n be an infinite language with the growth bounded by ( k, α ) -tetration, where k ≥ . In the current article we show that there are: • an alphabet Σ k − with | Σ k − | = 2 k − , • an erasing alphabetical homomorphism υ : Σ ∗ k − → ∆ ∗ , • a nonerasing alphabetical homomorphism π : ∆ ∗ n → ∆ ∗ , and • k − context free languages L , L , . . . , L k − ⊆ Σ ∗ k − such that the homomorphic image υ ( T k − i =1 L i ) dissects the homomorphicimage π ( L ) . Thus we may say that the languages with the growth boundedby a ( k, α ) -tetration are CFL k − -dissectible.We sketch the basic ideas of our proof. Recall that a non-associativeword on the letter z is a “well parenthesized” word containing a given num-ber of occurrences of z . It is known that the number of non-associativewords containing n + 1 occurrences of z is equal to the n -th Catalan num-ber [8]. For example for n = 3 we have distinct non-associative words: ((( zz ) z ) z ) , (( zz )( zz )) , ( z ( z ( zz ))) , ( z (( zz ) z )) , and (( z ( zz )) z ) . Every non-associative word contains the prefix ( k z for some k ∈ N , where ( k denotesthe k -th power of the opening bracket. It is easy to verify that there arenon-associative words such that k equals “approximately” log n . We con-struct three context free languages whose intersection accepts such words;we call these words balanced non-associative words . By counting the numberof opening brackets of a balanced non-associative word with n occurrencesof z we can compute a logarithm of n .3et log (1)2 n = log n and log ( j +1)2 n = log ( j )2 (log n ) . Our constructioncan be “chained” so that we construct k − context free languages, whoseintersection accepts words with n occurrences of z and a prefix x j ¯ z , where j is equal “approximately” to log ( k )2 n and ¯ z = x . If L is a language withthe growth bounded by a ( k, α ) -tetration then the language ¯ L = { x j | j = ⌈ log ( k )2 | w |⌉ and w ∈ L } is constantly growing. Less formally said, by meansof intersection of k − context free languages we transform the challengeof dissecting a language with the growth bounded by ( k, α ) -tetration to thechallenge of dissecting a constantly growing language. This approach allowsus to prove our result. Let R + denote the set of all positive real numbers.Let B k = { x , x , . . . , x k } be an ordered alphabet (set) of k distinct open-ing brackets, and let ¯B k = { y , y , . . . , y k } be an ordered alphabet (set) of k distinct closing brackets. We define the alphabet Σ k − = B k ∪ ( ¯B k \ { y ) } ) .The alphabet Σ k − contains all opening brackets B k and all the closingbrackets without the the first one ¯B k \ { y } . It follows that | Σ k − | = 2 k − .Let ǫ denote the empty word. Given a finite alphabet S , let S + denote theset of all finite nonempty words over the alphabet S and let S ∗ = S + ∪ { ǫ } .Let Fac( w ) denote the set of all factors a word w ∈ S ∗ . We define that ǫ, w ∈ Fac( w ) ; i.e. the empty word and the word w are factors of w . Let Pref( w ) ⊆ Fac( w ) denote the set of all prefixes of w ∈ S ∗ . We define that ǫ, w ∈ Pref( w ) . Let Suf( w ) ⊆ Fac( w ) denote the set of all suffixes of w ∈ S ∗ .We define that ǫ, w ∈ Suf( w ) . Given a finite alphabet S , let occur( w, t ) denote the number of occurrences of the nonempty factor t ∈ S + in the word w ∈ S ∗ ; formally occur( w, t ) = |{ v ∈ Suf( w ) | t ∈ Pref( v ) }| .Given two finite alphabets S , S , a homomorphism from S ∗ to S ∗ is afunction τ : S ∗ → S ∗ such τ ( ab ) = τ ( a ) τ ( b ) , where a, b ∈ S +1 . It followsthat in order to define a homomorphism τ , it suffices to define τ ( z ) for every z ∈ S ; such definition “naturally” extends to every word a ∈ S +1 . We say that τ is a nonerasing alphabetical homomorphism if τ ( z ) ∈ S for every z ∈ S .We say that τ is an erasing alphabetical homomorphism if τ ( z ) ∈ S ∪ { ǫ } for every z ∈ S and there is at least one z ∈ S such that τ ( z ) = ǫ .4 Balanced non-associative words
Suppose k, m ∈ N , where k, m ≥ , and k ≥ m . To simplify the notationwe define x = x m , y = y m , and z = x m − ; it means that x denotes the m -th opening bracket, y denotes the m -th closing bracket, and z denotes the m − -th opening bracket.Let µ k,m : Σ ∗ k − → Σ ∗ k − be an erasing alphabetical homomorphismdefined as follows: • µ k,m ( z ) = z , • µ k,m ( x ) = x , • µ k,m ( y ) = y . • µ k,m ( a ) = ǫ , where a ∈ Σ k − \{ x, y, z } .Given a language L ⊆ Σ ∗ k − , we define the language µ k,m ( L ) = { µ k,m ( w ) | w ∈ L } . Remark 3.1.
For given k, m the erasing alphabetical homomorphism µ k,m sends all opening and closing brackets from B k and ¯B k to the empty stringwith the exception of x , y , and z . Let
Naw k,m ⊆ Σ ∗ k − be the context free language generated by the fol-lowing context free grammar, where S is a start non-terminal symbol, N isa non-terminal symbol, and x, y, z, a are terminal symbols (the letters from Σ k − ). • S → N x N S S N y N | N x N z N y N | N x N z N z N y N , • N → a N | ǫ , where a ∈ Σ k − \{ x, y, z } .We call the words from Naw k,m non-associative words over the openingbracket x , the closing bracket y , and the letter z . Remark 3.2.
Let M = µ k,m (Naw k,m ) . To understand the definition of Naw k,m , note that the language M is generated by the context free gram-mar defined by: S → x S S y | xzy | xzzy . To see this, just remove thenon-terminal symbol N in the definition of Naw k,m . The usage of the non-terminal symbol N allows to “insert” between any two letters of a word from µ k,m (Naw k,m ) the words from K = (Σ k − \{ x, y, z } ) ∗ ; the set K contains ords from Σ ∗ k − that have no occurrence of x, y, z . It means that if w = w w . . . w n ∈ µ k,m (Naw k,m ) , then t w t w t . . . t n − w n t n ∈ Naw k,m , where w i ∈ { x, y, z } and t i ∈ K .The reason for the name “non-associative words” is the obvious similaritybetween the words from M and the “standard non-associative words” men-tioned in the introduction section. Our definition guarantees that w xzyw ∈ M if and only if w xzzyw ∈ M for every w , w ∈ { x, z, y } ∗ . Recall that a pushdown automaton is a -tuple (Q , ∆ , Γ , q , S , δ ) , where • Q is a set of states, • ∆ is an input alphabet, • Γ is a stack alphabet, • q ∈ Q is an input state, • S ∈ Γ is the initial symbol of the stack, • δ : (Q × ∆ × Γ) → (Q , Γ ∗ ) is a transition function.We define that a pushdown automaton accepts a word by the empty stack,hence we do not need to define the set of final states. Given a pushdownautomaton g , let AL( g ) ⊆ ∆ ∗ denotes the language accepted by g .Let Λ k,m = AL( g k,m ) ⊆ Σ ∗ k − denote the context free language acceptedby the pushdown automaton g k,m = (Q , Σ k − , Γ , q S , S , δ ) , where: • Q = { q S , q B , q , q x , q r } , • Γ = { S , X } , • δ ( q, a, u ) → ( q, u ) , where q ∈ Q , u ∈ Γ , and a ∈ Σ k − \{ x, y, z } , • δ ( q S , x, u ) → ( q B , u ) , where u ∈ Γ , • δ ( q S , z, u ) → ( q S , u ) , where u ∈ Γ , • δ ( q S , y, u ) → ( q S , u ) , where u ∈ Γ , • δ ( q B , x, u ) → ( q x , uXX ) , where u ∈ Γ , • δ ( q B , z, u ) → ( q S , u ) , where u ∈ Γ ,6 δ ( q B , y, u ) → ( q S , u ) , where u ∈ Γ , • δ ( q , x, u ) → ( q x , u ) , where u ∈ Γ , • δ ( q , z, u ) → ( q , u ) , where u ∈ Γ , • δ ( q , y, u ) → ( q , u ) , where u ∈ Γ , • δ ( q x , x, u ) → ( q x , uX ) , where u ∈ Γ , • δ ( q x , z, X ) → ( q , ǫ ) , • δ ( q x , z, S) → ( q r , X ) , where u ∈ Γ , • δ ( q x , y, u ) → ( q , u ) , where u ∈ Γ , and • δ ( q r , a, u ) → ( q r , u ) , where r ∈ Σ k − and u ∈ Γ . Remark 3.3.
Note in the definition of g k,m that the letters from Σ k − \{ x, y, z } change neither the state of g k,m nor the stack. Hence to illuminate the be-havior of g k,m , we can consider only words over the alphabet { x, y, z } . Thenit is easy to see that the pushdown automaton g k,m pushes XX on the stackon the first occurrence of xx . For every other occurrence of xx the pushdownautomaton g k,m pushes X on the stack. Once reached the state q x , then forevery occurrence of xz one X is removed from the stack. The state q r worksas a refuse state. Note that after reaching the state q r the stack is not empty,the stack cannot be changed, and no other state can be reached from q r . Thestates q S and q B enable to recognize the first occurrence of xx . Once thestates q x are reached, the states q S and q B can not be reached any more.Thus the pushdown automaton g k,m accepts all words, where the numberof occurrences of xz after the first occurrence of xx is exactly one more thanthe number of occurrences of xx . Formally, if w ∈ µ k,m (Σ ∗ k − ) then we define ¯ w as follows: • If occur( w, xx ) = 0 then ¯ w = ǫ . • If occur( w, xx ) ≥ then let ¯ w ∈ Suf( w ) be such that xx ∈ Pref( ¯ w ) and occur( ¯ w, xx ) = occur( w, xx ) .Clearly ¯ w is uniquely defined. Then we have that w ∈ µ k,m (Λ k,m ) if andonly if ¯ w = ǫ or occur( ¯ w, xx ) + 1 = occur( ¯ w, xz ) . It follows that the wordswithout any occurrence of xx are accepted. In the following we will consider he words from the intersection U = Λ k,m ∩ Naw k,m . Note that there are onlytwo nonempty words xzy, xzzy ∈ U , that have no occurrence of xx .Recall that a “standard” non-associative word can be represented as a fullbinary rooted tree graph, where every inner node represents a correspondingpair of brackets and every leaf represents the letter z [8]. It is known that thenumber of inner nodes plus one is equal to the number of leaves in a full binaryrooted tree graph. In the case of non-associative words from Naw k,m , let theleaves represent the factors xzy and xzzy . Then the number of occurrencesof xz is equal to the number of leaves and the number of occurrences of xx isequal to the number of inner nodes. Hence the intersection M ∩ Naw k,m con-tains non-associative words that have no “unnecessary” brackets; for example xzzy, xxzzyy, xxxzzyyy ∈ Naw k,m , xzzy ∈ M and xxzzyy, xxxzzyyy M . Let
Bal k,m ⊆ Σ ∗ k − be the context free language generated by the follow-ing context free grammar, where S is a start non-terminal symbol, N, K, V, P are non-terminal symbols, and x, y, z, a are terminal symbols (the letters from Σ k − ). • S → KV P , • V → V V | N z N | N zT z N | ǫ , • T → N y N T N x N | ǫ , • K → KK | N x N | ǫ , • P → P P | N y N | ǫ , • N → a N | ǫ , where a ∈ Σ k − \{ x, y, z } .We call the words from Bal k,m balanced words . Remark 3.4.
Let M = µ k,m (Bal k,m ) . It is easy to see that the words fromthe language M contains no factor of the form zy i x j z , where i, j are distinctpositive integers; hence the name “balanced” words. The non-terminal sym-bols K, P enable that if w ∈ M then w has a prefix x i and a suffix y j for all i, j ∈ N ∪ { } .The non-terminal symbol N in the definition of Bal k,m has the same pur-pose like in the definition of
Naw k,m . Ω k,m = Naw k,m ∩ Bal k,m ∩ Λ k,m .We call the words from Ω k,m balanced non-associative words over the openingbracket x , the closing bracket y , and a letter z .Let Ω k,m ( n ) = { w ∈ Ω k,m | occur( w, z ) = n } , where n ∈ N . The set Ω k,m ( n ) contains the balanced non-associative words having exactly n occur-rences of the letter z .Given a word w ∈ Σ ∗ k − and a ∈ Σ k − , let height( w, a ) = max { j | a j ∈ Fac( w ) } .The height of a word w is the maximal power of the letter a , that is a factorof w . We show that if w ∈ µ k,m (Ω k,m ) and h is the height the opening bracket x in w then x h is a prefix of w and y h is a suffix of w . Lemma 3.5. If w ∈ µ k,m (Ω k,m ) and h = height( w, x ) then x h ∈ Pref( w ) and y h ∈ Suf( w ) .Proof. Note that µ k,m (Ω k,m ) ⊆ Ω k,m . Since Ω k,m ⊆ Naw k,m , there is ¯ h ∈ N such that x ¯ h z ∈ Pref( w ) . To get a contradiction suppose that ¯ h < h . Because Ω k,m ⊆ Bal k,m it follows that w = x ¯ h w zy h x h zw for some w ∈ Fac( w ) with z ∈ Pref( w z ) and w ∈ Suf( w ) .Consider the prefix r = x ¯ h w zy h . Obviously w z ∈ µ k,m (Bal k,m ) . Itis easy to see that if v ∈ µ k,m (Bal k,m ) , x Pref( v ) , and y Suf( v ) then occur( v, x ) = occur( v, y ) . Thus occur( w z, x ) = occur( w z, y ) . It followsthat occur( r, x ) < occur( r, y ) .This is a contradiction, since for every prefix v ∈ Pref( w ) of a non-associative word w ∈ Naw k,m we have that occur( v, x ) ≥ occur( v, y ) . Weconclude that ¯ h = h and x h ∈ Pref( w ) . In an analog way we can show that y h ∈ Suf( w ) . This completes the proof.For a word w ∈ µ k,m (Ω k,m ) , we show the relation between the height of w and the number of occurrences of z in w . Proposition 3.6. If w ∈ µ k,m (Ω k,m ) and h = height( w, x ) then h − ≤ occur( w, z ) ≤ h .Proof. We prove the proposition for all h by induction:9 If h = 0 then w = ǫ . • If h = 1 then w ∈ { xzzy, xzy } . • If h = 2 then w ∈ { xxzyxzyy, xxzzyxzyy, xxzyxzzyy, xxzzyxzzyy } .Thus the proposition holds for h ≤ . Since Ω k,m ⊆ Λ k,m , clearly we havethat if h ≥ then w = xw w y , where w , w ∈ µ k,m (Ω k,m ) . Suppose theproposition holds for all ¯ h < h . We prove the proposition holds for h .Let h = height( w , x ) and h = height( w , x ) . Lemma 3.5 impliesthat x h ∈ Pref( w ) , y h ∈ Suf( w ) , x h ∈ Pref( w ) , and y h ∈ Suf( w ) .Since w ∈ µ k,m (Bal k,m ) it follows that h = h . Because x h ∈ Pref( w ) we have that x h +1 ∈ Pref( w ) . Clearly occur( w, x h +1 ) = 1 ; note that occur( w w , x h +1 ) = 0 . Thus h + 1 = h . For we assumed that the proposi-tion holds for all ¯ h < h , we can derive that occur( w, z ) = occur( w , z ) + occur( w , z ) ≤ h + 2 h = 2 h +1 = 2 h and occur( w, z ) = occur( w , z ) + occur( w , z ) ≥ h − + 2 h − = 2 h = 2 h − .This completes the proof.Proposition 3.6 have the following obvious corollary. Corollary 3.7. If n ∈ N , w ∈ µ k,m (Ω k,m ( n )) , and h = height( w, x ) then log n ≤ h ≤ n . Given w, u, v ∈ Σ +2 k − , let replace( w, v, u ) denote the word built from w by replacing the first occurrence of v in w by u . Formally, if occur( w, v ) = 0 then replace( w, v, u ) = w . If occur( w, v ) = j > and w = w vw , where occur( vw , v ) = j then replace( w, v, u ) = w uw .We prove that the set of balanced non-associative words Ω k,m ( n ) having n occurrences of z is nonempty for each n ∈ N . Lemma 3.8. If n ∈ N then Ω k,m ( n ) = ∅ . roof. If n = 1 then xzy ∈ Ω k,m (1) . Given n ∈ N with n > , let j ∈ N be such that j − < n ≤ j . Obviously such j exists and is uniquelydetermined. Let w = xzzy . Let w i +1 = xw i w i y . Clearly occur( w j , z ) = 2 j and w j ∈ Ω k,m (2 j ) . Note that occur( w j , xzzy ) = 2 j − . Let w j, = w j and w j,i +1 = replace( w j,i , xzzy, xzy ) , where i ∈ N ∪ { } and i ≤ j − . Let α =2 j − n . Then one can easily verify that occur( w j,α , z ) = n and w j,α ∈ Ω k,m ( n ) .Less formally said, we construct a balanced non-associative word w j hav-ing j − occurrences of xzzy and then we replace a given number of oc-currences of xzzy with the factor xzy to achieve the required number ofoccurrences of z . This completes the proof. Let Ω k = T km =2 Ω k,m and let Ω k ( n ) = { w ∈ Ω k | occur( w, x ) = n } . We showthat for all positive integers n, k with k ≥ there is a word w ∈ Ω k such that w has n occurrences of the opening bracket x . Proposition 4.1. If k, n ∈ N and k ≥ then Ω k ( n ) = ∅ .Proof. Let h (1) = n . Let w i ∈ µ k,i (Ω k,i ( h ( i − and let h ( i ) = height( w i , x i ) ,where i ∈ { , , , . . . , k } . Lemma 3.8 implies that such w i exist.Let v = w . Let v j +1 = replace( v j , x h ( j ) j , w j +1 ) , where j ∈ N and j ≥ .Lemma 3.5 implies that x h ( j ) ∈ Pref( v j ) . Note that µ k,j ( v j + 1) = µ k,j ( v j ) .Then it is quite straightforward to see that v k ∈ Ω k and occur( v k , x ) = n .Less formally said, with every iteration we construct a non-associative wordby “well parenthesizing” the prefix x h ( i ) i with the opening bracket x i +1 andthe closing bracket y i +1 . This completes the proof.To clarify the proof of Proposition 4.1, let us see the following example. Example 4.2.
Let n = 23 and k = 4 . To make the example easy to read, wedefine B = { z, ( , [ , < } and ¯B = { ¯ z, ) , ] , > } . It means that x = z , x = ( , x = [ , x = < , ¯ x = ¯ z , ¯ x =) , ¯ x =] , and ¯ x = > .To fit the example into the width of the page, we define auxiliary words u and u : • u = z )( z ))(( z )( z )))((( z )( z ))(( z )( z )))) , • u = (((( z )( zz ))(( zz )( zz )))((( zz )( zz ))(( zz )( zz ))))) . hen we have that • h (1) = 23 ; w = ((((( u u ; h (2) = 5 ; w = [[[(][(]][[(][((]]] ; • h (3) = 3 ; w = << [[ >< [[ >> ; h (4) = 2 ; v = w ; v = [[[(][(]][[(][((]]] u u ; • v = << [ >< [[ >> (][(]][[(][((]]]] u u .This ends the example. We define two technical functions log ( j )2 t and log [ j ]2 t for all j ∈ N and t ∈ R + as follows: • log (1)2 t = log t and log ( j +1)2 t = log ( j )2 (log t ) . • log [1]2 t = 1 + log t and log [ j +1]2 t = log [ j ]2 (1 + log t ) .It is a simple exercise to prove the following lemma. We omit the proof. Lemma 4.3. If j ∈ N then for each t ∈ R + with t ≥ we have that log ( j )2 t ≤ log [ j ]2 t ≤ j + log ( j )2 t . Using the function log ( k )2 t we present an upper and a lower bound for theheight of words from Ω k . Proposition 4.4. If k ∈ N and k ≥ then for each w ∈ Ω k , h = height( w, x k ) ,and n = occur( w, x ) we have log ( k )2 n ≤ h ≤ k + log ( k )2 n .Proof. It follows from Corollary 3.7 that log ( k )2 n ≤ h ≤ log [ k ]2 n . Then theproposition follows from Lemma 4.3. In [12] it was shown that every constantly growing language can be dissectedby some regular language.
Lemma 5.1. (see [12, Lemma . ]) Every infinite constantly growing lan-guage is REG -dissectible. . in [12] we can formulate the followingLemma. Lemma 5.2. If n, c ∈ N , K ⊂ N , | K | < ∞ , c = max { j ∈ K } , and L ⊆ ∆ ∗ n is a ( c , K ) -constantly growing language then there are j , j ∈ { , , , . . . , c } such that j = j and both sets H , H are infinite, where H i = { w | w ∈ L and | w | ≡ j i (mod c + 1) } and i ∈ { , } . Recall that a deterministic finite automaton g is -tuple (Q , ∆ , q , δ, F) , where Q is the set of states, ∆ is an input alphabet, q is the initial state, δ is atransition function, and F is the set of accepting states. Let AL( g ) denotethe language accepted by g ; AL( g ) is a regular language.We prove that if L ⊆ Ω k is an infinite language of balanced non-associativewords with the number of occurrences of x “bounded” by ( k, α ) -tetrationthen L can be dissected by a regular language. Proposition 6.1. If k, α ∈ N , k ≥ , and L ⊆ Ω k is an infinite languagesuch that for each w ∈ L there is w ∈ L with occur( w , x ) < occur( w , x ) and occur( w , x ) ≤ exp k,α occur( w , x ) then there is a regular language R such that R dissects L and the minimal deterministic finite automaton ac-cepting R has at most k + α + 3 states.Proof. Let w , w ∈ L be such that n ≤ exp k,α n , (1)where n = occur( w , x ) and n = occur( w , x ) .Let h = height( µ k,k ( w ) , x k ) and h = height( µ k,k ( w ) , x k ) . Proposition4.4 implies that log ( k )2 n ≤ h and h ≤ k + log ( k )2 n (2)From (1) and (2) we have that h ≤ k + log ( k )2 n ≤ k + log ( k )2 (exp k,α n ) . (3)13ealize that log (exp j,α ) = exp j − ,α and that if a, b ∈ R + and a, b ≥ then a + b ≤ ab . Then we have that log ( j )2 (exp j,α n ) = log ( j − (exp j − ,α + log n ) ≤ log ( j − (exp j − ,α log n ) .(4)From (4) it follows that log ( k )2 (exp k,α n ) ≤ log (exp ,α log ( k − n ) = α + log ( k )2 n . (5)From (2), (3), and (5) we have that h ≤ k + α + log ( k )2 n ≤ k + α + h . (6)The equation (6) says that for each u ∈ L there is v ∈ L with | u | < | v | and height( µ k,k ( v ) , x k ) ≤ k + α + height( µ k,k ( u ) , x k ) .Lemma 5.2 implies that there are distinct non-negative integers j , j ≤ k + α such that both H , H are infinite sets, where H i = { v | v ∈ L and height( µ k,k ( v ) , x k ) ≡ j i (mod k + α +1) } and i ∈ { , } .Let c = k + α . Consider the deterministic finite automaton g = (Q , Σ k − , q , δ, F) ,where • Q = { q , q , . . . , q c , q a , q r } , • δ ( q, x ) → ( q ) , where q ∈ Q and x ∈ Σ k − \{ x k , x k − } , • δ ( q i , x k ) → ( q i +1 mod c +1 ) , • δ ( q j , x k − ) → ( q a ) , • δ ( q i , x k − ) → ( q r ) , where i = j , • δ ( q, x ) → ( q ) , where q ∈ { q a , q r } and x ∈ { x k , x k − } , and • F = { q j , q a } .The deterministic finite automaton g implements the modulo operation onthe prefix of the form x ik . The input letter x ∈ Σ k − \{ x k , x k − } does notchange the state. The input letter x k changes the state from q i to q i +1 mod c +1 .If the input letter equals x k − then the state changes either to accept q a orrefuse q r . Realize that if w ∈ µ k,k (Ω k ) , a ∈ { y k , x k − } , and x k a ∈ Fac( w ) a = x k − , hence we do not need any “special” transition rule for theletter y k . Once in the state q a or q r , no other states can be reached. Thestates q j and q a are the accepting states. It is easy to see that AL( g ) = H and in consequence the regular language R = AL( g ) dissects L .This completes the proof.Given n ∈ N , let ∆ n be some alphabet with n letters. Let ∆ = B = { x } be the alphabet with the “first” opening bracket. Let L ⊆ ∆ ∗ n be an infinitelanguage with a growing bounded by ( k, α ) -tetration. Let υ : Σ ∗ k − → ∆ bean erasing alphabetical homomorphism defined by υ ( x ) = x and υ ( a ) = ǫ ,where a ∈ Σ k − \{ x } . Let π : ∆ ∗ n → ∆ be a nonerasing alphabeticalhomomorphism defined by π ( a ) = x for all a ∈ ∆ n . Note that if w ∈ ∆ ∗ n then | w | = | π ( w ) | .We show that there k − context free languages L , L , . . . , L k − ⊆ Σ ∗ k − such that the homomorphic image υ ( T ki L i ) dissects the homomorphicimage π ( L ) . Theorem 6.2. If n, α, k ∈ N , k ≥ , L ⊆ ∆ ∗ n is an infinite language with thegrowth bounded by ( k, α ) -tetration then there are k − context free languages L , L , . . . , L k − such that υ ( T k − i L i ) dissects π ( L ) .Proof. Recall that the language Ω k is an intersection of k − context freelanguages: Ω k = k \ m =2 (Naw k,m ∩ Bal k,m ∩ Λ k,m ) .Let us denote these languages ˜ L , ˜ L , . . . , ˜ L k − .Let π ( L ) = { π ( w ) | w ∈ L } ⊆ ∆ ∗ and let ¯ L = { w ∈ Ω k | υ ( w ) ∈ π ( L ) } ⊆ Ω k . Note that ¯ L contains w ∈ Ω k if and only if there is ¯ w ∈ L such thatthe number of occurrences of x in w is equal to the length of ¯ w ; formally occur( w, x ) = | ¯ w | .Since L is a language with the growth bounded by ( k, α ) -tetration, wehave that for each w ∈ ¯ L there is w ∈ ¯ L with occur( w , x ) ≤ exp k,α occur( w , x ) .Then Proposition 6.1 implies that there is a regular language R that dissects ¯ L . It is well known that intersection of a regular language and a context freelanguage is a context free language. Hence let L = ˜ L ∩ R and let L j = ˜ L j for all j ≥ and j ≤ k − . Then T k − i =1 L i dissects ¯ L . The theorem follows.15 cknowledgments This work was supported by the Grant Agency of the Czech Technical Uni-versity in Prague, grant No. SGS20/183/OHK4/3T/14.
References [1]
W. Bucher , A density problem for context-free languages , Bull. Eur.Assoc. Theor. Comput. Sci. EATCS 10, (1980).[2]
M. Domaratzki, J. Shallit, and S. Yu , Minimal covers of formallanguages , in Developments in Language Theory, 2001.[3]
P. Flajolet and J. M. Steyaert , On sets having only hard sub-sets , in Automata, Languages and Programming, J. Loeckx, ed., Berlin,Heidelberg, 1974, Springer Berlin Heidelberg, pp. 446–457.[4]
S. Ginsburg and S. Greibach , Deterministic context free languages ,Information and Control, 9 (1966), pp. 620 – 648.[5]
J. Julie, J. Baskar Babujee, and V. Masilamani , Dissectingpower of certain matrix languages , in Theoretical Computer Science andDiscrete Mathematics, S. Arumugam, J. Bagga, L. W. Beineke, andB. Panda, eds., Cham, 2017, Springer International Publishing, pp. 98–105.[6]
L. Liu and P. Weiner , An infinite hierarchy of intersections ofcontext-free languages , Math. Systems Theory 7, 185–192., (1973).[7]
E. L. Post , Recursively enumerable sets of positive integers and theirdecision problems , Bull. Amer. Math. Soc., 50 (1944), pp. 284–316.[8]
R. P. Stanley and S. Fomin , Enumerative Combinatorics , vol. 2of Cambridge Studies in Advanced Mathematics, Cambridge UniversityPress, 1999.[9]
D. Wotschke , The Boolean Closures of the Deterministic and Nonde-terministic Context-Free Languages , Springer Berlin Heidelberg, Berlin,Heidelberg, 1973, pp. 113–121. 1610]
D. Wotschke , Nondeterminism and boolean operations in pda’s , Jour-nal of Computer and System Sciences, 16 (1978), pp. 456 – 461.[11]
T. Yamakami , Intersection and union hierarchies of deterministiccontext-free languages and pumping lemmas , in Language and AutomataTheory and Applications, A. Leporati, C. Martín-Vide, D. Shapira,and C. Zandron, eds., Cham, 2020, Springer International Publishing,pp. 341–353.[12]