The complexity of separation for levels in concatenation hierarchies
TThe complexity of separation for levels inconcatenation hierarchies
Thomas Place
LaBRI, Bordeaux University and IUF, France
Marc Zeitoun
LaBRI, Bordeaux University, France
Abstract
We investigate the complexity of the separation problem associated to classes of regular languages.For a class C , C -separation takes two regular languages as input and asks whether there existsa third language in C which includes the first and is disjoint from the second. First, in contrastwith the situation for the classical membership problem, we prove that for most classes C , thecomplexity of C -separation does not depend on how the input languages are represented: it isthe same for nondeterministic finite automata and monoid morphisms. Then, we investigatespecific classes belonging to finitely based concatenation hierarchies. It was recently provedthat the problem is always decidable for levels 1/2 and 1 of any such hierarchy (with inefficientalgorithms). Here, we build on these results to show that when the alphabet is fixed, thereare polynomial time algorithms for both levels. Finally, we investigate levels 3/2 and 2 of thefamous Straubing-Thérien hierarchy. We show that separation is PSpace -complete for level 3/2and between
PSpace -hard and
EXPTime for level 2.
Theory of computation → Formal languages and automatatheory
Keywords and phrases
Regular languages, separation, concatenation hierarchies, complexity
Digital Object Identifier
Funding
Both authors acknowledge support from the DeLTA project (ANR-16-CE40-0007).
For more than 50 years, a significant research effort in theoretical computer science wasmade to solve the membership problem for regular languages. This problem consists indetermining whether a class of regular languages is decidable, that is, whether there is analgorithm inputing a regular language and outputing ‘yes’ if the language belongs to theinvestigated class, and ‘no’ otherwise.Many results were obtained in a long and fruitful line of research. The most prominentone is certainly Schützenberger’s theorem [19], which gives such an algorithm for the class ofstar-free languages. For most interesting classes also, we know precisely the computationalcost of the membership problem. As can be expected, this cost depends on the way theinput language is given. Indeed, there are several ways to input a regular language. Forinstance, it can be given by a nondeterministic finite automaton (
NFA ), or, alternately, by amorphism into a finite monoid. While obtaining an
NFA representation from a morphism intoa monoid has only a linear cost, the converse direction is much more expensive: from an
NFA with n states, the smallest monoid recognizing the same language may have an exponentialnumber of elements (the standard construction yields 2 n elements). This explains why thecomplexity of the membership problem depends on the representation of the input. For © Thomas Place and Marc Zeitoun;licensed under Creative Commons License CC-BY38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science(FSTTCS 2018).Editors: Sumit Ganguly and Paritosh Pandya; Article No. 47; pp. 47:1–47:36Leibniz International Proceedings in InformaticsSchloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany a r X i v : . [ c s . F L ] O c t instance, for the class of star-free languages, it is PSpace -complete if one starts from
NFAs (and actually, even from
DFAs [2]) while it is NL when starting from monoid morphisms.Recently, another problem, called separation, has replaced membership as the cornerstonein the investigation of regular languages. It takes as input two regular langages insteadof one, and asks whether there exists a third language from the class under investigationincluding the first input language and having empty intersection with the second one. Thisproblem has served recently as a major ingredient in the resolution of difficult membershipproblems, such as the so-called dot-depth two problem [16] which remained open for 40 years(see [13, 18, 6] for recent surveys on the topic). Dot-depth two is a class belonging to afamous concatenation hierarchy which stratifies the star-free languages: the dot-depth [1]. Aspecific concatenation hierarchy is built in a generic way. One starts from a base class (level 0of the hierarchy) and builds increasingly growing classes (called levels and denoted by 1/2, 1,3/2, 2, . . . ) by alternating two standard closure operations: polynomial and Boolean closure.Concatenation hierarchies account for a significant part of the open questions in this researcharea. The state of the art regarding separation is captured by only three results [17, 9]: infinitely based concatenation hierarchies (i.e. those whose basis is a finite class) levels 1/2, 1and 3/2 have decidable separation. Moreover, using specific transfer results [15], this canbe pushed to the levels 3/2 and 2 for the two most famous finitely based hierarchies: thedot-depth [1] and the Straubing-Thérien hierarchy [21, 22].Unlike the situation for membership and despite these recent decidability results forseparability in concatenation hierarchies, the complexity of the problems and of the cor-responding algorithms has not been investigated so far (except for the class of piecewisetestable languages [3, 11, 5], which is level 1 in the Straubing-Thérien hierarchy). The aimof this paper is to establish such complexity results. Our contributions are the following:We present a generic reduction, which shows that for many natural classes, the waythe input is given (by NFAs or finite monoids) has no impact on the complexity of theseparation problem. This is proved using two
LogSpace reductions from one problem tothe other. This situation is surprising and opposite to that of the membership problem,where an exponential blow-up is unavoidable when going from
NFAs to monoids.Building on the results of [17], we show that when the alphabet is fixed, there arepolynomial time algorithms for levels 1/2 and 1 in any finitely based hierarchy.We investigate levels 3/2 and 2 of the famous Straubing-Thérien hierarchy, and we showthat separation is
PSpace -complete for level 3/2 and between
PSpace -hard and
EXPTime for level 2. The upper bounds are based on the results of [17] while the lower bounds arebased on independent reductions.
Organization.
In Section 2, we give preliminary terminology on the objects investigated inthe paper. Sections 3, 4 and 5 are then devoted to the three above points. Due to spacelimitations, many proofs are postponed to the appendix.
In this section, we present the key objects of this paper. We define words and regularlanguages, classes of languages, the separation problem and finally, concatenation hierarchies.
An alphabet is a finite set A of symbols, called letters . Given some alphabet A , we denoteby A + the set of all nonempty finite words and by A ∗ the set of all finite words over A ( i.e. , . Place and M. Zeitoun 47:3 A ∗ = A + ∪ { ε } ). If u ∈ A ∗ and v ∈ A ∗ we write u · v ∈ A ∗ or uv ∈ A ∗ for the concatenationof u and v . A language over an alphabet A is a subset of A ∗ . Abusing terminology, if u ∈ A ∗ is some word, we denote by u the singleton language { u } . It is standard to extendconcatenation to languages: given K, L ⊆ A ∗ , we write KL = { uv | u ∈ K and v ∈ L } .Moreover, we also consider marked concatenation, which is less standard. Given K, L ⊆ A ∗ , a marked concatenation of K with L is a language of the form KaL , for some a ∈ A .We consider regular languages , which can be equivalently defined by regular expressions , nondeterministic finite automata ( NFAs ), finite monoids or monadic second-order logic (MSO).In the paper, we investigate the separation problem which takes regular languages as input.Since we are focused on complexity, how we represent these languages in our inputs matters.We shall consider two kinds of representations: NFAs and monoids. Let us briefly recall theseobjects and fix the terminology (we refer the reader to [7] for details).
NFAs. An NFA is a tuple A = ( A, Q, δ, I, F ) where A is an alphabet, Q a finite set ofstates, δ ⊆ Q × A × Q a set of transitions, I ⊆ Q a set of initial states and F ⊆ Q a set offinal states. The language L ( A ) ⊆ A ∗ consists of all words labeling a run from an initialstate to a final state. The regular languages are exactly those which are recognized by an NFA . Finally, we write “
DFA ” for deterministic finite automata, which are defined in thestandard way.
Monoids.
We turn to the algebraic definition of regular languages. A monoid is a set M endowed with an associative multiplication ( s, t ) s · t (also denoted by st ) having aneutral element 1 M , i.e. , such that 1 M · s = s · M = s for every s ∈ M . An idempotent of amonoid M is an element e ∈ M such that ee = e .Observe that A ∗ is a monoid whose multiplication is concatenation (the neutral elementis ε ). Thus, we may consider monoid morphisms α : A ∗ → M where M is an arbitrarymonoid. Given such a morphism, we say that a language L ⊆ A ∗ is recognized by α whenthere exists a set F ⊆ M such that L = α − ( F ). It is well-known that the regular languagesare also those which are recognized by a morphism into a finite monoid. When representing aregular language L by a morphism into a finite monoid, one needs to give both the morphism α : A ∗ → M ( i.e. , the image of each letter) and the set F ⊆ M such that L = α − ( F ). A class of languages C is a correspondence A
7→ C ( A ) which, to an alphabet A , associates aset of languages C ( A ) over A . (cid:73) Remark.
When two alphabets
A, B satisfy A ⊆ B , the definition of classes does notrequire C ( A ) and C ( B ) to be comparable. In fact, it may happen that a particular language L ⊆ A ∗ ⊆ B ∗ belongs to C ( A ) but not to C ( B ) (or the opposite). For example, we mayconsider the class C defined by C ( A ) = {∅ , A ∗ } for every alphabet A . When A (cid:40) B , we have A ∗ ∈ C ( A ) while A ∗
6∈ C ( B ).We say that C is a lattice when for every alphabet A , we have ∅ , A ∗ ∈ C ( A ) and C ( A ) isclosed under finite union and finite intersection: for any K, L ∈ C ( A ), we have K ∪ L ∈ C ( A )and K ∩ L ∈ C ( A ). Moreover, a Boolean algebra is a lattice C which is additionally closedunder complement: for any L ∈ C ( A ), we have A ∗ \ L ∈ C ( A ). Finally, a class C is quotienting if it is closed under quotients. That is, for every alphabet A , L ∈ C ( A ) and word u ∈ A ∗ , thefollowing properties hold: u − L def = { w ∈ A ∗ | uw ∈ L } and Lu − = { w ∈ A ∗ | wu ∈ L } both belong to C ( A ) . F S T T C S 2 0 1 8
All classes that we consider in the paper are (at least) quotienting lattices consisting of regular languages . Moreover, some of them satisfy an additional property called closure underinverse image .Recall that A ∗ is a monoid for any alphabet A . We say that a class C is closed underinverse image if for every two alphabets A, B , every monoid morphism α : A ∗ → B ∗ andevery language L ∈ C ( B ), we have α − ( L ) ∈ C ( A ). A quotienting lattice (resp. quotientingBoolean algebra) closed under inverse image is called a positive variety (resp. variety ). Separation.
Consider a class of languages C . Given an alphabet A and two languages L , L ⊆ A ∗ , we say that L is C -separable from L when there exists a third language K ∈ C ( A ) such that L ⊆ K and L ∩ K = ∅ . In particular, K is called a separator in C .The C -separation problem is now defined as follows: Input:
An alphabet A and two regular languages L , L ⊆ A ∗ . Output: Is L C -separable from L ? (cid:73) Remark.
Separation generalizes the simpler membership problem , which asks whether asingle regular language belongs to C . Indeed L ∈ C if and only if L is C -separable from A ∗ \ L (which is also regular and computable from L ).Most papers on separation are mainly concerned about decidability. Hence, they do notgo beyond the above presentation of the problem (see [3, 16, 12, 17] for example). However,this paper specifically investigates complexity. Consequently, we shall need to be more preciseand take additional parameters into account. First, it will be important to specify whetherthe alphabet over which the input languages is part of the input (as above) or a constant.When considering separation for some fixed alphabet A , we shall speak of “ C ( A )-separation”.When the alphabet is part of the input, we simply speak of “ C -separation”.Another important parameter is how the two input languages are represented. We shallconsider NFAs and monoids. We speak of separation for
NFAs and separation for monoids .Note that one may efficiently reduce the latter to the former. Indeed, given a language L ⊆ A ∗ recognized by some morphism α : A ∗ → M , it is simple to efficiently compute a NFA with | M | states recognizing L (see [7] for example). Hence, we have the following lemma. (cid:73) Lemma 1.
For any class C , there is a LogSpace reduction from C -separation for monoidsto C -separation for NFAs . Getting an efficient reduction for the converse direction is much more difficult since goingfrom
NFAs (or even
DFAs ) to monoids usually involves an exponential blow-up. However, weshall see in Section 3 that for many natural classes C , this is actually possible. We now briefly recall the definition of concatenation hierarchies. We refer the reader to [18]for a more detailed presentation. A particular concatenation hierarchy is built from a startingclass of languages C , which is called its basis . In order to get robust properties, we restrict C to be a quotienting Boolean algebra of regular languages. The basis is the only parameter inthe construction. Once fixed, the construction is generic: each new level is built from theprevious one by applying generic operators: either Boolean closure, or polynomial closure.Let us first define these two operators. Definition.
Consider a class C . We denote by Bool ( C ) the Boolean closure of C : forevery alphabet A , Bool ( C )( A ) is the least set containing C ( A ) and closed under Booleanoperations. Moreover, we denote by Pol ( C ) the polynomial closure of C : for every alphabet A , . Place and M. Zeitoun 47:5 Pol ( C )( A ) is the least set containing C ( A ) and closed under union and marked concatenation(if K, L ∈ Pol ( C )( A ) and a ∈ A , then K ∪ L, KaL ∈ Pol ( C )( A )).Consider a quotienting Boolean algebra of regular languages C . The concatenationhierarchy of basis C is defined as follows. Languages are classified into levels of two kinds:full levels (denoted by 0, 1, 2, . . . ) and half levels (denoted by 1/2, 3/2, 5/2, . . . ). Level 0 isthe basis ( i.e. , C ) and for every n ∈ N ,The half level n + 1 / polynomial closure of the previous full level, i.e. , of level n .The full level n + 1 is the Boolean closure of the previous half level, i.e. , of level n + 1 / / / / P ol Bool P ol Bool P ol
We write N = { , / , , , / , , . . . } for the set of all possible levels in a concatenationhierarchy. Moreover, for any basis C and n ∈ N , we write C [ n ] for level n in the concatenationhierarchy of basis C . It is known that every half-level is a quotienting lattice and every fulllevel is a quotienting Boolean algebra (see [18] for a recent proof).We are interested in finitely based concatenation hierarchies: if C is the basis, then C ( A ) isfinite for every alphabet A . Indeed, it was shown in [17] that for such hierarchies separationis always decidable for the levels 1/2 and 1 (in fact, while we do not discuss this in thepaper, this is also true for level 3/2, see [9] for a preliminary version). In Section 4, webuild on the results of [17] and show that when the alphabet is fixed, this can be achieved inpolynomial time for both levels 1/2 and 1. Moreover, we shall also investigate the famous Straubing-Thérien hierarchy in Section 5. Our motivation for investigating this hierarchy inparticular is that the results of [17] can be pushed to levels 3/2 and 2 in this special case.
NFAs
In this section, we investigate how the representation of input languages impact the complexityof separation. We prove that for many natural classes C (including most of those consideredin the paper), C -separation has the same complexity for NFAs as for monoids. Because ofthese results, we shall be able to restrict ourselves to monoids in later sections. (cid:73)
Remark.
This result highlights a striking difference between separation and the simplermembership problem. For most classes C , C -membership is strictly harder for NFAs than formonoids. This is because when starting from a
NFA , typical membership algorithms requireto either determinize A or compute a monoid morphism recognizing L ( A ) which involves anexponential blow-up in both cases. Our results show that the situation differs for separation.We already have a generic efficient reduction from C -separation for monoids to C -separationfor NFAs (see Lemma 1). Here, we investigate the opposite direction: given some class C , isit possible to efficiently reduce C -separation for NFAs to C -separation for monoids ? As faras we know, there exists no such reduction which is generic to all classes C . (cid:73) Remark.
There exists an inefficient generic reduction from separation for
NFAs to the sep-aration for monoids. Given as input two
NFAs A , A , one may compute monoid morphismsrecognizing L ( A ) and L ( A ). This approach is not satisfying as it involves an exponentialblow-up: we end-up with monoids M i of size 2 | Q i | where Q i is the set of states of A i .Here, we present a set of conditions applying to a pair of classes ( C , D ). When they aresatisfied, there exists an efficient reduction from C -separation for NFAs to D -separation formonoids. By themselves, these conditions are abstract. However, we highlight two concrete F S T T C S 2 0 1 8 applications. First, for every positive variety C , the pair ( C , C ) satisfies the conditions. Second,for every finitely based concatenation hierarchies of basis C , there exists another finite basis D such that for every n ∈ N , the pair ( C [ n ] , D [ n ]) satisfies the conditionsWe first introduce the notions we need to present the reduction and the conditionsrequired to apply it. Then, we state the reduction itself and its applications. We fix a special two letter alphabet E = { , } . For the sake of improved readability, weabuse terminology and assume that when considering an arbitrary alphabet A , it always hasempty intersection with E . This is harmless as we may work up to bijective renaming.We exhibit conditions applying to a pair of classes ( C , D ). Then, we prove that theyimply the existence of an efficient reduction from C -separation for NFAs to D -separation formonoids. This reduction is based on a construction which takes as input a NFA A (oversome arbitrary alphabet A ) and builds a modified version of the language L ( A ) (over A ∪ E )which is recognized by a “small” monoid. Our conditions involve two kinds of hypotheses: First, we need properties related to inverse image: “ D must be an an extension of C ”. The construction is parametrized by an object called “tagging”. We need an algorithmwhich builds special taggings (with respect to D ) efficiently.We now make these two notions more precise. Let us start with extension. Extensions.
Consider two classes C and D . We say that D is an extension of C when forevery alphabet A , the two following conditions hold:If γ : ( A ∪ E ) ∗ → A ∗ is the morphism defined by γ ( a ) = a for a ∈ A and γ ( b ) = ε for b ∈ E , then for every K ∈ C ( A ), we have γ − ( K ) ∈ D ( A ∪ E ).For every u ∈ E ∗ , if λ u : A ∗ → ( A ∪ E ) ∗ is the morphism defined by λ u ( a ) = au for a ∈ A , then for every K ∈ D ( A ∪ E ), we have λ − u ( K ) ∈ C ( A ).Positive varieties give an important example of extension. Since they are closed under inverseimage, it is immediate that for every positive variety C , C is an extension of itself. Taggings. A tagging is a pair P = ( τ : E ∗ → T, G ) where τ is a morphism into a finitemonoid and G ⊆ T . We call | G | the rank of P and | T | its size. Moreover, given some NFA A = ( A, Q, δ, I, F ), P is compatible with A when the rank | G | is larger than | δ | .For our reduction, we shall require special taggings. Consider a class D and a tagging P = ( τ : E ∗ → T, G ). We say that P fools D when, for every alphabet A and every morphism α : ( A ∪ E ) ∗ → M into a finite monoid M , if all languages recognized by α belong to Bool ( D )( A ∪ E ), then, there exists s ∈ M , such that for every t ∈ G , we have w t ∈ E ∗ whichsatisfies α ( w t ) = s and τ ( w t ) = t .Our reduction requires an efficient algorithm for computing taggings which fool the outputclass D . Specifically, we say that a class D is smooth when, given as input k ∈ N , one maycompute in LogSpace (with respect to k ) a tagging of rank at least k which fools D . Main theorem.
We may now state our generic reduction theorem. The statement has twovariants depending on whether the alphabet is fixed or not. (cid:73)
Theorem 2.
Let C , D be quotienting lattices such that D is smooth and extends C . Thenthe two following properties hold:There is a LogSpace reduction from C -separation for NFAs to D -separation for monoids.For every fixed alphabet A , there is a LogSpace reduction from C ( A ) -separation for NFAs to D ( A ∪ E ) -separation for monoids. . Place and M. Zeitoun 47:7 We have two main applications of Theorem 2 which we present at the end of the section.Let us first describe the reduction. As we explained, we use a construction building a languagerecognized by a “small” monoid out of an input
NFA and a compatible tagging.Consider a
NFA A = ( A, Q, δ, I, F ) and let P = ( τ : E ∗ → T, G ) be a compatible tagging(i.e. | δ | ≤ | G | ). We associate a new language L [ A , P ] over the alphabet A ∪ E and showthat one may efficiently compute a recognizing monoid whose size is polynomial with respectto | Q | and the rank of P (i.e | G | ). The construction involves two steps. We first define anintermediary language K [ A , P ] over the alphabet A × T and then define L [ A , P ] from it.We define K [ A , P ] ⊆ ( A × T ) ∗ as the language recognized by a new NFA A [ P ] which isbuilt by relabeling the transitions of A . Note that the definition of A [ P ] depends on arbitrarylinear orders on G and δ . We let A [ P ] = ( A × T, Q, δ [ P ] , I, F ) where δ [ P ] is obtained byrelabeling the transitions of A as follows. Given i ≤ | δ | , if ( q i , a i , r i ) ∈ δ is the i -th transitionof A , we replace it with the transition ( q i , ( a i , t i ) , r i ) ∈ δ [ P ] where t i ∈ G is the i -th elementof G (recall that | δ | ≤ | G | by hypothesis). (cid:73) Remark.
A key property of A [ P ] is that, by definition, all transitions are labeled by distinctletters in A × T . This implies that K [ A , P ] = L ( A [ P ]) is recognized by a monoid of size atmost | Q | + 2.We may now define the language L [ A , P ] ⊆ ( A ∪ E ) ∗ . Observe that we have a naturalmap µ : ( A E ∗ ) ∗ → ( A × T ) ∗ . Indeed, consider w ∈ ( A E ∗ ) ∗ . Since A ∩ E = ∅ (recallthat this is a global assumption), it is immediate that w admits a unique decomposition w = a w · · · a n w n with a , . . . , a n ∈ A and w , . . . , w n ∈ E ∗ . Hence, we may define µ ( w ) = ( a , P ( w )) · · · ( a n , P ( w n )) ∈ ( A × T ) ∗ . Finally, we define, L [ A , P ] = E ∗ · µ − ( K [ A , P ]) ⊆ ( A ∪ E ) ∗ We may now state the two key properties of L [ A , P ] upon which Theorem 2 is based. It isrecognized by a small monoid and the construction is connected to the separation. (cid:73) Proposition 3.
Given a
NFA A = ( A, Q, δ, I, F ) and a compatible tagging P of rank n ,one may compute in LogSpace a monoid morphism α : ( A ∪ E ) ∗ → M recognizing L [ A , P ] and such that | M | ≤ n + | A | × n × ( | Q | + 2) . (cid:73) Proposition 4.
Let C , D be quotienting lattices such that D extends C . Consider two NFAs A and A over some alphabet A and let P be a compatible tagging that fools D . Then, L ( A ) is C ( A ) -separable from L ( A ) if and only if L [ A , P ] is D ( A ∪ E ) -separable from L [ A , P ] . Let us explain why these two propositions imply Theorem 2. Let C , D be quotientinglattices such that D is smooth and extends C . We show that the second assertion in thetheorem holds (the first one is proved similarly).Consider two NFAs A i = ( A, Q j , δ j , I j , F j ) for j = 1 ,
2. We let k = max ( | δ | , | δ | ). Since D is smooth, we may compute (in LogSpace ) a tagging P = ( τ : E ∗ → T, G ) of rank | G | ≥ k .Then, we may use Proposition 3 to compute (in LogSpace ) monoid morphisms recognizing L [ A , P ] and L [ A , P ]. Finally, by Proposition 4, L ( A ) is C ( A )-separable from L ( A ) ifand only if L [ A , P ] is D ( A ∪ E )-separable from L [ A , P ]. Altogether, this construction is a LogSpace reduction to D -separation for monoids which concludes the proof. We now present the two main applications of Theorem 2. We start with the most simple onepositive varieties. Indeed, we have the following lemma.
F S T T C S 2 0 1 8 (cid:73)
Lemma 5.
Let C be a positive variety. Then, C is an extension of itself. Moreover, ifBool ( C ) = REG , then C is smooth. That a positive variety is an extension of itself is immediate (one uses closure underinverse image). The difficulty is to prove smoothness. We may now combine Theorem 2 withLemma 5 to get the following corollary. (cid:73)
Corollary 6.
Let C be a positive variety such that Bool ( C ) = REG . There exists a LogSpace reduction from C -separation for NFAs to C -separation for monoids. Corollary 6 implies that for any positive variety C , the complexity of C -separation is thesame for monoids and NFAs . We illustrate this with an example: the star-free languages . (cid:73) Example 7.
Consider the star-free languages (SF): for every alphabet A , SF( A ) is theleast set of languages containing all singletons { a } for a ∈ A and closed under Booleanoperations and concatenation. It is folklore and simple to verify that SF is a variety. It isknown that SF-membership is in NL for monoids (this is immediate from Schützenberger’stheorem [19]). On the other hand, SF-membership is PSpace -complete for
NFAs . In fact, itis shown in [2] that
PSpace -completeness still holds for deterministic finite automata (
DFAs ).For SF-separation, we may combine Corollary 6 with existing results to obtain that theproblem is in
EXPTime and
PSpace -hard for both
NFAs and monoids. Indeed, the
EXPTime upper bounds is proved in [14] for monoids and we may lift it to
NFAs with Corollary 6.Finally, the
PSpace lower bound follows from [2]: SF-membership is
PSpace -hard for
DFAs .This yields that SF-separation is
PSpace -hard for both
DFAs and
NFAs (by reduction frommembership to separation which is easily achieved in
LogSpace when starting from a
DFA ).Using Corollary 6 again, we get that SF-separation is
PSpace -hard for monoids as well. (cid:74)
We turn to our second application: finitely based concatenation hierarchies. Considera finite quotienting Boolean algebra C . We associate another finite quotienting Booleanalgebra C E which we only define for alphabets of the form A ∪ E (this is harmless: C E isused as the output class of our reduction). Let A be an alphabet and consider the morphism γ : ( A ∪ E ) ∗ → A ∗ defined by γ ( a ) = a for a ∈ A and γ (0) = γ (1) = ε . We define, C E ( A ∪ E ) = { γ − ( L ) | L ∈ C ( A ) } It is straightforward to verify that C E remains a finite quotienting Boolean algebra. Moreover,we have the following lemma. (cid:73) Lemma 8.
Let C be a finite quotienting Boolean algebra. For every n ∈ N , C E [ n ] issmooth and an extension of C [ n ] . In view of Theorem 2, we get the following corollary which provides a generic reductionfor levels within finitely based hierarchies. (cid:73)
Corollary 9.
Let C be a finite basis and n ∈ N . There exists a LogSpace reduction from C [ n ] -separation for NFAs to C E [ n ] -separation for monoids. In this section, we present generic complexity results for the fixed alphabet separation problemassociated to the lower levels in finitely based concatenation hierarchies. More precisely, weshow that for every finite basis C and every alphabet A , C [1 / A )- and C [1]( A )-separationare respectively in NL and in P . These upper bounds hold for both monoids and NFAs : weprove them for monoids and lift the results to
NFAs using the reduction of Corollary 9. . Place and M. Zeitoun 47:9 (cid:73)
Remark.
We do not present new proofs for the decidability of C [1 / C [1]-separationwhen C is a finite quotienting Boolean algebra. These are difficult results which are provedin [17]. Instead, we recall the (inefficient) procedures which were originally presented in [17]and carefully analyze and optimize them in order to get the above upper bounds.For the sake of avoiding clutter, we fix an arbitrary finite quotienting Boolean algebra C and an alphabet A for the section. The algorithms C [1 / A )- and C [1]( A )-separation presented in [17] are based on a commonsub-procedure. This remains true for the improved algorithms which we present in thepaper. In fact, this sub-procedure is exactly what we improve to get the announced uppercomplexity bounds. We detail this point here. Note that the algorithms require consideringspecial monoid morphisms (called “ C -compatible”) as input. We first define this notion. C -compatible morphisms. Since C is finite, one associates a classical equivalence ∼ C defined on A ∗ . Given u, v ∈ A ∗ , we write u ∼ C v if and only if u ∈ L ⇔ v ∈ L for all L ∈ C ( A ). Given w ∈ A ∗ , we write [ w ] C ⊆ A ∗ for its ∼ C -class. Since C is a finite quotientingBoolean algebra, ∼ C is a congruence of finite index for concatenation (see [18] for a proof).Hence, the quotient A ∗ / ∼ C is a monoid and the map w [ w ] C a morphism.Consider a morphism α : A ∗ → M into a finite monoid M . We say that α is C -compatiblewhen there exists a monoid morphism s [ s ] C from M to A ∗ / ∼ C such that for every w ∈ A ∗ ,we have [ w ] C = [ α ( w )] C . Intuitively, the definition means that α “computes” the ∼ C -classesof words in A ∗ . The following lemma is used to compute C -compatible morphisms (note thatthe LogSpace bound holds because C and A is fixed). (cid:73) Lemma 10.
Given two morphisms recognizing regular languages L , L ⊆ A ∗ as input,one may compute in LogSpace a C -compatible morphism which recognizes both L and L . In view of Lemma 10, we shall assume in this section without loss of generality thatour input in separation for monoids is a single C -compatible morphism recognizing the twolanguages that need to be separated. Sub-procedure.
Consider two C -compatible morphisms α : A ∗ → M and β : A ∗ → N . Wesay that a subset of N is good (for β ) when it contains β ( A ∗ ) and is closed under multiplication.For every good subset S of N , we associate a subset of M × N . We then consider theproblem of deciding whether specific elements belong to it (this is the sub-procedure used inthe separation algorithms). (cid:73) Remark.
The set M × N is clearly a monoid for the componentwise multiplication. Hencewe may multiply its elements and speak of idempotents in M × N .An ( α, β, S ) -tree is an unranked ordered tree. Each node x must carry a label lab ( x ) ∈ M × N and there are three possible kinds of nodes: Leaves : x has no children and lab ( x ) = ( α ( w ) , { β ( w ) } ) for some w ∈ A ∗ . Binary : x has exactly two children x and x . Moreover, if ( s , T ) = lab ( x ) and( s , T ) = lab ( x ), then lab ( x ) = ( s s , T ) with T ⊆ T T . S -Operation : x has a unique child y . Moreover, the following must be satisfied: The label lab ( y ) is an idempotent ( e, E ) ∈ M × N . lab ( x ) = ( e, T ) with T ⊆ E · { t ∈ S | [ e ] C = [ t ] C ∈ S } · E . F S T T C S 2 0 1 8
We are interested in deciding whether elements in M × N are the root label of somecomputation tree. Observe that computing all such elements is easily achieved with a leastfixpoint procedure: one starts from the set of leaf labels and saturates this set with threeoperations corresponding to the two kinds of inner nodes. This is the approach used in [17](actually, the set of all root labels is directly defined as a least fixpoint and ( α, β, S )-treesare not considered). However, this is costly since the computed set may have exponentialsize with respect to | N | . Hence, this approach is not suitable for getting efficient algorithms.Fortunately, solving C [1 / A )- and C [1]( A )-separation does not require to have the wholeset of possible root labels in hand. Instead, we shall only need to consider the elements( s, T ) ∈ M × N which are the root label of some tree and such that T is a singleton set .It turns out that these specific elements can be computed efficiently. We state this in thenext theorem which is the key technical result and main contribution of this section. (cid:73) Theorem 11.
Consider two C -compatible morphisms α : A ∗ → M and β : A ∗ → N and agood subset S ⊆ N . Given s ∈ M and t ∈ N , one may test in NL with respect to | M | and | N | whether there exists an ( α, β, S ) -tree with root label ( s, { t } ) . Theorem 11 is proved in appendix. We only present a brief outline which highlights twopropositions about ( α, β, S )-trees upon which the theorem is based.We first define a complexity measure for ( α, β, S )-trees. Consider two C -compatiblemorphisms α : A ∗ → M and β : A ∗ → N as well as a good subset S ⊆ N . Given an( α, β, S )-tree T , we define the operational height of T as the greatest number h ∈ N suchthat T contains a branch with h S -operation nodes.Our first result is a weaker version of Theorem 11. It considers the special case when werestrict ourselves to ( α, β, S )-trees whose operational heights are bounded by a constant. (cid:73) Proposition 12.
Let h ∈ N be a constant and consider two C -compatible morphisms α : A ∗ → M and β : A ∗ → N and a good subset S ⊆ N . Given s ∈ M and t ∈ N , one maytest in NL with respect to | M | and | N | whether there exists an ( α, β, S ) -tree of operationalheight at most h and with root label ( s, { t } ) . Our second result complements the first one: in Theorem 11, it suffices to consider( α, β, S )-trees whose operational heights are bounded by a constant (depending only on theclass C and the alphabet A which are fixed here). Let us first define this constant. Given afinite monoid M , we define the J -depth of M as the greatest number h ∈ N such that onemay find h pairwise distinct elements s , . . . , s h ∈ M such that for every i < h , s i +1 = xs i y for some x, y ∈ M (cid:73) Remark.
The term “ J -depth” comes from the Green’s relations which are defined on anymonoid [4]. We do not discuss this point here.Recall that the quotient set A ∗ / ∼ C is a monoid. Consequently, it has a J -depth. Oursecond result is as follows. (cid:73) Proposition 13.
Let h ∈ N be the J -depth of A ∗ / ∼ C . Consider two C -compatible mor-phisms α : A ∗ → M and β : A ∗ → N , and a good subset S ⊆ N . Then, for every ( s, T ) ∈ M × N , the following properties are equivalent: ( s, T ) is the root label of some ( α, β, S ) -tree. ( s, T ) is the root label of some ( α, β, S ) -tree whose operational height is at most h . In view of Proposition 13, Theorem 11 is an immediate consequence of Proposition 12applied in the special case when h is the J -depth of A ∗ / ∼ C and m = 1. . Place and M. Zeitoun 47:11 We now combine Theorem 11 with the results of [17] to get the upper complexity bounds for C [1 / A )- and C [1]( A )-separation that we announced at the begging of the section. Application to C [1 / . Let us first recall the connection between C [1 / α, β, S )-trees. The result is taken from [17]. (cid:73) Theorem 14 ([17]) . Let α : A ∗ → M be a C -compatible morphism and F , F ⊆ M .Moreover, let S = α ( A ∗ ) ⊆ M . The two following properties are equivalent: α − ( F ) is C [1 / -separable from α − ( F ) .for every s ∈ F and s ∈ F , there exists no ( α, α, S ) -tree with root label ( s , { s } ) . By Theorem 11 and the Immerman–Szelepcsényi theorem (which states that NL = co - NL ),it is straightforward to verify that checking whether the second assertion in Theorem 14holds can be done in NL with respect to | M | . Therefore, the theorem implies that C [1 / A )-separation for monoids is in NL . This is lifted to NFAs using Corollary 9. (cid:73)
Corollary 15.
For every finite basis C and alphabet A , C [1 / A ) -separation is in NL forboth NFAs and monoids.
Application to C [1] . We start by recalling the C [1]-separation algorithm which is again takenfrom [17]. In this case, we consider an auxiliary sub-procedure which relies on ( α, β, S )-trees.Consider a C -compatible morphism α : A ∗ → M . Observe that M is a monoid forthe componentwise multiplication. We let β : A ∗ → M as the morphism defined by β ( w ) = ( α ( w ) , α ( w )) for every w ∈ A ∗ . Clearly, β is C -compatible: given ( s, t ) ∈ M , itsuffices to define [( s, t )] C = [ s ] C . Using ( α, β, S )-trees, we define a procedure S Red ( α, S )which takes as input a good subset S ⊆ M (for β ) and outputs a subset Red ( α, S ) ⊆ S . Red ( α, S ) = { ( s, t ) ∈ S | ( s, { ( t, s ) } ) ∈ M × M is the root label of an ( α, β, S )-tree } ⊆ S It is straightforward to verify that
Red ( α, S ) remains a good subset of M . We now havethe following theorem which is taken from [17]. (cid:73) Theorem 16 ([17]) . Let α : A ∗ → M be a morphism into a finite monoid and F , F ⊆ M .Moreover, let S ⊆ M be the greatest subset of α ( A ∗ ) × α ( A ∗ ) such that Red ( α, S ) = S .Then, the two following properties are equivalent: α − ( F ) is Bool ( Pol ( C )) -separable from α − ( F ) .for every s ∈ F and s ∈ F , ( s , s ) S . Observe that Theorem 11 implies that given an arbitrary good subset S of α ( A ∗ ) × α ( A ∗ ),one may compute Red ( α, S ) ⊆ S in P with respect to | M | . Therefore, the greatest subset S of α ( A ∗ ) × α ( A ∗ ) such that Red ( α, S ) = S can be computed in P using a greatest fixpointalgorithm. Consequently, Theorem 16 yields that C [1]( A )-separation for monoids is in P .Again, this is lifted to NFAs using Corollary 9. (cid:73)
Corollary 17.
For every finite basis C and alphabet A , C [1]( A ) -separation is in P for both NFAs and monoids.
In this final section, we consider one of the most famous concatenation hierarchies: theStraubing-Thérien hierarchy [21, 22]. We investigate the complexity of separation for thelevels 3/2 and 2.
F S T T C S 2 0 1 8 (cid:73)
Remark.
Here, the alphabet is part of the input. For fixed alphabets, these levels can behandled with the generic results presented in the previous section (see Theorem 18 below).The basis of the Straubing-Thérien hierarchy is the trivial variety ST[0] defined byST[0]( A ) = {∅ , A ∗ } for every alphabet A . It is known and simple to verify (using induction)that all half levels are positive varieties and all full levels are varieties.The complexity of separation for the level one (ST[1]) has already been given a lot ofattention. Indeed, this level corresponds to a famous class which was introduced indepen-dently from concatenation hierarchies: the piecewise testable languages [20]. It was shownindependently in [3] and [11] that ST[1]-separation is in P for NFAs (and therefore for
DFAs and monoids as well). Moreover, it was also shown in [5] that the problem is actually P -complete for NFAs and
DFAs . Additionally, it is shown in [3] that ST[1 / NL .In the paper, we are mainly interested in the levels ST[3 /
2] and ST[2]. Indeed, theStraubing-Thérien hierarchy has a unique property: the generic separation results of [17]apply to these two levels as well. Indeed, these are also the levels 1/2 and 1 in another finitelybased hierarchy. Consider the class AT of alphabet testable languages . For every alphabet A ,AT( A ) is the set of all Boolean combinations of languages A ∗ aA ∗ for a ∈ A . One may verifythat AT is a variety and that AT( A ) is finite for every alphabet A . Moreover, we have thefollowing theorem which is due to Pin and Straubing [8] (see [18] for a modern proof). (cid:73) Theorem 18 ([8]) . For every n ∈ N , we have AT[ n ] = ST[ n + 1] . The theorem implies that ST[3 /
2] = AT[1 /
2] and ST[2] = AT[1]. Therefore, the resultsof [17] yield the decidability of separation for both ST[3 /
2] and ST[2] (the latter is the mainresult of [17]). As expected, this section investigates complexity for these two problems.
We have the following tight complexity bound for ST[3 / (cid:73) Theorem 19.
ST[3 / -separation is PSpace -complete for both
NFAs and monoids.
The
PSpace upper bound is proved by building on the techniques introduced in theprevious section for handling the level 1/2 of an arbitrary finitely based hierarchies. Indeed,we have ST[3 /
2] = AT[1 /
2] by Theorem 18. However, let us point out that obtaining thisupper bound requires some additional work: the results of Section 4 apply to the setting inwhich the alphabet is fixed, this is not the case here. In particular, this is why we end upwith a
PSpace upper bound instead of the generic NL upper presented in Corollary 15. Thedetailed proof is postponed to the appendix.In this abstract, we focus on proving that ST[3 / PSpace -hard. The proofis presented for
NFAs : the result can then be lifted to monoids with Corollary 6 since ST[3 / LogSpace reduction from the quantified Boolean formulaproblem (QBF) which is among the most famous
PSpace -complete problems.We first describe the reduction. For every quantified Boolean formula Ψ, we explain howto construct two languages L Ψ and L Ψ . It will be immediate from the presentation thatgiven Ψ as input, one may compute NFAs for L Ψ and L Ψ in LogSpace . Then, we show that Since ST[1] is a variety, P -completeness for ST[1]-separation can also be lifted to monoids usingCorollary 6. . Place and M. Zeitoun 47:13 this construction is the desired reduction: Ψ is true if and only if L Ψ is not ST[3 / L Ψ .Consider a quantified Boolean formula Ψ and let n be the number of variables it involves.We assume without loss of generality that Ψ is in prenex normal form and that the quantifier-free part of Ψ is in conjunctive normal form (QBF remains PSpace -complete when restrictedto such formulas). That is,Ψ = Q n x n · · · Q x ϕ where x . . . x n are the variables of Ψ, Q , . . . , Q n ∈ {∃ , ∀} are quantifiers and ϕ is a quantifier-free Boolean formula involving the variables x . . . x n which is in conjunctive normal form.We describe the two regular languages L Ψ , L Ψ by providing regular expressions recognizingthem. Let us first specify the alphabet over which these languages are defined. For eachvariable x i occurring in Ψ, we create two letters that we write x i and x i . Moreover, we let, X = { x , . . . , x n } and X = { x , . . . , x n } Additionally, our alphabet also contains the following letters: , . . . , i , $. For 0 ≤ i ≤ n ,we define an alphabet B i . We have: B = X ∪ X and B i = X ∪ X ∪ { , . . . , i , $ } Our languages are defined over the alphabet B n : L Ψ , L Ψ ⊆ B ∗ n . They are built by induction:for 0 ≤ i ≤ n we describe two languages L i , L i ⊆ B ∗ i (starting with the case i = 0). Thelanguages L Ψ , L Ψ are then defined as L n , L n . Construction of L , L . The language L is defined as L = ( B ) ∗ . The language L is defined from the quantifier-free Boolean formula ϕ . Recall that by hypothesis ϕ is inconjunctive normal form: ϕ = V j ≤ k ϕ j were ϕ i is a disjunction of literals. For all j ≤ k , welet C j ⊆ B = X ∪ X as the following alphabet:Given x ∈ X , we have x ∈ C j , if and only x is a literal in the disjunction ϕ j .Given x ∈ X , we have x ∈ C j , if and only ¬ x is a literal in the disjunction ϕ j .Finally, we define L = C C · · · C k . Construction of L i , L i for i ≥ . We assume that L i − , L i − are defined and describe L i and L i . We shall use the two following languages in the construction: T i = ( i x i ( B i − \ { x i } ) ∗ $ x i ) ∗ and T i = ( i x i ( B i − \ { x i } ) ∗ $ x i ) ∗ The definition of L i , L i from L i − , L i − now depends on whether the quantifier Q i is existentialor universal.If Q i is an existential quantifier (i.e. Q i = ∃ ): L i = ( i ( x i + x i ) L i − $( x i + x i )) ∗ i L i = ( i ( x i + x i ) L i − $( x i + x i )) ∗ i $ (cid:0) T i i + T i i (cid:1) If the Q i is an universal quantifier (i.e. Q i = ∀ ): L i = ( i ( x i + x i ) L i − $( x i + x i )) ∗ i L i = T i i $( i ( x i + x i ) L i − $( x i + x i )) ∗ i $ T i i Finally, L Ψ , L Ψ are defined as the languages L n , L n ⊆ ( B n ) ∗ . It is straightforward toverify from the definition, than given Ψ as input, one may compute NFAs for L Ψ and L Ψ in LogSpace . Consequently, it remains to prove that this construction is the desired reduction.We do so in the following proposition.
F S T T C S 2 0 1 8 (cid:73)
Proposition 20.
For every quantified Boolean formula Ψ , Ψ is true if and only if L Ψ isnot ST[3 / -separable from L Ψ . Proposition 20 is proved by considering a stronger result which states properties of allthe languages L i , L i used in the construction of L Ψ , L Ψ (the argument is an induction on i ).While we postpone the detailed proof to the appendix, let us provide a sketch which presentsthis stronger result. Proof of Proposition 20 (sketch).
Consider a quantified Boolean formula Ψ. Moreover, let B , . . . , B n and L i , L i ⊆ ( B i ) ∗ as the alphabets and languages defined above. The key ideais to prove a property which makes sense for all languages L i , L i . In the special case when i = n , this property implies Proposition 20.Consider 0 ≤ i ≤ n . We write Ψ i for the sub-formula Ψ i := Q i x i · · · Q x ϕ (withthe free variables x i +1 , . . . , x n ). In particular, Ψ := ϕ and Ψ n := Ψ. Moreover, we call“ i -valuation ” a sub-alphabet V ⊆ B i such that, , . . . , i , $ ∈ V and x , x , . . . , x i , x i ∈ V , and, for every j such that i < j ≤ n , one of the two following property holds: x j ∈ V and x j V , or, x j V and x j ∈ V .Clearly, an i -valuation corresponds to a truth assignment for all variables x j such that j > i (i.e. those that are free in Ψ i ): when the first (resp. second) assertion in Item 2 holds, x j is assigned to > (resp. ⊥ ). Hence, abusing terminology, we shall say that an i -valuation V satisfies Ψ i if Ψ i is true when replacing its free variables by the truth values provided by V .Finally, for 0 ≤ i ≤ n , if V ⊆ B i is an i -valuation, we let [ V ] ⊆ V ∗ as the followinglanguage. Given w ∈ V ∗ , we have w ∈ [ V ] if and only if for every j > i either x j ∈ alph ( w )or x j ∈ alph ( w ) (by definition of i -valuations, exactly one of these two properties must hold).Proposition 20 is now a consequence of the following lemma. (cid:73) Lemma 21.
Consider ≤ i ≤ n . Then given an i -valuation V , the two following propertiesare equivalent: Ψ i is satisfied by V . L i ∩ [ V ] is not ST[3 / -separable from L i ∩ [ V ] . Lemma 21 is proved by induction on i using standard properties of the polynomial closureoperation (see [18] for example). The proof is postponed to the appendix. Let us explainwhy the lemma implies Proposition 20.Consider the special case of Lemma 21 when i = n . Observe that V = B n is an n -valuation(the second assertion in the definition of n -valuations is trivially true since there are no j such that n < j ≤ n ). Hence, since Ψ = Ψ n and L Ψ , L Ψ = L n , L n , the lemma yields that, Ψ is satisfied by V (i.e. Ψ is true). L Ψ ∩ [ V ] is not ST[3 / L Ψ ∩ [ V ].Moreover, we have [ V ] = ( B n ) ∗ by definition. Hence, we obtain that Ψ is true if and only if L is not ST[3 / L which concludes the proof of Proposition 20. (cid:74) For the level two, there is a gap between the lower and upper bound that we are able toprove. Specifically, we have the following theorem. (cid:73)
Theorem 22.
ST[2] -separation is in
EXPTime and
PSpace -hard for both
NFAs and monoids. . Place and M. Zeitoun 47:15
Similarly to what happened with ST[3 / EXPTime upper bound is obtained bybuilding on the techniques used in the previous section. Proving
PSpace -hardness is achievedusing a reduction from ST[3 / PSpace -hard by Theorem 19). Thereduction is much simpler than what we presented for ST[3 /
2] above. It is summarized bythe following proposition. (cid:73)
Proposition 23.
Consider an alphabet A and H, H ⊆ A ∗ . Let B = A ∪ { , $ } with , $ A , L = H A ∗ $ ∗ ) ∗ H A ∗ $ ∗ ⊆ B ∗ and L = H A ∗ $ ∗ ) ∗ ⊆ B ∗ . Thetwo following properties are equivalent: H is ST[3 / -separable from H . L is ST[2] -separable from L . Proposition 23 is proved using standard properties of the polynomial and Boolean closureoperations. The argument is postponed ot the appendix. It is clear than given as input
NFAs for two languages
H, H , one may compute NFAs for the languages
L, L definedProposition 23 in LogSpace . Consequently, the proposition yields the desired
LogSpace reduction from ST[3 / NFAs to ST[2]-separation for
NFAs . This proves thatST[2]-separation is
PSpace -hard for
NFAs (the result can then be lifted to monoids usingCorollary 6) since ST[2] is a variety).
We showed several results, all of them raising new questions. First we proved that for manyimportant classes of languages (including all positive varieties), the complexity of separationdoes not depend on how the input languages are represented. A natural question is whetherthe technique can be adapted to encompass more classes. In particular, one may definemore permissive notions of positive varieties by replacing closure under inverse image byweaker notions. For example, many natural classes are length increasing positive varieties :closure under inverse image only has to hold for length increasing morphisms ( i.e. , morphisms α : A ∗ → B ∗ such that | α ( w ) | ≥ | w | for every w ∈ A ∗ ). For example, the levels of anotherfamous concatenation hiearchy, the dot-depth [1] (whose basis is {∅ , { ε } , A + , A ∗ } ) are lengthincreasing positive varieties. Can our techniques be adapted for such classes? Let us pointout that there exists no example of natural class C for which separation is decidable andstrictly harder for NFAs than for monoids. However, there are classes C for which the questionis open (see for example the class of locally testable languages in [10]).We also investigated the complexity of separation for levels 1/2 and 1 in finitely basedconcatenation hierarchies. We showed that when the alphabet is fixed, the problems arerespectively in NL and P for any such hierarchy. An interesting follow-up question wouldbe to push these results to level 3/2, for which separation is also known to be decidable inany finitely based concatenation hierarchy [9]. A rough analysis of the techniques used in [9]suggests that this requires moving above P .Finally, we showed that in the famous Straubing-Thérien hierarchy, ST[3 / PSpace -complete and ST[2]-separation is in
EXPTime and
PSpace -hard. Again, a naturalquestion is to analyze ST[5 / References Janusz A. Brzozowski and Rina S. Cohen. Dot-depth of star-free events.
Journal of
Computer and System Sciences , 5(1):1–16, 1971.
F S T T C S 2 0 1 8 Sang Cho and Dung T. Huynh. Finite automaton aperiodicity is PSPACE-complete.
The-oretical Computer Science , 88(1):99 – 116, 1991. Wojciech Czerwiński, Wim Martens, and Tomáš Masopust. Efficient separability of regularlanguages by subsequences and suffixes. In
Proceedings of the 40th International Colloquiumon Automata, Languages, and Programming (ICALP’13) , pages 150–161. Springer-Verlag,2013. James Alexander Green. On the structure of semigroups.
Annals of Mathematics , 54(1):163–172, 1951. Tomás Masopust. Separability by piecewise testable languages is PTIME-complete.
Theo-retical Computer Science , 711:109–114, 2018. Jean-Éric Pin. The dot-depth hierarchy, 45 years later. In
The Role of Theory in ComputerScience - Essays Dedicated to Janusz Brzozowski , pages 177–202, 2017. Jean-Éric Pin. Mathematical foundations of automata theory. In preparation, 2018. URL: . Jean-Eric Pin and Howard Straubing. Monoids of upper triangular Boolean matrices. In
Semigroups. Structure and Universal Algebraic Problems , volume 39 of
Colloquia Mathe-matica Societatis Janos Bolyal , pages 259–272. North-Holland, 1985. Thomas Place. Separating regular languages with two quantifier alternations. Unpublished,a preliminary version can be found at https://arxiv.org/abs/1707.03295 , 2018. Thomas Place, Lorijn van Rooijen, and Marc Zeitoun. Separating regular languages bylocally testable and locally threshold testable languages. In
Proceedings of the 33rd IARCSAnnual Conference on Foundations of Software Technology and Theoretical Computer Sci-ence , FSTTCS’13, pages 363–375, Dagstuhl, Germany, 2013. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. Thomas Place, Lorijn van Rooijen, and Marc Zeitoun. Separating regular languages bypiecewise testable and unambiguous languages. In
Proceedings of the 38th InternationalSymposium on Mathematical Foundations of Computer Science , MFCS’13, pages 729–740.Springer-Verlag, 2013. Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic. In
Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on ComputerScience Logic (CSL’14) and the 29th Annual ACM/IEEE Symposium on Logic in ComputerScience (LICS’14) , pages 75:1–75:10. ACM, 2014. Thomas Place and Marc Zeitoun. The tale of the quantifier alternation hierarchy of first-order logic over words.
SIGLOG News , 2(3):4–17, 2015. Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic.
Log-ical Methods in Computer Science , 12(1), 2016. Thomas Place and Marc Zeitoun. Adding successor: A transfer theorem for separation andcovering. Unpublished, a preliminary version can be found at http://arxiv.org/abs/1709.10052 , 2017. Thomas Place and Marc Zeitoun. Going higher in the first-order quantifier alternationhierarchy on words. Unpublished, a preliminary version can be found at https://arxiv.org/abs/1404.6832 , 2017. Thomas Place and Marc Zeitoun. Separation for dot-depth two. In
Proceedings of the 32thAnnual ACM/IEEE Symposium on Logic in Computer Science, (LICS’17) , pages 202–213.IEEE Computer Society, 2017. Thomas Place and Marc Zeitoun. Generic results for concatenation hierarchies.
Theory ofComputing Systems (ToCS) , 2018. Selected papers from CSR’17. Marcel Paul Schützenberger. On finite monoids having only trivial subgroups.
Information and Control , 8:190–194, 1965. . Place and M. Zeitoun 47:17 Imre Simon. Piecewise testable events. In , pages 214–222, 1975. Howard Straubing. A generalization of the schützenberger product of finite monoids.
The-oretical Computer Science , 13(2):137–150, 1981. Denis Thérien. Classification of finite monoids: The language approach.
Theoretical Com-puter Science , 14(2):195–208, 1981.
F S T T C S 2 0 1 8
A Appendix to Section 3
In this appendix, we present the missing proofs for the statements of Section 3.
A.1 Proof of Proposition 3
We start with Proposition 3 which is used to build morphisms recognizing the languages weassociate to
NFAs and tagging pairs. Let us recall the statement. (cid:73)
Proposition 3.
Given a
NFA A = ( A, Q, δ, I, F ) and a compatible tagging P of size n , onemay compute in LogSpace a monoid morphism α : ( A ∪ E ) ∗ → M recognizing L [ A , P ] andsuch that | M | ≤ n + | A | × n × ( | Q | + 2) . Let P = ( τ : E ∗ → T, G ) ( n = | T | ). We construct the morphism α : ( A ∪ E ) ∗ → M recognizing L [ A , P ] ⊆ ( A ∪ E ) ∗ . That it has size | M | ≤ n + | A | × n × ( | Q | + 2) and can becomputed in LogSpace is immediate from the construction.Recall that L [ A , P ] is defined from an intermediary language K [ A , P ] ⊆ ( A × T ) ∗ whichis recognized by the NFA A [ P ]. We first prove the following preliminary result about K [ A , P ]which uses the fact that, by construction, all transitions in A [ P ] are labeled by distinctletters in A × T . (cid:73) Lemma 24.
The language K [ A , P ] is recognized by a morphism β : ( A × T ) ∗ → N suchthat monoid N has size | N | ≤ | Q | + 2 . Proof.
Recall that A [ P ] = ( A × T, Q, δ [ P ] , I, F ) where δ [ P ] is obtained by relabeling thetransition of A . We let N = Q ∪{ N , N } and equip N with the following multiplication. Theelements 0 N and 1 N are respectively a zero and a neutral element. For ( q , r ) , ( q , r ) ∈ Q ,we define,( q , r ) · ( q , r ) = (cid:26) ( q , r ) if r = q N otherwiseWe now define a morphism β : ( A × T ) ∗ → N . Given ( a, t ) ∈ A × T , we know by definitionthat there exists at most one transition in δ [ P ] whose label is ( a, t ). Therefore, either thereis no such transition and we let β (( a, t )) = 0 N or there exists exactly one pair ( q, r ) ∈ Q such that ( q, ( a, t ) , r ) ∈ δ [ P ] and we define β (( a, t )) = ( q, r ). One may now verify that β recognizes L ( A [ P ]) = K [ A , P ]. (cid:74) Let us briefly recall how L [ A , P ] ⊆ ( A ∪ E ) ∗ is defined from K [ A , P ]. We have a map µ : ( A E ∗ ) ∗ → ( A × T ) ∗ defined as follows. Consider w ∈ ( A E ∗ ) ∗ . Since A ∩ E = ∅ , w admitsa unique decomposition w = a w · · · a n w n with a , . . . , a n ∈ A and w , . . . , w n ∈ E ∗ . Wedefine, µ ( w ) = ( a , τ ( w )) · · · ( a n , τ ( w n )). Finally, recall that, L [ A , P ] = E ∗ · µ − ( K [ A , P ]) ⊆ E ∗ ( A E ∗ ) ∗ = ( A ∪ E ) ∗ We may now define the morphism α : ( A ∪ E ) ∗ → M . We let β : ( A × T ) ∗ → N as themorphism given by Lemma 24. Consider the following set M : M = T ∪ ( T × N × A × T )Note that since | N | ≤ | Q | + 2, we do have | M | ≤ n + | A | × n × ( | Q | + 2) as desired. Weequip M with the following multiplication. Since M is defined as a union there are two kindsof elements which means that we have to consider four cases: . Place and M. Zeitoun 47:19 If t, t ∈ T , then their multiplication as element of M is the one in T , i.e. tt .If t ∈ T and ( t , s, a, t ) ∈ T × N × A × T , we let, t · ( t , s, a, t ) = ( tt , s, a, t )( r, t , s, a, t ) · t = ( t , s, a, t t )If ( t , s, a, t ) , ( t , s , a , t ) ∈ T × N × A × T , we let,( t , s, a, t ) · ( t , s , a , t ) = ( t , sβ (( a, t t )) s , a , t )One may verify that this multiplication is associative and that 1 T ∈ T is a neutral elementfor M . Finally, we define a morphism α : ( A ∪ E ) ∗ → M as follows. For a ∈ A , we let α ( a ) = (1 T , N , a, T ) ∈ T × N × A × T and for b ∈ E , we let α ( b ) = τ ( b ) ∈ T . The followingfact can be verified from the definition of α . (cid:73) Fact 25.
Consider a word u ∈ ( A ∪ E ) ∗ . Then, one of the two following properties holds: u ∈ E ∗ and α ( u ) = τ ( u ) ∈ T . u = u u au with u ∈ E ∗ , u ∈ ( A E ∗ ) ∗ , a ∈ A and u ∈ E ∗ and we have, α ( u ) = ( τ ( u ) , β ( µ ( u )) , a, τ ( u )) . It remains to verify that α recognizes L [ A , P ]. Since K [ A , P ] is recognized by β , we have H ⊆ N such that K [ A , P ] = β − ( H ). We define H ⊆ M as the following set: H = (cid:26) { ( t , s, a, t ) ∈ T × N × A × T | sβ (( a, t )) ∈ H } if 1 N H { ( t , s, a, t ) ∈ T × N × A × T | sβ (( a, t )) ∈ H } ∪ T if 1 N ∈ H Since L [ A , P ] = E ∗ · µ − ( K [ A , P ]) by definition, it can be verified from Fact 25 that L [ A , P ] = α − ( H ) which concludes the proof. A.2 Proof of Proposition 4
We first recall Proposition 4. (cid:73)
Proposition 4.
Let C , D be quotienting lattices such that D extends C . Consider two NFAs A and A over some alphabet A and let P be a compatible tagging that fools D . Then, L ( A ) is C ( A ) -separable from L ( A ) if and only if L [ A , P ] is D ( A ∪ E ) -separable from L [ A , P ] . We fix A = ( A, Q , δ , I , F ) and A = ( A, Q , δ , I , F ) for the proof. Moreover, welet P = ( τ : E ∗ → T, G ) as the tagging pair which fools D .There are two directions to prove. First, we assume that L ( A ) is C -separable from L ( A ). We prove that L [ A , P ] is D -separable from L [ A , P ]. Note that this direction isindependent from the hypothesis that P fools D . Let K ∈ C ( A ) be a separator for L ( A )and L ( A ): L ( A ) ⊆ K and L ( A ) ∩ K = ∅ . Consider the morphism γ : ( A ∪ E ) ∗ → A ∗ defined by γ ( a ) = a for a ∈ A and γ ( b ) = ε for b ∈ E . Since D is an extension of C , wehave γ − ( K ) ∈ D ( A ∪ E ) by definition. Moreover, it is straightforward to verify from thedefinitions of γ , L [ A , P ] and L [ A , P ] that γ − ( K ) separates L [ A , P ] from L [ A , P ] whichconcludes this direction of the proof.Assume now that L [ A , P ] is D -separable from L [ A , P ]. We show that L ( A ) is C -separable from L ( A ). Let K ∈ D ( A ∪ E ) which separates L [ A , P ] from L [ A , P ]. Clearly, F S T T C S 2 0 1 8 K ∈ Bool ( D )( A ∪ E ). Moreover, since D is a quotienting lattice, one may verify that Bool ( D )is a quotienting Boolean algebra (quotients commute with Boolean operations). Therefore,it follows from standard results about quotienting Boolean algebras that there exists amorphism α : ( A ∪ E ) ∗ → M into a finite monoid M which recognizes K and such thatevery language recognized by α belongs to Bool ( D ) (it suffices to choose α as the “syntacticmorphism” of K , see [7] for details). By definition of α and since P fools D , we get thefollowing fact. (cid:73) Fact 26.
There exists s ∈ M such that for every t ∈ G , we have w t ∈ E ∗ satisfying α ( w t ) = s and τ ( w t ) = t . Let u = w t ∈ E ∗ for some arbitrary t ∈ G and consider the morphism λ u : A ∗ → ( A ∪ E ) ∗ defined by γ ( a ) = au ∈ ( A ∪ E ) ∗ for every a ∈ A . Finally, we let K = λ − u ( K ). Since K ∈ D ( A ∪ E ) and D is an extension of C , it is immediate that K ∈ C ( A ). We now showthat K separates L ( A ) from L ( A ) which concludes the argument.We concentrate on proving that L ( A ) ⊆ K . That L ( A ) ∩ K = ∅ is showed symmetri-cally and left to the reader. Consider some word v = a · · · a n ∈ L ( A ). We show that v ∈ K .By definition of L [ A , P ], it is straightforward to verify that there exists t , . . . , t n ∈ G (eachdepending on the whole word v ) such that a w t · · · a n w t n ∈ L [ A , P ]. Moreover, by def-inition in Fact 26, we know that α ( w t ) = α ( u ) = s for every t ∈ G . Consequently, weget, α ( a w t · · · a n w t n ) = α ( a u · · · a n u ) = α ( λ u ( v ))Since α recognizes L [ A , P ] which contains a w t · · · a n w t n , it follows that λ u ( v ) ∈ L [ A , P ] as well. Hence, since L [ A , P ] ⊆ K , we obtain that λ u ( v ) ∈ K . Finally, this yields v ∈ λ − u ( K ) = K , finishing the proof. A.3 Proof of Lemma 5
We first recall the statement of Lemma 5. (cid:73)
Lemma 5.
Let C be a positive variety. Then, C is an extension of itself. Moreover, ifBool ( C ) = REG , then C is smooth. We fix the positive variety C for the proof. Clearly, C is an extension of itself since positivevarieties are closed under inverse image by definition. We now assume that Bool ( C ) = REGand show that C is smooth: given as input k ∈ N , one may compute in LogSpace (with respectto k ) a tagging of rank at least k and which fools C . We describe how to construct a taggingof rank k and size polynomial in k , that it can be computed in LogSpace is straightforwardto verify and left to the reader. Furthermore, we consider the special case when k = 2 h forsome h ≥ k is not of this form, it suffices to consider the least h such that k ≤ h ).The construction is based on the following preliminary lemma. (cid:73) Lemma 27.
There exist constants ‘, m ∈ N such that for every h ≥ , there exists amorphism γ : B ∗ → T and F ⊆ T such that, B ≤ h × ‘ , | T | ≤ m h and | F | ≥ h . for every alphabet A and every morphism α : ( A ∪ B ) ∗ → M into a finite monoid M , ifall languages recognized by α belongs to Bool ( C )( A ∪ B ) , then, there exists s ∈ M , suchthat for every t ∈ T , we have w t ∈ B ∗ which satisfies α ( w t ) = s and τ ( w t ) = t . . Place and M. Zeitoun 47:21 Before we prove Lemma 27, let us use it to finish the construction of smooth taggings.We fix h ≥ h and size polynomial in 2 h . Let γ : B ∗ → T and F ⊆ T be as defined in Lemma 27. We fix some binary encoding of the alphabet B over thetwo letter alphabet E given by the morphism η : B ∗ → E ∗ : for every b ∈ B , η ( b ) is distinctword of length log ( | B | ).It is straightforward to build a morphism τ : E ∗ → T which recognizes the languages η ( γ − ( s )) for s ∈ T . Moreover, one may verify that it is possible to do so with a monoid T of size polynomial with respect to | T | and | B | . Therefore the size of T is polynomial withrespect to 2 h since B ≤ h × m , | T | ≤ m h . One may now verify from our hypothesis on γ that there exists F ⊆ T such that | F | ≥ h and ( τ : E ∗ → T , F ) fools C . This concludesthe main proof. It remains to handle Lemma 27. Proof of Lemma 27.
We start by proving the following fact which handles the special casewhen h = 1. We shall use this fact to define the constants ‘, m ∈ N . (cid:73) Fact 28.
There exists a morphism η : D ∗ → R and G ⊆ R such that | G | = 2 and for everyalphabet A and every morphism α : ( A ∪ D ) ∗ → M into a finite monoid M , if all languagesrecognized by α belongs to Bool ( C )( A ∪ D ) , then, there exists s ∈ M , such that for every r ∈ R , we have w r ∈ D ∗ which satisfies α ( w r ) = s and η ( w t ) = t . Proof.
Since
Bool ( C ) = REG, there exist an alphabet D and a regular language L ⊆ D ∗ suchthat L Bool ( C )( D ). Since L is regular, we have a morphism η : D ∗ → R into a finite monoid R and XF ⊆ R such that L = η − ( X ). Since L Bool ( C ), it is not Bool ( C )-separable from D ∗ \ L = η − ( R \ X ). This implies the existence of r ∈ X and r ∈ R \ X such that η − ( r )is not Bool ( C )-separable from η − ( r ). We let G = { r, r } . It remains to show the propertydescribed in the fact is satisfied.Consider a morphism α : ( A ∪ D ) ∗ → M such that every language recognized by α belongsto Bool ( C )( A (cid:100) D ). We have to exhibit s ∈ M and w, w ∈ D ∗ such that α ( w ) = α ( w ) = s , η ( w ) = r and η ( w ) = r . Let β : D ∗ → M be the restriction of α to D ∗ . Since Bool ( C )is a variety, one may verify that every language recognized by β belongs to Bool ( C )( D ).Since η − ( r ) ⊆ D ∗ is not Bool ( C )-separable from η − ( r ) ⊆ D ∗ , it follows that there exists s ∈ M such that β − ( s ) intersects both η − ( r ) and η − ( r ) (otherwise a separator in Bool ( C ) would be recognized by β ). This exactly says that we have w, w ∈ D ∗ such that β ( w ) = α ( w ) = β ( w ) = α ( w ) = s , η ( w ) = r and η ( w ) = r , finishing the proof. (cid:74) We fix the tagging η : D ∗ → R and G for the remainder of the argument. We define ‘ = | D | and m = | R | . We may now prove the Lemma 27. We proceed by induction on h ≥ h = 1 has already been handled with Fact 26. Assume now that h ≥
2. Inductionto h − γ : ( B ) ∗ → T and F ⊆ T satisfying the two assertions in thelemma. Recall that Bool ( C ) is a variety by hypothesis. Hence, it is closed under bijectiverenaming of letters and we may assume without loss of generality that D ∩ B = ∅ . Wedefine the alphabet B as the disjoint union B = B ∪ D . Moreover, we let T as the monoid T = T × R equipped with the componentwise multiplication. We let γ : B ∗ → T as themorphism such for every b ∈ B , γ ( b ) = (cid:26) ( γ ( b ) , R ) if b ∈ B (1 T , η ( b )) if b ∈ D F S T T C S 2 0 1 8
Finally, we let F = F × G . Observe that by definition, we have | F | = 2 × | F | ≥ h .Moreover, | B | = | D | + | B | ≤ h × ‘ and | T | = | T | × | R | ≤ m h . It remains to show that thesecond assertion in Lemma 27 holds.We consider an alphabet and a morphism α : ( A ∪ B ) ∗ → M such that every languagerecognized by α belong to Bool ( C )( A ∪ B ). We have to exhibit s ∈ M such for every t ∈ F ,there exists w t ∈ B ∗ satisfying α ( w t ) = s and γ ( w t ) = t . By hypothesis on η and γ , we havethe following fact. (cid:73) Fact 29.
We have two elements s B , s D ∈ M which satisfy the following properties:for every t ∈ F , we have w t ∈ ( B ) ∗ such that α ( w t ) = s B and γ ( w t ) = t .for every r ∈ G , we have w r ∈ D ∗ such that α ( w r ) = s D and η ( w r ) = r . Proof.
We prove the existence of s B , the argument for s D is symmetrical. Recall that B = B ∪ D and let β : ( A ∪ B ) ∗ → M be the restriction of α to ( A ∪ B ) ∗ . Since Bool ( C ) isa variety, and all languages recognized by α belong to Bool ( C )( A ∪ B ), it straightforwardto verify that all languages recognized by β belong to Bool ( C )( A ∪ B ). Hence, since byhypothesis on γ : ( B ) ∗ → T and F , we obtain s B ∈ M such that for every t ∈ F , wehave w t ∈ ( B ) ∗ such that α ( w t ) = β ( w t ) = s B and γ ( w t ) = t . (cid:74) We define s = s B s D . It remains to show that s satisfies the desired property. Consider t ∈ F = F × G . We have t = ( t , r ) with t ∈ F and r ∈ G . Let w t = w t w r . By definitionof γ , since w t ∈ ( B ) ∗ and w r ∈ D ∗ , we have, γ ( w t ) = γ ( w t ) γ ( w r ) = ( γ ( w t ) , R ) · (1 T , η ( w r )) = ( t , R ) · (1 T , r ) = ( t , r ) = t This concludes the proof. (cid:74)
A.4 Proof of Lemma 8
We now prove Lemma 8. Let us first recall the statement. (cid:73)
Lemma 8.
Let C be a finite quotienting Boolean algebra. For every n ∈ N , C E [ n ] issmooth and an extension of C [ n ] . We fix the finite quotienting Boolean algebra C for the proof. We start by proving that C E [ n ] is smooth for every n ∈ N .Let k ∈ N , we describe a tagging of rank k . we let T k = { t , . . . , t k − } as the monoidwhose multiplication is defined by t i t j = t i + j mod k for i, j ≤ k − T is isomorphic to Z /k Z ). We now consider the morphism τ k : E ∗ → T k defined by β (0) = β (1) = t (i.e. τ k counts the length of words modulo k ). Clearly the tagging ( τ k : E ∗ → T k , T k ) has rank k and can be computed in LogSpace . Moreover, the following lemma can be verified from thedefinition of C E and that of concatenation hierarchies (the proof is left to the reader). (cid:73) Lemma 30.
For every k ∈ N and every n ∈ N , the tagging ( τ k : E ∗ → T k , T k ) fools C E [ n ] . Altogether, we obtain that C E [ n ] is smooth for every n ∈ N . It remains to show that C E [ n ] is an extension of C [ n ] for every n ∈ N . Both conditions involved in extension areverified using induction on n (this amounts to proving that they are preserved by polynomialand Boolean closure). The arguments are straightforward and left to the reader. . Place and M. Zeitoun 47:23 B Appendix to Section 4
In this appendix we present the missing proofs of Section 4. Let us first take care of Lemma 10.Recall that in this section, an arbitrary alphabet A and a finite quotienting Boolean algebra C are fixed. B.1 Proof of Lemma 10
Let us first recall the statement of Lemma 10 (cid:73)
Lemma 10.
Given two morphisms recognizing regular languages L , L ⊆ A ∗ as input,one may compute in LogSpace a C -compatible morphism which recognizes both L and L . We let α : A ∗ → M and α : A ∗ → M as the morphisms recognizing L and L .Recall that the relation ∼ C associated to C is a congruence over A ∗ for word concatenation( ∼ C compares words which belong to the same languages in C ). Therefore, the quotient set A ∗ / ∼ C is a monoid (we write “ • ” for its multiplication) and the map w [ w ] C which mapseach word to its ∼ C -class is a monoid morphism.We let M = M × M × ( A ∗ / ∼ C ) as the monoid equipped with the componentwise multipli-cation. Moreover, we let β : A ∗ → M as the morphism defined by β ( w ) = ( α ( w ) , α ( w ) , [ w ] C ).Clearly, β recognizes both L and L . Moreover, β is C -compatible: given s = ( s , s , D ) ∈ M ,it suffices to define [ s ] C = D . It then immediate that the two axioms in the definition of C -compatibility are satisfied:Given w ∈ A ∗ we [ β ( w )] C = [ w ] C .Given s, s ∈ M [ ss ] C = [ s ] C • [ s ] C .Finally, it is clear that β ca be computed in LogSpace from α and α . (cid:73) Remark.
It is important here that the alphabet A is fixed. This implies that the monoid A ∗ / ∼ C is a constant. When A is a parameter, it may not be possible to compute β in LogSpace (this depends on C ). B.2 Proof of Proposition 12
We actually prove a statement which is slightly stronger than Proposition 12 (this is requiredto use induction in the proof). It is as follows. (cid:73)
Proposition 31.
Let h, m ∈ N be constants. Consider two C -compatible morphisms α : A ∗ → M and β : A ∗ → N and a good subset S ⊆ N . Given s ∈ M and T ∈ N such that | T | ≤ m , one may test in NL with respect to | M | and | N | whether there exists an ( α, β, S ) -treeof operational height at most h and with root label ( s, T ) . Clearly, Proposition 12 is the special case of Proposition 31 when m = 1. Hence, we mayconcentrate on proving Proposition 31.Consider two C -compatible morphisms α : A ∗ → M and β : A ∗ → N and a goodsubset S ⊆ N . Given h, m ∈ N , we shall write X h,m ⊆ M × N for the set of all elements( s, T ) ∈ M × N such that | T | ≤ m and ( s, T ) is the root label of an ( α, β, S )-tree ofoperational height is a most h .We have to show that when h and m are fixed, one may test in NL with respect to | M | and | N | whether some input pair ( s, T ) ∈ M × N belongs to X h,m . We proceed by inductionon h .When h = 0, ( α, β, S )-trees of operational height 0 contain only leaves and binarynodes. Therefore, one may verify from the definition that their labels are always of the F S T T C S 2 0 1 8 form ( α ( w ) , { β ( w ) } ) for some w ∈ A ∗ . Consequently, the problem of deciding whether ( s, T )belongs to X h,m amounts to verifying that T is a singleton { t } and that there exists w ∈ A ∗ such that α ( w ) = s and β ( w ) = t . This is easily achieved in NL .We now assume that h ≥
1. We introduce an auxiliary set Y h,m ⊆ M × N . Given( s, T ) ∈ M × N , we have ( s, T ) ∈ Y h,m when | T | ≤ m and one of the two following conditionsholds:( s, T ) ∈ X h − ,m , or,( s, T ) is the root label of an ( α, β, S )-tree having operational height h and whose root isan S -operation node (i.e. the unique child of the root has operational height h − h , we have the following lemma. (cid:73) Lemma 32.
Let s ∈ M and T ∈ N , one may test in NL with respect to | M | and | N | whether ( s, T ) ∈ Y h,m Proof.
It suffices to verify that given as input ( s, T ) ∈ Y h,m such that | T | ≤ m , one maycheck in NL whether one of the two conditions in the definition of Y h,m is satisfied. Testingwhether ( s, T ) ∈ X h − ,m can be achieved in NL by induction on h −
1. For the secondcondition, we know that the two following properties are equivalent:( s, T ) is the root label of an ( α, β, S )-tree having operational height at h and whose rootis an S -operation node.there exists an ( α, β, S )-tree having operational height h − e, E ) is anidempotent satisfying: e = s and T ⊆ E · { t ∈ S | [ e ] C = [ t ] C ∈ S } · E Since | T | ≤ m , it is straightforward to verify that the second assertion is satisfied if and onlyif E can be chosen such that | E | ≤ m (i.e. ( e, E ) ∈ X h − , m ). Hence, the second conditionscan be checked in NL by induction which concludes the proof. (cid:74) Moreover, the next lemma is immediate from the definition of ( α, β, S )-trees of operationalheight h and a pigeon-hole principle argument. (cid:73) Lemma 33.
Let ( s, T ) ∈ M × N . Then, ( s, T ) ∈ X h,m if and only if there exists ‘ ≤ | M | × | N | m and ‘ elements ( r , T ) , . . . , ( r ‘ , T ‘ ) ∈ Y h,m such that, s = r · · · r ‘ and { t , . . . , t m } ⊆ T · · · T ‘ It is now immediate from Lemma 32 and 33 that one may test in NL with respect to | M | and | N | whether some input pair ( s, T ) ∈ M × N belongs to X h,m . This concludes theproof. B.3 Proof of Proposition 13
Let us first recall the statement of Proposition 13. (cid:73)
Proposition 13.
Let h ∈ N be the J -depth of A ∗ / ∼ C . Consider two C -compatible mor-phisms α : A ∗ → M and β : A ∗ → N , and a good subset S ⊆ N . Then, for every ( s, T ) ∈ M × N , the following properties are equivalent: ( s, T ) is the root label of some ( α, β, S ) -tree. ( s, T ) is the root label of some ( α, β, S ) -tree whose operational height is at most h . . Place and M. Zeitoun 47:25 We fix h as the J -depth of A ∗ / ∼ C . Moreover, we let α : A ∗ → M and β : A ∗ → N as two C -compatible morphisms and fix S ⊆ N as a good subset. The direction 2) ⇒ ⇒ s, T ) ∈ M × N and a ( α, β, S )-tree T whose root label is ( s, T ), we explain how to constructa second tree with the same root label and whose operational height is bounded by h .For the proof, we call operational size of an ( α, β, S )-tree the total number of operationnodes it contains (clearly, this number is always larger than the operational height). Theresult is a consequence of the following lemma. (cid:73) Lemma 34.
Consider an ( α, β, S ) -tree T and assume that it contains a branch with twodistinct operation nodes x and x whose labels ( s, T ) and ( s , T ) satisfy [ s ] C = [ s ] C . Then,there exists a second tree T with strictly smaller operational size than T and with the sameroot label. Starting from an arbitrary ( α, β, S )-tree T , one may use Lemma 34 recursively to build T which has the same label as T and such that for any two operation nodes x and x onthe same branch of T , their labels ( s, T ) and ( s , T ) satisfy [ s ] C = [ s ] C . Clearly, this tree T has operational height bounded by h (by definition of h as the J -depth of A ∗ / ∼ C ). Thisconcludes the proof for the implication 1) ⇒
2) in Proposition 13.We now concentrate on proving Lemma 34. We let T and x = x the nodes defined in thelemma. Since x, x are on the same branch, one is an ancestor of the other. By symmetry,we assume that x is an ancestor of x . We let S as the subtree of T which is rooted in x . Welet ( s, T ) as the label ( s, T ) = lab ( S ) = lab ( x ). We build a new tree S with the same labelas S and strictly smaller operational size. It will then be simple to build the desired tree T by replacing the subtree S with S in T .Given two nodes z, z of S , we write z < z to denote the fact that z is a (strict) ancestorof z . By hypothesis, we have x < x , hence we may consider the sequence of operationsnodes which are between the two. We let x , . . . , x k as the sequence of all nodes which satisfythe following properties:For all i , x i is an operation node. x = x k < · · · < x = x .Note that since x k = x and x = x , we have k ≥
2. For all i ≥
1, we let ( f i , T i ) as label of x i .By definition of operation nodes, f i ∈ M must be an idempotent. Moreover, ( f k , T k ) = ( s, T )is the label of S and we know by hypothesis that [ f ] C = [ f k ] C . Finally, consider the uniquechild of x and let ( e, E ) be the label of this child (which is an idempotent of M × N since x is an operation node). Recall that by definition of operation nodes, we have e = f and T ⊆ E · { t ∈ S | [ e ] C = [ t ] C } · E .We now classify the nodes within S in several categories. We call backbone of S the pathmade of all (strict) ancestors of x . Since x k is the root, there are k − ≥ x , . . . , x k ). Furthermore, we call lower nodes all nodes withinthe subtree rooted in x (including x ). We denote by m the number operation nodes whichare lower nodes. Finally, all nodes which are neither backbone nor lower nodes are called side nodes . Observe that any side node z has a closest ancestor y on the backbone which hasto be a binary node. We say that z is a left (resp. right) side node when it belongs to thesubtree whose root is the left (resp. right) child of y . Finally, we associate a rank to eachside node z : the rank of z is the smallest i ≤ k such that x i is an ancestor of z ( i must existsince x k is the root). For all i ≤ k , we write ‘ i (resp. r i ) the number of operation nodeswhich are left (resp. right) side nodes of rank i . We illustrate these definitions in Figure 1. F S T T C S 2 0 1 8 x k x x x Lower nodes B a c k b o n e L e f t s i d e n o d e s o f r a n k L e f t s i d e n o d e s o f r a n k R i g h t s i d e n o d e s o f r a n k OperationBinary
Figure 1
Classification of the nodes in S (here, there are no right side nodes of rank 2). Observe that by definition, backbone nodes, lower nodes and side nodes account for allnodes in the tree. Thus, we have the following fact. (cid:73)
Fact 35.
The total number of operation nodes in S is, k − m + ‘ + · · · ‘ k + r + · · · + r k Essentially, the desired tree S is built by removing all backbone nodes from S andreplacing them with binary nodes. Thus, we obtain a tree S whose operational size is m + ‘ + · · · ‘ k + r + · · · + r k which is strictly smaller than that of S since k − ≥
1. Weuse an inductive construction which is formalized in the following lemma. (cid:73)
Lemma 36.
For every i ≤ k , there exist two ( α, β, S ) -trees U i and V i of labels ( u i , U i ) and ( v i , V i ) with operational heights ‘ + · · · + ‘ i and r + · · · + r i respectively. Moreover,there exist u i , v i ∈ M satisfying the following two conditions: For q ∈ { u i , u i } and r ∈ { v i , v i } , f i = qer . T i ⊆ U i E · { t ∈ S | [ t ] C = [ ev i f i u i e ] C } · EV i . Before we show Lemma 36, we use it to build the desired tree S and finish the proofof Lemma 34. Recall that we need S to have label lab ( S ) = ( s, T ) = ( f k , T k ). We applyLemma 36 in the special case when i = k . This yields two ( α, β, S )-trees U k and V k withlabels ( u k , U k ) and ( v k , V k ) which have operational heights ‘ + · · · + ‘ i and r + · · · + r i .Moreover, we let u k , v k ∈ M which satisfy the two assertions in the lemma.It follows from the first assertion in Lemma 36 that u k ev k = v k eu k = f k = s . This impliesthe following fact. (cid:73) Fact 37. [ e ] C = [ ev k f k u k e ] C . . Place and M. Zeitoun 47:27 Proof.
By definition of C -compatible morphisms we have,[ ev k f k u k e ] C = [ e ] C • [ v k ] C • [ f k ] C • [ u k ] C • [ e ] C Therefore, since [ f k ] C = [ e ] C , it suffices to prove that, [ e ] C = [ e ] C • [ v k ] C • [ e ] C • [ u k ] C • [ e ] C .By the first assertion in Lemma 36, we have e = f k = u k ev k . Hence, [ e ] C = [ u k ] C • [ e ] C • [ v k ] C .Moreover, since e is idempotent of M , [ e ] C = [ ee ] C = [ e ] C • [ e ] C is an idempotent of A ∗ / ∼ C .This yields,[ e ] C = [ e ] C • [ u k ] C • [ e ] C • [ v k ] C • [ e ] C [ e ] C = ([ e ] C • [ u k ] C ) ω • [ e ] C • ([ v k ] C • [ e ] C ) ω [ e ] C = [ e ] C • ([ v k ] C • [ e ] C ) ω [ e ] C = [ e ] C • [ v k ] C • [ e ] C • ([ v k ] C • [ e ] C ) ω − We may now replace the second copy of [ e ] C in the above with [ e ] C • [ u k ] C • [ e ] C • [ v k ] C • [ e ] C which yields,[ e ] C = [ e ] C • [ v k ] C • [ e ] C • [ u k ] C • [ e ] C • ([ v k ] C • [ e ] C ) ω Finally, since [ e ] C = [ e ] C • ([ v k ] C • [ e ] C ) ω , this yields [ e ] C = [ e ] C • [ v k ] C • [ e ] C • [ u k ] C • [ e ] C asdesired. (cid:74) In view of Fact 37 and the second assertion in Lemma 36, we obtain that, T k ⊆ U k E · { t ∈ S | [ t ] C = [ e ] C } · EV k (1)Finally, we have a tree of root label ( e, E ) whose operational size is m −
1: the child of x .Hence, using one operation node, we may build a tree of operational size m whose root labelis: ( e, E · { t ∈ S | [ t ] C = [ e ] C } )Finally, by (1), we may combine this tree with U k and V k using two binary nodes to get atree S whose root label is:( s, T ) = ( f k , T k ) = ( u k ev k , T k )By definition, this tree S has operational size m + m + ‘ + · · · + ‘ k + r + · · · + r k . As desired,this is strictly smaller than S (its operational size is k − m + ‘ + · · · ‘ k + r + · · · + r k byFact 35 and k − ≥ i . When i = 1,since x is an operation node whose unique child has label ( e, E ), we have f = e and T ⊆ E · { t ∈ S | [ e ] C = [ t ] C } · E . We define both U and V as the same tree made of asingle leaf whose label is (1 M , { N } ) = ( α ( ε ) , { β ( ε ) } ). It is then simple to verify that thetwo assertions in the lemma are satisfied for u = v = 1 M .We now assume that i ≥
2. By definition, x i has a unique child whose label is anidempotent ( f i , F i ) such that, T i ⊆ F i · { t ∈ S | [ f i ] C = [ t ] C } · F i We use the following fact to choose our new trees U i , V i . F S T T C S 2 0 1 8 (cid:73)
Fact 38.
There exist two ( α, β, S ) -trees P and Q whose operational sizes are respectivelybounded by ‘ i and r i and whose labels ( p, P ) and ( q, Q ) satisfy the following two properties, f i = p · f i − · qF i ⊆ P T i − Q Proof.
We build P (resp. Q ) by combining all subtrees made of left (resp. right) side nodesof rank i into a single one using binary nodes only. In the degenerate case when there are noleft (resp. right) side nodes P (resp. Q ) is a single leaf with label (1 M , { N } ).Let us describe this construction in more details when the set of left and right side nodesof rank i are nonempty Consider all nodes between x i and x i − (which are all binary bydefinition). For each such node, one child is an ancestor of x i − (or x i − itself) and the otheris a side node. We define, x i < z h < · · · < z < x i − as all binary nodes whose left children are side nodes (inparticular these children and all their descendants are left side nodes of rank i ). x i < z h < · · · < z < x i − as all binary nodes whose right children are side nodes (inparticular these children and all their descendants are right side nodes of rank i ).We may now define P and Q . We start with P . For all j ≤ h , we let ( p j , P j ) as the label ofthe left child of z j . Clearly, one may combine all subtrees rooted in the left children of the z j with binary nodes into a single one whose label is,( p, P ) = ( p h , P h ) · · · · · ( p , P )By definition, the operational size of P is ‘ i : the sum of those for the subtrees we havecombined (we only added binary nodes). Symmetrically, one may build Q of operational size r i whose label is,( q, Q ) = ( q , Q ) · · · · · ( q h , Q h )where ( q j , Q j ) is the label of the right child of z j for all j ≤ h . One may now verify fromthe definition that the two assertions in the fact are satisfied. (cid:74) We are now ready to define our new trees U i and V i . We first use induction to obtaintwo trees U i − and V i − of labels ( u i − , U i − ) and ( v i − , V i − ) which satisfy the conditionsof Lemma 36 for i −
1. We define, U i as the tree of label ( u i , U i ) = ( p · u i − , P U i − ) obtained by combining P and U i − with a single binary node. V i as the tree of label ( v i , V i ) = ( v i − · q, V i − S ) obtained by combining V i − and Q with a single binary node.It remains to prove that this definition for the trees U i and V i satisfies the conditionsin Lemma 36. By definition, the operational size of U i is the sum of that of P (i.e. ‘ i bydefinition in Fact 38) with that of U i − (i.e. ‘ + · · · ‘ i − since we obtained U i − by induction).This exactly says that the operational size of U i is ‘ + · · · ‘ i as desired. Symmetrically, onemay verify that the operational size of V i is r + · · · + r i .We now have to find u i , v i ∈ M which satisfy the two assertions in the lemma. Since weobtained U i − and V i − by induction, we also have u i − , v i − ∈ L which satisfy these twoassertions for i −
1. We define, u i = pf i − u i − and v i = v i − f i − q It remains to verify that the two assertions in Lemma 36 hold for this choice of u i , v i .We begin with the first one. . Place and M. Zeitoun 47:29 Assertion 1.
We have four equalities to verify. Since the argument is similar for all four, weconcentrate on f i = u i ev i and f i = u i ev i whose proofs encompass all arguments. By Fact 38,we know that f i = pf i − q . Moreover, since f i − = u i − ev i − by the inductive definition of u i − and v i − , we get, f i = pu i − ev i − q = u i ev i Furthermore, f i − is idempotent. Thus, f i = pf i − q = p ( f i − ) q and since by constructionof u i − and v i − , we have f i − = u i − ev i − , we obtain, f i = pf i − u i − ev i − f i − q = u i ev i Assertion 2.
We finish with the second assertion which is the most involved. In particular,this is where we use the fact that S is good. We need to show that, T i ⊆ U i E · { t ∈ S | [ t ] C = [ ev i f i u i e ] C } · EV i We start with a simple fact. (cid:73)
Fact 39.
For any ( s, T ) ∈ M × N which is the label of an ( α, β, S ) -tree, we have T ⊆ { t ∈ S | [ t ] C = [ s ] C } . Proof.
This is immediate by induction on the height of ( α, β, S )-trees using the hypothesisthat S is good. (cid:74) We now start the proof. By definition, ( f i , T i ) is the label of the operation node x i whosechild has label ( f i , F i ). Hence, T i ⊆ F i · { t ∈ S | [ t ] C = [ f i ] C } · F i and it follows from thesecond item in Fact 38 that, T i ⊆ P T i − Q · { t ∈ S | [ t ] C = [ f i ] C } · P T i − Q The result is now a consequence of the two following inclusions:
P T i − Q ⊆ U i E · { t ∈ S | [ t ] C = [ ev i ] C } P T i − Q ⊆ { t ∈ S | [ t ] C = [ u i e ] C } · EV i (2)Indeed, one may combine these two inequalities with the previous one using the hypothesisthat S is good to obtain the desired inclusion: T i ⊆ U i E · { t ∈ S | [ t ] C = [ ev i ] C } · { t ∈ S | [ t ] C = [ f i ] C } · { t ∈ S | [ t ] C = [ u i e ] C } · EV i ⊆ U i E · { t ∈ S | [ t ] C = [ ev i f i u i e ] C } · EV i It remains to prove the two inequalities in (2). As they are based on symmetrical arguments,we concentrate on the first one and leave the other to the reader. Since we built U i − and V i − with induction, we have, T i − ⊆ U i − E · { t ∈ S | [ t ] C = [ ev i − f i − u i − e ] C } · EV i − By Fact 39, E ⊆ { t ∈ S | [ t ] C = [ e ] C } and V i − ⊆ { t ∈ S | [ t ] C = [ v i − ] C } . Hence, using thefact that S is good, we may simplify the above inclusion as follows: T i − ⊆ U i − E · { t ∈ S | [ t ] C = [ ev i − f i − u i − ev i − ] C } F S T T C S 2 0 1 8
Since u i − and v i − were built by induction, we know that u i − ev i − = f i − . Hence, since f i − is an idempotent, T i − ⊆ U i − E · { t ∈ S | [ t ] C = [ ev i − f i − ] C } Using Fact 39 again, we have Q ⊆ { t ∈ S | [ t ] C = [ q ] C } . Thus, using the hypothesis that S isgood together with the fact that v i = v i − f i − q by definition, this yields the following, T i − Q ⊆ U i − E · { t ∈ S | [ t ] C = [ ev i − f i − q ] C }⊆ U i − E · { t ∈ S | [ t ] C = [ ev i ] C } Finally, since U i = P U i − by definition, we have P T i − Q ⊆ P U i − E · { t ∈ S | [ t ] C = [ ev i ] C }⊆ U i E · { t ∈ S | [ t ] C = [ ev i ] C } This conclude the proof of Lemma 36.
C Appendix to Section 5
This section provides the missing proofs in Section 5. We start by introducing additionalterminology and preliminary results that we shall need to present these proofs.
C.1 Stratifications
We present a stratification of ST[3 /
2] =
Pol (AT) into finite quotienting lattices. It wasintroduced in [17]. We refer the reader to [17] for the proofs of the statements presentedhere.For any natural number k ∈ N , we define a finite quotienting lattice Pol k (AT) ⊆ Pol (AT).The definition uses induction on k :When k = 0, we simply define Pol (AT) = AT.When k ≥
1, we define
Pol k (AT) as the smallest lattice which contains Pol k − (AT) andsuch for any L , L ∈ Pol k − (AT) and any a ∈ A , L aL ∈ Pol k (AT)One may verify from the definitions that for every k ∈ N , Pol k (AT) is a finite quotientinglattice and that Pol k (AT) ⊆ Pol k +1 (AT). Moreover, by definition of Pol (AT), we have:ST[3 /
2] =
Pol (AT) = [ k ≥ Pol k (AT) . Given any alphabet A , we associate preorder relations to the strata Pol k (AT). For every k ∈ N and u, v ∈ A ∗ , we write u (cid:54) k v when the following condition is satisfied,For every L ∈ Pol k (AT)( A ), u ∈ L ⇒ v ∈ L It is immediate by definition that (cid:54) k is a preorder relation on A ∗ . The key point is thatwe may use it to characterize separability for Pol (AT) = ST[3 / (cid:73) Lemma 40.
Let A be an alphabet and L, L ⊆ A ∗ two languages. Then, the two followingproperties are equivalent: . Place and M. Zeitoun 47:31 L is not ST[3 / -separable from L . For every k ∈ N , there exists w ∈ L and w ∈ L such that w (cid:54) k w . Moreover, we may also use (cid:54) k to characterize separability for BPol (AT) = ST[2]. (cid:73)
Lemma 41.
Let A be an alphabet and L, L ⊆ A ∗ two languages. Then, the two followingproperties are equivalent: L is not ST[2] -separable from L . For every k ∈ N , there exists w ∈ L and w ∈ L such that w (cid:54) k w and w (cid:54) k w . We finish the presentation with three properties of the relations (cid:54) k . The first one issimple and states that they are compatible with word (this is because the strata Pol k (AT)are closed under quotients). (cid:73) Lemma 42.
Let A be an alphabet and k ∈ N . For every u , u ; v , v ∈ A ∗ such that u (cid:54) k v and u (cid:54) k v , we have u u (cid:54) k v v . The second lemma holds because
Pol (AT) is a sub-class of the star-free languages. It isas follows. (cid:73)
Lemma 43.
Let A be an alphabet and k ∈ N . Consider h , h ≥ k +1 − and any u ∈ A ∗ .Then, we have u h (cid:54) k u h . Finally, the third lemma states a characteristic property of
Pol (AT). The proof is rathertechnical (see [17] for details). Given an alphabet A and a word w ∈ A ∗ , we write alph ( w )for the alphabet of w , i.e. the least sub-alphabet B ⊆ A such w ∈ B ∗ . (cid:73) Lemma 44.
Let A be an alphabet and k ∈ N . Consider h, h , h ≥ k +1 − and any u, v ∈ A ∗ such that alph ( v ) ⊆ alph ( u ) , we have u h (cid:54) k u h vu h . C.2 Upper bound in Theorem 19
We explain why ST[3 / PSpace for monoids (as usual, the result may thenbe lifted to
NFAs using Corollary 6). The argument reuses the results of Section 4 andAppendix B, and the fact that ST[3 /
2] =
Pol (AT). In particular, we adapt Theorem 11 tothis setting. We start with some preliminary observations about the class AT.By definition of AT, it is straightforward to verify that the equivalence ∼ AT compareswords with the same alphabet. For u, v ∈ A ∗ , we have u ∼ AT v if and only if alph ( u ) = alph ( v ).Therefore, the monoid A ∗ / ∼ AT corresponds to 2 A (the set of sub-alphabets) equipped withunion as the multiplication. Moreover, for every w ∈ A ∗ , we have [AT] w = alph ( w ).We shall consider AT-compatible morphisms. If α : A ∗ → M is AT-compatible, given s ∈ M , we shall write alph ( s ) for [AT] s . We reuse the notion of ( α, β, S )-trees which weintroduced in Section 4 (here, we use them in the special case when C = AT). Consider analphabet A and two AT-compatible morphisms α : A ∗ → M and β : A ∗ → N . Given a pair( s, T ) ∈ M × N , we say that ( s, T ) is alphabet safe when alph ( s ) = alph ( t ) for every t ∈ T .The following lemma follows from definitions. (cid:73) Lemma 45.
Consider an alphabet A and two AT -compatible morphisms α : A ∗ → M and β : A ∗ → N . Moreover, let S ⊆ N be a good subset of N . Then, every ( s, T ) ∈ M × N which is the root label of some ( α, β, S ) -tree is alphabet safe. F S T T C S 2 0 1 8
Note that in the Appendix, the alphabet is one of our parameters which means that thesize of the monoid A ∗ / ∼ AT = 2 A may not be constant. Consequently, building AT-compatiblemorphisms is costly. Hence, we shall have to manipulate the construction explicitly. Given anarbitrary morphism α : A ∗ → M into a finite monoid M , we write α AT for the AT-compatiblemorphism α AT : A ∗ → M × A defined by α AT ( w ) = ( α ( w ) , alph ( w )).We may now adapt Theorem 11 to this setting. This is the key result for proving thatST[3 / PSpace for monoids. (cid:73)
Proposition 46.
Consider two morphisms α : A ∗ → M and β : A ∗ → N . Moreover, let α AT : A ∗ → M × A and β AT : A ∗ → N × A be the corresponding AT -compatible morphisms.Finally, let S ⊆ N × A be a good subset of N × A for β AT .Given an alphabet safe pair ( s, T ) ∈ ( M × A ) × N × A , one may test in PSpace withrespect to | A | , | M | and | N | whether there exists an ( α AT , β AT , S ) -tree with root label ( s, T ) . Proof sketch.
By Lemma 45, the set of possible labels for nodes in ( α AT , β AT , S )-treeshas size at most | M | × | N | × | A | (this is the size of the set of all alphabet safe pairsin ( M × A ) × N × A ). This observation yields an EXPTime least fixpoint algorithm forcomputing the set of all root labels of ( α AT , β AT , S )-tree with root label ( s, T ).This can be improved to PSpace by observing that it suffices to consider ( α AT , β AT , S )-trees whose heights are polynomially bounded with respect to | A | , | M | and | N | . This is asimple consequence of Proposition 13 since the J -depth of A ∗ / ∼ AT = 2 A is easily verified tobe | A | + 1. (cid:74) Since ST[3 /
2] =
Pol (AT), it is now simple to combine Theorem 14 with Proposition 46to get a
PSpace algorithm for ST[3 / C.3 Proof of Lemma 21
Let us recall the statement of Lemma 21 (we refer the reader to Section 5 for the definitionof the relevant notations). (cid:73)
Lemma 21.
Consider ≤ i ≤ n . Then given an i -valuation V , the two following propertiesare equivalent: Ψ i is satisfied by V . L i ∩ [ V ] is not ST[3 / -separable from L i ∩ [ V ] . We proceed by induction on 0 ≤ i ≤ n . Let us start with the base case i = 0. In thatcase, Ψ is the quantifier-free formula ϕ . Consider some 0-valuation V ⊆ ( B ) ∗ . One mayverify the following fact from the definitions of L ⊆ ( B ) ∗ and [ V ]. (cid:73) Fact 47.
The two following properties are equivalent: Ψ is satisfied by V . L ∩ [ V ] = ∅ . Since L = ( B ) ∗ by definition, we have L ∩ [ V ] = [ V ]. Hence, it is immediate that L ∩ [ V ] = [ V ] is not ST[3 / L ∩ [ V ] if and only if L ∩ [ V ] = ∅ . Combinedwith Fact 47, this yields Lemma 21 in the case i = 0.We now assume that i ≥
1. There are two cases depending on whether the quantifier Q i is existential or universal (this is expected since the definitions of L i and L i depend on this . Place and M. Zeitoun 47:33 parameter). Since these two cases are similar, we handle the one when Q i is existential andleaver the other to the reader. Consider an i -valuation V ⊆ ( B i ) ∗ . We have to show that thetwo following properties are equivalent: Ψ i is satisfied by V . L i ∩ [ V ] is not ST[3 / L i ∩ [ V ].Let us start with some terminology that we shall use for both directions. We let V ⊥ and V > as the following ( i − V : V > = V \ { i , x i } ⊆ B i − and V ⊥ = V \ { i , x i } ⊆ B i − We may now prove the equivalence. There are two directions to show.
Direction ⇒ . Assume that Ψ i is satisfied by V . We show that L i ∩ [ V ] is notST[3 / L i ∩ [ V ]. We use Lemma 40: given an arbitrary k ∈ N , we have toexhibit w ∈ L i ∩ [ V ] and w ∈ L i ∩ [ V ] such that w (cid:54) k w . We fix k for the proof.Recall that by hypothesis, we have Ψ i = ∃ x i Ψ i − . Hence, since Ψ i is satisfied by V ,the definitions yield that either V > or V ⊥ satisfies Ψ i − . By symmetry, we assume thatwe are in the former case: V > satisfies Ψ i − . By induction hypothesis this implies that L i − ∩ [ V > ] is not ST[3 / L i − ∩ [ V > ]. Consequently, Lemma 40 yields u ∈ L i − ∩ [ V > ] and u ∈ L i − ∩ [ V > ] such that u (cid:54) k u . Note that by definition of V > , wehave u, u ∈ ( B i − \ { x i } ) ∗ . We define, w = ( i x i u $ x i ) k +1 i y = ( i x i u $ x i ) k +1 i $( i x i u $ x i ) k +1 i w = ( i x i u $ x i ) k +1 i $( i x i u $ x i ) k +1 i Clearly, alph ( i $) ⊆ alph ( i x i u $ x i ). Therefore, Lemma 44 yields that w (cid:54) k y . Moreover,since u (cid:54) k u , we get from Lemma 42 that y (cid:54) k w . By transitivity, we get w (cid:54) k w . Finally,one may verify from the definition of L i and L i that w ∈ L i ∩ [ V ] and w ∈ L i ∩ [ V ]. Therefore,Lemma 40 yields that L i ∩ [ V ] is not ST[3 / L i ∩ [ V ] as desired. Direction ⇒ . We actually prove the contrapositive of this implication. Assumingthat Ψ i is not satisfied by V , we show that L i ∩ [ V ] is ST[3 / L i ∩ [ V ].Since Ψ i = ∃ x i Ψ i − , our hypothesis yields that Ψ i − is neither satisfied by V > nor by V ⊥ .Therefore, induction yields the two following properties: L i − ∩ [ V > ] is ST[3 / L i − ∩ [ V > ]. We let K > ∈ ST[3 /
2] as a separator.Note that since [ V > ] ∈ ST[3 /
2] (actually [ V > ] ∈ AT), we may assume without loss ofgenerality that K > ⊆ [ V > ]. L i − ∩ [ V ⊥ ] is ST[3 / L i − ∩ [ V ⊥ ]. We let K ⊥ ∈ ST[3 /
2] as a separator.Again, we may assume without loss of generality that K > ⊆ [ V ⊥ ].We now define a language K ∈ ST[3 /
2] from K > and K ⊥ . We then show that it separates L i ∩ [ V ] from L i ∩ [ V ]. We let: K = { i }∪ A ∗ i (( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i ∪ i x i K > $ x i i ( A \ { x i } ) ∗ ∪ A ∗ i (( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i x i K > $ x i i ( A \ { x i } ) ∗ ∪ i x i K ⊥ $ x i i ( A \ { x i } ) ∗ ∪ A ∗ i (( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i x i K ⊥ $ x i i ( A \ { x i } ) ∗ F S T T C S 2 0 1 8
It is straightforward to verify that K ∈ Pol (AT) = ST[3 / K separates L i ∩ [ V ] from L i ∩ [ V ].We first show that L i ∩ [ V ] ⊆ K . Consider a word w ∈ L i ∩ [ V ], we show that w ∈ K .Recall that we have L i = ( i ( x i + x i ) L i − $( x i + x i )) ∗ i . Consequently, there exists k ≥ w , . . . , w k ∈ ( x i + x i ) L i − $( x i + x i ) such that, w = i w · · · i w k i . Observe first that if k = 0, then w = i ∈ K and we are finished. Assume now that k = 1.By definition of K , when w k ∈ ( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ), we also have w ∈ K . Therefore,we assume that w k ( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ). Since w k ∈ ( x i + x i ) L i − $( x i + x i ),the letter i cannot occur in w k (by definition of L i − ). Hence, our hypothesis on w k impliesone of the two following properties holds: x i ∈ alph ( w k ) and x i alph ( w k ), or, x i ∈ alph ( w k ) and x i alph ( w k ).By symmetry, we handle the case when the first property holds and leave the other tothe reader. We now assume that x i ∈ alph ( w k ) and x i alph ( w k ).There are two sub-cases depending on whether x i ∈ alph ( w ) or not. Assume first that x i alph ( w ). Since w ∈ ( x i + x i ) L i − $( x i + x i ), it follows that w = x i u $ x i where u ∈ L i − .Moreover, recall that w ∈ [ V ] by definition which implies that u ∈ [ V ]. Moreover, alph ( u )contains neither x i nor i (the latter holds by definition of L i − ). Altogether, this yields that u ∈ L i − ∩ [ V > ] and therefore u ∈ K > by definition of K > . It follows that w ∈ x i K > $ x i which implies that w ∈ i x i K > $ x i i ( A \ { x i } ) ∗ ⊆ K which concludes this case.Finally, assume that x i ∈ alph ( w ). Therefore, there exists some factor w j for j ≤ k suchthat x i ∈ alph ( w j ). We consider the rightmost one. Note that we have j < k by hypothesison w k . By definition, we know that x i alph ( i w j +1 · · · i w k i ). We may now reuse theargument of the previous case to obtain that, i w j +1 · · · i w k i ∈ i x i K > $ x i i ( A \ { x i } ) ∗ Moreover, by definition of w j , we have w j ∈ ( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ). Therefore, we obtain, w ∈ A ∗ i (( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i x i K > $ x i i ( A \ { x i } ) ∗ ⊆ K This concludes the proof that L i ⊆ K .It remains to show that L i ∩ [ V ] ∩ K = ∅ . We proceed by contradiction and assume thatthere exists w ∈ L i ∩ [ V ] ∩ K . Recall that by definition, we have T i = ( i x i ( B i − \ { x i } )$ x i ) ∗ and T i = ( i x i ( B i − \ { x i } )$ x i ) ∗ L i = ( i ( x i + x i ) L i − $( x i + x i )) ∗ i $ (cid:0) T i i ∪ T i i (cid:1) Therefore, since w ∈ L i , we have w = u i $ v i with u ∈ ( i ( x i + x i ) L i − $( x i + x i )) ∗ and v ∈ T i ∪ T i . By symmetry, we shall assume that v ∈ T i . We obtain that k, ‘ ≥ u , . . . , u k ∈ ( x i + x i ) L i − $( x i + x i ) and v , . . . , v ‘ ∈ i x i ( B i − \ { x i } )$ x i such that, u = i u · · · i u k and v = i v · · · i v ‘ Since K is defined as a union, w belongs to some member of this union. We treat eachcase independently. If w ∈ { i } , we have a contradiction since w contains the letter $ bydefinition. . Place and M. Zeitoun 47:35 Assume now that w ∈ A ∗ i (( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i . If ‘ = 0, this meansthat $ ∈ ( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ) which is a contradiction. Otherwise ‘ ≥ v ‘ ∈ ( A ∗ x i A ∗ ∩ A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ). This is also a contradiction since v ‘ ∈ i x i ( B i − \ { x i } )$ x i and cannot contain the letter x i .We now treat the case when w ∈ i x i K > $ x i i ( A \ { x i } ) ∗ . If k = 0, this implies that$ ∈ x i K > $ x i which is a contradiction. Otherwise, we have u ∈ x i K > $ x i . Recall that u ∈ ( x i + x i ) L i − $( x i + x i ). Therefore, u ∈ x i L i − $ x i which implies that L i − ∩ K > = ∅ .Furthermore, since K > ⊆ [ V > ] by definition, we get that L i − ∩ [ V > ] ∩ K > = ∅ . Thiscontradicts the definition of K > . One may handle the case when w ∈ i x i K ⊥ $ x i i ( A \{ x i } ) ∗ symmetrically using the definition of K ⊥ .We turn to the case when w ∈ A ∗ i (( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i x i K > $ x i i ( A \ { x i } ) ∗ .Since the factors v j cannot contain the letter x i , it follows that there exists j ≤ k such that u j ∈ ( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ ) and, i u j +1 · · · i u k i $ v i ∈ i x i K > $ x i i ( A \ { x i } ) ∗ One may now reuse the argument of the previous case to derive a contradiction. Finally,one may handle that case when w ∈ A ∗ i (( A ∗ x i A ∗ ) \ ( A ∗ i A ∗ )) i x i K ⊥ $ x i i ( A \ { x i } ) ∗ symmetrically which concludes the proof. C.4 Proof of Theorem 16
It is straightforward to verify from Proposition 46 and Theorem 16 that ST[2]-separation is in
EXPTime for monoids (since ST[2] is a variety, this is also the case for
NFAs by Corollary 6).We focus on proving that ST[2]-separation is
PSpace -hard for
NFAs (again this is lifted tomonoids with Corollary 6). As explained in the main paper, this boils down to provingProposition 23. (cid:73)
Proposition 23.
Consider an alphabet A and H, H ⊆ A ∗ . Let B = A ∪ { , $ } with , $ A , L = H A ∗ $ ∗ ) ∗ H A ∗ $ ∗ ⊆ B ∗ and L = H A ∗ $ ∗ ) ∗ ⊆ B ∗ . Thetwo following properties are equivalent: H is ST[3 / -separable from H . L is ST[2] -separable from L . We start with the direction 1) ⇒ H is ST[3 / H andlet K ⊆ A ∗ be a separator in ST[3 / S ⊆ B ∗ : S = B ∗ K B ∗ . Clearly, S ∈ ST[3 / ⊆ ST[2]. Moreover, since L = H A ∗ $ ∗ ) ∗ H A ∗ $ ∗ and H ⊆ K by definition of K , we have L ⊆ S . Finally, we have H ∩ K = ∅ by definition of K . Moreover, L = H A ∗ $ ∗ ) ∗ . Since , $ A , given w ∈ L , the only factors of w belonging to A ∗ H K ⊆ A ∗ , we get L ∩ K = ∅ which concludes the proof for the direction 1) ⇒ ⇒ H is not ST[3 / H , we show that L is not ST[2]-separable from L . ByLemma 41, we have to show that for every k ∈ N , there exists w ∈ L and w ∈ L such that w (cid:54) k w and w (cid:54) k w . we fix k for the proof. F S T T C S 2 0 1 8
Since H is not ST[3 / H , Lemma 41 yields u ∈ H and u ∈ H suchthat u (cid:54) k u . We define, w = u u $ k +1 ) k +1 u u $ k +1 w = u u $ k +1 ) k +1 Since u ∈ H and u ∈ H , it is clear from the definitions of L and L that w ∈ L and w ∈ L . It remains to show that w (cid:54) k w and w (cid:54) k w . We start with the former.Since u (cid:54) k u , we may use Lemma 42 to obtain the following inequality: w (cid:54) k u u $ k +1 ) k +1 u u $ k +1 = u u $ k +1 ) k +1 +1 Moreover, it is immediate from Lemma 43 that we have, u u $ k +1 ) k +1 +1 (cid:54) k w By transitivity, this yields w (cid:54) k w .We finish with the converse inequality. Clearly, alph ( u ⊆ alph ( u $ u $ k +1 (cid:54) k ( u $ k +1 u u $ k +1 We may apply Lemma 42 to obtain: u u $ k +1 ) k +1 (cid:54) k u u $ k +1 ) k +1 u u $ k +1 This exactly says that ww
Since H is not ST[3 / H , Lemma 41 yields u ∈ H and u ∈ H suchthat u (cid:54) k u . We define, w = u u $ k +1 ) k +1 u u $ k +1 w = u u $ k +1 ) k +1 Since u ∈ H and u ∈ H , it is clear from the definitions of L and L that w ∈ L and w ∈ L . It remains to show that w (cid:54) k w and w (cid:54) k w . We start with the former.Since u (cid:54) k u , we may use Lemma 42 to obtain the following inequality: w (cid:54) k u u $ k +1 ) k +1 u u $ k +1 = u u $ k +1 ) k +1 +1 Moreover, it is immediate from Lemma 43 that we have, u u $ k +1 ) k +1 +1 (cid:54) k w By transitivity, this yields w (cid:54) k w .We finish with the converse inequality. Clearly, alph ( u ⊆ alph ( u $ u $ k +1 (cid:54) k ( u $ k +1 u u $ k +1 We may apply Lemma 42 to obtain: u u $ k +1 ) k +1 (cid:54) k u u $ k +1 ) k +1 u u $ k +1 This exactly says that ww (cid:54) k ww