On Varieties of Ordered Automata
aa r X i v : . [ c s . F L ] M a r On Varieties of Ordered Automata ⋆ Ondˇrej Kl´ıma and Libor Pol´ak
Department of Mathematics and Statistics, Masaryk University,Kotl´aˇrsk´a 2, 611 37 Brno, Czech Republic,
Abstract.
The Eilenberg correspondence relates varieties of regular lan-guages to pseudovarieties of finite monoids. Various modifications of thiscorrespondence have been found with more general classes of regular lan-guages on one hand and classes of more complex algebraic structures onthe other hand. It is also possible to consider classes of automata insteadof algebraic structures as a natural counterpart of classes of languages.Here we deal with the correspondence relating positive C -varieties of lan-guages to positive C -varieties of ordered automata and we present variousspecific instances of this correspondence. These bring certain well-knownresults from a new perspective and also some new observations. More-over, complexity aspects of the membership problem are discussed bothin the particular examples and in a general setting. Algebraic theory of regular languages is a well-established field in the theory offormal languages. The basic ambition of this theory is to obtain effective char-acterizations of various natural classes of regular languages. First examples ofsignificant classes of languages, which were effectively characterized by prop-erties of syntactic monoids, were the star-free languages by Sch¨utzenberger [22]and the piecewise testable languages by Simon [23]. A general framework for dis-covering relationships between properties of regular languages and properties ofmonoids was provided by Eilenberg [6], who established a one-to-one correspon-dence between the so-called varieties of regular languages and pseudovarieties of finite monoids. Here varieties of languages are classes closed for taking quo-tients, preimages under homomorphisms and Boolean operations. Thus a mem-bership problem for a given variety of regular languages can be translated to amembership problem for the corresponding pseudovariety of finite monoids. Anadvantage of this approach is that pseudovarieties of monoids are exactly classesof finite monoids which have an equational description by pseudoidentities – seeReiterman [21]. For a thorough introduction to that theory we refer to surveysby Pin [17] and by Straubing and Weil [26].Since not every natural class of languages is closed for taking all mentionedoperations, various generalizations of the notion of varieties of languages have ⋆ The paper was supported by grant GA15-02862S of the Czech Science Foundation. Ondˇrej Kl´ıma and Libor Pol´ak been studied. One possible generalization is the notion of positive varieties oflanguages introduced by Pin [16] – the classes need not be closed for taking com-plementation. Their equational characterization was given by Pin and Weil [20].Another possibility is to weaken the closure property concerning preimages un-der homomorphisms – only homomorphisms from a certain fixed class C areused. In this way, one can consider C -varieties of regular languages which wereintroduced by Straubing [25] and whose equational description was presentedby Kunc [13]. These two generalizations could be combined as suggested by Pinand Straubing [19].In our contribution we do not use syntactic structures at all. We considerclasses of automata as another natural counterpart to classes of regular lan-guages. In fact, we deal with classes of semiautomata, which are exactly au-tomata without the specification of initial nor final states. Characterizing ofclasses of languages by properties of minimal automata is quite natural, sinceusually we assume that an input of a membership problem for a fixed class oflanguages is given exactly by the minimal deterministic automaton. For exam-ple, if we want to test whether an input language is piecewise testable, we do notneed to compute its syntactic monoid which could be quite large (see Brzozowskiand Li [2]). Instead of that, we check a condition which must be satisfied by itsminimal automaton and which was also established in [23]. This characteriza-tion was used in [24] and [27] to obtain a polynomial and quadratic algorithms,respectively, for testing piecewise testability. In [11], Simon’s condition was re-formulated and the so-called confluent acyclic (semi)automata were defined. Inthis setting, this characterization can be viewed as an instance of Eilenberg typetheorem between varieties of languages and varieties of semiautomata.Moreover, each minimal automaton is implicitly equipped with an order inwhich the final states form an upward closed subset. This leads to a notion ofordered automata. Then positive C -varieties of ordered semiautomata can bedefined as classes which are closed for taking certain natural closure operations.We recall here the general Eilenberg type theorem, namely Theorem 6.3, whichstates that positive C -varieties of ordered semiautomata correspond to positive C -varieties of languages.Summarizing, there are three worlds:(L) classes of regular languages,(S) classes of finite monoids, sometimes enriched by an additional structure likethe ordered monoids, monoids with distinguished generators, etc.,(A) classes of semiautomata, sometimes ordered semiautomata, etc.Most variants of Eilenberg correspondence relate (L) and (S), the relation-ship between (A) and (S) was studied by Chaubard et al. [4], and finally thetransitions between (L) and (A) were initiated by ´Esik and Ito [8]. Here wecontinue in the last approach, to establish Theorem 6.3. In fact, this result is acombination of Theorem 5.1 of [19] (only some hints to a possible proof are giventhere) and the main result of [4] relating worlds (S) and (A). In contrary, in thepresent paper, one can find a self-contained proof which does not go through theclasses of monoids. n Varieties of Ordered Automata 3 The paper is structured as follows. In Sections 2 and 3 we recall the basicnotions. In Sections 4 and 5 we study ordered semiautomata and some naturalalgebraic constructions on them. The next section is devoted to the detailed proofof Theorem 6.3. Section 7 explains how the unordered variant of this result canbe obtained. Section 8 presents several instances of Theorem 6.3 and Section 9discusses membership problem for C -varieties of semiautomata given by certaintype of pseudoidentities.This paper is a technical report which precedes the short conference pa-per [12] on the topic. The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-13435-8 . C -Varieties of Languages First of all, we recall basic definitions. Let A ∗ be the set of all words over a finitealphabet A . We denote by λ the empty word. The set A ∗ equipped with theoperation of concatenation forms a free monoid over A with λ being a neutralelement. A language over alphabet A is a subset of A ∗ . Note that all languageswhich are considered in the paper are regular. For a language L ⊆ A ∗ and a pairof words u, v ∈ A ∗ , we denote by u − Lv − the quotient of L by these words,i.e. the set u − Lv − = { w ∈ A ∗ | uwv ∈ L } . In particular, a left quotient is defined by u − L = { w ∈ A ∗ | uw ∈ L } and a right one is defined by Lv − = { w ∈ A ∗ | wv ∈ L } .For the propose of this paper, following Straubing [25], the category of homo-morphisms C is a category where objects are all free monoids over non-empty fi-nite alphabets and morphisms are certain monoid homomorphisms among them.If the sets A and B are clear from the context, we write briefly f ∈ C instead of f ∈ C ( A ∗ , B ∗ ). This “categorical” definition means that C satisfies the followingproperties: – For each finite alphabet A , the identity mapping id A : A ∗ → A ∗ belongs to C . – If f : B ∗ → A ∗ and g : C ∗ → B ∗ belong to C , then their composition gf : C ∗ → A ∗ is also in C .If f : B ∗ → A ∗ is a homomorphism and L ⊆ A ∗ , then by the preimage of L in the homomorphism f is meant the set f − ( L ) = { v ∈ B ∗ | f ( v ) ∈ L } . Definition 1.
Let C be a category of homomorphisms. A positive C -variety oflanguages V associates to every non-empty finite alphabet A a class V ( A ) ofregular languages over A in such a way that – V ( A ) is closed under unions and intersections of finite families, – V ( A ) is closed under quotients, i.e. L ∈ V ( A ) , u, v ∈ A ∗ implies u − Lv − ∈ V ( A ) , – V is closed under preimages in morphisms of C , i.e. f : B ∗ → A ∗ , f ∈ C , L ∈ V ( A ) implies f − ( L ) ∈ V ( B ) . Ondˇrej Kl´ıma and Libor Pol´ak
Note that the first condition in Definition 1 ensures that the languages ∅ and A ∗ belong to V ( A ) for every alphabet A : ∅ is the union of the empty system and A ∗ is the intersection of the empty system. In other words, the first conditioncan be equivalently formulated as ∅ , A ∗ ∈ V ( A ) and V ( A ) is closed under binaryunions and intersections. In particular, all V ( A )’s are nonempty.If C consists of all homomorphisms we get exactly the notion of the positivevarieties of languages. When adding “each V ( A ) is closed under complements”,we get exactly the notion of the C - variety of languages. In this section we fix basic terminology concerning finite automata. First of all,note that all considered automata in the paper are deterministic, complete, finiteand over finite alphabets. Moreover, we use the term semiautomaton when theinitial and final states are not explicitly given.A deterministic finite automaton (DFA) over the alphabet A is a five-tuple A = ( Q, A, · , i, F ), where Q is a non-empty set of states , · : Q × A → Q is acomplete transition function , i ∈ Q is the initial state and F ⊆ Q is the set of final states. The transition function can be extended to a mapping · : Q × A ∗ → Q by q · λ = q, q · ( ua ) = ( q · u ) · a , for every q ∈ Q, u ∈ A ∗ , a ∈ A . The automaton A accepts a word u ∈ A ∗ if and only if i · u ∈ F and the language recognized bythe automaton A is L A = { u ∈ A ∗ | i · u ∈ F } . More generally, for q ∈ Q , wedenote L A ,q = { u ∈ A ∗ | q · u ∈ F } . For a fixed A , we denote this languagesimply by L q .We recall the construction of the minimal automaton of a regular languagewhich was introduced by Brzozowski [1]. Since this automaton is uniquely deter-mined and it plays a central role in our paper, we use the adjective “canonical”for it. Definition 1.
The canonical deterministic automaton of a regular language L is D L = ( D L , A, · , L, F L ) , where D L = { u − L | u ∈ A ∗ } , q · a = a − q , for each q ∈ D L , a ∈ A , and F L = { q ∈ D L | λ ∈ q } . A part of Brzozowski’s result is the correctness of the previous definition,because one needs to show that D L is really a finite deterministic automaton.The minimality of D L can be obtained as a consequence of the following lemma.Since the result will be modified later in the paper, the proof of the followinglemma is also presented here. Lemma 2 ([1]).
Let L be a regular language with the canonical automaton D L and let A = ( Q, A, · , i, F ) be an arbitrary DFA with L A = L . Then the followingholds:(i) For each q ∈ D L , we have that L D L ,q = q .(ii) For each u ∈ A ∗ , we have that L A ,i · u = u − L .(iii) The rule ϕ : i · u u − L , for every u ∈ A ∗ , correctly defines a surjectivemapping from Q ′ = { i · u | u ∈ A ∗ } onto D L satisfying ϕ (( i · u ) · a ) = ( ϕ ( i · u )) · a ,for every u ∈ A ∗ , a ∈ A . n Varieties of Ordered Automata 5 Proof. (i) Let q be a state of D L , i.e. q = u − L for some u ∈ A ∗ . Then v ∈ L D L ,q if and only if λ ∈ ( u − L ) · v = ( uv ) − L , which is equivalent to uv ∈ L and alsoto v ∈ u − L = q .(ii) Let u ∈ A ∗ . Then, for every v ∈ A ∗ , we have the following chain ofequivalent formulas: v ∈ L A ,i · u , ( i · u ) · v ∈ F, uv ∈ L, and v ∈ u − L . (iii) The correctness of the definition of ϕ follows from (ii) and the surjectivityof ϕ is clear. Moreover, ϕ (( i · u ) · a ) = ϕ ( i · ua ) = ( ua ) − L = u − L · a = ϕ ( i · u ) · a . ⊓⊔ First, we recall some basic terminology from the theory of ordered sets. By an ordered set we mean a set M equipped with an order ≤ , i.e. by a reflexive,antisymmetric and transitive relation. A subset X is called upward closed if, forevery pair of elements x, y ∈ M , the following property holds: x ≤ y, x ∈ X implies y ∈ X . For every subset X , we denote by ↑ X the smallest upward closedsubset containing the subset X , i.e. ↑ X = { m ∈ M | ∃ x ∈ X : x ≤ m } . Inparticular, for x ∈ M , we write ↑ x instead of ↑{ x } . A mapping f : M → N between two ordered sets ( M, ≤ ) and ( N, ≤ ) is called isotone if, for every pairof elements x, y ∈ M , we have that x ≤ y implies f ( x ) ≤ f ( y ).States of the canonical automaton D L are languages, and therefore theyare ordered naturally by the set-theoretical inclusion. The action by each letter a ∈ A is an isotone mapping: for each pair of states p, q such that p ⊆ q , wehave p · a = a − p ⊆ a − q = q · a . Moreover, the set F L of all final states isan upward closed subset with respect to ⊆ . These observations motivate thefollowing definition. Definition 1. An ordered automaton over the alphabet A is a six-tuple O =( Q, A, · , ≤ , i, F ) , where – ( Q, A, · , i, F ) is a usual DFA; – ≤ is an order on the set Q ; – the action by every letter a ∈ A is an isotone mapping from the ordered set ( Q, ≤ ) to itself; – F is an upward closed subset of Q with respect to ≤ . The definitions of the acceptance and the recognition are the same as in thecase of DFA’s. Since a composition of isotone mappings is isotone, it follows fromDefinition 1 that the action by every word u ∈ A ∗ is an isotone mapping fromthe ordered set of states into itself.Moreover, the ordered semiautomaton Q = ( Q, A, · , ≤ ) accepts the language L ⊆ A ∗ if we can complete Q to an ordered automaton O = ( Q, A, · , ≤ , i, F )such that L = L O .The following result states that Brzozowski’s construction gives the minimalordered automata O L = ( D L , A, · , ⊆ , L, F L ). Ondˇrej Kl´ıma and Libor Pol´ak
Lemma 2.
Let O = ( Q, A, · , ≤ , i, F ) be an ordered automaton recognizing thelanguage L = L O . Then, for states p ≤ q , we have L p ⊆ L q . Moreover,the mapping ϕ from Lemma 2 is an isotone one onto the canonical orderedautomaton O L = ( D L , A, · , ⊆ , L, F L ) .Proof. Let p ≤ q hold in the given ordered automaton O . If w ∈ L p , then p · w ∈ F . Now p · w ≤ q · w implies q · w ∈ F and therefore w ∈ L q . Moreover,having u, v ∈ A ∗ such that p = i · u, q = i · v , we get L i · u ⊆ L i · v . By Lemma 2,it means that u − L ⊆ v − L , i.e. ϕ ( i · u ) ⊆ ϕ ( i · v ) . ⊓⊔ When relating languages with algebraic structures (not our task here), thefollowing property of the minimal/canonical ordered automaton is a crucial one.
Lemma 3 (Pin [17, Section 3]).
The transition monoid of the minimal au-tomaton of a regular language L is isomorphic to the syntactic monoid of L .Similarly, the ordered transition monoid of the minimal ordered automaton of L is isomorphic to the syntactic ordered monoid of L . The next lemma clarifies how the quotients of a language can be obtainedchanging the initial and the final states appropriately.
Lemma 4.
Let O = ( Q, A, · , ≤ , i, F ) be an ordered automaton recognizing thelanguage L = L O . Let u, v ∈ A ∗ . Then(i) u − L = L B where B = ( Q, A, · , ≤ , i · u, F ) ,(ii) Lv − = L C where C = ( Q, A, · , ≤ , i, F v ) and F v = { q ∈ Q | q · v ∈ F } .Proof. (i) It follows from Lemma 2 (ii).(ii) To show that C is an ordered semiautomaton, we need to prove that F v is upward closed. Let p ∈ F v and p ≤ q ∈ Q . From p ∈ F v we have p · v ∈ F andfrom p ≤ q we obtain p · v ≤ q · v . Since F is upward closed we get q · v ∈ F ,which implies q ∈ F v .Now, for every w ∈ A ∗ , the following is a chain of equivalent statements: w ∈ Lv − , wv ∈ L, ( i · w ) · v ∈ F, i · w ∈ F v , and w ∈ L C . Thus we proved the equality Lv − = L C . ⊓⊔ The next result characterizes languages which are recognized by changingthe final states in the canonical ordered automaton.
Lemma 5.
Let H ⊆ D L be upward closed. Then ( D L , A, · , ⊆ , L, H ) recognizesa language which can be expressed as a finite union of finite intersections oflanguages of the form Lv − .Proof. For an arbitrary upward closed subset X ⊆ D L , we define P ast( X ) = { w ∈ A ∗ | L · w ∈ X } . Since P ast( X ) = S p ∈ X P ast( ↑ p ), it is enough to provethat, for each p ∈ D L , the set P ast( ↑ p ) can be expressed as a finite intersectionof right quotients of the language L . n Varieties of Ordered Automata 7 Let p be an arbitrary state in D L . For each q ∈ D L such that p q , wehave p = L p L q = q . This means that there is v q ∈ A ∗ with the property v q ∈ L p \ L q . Equivalently, p · v q ∈ F L and q · v q F L . Now if w ∈ P ast( ↑ p ) then L · w ⊇ p . Therefore ( L · w ) · v q ⊇ p · v q ∈ F L , from which we get wv q ∈ L , i.e. w ∈ Lv − q . We have showed that P ast( ↑ p ) ⊆ Lv − q .Now we claim that P ast( ↑ p ) = \ p q Lv − q . We already saw the “ ⊆ ”-part. To prove the opposite inclusion, let w be anarbitrary word from ∈ T p q Lv − q . Fixing q for a moment, we see that wv q ∈ L ,i.e. L · wv q ∈ F L . In the case of q = L · w we would have q · v q = ( L · w ) · v q ∈ F L – a contradiction. Hence L · w = q , and this holds for each q p . Therefore L · w ⊇ p and we deduce that w ∈ P ast( ↑ p ). ⊓⊔ There is a natural question how the minimal ordered (semi)automaton canbe computed from a given automaton.
Proposition 6.
There exists an algorithm which computes, for a given automa-ton A = ( Q, A, · , i, F ) , the minimal ordered automaton of the language L ( A ) .Proof. Our construction is based on Hopcroft minimization algorithm for DFA’s.We may assume that all states of A are reachable from the initial state i . Let R = ( Q × F ) ∪ (( Q \ F ) × Q ). Then we construct the relation R from R byremoving unsuitable pairs of states step by step. At first, we put R = R .Then for each integer k , if we find ( p, q ) ∈ R k and a letter a ∈ A such that( p · a, q · a ) R k , then we remove ( p, q ) from the current relation R k , that is,we put R k +1 = R k \ { ( p, q ) } . This construction stops after, say, m steps. So, R m +1 = R satisfies ( p, q ) ∈ R = ⇒ ( p · a, q · a ) ∈ R , for every p, q ∈ Q and a ∈ A .Now, we observe that, ( p, q ) ∈ R if and only if, for every u ∈ A ∗ , ( p · u, q · u ) ∈ R .The condition can be equivalently written as( p, q ) ∈ R if and only if ( ∀ u ∈ A ∗ : p · u ∈ F = ⇒ q · u ∈ F ) . (1)It follows that R is a quasiorder on Q and we can consider the correspondingequivalence relation ρ = { ( p, q ) | ( p, q ) ∈ R, ( q, p ) ∈ R } on the set Q . Thenthe quotient set Q/ρ = { [ q ] ρ | q ∈ Q } has a structure of the automaton: therule [ q ] ρ · ρ a = [ q · a ] ρ , for each q ∈ Q and a ∈ A , defines correctly actionsby letters using (1). Furthermore, the relation ≤ on Q/ρ defined by the rule[ p ] ρ ≤ [ q ] ρ iff ( p, q ) ∈ R , is an order on Q/ρ compatible with actions by letters.So, A ρ = ( Q/ρ, A, · ρ , ≤ , [ i ] ρ , F ρ ), where F ρ = { [ f ] ρ | f ∈ F } , is an orderedautomaton recognizing L ( A ). Moreover, if there are two states [ p ] ρ , [ q ] ρ ∈ Q/ρ such that L ( A ρ , p ) = L ( A ρ , q ), then ( p, q ) ∈ ρ . Thus, the ordered automaton A ρ is isomorphic to the minimal ordered automaton of the language L ( A ). ⊓⊔ Note also that the classical power-set construction makes from a nondeter-ministic automaton an ordered deterministic automaton which is ordered by theset-theoretical inclusion. Thus, for the purpose of a construction of the minimalordered automaton, one may also use Brzozowski’s minimization algorithm usingpower-set construction for the reverse of the given language.
Ondˇrej Kl´ıma and Libor Pol´ak
To get an Eilenberg correspondence between classes of languages and the classesof semiautomata we need an appropriate definition of a variety of semiautomata.The notion of variety of semiautomata would be given in terms of closure prop-erties with respect to certain constructions on semiautomata.Positive C -varieties of languages are closed under quotients, therefore thechoice of an initial state and final states in ordered automata can be left freedue to Lemma 4.If an ordered automaton O = ( Q, A, · , ≤ , i, F ) is given, then we denote by O the corresponding ordered semiautomaton ( Q, A, · , ≤ ). In particular, for thecanonical ordered automaton D L = ( D L , A, · , ⊆ , L, F ) of the language L , wehave D L = ( D L , A, · , ⊆ ).Since positive C -varieties of languages are closed under taking finite unionsand intersections, we include the closedness with respect to direct products ofordered semiautomata. Definition 1.
Let n ≥ be a natural number. Let O j = ( Q j , A, · j , ≤ j ) be anordered semiautomaton for j = 1 , . . . , n . We define the ordered semiautomaton O × · · · × O n = ( Q × · · · × Q n , A, · , ≤ ) as follows: – for each a ∈ A , we put ( q , . . . , q n ) · a = ( q · a, . . . , q n · n a ) and – we have ( p , . . . , p n ) ≤ ( q , . . . , q n ) if and only if, for each j = 1 , . . . , n , theinequality p j ≤ j q j is valid.The ordered semiautomaton O × · · · × O n is called a product of the orderedsemiautomata O , . . . , O n . We would like to know, which languages are recognized by a product ofordered semiautomata.
Lemma 2.
Let the ordered semiautomaton O be the product of the ordered semi-automata O , . . . , O n . Then the following holds:(i) If, for each j = 1 , . . . , n , the language L j is recognized by O j , then both L ∩ · · · ∩ L n and L ∪ · · · ∪ L n are recognized by O .(ii) If the language L is recognized by O , then L is a finite union of finiteintersections of languages recognized by O , . . . , O n .Proof. Let O j = ( Q j , A, · j , ≤ j ) , j = 1 , . . . , n . Denote Q = Q × · · · × Q n and O = ( Q, A, · , ≤ ).(i) Let F , . . . , F n be sets of final states used for recognition of the languages L , . . . , L n . Put F = F × · · · × F n for the intersection L ∩ · · · ∩ L n and F = { ( q , . . . , q n ) | there exists j ∈ { , . . . , n } such that q j ∈ F j } for the union L ∪ · · · ∪ L n . It is not hard to see that, in the both cases, F isindeed an upward closed subset.(ii) Let L be a language recognized by ( Q, A, · , ≤ ), i.e let F be an upwardclosed subset of Q , and i ∈ Q such that L is recognized by ( Q, A, · , ≤ , i, F ). Since n Varieties of Ordered Automata 9 F = S p ∈ F ↑ p , we see that L = S p ∈ F L p , where L p is recognized by the orderedautomata ( Q, A, · , ≤ , i, ↑ p ). Furthermore, for such p , we have p = ( p , . . . , p n )and we can write ↑ p = ↑ p × · · · × ↑ p n . Let i = ( i , . . . , i n ) and let L ( p,j ) be alanguage recognized by the ordered automaton ( Q j , A, · j , ≤ j , i j , ↑ p j ). Then onecan check that L p = L ( p, ∩ · · · ∩ L ( p,n ) . ⊓⊔ Also the following construction is useful.
Definition 3.
Let I = { , . . . , n } be a non-empty finite set and, for each j ∈ I ,let Q j = ( Q j , A, · j , ≤ j ) be an ordered semiautomaton. We define the disjointunion Q = ( Q, A, · , ≤ ) of ordered semiautomata Q , . . . , Q n in the followingway: – Q = { ( q, j ) | j ∈ I, q ∈ Q j } , – for each a ∈ A and ( p, j ) , ( q, k ) ∈ Q , we put ( q, j ) · a = ( q · j a, j ) and – we put ( q, j ) ≤ ( p, k ) if and only if j = k and q j ≤ j p j . Clearly, L is recognized by a disjoint union of ordered semiautomata if andonly if it is recognized by some of them. A further useful notion is a homomor-phism of ordered semiautomata. Definition 4.
Let ( Q, A, · , ≤ ) and ( P, A, ◦ , (cid:22) ) be ordered semiautomata and ϕ : Q → P be a mapping. Then ϕ is called a homomorphism of ordered semi-automata if it is isotone and ϕ ( q · a ) = ϕ ( q ) ◦ a for all a ∈ A , q ∈ Q . If thereexists a surjective homomorphism of ordered semiautomata from ( Q, A, · , ≤ ) to ( P, A, ◦ , (cid:22) ) , then we say that ( P, A, ◦ , (cid:22) ) is a homomorphic image of ( Q, A, · , ≤ ) .We say that ϕ is backward order preserving if, for every p, q ∈ Q , the inequality ϕ ( p ) (cid:22) ϕ ( q ) implies p ≤ q . If the homomorphism ϕ is surjective and backwardorder preserving, then we say that ( Q, A, · , ≤ ) is isomorphic to ( P, A, ◦ , (cid:22) ) . In what follows, we use often simply (
P, A, · , ≤ ) instead of ( P, A, ◦ , (cid:22) ). Notethat every backward order preserving mapping is injective.In the setting of the previous definition, one can prove by induction withrespect to the length of words that for an arbitrary homomorphism ϕ of semi-automata that the equality ϕ ( q · u ) = ϕ ( q ) ◦ u holds for every state q ∈ Q andevery word u ∈ A ∗ . Lemma 5.
Let an ordered semiautomaton ( P, A, · , ≤ ) be a homomorphic imageof an ordered semiautomaton ( Q, A, · , ≤ ) and L be recognized by ( P, A, · , ≤ ) .Then L is also recognized by ( Q, A, · , ≤ ) .Proof. If L is recognized by an ordered automaton P = ( P, A, · , ≤ , i, F ), with F being an upward closed subset, and ϕ is a surjective homomorphism of asemiautomaton ( Q, A, · , ≤ ) onto the semiautomaton P = ( P, A, · , ≤ ), then wecan choose some i ′ ∈ Q such that ϕ ( i ′ ) = i and we can consider F ′ = { q ∈ Q | ϕ ( q ) ∈ F } . Now F ′ is an upward closed subset in ( Q, ≤ ), because ϕ is an isotonemapping and F is upward closed. Moreover, for an arbitrary u ∈ A ∗ , the following is a chain of equivalentstatements: i ′ · u ∈ F ′ , ϕ ( i ′ · u ) ∈ F, ϕ ( i ′ ) · u ∈ F, i · u ∈ F, and u ∈ L .
The statement of the lemma follows. ⊓⊔ Definition 6.
An ordered semiautomaton ( Q, A, · , ≤ ) is trivial if q · a = q forall q ∈ Q and a ∈ A , and ≤ is the equality relation on Q . In particular, for anatural number n , we define the ordered semiautomaton T n ( A ) = ( I n , A, · , =) ,where I n = { , . . . , n } , the transition function · is defined by the rule j · a = j for all j ∈ I n and a ∈ A . It follows directly from the definition that every trivial ordered semiautomatonis isomorphic to some T n ( A ) = ( I n , A, · , =). Lemma 7.
The disjoint union of ordered semiautomata Q j = ( Q j , A, · j , ≤ j ) , j ∈ { , . . . , n } , is a homomorphic image of the product Q × · · · × Q n × T n ( A ) .Proof. Clearly, the mapping defined by the rule ϕ : ( q , . . . , q n , j ) ( q j , j ) , for every q ∈ Q , . . . , q n ∈ Q n , j ∈ { , . . . , n } , is a surjective homomorphism of the considered semiautomata. ⊓⊔ Definition 8.
Let ( Q, A, · , ≤ ) be an ordered semiautomaton and P ⊆ Q be anon-empty subset. If p · a ∈ P for every p ∈ P , a ∈ A , then ( P, A, · , ≤ ) is calleda subsemiautomaton of ( Q, A, · , ≤ ) . In the previous definition, the transition function and order on P are restric-tions of the corresponding data on the set Q and so they are denoted by thesame symbols.Using the notions of a subsemiautomaton and a homomorphic image, we canformulate the minimality of the canonical ordered semiautomaton in a bit preciseway. Lemma 9.
Let ( Q, A, · , ≤ ) be an ordered semiautomaton recognizing the lan-guage L . Then the canonical ordered semiautomaton D L is a homomorphic imageof some subsemiautomaton of ( Q, A, · , ≤ ) .Proof. Let L be recognized by the ordered automaton A = ( Q, A, · , ≤ , i, F ). It iseasy to see that the subset Q ′ = { i · u | u ∈ A ∗ } constructed in Lemma 2 formsa subsemiautomaton of ( Q, A, · , ≤ ). Furthermore, we defined there the mapping ϕ : Q ′ → D L by the rule ϕ ( q ) = u − L , where q = i · u . This mapping ϕ is asurjective homomorphism of ordered semiautomata. ⊓⊔ We say that an ordered semiautomaton (
Q, A, · , ≤ ) is if thereexists a state i ∈ Q such that Q = { i · u | u ∈ A ∗ } . n Varieties of Ordered Automata 11 Lemma 10.
Let ( Q, A, · ≤ ) be a 1-generated ordered semiautomaton. Then thissemiautomaton is isomorphic to a subsemiautomaton of a product of the canoni-cal ordered semiautomata of languages recognized by the ordered semiautomaton ( Q, A, · ≤ ) .Proof. Let A = ( Q, A, · ≤ ) be a 1-generated ordered semiautomaton, i.e. Q = { i · u | u ∈ A ∗ } for some i ∈ Q . For each q ∈ Q , we consider the orderedautomaton A q = ( Q, A, · , ≤ , i, ↑ q ). This automaton recognizes the language L A q ,which we denote by L ( q ). Using Lemma 9, there is a surjective homomorphism ϕ q : A → D L ( q ) of ordered semiautomata, because A is 1-generated and thus Q ′ = Q here.Assume that A has n states and denote them by q , . . . , q n . We can considerthe product of the canonical ordered semiautomata D L ( q k ) = ( D L ( q k ) , A, · q k , ⊆ q k )for all k = 1 , . . . , n , i.e. D L ( q ) ×· · ·×D L ( q n ) . Moreover, since we have the mapping ϕ q for each q ∈ Q , we can define a mapping ϕ : Q → D L ( q ) × · · · × D L ( q n ) bythe rule ϕ ( p ) = ( ϕ q ( p ) , . . . , ϕ q n ( p )), for p ∈ Q . To prove the statement of thelemma it is enough to show that this mapping ϕ is an backward order preservinghomomorphism of ordered semiautomata.Let p ≤ q hold in Q . For each r ∈ Q the homomorphism ϕ r is isotone andhence ϕ r ( p ) ⊆ r ϕ r ( q ). Thus we get ϕ ( p ) ≤ ϕ ( q ) and we see that ϕ is an isotonemapping. In the similar way one can check that ϕ ( p · a ) = ϕ ( p ) · a for every p ∈ Q and a ∈ A . These facts mean that ϕ is a homomorphism of orderedsemiautomata.Now assume that p and q are two states in Q such that ϕ ( p ) ≤ ϕ ( q ). Thenwe have ϕ p ( p ) ⊆ p ϕ p ( q ) in D L ( p ) . By the definition of the mapping ϕ p given inLemma 2, for each r ∈ Q , we have ϕ p ( r ) = L A p ,r . In particular, we can write L A p ,p ⊆ L A p ,q . Since p is a final state in A p , we have λ ∈ L A p ,p and therefore λ ∈ L A p ,q , which means that q is a final state in A p too. In other words q ∈ ↑ p ,i.e. q ≥ p . Thus the mapping ϕ is backward order preserving. ⊓⊔ Lemma 11.
Let A = ( Q, A, · ≤ ) be an ordered semiautomaton. Then the semi-automaton A is a homomorphic image of a disjoint union of its 1-generatedsubsemiautomata.Proof. For every q ∈ Q , we consider the subset of Q given by Q q = { q · u | u ∈ A ∗ } consisting from all states reachable from q . Clearly, Q q form a 1-gen-erated subsemiautomaton of A = ( Q, A, · , ≤ ). We consider disjoint union ofall these semiautomata. The set of all states of this ordered semiautomaton is P = { ( p, q ) | there exists u ∈ A ∗ such that q · u = p } . We show that the map-ping ϕ : P → Q given by the rule ϕ (( p, q )) = p is a surjective homomorphismof ordered semiautomata. Indeed, for a ∈ A , we have ( p, q ) · a = ( p · a, q ), andhence ϕ (( p, q ) · a ) = ϕ (( p · a, q )) = p · a = ϕ (( p, q )) · a . Moreover, ( p, q ) ≤ ( p ′ , q ′ )in P implies q = q ′ and p ≤ p ′ , which means that ϕ (( p, q )) ≤ ϕ (( p, q ′ )). Finally,the surjectivity follows from the fact { ( q, q ) | q ∈ Q } ⊆ P . ⊓⊔ Since positive C -varieties of languages are closed under taking preimages inmorphisms from the category C we need a construction on ordered semiautomatawhich enables the recognition of such languages. Definition 12.
Let f : B ∗ → A ∗ be a homomorphism and A = ( Q, A, · , ≤ ) bean ordered semiautomaton. By A f we denote the semiautomaton ( Q, B, · f , ≤ ) where q · f b = q · f ( b ) for every q ∈ Q and b ∈ B . We speak about f -renaming of A and we say that ( P, B, ◦ , (cid:22) ) is an f -subsemiautomaton of ( Q, A, · , ≤ ) ifit is a subsemiautomaton of A f . In other words, if P ⊆ Q , the order (cid:22) is therestriction of ≤ , and ◦ is a restriction of · f . If we consider f = id A : A ∗ → A ∗ , we can see that ( Q, A, · , ≤ ) id = ( Q, A, · , ≤ )and that id -subsemiautomata of ( Q, A, · , ≤ ) are exactly its subsemiautomata. Lemma 13.
Consider a homomorphism f : B ∗ → A ∗ of monoids.(i) Let L be a regular language which is recognized by an ordered automaton ( Q, A, · , ≤ , i, F ) . Then the automaton B = ( Q, B, ◦ , ≤ , i, F ) , where q ◦ b = q · f ( b ) for every q ∈ Q , b ∈ B , recognizes the language f − ( L ) .(ii) Let A = ( Q, A, · , ≤ ) be an ordered semiautomaton. If K ⊆ B ∗ is recog-nized by the ordered semiautomaton A f , then there exists a language L ⊆ A ∗ recognized by A such that K = f − ( L ) .Proof. (i) For every u ∈ B ∗ , we have the following chain of equivalent formulas: u ∈ f − ( L ) , f ( u ) ∈ L, i · f ( u ) ∈ F, i ◦ u ∈ F, and u ∈ L B . (ii) If K ⊆ B ∗ is recognized by the ordered semiautomaton A f then thereis a state i ∈ Q and an upward closed subset F ⊆ Q such that K = L B , where B = ( Q, B, · f , ≤ , i, F ). Now we consider L = L A ′ , where A ′ = ( Q, A, · , ≤ , i, F ).Now the equality K = f − ( L ) follows from Lemma 13 (i). ⊓⊔ Lemma 14.
Let f be an arbitrary homomorphism f : B ∗ → A ∗ .(i) If an ordered semiautomaton B = ( P, A, ◦ , (cid:22) ) is a homomorphic image ofan ordered semiautomaton A = ( Q, A, · , ≤ ) , then B f is a homomorphic image ofthe ordered semiautomaton A f .(ii) If an ordered semiautomaton B = ( P, A, ◦ , (cid:22) ) is a subsemiautomaton ofan ordered semiautomaton A = ( Q, A, · , ≤ ) then B f is a subsemiautomaton of A f .(iii) If an ordered semiautomaton B = A × · · · × A n is a product of a familyof ordered semiautomata A , . . . , A n , then B f = A f × · · · × A fn .Proof. (i) Let ϕ be a surjective homomorphism from an ordered semiautomaton( Q, A, · , ≤ ) onto ( P, A, ◦ , (cid:22) ). Then ϕ is a isotone mapping from ( Q, ≤ ) to ( P, (cid:22) ).The states and the order in the semiautomaton ( Q, A, · , ≤ ) f resp. ( P, A, ◦ , (cid:22) ) f are unchanged and hence ϕ is an isotone mapping from the ordered semiau-tomaton ( Q, A, · , ≤ ) f onto ( P, A, ◦ , (cid:22) ) f . Now let b ∈ B be an arbitrary letter and q ∈ Q be an arbitrary state. Then ϕ ( q · f b ) = ϕ ( q · f ( b )) = ϕ ( q ) ◦ f ( b ) = ϕ ( q ) ◦ f b .Therefore ϕ is a surjective homomorphism of ordered semiautomata.(ii) This is clear.(iii) Let A j = ( Q j , A, · j , ≤ j ) be an ordered semiautomaton for every j =1 , . . . , n . Let ( P, A, ◦ , (cid:22) ) be the product A × · · · × A n and ( R, B, ⋄ , ⊑ ) be the n Varieties of Ordered Automata 13 product of ordered semiautomata A f × · · · × A fn . Directly from the definitionswe have that P = R = Q × · · · × Q n and (cid:22) = ⊑ . Furthermore, for an arbitraryelement ( q , . . . , q n ) from the set P = R , we have( q , . . . , q n ) ◦ f b = ( q , . . . , q n ) ◦ f ( b ) == ( q · f ( b ) , . . . , q n · n f ( b )) = ( q · f b, . . . , q n · fn b )which is equal to ( q , . . . , q n ) ⋄ b . This means that the action by each letter b isdefined in the ordered semiautomaton ( P, A, ◦ , (cid:22) ) f in the same way as in theordered semiautomaton ( R, B, ⋄ , ⊑ ). ⊓⊔ C -Varietiesof Ordered Semiautomata Definition 1.
Let C be a category of homomorphisms. A positive C -variety ofordered semiautomata V associates to every non-empty finite alphabet A a class V ( A ) of ordered semiautomata over A in such a way that – V ( A ) = ∅ is closed under disjoint unions and direct products of non-emptyfinite families, and homomorphic images, – V is closed under f -subsemiautomata for all ( f : B ∗ → A ∗ ) ∈ C .Remark 2. We define T ( A ) as a class of all trivial ordered semiautomata over analphabet A , i.e. T ( A ) contains all semiautomata T n ( A ) and all their isomorphiccopies. By Lemma 7, the first condition in the definition of positive C -variety ofordered semiautomata can be written equivalently in the following way: T ( A ) ⊆ V ( A ) and V ( A ) is closed under direct products of non-empty finite families andhomomorphic images. In particular, the class of all trivial ordered semiautomata T forms the smallest positive C -variety of ordered semiautomata whenever theconsidered category C contains all isomorphisms.As mentioned in the introduction, Theorem 6.3 has already been proved inspecial cases. The technical difference is that ´Esik and Ito in [8] used disjointunion of automata and Chaubard, Pin and Straubing [4] did not use this con-struction because they used trivial automata instead of them.Now we are ready to state the Eilenberg type correspondence for positive C -varieties of ordered semiautomata.For each positive C -variety of ordered semiautomata V , we denote by α ( V )the class of regular languages given by the following formula( α ( V ))( A ) = { L ⊆ A ∗ | D L ∈ V ( A ) } . For each positive C -variety of regular languages L , we denote by β ( L ) the pos-itive C -variety of ordered semiautomata generated by all ordered semiautomata D L , where L ∈ L ( A ) for some alphabet A .The following result can be obtained as a combination of Theorem 5.1 of [19]and the main result of [4]. We show a direct and detailed proof here. Theorem 6.3.
Let C be a category of homomorphisms. The mappings α and β are mutually inverse isomorphisms between the lattice of all positive C -vari-eties of ordered semiautomata and the lattice of all positive C -varieties of regularlanguages.Proof. First of all, we fix a category of homomorphism C for the whole proof.The proof will be done when we show the following statements:1. α is correctly defined, i.e. for every positive C -variety of ordered semiau-tomata V , the class α ( V ) is a positive C -variety of languages.2. β is correctly defined, i.e. for every positive C -variety of regular languages L , the class β ( L ) is a positive C -variety of ordered semiautomata.3. β ◦ α = id, i.e. for each positive C -variety of ordered semiautomata V wehave β ( α ( V )) = V .4. α ◦ β = id, i.e. for each positive C -variety of languages L , we have α ( β ( L )) = L .We prove these facts in separate lemmas. The exception is the second itemwhich trivially follows from the definition of the mapping β . Before the formu-lation of these lemmas we prove some technicalities. Lemma 4.
For each positive C -variety of ordered semiautomata V and an al-phabet A we have ( α ( V ))( A ) = { L ⊆ A ∗ | ∃ A = ( Q, A, · , ≤ , i, F ) : L = L A and A ∈ V ( A ) } . Proof.
The inclusion “ ⊆ ” is trivial, because one can take for the ordered au-tomaton A the canonical automaton D L . To prove the opposite inclusion, let L = L A , where A = ( Q, A, · , ≤ , i, F ) with A ∈ V ( A ). By Lemma 9 and theassumption that V is closed under taking subsemiautomata and homomorphicimages, we have that D L ∈ V ( A ). Therefore L ∈ ( α ( V ))( A ). ⊓⊔ Lemma 5. If V is a positive C -variety of ordered semiautomata, then α ( V ) isa positive C -variety of regular languages.Proof. We need to prove that ( α ( V ))( A ) is closed under taking intersections,unions and quotients. Secondly, we must show the closure property with respectto taking preimages in morphisms from the category C .For each A , the class ( α ( V ))( A ) given by the formula from Lemma 4 isclosed under unions and intersections of finite families, since V ( A ) is closedunder products of finite families (see Lemma 2). The class ( α ( V ))( A ) is alsoclosed under quotients, since we can change initial and final states freely (seeLemma 4).Furthermore, by Lemma 13, we see the following observation. Since V isclosed under taking f -subsemiautomata for each homomorphism f from C , theclass α ( V ) is closed under preimages in the same homomorphisms. ⊓⊔ n Varieties of Ordered Automata 15 All three constructions – direct product, homomorphic image and subsemi-automaton – are standard constructions of universal algebra. From the generaltheory (see e.g. Burris and Sankappanavar [3]) we want to use only the factthat if one needs to generate the smallest class closed with respect to all threeconstructions together and containing a class X , then it is enough to consider ahomomorphic images of subalgebras in products of algebras form X . Note thatfrom this point of view, an alphabet A is fixed, and A serves as a set of unaryfunction symbols. Then a semiautomaton over A is a unary algebra.For a class of ordered semiautomata X over a fixed alphabet A we denote by – H X the class of all homomorphic images of ordered semiautomata from X , – I X the class of all isomorphic copies of ordered semiautomata from X , – S X the class of all subsemiautomata of ordered semiautomata from X , – P X the class of all products of non-empty finite families of ordered semiau-tomata from X , – D X the class of all disjoint unions of non-empty finite families of orderedsemiautomata from X .It is clear that the operators H , I and S are idempotent, i.e. for each class ofordered semiautomata X we have HH X = H X etc. Furthermore, IPP = IP and IDD = ID . Lemma 6.
For each class X of ordered semiautomata over a fixed alphabet A ,we have: D X ⊆ HP ( X ∪ T ( A )) and PH X ⊆ HP X , SH X ⊆ HS X , PS X ⊆ SP X . Proof.
The first property follows from Lemma 7. The other properties are well-knownfacts from universal algebra, see e.g. [3, Chapter II, Section 9] – a modificationfor the ordered case is straightforward. ⊓⊔ Lemma 7.
For each positive C -variety of regular languages L we have ( β ( L ))( A ) = HSP ( { D L | L ∈ L ( A ) } ∪ T ( A ) ) . Proof.
For every alphabet A , we denote X = { D L | L ∈ L ( A ) } ∪ T ( A ) andwe denote by L ( A ) the right-hand side of the formula in the statement, i.e. L ( A ) = HSP X .Since β ( L ) is a positive C -variety of ordered semiautomata, we have T ( A ) ⊆ ( β ( L ))( A ) by Remark 2. Therefore X ⊆ ( β ( L ))( A ) and the inclusion L ( A ) ⊆ ( β ( L ))( A ) follows from the fact that β ( L )( A ) is closed under operators H , S and P . To prove the opposite inclusion ( β ( L ))( A ) ⊆ L ( A ), we first prove that L is apositive C -variety of ordered semiautomata.By the first property in Lemma 6, we get D ( L ( A )) = DHSP X ⊆ HP ( HSP X ) . By the other properties of Lemma 6 and idempotency of the operators we get
HPHSP X ⊆ HSP X = L ( A ). Thus D ( L ( A )) ⊆ L ( A ). In the same way one canprove another inclusions H ( L ( A )) ⊆ L ( A ), S ( L ( A )) ⊆ L ( A ), P ( L ( A )) ⊆ L ( A ).Therefore L ( A ) is closed under all four operators H , S , P and D .It remains to prove that L is closed under f -renaming. So, let f : B ∗ → A ∗ belong to C . We need to show that ( Q, A, · , ≤ ) f belongs to L ( B ) whenever( Q, A, · , ≤ ) is from L ( A ). For an arbitrary set Y of ordered semiautomata overthe alphabet A , we denote Y f = { ( Q, A, · , ≤ ) f | ( Q, A, · , ≤ ) ∈ Y } . Using thisnotation, we need to show that ( L ( A )) f ⊆ L ( B ).At first, we show a weaker inclusion X f ⊆ L ( B ). Trivially ( T ( A )) f = ( T ( B )).Now let L be an arbitrary language from L ( A ). We consider ( D L , A, · , ⊆ ) f whichis an ordered semiautomaton over B . By Lemma 13, the set Z of all regular lan-guages which are recognized by the ordered semiautomaton ( D L , A, · , ⊆ ) f con-tains only languages of the form f − ( K ), where K is recognized by ( D L , A, · , ⊆ ).Since L ( A ) is closed under unions, intersections and quotients, every such lan-guage K belongs to L ( A ) by Lemmas 4 and 5. This means that the set Z is asubset of L ( B ) because L is closed under preimages in the homomorphism f .Therefore L ( B ) contains all canonical ordered semiautomata of languages from Z . Finally, ( D L , A, , · , ⊆ ) f can be reconstructed from these canonical orderedsemiautomata of languages from Z : by Lemma 11, the ordered semiautomaton( D L , A, · , ⊆ ) f is a homomorphic image of a disjoint union of certain subsemiau-tomata which are isomorphic, by Lemma 10, to subsemiautomata of products ofthe canonical ordered semiautomata of languages from Z . Hence ( D L , A, · , ⊆ ) f belongs to L ( B ) which is closed under homomorphic images, subsemiautomata,products and disjoint unions as we proved above. So, we proved X f ⊆ L ( B ).Now Lemma 14 has the following consequences ( H Y ) f ⊆ H ( Y f ), ( S Y ) f ⊆ S ( Y f ) and ( P Y ) f ⊆ P ( Y f ) for an arbitrary set of ordered semiautomata Y over the alphabet A . If we use all these properties we get ( L ( A )) f = ( HSP X ) f ⊆ HSP ( X f ) ⊆ HSP ( L ( B )) = L ( B ) because L ( B ) is closed under all three oper-ators. Hence L is closed under taking f -subsemiautomata and therefore L is apositive C -variety of ordered semiautomata.The inclusion ( β ( L ))( A ) ⊆ L ( A ) follows from the definition of β ( L ) which isthe smallest positive C -variety of ordered semiautomata containing X . Since theopposite inclusion is also proved we have finish the proof of the lemma. ⊓⊔ Lemma 8.
For each positive C -variety of ordered semiautomata V we have β ( α ( V )) = V .Proof. Let A be an arbitrary alphabet. By Lemma 7,( β ( α ( V )))( A ) = HSP ( {D L | L ∈ ( α ( V ))( A ) } ∪ T ( A ) ) . We denote X = { D L | L ∈ ( α ( V ))( A ) } . If we use the definition of the mapping α then we see that X = { ( Q, A, · , ≤ ) ∈ V ( A ) | ∃ L ⊆ A ∗ : D L = ( Q, A, · , ≤ ) } . Inparticular X ⊆ V ( A ). Since we also have T ( A ) ⊆ V ( A ) we see that X ∪ T ( A ) ⊆ V ( A ). Hence ( β ( α ( V )))( A ) = HSP ( X ∪ T ( A ) ) ⊆ V ( A ) n Varieties of Ordered Automata 17 because V ( A ) is closed under taking homomorphic images, subsemiautomataand products.In the proof of Lemma 7 we already saw that every ordered semiautoma-ton ( Q, A, · , ≤ ) can be reconstructed from the canonical ordered automata oflanguages which are recognized by ( Q, A, · , ≤ ) by Lemma 10 and 11. Therefore V ( A ) ⊆ HSP ( X ∪ T ( A )) and we proved the equality V ( A ) = HSP ( X ∪ T ( A )),which means that β ( α ( V )) = V . ⊓⊔ Lemma 9.
Let L be a positive C -variety of languages L . Then α ( β ( L )) = L .Proof. We want to prove that for every A the equality ( α ( β ( L )))( A ) = L ( A )holds. Let L ∈ L ( A ) be an arbitrary language. By the definition of the map-ping β , we have D L ∈ ( β ( L ))( A ). Therefore, by definition of α , we have L ∈ α ( β ( L ))( A ) and we have proved the inclusion “ ⊇ ”.To prove the opposite one, let K ∈ α (( β ( L )))( A ) be an arbitrary language.Then there is an ordered automaton A = ( Q, A, · , ≤ , i, F ) such that K = L A and A ∈ ( β ( L ))( A ) = HSP ( { D L | L ∈ L ( A ) }∪ T ( A ) ). If K is recognized by A , where A ∈ H X for the class of ordered semiautomata X = SP ( { D L | L ∈ L ( A ) } ∪ T ( A ) ), then there is an ordered automaton B such that A is a homomorphicimage of B ∈ X . By Lemma 5, the language K is recognized by B . Thus we canassume that A belongs to SP ( { D L | L ∈ L ( A ) } ∪ T ( A ) ). In the same way wecan also assume that A belongs to P ( { D L | L ∈ L ( A ) } ∪ T ( A ) ). By Lemma 2,we know that K is a finite union of finite intersections of languages which arerecognized by ordered semiautomata from the class { D L | L ∈ L ( A ) } ∪ T ( A ).Furthermore, trivial semiautomata recognize only languages ∅ and A ∗ whichbelong to every L ( A ), hence we may consider { D L | L ∈ L ( A ) } instead of { D L | L ∈ L ( A ) }∪ T ( A ) in the previous sentence. Since the canonical automaton D L recognizes only finite unions of finite intersections of quotients of the language L (by Lemmas 4 and 5), and since L ( A ) is closed under taking quotients, unionsand intersections, we see that K belongs to L ( A ). ⊓⊔ The previous lemma finishes the proof of Theorem 6.3. ⊓⊔ C -Varieties of Semiautomata For an ordered semiautomaton (
Q, A, · , ≤ ) we define its dual ( Q, A, · , ≤ ) d =( Q, A, · , ≤ d ) where ≤ d is the dual order to ≤ , i.e. p ≤ d q if and only if q ≤ p .Instead of the symbol ≤ d we usually use the symbol ≥ . Trivially, the resultingstructure ( Q, A, · , ≥ ) is also an ordered semiautomaton. For a positive C -varietyof ordered semiautomata V we denote by V d its dual, i.e for every alphabet A weconsider V d ( A ) = { ( Q, A, · , ≤ ) d | ( Q, A, · , ≤ ) ∈ V ( A ) } . It is clear that ( V d ) d = V and that V d is a positive C -variety of ordered semiautomata.We say that V is selfdual if V d = V . In other words, V is selfdual if andonly if every V ( A ) is closed under taking duals of its members. An alternativecharacterization follows. Lemma 1.
Let V be a positive C -variety of ordered semiautomata. Then V isselfdual if and only if for each alphabet A , we have that: ( Q, A, · , ≤ ) ∈ V ( A ) implies ( Q, A, · , =) ∈ V ( A ) . Proof. If V d = V and ( Q, A, · , ≤ ) ∈ V ( A ) then we also have ( Q, A, · , ≥ ) ∈ V ( A ).Now the ordered semiautomaton ( Q, A, · , =) is isomorphic to a subsemiautoma-ton of the product of the ordered semiautomata ( Q, A, · , ≤ ) and ( Q, A, · , ≥ ),namely the subsemiautomaton with the set of states { ( q, q ) | q ∈ Q } .To prove the converse, it is enough to see that an arbitrary ordered semiau-tomaton ( Q, A, · , ≤ ) is a homomorphic image of the semiautomaton ( Q, A, · , =):the identity mapping is a homomorphism of the considered order semiautomata. ⊓⊔ Recall that a C -variety of regular languages is a positive C -variety of languageswhich is closed under taking complements. The canonical ordered semiautoma-ton of the complement of a regular language L is the dual of the canonicalordered semiautomaton of L , i.e D L c = ( D L ) d . This easy observation helps toprove the following statement. Proposition 2.
There is one to one correspondence between C -varieties of reg-ular languages and selfdual positive C -varieties of ordered semiautomata.Proof. The mentioned correspondence is given by the pairs of the mappings α and β from Theorem 6.3. For a selfdual positive C -variety of ordered semiau-tomata V , we know that ( α ( V ))( A ) is closed under complements. This meansthat α ( V ) is a C -variety of regular languages. Therefore, it remains to showthat, for an arbitrary C -variety of regular languages L , the positive C -variety ofordered semiautomata β ( L ) is selfdual. By Lemma 7, ( β ( L ))( A ) = HSP ( { D L | L ∈ L ( A ) } ∪ T ( A ) ), where the set { D L | L ∈ L ( A ) } ∪ T ( A ) is selfdual. However,for every selfdual class of semiautomata X , the classes of ordered semiautomata P X , S X and H X are selfdual again. Hence β ( L ) is selfdual. ⊓⊔ Since every ordered semiautomaton (
Q, A, · , ≤ ) is a homomorphic image ofthe ordered semiautomaton ( Q, A, · , =) we can consider the notion of C -varietiesof semiautomata instead of selfdual positive C -varieties of ordered semiautomata: C -varieties of semiautomata are classes of semiautomata which are closed un-der taking f -subsemiautomata, homomorphic images, disjoint unions and finiteproducts.Let A ( A ) be the class of all ordered semiautomata over the alphabeth A .Notice that A forms the greatest positive C -variety of ordered semiautomata foreach category C .If we have C -variety of semiautomata V then we can consider all possiblecompatible orderings on these semiautomata and define the positive C -variety ofordered semiautomata V o in the following sense V o ( A ) = { ( Q, A, · , ≤ ) ∈ A ( A ) | ( Q, A, · ) ∈ V ( A ) } . n Varieties of Ordered Automata 19 Clearly, V o is selfdual. Conversely, for a selfdual positive C -variety of orderedsemiautomata V , we can consider V u ( A ) = { ( Q, A, · ) | there is an order ≤ such that ( Q, A, · , ≤ ) ∈ V ( A ) } . Now two mappings V V o and V V u are mutually inverse mappings be-tween C -varieties of semiautomata and selfdual positive C -varieties of orderedsemiautomata.Using this easy correspondence, we obtain the following result as the conse-quence of Proposition 2. Theorem 7.3.
There is one to one correspondence between C -varieties of regu-lar languages and C -varieties of semiautomata. Note that this results can be also obtained by composing the results by Pin,Straubing [19] with those of Chaubard, Pin and Straubing [4]
In this section we present several instances of Eilenberg type correspondence.Some of them are just reformulations of examples already mentioned in existingliterature. In particular, the first three subsections correspond to pseudovarietiesof aperiodic, R -trivial and J -trivial monoids, respectively. Also Subsection 8.4has a natural counterpart in pseudovarieties of ordered monoids satisfying theinequality 1 ≤ x . In all these cases, C is the category of all homomorphisms de-noted by C all . Nevertheless, we believe that these correspondences viewed fromthe perspective of varieties of (ordered) semiautomata are of some interest. An-other four subsections works with different categories C and Subsections 8.6and 8.8 bring new examples of (positive) C -varieties of (ordered) automata. The star free languages were characterized by Sch¨utzenberger [22] as languageshaving aperiodic syntactic monoids. Here we recall the subsequent characteriza-tion of McNaughton and Papert [14] by counter-free automata.
Definition 1.
We say that a semiautomaton ( Q, A, · ) is counter-free if, for each u ∈ A ∗ , q ∈ Q and n ∈ N such that q · u n = q , we have q · u = q . Proposition 2.
The class of all counter-free semiautomata forms a variety ofsemiautomata.Proof.
It is easy to see that disjoint unions, subsemiautomata, products and f -renamings of a counter-free semiautomata are again counter-free.Let ϕ : ( Q, A, · ) → ( P, A, ◦ ) be a surjective homomorphism of semiautomataand let ( Q, A, · ) be counter-free. We prove that also ( P, A, ◦ ) is a counter-freesemiautomaton. Take p ∈ P, u ∈ A ∗ and n ∈ N such that p ◦ u n = p . Let q ∈ Q be an arbitrarystate such that ϕ ( q ) = p . Then, for each j ∈ N , we have ϕ ( q · u jn ) = p ◦ u jn = p .Since the set { q, q · u n , q · u n , . . . } is finite, there exist k, ℓ ∈ N such that q · u kn = q · u ( k + ℓ ) n . If we take r = q · u kn , then r · u ℓn = r . Since ( Q, A, · ) is counter-free,we get r · u = r . Consequently, p ◦ u = ϕ ( r ) ◦ u = ϕ ( r · u ) = ϕ ( r ) = p . ⊓⊔ The promised link between languages and automata follows.
Proposition 3 (McNaughton, Papert [14]).
Star free languages are exactlythe languages recognized by counter-free semiautomata.
Note that this characterization is effective, although testing whether a regularlanguage given by a DFA is aperiodic is even PSPACE-complete problem by Choand Huynh [5].
The content c( u ) of a word u ∈ A ∗ is the set of all letters occurring in u . Definition 4.
We say that a semiautomaton ( Q, A, · ) is acyclic if, for every u ∈ A + and q ∈ Q such that q · u = q , we have q · a = q for every a ∈ c( u ) . Note that one of the conditions in Simon’s characterization of piecewisetestable languages is that a minimal DFA is acyclic – see [23].One can prove the following proposition in a similar way as in the case ofcounter-free semiautomata.
Proposition 5.
The class of all acyclic semiautomata forms a variety of semi-automata.
According to Pin [15, Chapter 4, Section 3], a semiautomaton (
Q, A, · ) iscalled extensive if there exists a linear order (cid:22) on Q such that ( ∀ q ∈ Q, a ∈ A ) q (cid:22) q · a . Note that such an order need not to be compatible with actions ofletters. One can easily show that a semiautomaton is acyclic if and only if it isextensive. We prefer to use the term acyclic, since we consider extensive actionsby letters (compatible with ordering of a semiautomaton) later in the paper.Anyway, testing whether a given semiautomaton is acyclic can be decided usingthe breadth-first search algorithm. Proposition 6 (Pin [15]).
The languages over the alphabet A accepted byacyclic semiautomata are exactly disjoint unions of the languages of the form A ∗ a A ∗ a A ∗ . . . A ∗ n − a n A ∗ n where a i A i − ⊆ A for i = 1 , . . . , n . Note that the languages above are exactly those having R -trivial syntacticmonoids n Varieties of Ordered Automata 21 In our paper [11] concerning piecewise testable languages, we introduced a cer-tain condition on automata being motivated by the terminology from the theoryof rewriting systems.
Definition 7.
We say that a semiautomaton ( Q, A, · ) is confluent , if for eachstate q ∈ Q and every pair of words u, v ∈ A ∗ , there is a word w ∈ A ∗ such that c( w ) ⊆ c( uv ) and ( q · u ) · w = ( q · v ) · w . In [11], this definition was studied in the context of acyclic (semi)automata,in which case several equivalent conditions were described. One of them can berephrased in the following way.
Lemma 8.
Let ( Q, A, · ) be an acyclic semiautomaton. Then ( Q, A, · ) is conflu-ent if and only if, for each q ∈ Q , u, v ∈ A ∗ , we have q · u · ( uv ) | Q | = q · v · ( uv ) | Q | .Proof. Assume that (
Q, A, · ) is a confluent acyclic semiautomaton and let q ∈ Q , u, v ∈ A ∗ be arbitrary. We consider the sequence of states q · u, q · u · ( uv ) , q · u · ( uv ) , . . . , q · u · ( uv ) | Q | . Since the sequence contains more members than | Q | , we have p = q · u · ( uv ) k = q · u · ( uv ) ℓ for some 0 ≤ k < ℓ ≤ | Q | . Since ( Q, A, · ) is acyclic, we have p · a = p forevery a ∈ c( uv ). Therefore, p = q · u · ( uv ) k = q · u · ( uv ) k +1 = · · · = q · u · ( uv ) | Q | and we have p · w = p for every w ∈ A ∗ such that c( w ) ⊆ c( uv ). Similarly, for r = q · v · ( uv ) | Q | , we obtain the same property r · w = r for the same words w .Taking into account that ( Q, A, · ) is confluent we obtain the existence of a word w such that c( w ) ⊆ c( uv ) and p · w = r · w . Hence p = r and the first implicationis proved. The second implication is evident. ⊓⊔ Using the condition from Lemma 8, one can prove that the class of all acyclicconfluent semiautomata is a variety of semiautomata similarly as in Proposi-tion 2. Finally, the main result from [11] can be formulated in the following way.It is mentioned in [11] that the defining condition is testable in a polynomialtime.
Proposition 9 (Kl´ıma and Pol´ak [11]).
The variety of all acyclic confluentsemiautomata corresponds to the variety of all piecewise testable languages.
We say that an ordered semiautomaton (
Q, A, · , ≤ ) has extensive actions if, forevery q ∈ Q , a ∈ A , we have q ≤ q · a . Clearly, the defining condition is testablein a polynomial time. The transition ordered monoids of such ordered semiau-tomata are characterized by the inequality 1 ≤ x . It is known [17, Proposition8.4] that the last inequality characterizes the positive variety of all finite unionsof languages of the form A ∗ a A ∗ a A ∗ . . . A ∗ a ℓ A ∗ , where a , . . . , a ℓ ∈ A, ℓ ≥ . Therefore we call them positive piecewise testable languages . In this way onecan obtain the following statement, which we prove directly using the theorypresented in this paper.
Proposition 10.
The class of all ordered semiautomata with extensive actionsis a positive variety of ordered semiautomata and corresponds to the positivevariety of all positive piecewise testable languages.Proof.
It is a routine to check that the class of all ordered semiautomata withextensive actions is a positive variety of ordered semiautomata. Using Theo-rem 6.3, we need to show, that a language L is positive piecewise testableif and only if its canonical ordered semiautomaton has extensive actions. Toprove that the canonical semiautomaton of a positive piecewise testable lan-guage has extensive actions, it is enough to prove this fact for languages of theform A ∗ a A ∗ a A ∗ . . . A ∗ a ℓ A ∗ with a , . . . , a ℓ ∈ A, ℓ ≥
0. This observation fol-lows from the description of the canonical ordered (semi)automata of a languagegiven in Section 4. Indeed, for every language K = A ∗ b A ∗ b A ∗ . . . A ∗ b k A ∗ wehave K ⊆ b − K , because b − K = K or b − K = A ∗ b A ∗ . . . A ∗ b k A ∗ dependingon the fact whether b = b or b = b .Assume that the canonical automaton O L = ( D L , A, · , ⊆ , L, F L ) of a language L has extensive actions; consequently ( D L , A, · ) is an acyclic semiautomaton.Since F is upward closed, for every p ∈ F and a ∈ A , we have p · a ∈ F . Inother words, for every p ∈ F , we have L p = A ∗ . However by Lemma 2 we have L p = p , so we get that F contains just one final state p = A ∗ . Now we considera simple path in O L from i to p labeled by a word u = a a . . . a n with a k ∈ A ,i.e i = i · a = i · a a = · · · 6 = i · u = p . If we consider a word w such that w = w a w a . . . a n w n , where w , w , . . . w n ∈ A ∗ , then one can easily proveby an induction with respect to k that i · a . . . a k ≤ i · w a w a . . . a k w k . For k = n , we get p ≤ i · w ∈ F , thus i · w = p . Hence we can conclude with A ∗ a A ∗ a . . . a n A ∗ ⊆ L . We can consider the language K , which is the union ofsuch languages A ∗ a A ∗ a . . . a n A ∗ for all possible simple paths in O L from i to p . Now K ⊆ L follows from the previous argument and L ⊆ K is clear, becauseevery w ∈ L describes a unique simple path from i to p . ⊓⊔ Note that a usual characterization of the class of positive piecewise testablelanguages is given by a forbidden pattern for DFA (see e.g. [26, page 531]). Thispattern consists of two words v, w ∈ A ∗ and two states p and q = p · v suchthat p · w ∈ F and q · w F . In view of (1) from Section 4, the presence ofthe pattern is equivalent to the existence of two states [ p ] ρ [ q ] ρ , such that[ p ] ρ · ρ v = [ q ] ρ in the minimal automaton of the language. The membership forthe class of positive piecewise testable languages is decidable in polynomial time– see [17, Corollary 8.5] or [26, Theorem 2.20]. We recall examples from the paper [8]. We call a semiautomaton (
Q, A, · ) au-tonomous if for each state q ∈ Q and every pair of letters a, b ∈ A , we have n Varieties of Ordered Automata 23 q · a = q · b . For a positive integer d , let V d be the class of all autonomoussemiautomata being disjoint unions of cycles whose lengths divide d . Clearly,the defining conditions are testable in a linear time. Proposition 11 (´Esik and Ito [8]). (i) All autonomous semiautomata form a C l -variety of semiautomata and the corresponding C l -variety of languages consistsof regular languages L such that, for all u, v ∈ A ∗ , if u ∈ L , | u | = | v | then v ∈ L .(ii) The class V d forms a C l -variety of semiautomata and the corresponding C l -variety of languages consists of all unions of ( A d ) ∗ A i , i ∈ { , . . . , d − } . Synchronizing automata are intensively studied in the literature. A semiautoma-ton (
Q, A, · ) is synchronizing if there is a word w ∈ A ∗ such that the set Q · w is aone-element set. We use an equivalent condition, namely, for each pair of states p, q ∈ Q , there exists a word w ∈ A ∗ such that p · w = q · w (see e.g. Volkov [28,Proposition 1]). In this paper we consider the classes of semiautomata which areclosed for taking disjoint unions. So, we need to study disjoint unions of synchro-nizing semiautomata. Those automata can be equivalently characterized by thefollowing weaker version of confluence. We say that a semiautomaton ( Q, A, · ) is weakly confluent if, for each state q ∈ Q and every pair of words u, v ∈ A ∗ , thereis a word w ∈ A ∗ such that ( q · u ) · w = ( q · v ) · w . Proposition 12.
A semiautomaton is weakly confluent if and only if it is adisjoint union of synchronizing semiautomata.Proof.
It is clear that a disjoint union of synchronizing semiautomata is weaklyconfluent.To prove the opposite implication, assume that (
Q, A, · ) is a weakly confluentsemiautomaton. We consider one connected component and an arbitrary pair p, q of its states. Then there exist states p , p , . . . , p n and letters a , . . . , a n − suchthat p = p , p n = q and for each i ∈ { , . . . , n − } we have p i · a i = p i +1 or p i +1 · a i = p i . We claim, for each i ∈ { , . . . , n } , the existence of a word w i suchthat p · w i = p i · w i . This claim gives, in the case i = n , that p · w n = q · w n , whichconcludes the proof. In the rest of the proof we show the claim by the induction on i . For i = 1 one can take any word for w . Now, assume that the claim is true for i , i.e. there is a word w i and state r such that r = p · w i = p i · w i . Furthermore,we denote r = p i +1 · w i . In the case p i · a i = p i +1 , we denote r = p i and we have r · w i = r and r · a i w i = r . In the case p i +1 · a i = p i , we denote r = p i +1 andwe have r · a i w i = r and r · w i = r . In both cases, since the semiautomaton isweakly confluent there exists u ∈ A ∗ such that r · u = r · u . Now for w i +1 = w i u we have p · w i +1 = ( p · w i ) · u = r · u = r · u = ( p i +1 · w i ) · u = p i +1 · w i +1 . ⊓⊔ Since the synchronization property can be tested in the polynomial time(see [28]), Proposition 12 implies that the weak confluence of a semiautomatoncan be tested in the polynomial time, as well.In the next result we use the category C s of all surjective homomorphisms.Note that f : B ∗ → A ∗ is a surjective homomorphism if and only if A ⊆ f ( B ). Proposition 13.
The class of all weakly confluent semiautomata is a C s -varietyof semiautomata.Proof. Clearly, the class of all weakly confluent semiautomata V is closed underdisjoint unions, subsemiautomata and homomorphic images. We need to checkthat V is closed under direct products of non-empty finite families.Let Q = ( Q, A, · , ≤ ) and P = ( P, A, ◦ , (cid:22) ) be a pair of weakly confluentsemiautomata. Take a state ( q, p ) in the product Q × P and let u, v ∈ A ∗ bewords. Since ( Q, A, · , ≤ ) is weakly confluent, there is w ∈ A ∗ such that q · u · w = q · v · w . Now we consider the words uw and vw . Since P is weakly confluent,there is z ∈ A ∗ such that p · uw · z = p · vw · z . Hence ( q, p ) · u · wz = ( q, p ) · v · wz and we proved that Q × P is weakly confluent. The general case for a directproduct of a non-empty finite family of ordered semiautomata can be proved inthe same way.To finish the proof, assume that f ∈ C s ( B ∗ , A ∗ ) is a surjective homomor-phism. Let A = ( Q, A, · ) be a weakly confluent semiautomaton and A f =( Q, B, · f ) is its f -renaming. Taking q ∈ Q and u, v ∈ B ∗ , we have q · f u = q · f ( u )and q · f v = q · f ( v ). Since A = ( Q, A, · ) is weakly confluent, there is w ∈ A ∗ such that q · f ( u ) · w = q · f ( v ) · w . Now we can consider a preimige w ′ ∈ B ∗ of the word w in the surjective homomorphism f . Finally, we can conclude that( q · f u ) · f w ′ = ( q · f v ) · f w ′ . ⊓⊔ Finite languages do not form a variety, because their complements, the so-called cofinite languages , are not finite. Moreover, the class of all finite languages is notclosed for taking preimages under all homomorphisms. However, one can restrictthe category of homomorphisms to the so-called non-erasing ones: we say thata homomorphism f : B ∗ → A ∗ is non-erasing if f − ( λ ) = { λ } . The class ofall non-erasing homomorphisms is denoted by C ne . Note that C ne -varieties oflanguages correspond to +-varieties of languages (see [25]).We use certain technical terminology for states of a given semiautomaton( Q, A, · ): we say that a state q ∈ Q has a cycle , if there is a word u ∈ A + suchthat q · u = q and we say that the state q is absorbing if for each letter a ∈ A wehave q · a = q . Definition 14.
We call a semiautomaton ( Q, A, · ) strongly acyclic , if each statewhich has a cycle is absorbing. It is evident that every strongly acyclic semiautomaton is acyclic.
Proposition 15. (i) The class of all strongly acyclic semiautomata forms a C ne -variety.(ii) The class of all strongly acyclic confluent semiautomata forms a C ne -va-riety. n Varieties of Ordered Automata 25 Proof. (i) It is easy to see that the class V of all strongly acyclic semiautomatais closed under finite products, disjoint unions and subsemiautomata. Also theproperty of f -renaming is clear whenever we consider a non-erasing homomor-phism f : B ∗ → A ∗ . Finally, one can prove that the class V is closed underhomomorphisms in a similar way as in the case of counter-free semiautomata.(ii) By the first part we know that all strongly acyclic semiautomata form a C ne -variety. We also know that all acyclic confluent semiautomata form a vari-ety of semiautomata, and hence they form also a C ne -variety of semiautomata.Therefore all strongly acyclic confluent semiautomata, as an intersection of two C ne -varieties, form a C ne -variety again. ⊓⊔ Proposition 16.
The C ne -variety of all finite and all cofinite languages corre-sponds to the C ne -variety of all strongly acyclic confluent semiautomata.Proof. At first, consider an arbitrary finite language L ⊆ A ∗ and its canonicalautomaton ( D L , A, · , L, F L ). Since L is finite, there is only one state in D L whichhas a cycle, namely the state ∅ . Moreover, this state is absorbing and it isreachable from all other states, because quotients of finite languages are finite.Therefore the semiautomaton ( D L , A, · ) is strongly acyclic and confluent at thesame time. Of course, if we start with the complement of a finite language L ,the canonical semiautomaton is the same as for L .Conversely, let A = ( Q, A, · ) be a strongly acyclic confluent semiautomaton.For an arbitrary state q ∈ Q , we take some path starting in q of length | Q | .On that path there is a state q ′ which has a cycle, i.e. q · u = q ′ = q ′ · v forsome u ∈ A ∗ , v ∈ A + . Since A is strongly acyclic, q ′ is an absorbing state. Since A is confluent, there is at most one such absorbing state q ′ reachable from q .Now we choose i ∈ Q and F ⊆ Q arbitrarily and we consider the automaton( Q, A, · , i, F ). By the previous considerations there is just one state reachablefrom i which has a cycle. We denote it by f . Note that it is an absorbing state.One can see that, for each state q = f , the set { u | i · u = q } is finite andtherefore { u | i · u = f } is a complement of the finite language. Thus dependingon the fact f ∈ F , the language recognized by ( Q, A, · , i, F ) is cofinite or finite. ⊓⊔ Naturally, one can try to describe the corresponding C ne -variety of languagesfor the C ne -variety of strongly acyclic semiautomata. Following Pin [17, Section5.3], we call L ⊆ A ∗ a prefix-testable language if L is a finite union of a finitelanguage and languages of the form uA ∗ , with u ∈ A ∗ . One can prove thefollowing statement in a similar way as Proposition 16. Note that one can findalso a characterization via syntactic semigroups in [17, Section 5.3]. Proposition 17.
The C ne -variety of all prefix-testable languages corresponds tothe C ne -variety of all strongly acyclic semiautomata. The characterization from Proposition 16 can be modified for a positive C ne -variety of finite languages F : where F ( A ) consists from A ∗ and all finite lan-guages over A . To make the characterizing condition more readable, for a givenstrongly acyclic confluent semiautomaton and its state q , we call the uniquely determined state q ′ , mentioned in the proof of Proposition 16, as a main follower of the state q . Proposition 18.
The positive C ne -variety of all finite languages corresponds tothe positive C ne -variety of all strongly acyclic confluent ordered semiautomatasatisfying q ′ ≤ q for each state q and its main follower q ′ .Proof. By the first paragraph of the proof of Proposition 16, every canonicalordered automaton of a finite language satisfies the additional condition q ′ ≤ q for each state q , because the main follower of q is ∅ .Similarly, in the second part of the proof: Let f be the considered mainfollower of i . Since it is also main follower of all reachable states from the initialstate i , we see that f is the minimal state among all reachable states from i .Now if f is final, then all states are final, because the final states form upwardclosed subset. Consequently the language accepted by the ordered automatonis A ∗ in this case. If f is not final, then the language accepted by the orderedautomaton is finite. ⊓⊔ Note that, all considered conditions on semiautomata discussed in this sub-section can be checked in polynomial time.
We know that a language L ⊆ A ∗ is positive piecewise testable if, for everypair of words u, w ∈ A ∗ such that uw ∈ L and for a letter a ∈ A , we have uaw ∈ L . So, we can add an arbitrary letter into each word from the language(at an arbitrary position) and the resulting word stays in the language. Now weconsider an analogue, where we put into the word not only a letter but a wordof a given fixed length. The length of a word v ∈ A ∗ is denoted by | v | as usually.For each positive integer n , we consider the following property of a givenregular language L ⊆ A ∗ :for every u, v, w ∈ A ∗ , if uw ∈ L and | v | = n, then uvw ∈ L .
We say that L is closed under n - insertions whenever L satisfies this property.We show that the class of all regular languages closed under n -insertions form apositive C -variety of languages by describing the corresponding positive C -varietyof ordered semiautomata.At first, we need to describe an appropriate category of homomorphisms.Let C lm be the category consisting of the so-called length-multiplying (see [25])homomorphisms: f ∈ C lm ( B ∗ , A ∗ ) if there exists a positive integer k such that | f ( b ) | = k for every b ∈ B . Definition 19.
Let n be a positive integer and Q = ( Q, A, · , ≤ ) be an orderedsemiautomaton. We say that Q has n -extensive actions if, for every q ∈ Q and u ∈ A ∗ such that | u | = n , we have q ≤ q · u . n Varieties of Ordered Automata 27 Note that ordered semiautomata from Subsection 8.4 are ordered semiau-tomata which have 1-extensive actions. Of course, these ordered semiautomatahave n -extensive actions for every n . More generally, if n divides m and an or-dered semiautomaton Q has n -extensive actions, then Q has m -extensive actions. Proposition 20.
Let n be a positive integer. The class of all ordered semiau-tomata which have n -extensive actions form a positive C lm -variety of orderedsemiautomata. The corresponding positive C lm -variety of languages consists ofall regular languages closed under n -insertions.Proof. The first part of the statement is easy to show. To establish the secondpart, let L be a regular language over A closed under n -insertions. For u ∈ A ∗ ,we consider the state K = u − L in the canonical ordered semiautomaton of L .Now we show that for every v ∈ A ∗ such that | v | = n , we have K ⊆ v − K .Indeed, if w ∈ K = u − L then uw ∈ L and since L is closed under n -insertionswe get uvw ∈ L . Hence vw ∈ K = u − L , which implies w ∈ v − K . Therefor thecanonical ordered semiautomaton of L has n -extensive actions.On contrary, let L be recognized by Q = ( Q, A, · , ≤ , i, F ) with n -extensiveactions. For every u, v, w ∈ A ∗ such that uw ∈ L and | v | = n , we can considerthe state q = i · u in Q . Since Q has n -extensive actions we have q · v ≥ q . Hence i · uvw = q · vw ≥ q · w = i · uw ∈ F and we can conclude that uvw ∈ L . Thus L is closed under n -insertions. ⊓⊔ For a fixed n , it is decidable in polynomial time whether a given orderedsemiautomaton has n -extensive actions, because the relation q ≤ q · u has to bechecked only for polynomially many words u . C -Varieties of Semiautomata In the previous section, the membership problem for (positive) C -varieties ofsemiautomata was always solved by an ad hoc argument. Here we discuss whetherit is possible to give a general result in this direction. For that purpose, recallthat ω -identity is a pair of ω -terms, which are constructed from variables by (re-peated) successive application of concatenation and the unary operation u u ω .In a particular monoid, the interpretation of this unary operation assigns to eachelement s its uniquely determined power which is idempotent.In the case of C all consisting of all homomorphisms, we mention Theorem2.19 from [26] which states the following result: if the corresponding pseudova-riety of monoids is defined by a finite set of ω -identities then the membershipproblem of the corresponding variety of languages is decidable by a polynomialspace algorithm in the size of the input automaton. Thus, Theorem 2.19 slightlyextends the case when the pseudovariety of monoids is defined by a finite set ofidentities. The algorithm checks the defining ω -identities in the syntactic monoid M L of a language L and uses the basic fact that M L is the transition monoidof the minimal automaton of L . This extension is possible, because the unaryoperation ( ) ω can be effectively computed from the input automaton. We should mention that checking a fixed identity in an input semiautomatoncan be done in a better way. Such a (NL) algorithm (a folklore algorithm in thetheory) guesses a pair of finite sequences of states for two sides of a given identity u = v which are visited during reading the word u (and v respectively) letterby letter. These sequences have the same first states and distinct last states.Then the algorithm checks whether for each variable, there is a transition ofthe automaton given by a word, which transforms all states in the sequence inthe right way, when every occurrence of the variable is considered. If, for everyused variable, there is such a word, we obtained a counterexample disprovingthe identity u = v .Whichever algorithm is used, we can immediately get the generalization tothe case of positive varieties of languages, because checking inequalities can bedone in the same manner as checking identities. However, we want to use thementioned algorithms to obtain a corresponding result for positive C -varieties ofordered semiautomata for the categories used in this paper. For such a resultwe need the following formal definition. An ω -inequality u ≤ v holds in anordered semiautomaton O = ( Q, A, · , ≤ ) with respect to a category C if, for every f ∈ C ( X ∗ , A ∗ ) with X being the set of variables occurring in uv , and for every p ∈ Q , we have p · f ( u ) ≤ p · f ( v ). Here f ( u ) is equal to f ( u ′ ), where u ′ is a wordobtained from u if all occurrences of ω are replaced by an exponent n satisfyingthe equality s ω = s n in the transition monoid of O for its arbitrary element s . Theorem 9.1.
Let O = ( Q, A, · , ≤ ) be an ordered semiautomaton, let u ≤ v bean ω -inequality and C be one of the categories C ne , C l , C s and C lm . The problemwhether u ≤ v holds in O with respect to C is decidable.Proof. The result is a consequence of the following propositions.
Proposition 2.
Let O = ( Q, A, · , ≤ ) be an ordered semiautomaton, u ≤ v bean ω -inequality and C be one of the categories C ne , C l and C s . The problem ofdeciding whether u ≤ v holds in O with respect to C can be solved by a polynomialspace algorithm.Proof. First of all, we prove the statement formally for C = C all . We start withthe case when ω operation is not used. It is mentioned in Section 9 that such analgorithm is a folklore in the theory.Let y . . . y s ≤ z . . . z t be an inequality, where y , . . . , y s , z , . . . , z t are vari-ables from X . Recall, that this inequality holds in the transition ordered monoidof A if and only if for every homomorphism f : X ∗ → A ∗ , the inequality oftransformations f ( y ) ◦ · · · ◦ f ( y s ) ≤ f ( z ) ◦ · · · ◦ f ( z t ) is satisfied. This require-ment can be reformulated as the inequality q · f ( y ) · · · f ( y s ) ≤ q · f ( z ) · · · f ( z t )of states of A , for every state q ∈ Q and every homomorphism f : X ∗ → A ∗ .This means that the inequality is not valid if and only if there exist such f andstates p , p , . . . , p s , q , q , . . . , q t ∈ Q , with p = q and p s q t , which satisfy p i − · f ( y i ) = p i and q j − · f ( z j ) = q j for every i ∈ { , . . . , s } and j ∈ { , . . . , t } .Since the numbers s and t are constants, one can non-deterministically choose all n Varieties of Ordered Automata 29 these states, and then decide whether for this choice of states the required homo-morphism f exists. For every variable x , denote by I x the set of all i ∈ { , . . . , s } such that y i = x , and by J x the set of all j ∈ { , . . . , t } such that z j = x . Inorder to decide existence of f , one has to check whether for every variable x there exists a word f ( x ) ∈ A ∗ such that p i − · f ( x ) = p i for every i ∈ I x and q j − · f ( x ) = q j for every j ∈ J x . However, the existence of such a word f ( x )can be expressed as a condition on the product automaton of | I x | + | J x | copiesof the automaton A ; namely, it is equivalent to reachability of the state withcomponents p i , for i ∈ I x , and q j , for j ∈ J x , from the state with components p i − , for i ∈ I x , and q j − , for j ∈ J x . Recall that | I x | + | J x | is a constant.Now assume that u and v are ω -terms. We are guessing the states as in theprevious simple case, but we do this inductively with respect to the structure ofthe ω -terms u and v from top to down. In this way we obtain a more complicatedsystem of states comparing the sequences in the case of (linear) words. To explainthe inductive construction, assume that we have guessed states p and q for acertain ω -subterm w assuming that p · f ( w ) = q . If w = w w for ω -terms w and w , then we simply guess a state r and assume that p · f ( w ) = r and r · f ( w ) = p . The case w = z ω , with a subterm z , is more complicated. It is wellknown that, for every element s in the transition monoid of the given automaton,the element s ω is equal to s n for some n ≤ | Q | . In particular, s n · s n = s n holdsfor this n . So, we guess n ≤ | Q | and states r , r , r , . . . , r n such that r = p , r n = r n = q and we assume that r i − · f ( w ) = r i for every i = 1 , . . . , n . In thisway, when we decompose all subterms, we obtain a system of states equippedwith assumptions of the form p · f ( x ) = q , where p and q are states and x isa variable. Since the ω -terms u and v are not part of the input, there are onlyconstantly many steps of the algorithm decomposing the terms u and v . Thus,at the end, the number of conditions is polynomial with respect the size of theinput automaton. (In fact, the number of the conditions can be bounded by thenumber of all states in Q , which is linear.) The final part of the algorithm isthe same: we just check, for each variable x , whether it is possible to satisfy allthe conditions concerning f ( x ) at the same time. Point out, that the numberof conditions was constant in the case of identity u = v in the first part, whichgives log space algorithm in the that case.Now we are ready to discuss another categories, where we search for f ∈C ( X ∗ , A ∗ ). The case C = C ne is trivial. When we test reachability in the productof certain number of copies of O , we are looking for a non-empty path in thegraph. The case C = C l is even easier, because we test reachability in one step.Seeing this case from another point of view, this case is easy, because there areonly polynomially many homomorphisms in C l ( X ∗ , A ∗ ) for fixed X and A whereonly A is a part of the input. The case C = C s is also easy. We are looking for f ∈ C ( X ∗ , A ∗ ) such that A ⊆ f ( X ). So, we can additionally guess, for each letter a ∈ A , a variable x ∈ X such that f ( x ) = a .We could conclude with the remark, that is well known that nondeterministicpolynomial space is equivalent to deterministic polynomial space. ⊓⊔ Proposition 3.
Let O = ( Q, A, · , ≤ ) be an ordered semiautomaton, u ≤ v bean ω -inequality. The problem whether u ≤ v holds in O with respect to C lm isdecidable.Proof. We proceed as in the general case up to the place where the existenceof f ( x ) is discussed. We do not decide whether there is f ( x ) ∈ A ∗ satisfying allconditions before we first complete the conditions in such a way that, for every q ∈ Q , the condition on q · f ( x ) is present. This is made by guessing missingpairs q · f ( x ) for all q and x . Just now we test whether there are words f ( x )satisfying the conditions.Only if there are such words, we continue. Next we try to describe all of them.It is possible, because, for every x , we know how f ( x ) transform the semiautoma-ton O . So, the language of all words which are considered as a potential words f ( x ) is a regular language which is recognized by the transition monoid of thesemiautomaton O . We denote it as L x . Furthermore, we are able to compute aregular expression r x describing L x . We need to decide whether for each variable x there is a word w x ∈ L x such that all words’s w x have the same length. Thus,we need to know all possible lengths of words in L x . For this purpose we considerthe unique literal mapping ψ : A ∗ → { a } ∗ , ψ ( A ) = { a } . Clearly, the language ψ ( L x ) is regular, because it is described by a regular expression r x , which can beobtained from r x , if we replace every letter from the alphabet A by the letter a .Moreover, there is a word w x ∈ L x of length k if and only if a k ∈ ψ ( L x ). So, theexistence of an integer k such that L x ∩ A k = ∅ holds for every x , is equivalent tothe fact T x ∈ X ψ ( L x ) = ∅ . The later inequality is equivalent to non-emptiness ofthe language given by the generalized regular expression T x ∈ X r x . So, one candecide this question. ⊓⊔ We did not discuss the complexity of the algorithm, because we do not see howto effectively construct the regular expression T x ∈ X r x .
10 Further Remarks
At the end we could mention that one can extend the construction in at least twonatural directions. First, the theory of tree languages is a field where many fun-damental ideas from the theory of deterministic automata were successfully gen-eralized. Another recent notion of biautomata (see [10] and [9]) is based on con-sidering both-sided quotients instead of left quotients only. In both cases one cantry to apply the previous constructions and consider varieties of (semi)automata.Some papers in this direction already exist [7].
References
1. J. Brzozowski: Canonical regular expressions and minimal state graphs for definiteevents, Mathematical theory of Automata, 529–561 (1962)2. J. Brzozowski and B. Li: Syntactic complexity of R- and J-trivial regular languages,DCFS 2013, LNCS 8031, 160–171 (2013)n Varieties of Ordered Automata 313. S. Burris and H.P. Sankappanavar:
A course in universal algebra , Springer-Verlag(1981)4. L. Chaubard, J.-´E. Pin and H. Straubing: Actions, wreath products of C -varietiesand concatenation product, Theor. Comput. Sci 356, 73–89 (2006)5. Cho, S., Huynh, D. T.: Finite automaton aperiodicity is PSPACE-complete. The-oretical Computer Science 88, 96–116 (1991)6. S. Eilenberg: Automata, Languages and Machines, vol. B. , Academic Press (1976),7. Z. ´Esik and S. Iv´an: Some Varieties of Finite Tree Automata Related to RestrictedTemporal Logics, Fundam. Inform 82, 79–103 (2008)8. Z. ´Esik and M. Ito: Temporal Logic with Cyclic Counting and the Degree of Ape-riodicity of Finite Automata, Acta Cybern 16, 1–28 (2003)9. M. Holzer and S. Jakobi: Minimization and characterizations for biautomata,NCMA 3013, ¨Osterreichische Computer Gesellschaft 294, 179–193 (2013)10. O. Kl´ıma and L. Pol´ak: On biautomata, RAIRO - Theor. Inf. and Applic. 46,573–592 (2012)11. O. Kl´ıma and L. Pol´ak: Alternative Automata Characterization of PiecewiseTestable Languages, DLT 2013, LNCS 7907, 289–300 (2013)12. O. Kl´ıma and L. Pol´ak: On varieties of ordered automata, LATA 2019, LNCS11417, 108–120 (2019), https://doi.org/10.1007/978-3-030-13435-8 \
13. M. Kunc: Equational description of pseudovarieties of homomorphisms, RAIRO -Theoretical Informatics and Applications 37, 243–254 (2003)14. R. McNaughton and S. Papert:
Counter-Free Automata , M.I.T. Press (1971)15. J.-´E. Pin:
Varieties of formal languages , Plenum Publishing Co. (1986)16. J.-´E. Pin: A Variety Theorem Without Complementation, Russian Mathematics39, 80–90 (1995)17. J.-´E. Pin: