Automata Learning: An Algebraic Approach
aa r X i v : . [ c s . F L ] A ug Automata Learning: An Algebraic Approach
Henning Urbat ∗ Friedrich-Alexander-Universität Erlangen-NürnbergErlangen, Germany [email protected]
Lutz Schröder ∗ Friedrich-Alexander-Universität Erlangen-NürnbergErlangen, Germany [email protected]
Abstract
We propose a generic categorical framework for learningunknown formal languages of various types (e.g. finite orinfinite words, weighted and nominal languages). Our ap-proach is parametric in a monad T that represents the giventype of languages and their recognizing algebraic struc-tures. Using the concept of an automata presentation of T -algebras, we demonstrate that the task of learning a T -recognizable language can be reduced to learning an ab-stract form of algebraic automaton whose transitions aremodeled by a functor. For the important case of adjointautomata, we devise a learning algorithm generalizing An-gluin’s L ∗ . The algorithm is phrased in terms of categoricallydescribed extension steps; we provide for a termination andcomplexity analysis based on a dedicated notion of finite-ness. Our framework applies to structures like ω -regular lan-guages that were not within the scope of existing categori-cal accounts of automata learning. In addition, it yields newlearning algorithms for several types of languages for whichno such algorithms were previously known at all, includ-ing sorted languages, nominal languages with name bind-ing, and cost functions. Keywords
Automata Learning, Monads, Algebras
Active automata learning is the task of inferring a finiterepresentation of an unknown formal language by askingquestions to a teacher. Such learning situations naturallyarise, e.g., in software verification, where the “teacher” issome reactive system and one aims to construct a formalmodel of it by running suitable tests [61]. Starting with An-gluin’s [8] pioneering work on learning regular languages,active learning algorithms have been developed for count-less types of systems and languages, including ω -regularlanguages [9, 32], tree languages [30], weighted languages[12, 63], and nominal languages [47]. Most of these exten-sions are tailor-made modifications of Angluin’s L ∗ algo-rithm and thus bear close structural analogies. This has mo-tivated recent work towards a uniform category theoreticunderstanding of automata learning, based on modellingstate-based systems as coalgebras [14, 65]. In the present ∗ The authors acknowledge support by Deutsche Forschungsgemeinschaft(DFG) under project SCHR 1118/8-2. , , . paper, we propose a novel algebraic approach to automatalearning.Our contributions are two-fold. First, we study the prob-lem of learning an abstract form of automata originally in-troduced by Arbib and Manes [10] in the context of mini-mization: given an endofunctor F on a category D and ob-jects I , O ∈ D , an F -automaton consists of an object Q ofstates and morphisms δ Q , i Q and f Q as shown below, repre-senting transitions, initial states and final states (or outputs). FQ δ Q (cid:15) (cid:15) I i Q / / Q f Q / / O Taking FQ = Σ × Q on Set with I = O = { , } yieldsclassical deterministic automata, but also several other no-tions of automata (e.g. weighted automata, residual nonde-terministic automata, and nominal automata) arise as in-stances. As our first main result, we devise a generalized L ∗ algorithm for adjoint F -automata , i.e. automata whosetype functor F admits a right adjoint G , based on alternat-ing moves along the initial chain for the functor I + F andthe final cochain for the functor O × G . Our generic algo-rithm subsumes known L ∗ -type algorithms for all the aboveclasses of automata, and its analysis yields uniform proofs oftheir correctness and termination. In addition, it also instan-tiates to a number of new learning algorithms, e.g. for sortedautomata and for several versions of nominal automata withname binding.We subsequently show that learning algorithms for F -automata (including our generalized L ∗ algorithm) apply farbeyond the realm of automata: they can be used to learnlanguages representable by monads [7, 59]. Given a monad T on the category D , we model a language as a morphism L : T I → O in D . At this level of generality, one obtainsa concept of T -recognizable language (i.e. a language recog-nized by a finite T -algebra) that uniformly captures numer-ous automata-theoretic classes of languages. For instance,regular and ω -regular languages (the languages accepted byclassical finite automata and Büchi automata, respectively)correspond precisely to T -recognizable languages for themonads representing semigroups and Wilke algebras, T I = I + on Set and T ( I , J ) = ( I + , I up + I ∗ × J ) on Set . Here I up denotes the set of ultimately periodic infinite wordsover the alphabet I . For ω -regular languages, Farzan et al. , Henning Urbat and Lutz Schröder [32] proposed an algorithm that learns a language L ⊆ I ω of infinite words by learning the set of lassos in L , i.e. theregular language of finite words given by lasso ( L ) = { u $ v : u ∈ I ∗ , v ∈ I + , uv ω ∈ L } ⊆ ( I + { $ }) ∗ . We show that this idea extends to general T -recognizablelanguages, using the concept of an automata presenta-tion . Such a presentation allows for the linearization of T -recognizable languages, i.e. a reduction to “regular” lan-guages accepted by finite F -automata for suitable F .In combination, our results yield a generic strategy forlearning an unknown T -recognizable language L : T I → O :(1) find an automata presentation for the free T -algebra T I ;(2) learn the minimal automaton for the linearization of L .This approach turns out to be applicable to a wide range oflanguages. In particular, it covers several settings for whichno learning algorithms are known, e.g. cost functions [23]. Related work.
A categorical interpretation of several keyconcepts in Angluin’s L ∗ algorithm for classical automatawas first given by Jacobs and Silva [37], and later extendedto F -automata in a category, i.e. to similar generality as inthe present paper, by van Heerdt, Sammartino, and Silva[64]. Their main contribution is an abstract categoricalframework (CALF) for correctness proofs of learning algo-rithms, while a concrete generic algorithm is not given. VanHeerdt et al. [65] also study learning automata with side ef-fects modelled via monads; this use of monads is unrelatedto the monad-based abstraction of algebraic recognition inthe present paper. Barlocco, Kupke, and Rot [14] developa learning algorithm for set coalgebras (with all underlyingconcepts phrased categorically), parametric in a coalgebraiclogic. Its scope is quite different from our generalized L ∗ al-gorithm: via genericity over the branching type it covers,e.g., labeled transition systems, but unlike our algorithm itdoes not apply to, e.g., nominal automata. The connectionsbetween the two approaches are further discussed in Re-mark 4.16.Automata learning can be seen as an interactive ver-sion of automata minimization, which has been extensivelystudied from a (co-)algebraic perspective [5, 10, 16, 24, 35,63]. In particular, our chain-based iterative learning algo-rithm resembles the coalgebraic approach to partition re-finement [1]. We proceed to recall concepts from category theory andthe theory of nominal sets that we will use throughoutthe paper. Readers should be familiar with basic notionssuch as functors, (co-)limits and adjunctions; see, e.g., MacLane [41].
Functor (co-)algebras.
Let H : D → D be an endofunctoron a category D . An H -algebra is a pair ( A , α ) consisting of an object A ∈ D and a morphism α : HA → A . A homomor-phism h : ( A , α ) → ( B , β ) between H -algebras is a morphism h : A → B such that h · α = β · Fh . An H -algebra ( A , α ) is initial if for every H -algebra ( B , β ) there is a unique homo-morphism ( A , α ) → ( B , β ) ; we generally denote the initialalgebra of H (unique up to isomorphism if it exists) as µH .If D is cocomplete and H preserves filtered colimits, µH canbe constructed as the colimit of the initial ω -chain for H [6]: µH = colim ( ¡ −→ H H ¡ −−→ H H ¡ −−−→ H → · · · ) , where ¡ is the unique morphism from the initial object 0of D into H
0, and H n means H applied n times. Letting j n : H n → µH ( n ∈ N ) denote the colimit cocone, we ob-tain the H -algebra structure on µH as the unique morphism α : H ( µH ) → µH satisfying α · H j n = j n + for all n ∈ N . Dually, one has notions of a coalgebra for the endofunctor H , a coalgebra homomorphism , and a final coalgebra . Coal-gebras provide an abstract notion of state-based transitionsystem: We think of the base object A of an H -coalgebra asan object of states , and of its structure map α : A → HA asassigning to each state a structured collection of successors.Coalgebra homomorphisms are behaviour-preserving maps,and final coalgebras have abstracted behaviours as states. Monad algebras. A monad T = ( T , µ , η ) on a category D is given by an endofunctor T : D → D and two naturaltransformations η : Id D → T and µ : TT → T (the unit and multiplication ) such that the following diagrams commute: TTT
T µ / / µT (cid:15) (cid:15) TT µ (cid:15) (cid:15) TT µ / / T T
T η / / ■■■■■■■ ■■■■■■■ TT µ (cid:15) (cid:15) T ηT o o ✉✉✉✉✉✉✉✉✉✉✉✉✉✉ T A T -algebra is an algebra ( A , α ) for the endofunctor T forwhich the following diagrams commute: TT A µ A / / T α (cid:15) (cid:15)
T A α (cid:15) (cid:15) T A α / / A A ■■■■■■■ ■■■■■■■ η A / / T A α (cid:15) (cid:15) A A homomorphism of T -algebras is just a homomorphism ofthe underlying T -algebras. For each X ∈ D , the T -algebra T X = ( T X , µ X ) is called the free T -algebra on X .Monads form a categorical abstraction of algebraic theo-ries [43]. In fact, every algebraic theory (given by a finitarysignature Γ and a set E of equations between Γ -terms) in-duces of monad T on Set where
T X is the underlying set ofthe free ( Γ , E ) -algebra on X (i.e. the set of all Γ -terms over X modulo equations in E ), and the maps η X : X → T X and µ X : TT X → T X are given by inclusion of variables and utomata Learning: An Algebraic Approach , , flattening of terms, respectively. Then the categories of T -algebras and ( Γ , E ) -algebras are isomorphic. Conversely, ev-ery monad T on Set with T preserving filtered colimits arisesfrom some algebraic theory ( Γ , E ) in this way.Similarly, every ordered algebraic theory [17], given bya signature Γ and a set E of inequations s ≤ t between Γ -terms, yields a monad T on the category Pos of posets whosealgebras are ordered Γ -algebras (i.e. Γ -algebras on a posetwith monotone operations) satisfying the inequations in E . Free monads.
Let H : D → D be an endofunctor on acategory D with coproducts, and suppose that, for each X ∈ D , the initial algebra µ ( X + H ) for the functor X + H exists. Then H induces a monad T H , the free monadover H [15]. It is given on objects by T H X = µ ( X + H ) ; itsaction on morphisms and the unit and multiplication aredefined via initiality of the algebras µ ( X + H ) . Then the cat-egories of T H -algebras and H -algebras are isomorphic: If B + H ( T H B ) [ i B , α B ] −−−−−→ T H B denotes the B + H -algebra structureof T H B = µ ( B + H ) , the isomorphism is given on objects by ( T H B β −→ B ) 7→ ( HB Hi B −−−→ H ( T H B ) α B −−→ T H B β −→ B ) and on morphisms by h h . Factorization systems. A factorization system (E , M) in acategory D is given by two classes E and M of morphismssuch that (i) E and M are closed under composition andcontain all isomorphisms, (ii) every morphism f has a fac-torization f = m · e with e ∈ E and m ∈ M , and (iii) the diagonal fill-in property holds: given a commutative square m · f = д · e with e ∈ E and m ∈ M , there exists a uniquemorphism d with f = d · e and д = m · d . The morphisms m and e in (i) are unique up to isomorphism and are calledthe image and coimage of f . Categories of (co-)algebras typ-ically inherit factorizations from their underlying category:(1) If H : D → D is an endofunctor with H (E) ⊆ E , thefactorization system (E , M) for D lifts to the category of H -algebras, that is, every H -algebra homomorphism uniquelyfactorizes into a homomorphism in E followed by a homo-morphism in M . Dually, if H (M) ⊆ M , then the categoryof H -coalgebras has a factorization system lifting (E , M) .(2) If T is a monad on D with T (E) ⊆ E , the factorizationsystem (E , M) for D lifts to the category of T -algebras.A factorization system (E , M) is proper if every morphismin E is epic and every morphism in M is monic. When-ever a proper factorization system (E , M) is fixed, quo-tients and subobjects in D are represented by morphismsin E and M , respectively. In particular, in the situation of(1) and (2) above, we represent quotient (co-)algebras and sub(co-)algebras by homomorphisms in E and M , respec-tively. Closed categories. A symmetric monoidal category is a cat-egory D equipped with a functor ⊗ : D × D → D ( tensor product ), an object I D ∈ D ( tensor unit ), and isomorphisms ( X ⊗ Y )⊗ Z (cid:27) X ⊗( Y ⊗ Z ) , X ⊗ Y (cid:27) Y ⊗ X , I D ⊗ X (cid:27) X (cid:27) X ⊗ I D , natural in X , Y , Z ∈ D , satisfying coherence laws [41, Chap-ter VII]. D is closed if the endofunctor X ⊗ (−) : D → D hasa right adjoint (denoted by [ X , −] ) for every X ∈ D , i.e. thereis a natural isomorphism D ( X ⊗ Y , Z ) (cid:27) D ( Y , [ X , Z ]) . Nominal sets.
Fix a countably infinite set A of names , andlet Perm ( A ) be the group of all permutations π : A → A with π ( a ) = a for all but finitely many a . A nominal set [51] is aset X with a group action · : Perm ( A )× X → X subject to thefollowing property: for each x ∈ X there is a finite set S ⊆ A (a support of x ) such that every π ∈ Perm ( A ) that leaves allelements of S fixed satisfies π · x = x . This implies that x hasa least support supp ( x ) ⊆ A . The idea is that x is a syntacticobject with bound and free variables (e.g. a λ -term modulo α -equivalence), and that supp ( x ) is its set of free variables.A nominal set X is orbit-finite if the number of orbits (i.e.equivalence classes of the relation x ≡ y iff x = π · y forsome π ) is finite. A map f : X → Y between nominal sets is equivariant if f ( π · x ) = π · f ( x ) for x ∈ X and π ∈ Perm ( A ) . We next develop the abstract categorical notion of automa-ton that underlies our generic learning algorithm.
Notation 3.1.
For the rest of this paper, let us fix(1) a category D with a proper factorization system (E , M) ,(2) an endofunctor F : D → D , and(3) two objects I , O ∈ D . Definition 3.2 (Automaton (cf. [5, 10])) . An ( F -)automaton is given by an object Q ∈ D of states and three morphisms δ Q : FQ → Q , i Q : I → Q , f Q : Q → O , representing transitions, initial states, and final states (oroutputs), respectively. A homomorphism between automata ( Q , δ Q , i Q , f Q ) and ( Q ′ , δ Q ′ , i Q ′ , f Q ′ ) is a morphism h : Q → Q ′ in D such that the following diagrams commute: FQ δ Q / / Fh (cid:15) (cid:15) Q h (cid:15) (cid:15) FQ ′ δ Q ′ / / Q ′ I i Q / / i Q ′ ❋❋❋❋❋❋❋ Q h (cid:15) (cid:15) f Q / / OQ ′ f Q ′ ; ; ✇✇✇✇✇✇ Example 3.3 ( Σ -automata) . Suppose that ( D , ⊗ , I D ) is asymmetric monoidal closed category. Choosing the data F = Σ ⊗ (−) , I = I D , and O ∈ D (arbitrary)for a fixed input alphabet Σ ∈ D yields Goguen’s notionof a Σ -automaton [35]. In our applications, we shall workwith the categories Set (sets and functions),
Pos (posets andmonotone maps),
JSL (join-semilattices with ⊥ and semilat-tice homomorphisms preserving ⊥ ), K - Vec (vector spacesover field K and linear maps) and Nom (nominal sets andequivariant maps). The factorization systems and monoidal , Henning Urbat and Lutz Schröder structures are given in the table below. In the fourth row, ⊗ is the usual tensor product of vector spaces representingbilinear maps. Similarly, in the third row, ⊗ is the tensorproduct of semilattices representing bimorphisms [13], i.e.semilattice morphisms h : A ⊗ B → C correspond to maps h ′ : A × B → C preserving ∨ and ⊥ in each component. D (E , M) ⊗ I D O Set (surjective, injective) × { , } Pos (surjective, embedding) × { < } JSL (surjective, injective) ⊗ { < } { < } K - Vec (surjective, injective) ⊗ K K
Nom (surjective, injective) × { , } Table 1.
Symmetric monoidal closed categoriesWe choose the input alphabet Σ ∈ D to be a finite set, a dis-crete finite poset, a free semilattice on a finite set, a finite-dimensional vector space, and the nominal set A of atoms,respectively, and the output object O ∈ D as shown in thelast column. Then Σ -automata are precisely classical deter-ministic automata [53], ordered automata [50], semilatticeautomata [39], linear weighted automata [31], and nominalautomata [18]. See Example 3.9 and 3.10 for further details. Example 3.4 (Tree automata) . Let Γ be a signature and F Γ Q = Ý n ∈ N Ý γ ∈ Γ n Q n on Set the induced polynomial func-tor, with Γ n the set of n -ary operations in Γ . Choosing I = ∅ and O =
2, an F Γ -automaton is a (bottom-up) tree automa-ton over Γ [25], shortly a Γ -automaton . For the analogousfunctor F Γ on Pos and O = { < } , we obtain ordered Γ -automata .In the following, we focus on adjoint automata , i.e. au-tomata whose transition type F is a left adjoint: Assumptions 3.5.
For the rest of this section and in Sec-tion 4, our data is required to satisfy the following condi-tions:(1) D is complete and cocomplete; in particular, D has aninitial object 0 and a terminal object 1.(2) The unique morphism ¡ : 0 → I lies in M , and theunique morphism ! : O → E .(3) The functor F : D → D has a right adjoint G : D → D .(4) The functor F preserves quotients ( F (E) ⊆ E ). Example 3.6.
Every symmetric monoidal closed category D with F = Σ ⊗ − satisfies Assumption (3): closedness as-serts precisely that F has the right adjoint G = [ Σ , −] . Thecategories D of Table 1 also satisfy the remaining assump-tions. Remark 3.7.
The key feature of our adjoint setting is thatautomata can be dually viewed as algebras and coalgebras for suitable endofunctors. In more detail: (1) An automaton Q corresponds precisely to an algebra ( F I Q α Q −−→ Q ) = ( I + FQ [ i Q , δ Q ] −−−−−−→ Q ) for the endofunctor F I = I + F equipped with an outputmorphism f Q : Q → O . Since F I preserves filtered colimits(using that the left adjoint F preserves all colimits and thefunctor I + (−) preserves filtered colimits), the initial algebra µF I for F I emerges as the colimit of the initial ω -chain: µF I = colim ( ¡ −→ F I F I ¡ −−→ F I F I ¡ −−→ F I → · · · ) . The colimit injections and the F I -algebra structure on µF I are denoted by j n : F nI → µF I ( n ∈ N ) and α : F I ( µF I ) → µF I . For any automaton Q (viewed as an F I -algebra), we write e Q : µF I → Q for the unique F I -algebra homomorphism from µF I into Q .(2) Dually, replacing δ Q : FQ → Q by its adjoint transpose δ @ Q : Q → GQ , an automaton can be presented as a coalge-bra ( Q γ Q −−→ G O Q ) = ( Q h f Q , δ @ Q i −−−−−−→ O × GQ ) for the endofunctor G O = O × G equipped with an ini-tial state i Q : I → Q . Since G O preserves cofiltered limits,the final coalgebra νG O arises as the limit of the final ω op -cochain: νG O = lim ( ! ←− G O G O ! ←−−− G O G O ! ←−−− G O ← · · · ) . The limit projections and the G O -coalgebra structure on νG O are denoted by j ′ k : νG O → G kO ( k ∈ N ) and νG O γ −→ G O ( νG O ) . For any automaton Q (viewed as a G O -coalgebra), we write m Q : Q → νG O for the unique G O -coalgebra homomorphism into νG O . Definition 3.8 (Language) . (1) A language is a morphism L : µF I → O . (2) The language accepted by an automaton Q is defined by L Q = ( µF I e Q −−→ Q f Q −−→ O ) . Example 3.9 ( Σ -automata, continued) . (1) In the settingof Example 3.3, the initial algebra µF I and the initial chainfor the functor F I = I D + Σ ⊗ − can be described as follows[35]. Let Σ n = Σ ⊗ Σ ⊗ · · · ⊗ Σ denote the n th tensor powerof Σ (where Σ = I D ), and put Σ < n = Þ m < n Σ m ( n ∈ N ) and Σ ∗ = Þ n ∈ N Σ n . Then µF I is carried by the object Σ ∗ of words, and the initialchain is given by the coproduct injections Σ < Σ < Σ < Σ < · · · . utomata Learning: An Algebraic Approach , , (2) For the functor G O = O × [ Σ , −] the final coalgebra νG O is carried by the object [ Σ ∗ , O ] of languages and we have thefinal cochain [ Σ < , O ] ← [ Σ < , O ] ← [ Σ < , O ] ← [ Σ < , O ] ← · · · with connecting morphisms given by restriction. To see this,consider the contravariant functor P = [− , O ] : D → D op .It is not difficult to verify that P is a left adjoint (with rightadjoint P op ) and that there is a natural isomorphism PF I (cid:27) G op O P . If Alg F I and Coalg G O denote the categories of F I -algebrasand G O -coalgebras, it follows [36, Theorem 2.4] that P liftsto a left adjoint P : Alg F I → ( Coalg G O ) op given by ( F I Q α Q −−→ Q ) 7→ ( PQ Pα Q −−−→ PF I Q (cid:27) G O PQ ) . Since left adjoints preserve initial objects, P maps the initialalgebra µF I to the final coalgebra νG O , i.e. one has νG O = P ( µF I ) with the coalgebra structure γ = ( νG O = P ( µF I ) Pα −−→ PF I ( µF I ) (cid:27) G O P ( µF I ) = G O ( νG O ) ) . Moreover, applying P to the initial chain for F I yields thefinal cochain for G O : ( ! ←− G O G O ! ←−−− G O · · · ) = ( P P ¡ ←− PF I P F I ¡ ←−−− PF I · · · ) . Since µF I = Σ ∗ and P = [− , O ] , we obtain the above descrip-tion of νG O and of the final cochain for G O .(3) For the categories of Table 1, the categorical notion of(accepted) language given in Definition 3.8 thus specializesto the familiar ones. For illustration, let us spell out the case D = Set . A Σ -automaton in Set is precisely a classical deter-ministic automaton: it is given by a set Q of states, a transi-tion map δ Q : Σ × Q → Q , a map i Q : 1 → Q (representingan initial state q = i Q (∗) ), and a map f Q : Q → f − Q [ ] of final states). From (1) and (2) weobtain the well-known description of the initial algebra for F I = + Σ × − as the set Σ ∗ of finite words over Σ (withalgebra structure α : 1 + Σ × Σ ∗ → Σ ∗ given by ∗ 7→ ε and ( a , w ) 7→ wa ) and of the final coalgebra for G O = × [ Σ , −] as the set [ Σ ∗ , ] (cid:27) P Σ ∗ of all languages L ⊆ Σ ∗ [55]. Theunique F I -algebra homomorphism e Q : Σ ∗ → Q maps aword w ∈ Σ ∗ to the state of Q reached on input w . Thus,the language L Q = f Q · e Q accepted by Q is the usual con-cept: w lies in L Q if and only if Q reaches a final state oninput w . Example 3.10 (Nominal automata) . Our notion of automa-ton (Definition 3.2) has several natural instantiations to thecategory
Nom of nominal sets and equivariant maps.(1) The simplest instance was already mentioned in Exam-ple 3.3: a Σ -automaton in Nom corresponds precisely toa nominal deterministic automaton [18]. For simplicity, wechoose the alphabet Σ = A . A nominal automaton is givenby a nominal set Q of states, an equivariant transition map δ Q : A × Q → Q , an equivariant map i Q : 1 → Q (repre-senting an equivariant initial state q ∈ Q ), and an equivari-ant map f Q : Q → F ⊆ Q of final states). The initial algebra A ∗ is the nomi-nal set of words over A with group action π · ( a . . . a n ) = ( π · a ) . . . ( π · a n ) for a . . . a n ∈ A ∗ and π ∈ Perm ( A ) . Thus,a language L : A ∗ → A .Nominal automata with orbit-finite state space areknown to be expressively equivalent to Kaminski andFrancez’ [38] deterministic finite memory automata .(2) Now Nom carries a further symmetric monoidal closedstructure, the separated product ∗ given on objects by X ∗ Y = { ( x , y ) ∈ X × Y : x y } , where x y means that supp ( x ) ∩ supp ( y ) = ∅ . The rightadjoint of F = A ∗ (−) is the abstraction functor G = [ A ](−) [51] which maps a nominal set X to the quotient of A × X modulo the equivalence relation ∼ defined by ( a , x ) ∼ ( b , y ) iff ( ac ) · x = ( bc ) · y for some (equivalently, all) c ∈ A with c a , b , x , y . We write h a i x for the equivalence class of ( a , x ) ,which we think of as the result of binding the name a in x . F -automata are precisely the separated automata recentlyintroduced by Moerman and Rot [46].(3) By combining the adjunctions of (1) and (2), we obtainthe adjoint pair of functors F ⊣ G with F = A × (−) + A ∗ (−) , G = [ A , −] × [ A ](−) . The ensuing notion of automaton coincides with one usedin Kozen et al.’s [40] coalgebraic representation of nominalKleene algebra [33]. Such automata have two types of transi-tions, free transitions ( [ A , −] ) and bound transitions ( [ A ](−) ).They accept bar languages [56]: putting ¯ A = A ∪{h a | a ∈ A } (changing the original notation from | a to h a for compatibil-ity with dynamic sequences as discussed next), a bar string isjust a word over ¯ A . We consider h a as binding a to the right.This gives rise to the expected notions of free names and α -equivalence ≡ α . A bar string is clean if its bound names aremutually distinct and distinct from all its free names. Simpli-fying slightly, we define a bar language to be an equivariantset of bar strings modulo α -equivalence, i.e. an equivariantsubset of ¯ A ∗ /≡ α . The initial algebra µF is the nominal set ofclean bar strings. A language in our sense is thus an equivari-ant set of clean bar strings; such languages are in bijectivecorrespondence with bar languages [56].(4) We note next that [ A ](−) is itself a left adjoint, our firstexample of a left adjoint that is not of the form Σ ⊗ − for aclosed structure ⊗ . The right adjoint R is given on objectsby RX = { f ∈ [ A , X ] : a f ( a ) for all a ∈ A } [51]. Weextend the above notion of automaton with this feature, i.e.we now work with the adjoint pair F ⊣ G given by F = A × (−) + A ∗ (−) + [ A ](−) , G = [ A , −] × [ A ](−) × R . , Henning Urbat and Lutz Schröder The initial algebra µF now consists of words built fromthree types of letters; we denote the new type of letters in-duced by the new summand [ A ](−) in F by a i (for a ∈ A ).Recalling that words grow to the right, we see that a i bindsto the left. We read a i as deallocating the name or resource a .Languages in this model consist of dynamic sequences [34].We associate such languages with a species of nominal au-tomata having three types of transitions: free and boundtransitions as above, and deallocating transitions q a i −−→ q ′ with a q ′ . To the best of our knowledge, this notion of nom-inal automaton has not appeared in the literature before. Example 3.11 (Sorted Σ -automata) . In our applications inSection 5, we shall encounter a generalized version of Σ -automata where (1) the input object I is arbitrary, not nec-essarily equal to the tensor unit I D , and (2) the automatonhas a sorted object of states and consumes sorted words.This reflects the fact that the algebraic structures arisingin algebraic language theory are often sorted. For brevity,we only treat the case of sorted automata in Set . Fix a set S of sorts and a family of sets Σ = ( Σ s , t ) s , t ∈ S ; we thinkof the elements of Σ s , t as letters with domain sort s andcodomain sort t . We instantiate our setting to the adjointpair F ⊣ G : Set S → Set S defined as follows for Q ∈ Set S and s , t ∈ S : ( FQ ) t = Ý s ∈ S Σ s , t × Q s , ( GQ ) s = Î t ∈ S [ Σ s , t , Q t ] . Choosing I ∈ Set S arbitrary and the output object O = S -sorted set with two elements in each component, an F -automaton is a sorted Σ -automaton . It is given by an S -sorted set of states Q , transitions δ Q , s , t : Σ s , t × Q t → Q t ( s , t ∈ S ), initial states i : I → Q and an output map f Q : Q → S -sorted set of final states). Theinitial algebra µF I is the S -sorted set of all well-sorted wordsover Σ with an additional first letter from I . More precisely, ( µF I ) t consists of all words xa . . . a n with x ∈ Ý s ∈ S I s and a , . . . , a n ∈ Ý r , s Σ r , s such that the sorts of consecutive let-ters match, i.e. there exist sorts s = s , s , . . . , s n = t ∈ S such that x ∈ I s and a i ∈ Σ s i − , s i for i = , . . . , n . In particu-lar, in the single-sorted case we have µF I = I × Σ ∗ . For anywell-sorted input word w = xa . . . a n one obtains the run x −→ q a −→ q → · · · a n −−→ q n in Q where q = i Q , s ( x ) and q i = δ Q , s i − , s i ( a i , q i − ) for i = , . . . , n , and w is accepted if and only if q n is a final state.We conclude with a discussion of minimal automata. Definition 3.12 (Minimal automaton) . An automaton Q iscalled (1) reachable if the unique F I -algebra homomorphism e Q : µF I → Q lies in E , and (2) minimal if it is reachableand for every reachable automaton Q ′ with L Q = L Q ′ , thereexists a unique automata homomorphism from Q ′ to Q . Theorem 3.13.
For every language L there exists a minimalautomaton Min ( L ) accepting L , unique up to isomorphism. Proof sketch. We describe the construction of the minimalautomaton. By equipping µF I with the final states L : µF I → O , we can view µF I as a G O -coalgebra. Consider the (E , M) -factorization of the unique coalgebra homomorphism m µ F I : m µ F I = ( µF I e
Min ( L ) / / / / Min ( L ) / / m Min ( L ) / / νG O ) . The object
Min ( L ) can be uniquely equipped with an au-tomaton structure for which e Min ( L ) is an F I -algebra homo-morphism and m Min ( L ) is a G O -coalgebra homomorphism.This automaton is the minimal acceptor for L . (cid:3) The minimization theorem and its proof are closely re-lated to the classical work of Arbib and Manes [10] onthe minimal realization of dynamorphisms , i.e. F -algebrahomomorphisms from µF I into νG O . Under different as-sumptions on the type functor F and the base category D (e.g. co-wellpoweredness), minimization results were alsoestablished by Adámek and Trnková [5] and, recently, byvan Heerdt et al. [63]. L ∗ Algorithm
To motivate our learning algorithm for adjoint automata, werecall Angluin’s classical L ∗ algorithm [8] for learning an un-known Σ -automaton Q in Set . The algorithm assumes thatthe learner has access to an oracle (the teacher ) that can beasked two types of questions:(1)
Membership queries: given a word w ∈ Σ ∗ , is w ∈ L Q ?(2) Equivalence queries: given an automaton H , is L H = L Q ?If the answer in (2) is “no”, the teacher discloses a counterex-ample , i.e. a word w ∈ L Q \ L H ∪ L H \ L Q , to the learner.The idea of L ∗ is to compute a sequence of approxima-tions of the unknown automaton Q by considering finite(co-)restrictions of the morphism m Q · e Q , as indicated bythe diagram below. Note that the kernel of m Q · e Q is pre-cisely the well-known Nerode congruence of L Q . Σ < / / / / · · · Σ < N / / / / Σ < N + / / / / · · · Σ ∗ e Q (cid:15) (cid:15) ✤✤✤✤✤ S h S , T % % e S , T (cid:15) (cid:15) (cid:15) (cid:15) O O O O H S , T (cid:15) (cid:15) m S , T (cid:15) (cid:15) Q m Q (cid:15) (cid:15) ✤✤✤✤✤ [ T , ][ Σ < , ] ·· o o o o [ Σ < K , ] O O O O [ Σ < K + , ] o o o o ·· o o o o [ Σ ∗ , ] (1)In more detail, the algorithm maintains a pair ( S , T ) of finitesets S , T ⊆ Σ ∗ (“states” and “tests”). For any such pair, therestriction of m Q · e Q to the domain S and codomain [ T , ] , h S , T : S → [ T , ] , h S , T ( s )( t ) = L Q ( st ) for s ∈ S , t ∈ T , is called the observation table for ( S , T ) . It is usually repre-sented as an | S | × | T | -matrix with binary entries. The learner utomata Learning: An Algebraic Approach , , can compute h S , T via membership queries. The pair ( S , T ) is closed if for each s ∈ S and a ∈ Σ there exists s ′ ∈ S with h S ∪ S Σ , T ( sa ) = h S , T ( s ′ ) . It is consistent if, for all s , s ′ ∈ S , h S , T ( s ) = h S , T ( s ′ ) implies h S , T ∪ Σ T ( s ) = h S , T ∪ Σ T ( s ′ ) . Initially, one puts S = T = { ε } . If at some stage the pair ( S , T ) is not closed or not consistent, either S or T can beextended by invoking one of the following two procedures: Extend S Input:
A pair ( S , T ) that is not closed.(0) Choose s ∈ S and a ∈ Σ such that h S ∪ S Σ , T ( sa ) , h S , T ( s ′ ) for all s ′ ∈ S .(1) Put S : = S ∪ { sa } . Extend T Input:
A pair ( S , T ) that is not consistent.(0) Choose s , s ′ ∈ S , t ∈ T and a ∈ Σ such that h S , T ( s ) = h S , T ( s ′ ) and h S , T ∪ Σ T ( s )( at ) , h S , T ∪ Σ T ( s ′ )( at ) . (1) Put T : = T ∪ { at } .The two procedures are applied repeatedly until the pair ( S , T ) is closed and consistent. Then, one constructs an au-tomaton H S , T , the hypothesis for ( S , T ) . Its set of states isthe image h S , T [ S ] , the transitions δ S , T : Σ × H S , T → H S , T are given by δ S , T ( a , h S , T ( s )) = h S ∪ S Σ , T ( sa ) for s ∈ S and a ∈ Σ , the initial state is h S , T ( ε ) , and a state h S , T ( s ) is finalif s ∈ L Q (i.e. h S , T ( s )( ε ) = δ S , T is equivalent to ( S , T ) being closed and consistent.The learner now tests whether L H S , T = L Q by askingan equivalence query. If the answer is “yes”, the algorithmterminates successfully; otherwise, the teacher’s counterex-ample and all its prefixes are added to S . In summary: L ∗ AlgorithmGoal:
Learn an automaton equivalent to an unknown au-tomaton Q .(0) Initialize S = T = { ε } .(1) While ( S , T ) is not closed or not consistent:(a) If ( S , T ) is not closed: Extend S .(b) If ( S , T ) is not consistent: Extend T .(2) Construct the hypothesis H S , T .(a) If L H S , T = L Q : Return H S , T .(b) If L H S , T , L Q : Put S : = S ∪ C , where C is the set ofprefixes of the teacher’s counterexample.(3) Go to (1).The algorithm runs in polynomial time w.r.t. the sizeof the minimal automaton Min ( L Q ) and the length ofthe longest counterexample provided by the teacher. Thelearned automaton (i.e. the correct hypothesis returnedin Step (2a)) is isomorphic to Min ( L Q ) . Correctness and termination rest on the invariant that S is prefix-closedand T is suffix-closed. Note that if T ⊆ Σ < K , then T yieldsa quotient [ Σ < K , ] ։ [ T , ] given by restriction. In thefollowing, T is represented via this quotient.We shall now develop all ingredients of L ∗ for adjoint F -automata. This requires additional assumptions, which holdfor all the functors discussed in Example 3.3, 3.10 and 3.11: Assumptions 4.1.
On top of our Assumptions 3.5, we re-quire for the rest of this section that F I = I + F preservessubobjects ( F I (M) ⊆ M ) and pullbacks of M -morphisms,and that G O = O × G preserves quotients ( G O (E) ⊆ E ).Our categorical learning algorithm generalizes (1) to thediagram shown below, where the upper and lower part aregiven by the initial chain for F I and the final cochain for G O : F I / / ¡ / / · · · F NI j N ) ) / / F NI ¡ / / F N + I / / F N + I ¡ / / · · · µF Ie Q (cid:15) (cid:15) ✤✤✤✤✤ S h s , t ' ' e s , t (cid:15) (cid:15) (cid:15) (cid:15) O O s O O H s , t (cid:15) (cid:15) m s , t (cid:15) (cid:15) Q m Q (cid:15) (cid:15) ✤✤✤✤✤ TG O · · · ! o o o o G KO t O O O O G K + O G KO ! o o o o · · · G K + O ! o o o o νG Oj ′ K i i (2)The algorithm maintains a pair ( s , t ) of an F I -subcoalgebraand a G O -quotient algebra s : ( S , σ ) ( F NI , F NI ¡ ) , t : ( G KO , G KO ! ) ։ ( T , τ ) , (3)with N , K >
0. For Σ -automata in Set , this means preciselythat S is a prefix-closed subset of Σ < N , and that T representsa suffix-closed subset of Σ < K .Initially, one takes N = K = s = id I and t = id O , whichcorresponds to Step (0) of the original L ∗ algorithm. Remark 4.2.
By Assumptions 3.5(2) and 4.1, every subcoal-gebra s : ( S , σ ) ( F NI , F NI ¡ ) induces the two subcoalge-bras ( S , σ ) / / F NI ¡ · s / / ( F N + I , F N + I ¡ ) ( F I S , F I σ ) . o o F I s o o In the case of Σ -automata in Set , the construction of thesetwo subcoalgebras corresponds to viewing a prefix-closedsubset S ⊆ Σ < N as a subset of Σ < N + , and to extending S tothe prefix-closed subset S Σ ∪ { ε } = S ∪ S Σ ⊆ Σ < N + . A dualremark applies to quotient algebras of ( G KO , G KO ! ) . Definition 4.3 (Observation table) . Let ( s , t ) be a pair asin (3), and let Q be an automaton. The observation table for , Henning Urbat and Lutz Schröder ( s , t ) w.r.t. Q is the morphism h Qs , t = ( S s −→ F NI j N −−→ µF I e Q −−→ Q m Q −−−→ νG O j ′ K −−→ G KO t −→ T ) . Its (E , M) -factorization is denoted by h Qs , t = ( S e Qs , t / / / / H Qs , t / / m Qs , t / / T ) . In the following, we fix Q (the unknown automaton to belearned) and omit the superscripts (−) Q . Remark 4.4.
In our categorical setting, membershipqueries are replaced by the assumption that the learner cancompute the observation table h Qs , t for each pair ( s , t ) . Im-portantly, this morphism depends only on the language of Q : one can show that for every automaton Q ′ with L Q = L Q ′ one has m Q · e Q = m Q ′ · e Q ′ , whence h Qs , t = h Q ′ s , t . Definition 4.5 (Closed/Consistent pair) . For any pair ( s , t ) as in (3), let cl s , t and cs s , t be the unique diagonal fill-insmaking all parts of the diagram below commute: H s , G O t cs s , t (cid:15) (cid:15) (cid:15) (cid:15) ✤✤ / / m s , GOt / / G O T τ (cid:15) (cid:15) S e s , GOt ♥♥♥♥♥♥♥♥♥ e s , t / / / / σ (cid:15) (cid:15) H s , t / / m s , t / / (cid:15) (cid:15) cl s , t (cid:15) (cid:15) ✤✤ TF I S e FI s , t / / / / H F I s , t m FI s , t ♥♥♥♥♥♥♥♥♥♥ The pair ( s , t ) is closed if cl s , t is an isomorphism, and consis-tent if cs s , t is an isomorphism.If ( s , t ) is not closed or not consistent, at least one of thetwo dual procedures below applies. “Extend s ” replaces S F NI S ′ F N + I
0, i.e. it moves tothe right in the initial chain for F I . Analogously, “Extend t ”replaces G KO ։ T by a new quotient algebra G K + O ։ T ′ ,and thus moves to the right in the final cochain for G O . Extend s Input:
A pair ( s , t ) as in (3) that is not closed.(0) Choose an object S ′ and M -morphisms s : S S ′ and s : S ′ F I S such that σ = s · s and e F I s , t · s ∈ E . (1) Replace s : ( S , σ ) ( F NI , F NI ¡ ) by the subcoalgebra F I s · s : ( S ′ , F I s · s ) ( F N + I , F N + I ¡ ) . Remark 4.6. (1) One trivial choice in Step (0) is S ′ = F I S s = σ , s = id . To get an efficient implementation of the algorithm, oneaims to choose the subobject s : S ′ F I S as small as pos-sible. (2) The update of s in Step (1) is well-defined, i.e. F I s · s is a subcoalgebra. Indeed, the commutative diagram belowshows that F I s · s is a coalgebra homomorphism: F N + I F N + I ¡ / / F I ( F N + I ) F I S F I s O O F I σ / / F I F I S F I F I s O O S ′ O O s O O s / / F I S F I σ ♦♦♦♦♦♦♦♦ F I s / / F I S ′ O O F I s O O Moreover, since s , s ∈ M and F I preserves M (see Assump-tions 4.1), we have F I s · s ∈ M .(3) In the case of Σ -automata in Set , the condition σ = s · s states that S ⊆ S ′ ⊆ S ∪ S Σ = S Σ ∪ { ε } . The condition e F I s , t · s ∈ E asserts that given s ∈ S and a ∈ Σ suchthat h S ∪ S Σ , T ( sa ) , h S , T ( r ) for all r ∈ S , there exists s ′ ∈ S ′ with h S ∪ S Σ , T ( sa ) = h S ∪ S Σ , T ( s ′ ) . Thus, “Extend s ” subsumesseveral executions of “Extend S ” in the original L ∗ algorithm. Extend t Input:
A pair ( s , t ) as in (3) that is not consistent.(0) Choose an object T ′ and E -morphisms t : G O T ։ T ′ and t : T ′ ։ T such that τ = t · t and t · m s , G O t ∈ M . (1) Replace t : ( G KO , G KO ! ) ։ ( T , τ ) by the quotient algebra t · G O t : ( G K + O , G K + O ! ) ։ ( T ′ , t · G O t ) . Remark 4.7. (1) Dually to Remark 4.6, a trivial choice inStep (0) is given by T ′ = G O T , t = id , t = τ , and Step (1) iswell-defined, i.e. t · G O t is a quotient algebra.(2) In the case of Σ -automata in Set , we view the quotients T and T ′ as subsets of Σ < K and Σ < K + , respectively, usingthe above identification between subsets and quotients. Thecondition τ = t · t then states that T ⊆ T ′ ⊆ T ∪ Σ T . Thecondition t · m s , G O t ∈ M states that every inconsistencyadmits a witness in T ′ : given s , s ′ ∈ S with h S , T ( s ) = h S , T ( s ′ ) but h S , T ∪ Σ T ( s ) , h S , T ∪ Σ T ( s ′ ) , there exists t ′ ∈ T ′ with h S , T ′ ( s )( t ′ ) , h S , T ′ ( s ′ )( t ′ ) . Thus, “Extend t ” subsumes sev-eral executions of “Extend T ” in the original L ∗ algorithm.If ( s , t ) is both closed and consistent, then we can define anautomaton structure on H s , t : Definition 4.8 (Hypothesis) . Let the pair ( s , t ) be closedand consistent. The hypothesis for ( s , t ) is the automaton ( H s , t , δ s , t , i s , t , f s , t ) with states H s , t and structure defined below. Here, inl / inr are coproduct injections, outl / outr are product projections,and (−) denotes adjoint transpose along the adjunction F ⊣ G : utomata Learning: An Algebraic Approach , , (1) The transitions δ s , t : FH s , t → H s , t are given by the di-agonal fill-in of the commutative square FS l s , t (cid:15) (cid:15) Fe s , t / / / / FH s , tδ s , t x x r r r r s , t (cid:15) (cid:15) H s , t / / m s , t / / T with the two vertical morphisms defined by l s , t = ( FS inr −−→ I + FS = F I S e FI s , t −−−−→ H F I s , t cl − s , t −−−→ H s , t ) , r s , t = ( H s , t cs − s , t −−−→ H s , G O t m s , GOt −−−−−−→ G O T = O × GT outr −−−→ GT ) . (2) The initial states are i s , t = ( I inl −−→ I + FS = F I S e FI s , t −−−−→ H F I s , t cl − s , t −−−→ H s , t ) . (3) The final states are f s , t = ( H s , t cs − s , t −−−→ H s , G O t m s , GOt −−−−−−→ G O T = O × GT outl −−−→ O ) . Remark 4.9.
The square defining δ s , t commutes: both legscan be shown to be equal to FS inr −−→ I + FS = F I S h FI s , t −−−−→ T .The idea of constructing the F -algebra structure of a hypoth-esis via diagonal fill-in originates in the abstract frameworkof CALF [64]. An important difference is that in the latterthe existence of the two vertical morphisms of the corre-sponding square is postulated, while our present setting fea-tures a concrete description of l s , t and r s , t .Recall that in L ∗ , if a hypothesis H S , T is not correct (i.e. L H S , T , L Q ), the learner receives a counterexample w ∈ Σ ∗ from the teacher and adds the set C of all its prefixes to S .Identifying the word w with this set, the concept of a coun-terexample has the following categorical version: Definition 4.10 (Counterexample) . Let ( s , t ) be closed andconsistent. A counterexample for H s , t is a subcoalgebra c : ( C , γ ) ( F MI , F MI ¡ ) for some M > H s , t and Q do not agree on inputs from C , that is, L H s , t · j M · c , L Q · j M · c . Remark 4.11. (1) If L H s , t , L Q , then a counterexample al-ways exists. Indeed, since the colimit injections j M : F MI → µF I are jointly epimorphic, one has L H s , t · j M , L Q · j M forsome M > ( C , γ ) = ( F MI , F MI ¡ ) is a counterex-ample. To obtain an efficient algorithm, it is often assumedthat the teacher delivers a minimal counterexample, i.e. M is minimal and no proper subcoalgebra is a counterexample.(2) Given a counterexample c : ( C , γ ) ( F MI , F MI ¡ ) , onecan add c to the subcoalgebra s : ( S , σ ) ( F NI , F NI ¡ ) asfollows: by Remark 4.2, we can assume that M = N , andthen form the supremum s ∨ c : ( S ∨ C , σ ∨ γ ) ( F NI , F NI ¡ ) of s and c in the lattice of subcoalgebras of ( F NI , F NI ¡ ) , viz.the image of the homomorphism [ s , c ] : S + C → F NI
0. With all these ingredients at hand, we obtain the followingabstract learning algorithm for adjoint F -automata: Generalized L ∗ AlgorithmGoal:
Learn an automaton equivalent to an unknown au-tomaton Q .(0) Initialize N = K = s = id I and t = id O .(1) While ( s , t ) is not closed or not consistent:(a) If ( s , t ) is not closed: Extend s .(b) If ( s , t ) is not consistent: Extend t .(2) Construct the hypothesis H s , t .(a) If L H s , t = L Q : Return H s , t .(b) If L H s , t , L Q : Replace the subcoalgebra s by s ∨ c ,where c is the teacher’s counterexample.(3) Go to (1).To prove the termination and correctness of Generalized L ∗ ,we need a finiteness assumption on the unknown automa-ton Q . We call a D -object Q Noetherian if both its poset ofsubobjects (ordered by m ≤ m ′ iff m = m ′ · p for some p ) andthat of its quotients (ordered by e ≤ e ′ iff e = q · e ′ for some q ) contain no infinite strictly ascending chains. Theorem 4.12. If Q is Noetherian, then the generalized L ∗ algorithm terminates and returns Min ( L Q ) . Remark 4.13.
Under a slightly stronger finiteness condi-tion on Q , we obtain a complexity bound. Suppose that Q has finite height n , that is, n is the maximum length ofany strictly ascending chain of subobjects or quotients of Q .Then Steps (1a), (1b) and (2b) are executed O ( n ) times. Example 4.14. In D = Set , Pos , JSL , K - Vec , and
Nom ,the Noetherian objects are precisely the finite sets, finiteposets, finite semilattices, finite-dimensional vector spacesand orbit-finite nominal sets. The height of Q is equal to thenumber of elements of Q (for D = Set , Pos ) or the dimension(for D = K - Vec ). For D = Nom , the height of an orbit-finiteset Q can be shown to be polynomial in the number of orbitsof Q and max { | supp ( q )| | q ∈ Q } , using upper bounds onthe length of subgroup chains in symmetric groups [11]. Remark 4.15.
In the generalized L ∗ algorithm, counterex-amples are added to S . Dually, one may opt to add them to T instead; for Σ -automata in Set , this corresponds to a modifi-cation of Angluin’s algorithm due to Maler and Pnueli [42]that makes it possible to avoid inconsistent observation ta-bles, i.e. all tables constructed in the modified algorithm areconsistent. In this dual approach, the accepted language ofan automaton Q is defined coalgebraically as the morphism L ′ Q = ( I i Q −−→ Q m Q −−−→ νG O ) , and a counterexample is a quotient algebra c : ( G MO , G MO ! ) ։ ( C , γ ) for some M > c · j ′ M · L ′ H s , t , c · j ′ M · L ′ Q . InStep (2b), a counterexample c is added to the quotient alge-bra t : ( G KO , G − O K ! ) ։ ( T , τ ) by forming the supremum of , Henning Urbat and Lutz Schröder t and c . To guarantee termination, our original requirementthat F I preserves pullbacks of M -morphisms (see Assump-tions 4.1) needs to be replaced by the dual requirement that G O preserves pushouts of E -morphisms. Remark 4.16.
We elaborate on the connection betweenGeneralized L ∗ and the learning algorithm for coalgebrasdue to Barlocco et al. [14]. The latter is concerned withcoalgebras whose semantics is given in terms of a coalge-braic logic , i.e. a natural transformation δ : L op P → PB where L : A → A and B : B → B are endofunctors and P : B → A op is a left adjoint (see the left-hand square be-low). A op δ ❑❑❑❑❑❑ ! ) ❑❑❑❑❑❑ L op (cid:15) (cid:15) B P o o B (cid:15) (cid:15) A op B P o o ( D op ) op ( G op O ) op (cid:15) (cid:15) id ▲▲▲▲▲▲ " * ▲▲▲ ▲▲▲ D Id o o G O (cid:15) (cid:15) ( D op ) op D Id o o Here, L represents the syntax (usually modalities over apropositional base logic embodied by A ), and B the be-haviour (defining the branching type of coalgebras on B ).The coalgebraic semantics of F -automata corresponds to thetrivial logic shown in the right-hand square. In this sense, F -automata are formally covered by the framework of [14].While Generalized L ∗ is based on Angluin’s L ∗ algorithm,the coalgebraic learning algorithm in op. cit. generalizesMaler and Pnueli’s approach, and thus needs to keep obser-vation tables consistent (Remark 4.15). To this end, tablesare required to satisfy a property called sharpness , whichentails that the existence of extensions of non-closed tablesis nontrivial and can only be guaranteed under strong as-sumptions on epimorphisms in the base category (e.g., allepimorphisms must split). Thus, the algorithm is effectivelylimited to coalgebras in Set and does not apply, e.g., to Σ -automata in Nom ; see Appendix. In our Generalized L ∗ , nosuch assumptions are needed since table extensions alwaysexist (Remark 4.6). This makes our algorithm applicable incategories beyond Set , including the ones in Example 3.3.Generalized L ∗ provides a unifying perspective on knownlearning algorithms for several notions of deterministic au-tomata, including classical Σ -automata ( D = Set [8]), lin-ear weighted automata ( D = K - Vec [12]) and nominal au-tomata ( D = Nom [21, 47]). For D = JSL , finite semilatticeautomata can be interpreted as nondeterministic finite au-tomata by means of an equivalence between the categoryof finite semilattices and a suitable category of finite clo-sure spaces and relational morphisms [3, 48]. For any reg-ular language L , the minimal Σ -automaton Min ( L ) in JSL corresponds under this equivalence to the minimal residualfinite state automaton (RFSA) [28], a canonical nondetermin-istic acceptor for L whose states are the join-irreducible ele-ments of Min ( L ) . Consequently, the NL ∗ algorithm for learn-ing RFSA due to Bollig et al. [20] is also subsumed by our categorical setting. We note that although NL ∗ learns a min-imal RFSA, the intermediate hypotheses arising in the algo-rithm are not necessarily RFSA, but general nondeterminis-tic finite automata. Our categorical perspective provides anexplanation of this phenomenon: it shows that NL ∗ implic-itly computes deterministic finite automata over JSL , andnot every such automaton corresponds to an RFSA.Finally, our algorithm instantiates to new learning algo-rithms for nominal languages with name binding, includ-ing languages of dynamic sequences (Example 3.10), andfor sorted languages (Example 3.11). A special instance ofsorted automata where all transitions are sort-preserving(i.e. Σ s , t = ∅ for s , t ) appeared in the work of Moerman[45] on learning product automata.In each of the above settings, in order to turn General-ized L ∗ into a concrete algorithm, one only needs to providea suitable data structure for representing observation tables h s , t by finite means, and a strategy for choosing the objects S ′ and T ′ in the procedures “Extend s ” and “Extend t ”. Weemphasize that these design choices can be non-trivial anddepend on the specific structure of the underlying category D . The typical approach is to represent the map h s , t : S → T by restricting the objects S and T to finite sets of generators.For instance, finite-dimensional vector spaces can be repre-sented by their bases ( D = K - Vec ), finite semilattices bytheir join-irreducible elements ( D = JSL ) and orbit-finitesets by subgroups of finite symmetric groups ( D = Nom ).Our above results demonstrate, however, that the core ofour learning algorithm is independent from such implemen-tation details; in particular, its correctness and termination,and parts of the complexity analysis, always come for freeas instances of the general results in Theorem 4.12 and Re-mark 4.13. In this way, the categorical approach providesa clean separation between generic structures and designchoices tailored to a specific application. This leads to a sim-plified derivation of learning algorithms in new settings.
In this section, we investigate languages recognizable bymonad algebras and show that the task of learning themcan be reduced to learning F -automata. Notation 5.1.
Fix a monad T = ( T , µ , η ) on D that preservesquotients ( T (E) ⊆ E ). We continue to work with the fixedobjects I , O ∈ D of inputs and outputs (with I now thoughtof as an input alphabet, so not normally the monoidal unit).Finally, we fix a full subcategory D f ⊆ D closed under sub-objects and quotients, and call the objects of D f the finite objects of D . Example 5.2.
Choose
Set f , Pos f , JSL f , K - Vec f and Nom f to be the class of all Noetherian objects (see Example 4.14).Our monads of interest model formal languages: utomata Learning: An Algebraic Approach , , D TSet T + X = X + Set T ∞ ( X , Y ) = ( X + , X up + X ∗ Y ) Set T Γ X = Γ -trees over X JSL T ∗ X = free idempotent semiring on X K - Vec T ∗ X = free K -algebra on X Pos T S X = free stabilization algebra on X Nom T ∗ X = X ∗ In the second row, X up = { vw ω : v ∈ X ∗ , w ∈ X + } denotes the set of ultimately periodic words over X , and inthe third row, Γ is a finitary algebraic signature. Finite alge-bras for the above seven monads correspond to finite semi-groups, finite Wilke algebras [66], finite Γ -algebras, finite-dimensional K -algebras, finite stabilization algebras [26],and orbit-finite nominal monoids [19], respectively.In the present setting, we shall consider the following gen-eralized concept of a language: Definition 5.3 (Language) . A language is a morphism L : T I → O in D . It is called recognizable if there exists a T -homomorphism e : ( T I , µ I ) → ( A , α ) into a finite T -algebra ( A , α ) and a mor-phism p : A → O in D with L = p · e . T I e ❆❆❆❆❆ L / / OA p ? ? ⑧⑧⑧⑧⑧ In this case, we say that e recognizes L (via p ) . Remark 5.4.
The above definition generalizes the conceptsof the previous sections. Indeed, if F is functor for whichthe free monad T F (see Section 2) exists, then a language L : T F I → O in the sense of Definition 5.3 is precisely a lan-guage L : µF I → O in the sense of Definition 3.8. Moreover,since the categories of F -algebras and T F -algebras are iso-morphic, L is T F -recognizable if and only if L is regular , i.e.accepted by some finite F -automaton. Example 5.5.
Many important automata-theoretic classesof languages can be characterized algebraically as recogniz-able languages for a monad. For the monads of Example 5.2we obtain the following languages: D T T -recognizable languages
Set T + regular languages [50] Set T ∞ ω -regular languages [49] Set T Γ tree languages over Γ [25] JSL T ∗ regular languages [52] K - Vec T ∗ recognizable weighted languages [54] Pos T S regular cost functions [23] Nom T ∗ monoid-recognizable data languages [19]In the following, we focus on ( ω -)regular languages and costfunctions; see [58, 60] for details on the remaining examples. (1) For the semigroup monad T + on Set we obtain the clas-sical concept of algebraic language recognition: a language L ⊆ I + is recognizable if there exists a semigroup morphism e : I + → S into a finite semigroup S and a subset P ⊆ S with L = e − [ P ] . Recognizable languages are exactly the ( ε -free)regular languages [50]. In fact, the expressive equivalencebetween Σ -automata in Set and semigroups generalizes to Σ -automata in symmetric monoidal closed categories [4].(2) Languages of infinite words can be captured alge-braically as follows. A Wilke algebra [66] is a two-sortedset ( S + , S ω ) with a product · : S + × S + → S + , a mixed prod-uct · : S + × S ω → S ω and a unary operation (−) ω : S + → S ω subject to the laws ( st ) u = s ( tu ) , ( st ) z = s ( tz ) , s ( ts ) ω = ( st ) ω , ( s n ) ω = s ω , for all s , t , u ∈ S + , z ∈ S ω and n >
0. The free Wilke al-gebra generated by the two-sorted set ( X , Y ) is T ∞ ( X , Y ) = ( X + , X up + X ∗ Y ) with the two products given by concatena-tion of words, and w ω = . . . for w ∈ X + . In particular,choosing the input object ( I , ∅) for some set I and the out-put object O = ({ , } , { , }) , we have T ∞ ( I , ∅) = ( I + , I up ) ,and thus a language L : T ∞ ( I , ∅) → O specifies a set of finiteor ultimately periodic infinite words. Languages recogniz-able by Wilke algebras correspond to ω -regular languages ,i.e. languages accepted by Büchi automata [49, 66].(3) Regular cost functions were introduced by Colcom-bet [23] as a quantitative extension of regular languagesthat provides a unifying framework for studying limited-ness problems. A cost function over the alphabet I is a func-tion f : I ∗ → N ∪ {∞} . Two cost functions f and д are iden-tified if, for every subset A ⊆ N , the function f is boundedon A iff д is bounded on A . Regular cost functions corre-spond to languages recognizable by finite stabilization al-gebras . The latter are ordered algebras over the signature Γ = { / , ·/ , (−) / , (−) ω / } , with −/ n denoting arities,subject to suitable inequations; see [26, 58]. We let T S de-note the monad on Pos induced by this ordered algebraictheory.Our generic approach to learning T -recognizable languagesis based on the idea of presenting the free algebra T I = ( T I , µ I ) and its finite quotient algebras as automata: Definition 5.6 ( T -refinable) . A quotient e : T I ։ A in D is T -refinable if there exists a finite quotient algebra e ′ : T I ։ ( B , β ) of T I and a morphism f : B ։ A with e = f · e ′ . Definition 5.7 (Automata presentation) . An automata pre-sentation of the free T -algebra T I is given by an endofunctor F on D and an F -algebra structure δ : FT I → T I such that(1) F (E) ⊆ E , the initial algebra µF I exists, and every reg-ular language L : µF I → O admits a minimal automaton Min ( L ) ;(2) the F I -algebra ( T I , [ η I , δ ]) is reachable (i.e. e T I ∈ E );(3) a T -refinable quotient e : T I ։ A in D carries a T -algebra quotient iff e carries an F -algebra quotient; that is, , Henning Urbat and Lutz Schröder there exists α A making the left-hand square below commuteiff there exists δ A making the right-hand square commute. TT I µ I / / T e (cid:15) (cid:15) (cid:15) (cid:15)
T I e (cid:15) (cid:15) (cid:15) (cid:15) T A ∃ α A / / ❴❴❴ A ⇐⇒ FT I δ / / Fe (cid:15) (cid:15) (cid:15) (cid:15) T I e (cid:15) (cid:15) (cid:15) (cid:15) FA ∃ δ A / / ❴❴❴ A If in (3) only the implication “ ⇒ ” is required, ( F , δ ) is calleda weak automata presentation . Remark 5.8. (1) Examples of functors F for which the firstcondition is satisfied include all functors satisfying the As-sumptions 3.5, see Remark 3.7(1) and Theorem 3.13, andpolynomial functors F = F Γ on Set or Pos for a signature Γ .Recall from Example 3.4 that F Γ -automata are Γ -automata.(2) Presentations of T -algebras as (sorted) Σ -automata werepreviously studied by Urbat, Adámek, Chen, and Milius [59]for the special case where D is a variety of algebras and Σ ∈ D is a free algebra, and called unary presentations . Example 5.9.
For all monads of Example 5.2, free algebrasadmit an automata presentation (in fact, a presentation as(sorted) Σ -automata [58–60]). Here we consider three cases:(1) Semigroups.
The free semigroup T + I = I + has a Σ -automata presentation δ : Σ × I + → I + given by the alphabet Σ = { → a : a ∈ I } ∪ { ← a : a ∈ I } and the transitions δ ( → a , w ) = wa and δ ( ← a , w ) = aw for w ∈ I + , a ∈ I . Recall from Example 3.11 that µF I = I × Σ ∗ . The uniquehomomorphism e I + : I × Σ ∗ → I + interprets a word in I × Σ ∗ as a list of instructions for forming a word in I + , e.g. e I + ( a → a → b ← b → a ) = baaba . For a weak automata presentation of I + , it suffices to takethe restriction δ ′ : Σ ′ × I + → I + of δ where Σ ′ = { → a : a ∈ I } .(2) Wilke algebras . The free Wilke algebra T ∞ ( I , ∅) = ( I + , I up ) can be presented as a two-sorted Σ -automaton withthe sorted alphabet Σ = ( Σ + , + , Σ + , ω , Σ ω , ω , ∅) given by Σ + , + = { → a : a ∈ I } ∪ { ← a : a ∈ I } Σ + , ω = { ω } ∪ { → v ω : v ∈ I + } Σ ω , ω = { a ← : a ∈ I } and the transitions below, where v , w ∈ I + , z ∈ I up , a ∈ I : δ + , + ( → a , w ) = wa , δ + , + ( ← a , w ) = aw , δ + , ω ( ω , w ) = w ω , δ + , ω ( → v ω , w ) = wv ω , δ ω , ω ( a ← , z ) = az . Recall from Example 3.11 that the initial algebra µF I consistsof sorted words over Σ with an additional first letter from I .The homomorphism e ( I + , I up ) : µF I → ( I + , I up ) views such aword as an instruction for forming a word in ( I + , I up ) , e.g. e ( I + , I up ) ( a → b → aωa ← a ← ) = aa ( aba ) ω . To obtain a weak automata presentation, it suffices to re-strict Σ + , + and Σ + , ω to the finite subalphabets Σ ′ + , + = { → a : a ∈ I } and Σ ′ + , ω = { ω } . A Σ ′ -automaton is similar to a fam-ily of DFAs , a concept recently employed by Angluin andFisman [9] for learning ω -regular languages.(3) Stabilization algebras.
Suppose that T is a monad on Set or Pos induced by a finitary signature Γ and (in-)equations E ;see Section 2. Then T I can be presented as the Γ -automaton δ : F Γ ( T I ) →
T I given by the Γ -algebra structure on thefree ( Γ , E ) -algebra T I . The initial algebra µ ( F Γ ) I is the alge-bra T Γ I of Γ -terms over I , and the unique homomorphism e T I : T Γ I ։ T I interprets Γ -terms in T I . In particular, forthe monad T = T S on Pos , the free stabilization algebra T S I admits a Γ -automata presentation for the signature Γ of Ex-ample 5.5(3).From now on, we fix a weak automata presentation ( F , δ ) ofthe free T -algebra T I . Definition 5.10 (Linearization) . The linearization of a lan-guage L : T I → O is given by lin ( L ) = ( µF I e
T I / / / / T I L / / O ) . Example 5.11. (1)
Semigroups.
Take the Σ -automata pre-sentation of Example 5.9(1). Given L ⊆ I + , the language lin ( L ) ⊆ I × Σ ∗ consists of all possible ways of gener-ating words in L by starting with a letter a ∈ I andadding letters on the left or on the right. For instance, if L contains the word abc , then lin ( L ) contains the words a → b → c , b ← a → c , b → c ← a , c ← b ← a .(2) Wilke algebras.
Take the weak presentation of Exam-ple 5.9(2). Given L ⊆ ( I + , I up ) , the language lin ( L ) consists ofall possible ways of generating words in L by starting witha letter a ∈ I and repeatedly applying any of the followingoperations: (i) right concatenation of a finite word with aletter; (ii) left concatenation of an infinite word with a let-ter; (iii) taking the ω -power of a finite word. For instance, if L contains the word ( ab ) ω , then lin ( L ) contains a → b ω , b → aωa ← , a → bωb ← a ← , b → aωa ← b ← a ← , . . . . Thus, lin ( L ) is a two-sorted version ofthe language lasso ( L ) mentioned in the Introduction.(3) Stabilization algebras.
Take the presentation of Exam-ple 5.9(3). Given a language L ⊆ T S I , the set lin ( L ) ⊆ T Γ I consists of all Γ -trees whose interpretation in T S I lies in L .As demonstrated by the above examples, the linearizationallows us to identify a language L : T I → O with a lan-guage lin ( L ) : µF I → O of finite words or trees. Since themorphism e T I : µF I ։ T I is assumed to be epic by Def-inition 5.7(2), this identification is unique; that is, lin ( L ) uniquely determines L . In particular, in order to learn L , it issufficient to learn lin ( L ) . This approach is supported by thefollowing result: utomata Learning: An Algebraic Approach , , Theorem 5.12. If L : T I → O is a T -recognizable language,then its linearization lin ( L ) : µF I → O is regular, i.e. acceptedby some finite F -automaton.Proof sketch. Let e : T I → ( A , α ) be a T -homomorphism rec-ognizing L via p : A → O . By replacing e with its coimage,we may assume that e ∈ E . The weak automata presentationyields an F -algebra structure on A making e an F -algebra ho-momorphism. Then A , viewed as an automaton with initialstates e · η I : I → A and final states p , accepts lin ( L ) . (cid:3) In view of this theorem, one can apply any learning algo-rithm for finite F -automata (e.g. Generalized L ∗ for the caseof adjoint automata, or a learning algorithm for tree au-tomata [30] if F is a polynomial functor) to learn the mini-mal automaton Q L for lin ( L ) . This automaton, together withthe epimorphism e T I , constitutes a finite representation ofthe unknown language L : T I → O . If the given automatapresentation for T I is non-weak, we can go one step furtherand infer from Q L a minimal algebraic representation of L : Definition 5.13 (Syntactic T -algebra) . Let L : T I → O berecognizable. A syntactic T -algebra for L is a quotient T -algebra e L : T I ։ Syn ( L ) of T I such that (1) e L recognizes L ,and (2) e L factorizes through every finite quotient T -algebra e : T I ։ ( A , α ) recognizing L . T I e / / / / e L % % % % ▲▲▲▲▲▲▲ ( A , α ) (cid:15) (cid:15) ✤✤ Syn ( L ) Theorem 5.14.
Let ( F , δ ) be an automata presentation for T I . Then every T -recognizable language L : T I → O has a syn-tactic T -algebra Syn ( L ) , and its corresponding F -automaton(via the given presentation) is the minimal automaton for lin ( L ) : Syn ( L ) (cid:27) Min ( lin ( L )) . This theorem asserts that we can uniquely equip the learnedminimal F -automaton Q L = Min ( lin ( L )) with a T -algebrastructure α L : TQ L → Q L for which the unique automata ho-momorphism e L : T I ։ Q L is a T -algebra homomorphism e L : T I ։ ( Q L , α L ) . Then e L is the syntactic algebra for L . Remark 5.15.
To make the construction of
Syn ( L ) fromthe learned automaton Q L effective, we need to assume thatthe morphisms e Q L , e T I , Te Q L , Te T I and µ I can be repre-sented as (sorted families of) computable maps and more-over the maps e T I and Te Q L admit computable (not neces-sarily morphic) right inverses m and n , respectively. Thenthe T -algebra structure α L of Syn ( L ) can be represented asthe computable map e Q L · m · µ I · Te T I · n ; see the commutativediagram below. T ( µF I ) T e
T I / / / / T e QL & & & & ▼▼▼▼▼▼ TT I
T e L (cid:15) (cid:15) (cid:15) (cid:15) µ I / / T I e L (cid:15) (cid:15) (cid:15) (cid:15) µF Ie T I o o o o e QL z z z z ✉✉✉✉✉✉ TQ L α L / / Q L Example 5.16.
This computation strategy works for allmonads of Example 5.2. We consider our running examples:(1)
Semigroups.
For the Σ -automata presentation of Exam-ple 5.9(1) and L ⊆ I + , we compute the semigroup structure • : Q L × Q L → Q L on Q L from its automaton structure asfollows. Given q , q ′ ∈ Q L choose words w , w ′ ∈ I × Σ ∗ with e Q L ( w ) = q , e Q L ( w ′ ) = q ′ , i.e. witnesses for the reach-ability of q and q ′ . Next, choose v ∈ I × Σ ∗ with e I + ( v ) = e I + ( w ) e I + ( w ′ ) ∈ I + , and put q • q ′ : = e Q L ( v ) .(2) Wilke algebras.
Analogous to the case of semigroups.(3)
Cost functions.
For a monad T on Set or Pos given by asignature Γ and (in-)equations E and the Γ -automata pre-sentation of T I in Example 5.9(3), the computation of α L is trivial: the structure of the Γ -algebra Syn ( L ) is just theautomaton structure of Q L . In particular, this applies tothe monad T S on Pos representing cost functions (Exam-ple 5.2(3)). Thus, we obtain the first learning algorithm forthis class of languages.
We have presented a generic algorithm (Generalized L ∗ ) forlearning F -automata that forms a uniform abstraction of L ∗ -type algorithms, their correctness proofs, and parts of theircomplexity analysis, and instantiates to several new learn-ing algorithms, e.g. for various notions of nominal automatawith name binding. Moreover, we have shown how to ex-tend the scope of Generalized L ∗ , and other learning algo-rithms for F -automata, to languages recognizable by monadalgebras. This gives rise to a generic approach to learningnumerous types of languages, including cases for which nolearning algorithms are known (e.g. cost functions).The next step is to turn our high-level categorical ap-proach into an implementation-level algorithm, parametricin the monad T and its automata presentation, with corre-sponding tool support. We expect that the recent work oncoalgebraic minimization algorithms and their implementa-tion [27, 29] can provide guidance. It should be illuminatingto experimentally compare the performance of the genericalgorithm with tailor-made algorithms for specific types ofautomata.Our generalized L ∗ algorithm is concerned with adjoint F -automata and applies to a wide variety of automata on fi-nite words (including weighted, residual nondeterministic,and nominal automata), but presently not to tree automata.To deal with the latter, the adjointness of the type functor F needs to be relaxed, which entails that a coalgebraic seman-tics is no longer directly available. A categorical approach tolearning tree automata, assuming a purely algebraic point ofview, was recently proposed by van Heerdt et al [62]. Thesubtle interplay between the algebraic and coalgebraic as-pects underlying learning algorithms is up for further in-vestigation. , Henning Urbat and Lutz Schröder References [1] Jiří Adámek, Filippo Bonchi, Mathias Hülsbusch, Barbara König, Ste-fan Milius, and Alexandra Silva. 2012. A Coalgebraic Perspective onMinimization and Determinization. In
Foundations of Software Scienceand Computational Structures , Lars Birkedal (Ed.). Springer Berlin Hei-delberg, 58–73.[2] Jiří Adámek, Horst Herrlich, and George Strecker. 2004.
Abstract andConcrete Categories: The Joy of Cats . Dover Publications. 528 pages.[3] Jiří Adámek, Stefan Milius, Robert S. R. Myers, and Henning Ur-bat. 2014. On Continuous Nondeterminism and State Minimality. In
Proc. Mathematical Foundations of Programming Science (MFPS XXX)(Electron. Notes Theor. Comput. Sci., Vol. 308) , Bart Jacobs, AlexandraSilva, and Sam Staton (Eds.). Elsevier, 3–23.[4] J. Adámek, S. Milius, and H. Urbat. 2015. Syntactic Monoids in a Cate-gory. In
Proc. CALCO’15 (LIPIcs) . Schloss Dagstuhl–Leibniz-Zentrumfür Informatik.[5] Jiří Adámek and Vera Trnková. 1989.
Automata and Algebras in Cat-egories . Springer.[6] Jiří Adámek. 1974. Free algebras and automata realizations in thelanguage of categories.
Commentationes Mathematicae UniversitatisCarolinae
15, 4 (1974), 589–602. http://eudml.org/doc/16649 [7] Mikołaj Bojańczyk. 2015. Recognisable languages over monads. In
Proc. DLT 2015 , Igor Potapov (Ed.). LNCS, Vol. 9168. Springer, 1–13. http://arxiv.org/abs/1502.04898 .[8] Dana Angluin. 1987. Learning Regular Sets from Queries and Coun-terexamples.
Inf. Comput.
75, 2 (1987), 87–106.[9] Dana Angluin and Dana Fisman. 2016. Learning regular omega lan-guages.
Theoretical Computer Science
650 (2016), 57 – 72.[10] Michael A. Arbib and Ernest G. Manes. 1975. Adjoint machines, state-behavior machines, and duality.
Journal of Pure and Applied Algebra
6, 3 (1975), 313 – 344.[11] László Babai. 1986. On the length of subgroup chains inthe symmetric group.
Comm. Alg.
14, 9 (1986), 1729–1736. https://doi.org/10.1080/00927878608823393 [12] Borja Balle and Mehryar Mohri. 2015. Learning Weighted Automata.In
Algebraic Informatics , Andreas Maletti (Ed.). Springer, 1–21.[13] Bernhard Banaschewski and Evelyn Nelson. 1976. Tensor prod-ucts and biomorphisms.
Can. Math. Bull.
19, 4 (1976), 385–402. https://doi.org/10.4153/CMB-1976-060-2 [14] Simone Barlocco, Clemens Kupke, and Jurriaan Rot. 2019. CoalgebraLearning via Duality. In
Proc. FOSSACS 2019 . 62–79.[15] Michael Barr. 1970. Coequalizers and free triples.
MathematischeZeitschrift
Logic, Language, Information and Com-putation , Luke Ong and Ruy de Queiroz (Eds.). Springer Berlin Hei-delberg, 191–205.[17] S. L. Bloom. 1976. Varieties of ordered algebras.
J. Comput. Syst. Sci.
2, 13 (1976), 200–212.[18] Mikołaj Bojańczyk, Bartek Klin, and Sławomir Lasota. 2014. Automatatheory in nominal sets.
Log. Methods Comput. Sci.
10, 3:4 (2014), 44pp.[19] Mikołaj Bojańczyk. 2013. Nominal Monoids.
Theory of ComputingSystems
53, 2 (2013), 194–222.[20] Benedikt Bollig, Peter Habermehl, Carsten Kern, and Martin Leucker.2009. Angluin-Style Learning of NFA. In .[21] Benedikt Bollig, Peter Habermehl, Martin Leucker, and BenjaminMonmege. 2014. A Robust Class of Data Languages and an Appli-cation to Learning.
Logical Methods in Computer Science
10, 4 (2014).[22] Venanzio Capretta, Tarmo Uustalu, and Varmo Vene. 2006. Recur-sive coalgebras from comonads.
Information and Computation
Automata, Languages and Programming ,Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, SotirisNikoletseas, and Wolfgang Thomas (Eds.). Springer Berlin Heidelberg,139–150.[24] Thomas Colcombet and Daniela Petrişan. 2017. Automata Minimiza-tion: a Functorial Approach. In , Filippo Bonchi and Barbara König(Eds.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 8:1–8:16.[25] H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacque-mard, D. Lugiez, S. Tison, and M. Tommasi. 2007. TreeAutomata Techniques and Applications. Available on: .[26] L. Daviaud, D. Kuperberg, and J.-É. Pin. 2016. Varieties of Cost Func-tions. In
Proc. STACS 2016 (LIPIcs, Vol. 47) , N. Ollinger and H. Vollmer(Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 30:1–30:14.[27] Hans-Peter Deifel, Stefan Milius, Lutz Schröder, and Thorsten Wiß-mann. 2019. Generic Partition Refinement and Weighted Tree Au-tomata. In
Formal Methods – The Next 30 Years , Maurice H. ter Beek,Annabelle McIver, and José N. Oliveira (Eds.). Springer InternationalPublishing, 280–297.[28] François Denis, Aurélien Lemay, and Alain Terlutte. 2001. ResidualFinite State Automata. In
STACS 2001 , Afonso Ferreira and Horst Re-ichel (Eds.). 144–157.[29] Ulrich Dorsch, Stefan Milius, Lutz Schröder, and Thorsten Wiß-mann. 2017. Efficient Coalgebraic Partition Refinement. In
Proc. 28th International Conference on Concurrency Theory (CONCUR2017) (LIPIcs) . Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. https://arxiv.org/abs/1705.08362 [30] Frank Drewes and Johanna Högberg. 2003. Learning a RegularTree Language from a Teacher. In
Developments in Language Theory ,Zoltán Ésik and Zoltán Fülöp (Eds.). Springer Berlin Heidelberg, 279–291.[31] M. Droste, W. Kuich, and H. Vogler (Eds.). 2009.
Handbook of weightedautomata . Springer.[32] Azadeh Farzan, Yu-Fang Chen, Edmund M. Clarke, Yih-Kuen Tsay,and Bow-Yaw Wang. 2008. Extending Automated Compositional Veri-fication to the Full Class of Omega-regular Languages. In
Proc. TACAS2008 . 2–17.[33] Murdoch James Gabbay and Vincenzo Ciancia. 2011. Fresh-ness and Name-Restriction in Sets of Traces with Names.In
Foundations of Software Science and Computational Struc-tures, FOSSACS 2011 (LNCS, Vol. 6604) . Springer, 365–380. https://doi.org/10.1007/978-3-642-19805-2 [34] Murdoch James Gabbay, Dan R. Ghica, and Daniela Petrişan. 2015.Leaving the Nest: Nominal Techniques for Variables with InterleavingScopes. In
Computer Science Logic, CSL 2015 (LIPIcs, Vol. 41) . SchlossDagstuhl - Leibniz-Zentrum fuer Informatik, 374–389.[35] Joseph A. Goguen. 1975. Discrete-Time Machines in Closed MonoidalCategories. I.
J. Comput. Syst. Sci.
10, 1 (1975), 1–43.[36] Claudio Hermida and Bart Jacobs. 1998. Structural Induction andCoinduction in a Fibrational Setting.
Information and Computation
Automata Learning: A Categor-ical Perspective . Springer, 384–406.[38] Michael Kaminski and Nissim Francez. 1994. Finite-memory au-tomata.
Theoret. Comput. Sci.
Theoretical Computer Science
Automata, Languages,and Programming, ICALP 2015 (LNCS, Vol. 9135) . Springer, 286–298. https://doi.org/10.1007/978-3-662-47666-6 utomata Learning: An Algebraic Approach , , [41] S. Mac Lane. 1998. Categories for the Working Mathematician (2nd ed.).Springer.[42] Oded Maler and Amir Pnueli. 1995. On the Learnability of InfinitaryRegular Sets.
Inf. Comput.
Algebraic Theories . Graduate Texts in Mathematics,Vol. 26. Springer.[44] Stefan Milius, Lutz Schröder, and Thorsten Wißmann. 2016. RegularBehaviours with Names.
Appl. Categ. Structures
24, 5 (2016), 663–701.[45] Joshua Moerman. 2019. Learning Product Automata. In
Proc. 14thInternational Conference on Grammatical Inference 2018 (Proceedingsof Machine Learning Research, Vol. 93) , Olgierd Unold, Witold Dyrka,and Wojciech Wieczorek (Eds.). PMLR, 54–66.[46] Joshua Moerman and Jurriaan Rot. 2019. Separation and Renamingin Nominal Sets.
CoRR abs/1906.00763 (2019). arXiv:1906.00763[47] Joshua Moerman, Matteo Sammartino, Alexandra Silva, Bartek Klin,and MichałSzynwelski. 2017. Learning Nominal Automata. In
Proceed-ings of the 44th ACM SIGPLAN Symposium on Principles of Program-ming Languages (POPL 2017) . ACM, 613–625.[48] Robert S. R. Myers, Jiří Adámek, Stefan Milius, and Henning Urbat.2014. Canonical Nondeterministic Automata. In
Proc. CoalgebraicMethods in Computer Science (CMCS’14) (Lecture Notes Comput. Sci.,Vol. 8446) , Marcello M. Bonsangue (Ed.). Springer, 189–210.[49] D. Perrin and J.-É. Pin. 2004.
Infinite Words . Elsevier.[50] J.-É. Pin. 2016. Mathematical Foundations of Au-tomata Theory. (November 2016). Available at .[51] Andrew M. Pitts. 2013.
Nominal Sets: Names and Symmetry in Com-puter Science . Cambridge University Press.[52] L. Polák. 2001. Syntactic semiring of a language. In
Proc. MFCS’01(LNCS, Vol. 2136) , J. Sgall, A. Pultr, and P. Kolman (Eds.). Springer, 611–620.[53] Michael O. Rabin and Dana S. Scott. 1959. Finite Automata and TheirDecision Problems.
IBM J. Res. Dev.
3, 2 (April 1959), 114–125.[54] C. Reutenauer. 1980. Séries formelles et algèbres syntactiques.
J. Al-gebra
66 (1980), 448–483.[55] Jan J. M. M. Rutten. 2000. Universal coalgebra: a theory of systems.
Theoret. Comput. Sci.
Foundations of Software Science and Computation Struc-tures, FOSSACS 2017 (LNCS, Vol. 10203) . Springer, 124–142. https://doi.org/10.1007/978-3-662-54458-7 [57] Paul Taylor. 1999.
Practical Foundations of Mathematics . CambridgeUniversity Press.[58] Henning Urbat, Jirí Adámek, Liang-Ting Chen, and Stefan Milius.2017. Eilenberg Theorems for Free.
CoRR abs/1602.05831 (2017). http://arxiv.org/abs/1602.05831 [59] Henning Urbat, Jiří Adámek, Liang-Ting Chen, and Stefan Milius.2017. Eilenberg Theorems for Free. In
Proc. MFCS 2017 (LIPIcs, Vol. 83) ,Kim G. Larsen, Hans L. Bodlaender, and Jean-François Raskin (Eds.).Schloss Dagstuhl.[60] Henning Urbat and Stefan Milius. 2019. Varieties of Data Languages.In
Proc. 46th International Colloquium on Automata, Languages, andProgramming (ICALP 2019) (LIPIcs, Vol. 132) , Christel Baier, IoannisChatzigiannakis, Paola Flocchini, and Stefano Leonardi (Eds.). 130:1–130:14.(Presents the first Eilenberg-type correspondence for data languagesand a nominal Eilenberg-Schützenberger theorem characterizingpseudovarieties of nominal monoids.).[61] Frits Vaandrager. 2017. Model Learning.
Commun. ACM
60, 2 (2017),86–95.[62] Gerco van Heerdt, Tobias Kappé, Jurriaan Rot, Matteo Sammartino,and Alexandra Silva. 2020. A Categorical Framework for LearningGeneralised Tree Automata. https://arxiv.org/abs/2001.05786 [63] Gerco van Heerdt, Tobias Kappé, Jurriaan Rot, Matteo Sam-martino, and Alexandra Silva. 2019. Tree Automata as Algebras:Minimisation and Determinisation.
CoRR abs/1904.08802 (2019). http://arxiv.org/abs/1904.08802 [64] Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva. 2017.CALF: Categorical Automata Learning Framework. In
Proc. CSL 2017 .29:1–29:24.[65] Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva. 2017.Learning Automata with Side-Effects.
CoRR abs/1704.08055 (2017). http://arxiv.org/abs/1704.08055 [66] T. Wilke. 1991. An Eilenberg Theorem for ∞ -Languages. In Proc. ICALP’91 (LNCS, Vol. 510) . Springer, 588–599.15 , Henning Urbat and Lutz Schröder
A Appendix: Omitted Proofs and Details
In this appendix, we provide full proofs of all our resultsand more detailed treatment of some examples omitted dueto space restrictions.
Discussion of the Assumptions 3.5 and 4.1
We comment on some technical consequences of our As-sumptions 3.5 and 4.1.
Remark A.1.
The assumption F (E) ⊆ E implies that thefactorization system (E , M) of D lifts to automata: givenan automata homomorphism h : Q → Q ′ and its (E , M) -factorization h = ( Q e / / / / Q ′′ / / m / / Q ′ ) in D , there ex-ists a unique automata structure ( Q ′′ , δ Q ′′ , i Q ′′ , f Q ′′ ) on Q ′′ such that both e and m are automata homomorphisms. In-deed, the transitions δ ′′ Q are given by diagonal fill-in FQ δ Q / / Fe (cid:15) (cid:15) (cid:15) (cid:15) Q e (cid:15) (cid:15) (cid:15) (cid:15) FQ ′′ δ Q ′′ / / ❴❴❴ Fm (cid:15) (cid:15) Q ′′ (cid:15) (cid:15) m (cid:15) (cid:15) FQ ′ δ Q ′ / / Q ′ and the initial and final states by i Q ′′ = ( I i Q −−→ Q e −→ Q ′′ ) , f Q ′′ = ( Q ′′ m −→ Q ′ f Q ′′ −−−→ O ) . Remark A.2.
The condition F I (M) ⊆ M makes sure thatthe factorization system (E , M) lifts from D to Coalg F I ,the category of F I -coalgebras: given an F I -coalgebra homo-morphism h : ( C , γ ) → ( C ′ , γ ′ ) and its (E , M) -factorization h = ( C e / / / / C ′′ / / m / / C ′ ) in D , there is a unique F I -coalgebra structure ( C ′′ , γ ′′ ) on C ′′ such that both e and m are coalgebra homomorphisms. The structure γ ′′ is definedvia diagonal fill-in in analogy to Remark A.1.Dually, the condition G O (E) ⊆ E implies that Alg G O , thecategory of G O -algebras, has a factorization system lifting (E , M) . Details for Example 3.3
We show that for each of the five categories D of Table 1and the endofunctors F and G on D given by F = Σ ⊗ (−) and G = [ Σ , −] , the Assumptions 3.5(1)–(4) and 4.1 are satisfied.Clearly, all the categories D with the correspondingchoices of I and O satisfy the Assumptions 3.5(1)(2). More-over, (3) holds because D is closed. For (4), note that in allcases E coincides with the class of all epimorphisms. Since every left adjoint F preserves epimorphisms, it follows that F (E) ⊆ E . It remains to verify the Assumptions 4.1. We con-sider the cases D = Set , Pos , JSL , K - Vec ; for D = Nom , seethe details for Example 3.10. F I preserves M and intersections of M -morphisms. This is clear for D = Set , Pos since in these categories co-products commute with intersections, i.e. one has ( A + B ) ∩ ( C + D ) (cid:27) ( A ∩ C ) + ( B ∩ D ) . For D = JSL recall that we have chosen Σ to be the freesemilattice P f Σ over a finite set Σ of generators, i.e. the ∪ -semilattice of finite subsets of Σ . It follows that F I X = I + Σ ⊗ X = I + ( Þ a ∈ Σ I ) ⊗ X (cid:27) I + Þ a ∈ Σ I ⊗ X (cid:27) I + Þ a ∈ Σ X using that I = P f I ⊗ X (cid:27) X , and the left adjoint (−) ⊗ X preserves coproducts. Now note that the coproduct X + Y of two semilattices coincides with the product X × Y , withinjections given by inl : X → X × Y , x
7→ ( x , ⊥) inr : Y → X × Y , Y
7→ (⊥ , y ) This implies that monomorphisms in
JSL are stable under co-products, and that intersections commute with coproducts.It thus follows from the above formula for F I X that F I pre-serves monomorphisms and intersections.For D = K - Vec , the proof is analogous, using again theproduct/coproduct coincidence. G O preserves epimorphims. We first show that the functor [ Σ , −] preserves epimorphisms (i.e. surjections). Note firstthat in D = Set , Pos , JSL , K - Vec , the object [ Σ , X ] is carriedby the set D ( Σ , X ) with the D -structure inherited from X (i.e. defined pointwise), and that for any morphism e : X → Y the morphism [ Σ , e ] : [ Σ , X ] → [ Σ , Y ] is given by f e · f .We need to prove that [ Σ , e ] is surjective provided that e is surjective; that is, for every morphism д : Σ → Y thereexists a morphism f : Σ → X making the following trianglecommute: Σ f / / ❴❴❴ д ❍❍❍❍❍❍❍ X e (cid:15) (cid:15) (cid:15) (cid:15) Y This follows from the fact that in each case, Σ has been cho-sen as a projective object of D . For instance, for D = JSL weconstruct f as follows. Recall that Σ is the free semilatticeon a finite set Σ , and denote by η : Σ → Σ the universalmap. For each a ∈ Σ , choose x a ∈ X with e ( x a ) = д ( η ( a )) ,using that e is surjective. This gives a map f : Σ → X , a x a . Let f : Σ → X be the unique semilattice homomorphismextending f , i.e. with f · η = f . Then e · f = д since this utomata Learning: An Algebraic Approach , , equation holds when precomposed with the universal map η , as shown by the diagram below: Σ η / / f ! ! Σ f / / ❴❴❴ д ●●●●●●● X e (cid:15) (cid:15) (cid:15) (cid:15) Y This shows that the functor [ Σ , −] preserves epimorphisms.Since epimorphisms in our categories D are stable underproducts, it follows that also the functor G O = O × [ Σ , −] preserves epimorphisms. Details for Example 3.9 (1) The functor P = [− , O ] : D → D op is a left adjoint (withright adjoint P op : D op → D ) because, for each X , Y ∈ D , D ( X , PY ) = D ( X , [ Y , O ]) (cid:27) D ( X ⊗ Y , O ) (cid:27) D ( Y ⊗ X , O ) (cid:27) D ( Y , [ X , O ]) = D ( Y , PX ) . (2) We have a natural isomorphism PF I (cid:27) G op O P . To see this, observe that all parts of the following diagramcommute up to isomorphism. D F I ) ) F (cid:15) (cid:15) P / / D op G op (cid:15) (cid:15) G op O v v D P / / I + − (cid:15) (cid:15) D op ( O ×−) op (cid:15) (cid:15) D P / / D op The left and right parts commute by definition. The twosquares commute because for each X ∈ D , PFX = [ Σ ⊗ X , O ] (cid:27) [ Σ , [ X , O ]] = GPX and P ( I + X ) (cid:27) PI × PX (cid:27) [ I D , O ] × PX (cid:27) O × PX . The isomorphism P ( I + X ) (cid:27) PI × PX uses that P is a leftadjoint, i.e. preserves coproducts. Details for Example 3.10
We verify that the functors of Example 3.10(1)–(4), see thetable below, satisfy our Assumptions 3.5(4) and 4.1. Recallthat we have chosen I = O =
2, and that the factor-ization system of
Nom is the one given by epimorphisms (=surjective equivariant maps) and monomorphisms (= injec-tive equivariant maps).
F G (1) A × (−) [ A , −] (2) A ∗ (−) [ A ](−) (3) A × (−) + A ∗ (−) [ A , −] × [ A ](−) (4) A × (−) + A ∗ (−) + [ A ](−) [ A , −] × [ A ](−) × RF preserves epimorphisms. This follows from the fact that F is a left adjoint. F I preserves monomorphisms. The functors A ×(−) and A ∗(−) preserve monomorphisms by definition, recalling thatfor an equivariant map e : X → Y the map A ∗ e is given by A ∗ e : A ∗ X → A ∗ Y , ( a , x ) 7→ ( a , e ( x )) . The functor [ A ](−) preserves monomorphisms because it isa right adjoint. Since coproducts in Nom are formed at thelevel of
Set , it follows that monomorphisms in
Nom are sta-ble under coproducts. This implies that for all the functors F in (1)–(4), the functor F I = I + F preserves monomorphisms. F I preserves intersections. Note that intersections of sub-objects (i.e. equivariant subsets) in
Nom are just set-theoretic intersections. Thus, the functors A ×(−) and A ∗(−) clearly preserve intersections by definition. The functor [ A ](−) preserves them because it is right adjoint and thuspreserves all limits. Since intersections commute with co-products in Set and thus also in
Nom , it follows that for allthe functors F in (1)–(4), the functor F I = I + F preservesintersections. G O preserves epimorphisms. The functor [ A ](−) preservesepimorphisms because it is a left adjoint. Moreover, we have Lemma A.3.
The functors [ A , −] : Nom → Nom and R : Nom → Nom preserve epimorphisms.Proof. (1) We first show that [ A , −] preserves epimorphisms(i.e. surjections). This can be deduced from the fact that ev-ery polynomial functor on Nom preserves epimorphisms(like in
Set ) and that [ A , −] can be expressed as a quotientfunctor of a polynomial functor [44, Lemma 6.9]. In the fol-lowing, we give a direct proof for the convenience of thereader.Recall from [51, Theorem 2.19] that [ A , X ] is the nominalset of finitely supported maps f : A → X ; here f is finitelysupported if there exists a finite subset S ⊆ A such that for allpermutations π ∈ Perm ( A ) that fix S and all a ∈ A one has f ( π · a ) = π · f ( a ) . In particular, equivariant maps are finitelysupported maps with support S = ∅ . For any equivariantmap e : X → Y , the map [ A , e ] is given by [ A , e ] : [ A , X ] → [ A , Y ] , f e · f . We need to show that [ A , e ] is surjective provided that e issurjective; in other words, for every finitely supported map д : A → Y , there exists a finitely supported map f : A → X , Henning Urbat and Lutz Schröder making the following triangle commute: A f / / ❴❴❴ д ❍❍❍❍❍❍❍ X e (cid:15) (cid:15) (cid:15) (cid:15) Y Fix an arbitrary atom a < A \ supp д . Moreover, choose x ∈ X with e ( x ) = д ( a ) , and choose x b ∈ X with e ( x b ) = д ( b ) forevery b ∈ supp д ∪ supp x , using that e is surjective. Definethe map f : A → X as follows: f ( b ) = ( ( b a ) · x for b ∈ A \ ( supp д ∪ supp x ) ; x b for b ∈ supp д ∪ supp x . We claim that (i) the map f is finitely supported and (ii) itsatisfies e · f = д . Ad (i).
We show that the finite set of atoms S = supp д ∪ supp x ∪ Ø b ∈ supp д ∪ supp x supp x b supports the map f . Thus, let π ∈ Perm ( A ) be a permutationfixing S ; we need to prove that f ( π · b ) = π · f ( b ) for all b ∈ A .For b ∈ supp д ∪ supp x , we have f ( π · b ) = f ( b ) = x b = π · x b = π · f ( b ) . For b ∈ A \ ( supp д ∪ supp x ) , we get f ( π · b ) = ( π ( b ) a ) · x = π · ( b a ) · x = π · f ( b ) . Here the first and last equation use the definition of f .The middle equation holds because the two permutations ( π ( b ) a ) and π · ( b a ) are equal on supp x . Indeed, both per-mutations send a to π ( b ) , and all elements of supp x \{ a } arefixed by both permutations because b , π ( b ) < supp x and π fixes supp x . Ad (ii).
We show that e ( f ( b )) = д ( b ) for all b ∈ A . For b ∈ supp д ∪ supp x we have e ( f ( b )) = e ( x b ) = д ( b ) by definition of f and x b . For b ∈ A \ ( supp д ∪ supp x ) , e ( f ( b )) = e (( b a ) · x ) def. f = ( b a ) · e ( x ) e equivariant = ( b a ) · д ( a ) def. x = д (( b a ) · a ) a , b < supp д = д ( b ) . (2) We show that R preserves surjections. Recall that R isthe subfunctor of [ A , −] given by RX = { f ∈ [ A , X ] : a f ( a ) for every a ∈ A } . We need to show that Re : RX → RY is surjective for everysurjective equivariant map e : X ։ Y ; that is, for every д ∈ RY , there exists f ∈ RX with e · f = д .The definition of f is the same as in part (1) of the proof,except that the elements x and x b ( b ∈ supp д ∪ supp x ) arenow additionally required to satisfy a x and b x b . Such a choice of x and x b is always possible: if x is any elementof X with e ( x ) = д ( a ) , choose a ′ with a ′ д ( a ) , x and put x ′ = ( a ′ a ) · x . Then a x ′ and e ( x ′ ) = e (( a ′ a ) · x ) = ( a ′ a ) · e ( x ) = ( a ′ a ) · д ( a ) = д ( a ) , where the last equation uses that a , a ′ д ( a ) . Thus, we canreplace x by x ′ . Analogously for x b .Part (1) now shows that f is finitely supported and satis-fies e · f = д . Moreover, we clearly have b f ( b ) for every b ∈ A by definition of f and the above choices of x and x b ,i.e. f ∈ RX . (cid:3) Since epimorphisms in
Nom are stable under products(which follows from the corresponding property in
Set ), weconclude that for all the functors G in (1)–(4), the functor G O = × G preserves epimorphisms. Details for Example 3.11
We describe sorted Σ -automata for the case of generalbase categories D . Suppose that ( D , ⊗ , I D ) is a sym-metric monoidal closed category satisfying our Assump-tions 3.5(1)–(2), and let S be a set of sorts. Then the category D S (equipped with the monoidal structure and the factoriza-tion system inherited sortwise from D ) is also symmetricmonoidal closed and satisfies the Assumptions 3.5(1)–(2).Fix an arbitrary object I ∈ D S inputs (not necessarily thetensor unit), an arbitrary object O ∈ D S of outputs, and afamily of objects Σ = ( Σ s , t ) s , t ∈ S in D ; we think of Σ s , t asa set of letters with input sort s and output sort t . Take thefunctors F : D S → D S , ( FQ ) t = Þ s ∈ S Σ s , t ⊗ Q s ( t ∈ S ) , G : D S → D S , ( GQ ) s = Ö t ∈ S [ Σ s , t , Q t ] ( s ∈ S ) . The functor F is a left adjoint of G : we have the isomor-phisms (natural in P , Q ∈ D S ) D S ( FQ , P ) = Ö t ∈ S D (( FQ ) t , P t ) = Ö t ∈ S D ( Þ s ∈ S Σ s , t ⊗ Q s , P t ) (cid:27) Ö t ∈ S Ö s ∈ S D ( Σ s , t ⊗ Q s , P t ) (cid:27) Ö s ∈ S Ö t ∈ S D ( Σ s , t ⊗ Q s , P t ) (cid:27) Ö s ∈ S Ö t ∈ S D ( Q s , [ Σ s , t , P t ]) (cid:27) Ö s ∈ S D ( Q s , Ö t ∈ S [ Σ s , t , P t ]) = Ö s ∈ S D ( Q s , ( GP ) s ) . = D S ( Q , GP ) utomata Learning: An Algebraic Approach , , Instantiating Definition 3.2 to the above data, we obtain theconcept of a sorted Σ -automaton . It is given by an S -sortedobject of states Q ∈ D S together with morphisms δ Q , s , t , i Q , t and f Q , t as in the diagram below for s , t ∈ S : Σ s , t ⊗ Q tδ Q , s , t (cid:15) (cid:15) I t i Q , t / / Q t f Q , t / / O t In generalization of the single-sorted case (see Example 3.9),the initial algebra for F I can be described as follows. For n ∈ N and s , t ∈ S define the object Σ ns , t ∈ D inductively by Σ s , t = I D , Σ n + s , t = Þ r ∈ S Σ s , r ⊗ Σ nr , t . and put Σ ∗ s , t = Þ n ∈ N Σ ns , t . The initial algebra for the functor F I is given by ( µF I ) t = Þ s ∈ S I s ⊗ Σ ∗ s , t ( t ∈ S ) . Proof of Theorem 3.13
We first establish some basic observations about automatahomomorphisms and languages:
Proposition A.4.
For each automata homomorphism h : Q → Q ′ one has L Q = L Q ′ Proof.
This follows from the commutative diagram below.The upper triangle commutes by initiality of µF I , and allremaining parts commute by definition. µF IL Q ( ( L Q ′ v v e Q (cid:7) (cid:7) ✍✍✍✍✍✍ e Q ′ (cid:24) (cid:24) ✶✶✶✶✶✶ Q f Q (cid:24) (cid:24) ✵✵✵✵✵✵ h / / Q ′ f Q ′ (cid:6) (cid:6) ☞☞☞☞☞☞ O (cid:3) Remark A.5.
Every F -algebra homomorphism h : ( Q , δ ) →( Q ′ , δ ′ ) is also a G -coalgebra homomorphism h : ( Q , δ @ ) →( Q ′ , ( δ ′ ) @ ) , and vice versa. Indeed, the corresponding com-mutative squares are just adjoint transposes of each other. FQ δ / / Fh (cid:15) (cid:15) Q h (cid:15) (cid:15) FQ ′ δ ′ / / Q ′ Q δ @ / / h (cid:15) (cid:15) GQ Gh (cid:15) (cid:15) Q ′ ( δ ′ ) @ / / GQ ′ Proposition A.6.
For all automata Q and Q ′ , we have L Q = L Q ′ iff m Q · e Q = m Q ′ · e Q ′ . Proof. (1) For the “if” direction, suppose that m Q · e Q = m Q ′ · e Q ′ . Then the following diagram (where outl : G O = O × G → O denotes the left product projection) commutes bythe definition of γ Q in Remark 3.7 and because m Q is a G O -coalgebra homomorphism. Q γ Q / / m Q (cid:15) (cid:15) f Q $ $ G O Q G O m Q (cid:15) (cid:15) outl / / OνG
O γ / / G O ( νG O ) outl ; ; ✇✇✇✇✇✇✇✇✇ (4)Thus f Q = outl · γ · m Q and analogously f Q ′ = outl · γ · m Q ′ .This implies L Q = f Q · e Q = outl · γ · m Q · e Q = outl · γ · m Q ′ · e Q ′ = · · · = L Q ′ . (2) For the “only if” direction, suppose that L : = L Q = L Q ′ .By equipping µF I with final states L : µF I → O , we canview µF I as a G O -coalgebra, and thus e Q : µF I → Q as a G O -coalgebra homomorphism (see Remark A.5). It followsthat m Q · e Q : µF I → νG O is a G O -coalgebra homomorphism.Analogously, m Q ′ · e Q ′ is a coalgebra homomorphism. Thus, m Q · e Q = m Q ′ · e Q ′ by finality of νG O . (cid:3) Remark A.7.
For every language L : µF I → O there existsan automaton Q accepting L . Indeed, one can choose Q = µF I with output morphism L : µF I → O .We are prepared to prove the minimization theorem: Proof of Theorem 3.13.
Fix an arbitrary automaton Q with L Q = L (see Remark A.7). Viewing µF I as an automa-ton with output morphism L Q = f Q · e Q : µF I → O , theunique F I -algebra homomorphism e Q is an automata ho-momorphism. Analogously, equipping νG O with the initialstates m Q · i Q : I → νG O makes the unique G O -coalgebrahomomorphism m Q : Q → νG O an automata homomor-phism. Thus m Q · e Q is an automata homomorphism. Formits (E , M) -factorization, see Remark A.1: µF Ie Q ~ ~ ⑤⑤⑤⑤⑤⑤⑤⑤ e Min ( L ) ❍❍❍❍❍❍❍❍❍ Q m Q ❇❇❇❇❇❇❇❇ Min ( L ) { { m Min ( L ) { { ✈✈✈✈✈✈✈✈✈ νG O We claim that
Min ( L ) is the minimal automaton for L . Tothis end, note first that L Min ( L ) = L Q = L by the “if” direc-tion of Proposition A.6. Thus, Min ( L ) accepts the language L . Moreover, Min ( L ) is reachable because e Min ( L ) ∈ E .To establish the universal property of Min ( L ) , supposethat R is a reachable automaton accepting L ; we need toshow that there is a unique homomorphism from R into , Henning Urbat and Lutz SchröderMin ( L ) . From L Min ( L ) = L R = L it follows that m R · e R = m Min ( L ) · e Min ( L ) by the “only if” direction of Proposition A.6.Thus, diagonal fill-in yields a unique automata homomor-phism h : R → Min ( L ) making the diagram below commute: µF Ie R ~ ~ ~ ~ ⑥⑥⑥⑥⑥⑥⑥⑥ e Min ( L ) ❍❍❍❍❍❍❍❍❍ R m R ❇❇❇❇❇❇❇❇ h / / ❴❴❴❴❴❴❴❴ Min ( L ) { { m Min ( L ) { { ✈✈✈✈✈✈✈✈✈ νG O Given another automata homomorphism h ′ : R ։ Min ( L ) ,we have h ′ · e R = e Min ( L ) by initiality of µF I . Thus h ′ · e R = h · e R , which implies h ′ = h because e R is epic. This provesthe desired universal property of Min ( L ) .The uniqueness of Min ( L ) up to isomorphism follows im-mediately from its universal property. (cid:3) The construction of
Min ( L ) is the above proof also shows: Corollary A.8.
An automaton Q is minimal if and only if itis both reachable ( e Q ∈ E) and simple ( m Q ∈ M ). Details for Remark 4.4
That L Q = L Q ′ implies h Qs , t = h Q ′ s , t follows immediately fromthe “only if” direction of Proposition A.6 and the definitionof h (−) s , t . Details for Definition 4.8
For the diagonal fill-in δ s , t to exist, we need to verify that foreach pair ( s , t ) as in (3), the square below is commutative: FS l s , t (cid:15) (cid:15) Fe s , t / / / / FH s , tr s , t (cid:15) (cid:15) H s , t / / m s , t / / T where l s , t = ( FS inr −−→ I + FS = F I S e FI s , t −−−−→ H F I s , t cl − s , t −−−→ H s , t ) and r s , t = ( H s , t cs − s , t −−−→ H s , G O t m s , GO t −−−−−−→ G O T = O × GT outr −−−→ GT ) . Proof.
By definition of cl s , t and cs s , t , the lower path of thesquare is equal to FS inr −−→ F I S h FI s , t −−−−→ T and the upper path is equal to FS Fh s , GOt −−−−−−→ FG O T outr −−−−→ T . We therefore need to verify that the outside of the followingdiagram commutes: FS Fh s , GOt / / Fs (cid:27) (cid:27) ✼✼✼✼✼✼✼ inr (cid:15) (cid:15) FG O T outr (cid:15) (cid:15) F F NI F j N / / inr (cid:15) (cid:15) F ( µF I ) Fe Q / / inr (cid:15) (cid:15) FQ inr (cid:15) (cid:15) Fm Q / / F ( νG O ) F j ′ K + / / Fγ (cid:15) (cid:15) FG K + O (∗) FG O t @ @ (cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0) F I ( µF I ) α (cid:15) (cid:15) F I e Q / / F I Q α Q (cid:15) (cid:15) FG O ( νG O ) outr (cid:15) (cid:15) F N + I F I j N @ @ ✂✂✂✂✂✂✂ j N + / / µF I e Q / / Q m Q / / νG O j ′ K / / G KO t (cid:30) (cid:30) ❃❃❃❃❃❃❃ F I S F I s C C ✞✞✞✞✞✞ h FI s , t / / T All parts except ( ∗ ) clearly commute either by definition orby naturality of inr : F → F I and outr : G O → G . For ( ∗ ),note that the lower path is the adjoint transpose of νG O γ −→ G O ( νG O ) outr −−−→ G ( νG O ) Gj ′ K −−−→ GG KO Gt −−→ GT the upper path is the adjoint transpose of νG O j ′ K + −−−→ G K + O G O t −−−→ G O T outr −−−→ GT , and the commutative diagram below shows that these twomorphisms are equal: νG O j ′ K + / / γ (cid:15) (cid:15) G K + O G O t / / outr (cid:15) (cid:15) G O T outr (cid:15) (cid:15) G O ( νG O ) G O j ′ K : : ttttttttt outr (cid:15) (cid:15) G ( νG O ) Gj ′ K / / GG KO Gt / / GT This concludes the proof. (cid:3)
Proof of Theorem 4.12
The proof of the correctness and termination of the gener-alized L ∗ algorithm requires some preparation. First, recallthat for any endofunctor H , an H -coalgebra C γ −→ HC is re-cursive [57] if for each H -algebra HA α −→ A there exists aunique coalgebra-to-algebra homomorphism h from ( C , γ ) into ( A , α ) ; that is, h makes the square below commute. C h / / ❴❴❴❴ γ (cid:15) (cid:15) AHC Hh / / HA α O O utomata Learning: An Algebraic Approach , , Dually, an H -algebra HA α −→ A is corecursive if for each H -coalgebra C γ −→ HC there exists a unique coalgebra-to-algebra homomorphism h from ( C , γ ) into ( A , α ) . Lemma A.9 (see [22], Prop. 6) . For each recursive coalgebra C γ −→ HC , the coalgebra HC Hγ −−→ HHC is also recursive.
Barlocco et al. [14] model prefix-closed sets as recursivesubcoalgebras of an initial algebra µH . In our present set-ting, recursivity comes for free: Proposition A.10.
Every subcoalgebra of ( F NI , F NI ¡ ) , N ≥ , is recursive. In particular, this result applies to the subcoalgebras ( S , σ ) in the generalized L ∗ algorithm. Proof.
Suppose that s : ( S , σ ) ( F NI , F NI ¡ ) is a subcoalge-bra for some N ≥
0. We prove that ( S , σ ) is recursive byinduction on N .For N =
0, note first that in any category D the initial object0 has no proper subobjects. (Indeed, suppose that m : S S : 0 → S sat-isfies m · ¡ S = id by initiality of 0, so m is both monicand split epic, i.e. an isomorphism.) Consequently, we have ( S , σ ) = ( , ¡ ) , and this coalgebra is trivially recursive by ini-tiality of 0.For the induction step, let N >
0, and let ( A , α ) be an ar-bitrary F I -algebra. We need to prove that there is a uniquecoalgebra-to-algebra homomorphism h : ( S , σ ) → ( A , α ) .(1) Existence.
Since ( F NI , F NI ¡ ) is a recursive coalgebra byLemma A.9, we have a unique coalgebra-to-algebra homo-morphism h ′ from ( F NI , F NI ¡ ) to ( A , α ) . Thus h = h ′ · s is acoalgebra-to-homomorphism from ( S , σ ) to ( A , α ) .(2) Uniqueness.
Suppose that h : ( S , σ ) → ( A , α ) is acoalgebra-to-algebra homomorphism. Form the pullback of s and F N − I ¡: F N − I / / F N − I ¡ / / F NI S ′ O O s ′ O O / / m / / S O O s O O Note that F N − I ¡ ∈ M because ¡ : 0 → F I = I lies in M by Assumption 3.5(2) and F I preserves M by Assump-tions 4.1. Since in any factorization system (E , M) the class M is stable under pullbacks [2, Prop. 14.15], it follows that m , s ′ ∈ M . Since F I preserves pullbacks of M -morphismsby Assumptions 4.1, the upper right square in the diagrambelow is a pullback, and the outer part commutes because s is a coalgebra homomorphism. Thus, there is a unique mor-phism n making the two triangles commute: F NI / / F NI ¡ / / F N + I F I S ′ O O F I s ′ O O / / F I m / / F I S O O F I s O O S > > n > > ⑤⑤⑤⑤ L L s @ @ σ ; ; It follows that m : ( S ′ , n · m ) ( S , σ ) and s ′ : ( S ′ , n · m ) ( F N − I , F N − I ¡ ) are coalgebra homomorphisms, as shown bythe two commutative diagrams below: S σ / / F I SS ′ / / m / / O O m O O S / / n / / > > σ > > ⑥⑥⑥⑥⑥⑥⑥⑥ F I S ′ O O F I m O O F N − I F N − I ¡ / / F NI S ′ / / m / / O O s ′ O O S / / n / / ? ? s ? ? ⑦⑦⑦⑦⑦⑦⑦⑦ F I S ′ O O F I s ′ O O By induction we know that the coalgebra ( S ′ , n · m ) is recur-sive, that is, we have a unique coalgebra-to-algebra homo-morphism д : ( S ′ , n · m ) → ( A , α ) . Since also h · m : ( S ′ , n · m ) → ( A , α ) is coalgebra-to-algebra homomorphism (be-ing the composite of a coalgebra homomorphism with acoalgebra-to-algebra homomorphism), we get h · m = д .Then the commutative diagram below shows that h = α · F I д · n , i.e. h is uniquely determined by д . S n (cid:8) (cid:8) h / / σ (cid:15) (cid:15) AF I S ′ F I д = = F I m / / FS F I h / / FA α O O (5) (cid:3) Note that the proof of Proposition A.10 uses our assump-tion that F I preserves pullbacks im M -morphisms. Since wedo not require G O to preserve pushouts of E -morphisms, thecorresponding statement that every G O -quotient algebra of ( G KO , G KO ! ) is corecursive does not hold. However, we havethe following weaker result: Proposition A.11.
At each stage of Generalized L ∗ , the al-gebra ( T , τ ) is corecursive.Proof. Recall that ( T , τ ) is a quotient algebra t : ( G KO , G KO ! ) ։ ( T , τ ) for some K >
0. We need toshow that (1) ( T , τ ) is corecursive after its initialization inStep 0 of the algorithm, and that (2) every application of“Extend t ” preserves corecursivity. Proof of (1).
Initially, we have ( T , τ ) = ( G O , G O ! ) . Since thealgebra ( , ! ) is trivially corecursive by terminality of 1, thedual of Lemma A.9 shows that ( T , τ ) is corecursive. , Henning Urbat and Lutz Schröder Proof of (2).
Suppose that ( T , τ ) is corecursive. Applying “Ex-tend t ” replaces ( T , τ ) by the algebra ( T ′ , t · G O t ) , where τ = t · t . Then t : ( G O T , G O τ ) → ( T ′ , t · G O t ) and t : ( T ′ , t · G O t ) → ( T , τ ) are G O -algebra homomorphisms,as shown by the diagram below. T T ′ t o o G O T t o o G O T t O O G O T τ O O G O T ′ G O t o o G O t O O G O G O T G O τ O O G O t o o To show that ( T ′ , t · G O t ) is corecursive, let ( C , γ ) be a G O -coalgebra. We need to prove that there is a unique coalgebra-to-algebra homomorphism h from ( C , γ ) into ( T ′ , t · G O t ) . Existence.
Since ( T , τ ) is corecursive, the algebra ( G O T , G O τ ) is also corecursive by the dual of Lemma A.9. Thus, thereexists a unique coalgebra-to-algebra homomorphism h ′ from ( C , γ ) into ( G O T , G O τ ) . It follows that h = t · h ′ is a coalgebra-to-algebra homomorphism from ( C , γ ) into ( T ′ , t · G O t ) , being the composite of the coalgebra-to-algebra homomorphism h ′ with the algebra homomorphism t . Uniqueness.
Let h be a coalgebra-to-algebra homomorphismfrom ( C , γ ) into ( T ′ , t · G O t ) , and denote by д the uniquecoalgebra-to-algebra homomorphism from ( C , γ ) into thecorecursive algebra ( T , τ ) . Since also t · h is such a homo-morphism (being the composite of a coalgebra-to-algebrahomomorphism with an algebra homomorphism), we have t · h = д . From the commutative diagram below it then fol-lows that h = t · G O д · γ , which shows that h is uniquelydetermined by д . T ′ C h o o γ (cid:15) (cid:15) G O T t O O G O T ′ G O t O O G O C G O h o o G O д c c ●●●●●●●●● (cid:3) Lemma A.12.
Let ( s , t ) be closed and consistent, and supposethat the algebra ( T , τ ) is corecursive. Then the associated hy-pothesis automaton H s , t (see Definition 4.8) is minimal. More-over, the two diagrams below commute: S / / s / / e s , t (cid:31) (cid:31) ❅❅❅❅❅❅❅❅ F NI j N / / µF Ie Hs , t } } ④④④④④④④④ H s , t H s , tm s , t ~ ~ ⑦⑦⑦⑦⑦⑦⑦⑦ m Hs , t ! ! ❉❉❉❉❉❉❉❉ T G KO t o o νG Oj ′ K o o In particular, by Proposition A.11, this lemma applies tothe pairs ( s , t ) constructed in the generalized L ∗ algorithm. Proof. (1) We first prove that the left-hand diagram com-mutes. Consider the F I -algebra structure on H s , t given by [ i s , t , δ s , t ] : F I H s , t → H s , t . Then e s , t : ( S , σ ) → ( H s , t , [ i s , t , δ s , t ]) is a coalgebra-to-algebra homomorphism, as shown by the commutative di-agram below: S e s , t / / σ (cid:15) (cid:15) H s , t H F i s , t cl − s , t ; ; ✈✈✈✈✈✈✈✈✈ F I S e FI s , t < < ②②②②②②②② F I e s , t / / F I H s , t [ i s , t , δ s , t ] O O Indeed, the upper left part commutes by the definition of cl s , t , and the lower right part commutes by definition of i s , t and δ s , t (consider the two coproduct components of F I S = I + FS separately).Since also e H s , t · j N · s : ( S , σ ) → ( H s , t , [ i s , t , δ s , t ]) isa coalgebra-to-algebra homomorphism (being the compos-ite of the F I -coalgebra homomorphism s , the coalgebra-to-algebra homomorphism j N and the F I -algebra homomor-phism e H s , t ) and the coalgebra ( S , σ ) is recursive by Propo-sition A.10, we conclude that e s , t = e H s , t · j N · s .(2) The proof that the right-hand diagram commutes is com-pletely analogous: one views H s , t as a G O -coalgebra h f s , t , δ @ s , t i : H s , t → G O H s , t , where δ @ s , t : H s , t → GH s , t denotes the adjoint transposeof δ s , t : FH s , t → H s , t , and shows that both m s , t and t · j ′ K · m H s , t are coalgebra-to-algebra homomorphisms from ( H s , t , h f s , t , δ @ s , t i) into the corecursive algebra ( T , τ ) .(3) Since e s , t ∈ E and m s , t ∈ M , it follows from the twocommutative diagrams that e H s , t ∈ E and m H s , t ∈ M (see[2, Prop. 14.11]). Thus, the automaton H s , t is minimal byCorollary A.8. (cid:3) utomata Learning: An Algebraic Approach , , An important invariant of the generalized L ∗ algorithmis that the subcoalgebra s is pointed and that the quotientalgebra t is co-pointed: Definition A.13. An F I -coalgebra ( R , ϱ ) is pointed if thereis a morphism i R such that the left-hand triangle below com-mutes. A G O -algebra ( B , β ) is co-pointed if there is a mor-phism f R such that the right-hand triangle below commutes: I i R / / inl ❅❅❅❅❅❅❅❅ R ϱ (cid:15) (cid:15) F I R O B f B o o G O B outl a a ❈❈❈❈❈❈❈❈ β O O Note that if ( R , ϱ ) is a subcoalgebra of ( F MI , F MI ¡ ) , then i R is necessarily unique because F MI ¡ is monic by Assump-tions 3.5(2) and Assumptions 4.1. Dually for co-pointed quo-tient algebras of ( G MO , G MO ! ) . Lemma A.14.
At each stage of the generalized L ∗ algorithm,the coalgebra ( S , σ ) is pointed and the algebra ( T , τ ) is co-pointed.Proof. We proceed by induction on the number of steps ofthe algorithm required to construct the pair ( s , t ) . Initially,after Step (0), ( S , σ ) is equal to ( I , F I ¡ ) , and thus pointed via i S = id I . I id / / inl (cid:31) (cid:31) ❅❅❅❅❅❅❅❅ I F I ¡ = inl (cid:15) (cid:15) F I I Dually, ( T , τ ) is co-pointed via f T = id O .Now suppose that at some stage of the algorithm, ( S , σ ) is pointed and ( T , τ ) is co-pointed. We need to show that ( S , σ ) remains pointed after executing “Extend s ” or addinga counterexample to s , and that ( T , τ ) remains co-pointedafter executing “Extend t ”.(1) Extend s . When calling “Extend s ”, the coalgebra ( S , σ ) is replaced by the coalgebra ( S ′ , F I s · s ) . This coalgebra ispointed via i S ′ = s · i S , as witnessed by the commutativediagram below: I i S / / i S ′ ! ! inl - - inl + + S s / / σ ❇❇❇❇❇❇❇❇ S ′ s (cid:15) (cid:15) F I S F I s (cid:15) (cid:15) F I S ′ (2) Extend t . Symmetric to (1).(3)
Adding a counterexample.
Let ( C , γ ) be the counterexam-ple added to ( S , σ ) , and denote by i : ( S , σ ) ( S ∨ C , σ ∨ γ ) the embedding. Then the coalgebra ( S ∨ C , σ ∨ γ ) is pointed via i S ∨ C = i · i S , as shown by the commutative diagram be-low: I inl i S / / i S ∨ C $ $ inl (cid:31) (cid:31) ❅❅❅❅❅❅❅❅ S σ (cid:15) (cid:15) i / / S ∨ C σ ∨ γ (cid:15) (cid:15) F I S F I i / / F I ( S ∨ C ) (cid:3) Lemma A.15.
Let A be an automaton. For any pointed sub-coalgebra r : ( R , ϱ ) ( F MI , F MI ¡ ) , we have i A = ( I i R −→ R r −→ F MI j M −−→ µF I e A −−→ A ) Dually, for any co-pointed quotient algebra b : ( G MO , G MO ! ) ։ ( B , β ) , we have f A = ( A m A −−→ νG O j ′ M −−→ G MO b −→ B f B −−→ O ) . Proof.
The first statement follows from the commutative di-agram below, all of whose parts either commute trivially orby definition. I i A / / i R (cid:15) (cid:15) inl (cid:22) (cid:22) ✳✳✳✳✳✳✳✳✳✳✳✳✳✳✳ inl % % ❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑❑ inl % % inl , , ❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳❳ AF I A α A < < ②②②②②②②②② F I R F I r / / F M + I F I j M / / j M + ( ( ❘❘❘❘❘❘❘❘❘❘❘❘❘❘❘❘ F I ( µF I ) F I e A O O α ! ! ❇❇❇❇❇❇❇❇❇ R r / / ϱ B B ☎☎☎☎☎☎☎☎ F MI j M / / F MI ¡ = = ⑤⑤⑤⑤⑤⑤⑤⑤ µF Ie A O O The proof of the second statement is dual. (cid:3)
Proposition A.16.
Let ( s , t ) be a closed and consistent pairas in (3) , and suppose that t is co-pointed. Then the hypothe-sis H = H s , t and the unknown automaton Q have the sameobservation tables for ( s , t ) : h Hs , t = h Qs , t . In particular, H and Q agree on inputs from S , that is, L H · j N · s = L Q · j N · s . , Henning Urbat and Lutz Schröder Proof. (1) For the first equality, consider the following dia-gram: F NI j N / / µF Ie Q (cid:15) (cid:15) e Hs , t (cid:5) (cid:5) ☛☛☛☛☛☛☛☛☛☛☛☛☛☛☛ S s O O e s , t (cid:15) (cid:15) (cid:15) (cid:15) H s , t (cid:15) (cid:15) m s , t (cid:15) (cid:15) m Hs , t (cid:25) (cid:25) ✸✸✸✸✸✸✸✸✸✸✸✸✸✸✸ Q m Q (cid:15) (cid:15) TG KO t O O νG Oj ′ K o o The outward commutes by definition of h s , t and since h s , t = m s , t · e s , t . The upper left and lower left parts commute byLemma A.12. It follows that the remaining part commuteswhen precomposed with j N · s and postcomposed with t · j ′ K ,which gives h Hs , t = h Qs , t .(2) The second equality follows by postcomposing bothsides of the equality h Hs , t = h Qs , t with f T : T → O and ap-plying Lemma A.15. (cid:3) The key to the termination of the learning algorithm liesis in the following result.
Lemma A.17.
Let ( s , t ) be a closed and consistent pair as in (3) , and suppose that t is co-pointed. Then for every counterex-ample c for H s , t , the pair ( s ∨ c , t ) is not closed or not consistent.Proof. Suppose for the contrary that the pair ( s ∨ c , t ) isclosed and consistent. Denote by i : S S ∨ C and i ′ : C → S ∨ C the two embeddings, satisfying ( s ∨ c )· i = s and ( s ∨ c )· i ′ = c .Via diagonal fill-in we obtain a unique j : H s , t H s ∨ c , t such that the following diagram commutes: S / / i / / e s , t (cid:15) (cid:15) (cid:15) (cid:15) S ∨ C e s ∨ c , t (cid:15) (cid:15) (cid:15) (cid:15) H s , t / / j / / (cid:15) (cid:15) m s , t (cid:15) (cid:15) H s ∨ c , t { { m s ∨ c , t { { ✇✇✇✇✇✇✇✇✇ T We shall show below that j is an automata homomorphism.In particular, H s , t and H s ∨ c , t accept the same language by Proposition A.4. Letting H = H s ∨ c , t , we compute L H s , t · j N · c = L H · j N · c since L H s , t = L H = f H · e H · j N · c def. L H = f T · t · j ′ K · m H · e H · j N · c by Lemma A.15 = f T · t · j ′ K · m H · e H · j N · ( s ∨ c ) · i ′ def. i ′ = f T · h Hs ∨ c , t · i ′ def. h Hs ∨ c , t = f T · h Qs ∨ c , t · i ′ by Prop. A.16 = · · · = L Q · j N · c compute backwardsThis contradicts the fact that c is a counterexample for H s , t .To conclude the proof, it only remains to verify our aboveclaim that j is an automata homomorphism.(1) j preserves transitions. Observe first that we have m s , t · l s , t = m s ∨ c , t · l s ∨ c , t · Fi , (6)as shown by the commutative diagram below: FS Fi (cid:15) (cid:15) l s , t / / inr & & ▲▲▲▲▲▲▲▲▲▲▲▲ H s , tm s , t (cid:15) (cid:15) F I S e FI s , t / / h FI s , t * * ❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱ F I i (cid:15) (cid:15) H F I s , t m FI s , t % % ❑❑❑❑❑❑❑❑❑❑ cl − s , t ssssssssss TF I ( S ∨ C ) h FI ( s ∨ c ) , t ✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐ e FI ( s ∨ c ) , t / / H F I ( s ∨ c ) , t m FI ( s ∨ c ) , t sssssssssss cl − s ∨ c , t % % ❏❏❏❏❏❏❏❏❏ F ( S ∨ C ) inr rrrrrrrrrr l s ∨ c , t / / H s ∨ c , tm s ∨ c , t O O Here the left-hand part commutes by naturality of inr , thecentral triangle commutes by definition of h − , t (using that ( s ∨ c ) · i = s ), and all remaining parts commute by definition.Now, consider the following diagram: FS l s , t / / Fi (cid:15) (cid:15) Fe s , t ( ( PPPPPPPPP H s , tm s , t (cid:15) (cid:15) FH s , t δ s , t / / F j (cid:15) (cid:15) H s , tj (cid:15) (cid:15) ♣♣♣♣♣♣♣♣ ♣♣♣♣♣♣♣♣ TFH s ∨ c , t δ s ∨ c , t / / H s ∨ c , t ◆◆◆◆◆◆◆ ◆◆◆◆◆◆◆ F ( S ∨ C ) Fe s ∨ c , t ♥♥♥♥♥♥♥ l s ∨ c , t / / H s ∨ c , tm s ∨ c , t O O utomata Learning: An Algebraic Approach , , The outward commutes by (6), and all parts except the cen-tral square commute by definition. It follows that also thecentral square commutes, because it commutes when pre-composed with the epimorphism Fe s , t and postcomposedwith the monomorphism m s ∨ c , t . Thus, j preserves transi-tions.(2) j preserves the initial state. Observe first that we have m s , t · i s , t = m s ∨ c , t · i s ∨ c , t , (7)as shown by the commutative diagram below: I i s , t / / inl ❋❋❋❋❋❋❋❋❋❋ H s , tm s , t (cid:15) (cid:15) F I S e FI s , t / / h FI s , t * * ❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱❱ F I i (cid:15) (cid:15) H F I s , t m FI s , t % % ❑❑❑❑❑❑❑❑❑❑ cl − s , t ssssssssss TF I ( S ∨ C ) h FI ( s ∨ c ) , t ✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐✐ e FI ( s ∨ c ) , t / / H F I ( s ∨ c ) , t m FI ( s ∨ c ) , t sssssssssss cl − s ∨ c , t % % ❏❏❏❏❏❏❏❏❏ I inl < < ①①①①①①①①①① i s ∨ c , t / / H s ∨ c , tm s ∨ c , t O O Now consider the following diagram: I i s , t / / H s , t % % m s , t % % ▲▲▲▲▲▲▲▲▲▲▲▲ j (cid:15) (cid:15) I i s ∨ c , t / / H s ∨ c , t / / m s ∨ c , t / / T The outward commutes by (7), and the right-hand triangleby the definition of j . Thus the left-hand part commutes,since it does when postcomposed with the monomorphism m s ∨ c , t . This proves that j preserves the initial state.(3) j preserves final states. The proof is analogous to (2). (cid:3)
With the above results at hand, we are ready to proveTheorem 4.12:
Proof of Theorem 4.12.
The algorithm only terminates if ahypothesis H s , t constructed in Step (2) is correct (i.e. it ac-cepts the same language as the unknown automaton Q ), inwhich case H s , t is returned. This automaton is minimal byLemma A.12, so H s , t = Min ( L Q ) .Thus, we only need to verify that the algorithm even-tually finds a correct hypothesis. For any F I -subcoalgebra r : ( R , ϱ ) ( F MI , F MI ¡ ) , let e r and m r denote the (E , M) -factorizations of e Q · j M · r . R r / / e r % % % % ▲▲▲▲▲▲▲▲▲▲▲▲▲ F MI j M / / µF I e Q / / QQ r m r ssssssssssss Similarly, for any G O -quotient algebra b : ( G MO , G MO ! ) ։ ( B , β ) , let e b and m b be the (E , M) -factorization of b · j ′ M · m Q . Q m Q / / e b % % % % ❑❑❑❑❑❑❑❑❑❑❑❑❑ νG O j ′ M / / G MO b / / BQ b m b sssssssssssss Let ( s , t ) and ( s ′ , t ′ ) be two consecutive pairs appearing inan execution of the algorithm. We show below that the fol-lowing statements hold:(1) If ( s ′ , t ′ ) emerges from ( s , t ) via “Extend s ”, then m s < m s ′ and e t = e t ′ .(2) If ( s ′ , t ′ ) emerges from ( s , t ) via “Extend t ”, then m s = m s ′ and e t < e t ′ .(3) If ( s ′ , t ′ ) emerges from ( s , t ) by adding a counterexample,then m s ≤ m s ′ and e t = e t ′ Letting ( s , t ) , ( s , t ) , ( s , t ) , . . . denote the sequence ofpairs constructed in an execution of the algorithm, it followsthat we obtain two ascending chains m s ≤ m s ≤ m s ≤ · · · and e s ≤ e s ≤ e s ≤ · · · . of subobjects and quotients of Q , respectively. By our as-sumption that Q is Noetherian, both chains must stabilize,i.e. all but finitely many of the relations ≤ are equalities. By(1) and (2), this implies that “Extend s ” and “Extend t ” arecalled only finitely often. Moreover, whenever a counterex-ample is added to s , this must be immediately followed by acall of “Extend s ” oder “Extend t ” by Lemma A.17. Thus alsoStep (2b) is executed only finitely often. This proves that thealgorithm necessarily terminates.It remains to establish the above statements (1)–(3).(1) An application of “Extend s ” to ( s , t ) yields the new pair ( s ′ , t ′ ) with s ′ = F I s · s and t ′ = t . Thus, we trivially have e t = e t ′ . Moreover, m s ≤ m s ′ holdsby the right-hand triangle in the diagram below, where themorphism n s , s ′ is obtained via diagonal fill-in: S s (cid:15) (cid:15) e s / / / / Q s / / m s / / (cid:15) (cid:15) n s , s ′ (cid:15) (cid:15) ✤✤✤ QS ′ e s ′ / / / / Q s ′ ? ? m s ′ ? ? ⑦⑦⑦⑦⑦⑦⑦⑦ To prove m s < m s ′ , we need to show that n s , s ′ is not an iso-morphism. To this end, consider the unique morphisms d s , Henning Urbat and Lutz Schröder and d s ′ (defined via diagonal fill-in) such that the diagramsbelow commute: S s / / e s , t (cid:15) (cid:15) (cid:15) (cid:15) e s " " " " ❊❊❊❊❊❊ F NI j N / / µF Ie Q (cid:15) (cid:15) Q s m s ●●●●●● d s { { { { ①①①①①① H s , t (cid:15) (cid:15) m s , t (cid:15) (cid:15) Q e t | | ②②②②②② m Q (cid:15) (cid:15) Q t } } m t } } ③③③③③③ T G KO t o o νG Oj ′ K o o S ′ s ′ / / e s ′ , t (cid:15) (cid:15) (cid:15) (cid:15) e s ′ $ $ $ $ ❍❍❍❍❍❍❍ F N + I j N + / / µF Ie Q (cid:15) (cid:15) Q s ′ $ $ m s ′ $ $ ■■■■■■■ d s ′ z z z z ✉✉✉✉✉✉ H s ′ , t (cid:15) (cid:15) m s ′ , t (cid:15) (cid:15) Q e t { { ✈✈✈✈✈✈✈ m Q (cid:15) (cid:15) Q t { { m t { { ✇✇✇✇✇✇✇ T G KO t o o νG Oj ′ K o o Moreover, observe that we have the following commutativediagram: H s ′ , t ) ) m s ′ , t ) ) ❘❘❘❘❘❘❘❘❘❘❘❘❘❘❘❘❘ S ′ h s ′ , t / / e s ′ , t ♠♠♠♠♠♠♠♠♠♠♠♠♠♠♠ s ❆❆❆❆❆❆❆❆ TF I S h FI s , t ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ e FI s , t / / H F I s , t = = m FI s , t = = ③③③③③③③③③ By the choice of s in “Extend s ”, we have e F I s , t · s ∈ E . Theuniqueness of (E , M) -factorizations thus implies that, up toisomorphism, H s ′ , t = H F I s , t , e s ′ , t = e F I s , t · s , m s ′ , t = m F I s , t . We now claim that the following diagram commutes: Q s / / n s , s ′ / / d s (cid:15) (cid:15) (cid:15) (cid:15) Q s ′ d s ′ (cid:15) (cid:15) (cid:15) (cid:15) S e s , t (cid:6) (cid:6) ✍✍✍✍✍✍✍✍✍✍✍✍✍✍✍ h s , t (cid:30) (cid:30) ❃❃❃❃❃❃❃ e s ` ` ` ` ❆❆❆❆❆❆❆❆ s / / S ′ e s ′ , t (cid:28) (cid:28) ✾✾✾✾✾✾✾✾✾✾✾✾✾✾✾✾ e s ′ sssssssssss h s ′ , t (cid:127) (cid:127) ⑧⑧⑧⑧⑧⑧⑧⑧ TH s , t m s , t ♦♦♦♦♦♦♦♦♦♦♦♦♦♦ / / cl s , t / / H F I s , t = H s ′ , t i i m s ′ , t i i ❙❙❙❙❙❙❙❙❙❙❙❙❙❙❙❙ (8)All inner parts commute by definition. Thus also the out-ward commutes, since it does when precomposed withthe epimorphism e s and postcomposed with the monomor-phism m s ′ , t .We are ready to prove our claim that n s , s ′ is not an iso-morphism. Suppose for the contrary that it is. Since d s ′ ∈ E ,the diagram (8) yields cl s , t · d s = d s ′ · n s , s ′ ∈ E . Thus cl s , t ∈ E . One the other hand, by definition of cl s , t we have m F I s , t · cl s , t = m s , t ∈ M and thus cl s , t ∈ M . But from cl s , t ∈ E ∩ M it follows that that cl s , t is an isomorphism [2, Prop. 14.6], contradicting the fact that the input pair ( s , t ) of“Extend s ” is not closed.(2) The proof is symmetric to (1).(3) Adding a counterexample c means to to replace the pair ( s , t ) by the pair ( s ′ , t ′ ) with s ′ = s ∨ c and t ′ = t . Thus e t = e t ′ . Letting i : ( S , σ ) ( S ∨ C , σ ∨ γ ) = ( S ′ , σ ′ ) de-note the embedding with s = ( s ∨ c ) · i , diagonal fill-in yieldsa morphism n s , s ′ making the diagram below commute: S i (cid:15) (cid:15) e s / / / / Q s / / m s / / (cid:15) (cid:15) n s , s ′ (cid:15) (cid:15) ✤✤✤ QS ′ e s ′ / / / / Q s ′ ? ? m s ′ ? ? ⑦⑦⑦⑦⑦⑦⑦⑦ This proves that m s ≤ m s ′ . (cid:3) Details for Remark 4.13
Let m and n be the height (i.e. the length of the longeststrictly ascending chain) of the poset of subobjects and quo-tients of Q , respectively. The proof of Theorem 4.12 showsthat(1) “Extend s ” is executed at most m times;(2) “Extend t ” is executed at most n times;(3) Step (2b) is executed at most m + n times.Thus, Steps (1a), (1b) and (2b) are executed at most 2 m + n = O ( m + n ) times. Details for Example 4.14 (1) The statements for D = Set , Pos , K - Vec are clear.(2) D = JSL : clearly every finite semilattice is Noetherian.Conversely, if Q is a infinite semilattice, choose a sequence q , q , q , . . . of elements of Q such that q n + is not an element of the sub-semilattice h q , . . . , q n i of Q generated by q , . . . , q n . Sincethis subsemilattice is finite (of cardinality at most 2 n + ), sucha q n + can always be chosen. Then h q i h q , q i h q , q , q i . . . is an infinite strictly ascending chain of subsemilattices of Q , showing that Q is not Noetherian.(3) D = Nom : We show that orbit-finite sets have theclaimed polynomial height. Let X be an orbit-finite nomi-nal set with n orbits. It is clear that chains of subobjects, i.e.equivariant subsets, of X have length at most n . It remainsto show the polynomial bound on chains of quotients. Thenumber of orbits decreases non-strictly along such a chain,and can strictly decrease at most n times, so it suffices toconsider chains of quotients that retain the same numberof orbits. Such quotients are sums of quotients of single-orbit sets, so it suffices to consider the case where X hasonly one orbit. Then, all elements of X have supports of the utomata Learning: An Algebraic Approach , , same size k ; since this number decreases non-strictly alonga chain of quotients, and can strictly decrease at most k times, it suffices to consider chains of quotients that retainthe same support size.We now use the standard fact that X is a quotient of A ∗ k ,the k -fold separated product of A ; the same, of course, holdsfor all quotients of X . A quotient of A ∗ k whose elements re-tain supports of size k is determined by a subgroup G ofthe symmetric group S k . (Specifically, the quotient deter-mined by G identifies ( a , . . . , a k ) and ( a π ( ) , . . . , a π ( k ) ) forall ( a , . . . , a k ) ∈ A ∗ k and π ∈ G . Conversely, from a givenquotient e : X ։ Y , we obtain G as consisting of all π ∈ S k such that e identifies ( a , . . . , a k ) and ( a π ( ) , . . . , a π ( k ) ) forall ( a , . . . , a k ) ∈ A ∗ k .) The given chain of quotients thuscorresponds to a chain of subgroups of S k , which for k ≥ k − Details for Remark 4.16
We demonstrate that the coalgebraic learning algorithmin [14] gets stuck when applied to the setting of Σ -automatain Nom . In the following, we assume some familiarity withthe algorithm and the notation introduced in op. cit.
A coalgebraic logic giving the semantics of nominal au-tomata can be described in complete analogy to the
Set case[14, Example 1]. We instantiate the logical framework to
Nom op δ ■■■■■ ■■■■■ ( ■■■■ ■■■■ L op (cid:15) (cid:15) Nom P o o B (cid:15) (cid:15) Nom op Nom P o o where LX = + A × X , BX = × [ A , X ] , P = [− , ] . The right adjoint of P is Q = [− , ] : Nom op → Nom . Foreach X ∈ Nom , the map δ X : 1 + A × [ X , ] → [ × [ A , X ] , ] sends the unique element of 1 to the left product projection,and ( a , f ) ∈ A × [ X , ] to δ X ( a , f ) ∈ [ × [ A , X ] , ] with δ X ( a , f )( b , д ) = f ( д ( a )) for b ∈ д ∈ [ A , X ] . We have the initial algebra for L given by Φ = µL = A ∗ , andthe theory map th γ : X → Q Φ = [ A ∗ , ] for a nominal automaton (i.e. B -coalgebra) X is just theunique coalgebra homomorphism from X into the final coal-gebra νB = [ A ∗ , ] (cf. Example 3.9).Now consider the nominal language K : A ∗ → K ( w ) = w has even length. We assume that the un-known coalgebra is given by ( X γ −→ BX ) = ( A ∗ h K , γ ′ i −−−−−→ × [ A , A ∗ ] ) with γ ′ ( w )( a ) = wa for w ∈ A ∗ , a ∈ A . (The state set X iseffectively made known to the learner in advance since thelearning algorithm computes subobjects of X . Thus, in thetypical scenario X will be orbit-infinite like in the presentexample, although of course the language K can be acceptedby an orbit-finite automaton.) The algorithm starts with thetrivial observation table S = { ε } X and Ψ = ∅ Φ , This table is closed and the induced conjecture is the trivialone-state automaton accepting all words in A ∗ . Since a < K for a ∈ A , the teacher provides the (minimal) counterexam-ple { ε } + A Φ . After adding it to Ψ , the new table is S = { ε } X and Ψ = { ε } + A Φ . The next reachability step computes the set Γ ( S ) of elementsof X reachable from S = { ε } in a single transition step: Γ ( S ) = A . Thus S ∨ Γ ( S ) = S ∪ Γ ( S ) = { ε } + A X . Viewing the elements of Q Ψ = [{ ε } + A , ] as finitely sup-ported subsets of Ψ = { ε } + A , we can describe the map S ∨ Γ ( S ) X th γ −−−→ Q Ψ as sending ε to { ε } ⊆ Ψ and every a ∈ A to A ⊆ Ψ , i.e. theimage of this map is the discrete nominal set S = {{ ε } , A } (cid:27) . In order to close the table, Step 6 of the algorithm now re-quires to choose a monomorphism S X subject to certainconditions. But clearly there exists no monomorphism from S = X = A ∗ in Nom , i.e. the algorithm cannot make therequired choice.
Details for Example 5.9
Our categorical notion of automata presentation involvesquotients of T -algebras. For practical purposes, it is some-times more convenient to work with the equivalent conceptof a congruence: Remark A.18. (1) Recall that for a monad T on Set givenby a finitary signature Γ and equations E between Γ -terms,quotient algebras of a T -algebra (i.e. ( Γ , E ) -algebra) A corre-spond bijectively to congruences on A . Here a congruence isan equivalence relation ≡ on A respecting all Γ -operations:for all a , a ′ ∈ A with a ≡ a ′ , one has γ ( a , . . . , a i − , a , a i + . . . , a n ) ≡ γ ( a , . . . , a i − , a ′ , a i + , . . . , a n ) for n > γ ∈ Γ n , i ∈ { , . . . , n } and a j ∈ A ( j , i ). The bijec-tion identifies a quotient e : A ։ B with its kernel, i.e. thecongruence given by a ≡ a ′ ⇔ e ( a ) = e ( a ′ ) . , Henning Urbat and Lutz Schröder Thus, if the object
T I is equipped with some Σ -automatastructure Σ × T I δ −→ T I , the equivalence in Definition 5.7(3)states precisely that an equivalence relation ≡ on T I corre-sponding to a T -refinable quotient is a congruence on T I ifffor all w , w ′ ∈ T I and a ∈ Σ , w ≡ w ′ implies δ ( a , w ) ≡ δ ( a , w ′ ) . (2) An analogous remark applies to monads T on Set S cor-responding to a finitary S -sorted signature Γ and equationsbetween Γ -terms: quotient algebras of a ( Γ , E ) -algebra A cor-respond to S -sorted congruence relations, i.e. families ofequivalence relations ≡ = (≡ s ⊆ A s × A s ) s ∈ S respecting alloperations. Thus, if T I is equipped with the structure of asorted Σ -automaton δ s , t : Σ s , t × ( T I ) s → ( T I ) t ( s , t ∈ S ),the equivalence in Definition 5.7(3) states precisely that an S -sorted equivalence relation ≡ on T I corresponding to a T -refinable quotient is a congruence on T I iff for all w , w ′ ∈( T I ) s and a ∈ Σ s , t , w ≡ w ′ implies δ s , t ( a , w ) ≡ δ s , t ( a , w ′ ) . We will now describe automata presentations for semi-groups, Wilke algebras, and general (ordered) ( Γ , E ) -algebras, including stabilization algebras. We will see thatin all these cases, the equivalence in Definition 5.7(3) holdsfor arbitrary, not only T -refinable, quotients. Semigroups.
The free semigroup T + I = I + has a Σ -automata presentation δ : Σ × I + → I + given by the alphabet Σ = { → a : a ∈ I } ∪ { ← a : a ∈ I } and the transitions δ ( → a , w ) = wa and δ ( ← a , w ) = aw for w ∈ I + , a ∈ I . We show that (1)–(3) of Definition 5.7 (with F = Σ ×− on Set )are satisfied. (1) is clear by Remark 5.4. For (2), recall fromExample 3.11 that µF I = I × Σ ∗ . The unique homomorphism e I + : I × Σ ∗ → I + interprets a word in I × Σ ∗ as a list ofinstructions for forming a word in I + , e.g. e I + ( a → a → b ← b → a ) = baaba . Thus, e I + is surjective: given a . . . a n ∈ I + with a i ∈ I , wehave a . . . a n = e I + ( a → a · · · → a n ) To show (3), we use Remark A.18(1): we need to verify thatan equivalence relation ≡ on I + is a monoid congruence iff,for every w , w ′ ∈ I + and a ∈ I , w ≡ w ′ implies wa ≡ w ′ a , aw ≡ aw ′ . The “only if” direction is clear. For the “if” direction, let w ≡ w ′ and v ∈ I + ; we need to show that wv ≡ w ′ v and vw ≡ vw ′ . For the first equivalence, let v = a . . . a n . Then we getthe chain of implications w ≡ w ′ ⇒ wa ≡ w ′ a ⇒ . . . ⇒ wa . . . a n ≡ wa . . . a n , i.e. wv ≡ w ′ v . The proof of the second equivalence is sym-metric. Wilke algebras.
The free Wilke algebra T ∞ ( I , ∅) = ( I + , I up ) can be presented as a two-sorted Σ -automaton with thesorted alphabet Σ = ( Σ + , + , Σ + , ω , Σ ω , ω , ∅) given by Σ + , + = { → a : a ∈ I } ∪ { ← a : a ∈ I } Σ + , ω = { ω } ∪ { → v ω : v ∈ I + } Σ ω , ω = { a ← : a ∈ I } and the transitions below, where v , w ∈ I + , z ∈ I up , a ∈ I : δ + , + ( → a , w ) = wa , δ + , + ( ← a , w ) = aw , δ + , ω ( ω , w ) = w ω , δ + , ω ( → v ω , w ) = wv ω , δ ω , ω ( a ← , z ) = az . We show that (1)–(3) of Definition 5.7 (with F the functor on Set { + , ω } from Example 3.11) are satisfied. (1) is clear by Re-mark 5.4. For (2), recall from Example 3.11 that the initial al-gebra µF I consists of sorted words over Σ with an additionalfirst letter from I . The homomorphism e ( I + , I up ) : µF I →( I + , I up ) views such a word as an instruction for forminga word in ( I + , I up ) , e.g. e ( I + , I up ) ( a → b → aωa ← a ← ) = aa ( aba ) ω . Thus e ( I + , I up ) is surjective: every finite word w ∈ I + is inthe image of e ( I + , I up ) as in the case of semigroups, and foran ultimately periodic word ( a . . . a n )( b . . . b m ) ω ∈ I up wehave ( a . . . , a n )( b . . . b m ) ω = e ( I + , I up ) ( b → b · · · → b m ω ← a n · · · ← a ) . To show (3), we use Remark A.18(2): we need to verifythat a two-sorted equivalence relation ≡ on ( I + , I up ) is acongruence w.r.t. the Wilke algebra structure iff, for each w , w ′ , v ∈ I + with w ≡ w ′ and a ∈ I , one has aw ≡ aw ′ , wa ≡ w ′ a , w ω ≡ ( w ′ ) ω , wv ω ≡ w ′ v ω , and for each z , z ′ ∈ I up with z ≡ z ′ and a ∈ I one has az ≡ az ′ . The “only if” direction is clear. For the “if” direction, weneed to show that for all v , w , w ′ ∈ I + and z , z ′ ∈ I up , • w ≡ w ′ implies vw ≡ vw ′ , wv ≡ w ′ v , w ω ≡ ( w ′ ) ω and wz ≡ w ′ z ; • z ≡ z ′ implies wz ≡ wz ′ .Let us show that w ≡ w ′ implies wz ≡ w ′ z ; the proofs of theother statements are similar. We have z = a . . . a n y ω with a , . . . , a n ∈ I and y ∈ I + . From w ≡ w ′ it follows that wa ≡ w ′ a , wa a ≡ w ′ a a , · · · , wa . . . a n ≡ w ′ a . . . a n , and thus wz = wa . . . a n y ω ≡ w ′ a . . . a n y ω = w ′ z . utomata Learning: An Algebraic Approach , , Stabilization algebras.
Suppose that T is a monad on Set or Pos induced by a finitary signature Γ and (in-)equations E ;see Section 2. Then T I can be presented as the Γ -automaton δ : F Γ ( T I ) →
T I given by the Γ -algebra structure on the free ( Γ , E ) -algebra T I . We show that (1)–(3) of Definition 5.7 aresatisfied.(1) is clear by Remark 5.4. For (2), observe that the ini-tial algebra µ ( F Γ ) I is the algebra T Γ I of Γ -terms over I , andthat the unique homomorphism e T I : T Γ I ։ T I interprets Γ -terms in T I . Since the T -algebra T I is generated by the set I as a Γ -algebra, every element of T I can be expressed as a Γ -term over I , i.e. e T I is surjective. (3) is clear: the equivalencejust amounts to the statement that if e is a surjective homo-morphism of (ordered) Γ -algebras and its domain satisfiesall (in-)equations in E , then so does its codomain.By instantiating to the monad T = T S on Pos , we see thatthe free stabilization algebra T S I has a Γ -automata presen-tation for the signature Γ of Example 5.5(3). Proof of Theorem 5.12
Suppose that L is recognized via e : T I → ( A , α ) and p : A → O , where ( A , α ) is a finite T -algebra. We may assume that e ∈ E . (Otherwise consider the (E , M) -factorization T I e ′ / / / / ( A ′ , α ′ ) / / m / / ( A , α ) of e . Since D f is closed under subobjects, L is recognizedby the finite T -algebra ( A ′ , α ′ ) via e ′ and p · m , i.e. we canreplace e by e ′ .)Since ( F , δ ) forms a weak automata presentation, theobject A can be equipped with an F -algebra structure δ A : FA → A such that e : ( T I , δ ) ։ ( A , δ A ) is an F -algebrahomomorphism. Equipping T I and A with the initial states η I : I → T I and e · η I : I → A , respectively, we can view T I and A as F I -algebras and e as an F I -algebra homomorphism.By initiality of µF I , it follows that e A = e · e T I . It followsthat the diagram below commutes, which proves that the au-tomaton ( A , δ A , e · η I , p ) accepts the language lin ( L ) = L · e T I . µF I e
T I / / / / e A ! ! ❇❇❇❇❇❇❇❇ T I e (cid:15) (cid:15) L / / OA p ? ? ⑦⑦⑦⑦⑦⑦⑦⑦ (9)Since A is finite, we conclude that lin ( L ) is regular. (cid:3) Proof of Theorem 5.14
The proof is illustrated by the diagram below: µF I e
T I / / / / lin ( L ) " " e A (cid:15) (cid:15) (cid:15) (cid:15) T I e } } } } ⑤⑤⑤⑤⑤⑤⑤⑤ e ′ (cid:15) (cid:15) (cid:15) (cid:15) L / / OA f A M M B h o o p ′ ? ? ⑦⑦⑦⑦⑦⑦⑦⑦ Let A = Min ( lin ( L )) be the minimal automaton for the lan-guage lin ( L ) . Equipping T I with the initial states η I : I → T I and the final states L : T I → O , we can view T I as an au-tomaton accepting lin ( L ) = L · e T I . Since e T I ∈ E (that is,the automaton
T I reachable) and A is minimal, there exists aunique automata homomorphism e : T I ։ A . We now provethe theorem by establishing the following claims: Claim 1.
For every finite quotient T -algebra e ′ : T I ։ ( B , β ) that recognizes L , there exists a unique h : B → A with e = h · e ′ . Proof.
As in the proof of Theorem 5.12, B can be viewed asa reachable automaton recognizing lin ( L ) . By minimality of A , there is an automata homomorphism h : B → A . We have h · e ′ · e T I = e · e T I because both sides are F I -algebra homomorphisms from µF I to B and µF I is initial. Thus h · e ′ = e because e T I is epic.
Claim 2.
The automaton A can be equipped with T -algebra structure ( A , α A ) such that e : T I ։ ( A , α A ) is a T -homomorphism. Proof.
Since L is T -recognizable, we have L = p ′ · e ′ for somefinite quotient T -algebra e ′ : T ։ ( B , β ) and some p ′ : A → O . By Claim 1, e = h · e ′ for some h , which shows that e is T -refinable. Since ( F , δ ) is an automata presentation, weobtain the desired α A . Claim 3. e : T I ։ ( A , α A ) is a syntactic T -algebra for L . Proof.
The homomorphism e recognizes L via f A : we have L · e T I = lin ( L ) = f A · e A = f A · e · e T I and thus L = f A · e because e T I is epic. The universal prop-erty of e follows from Claim 1. (cid:3)(cid:3)