Glushkov's construction for functional subsequential transducers
GGlushkov’s construction for functional subsequentialtransducers
Aleksander Mendoza-Drosik
Abstract —Glushkov’s construction has many interesting propertiesand they become even more evident when applied to transducers.This article strives to show the vast range of possible extensionsand optimisations for this algorithm. Special flavour of regularexpressions is introduced, which can be efficiently converted to (cid:15) -freefunctional subsequential weighted finite state transducers. Producedautomata are very compact, as they contain only one state for eachsymbol (from input alphabet) of original expression and only onetransition for each range of symbols, no matter how large. Suchcompactified ranges of transitions allow for efficient binary searchlookup during automaton evaluation. All the methods and algorithmspresented here were used to implement open-source compiler ofregular expressions for multitape transducers.
Keywords —weighted automata, transducers, Glushkov, follow au-tomata, regular expressions
I. I
NTRODUCTION T HERE are not many open source solutions availablefor working with transducers. The most significant andwidely used library is OpenFst. Their approach is based ontheory of weighted automata[1][2][3]. Here we propose analternative approach founded on lexicographic transducers [4]and Glushkov’s algorithm [5].Let W be some set of weight symbols. The free monoid W ∗ will be out set of weight strings. We assume there issome lexicographic order defined as b w > b w ⇐⇒ w > w or ( w = w and b > b ) where w , w ∈ W and b , b ∈ W ∗ . The order is definedonly on strings of equal lengths. Let Σ be the input alphabet, Σ ∗ is the monoid of input strings and D is the monoid ofoutput strings. Lexicographic transducer is defined as tuple ( Q, I, W, Σ , D, δ, τ ) where Q is some finite set of states, I isthe set of initial states, τ is a state output (partial) function Q → D × W and lastly δ represents transitions of the form δ ⊂ Q × W × Σ × D × Q .Thanks to τ , such machines are subsequential [6][7][8][9].As an example consider the simple transducer from figure 1.The states q , q and q have no output, which can be denotedwith τ ( q ) = ∅ . The only set which does have output is q .Every time automaton finishes reading input string and reaches q , it will append d to its output and then accept. For instance,on input σ σ it will first read σ , produce output d d andgo to state q , then read σ and append output d , go to state q , finally reaching end of input, appending d and accepting.The total output would be d d d d . Note that the automatonis nondeterministic, as it could take alternative route passingthrough q and producing d d . In such scenarios weightsare used to disambiguate output. The first route producesweight string w w w , while the second produces w w w . According to our definition of lexicographic order we have w w w > w w w (assuming that w > w ). Throughoutthis article we will consider smaller weights to be ”better”.Hence the automaton should choose d d as the definitiveoutput for input σ σ . There might be situations in whichtwo different routes have the exact same (equally highest)weight while also producing different outputs. In such cases,the automaton is ambiguous and produces multiple outputs forone input. II. E XPRESSIVE POWER
There are some remarks to be made about lexicographictransducers. They recognize relations on languages, unlike”plain” finite state automata (FSA) which recognize languages.If M is some transducer, then we denote its recognized relationwith L ( M ) . Those relations are subsets of Σ ∗ × D . The setof strings Σ ∗ accepted by M must be a regular language(indeed, if we erased output labels, we would as a result obtainFSA). The weights are erasable [4] in the sense that, give anylexicographic transducer we can always build an equivalentautomaton without weighted transitions. If we didn’t have τ ,the only output possible to be expressed for empty input wouldbe an empty string as well. With τ we can express pairs like ( (cid:15), d ) ∈ L where d (cid:54) = (cid:15) .The transducers can return at most finitely many outputs forany given input (see infinite superposition [4]). If we allowedfor (cid:15) -transitions (transitions that have (cid:15) as input label) we couldbuild (cid:15) -cycles and produce infinitely many outputs. However,automata that do so are not very interesting from practicalpoint of view. Therefore we shall focus only on functionaltransducers, that is those which produce at most one output.If automaton does not have any (cid:15) -cycles and is functional,then it’s possible to erase all (cid:15) -transitions (note that it wouldnot be possible without τ , because (cid:15) -transitions allow forproducing output given empty input). Therefore (cid:15) -transitionsdon’t increase power of functional transducers.We say that transducer has conflicting states q and q if it’s possible to reach both of them simultaneously (thereare two possible routes with the same inputs and weights)given some input σ and there is some another state q towhich both of those states can transition over the same inputsymbol σ i . Alternatively, there might be no third state q ,but instead both q and q have non-empty τ output (so τ can in a sense be treated like q ). We say that transitions ( q , σ i , w, d, q ) and ( q , σ i , w (cid:48) , d, q ) are weight-conflicting if they have equal weights w = w (cid:48) . For instance in figure 1the states q and q are indeed conflicting because they bothtransition to q over σ but their transitions are not weight a r X i v : . [ c s . F L ] S e p ig. 1. Example of lexicographic transducer. State q is initial. State q in accepting, in the sense that τ ( q ) = ( w , d ) . The remaining states havestate output ∅ . conflicting. It can be shown that transducers without weight-conflicting transitions are functional. Moreover, if a transduceris functional but contains weight-conflicting transitions, thenthe weights can be reassigned in such a way that eliminatesall conflicts [4]. The only requirement is that there are enoughsymbols in W (for instance, if W had only one symbol, thenall transitions of conflicting states would always be weight-conflicting). If there are at least as many weight symbols asthere are states | W | = | Q | , then every functional transducer on | W | states can be built without weight-conflicting transitions.For convenience we can assume that W = N , but in practiceall algorithms presented here will work with bounded W .Hence transducers without weight-conflicting transitions areequivalent in power to functional transducers. This is importantbecause by searching for weight-conflicting transitions we canefficiently test whether transducer is functional or not.III. R ANGED AUTOMATA
Often when implementing automata the algorithm behind δ function needs to efficiently find the right transition for agiven σ symbol. It’s beneficial to optimise UNIX-style rangeslike [0-9] or [a-z] as they arise often in practical settings.Even the . wildcard can be treated as one large range spanningentire Σ . If the alphabet is large (like ASCII or UNICODE),then checking every one of them in a loop is not feasible. Asignificant improvement can be made by only checking twoinequalities like σ ≤ x ≤ σ , instead of large numberof equalities. The current paper presents a way in whichsimplified model of ( S , k ) -automata[10][11], can be used toobtain major improvements. In particular we consider onlyautomata that don’t have any registers apart from constantvalues, that is k = 0 . Therefore we provide a more specializeddefinition of ”ranged automata”.Let Σ be the (not necessarily finite) alphabet of automaton.Let χ be the set of subsets of Σ that we will call ranges of Σ . Let χ be the closure of χ under countable union andcomplementation (so it forms a sigma algebra). For instance,imagine that there is total order on Σ and χ is the setof all intervals in Σ . Now we want to build an automatonwhose transitions are not labelled with symbols from Σ , butrather with ranges from χ . Union χ ∪ χ of two elementsfrom χ ”semantically” corresponds to putting two edges, ( q, χ , q (cid:48) ) ∈ δ (for a moment forget about outputs and weights)and ( q, χ , q (cid:48) ) ∈ δ . There is no limitation on the size of δ . It might be countably infinite, hence it’s natural that χ should be closed under countable union. Therefore, χ is theset of allowed transition labels and χ is the set of all possible”semantic” transitions. We could say that χ is discrete if itcontains every subset of Σ . An example of discrete χ wouldbe finite set Σ with all UNIX-style ranges [ σ - σ (cid:48) ] includedin χ .Another example would be set Σ = R with χ consistingof all ranges, whose ends are computable real numbers (realnumber x is computable if the predicate q < x is decidablefor all rational numbers q ). If we also restricted δ to be a finiteset, then we could build effective automata that work with realnumbers of arbitrary precision.IV. R EGULAR EXPRESSIONS
Here we describe a flavour of regular expressions specifi-cally extended to interplay with lexicographic transducers andranged automata.Transducers with input Σ ∗ and output Γ ∗ can be seen asFSA working with single input Σ ∗ × Γ ∗ . Therefore we cantreat every pair of symbols ( σ, γ ) as an atomic formula ofregular expressions for transducers. We can use concatenation ( σ, γ )( (cid:15), γ ) to represent ( σ, γ γ ) . It’s possible to createambiguous transducers with unions like ( (cid:15), γ ) + ( (cid:15), γ ) . Tomake notation easier, we will treat every σ as ( σ, (cid:15) ) and every γ as ( (cid:15), γ ) . Then instead of writing lengthy ( σ, (cid:15) )( (cid:15), γ ) wecould introduce shortened notation σ : γ . Because we wouldlike to avoid ambiguous transducers we can put restrictionthat the right side of : should always be a string of Γ ∗ andwriting entire formulas (like σ : γ + γ ∗ ) is not allowed. Thisrestriction will later simplify Glushkov’s algorithm.We define A Σ to be the set of atomic characters. Forinstance we could choose A Σ = Σ ∪ { (cid:15) } for FSA/transducersand A Σ = χ for ranged automata.We call RE Σ: D the set of all regular expression formulaswith underlying set of atomic characters A Σ and allowedoutput strings D . It’s possible that D might be a singletonmonoid { (cid:15) } but it should not be empty, because then noelement would belong to Σ ∗ × D . By inductive definition,if φ and ψ are RE Σ: D formulas and d ∈ D , then union φ + ψ , concatenation φ · ψ , Kleene closure φ ∗ and outputconcatenation φ : d are RE Σ: D formulas as well. Define V Σ: D : RE Σ: D → Σ ∗ × D to be the valuation function: V Σ: D ( φ + ψ ) = V Σ: D ( φ ) ∪ V Σ: D ( ψ ) V Σ: D ( φ · ψ ) = V Σ: D ( φ ) · V Σ: D ( ψ ) V Σ: D ( φ ∗ ) = ( (cid:15), (cid:15) ) + V Σ: D ( φ ) + V Σ: D ( φ ) + ...V Σ: D ( φ : d ) = V Σ: D ( φ ) · ( (cid:15), d ) V Σ: D ( a ) = a where a ∈ A Σ: D Some notable properties are: x : y + x : y = x : ( y + y ) x : (cid:15) + x : y + x : y ... = x : y ∗ ( x : y )( (cid:15) : y ) = x : ( y y ) x : ( y y (cid:48) ) + x : ( y y (cid:48) ) = ( x : y + x : y · ( (cid:15) : y (cid:48) ) x : ( y (cid:48) y ) + x : ( y (cid:48) y ) = ( (cid:15) : y (cid:48) ) · ( x : y + x : y Therefore we can see that expressive power with and without : is the same.It’s also possible to extend regular expressions with weights.Let RE Σ: DW be a superset of RE Σ: D and W be the set ofeight symbols. If φ ∈ RE Σ → DW and w , w ∈ W then w φ and φw are in RE Σ → DW . This allows for inserting weight atany place. For instance, the automaton from figure 1 could beexpressed using (( σ : d d ) w ( σ : d ) w + ( σ : d ) w σ w ) : d The definition of V Σ: D ( φw ) depends largely on W butassociativity ( φw ) w = φ ( w + w ) should be preserved,given that W is a multiplicative monoid. This also impliesthat w (cid:15)w = w w , which is semantically equivalent to theaddition w + w .We showed that regular expressions for transducers can beexpressed using pairs of symbols ( σ, γ ) . There is an alternativeapproach. We can encode both input and output string byinterleaving their symbols like σ γ σ γ . Such regular ex-pressions ”recognize” relations rather than ”generate” them.This approach has one significant problem. We have to keeptrack of the order. For instance, this ( σ γ σ + σ ) γ is a validinterleaved expression but this is not ( σ γ + σ ) γ .In order to decide whether an interleaved regular expressionis valid, we should annotate every symbol with its respectivealphabet (like ( σ Σ1 γ Γ1 σ Σ2 + σ Σ3 ) γ Γ4 ). Then we rewrite theexpression, treating alphabets themselves as the new symbols(for instance (ΣΓΣ + Σ)Γ ). If the language recognized bysuch expression is a subset of (ΣΓ) ∗ , then the interleavedexpression valid.This leads us to introduce interleaved alphabets . We shouldnotice that (ΣΓ) ∗ is in fact a local language. What it meansis that in order to define interleaved alphabet we need 3 sets -set of initial alphabets U , set of allowed 2-factors of V and setof final alphabets W . Moreover all the elements of U must bepairwise disjoint alphabets. Similarly for V if (Σ , Σ ) ∈ V and (Σ , Σ ) ∈ V then Σ and Σ must be disjoint. (Forinstance, in case of (ΣΓ) ∗ we have U = { Σ } , V = { (Σ , Γ) } and W = Γ ).With interleaved alphabets we can encode much more com-plex ”multitape automata”. In fact it has certain resemblanceto recursive algebraic data structures built from products (like { (Σ , Γ) } in V ) and coproducts (like { (Σ , Γ ) , (Σ , Γ ) } ∈ V ). It’s possible to use interleaved alphabets together with RE Σ: DW to express multitape inputs and mutitape outputs.V. E XTENDED G LUSHKOV ’ S CONSTRUCTION
The core result of this paper is Glushkov’s algorithmcapable of producing very compact, (cid:15) -free, weighted, ranged,functional, multitape transducers and automatically check ifany regular expression is valid, when given specification ofinterleaved alphabets.Let φ be some RE Σ: DW formula. We will call Σ the universalalphabet . We also admit several subaphabets Σ , Σ , ... allof which are subsets of Σ . Each Σ i admits their own set ofatomic characters A Σ i and we require that A Σ i ⊂ A Σ . Let U Σ , V Σ , W Σ be the interleaved alphabet consisting of all thesubalphabets. For example Σ could be the set of all 64-bitintegers and then V Σ could contain its subsets like ASCII,UNICODE or binary alphabet { , } (possibly with offsets to ensure disjointness). In cases when D = Γ ∗ , we cansimilarly define U Γ , V Γ , W Γ , but there might be cases where D is more a exotic set (like real numbers) and interleavedalphabet’s don’t make much sense. Moreover, we require W to be a semiring. For instance, lexicographic weights haveconcatenation as multiplicative operation and min is used foraddition.First step of Glushkov’s algorithm is to create a newalphabet Ω in which every atomic character (including du-plicates but excluding (cid:15) ) in φ is treated as a new individualcharacter. As a result we should obtain new rewritten formula ψ ∈ RE Ω → DW along with mapping α : Ω → A Σ . This mappingwill remember the original atomic character, before it wasrewritten to unique symbol in Ω . For example φ = ( (cid:15) : x ) x ( x : x x ) x w + ( x x ) ∗ w will be rewritten as ψ = ( (cid:15) : x ) ω ( ω : x x ) ω w + ( ω ω ) ∗ w with α = { ( ω , x ) , ( ω , x ) , ( ω , x ) , ( ω , x ) , ( ω , x ) } .Every element x of A Σ may also be member of several sub-alphabets. For simplicity we can assume that all expressionsare annotated and we know exactly which subalphabet a given x belongs to. In practice, we would try to infer the annotationautomatically and ask user to manually annotate symbols onlywhen necessary.Next step is to define function Λ : RE Ω → DW (cid:42) ( D × W ) .It returns the output produced for empty word (cid:15) (if any) andweight associated with it. (We use symbol (cid:42) to highlight thefact that Λ is a partial function and may fail for ambiguoustransducers.) For instance in the previous example empty wordcan be matched and the returned output and weight is ( (cid:15), w ) .Because both D and W are monoids, we can treat D × W likea monoid defined as ( y , w ) · ( y , w ) = ( y y , w + w ) . Wealso admit ∅ as multiplicative zero, which means that ( y , w ) ·∅ = ∅ . We denote W ’s neutral element as . This facilitatesrecursive definition: Λ( ψ + ψ ) = Λ( ψ ) ∪ Λ( ψ ) if at least one of the sides is ∅ ,otherwise error Λ( ψ ψ ) = Λ( ψ ) · Λ( ψ )Λ( ψ : y ) = Λ( ψ ) · ( y, ψ w ) = Λ( ψ ) · ( (cid:15), w )Λ( wψ ) = Λ( ψ ) · ( (cid:15), w )Λ( ψ ∗ ) = ( (cid:15), if ( (cid:15), w ) = Λ( ψ ) or ∅ = Λ( ψ ) , otherwiseerror Λ( (cid:15) ) = ( (cid:15), ω ) = (cid:15) where ω ∈ Ω Next step is to define B : RE Ω → DW → (Ω (cid:42) D × W ) which for a given formula ψ returns set of Ω characters thatcan be found as the first in any string of V Ω → D ( ψ ) and toeach such character we associate output produced ”before”reaching it. For instance, in the previous example of ψ thereare two characters that can be found at the beginning: ω and ω . Additionally, there is (cid:15) which prints output x beforereaching ω . Therefore ( ω , ( x , and ( ω , ( (cid:15), are theresult of B ( ψ ) . For better readability, we admit operation ofmultiplication · : (Ω (cid:42) D × W ) × ( D × W ) → (Ω (cid:42) D × W ) that performs monoid multiplication on all D × W elementseturned by Ω (cid:42) D × W . B ( ψ + ψ ) = B ( ψ ) ∪ B ( ψ ) B ( ψ ψ ) = B ( ψ ) ∪ Λ( ψ ) · B ( ψ ) B ( ψ w ) = B ( ψ ) B ( wψ ) = ( (cid:15), w ) · B ( ψ ) B ( ψ ∗ ) = B ( ψ ) B ( ψ : d ) = B ( ψ ) B ( (cid:15) ) = ∅ B ( ω ) = { ( ω, ( (cid:15), } It’s worth noting that B ( ψ ) ∪ B ( ψ ) always yields function(instead of relation) because every Ω character appears in ψ only once and it cannot be both in ψ and ψ .Next step is to define E : RE Ω → DW → (Ω (cid:42) D × W ) , whichis very similar to B , except that E collects characters foundat the end of strings. In our example it would be ( ω , ( (cid:15), w )) and ( ω , ( (cid:15), w )) . Recursive definition is as follows: E ( ψ + ψ ) = E ( ψ ) ∪ E ( ψ ) E ( ψ ψ ) = E ( ψ ) · Λ( ψ ) ∪ B ( ψ ) E ( ψ w ) = E ( ψ ) · ( (cid:15), w ) E ( wψ ) = E ( ψ ) E ( ψ ∗ ) = E ( ψ ) E ( ψ : d ) = E ( ψ ) · ( d, E ( (cid:15) ) = ∅ E ( ω ) = { ( ω, ( (cid:15), } Next step is to use B and E to determine all two-charactersubstrings that can be encountered in V Ω → D ( ψ ) . Given twofunctions b, e : Ω (cid:42) D × W we define product b × e :Ω × Ω (cid:42) D × W such that for any ( ω , ( y , w )) ∈ b and ( ω , ( y , w )) ∈ c there is (( ω , ω ) , ( y y , w + w )) ∈ b × e .Then define L : RE Ω → DW → (Ω × Ω (cid:42) D × W ) as: L ( ψ + ψ ) = L ( ψ ) ∪ L ( ψ ) L ( ψ ψ ) = L ( ψ ) ∪ L ( ψ ) ∪ E ( ψ ) × B ( ψ ) L ( ψ w ) = L ( ψ ) L ( wψ ) = L ( ψ ) L ( ψ ∗ ) = L ( ψ ) ∪ E ( ψ ) × B ( ψ ) L ( ψ : d ) = L ( ψ ) L ( (cid:15) ) = ∅ L ( ω ) = ∅ One should notice that all the partial functions produced by B , E and L have finite domains, therefore they are effectiveobjects from computational point of view.The last step is to use results of L, B, E, Λ and α to produceautomaton ( Q, q (cid:15) , W, Σ , D, δ, τ ) with δ : Q × Σ → ( Q (cid:42) D × W ) τ : Q (cid:42) D × WQ = { q ω : ω ∈ Ω } ∪ { q (cid:15) } τ = E ( ψ )( q ω , α ( ω ) , q ω , d, w ) ∈ δ for every ( ω , ω , d, w ) ∈ L ( ψ )( q (cid:15) , α ( ω ) , q ω , d, w ) ∈ δ for every ( ω, d, w ) ∈ B ( ψ ) This concludes the Glushkov’s construction. Now it’s pos-sible to use specification U Σ , V Σ and W Σ of interleavedalphabet to check if regular expression was correct. We cantreat alphabets Σ , Σ , ... as colours and then colour each statewith it’s respective alphabet. If transition leads from stateof colour Σ i to Σ j then we check that the pair (Σ i , Σ j ) isindeed present in V Σ . Similarly we check colours of initialand accepting states. VI. O PTIMISATIONS
The above construction can detect some obvious casesof ambiguous transducers, but it doesn’t give us completeguarantee. We can check in quadratic time[12] for weightconflicting transitions to be sure. If there are none, thentransducer must be functional. If we find at least one, it doesn’timmediately imply that the transducer is ambiguous. In suchcases we can warn the user and demand additional weightannotations in the regular expression.When A Σ consists of all possible ranges χ , then theobtained δ is of the form Q × W × χ × D × Q . While,theoretically equivalent to Q × W × Σ × D × Q , in practice itallows for more efficient implementations. For instance giventwo ranges [1-50] and [20-80] , we do not need to checkequality for all numbers. The only points worth checkingare , , , . Let’s arrange them in some sorted array. Thengiven any number x , we can use binary search to find whichof those points is closest to x and then lookup the full list ofintervals that x is a member of. This approach works even forreal numbers. More precise algorithm can be given a follows.Let ( x , y ) , ( x , y ) , ... ( x n , y n ) be closed ranges. Build anarray P sorted in ascending order that contains all y i and alsofor every x i contains the largest element of Σ smaller than x i (or more generally the supremum). Build a second array R that to every i th element of R assign list of ranges containing i th element of P . Then in order to find ranges containing any x , run binary search on P that returns index of the largestelement smaller or equal to x . Then lookup the list of rangesin R .In Glushkov’s construction epsilons are not rewritten to Ω ,which means that there are also no (cid:15) -transitions. Hence we canuse dynamic programming to efficiently evaluate automatonfor any input string x ∈ Σ ∗ . The algorithm is as follows.Create two dimensional array c i,j of size | Q | × ( | x | + 1) where i -th column represents all nondeterministically reachedstates after reading first i − symbols. Each cell should holdinformation about the previously used transition. This alsotells us the weight, output and source state of transition. Forinstance cell c i,j = k should encode transition coming fromstate k to state i , after reading j − th symbol. If state q i ∈ Q does not belong to j th superposition, then c i,j = ∅ . The firstcolumn is initialized with arbitrary value at c i, = − for i referring to initial state q (cid:15) and set to c i, = ∅ for all other i .Then algorithm progresses building next column from previousone. After filling out the entire array. The last column shouldbe checked for any accepting states according to τ . Theremight be many of them but the one with largest weight shouldbe chosen. If we checked that the automaton has no weight-conflicting transitions, then there should always be only onemaximal weight. Finally we can backtrack, to find out whichpath ”won”. This will determine what outputs need to beconcatenated together to obtain path’s output. This algorithmis quadratic O ( | Q | , | x | ) , but in practice each iteration itself isvery efficient, especially when combined with binary searchdescribed in previous paragraph. By observing that states ofautomata are often sparsely connected, additional optimisationcan be made by representing the two dimensional array withist of indices, as it’s often done for sparse matrices.VII. C ONCLUSION
Interleaved alphabets could find numerous applications withmany possible extensions. In the context of natural languageprocessing, they could be used to annotate human sentenceswith linguistic meta-information like parts of speech. Thentransducers could built to take advantage of those tag. Usinggrammatical inference methods, one could also train suchtransducers to as POS taggers.This approach cannot fully replace OpenFST, because itlacks their flexibility. The goal of OpenFST is to providegeneral and extensible implementation of many differenttransducer’s, whereas the approach presented in this papersacrifices extensibility for highly integrated design and optimalefficiency. For instance, Glushkov’s algorithm could neversupport such operations as inverses, projections, reverses orcomposition. A
CKNOWLEDGMENT
The authors would like to thank Piotr Radwan for all theinspiration. R
EFERENCES[1] M. Mohri, “Weighted finite-state transducer algorithms an overview,”
AT&T Labs , 2004.[2] M. Droste, W. Kuich, and H. Vogler,
Handbook of Weighted Automata ,01 2009.[3] M. Droste and D. Kuske, “Weighted automata,”
Institut fur Informatik,Universitat Leipzig , 2010.[4] A. Mendoza-Drosik, “Multitape automata and finite state transducerswith lexicographic weights,”
ArXiv , vol. abs/2007.12940, 2020.[5] V. M. Glushkov,
The abstract theory of automata . Russian MathematicsSurveys, 1961.[6] F. P. Mehryar Mohri and M. Riley, “Weighted finite-state transducers inspeech recognition,”
AT&T Labs Research , 2008.[7] M. Mohri,
Weighted Finite-State Transducer Algorithms. An Overview .Springer, 2004.[8] C. E. Hasan Ibne Akram, Colin de la Higuera, “Actively learning prob-abilistic subsequential transducers,”
JMLR: Workshop and ConferenceProceedings , 2012.[9] C. de la Higuera,
Grammatical Inference: Learning Automata andGrammars . Cambridge University Press, 2010.[10] K. Meer and A. Naif, “Generalized finite automata over real and complexnumbers,” vol. 591, 04 2014.[11] A. Gandhi, B. Khoussainov, and J. Liu, “Finite automata over structures,”in
Theory and Applications of Models of Computation , M. Agrawal, S. B.Cooper, and A. Li, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg,2012, pp. 373–384.[12] M.-P. B´eal, O. Carton, C. Prieur, and J. Sakarovitch, “Squaring transduc-ers: An efficient procedure for deciding functionality and sequentialityof transducers,” in