[PDF] Glushkov's construction for functional subsequential transducers

Abstract

Glushkov's construction has many interesting properties and they become even more evident when applied to transducers. This article strives to show the wast range of possible extensions and optimisations for this algorithm. Special flavour of regular expressions is introduced, which can be efficiently converted to ϵ -free functional subsequential weighted finite state transducers. Produced automata are very compact, as they contain only one state for each symbol (from input alphabet) of original expression and only one transition for each range of symbols, no matter how large. Such compactified ranges of transitions allow for efficient binary search lookup during automaton evaluation. All the methods and algorithms presented here were used to implement open-source compiler of regular expressions for multitape transducers.

Full PDF

GGlushkov’s construction for functional subsequentialtransducers

Aleksander Mendoza-Drosik

Abstract —Glushkov’s construction has many interesting propertiesand they become even more evident when applied to transducers.This article strives to show the vast range of possible extensionsand optimisations for this algorithm. Special ﬂavour of regularexpressions is introduced, which can be efﬁciently converted to (cid:15) -freefunctional subsequential weighted ﬁnite state transducers. Producedautomata are very compact, as they contain only one state for eachsymbol (from input alphabet) of original expression and only onetransition for each range of symbols, no matter how large. Suchcompactiﬁed ranges of transitions allow for efﬁcient binary searchlookup during automaton evaluation. All the methods and algorithmspresented here were used to implement open-source compiler ofregular expressions for multitape transducers.

Keywords —weighted automata, transducers, Glushkov, follow au-tomata, regular expressions

I. I

NTRODUCTION T HERE are not many open source solutions availablefor working with transducers. The most signiﬁcant andwidely used library is OpenFst. Their approach is based ontheory of weighted automata[1][2][3]. Here we propose analternative approach founded on lexicographic transducers [4]and Glushkov’s algorithm [5].Let W be some set of weight symbols. The free monoid W ∗ will be out set of weight strings. We assume there issome lexicographic order deﬁned as b w > b w ⇐⇒ w > w or ( w = w and b > b ) where w , w ∈ W and b , b ∈ W ∗ . The order is deﬁnedonly on strings of equal lengths. Let Σ be the input alphabet, Σ ∗ is the monoid of input strings and D is the monoid ofoutput strings. Lexicographic transducer is deﬁned as tuple ( Q, I, W, Σ , D, δ, τ ) where Q is some ﬁnite set of states, I isthe set of initial states, τ is a state output (partial) function Q → D × W and lastly δ represents transitions of the form δ ⊂ Q × W × Σ × D × Q .Thanks to τ , such machines are subsequential [6][7][8][9].As an example consider the simple transducer from ﬁgure 1.The states q , q and q have no output, which can be denotedwith τ ( q ) = ∅ . The only set which does have output is q .Every time automaton ﬁnishes reading input string and reaches q , it will append d to its output and then accept. For instance,on input σ σ it will ﬁrst read σ , produce output d d andgo to state q , then read σ and append output d , go to state q , ﬁnally reaching end of input, appending d and accepting.The total output would be d d d d . Note that the automatonis nondeterministic, as it could take alternative route passingthrough q and producing d d . In such scenarios weightsare used to disambiguate output. The ﬁrst route producesweight string w w w , while the second produces w w w . According to our deﬁnition of lexicographic order we have w w w > w w w (assuming that w > w ). Throughoutthis article we will consider smaller weights to be ”better”.Hence the automaton should choose d d as the deﬁnitiveoutput for input σ σ . There might be situations in whichtwo different routes have the exact same (equally highest)weight while also producing different outputs. In such cases,the automaton is ambiguous and produces multiple outputs forone input. II. E XPRESSIVE POWER

There are some remarks to be made about lexicographictransducers. They recognize relations on languages, unlike”plain” ﬁnite state automata (FSA) which recognize languages.If M is some transducer, then we denote its recognized relationwith L ( M ) . Those relations are subsets of Σ ∗ × D . The setof strings Σ ∗ accepted by M must be a regular language(indeed, if we erased output labels, we would as a result obtainFSA). The weights are erasable [4] in the sense that, give anylexicographic transducer we can always build an equivalentautomaton without weighted transitions. If we didn’t have τ ,the only output possible to be expressed for empty input wouldbe an empty string as well. With τ we can express pairs like ( (cid:15), d ) ∈ L where d (cid:54) = (cid:15) .The transducers can return at most ﬁnitely many outputs forany given input (see inﬁnite superposition [4]). If we allowedfor (cid:15) -transitions (transitions that have (cid:15) as input label) we couldbuild (cid:15) -cycles and produce inﬁnitely many outputs. However,automata that do so are not very interesting from practicalpoint of view. Therefore we shall focus only on functionaltransducers, that is those which produce at most one output.If automaton does not have any (cid:15) -cycles and is functional,then it’s possible to erase all (cid:15) -transitions (note that it wouldnot be possible without τ , because (cid:15) -transitions allow forproducing output given empty input). Therefore (cid:15) -transitionsdon’t increase power of functional transducers.We say that transducer has conﬂicting states q and q if it’s possible to reach both of them simultaneously (thereare two possible routes with the same inputs and weights)given some input σ and there is some another state q towhich both of those states can transition over the same inputsymbol σ i . Alternatively, there might be no third state q ,but instead both q and q have non-empty τ output (so τ can in a sense be treated like q ). We say that transitions ( q , σ i , w, d, q ) and ( q , σ i , w (cid:48) , d, q ) are weight-conﬂicting if they have equal weights w = w (cid:48) . For instance in ﬁgure 1the states q and q are indeed conﬂicting because they bothtransition to q over σ but their transitions are not weight a r X i v : . [ c s . F L ] S e p ig. 1. Example of lexicographic transducer. State q is initial. State q in accepting, in the sense that τ ( q ) = ( w , d ) . The remaining states havestate output ∅ . conﬂicting. It can be shown that transducers without weight-conﬂicting transitions are functional. Moreover, if a transduceris functional but contains weight-conﬂicting transitions, thenthe weights can be reassigned in such a way that eliminatesall conﬂicts [4]. The only requirement is that there are enoughsymbols in W (for instance, if W had only one symbol, thenall transitions of conﬂicting states would always be weight-conﬂicting). If there are at least as many weight symbols asthere are states | W | = | Q | , then every functional transducer on | W | states can be built without weight-conﬂicting transitions.For convenience we can assume that W = N , but in practiceall algorithms presented here will work with bounded W .Hence transducers without weight-conﬂicting transitions areequivalent in power to functional transducers. This is importantbecause by searching for weight-conﬂicting transitions we canefﬁciently test whether transducer is functional or not.III. R ANGED AUTOMATA

Often when implementing automata the algorithm behind δ function needs to efﬁciently ﬁnd the right transition for agiven σ symbol. It’s beneﬁcial to optimise UNIX-style rangeslike [0-9] or [a-z] as they arise often in practical settings.Even the . wildcard can be treated as one large range spanningentire Σ . If the alphabet is large (like ASCII or UNICODE),then checking every one of them in a loop is not feasible. Asigniﬁcant improvement can be made by only checking twoinequalities like σ ≤ x ≤ σ , instead of large numberof equalities. The current paper presents a way in whichsimpliﬁed model of ( S , k ) -automata[10][11], can be used toobtain major improvements. In particular we consider onlyautomata that don’t have any registers apart from constantvalues, that is k = 0 . Therefore we provide a more specializeddeﬁnition of ”ranged automata”.Let Σ be the (not necessarily ﬁnite) alphabet of automaton.Let χ be the set of subsets of Σ that we will call ranges of Σ . Let χ be the closure of χ under countable union andcomplementation (so it forms a sigma algebra). For instance,imagine that there is total order on Σ and χ is the setof all intervals in Σ . Now we want to build an automatonwhose transitions are not labelled with symbols from Σ , butrather with ranges from χ . Union χ ∪ χ of two elementsfrom χ ”semantically” corresponds to putting two edges, ( q, χ , q (cid:48) ) ∈ δ (for a moment forget about outputs and weights)and ( q, χ , q (cid:48) ) ∈ δ . There is no limitation on the size of δ . It might be countably inﬁnite, hence it’s natural that χ should be closed under countable union. Therefore, χ is theset of allowed transition labels and χ is the set of all possible”semantic” transitions. We could say that χ is discrete if itcontains every subset of Σ . An example of discrete χ wouldbe ﬁnite set Σ with all UNIX-style ranges [ σ - σ (cid:48) ] includedin χ .Another example would be set Σ = R with χ consistingof all ranges, whose ends are computable real numbers (realnumber x is computable if the predicate q < x is decidablefor all rational numbers q ). If we also restricted δ to be a ﬁniteset, then we could build effective automata that work with realnumbers of arbitrary precision.IV. R EGULAR EXPRESSIONS

Here we describe a ﬂavour of regular expressions speciﬁ-cally extended to interplay with lexicographic transducers andranged automata.Transducers with input Σ ∗ and output Γ ∗ can be seen asFSA working with single input Σ ∗ × Γ ∗ . Therefore we cantreat every pair of symbols ( σ, γ ) as an atomic formula ofregular expressions for transducers. We can use concatenation ( σ, γ )( (cid:15), γ ) to represent ( σ, γ γ ) . It’s possible to createambiguous transducers with unions like ( (cid:15), γ ) + ( (cid:15), γ ) . Tomake notation easier, we will treat every σ as ( σ, (cid:15) ) and every γ as ( (cid:15), γ ) . Then instead of writing lengthy ( σ, (cid:15) )( (cid:15), γ ) wecould introduce shortened notation σ : γ . Because we wouldlike to avoid ambiguous transducers we can put restrictionthat the right side of : should always be a string of Γ ∗ andwriting entire formulas (like σ : γ + γ ∗ ) is not allowed. Thisrestriction will later simplify Glushkov’s algorithm.We deﬁne A Σ to be the set of atomic characters. Forinstance we could choose A Σ = Σ ∪ { (cid:15) } for FSA/transducersand A Σ = χ for ranged automata.We call RE Σ: D the set of all regular expression formulaswith underlying set of atomic characters A Σ and allowedoutput strings D . It’s possible that D might be a singletonmonoid { (cid:15) } but it should not be empty, because then noelement would belong to Σ ∗ × D . By inductive deﬁnition,if φ and ψ are RE Σ: D formulas and d ∈ D , then union φ + ψ , concatenation φ · ψ , Kleene closure φ ∗ and outputconcatenation φ : d are RE Σ: D formulas as well. Deﬁne V Σ: D : RE Σ: D → Σ ∗ × D to be the valuation function: V Σ: D ( φ + ψ ) = V Σ: D ( φ ) ∪ V Σ: D ( ψ ) V Σ: D ( φ · ψ ) = V Σ: D ( φ ) · V Σ: D ( ψ ) V Σ: D ( φ ∗ ) = ( (cid:15), (cid:15) ) + V Σ: D ( φ ) + V Σ: D ( φ ) + ...V Σ: D ( φ : d ) = V Σ: D ( φ ) · ( (cid:15), d ) V Σ: D ( a ) = a where a ∈ A Σ: D Some notable properties are: x : y + x : y = x : ( y + y ) x : (cid:15) + x : y + x : y ... = x : y ∗ ( x : y )( (cid:15) : y ) = x : ( y y ) x : ( y y (cid:48) ) + x : ( y y (cid:48) ) = ( x : y + x : y · ( (cid:15) : y (cid:48) ) x : ( y (cid:48) y ) + x : ( y (cid:48) y ) = ( (cid:15) : y (cid:48) ) · ( x : y + x : y Therefore we can see that expressive power with and without : is the same.It’s also possible to extend regular expressions with weights.Let RE Σ: DW be a superset of RE Σ: D and W be the set ofeight symbols. If φ ∈ RE Σ → DW and w , w ∈ W then w φ and φw are in RE Σ → DW . This allows for inserting weight atany place. For instance, the automaton from ﬁgure 1 could beexpressed using (( σ : d d ) w ( σ : d ) w + ( σ : d ) w σ w ) : d The deﬁnition of V Σ: D ( φw ) depends largely on W butassociativity ( φw ) w = φ ( w + w ) should be preserved,given that W is a multiplicative monoid. This also impliesthat w (cid:15)w = w w , which is semantically equivalent to theaddition w + w .We showed that regular expressions for transducers can beexpressed using pairs of symbols ( σ, γ ) . There is an alternativeapproach. We can encode both input and output string byinterleaving their symbols like σ γ σ γ . Such regular ex-pressions ”recognize” relations rather than ”generate” them.This approach has one signiﬁcant problem. We have to keeptrack of the order. For instance, this ( σ γ σ + σ ) γ is a validinterleaved expression but this is not ( σ γ + σ ) γ .In order to decide whether an interleaved regular expressionis valid, we should annotate every symbol with its respectivealphabet (like ( σ Σ1 γ Γ1 σ Σ2 + σ Σ3 ) γ Γ4 ). Then we rewrite theexpression, treating alphabets themselves as the new symbols(for instance (ΣΓΣ + Σ)Γ ). If the language recognized bysuch expression is a subset of (ΣΓ) ∗ , then the interleavedexpression valid.This leads us to introduce interleaved alphabets . We shouldnotice that (ΣΓ) ∗ is in fact a local language. What it meansis that in order to deﬁne interleaved alphabet we need 3 sets -set of initial alphabets U , set of allowed 2-factors of V and setof ﬁnal alphabets W . Moreover all the elements of U must bepairwise disjoint alphabets. Similarly for V if (Σ , Σ ) ∈ V and (Σ , Σ ) ∈ V then Σ and Σ must be disjoint. (Forinstance, in case of (ΣΓ) ∗ we have U = { Σ } , V = { (Σ , Γ) } and W = Γ ).With interleaved alphabets we can encode much more com-plex ”multitape automata”. In fact it has certain resemblanceto recursive algebraic data structures built from products (like { (Σ , Γ) } in V ) and coproducts (like { (Σ , Γ ) , (Σ , Γ ) } ∈ V ). It’s possible to use interleaved alphabets together with RE Σ: DW to express multitape inputs and mutitape outputs.V. E XTENDED G LUSHKOV ’ S CONSTRUCTION

The core result of this paper is Glushkov’s algorithmcapable of producing very compact, (cid:15) -free, weighted, ranged,functional, multitape transducers and automatically check ifany regular expression is valid, when given speciﬁcation ofinterleaved alphabets.Let φ be some RE Σ: DW formula. We will call Σ the universalalphabet . We also admit several subaphabets Σ , Σ , ... allof which are subsets of Σ . Each Σ i admits their own set ofatomic characters A Σ i and we require that A Σ i ⊂ A Σ . Let U Σ , V Σ , W Σ be the interleaved alphabet consisting of all thesubalphabets. For example Σ could be the set of all 64-bitintegers and then V Σ could contain its subsets like ASCII,UNICODE or binary alphabet { , } (possibly with offsets to ensure disjointness). In cases when D = Γ ∗ , we cansimilarly deﬁne U Γ , V Γ , W Γ , but there might be cases where D is more a exotic set (like real numbers) and interleavedalphabet’s don’t make much sense. Moreover, we require W to be a semiring. For instance, lexicographic weights haveconcatenation as multiplicative operation and min is used foraddition.First step of Glushkov’s algorithm is to create a newalphabet Ω in which every atomic character (including du-plicates but excluding (cid:15) ) in φ is treated as a new individualcharacter. As a result we should obtain new rewritten formula ψ ∈ RE Ω → DW along with mapping α : Ω → A Σ . This mappingwill remember the original atomic character, before it wasrewritten to unique symbol in Ω . For example φ = ( (cid:15) : x ) x ( x : x x ) x w + ( x x ) ∗ w will be rewritten as ψ = ( (cid:15) : x ) ω ( ω : x x ) ω w + ( ω ω ) ∗ w with α = { ( ω , x ) , ( ω , x ) , ( ω , x ) , ( ω , x ) , ( ω , x ) } .Every element x of A Σ may also be member of several sub-alphabets. For simplicity we can assume that all expressionsare annotated and we know exactly which subalphabet a given x belongs to. In practice, we would try to infer the annotationautomatically and ask user to manually annotate symbols onlywhen necessary.Next step is to deﬁne function Λ : RE Ω → DW (cid:42) ( D × W ) .It returns the output produced for empty word (cid:15) (if any) andweight associated with it. (We use symbol (cid:42) to highlight thefact that Λ is a partial function and may fail for ambiguoustransducers.) For instance in the previous example empty wordcan be matched and the returned output and weight is ( (cid:15), w ) .Because both D and W are monoids, we can treat D × W likea monoid deﬁned as ( y , w ) · ( y , w ) = ( y y , w + w ) . Wealso admit ∅ as multiplicative zero, which means that ( y , w ) ·∅ = ∅ . We denote W ’s neutral element as . This facilitatesrecursive deﬁnition: Λ( ψ + ψ ) = Λ( ψ ) ∪ Λ( ψ ) if at least one of the sides is ∅ ,otherwise error Λ( ψ ψ ) = Λ( ψ ) · Λ( ψ )Λ( ψ : y ) = Λ( ψ ) · ( y, ψ w ) = Λ( ψ ) · ( (cid:15), w )Λ( wψ ) = Λ( ψ ) · ( (cid:15), w )Λ( ψ ∗ ) = ( (cid:15), if ( (cid:15), w ) = Λ( ψ ) or ∅ = Λ( ψ ) , otherwiseerror Λ( (cid:15) ) = ( (cid:15), ω ) = (cid:15) where ω ∈ Ω Next step is to deﬁne B : RE Ω → DW → (Ω (cid:42) D × W ) which for a given formula ψ returns set of Ω characters thatcan be found as the ﬁrst in any string of V Ω → D ( ψ ) and toeach such character we associate output produced ”before”reaching it. For instance, in the previous example of ψ thereare two characters that can be found at the beginning: ω and ω . Additionally, there is (cid:15) which prints output x beforereaching ω . Therefore ( ω , ( x , and ( ω , ( (cid:15), are theresult of B ( ψ ) . For better readability, we admit operation ofmultiplication · : (Ω (cid:42) D × W ) × ( D × W ) → (Ω (cid:42) D × W ) that performs monoid multiplication on all D × W elementseturned by Ω (cid:42) D × W . B ( ψ + ψ ) = B ( ψ ) ∪ B ( ψ ) B ( ψ ψ ) = B ( ψ ) ∪ Λ( ψ ) · B ( ψ ) B ( ψ w ) = B ( ψ ) B ( wψ ) = ( (cid:15), w ) · B ( ψ ) B ( ψ ∗ ) = B ( ψ ) B ( ψ : d ) = B ( ψ ) B ( (cid:15) ) = ∅ B ( ω ) = { ( ω, ( (cid:15), } It’s worth noting that B ( ψ ) ∪ B ( ψ ) always yields function(instead of relation) because every Ω character appears in ψ only once and it cannot be both in ψ and ψ .Next step is to deﬁne E : RE Ω → DW → (Ω (cid:42) D × W ) , whichis very similar to B , except that E collects characters foundat the end of strings. In our example it would be ( ω , ( (cid:15), w )) and ( ω , ( (cid:15), w )) . Recursive deﬁnition is as follows: E ( ψ + ψ ) = E ( ψ ) ∪ E ( ψ ) E ( ψ ψ ) = E ( ψ ) · Λ( ψ ) ∪ B ( ψ ) E ( ψ w ) = E ( ψ ) · ( (cid:15), w ) E ( wψ ) = E ( ψ ) E ( ψ ∗ ) = E ( ψ ) E ( ψ : d ) = E ( ψ ) · ( d, E ( (cid:15) ) = ∅ E ( ω ) = { ( ω, ( (cid:15), } Next step is to use B and E to determine all two-charactersubstrings that can be encountered in V Ω → D ( ψ ) . Given twofunctions b, e : Ω (cid:42) D × W we deﬁne product b × e :Ω × Ω (cid:42) D × W such that for any ( ω , ( y , w )) ∈ b and ( ω , ( y , w )) ∈ c there is (( ω , ω ) , ( y y , w + w )) ∈ b × e .Then deﬁne L : RE Ω → DW → (Ω × Ω (cid:42) D × W ) as: L ( ψ + ψ ) = L ( ψ ) ∪ L ( ψ ) L ( ψ ψ ) = L ( ψ ) ∪ L ( ψ ) ∪ E ( ψ ) × B ( ψ ) L ( ψ w ) = L ( ψ ) L ( wψ ) = L ( ψ ) L ( ψ ∗ ) = L ( ψ ) ∪ E ( ψ ) × B ( ψ ) L ( ψ : d ) = L ( ψ ) L ( (cid:15) ) = ∅ L ( ω ) = ∅ One should notice that all the partial functions produced by B , E and L have ﬁnite domains, therefore they are effectiveobjects from computational point of view.The last step is to use results of L, B, E, Λ and α to produceautomaton ( Q, q (cid:15) , W, Σ , D, δ, τ ) with δ : Q × Σ → ( Q (cid:42) D × W ) τ : Q (cid:42) D × WQ = { q ω : ω ∈ Ω } ∪ { q (cid:15) } τ = E ( ψ )( q ω , α ( ω ) , q ω , d, w ) ∈ δ for every ( ω , ω , d, w ) ∈ L ( ψ )( q (cid:15) , α ( ω ) , q ω , d, w ) ∈ δ for every ( ω, d, w ) ∈ B ( ψ ) This concludes the Glushkov’s construction. Now it’s pos-sible to use speciﬁcation U Σ , V Σ and W Σ of interleavedalphabet to check if regular expression was correct. We cantreat alphabets Σ , Σ , ... as colours and then colour each statewith it’s respective alphabet. If transition leads from stateof colour Σ i to Σ j then we check that the pair (Σ i , Σ j ) isindeed present in V Σ . Similarly we check colours of initialand accepting states. VI. O PTIMISATIONS

The above construction can detect some obvious casesof ambiguous transducers, but it doesn’t give us completeguarantee. We can check in quadratic time[12] for weightconﬂicting transitions to be sure. If there are none, thentransducer must be functional. If we ﬁnd at least one, it doesn’timmediately imply that the transducer is ambiguous. In suchcases we can warn the user and demand additional weightannotations in the regular expression.When A Σ consists of all possible ranges χ , then theobtained δ is of the form Q × W × χ × D × Q . While,theoretically equivalent to Q × W × Σ × D × Q , in practice itallows for more efﬁcient implementations. For instance giventwo ranges [1-50] and [20-80] , we do not need to checkequality for all numbers. The only points worth checkingare , , , . Let’s arrange them in some sorted array. Thengiven any number x , we can use binary search to ﬁnd whichof those points is closest to x and then lookup the full list ofintervals that x is a member of. This approach works even forreal numbers. More precise algorithm can be given a follows.Let ( x , y ) , ( x , y ) , ... ( x n , y n ) be closed ranges. Build anarray P sorted in ascending order that contains all y i and alsofor every x i contains the largest element of Σ smaller than x i (or more generally the supremum). Build a second array R that to every i th element of R assign list of ranges containing i th element of P . Then in order to ﬁnd ranges containing any x , run binary search on P that returns index of the largestelement smaller or equal to x . Then lookup the list of rangesin R .In Glushkov’s construction epsilons are not rewritten to Ω ,which means that there are also no (cid:15) -transitions. Hence we canuse dynamic programming to efﬁciently evaluate automatonfor any input string x ∈ Σ ∗ . The algorithm is as follows.Create two dimensional array c i,j of size | Q | × ( | x | + 1) where i -th column represents all nondeterministically reachedstates after reading ﬁrst i − symbols. Each cell should holdinformation about the previously used transition. This alsotells us the weight, output and source state of transition. Forinstance cell c i,j = k should encode transition coming fromstate k to state i , after reading j − th symbol. If state q i ∈ Q does not belong to j th superposition, then c i,j = ∅ . The ﬁrstcolumn is initialized with arbitrary value at c i, = − for i referring to initial state q (cid:15) and set to c i, = ∅ for all other i .Then algorithm progresses building next column from previousone. After ﬁlling out the entire array. The last column shouldbe checked for any accepting states according to τ . Theremight be many of them but the one with largest weight shouldbe chosen. If we checked that the automaton has no weight-conﬂicting transitions, then there should always be only onemaximal weight. Finally we can backtrack, to ﬁnd out whichpath ”won”. This will determine what outputs need to beconcatenated together to obtain path’s output. This algorithmis quadratic O ( | Q | , | x | ) , but in practice each iteration itself isvery efﬁcient, especially when combined with binary searchdescribed in previous paragraph. By observing that states ofautomata are often sparsely connected, additional optimisationcan be made by representing the two dimensional array withist of indices, as it’s often done for sparse matrices.VII. C ONCLUSION

Interleaved alphabets could ﬁnd numerous applications withmany possible extensions. In the context of natural languageprocessing, they could be used to annotate human sentenceswith linguistic meta-information like parts of speech. Thentransducers could built to take advantage of those tag. Usinggrammatical inference methods, one could also train suchtransducers to as POS taggers.This approach cannot fully replace OpenFST, because itlacks their ﬂexibility. The goal of OpenFST is to providegeneral and extensible implementation of many differenttransducer’s, whereas the approach presented in this papersacriﬁces extensibility for highly integrated design and optimalefﬁciency. For instance, Glushkov’s algorithm could neversupport such operations as inverses, projections, reverses orcomposition. A

CKNOWLEDGMENT

The authors would like to thank Piotr Radwan for all theinspiration. R

EFERENCES[1] M. Mohri, “Weighted ﬁnite-state transducer algorithms an overview,”

AT&T Labs , 2004.[2] M. Droste, W. Kuich, and H. Vogler,

Handbook of Weighted Automata ,01 2009.[3] M. Droste and D. Kuske, “Weighted automata,”

Institut fur Informatik,Universitat Leipzig , 2010.[4] A. Mendoza-Drosik, “Multitape automata and ﬁnite state transducerswith lexicographic weights,”

ArXiv , vol. abs/2007.12940, 2020.[5] V. M. Glushkov,

The abstract theory of automata . Russian MathematicsSurveys, 1961.[6] F. P. Mehryar Mohri and M. Riley, “Weighted ﬁnite-state transducers inspeech recognition,”

AT&T Labs Research , 2008.[7] M. Mohri,

Weighted Finite-State Transducer Algorithms. An Overview .Springer, 2004.[8] C. E. Hasan Ibne Akram, Colin de la Higuera, “Actively learning prob-abilistic subsequential transducers,”

JMLR: Workshop and ConferenceProceedings , 2012.[9] C. de la Higuera,

Grammatical Inference: Learning Automata andGrammars . Cambridge University Press, 2010.[10] K. Meer and A. Naif, “Generalized ﬁnite automata over real and complexnumbers,” vol. 591, 04 2014.[11] A. Gandhi, B. Khoussainov, and J. Liu, “Finite automata over structures,”in

Theory and Applications of Models of Computation , M. Agrawal, S. B.Cooper, and A. Li, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg,2012, pp. 373–384.[12] M.-P. B´eal, O. Carton, C. Prieur, and J. Sakarovitch, “Squaring transduc-ers: An efﬁcient procedure for deciding functionality and sequentialityof transducers,” in