Ambiguity through the lens of measure theory
aa r X i v : . [ c s . F L ] N ov Ambiguity through the lens of measure theory
Olivier CartonNovember 23, 2020
Abstract
In this paper, we consider automata accepting irreducible sofic shifts,that is, strongly connected automata where each state is initial and final.We provide a characterization of unambiguity for finite words by means ofmeasure of sets of infinite sequences labelling two runs. More precisely, weshow that such an automaton is unambiguous, in the sense that no finiteword labels two runs with the same starting state and the same endingstate if and only if for each state, the set of infinite sequences labellingtwo runs starting from that state has measure zero.
The relationship between deterministic and non-deterministic machines hasbeen extensively studied since the very beginning of computer science. Despitethese efforts, many questions remain wide open. This is of course true in com-plexity theory for questions like P versus NP but also in automata theory [8, 5].It is for instance not known whether the simulation of non-deterministic eitherone-way or two-way automata by deterministic two-way automata requires anexponential blow-up of the number of states [14].Unambiguous machines are usually defined as non-deterministic machinesin which each input has at most one accepting run. These machines are inter-mediate machines in between the two extreme cases. In the case of automataaccepting finite words, non-deterministic automata can be exponentially moresuccinct than unambiguous automata which can be, in turn, exponentially moresuccinct than deterministic automata [16]. However, the problem of contain-ment for unambiguous automata is tractable is polynomial time [18] like fordeterministic automata while the same problem for non-deterministic automatais PSPACE-complete.The polynomial time algorithm for the containment of unambiguous au-tomata accepting finite words in [18] is based on a clever counting argumentwhich cannot easily be adapted to infinite words. It is still unknown whetherthe containment problem for B¨uchi automata can be solved in polynomial time.The problem was solved in [10] for sub-classes of B¨uchi automata with weakacceptance conditions and in [2] for prophetic B¨uchi automata introduced in [3](see also [13, Sec. II.10]) which are strongly unambiguous. This latter resultsare obtained through reductions of the problem for infinite words to the problemfor finite words.The aim of this paper is to exhibit a strong link between the ambiguityof some automaton for finite words and the ambiguity of the same automaton1or infinite words. The paper is focused for simplicity on strongly connectedautomata with all states being initial and final which accept the so-called irre-ducible shift spaces [11]. It turns out that unambiguity for infinite words impliesthe unambiguity for finite words but the converse does not hold in general. Thisconverse can however be recovered if unambiguity for infinite words is consid-ered up to a negligible set of inputs.
Negligible should here be understood as aset of zero measure. If the shift space associated with the automaton is the fullshift, that is the set of all infinite words, the uniform measure or any Bernoullimeasure can be used to define the meaning of negligible . If the associated shiftspace is not the full shift, measures whose support coincides with the set offactors of the shift space must be considered.This work was motivated by the use of transducers, that is automata withoutputs, to realize functions from infinite words to infinite words. It is a classicalresult that each function realized by a transducer can be realized by a transducerwhose input automaton is unambiguous [4]. The result proved in this papershows that if all states of a strongly connected unambiguous transducer aremade final the transducer remains unambiguous up to a set of measure zero.
Let A be a finite set of symbols that we refer to as the alphabet . We write A N forthe set of all sequences on the alphabet A and A ∗ for the set of all (finite) words.The length of a finite word w is denoted by | w | . The positions of sequences andwords are numbered starting from 1. The empty word is denoted by ε . Thecardinality of a finite set E is denoted by E . A factor of a sequence a a a · · · is a finite word of the form a k a k +1 · · · a ℓ − for integers 1 k ℓ where k = ℓ yields the empty word ε . We let fact( X ) denote the set of factors of a set X ofsequences.The shift map is the function σ which maps each sequence ( x i ) i > to thesequence ( x i +1 ) i > obtained by removing its first element. A shift space is asubset X of A N which is closed for the topology and such that σ ( X ) = X . Werefer the reader to [11] for a complete introduction to shift spaces. The entropy h ( X ) of a shift space X is defined by h ( X ) = lim n →∞ log X ) ∩ A n ) n . We mention two classical shift spaces which are used in many examples. The full shift over some alphabet A is the shift space made of all sequences over A and its entropy is log A . The golden mean shift is the set of sequences over { , } with no consecutive 1s. Its entropy is log φ where φ is the golden meanratio.A probability measure on A ∗ is a function µ : A ∗ → [0 ,
1] such that µ ( ε ) = 1and that the equality X a ∈ A µ ( wa ) = µ ( w )holds for each word w ∈ A ∗ . The simplest example of a probability measure isa Bernoulli measure . It is a monoid morphism from A ∗ to [0 ,
1] (endowed with2ultiplication) such that P a ∈ A µ ( a ) = 1. Among the Bernoulli measures is the uniform measure which maps each word w ∈ A ∗ to ( A ) −| w | . In particular,each symbol a is mapped to µ ( a ) = 1 / A .By the Carath´eodory extension theorem, a measure µ on A ∗ can be uniquelyextended to a probability measure ˆ µ on A N such that ˆ µ ( wA N ) = µ ( w ) holds foreach word w ∈ A ∗ . In the rest of the paper, we use the same symbol for µ and ˆ µ . A probability measure µ is said to be (shift) invariant if the equality X a ∈ A µ ( aw ) = µ ( w )holds for each word w ∈ A ∗ . The support supp( µ ) of a measure µ is the set { w : µ ( w ) > } of finite words.The column vector such that each of its entry is 1 is denoted by . A P -vector λ is called stochastic (respectively, substochastic ) if its entries are non-negative and sum up to 1 (respectively, to at most 1). that is, 0 λ p p ∈ P and λ = 1 (respectively, λ M is called stochas-tic (respectively, substochastic ) if each of its rows is stochastic (respectively, substochastic ), that is M = (respectively, M ). It is called strictly sub-stochastic if it is substochastic but not stochastic. This means that the entriesof at least one of its rows sum up to a value which is strictly smaller than 1.A measure µ is rational if it is realized by a weighted automaton [15, Chap .4].Equivalently there is an integer m , a row 1 × m -vector π , a morphism ν from A ∗ into m × m -matrices over real numbers and a column m × ρ suchthat the following equality holds for each word a · · · a k [1]. µ ( a · · · a k ) = πν ( a · · · a m ) ρ = πν ( a ) · · · ν ( a k ) ρ The triple h π, ν, ρ i is called a representation of the rational measure µ . By thethe main result in [7], it can always be assumed that both the vector π andthe matrix P a ∈ A ν ( a ) are stochastic and that the vector ρ is the vector . Thetriple h π, ν, i is then called a stochastic representation of µ . The measure µ is invariant if π P a ∈ A ν ( a ) = π . With each stochastic representation h π, ν, i is associated an automaton whose state set is P = { , . . . , m } where m is thecommon dimension of all matrices ν ( a ). For each states p, q ∈ P , there is atransition p a −→ q whenever ν ( a ) p,q >
0. The initial states are those states q in P such that π q > We refer the reader to [13] for a complete introduction to automata accepting(infinite) sequences of symbols. A (B¨uchi) automaton A is a tuple h Q, A, ∆ , I, F i where Q is the finite state set, A the alphabet, ∆ ⊆ Q × A × Q the transitionrelation, I ⊆ Q the set of initial states and F is the set of final states. Atransition is a tuple h p, a, q i in Q × A × Q and it is written p a −→ q . A finite run in A is a finite sequence of consecutive transitions, q a −→ q a −→ q · · · q n − a n −−→ q n Its label is the word a a · · · a n . An infinite run in A is a sequence of consecutivetransitions, q a −→ q a −→ q a −→ q · · · initial if its first state q is initial, that is, belongs to I . A run is called final if it visits infinitely often a final state. An infinite run is accepting if it isboth initial and final. A sequence is accepted if it is the label of an acceptingrun. The set of accepted sequences is said to be accepted by the automaton.As usual, an automaton is deterministic if it has only one initial state, thatis I = 1 and if p a −→ q and p a −→ q ′ are two of its transitions with the samestarting state and the same label, then q = q ′ . The leftmost automaton picturedin Figure 1 is deterministic while the rightmost one is not. Both accepts thegolden mean shift (each state is initial and final).In this paper, we mainly consider B¨uchi automata accepting shift spaces.It is easily verified that an automaton in which each state is both initial andfinal, that is, I = F = Q accepts a shift space. Conversely, if a shift space isaccepted by some trim automaton, it is also accepted by the same automatonin which each state is made initial and final. A shift space is called sofic if it isaccepted by some automaton. A sofic shift is called irreducible if it is acceptedby a strongly connected automaton.1 20 10 1 20 01Figure 1: Two automata accepting the golden mean shiftThere is a unique, up to isomorphism, deterministic automaton acceptingan irreducible sofic shift with the minimal number of states [11, Thm 3.3.18].This minimal automaton is also referred to as either its Shannon cover or itsFischer cover. It can be obtained from any automaton accepting the shift spacevia determinizing and state-minimizing algorithms, e.g., [11, pp. 92], [9, pp.68]. The minimal automaton of the golden mean shift is the leftmost automa-ton pictured in Figure 1. This minimal automaton always has at least onesynchronizing word [12] as defined in the next paragraph.For each state q , its future (respectively bi-future ) is the set F ( q ) (respec-tively, F ( q )) of sequences labelling a run (respectively, at least two runs) start-ing from q . Its past P ( q ) is the set of words labelling a run ending in q . A synchronizing word of a strongly connected automaton is a word w such thatthere is a unique state q such that w ∈ P ( q ). Words 0 and 1 are synchronizingwords of both automata pictured in Figure 1.If A is an automaton and P ⊆ Q is a subset of its state set Q , we let P · w denote the subset P ′ ⊆ Q defined by P ′ = { q : ∃ p ∈ P p w −→ q } . If P is asingleton set { q } , we write q · w for { q } · w . By a slight abuse of notation, wealso write q · w = p for { q } · w = { p } . If A is deterministic, q · w is either theempty set or a singleton set.An automaton is unambiguous (for finite words) if for each states p, q ∈ Q and each word w , there is at most one run p w −→ q from p to q labelled by w .The two autmata pictured in Figure 1 are unambiguous because the left-most one is deterministic while the rightmost one is reverse deterministic. Thethree automata pictured in Figure 2 are also unambiguous. The leftmost oneis deterministic and therefore F (1) = F (2) = A N and F (1) = F (2) = ∅ . The4 20 10 1 1 20 01 1 12 340 10 0 10 , F (1) = 0 A N , F (2) = 1 A N and F (1) = F (2) = ∅ . The rightmost one is neither deterministic nor reversedeterministic but it is unambiguous. An ambiguous automaton is pictured inFigure 3 below. Let A be an automaton with state set Q . The adjacency matrix of A is the Q × Q -matrix M defined by M p,q = { a ∈ A : p a −→ q } . Its entry M p,q is thusthe number of transitions from p to q . By a slight abuse of notation, the spectralradius of the adjacency matrix, is called the spectral radius of the automaton.If A is deterministic, the matrix M/ ( A ) is substochastic. If A is deterministicand complete, that is, for each pair ( p, a ) in Q × A , there exists exactly onestate q such that p a −→ q is a transition of A , the matrix M/ ( A ) is stochastic.The adjacency matrices of the three automata pictured in Figure 2 are givenbelow. (cid:18) (cid:19) (cid:18) (cid:19) The following theorem provides a characterization of ambiguity using thespectral radius of the adjacency matrix of the automaton.
Theorem 1.
Let X be a shift space of entropy h and let A be a strongly con-nected automaton accepting a shift space contained in X . If A satisfies two ofthe following properties, it also satisfies the third one.i) A is unambiguous,ii) A accepts X ,iii) the logarithm of the spectral radius of its adjacency matrix is h . The three automata pictured in Figure 2 accepts the full shift over { , } whose entropy is log 2. It can be verified that the spectral radius of their threeadjacency matrices is 2. This confirms that these automata are unambiguous.The adjacency matrix of the automaton pictured in Figure 3 is ( ) whosespectral radius is 2 while the entropy of the golden mean shift is log φ . Thisconfirms that this automaton is ambiguous.5mbiguity of automata has been defined using finite words: an automaton isambiguous if some finite word w is the label of two different runs from a state p toa state q . If the automaton is trim, this implies that some sequence of the form wy is the label of two different runs from p . The converse of this implication doesnot hold in general. The third automaton pictured in Figure 2 is unambiguousalthough the sequence (01) N = 0101 · · · is the label of the following two runsstarting from state 1. 1 −→ −→ −→ −→ −→ −→ · · · −→ −→ −→ −→ −→ −→ · · · However, the set F (1) is contained in (0 + 1) ∗ (01) N and it is thus countableand of measure 0 for the uniform measure. Note that if each transition p −→ q isreplaced by the two transitions p −→ q and p −→ q , the set F (1) is not anymorecountable but it is still of measure 0 as a subset of { , , } N .The following theorem provides a characterization of ambiguity using mea-sure theory. More precisely, it states that a strongly connected automaton isunambiguous whenever the measure of sequences labelling two runs is negligible,that is, of measure zero. Theorem 2.
Let A be a strongly connected automaton accepting a shift space X .Let µ be a rational measure such that supp( µ ) = fact( X ) . The automaton A isunambiguous if and only if for each state q , the set F ( q ) of sequences labellingat least two different runs from q satisfies µ ( F ( q )) = 0 . The measure used to quantify this ambiguity must be compatible with theautomaton. More precisely, its support must be equal to the set of finite wordslabelling at least one run in the automaton. If this condition is not fulfilled, theresult may not hold as it is shown by the following two examples.Consider again the third automaton pictured in Figure 2. Let µ be theprobability measure putting weight 1 / N and (10) N and zero everywhere else. More formally, it is defined µ ((01) N ) = µ ((10) N ) = 1 / µ ( { , } N \ { (01) N , (10) N } ) = 0. The measure µ ( F (1)) = 1 / ∗ + (10) ∗ ofthis measure µ is strictly contained in the set of words labelling a run in thisautomaton. This latter set is actually the set { , } ∗ of all finite words over { , } . 1 20 10 0Figure 3: An ambiguous automaton accepting the golden mean shiftConsider the automaton pictured in Figure 3. It accepts the golden meanshift X and it is ambiguous because the word 00 is the label of the two runs2 −→ −→ −→ −→
1. The uniform measure µ ( X ) of the golden shiftis zero. Therefore, both numbers µ ( F (1)) and µ ( F (2)) are zero although theautomaton is ambiguous. This comes from the fact that the support { , } ∗ ofthe uniform measure strictly contains the set fact( X ). This latter set is the set(0 + 10) ∗ (1 + ε ) of finite words with no consecutive 1s.6 Proofs
The following lemma is the key lemma used to prove Theorem 1. It is a directapplication of the Perron-Frobenius theorem.
Lemma 3.
Let X and Y be two sofic shift spaces such that X ⊆ Y . If h ( X ) = h ( Y ) , then X = Y .Proof. We prove that if X ⊂ Y but X = Y , then h ( X ) < h ( Y ). Let A be anunambiguous automaton accepting the shift space Y . Let M be the adjacencymatrix of A . For each states p and q , and each integer n , M np,q is the numberof blocks of Y of length n labelling a run in A from p to q . Let n be the leastinteger such that there is a word w of length n , which is a block of Y but nota block of X . Let N be the matrix defined as follows. For each states p and q , N p,q is the number of blocks of X of length n labelling a run in A from p to q .Since X ⊂ Y , N p,q M np,q for each states p and q . Furthermore, by the choiceof n , there are p and q such that N p,q < M np,q . By Theorem 1.5e in [17], thespectral radius of M n is strictly greater than N . This proves the claim.Now we come to the proof of Theorem 1. Proof of Theorem 1. let M the adjacency matrix of A .We claim that if the automaton A is unambiguous, then the logarithm of itsspectral radius is less than or equal to h . Let α be a positive real number. The( p, q )-entry of the matrix M n is the number of finite runs of length n from p to q .Since the automaton is unambiguous, this number is bounded by the number ofwords of length n in fact( X ). Since the entropy of X is h , this number is upperbounded by e ( h + α ) n for n great enough. By the Perron-Frobenius theorem [17,Thm 1.2], the logarithm of the spectral radius of M is at most h + α . Since thisis true for each positive real number α , the claim is proved.We also claim that if each word in fact( X ) is the label of a run in A , thenthe logarithm of its spectral radius is greater than or equal to h . Let α be apositive real number and let λ be the spectral radius of M . By the Perron-Frobenius theorem [17, Thm 1.2], there is a constant K such that M np,q Kλ n for each states p, q . The number of finite runs of length n in A is thus boundedby K ( Q ) λ n . This number must be greater than or equal to the number ofwords of length n in fact( X ). Since the entropy of X is h , this number is lowerbounded by e ( h − α ) n for n great enough. If follows that log λ is greater than h − α . Since this is true for each positive real number α , the claim is proved.Combining the two claims shows that if A satisfies (i) and (ii), then it alsosatisfies (iii).Suppose that A is unambiguous and that the logarithm of its spectral radiusis h . Since it is unambiguous, the logarithm of the spectral radius of M is equalto the entropy of the shift space of A . Since the entropy of the shift spaceaccepted by A is the entropy of X , then X is, by Lemma 3, the shift spaceaccepted A .Suppose that each word in fact( X ) is the label of a run in A and that thelogarithm of its spectral radius is h .Suppose also by contradiction that A is not unambiguous and that thereare two different runs from p to q labelled by the same word w of length n .Consider the matrix M n . Subtracting 1 to the ( p, q )-entry of this matrix yields7he matrix M ′ with the same shift space. The logarithm of the spectral radiusof M ′ is thus equal to nh . By Theorem 1.5e in [17], the spectral radius of M n is strictly greater than nh . This is a contradiction with the hypothesis that thespectral radius of A is h .The next lemma states some sets induced by the minimal automaton of asofic shift have a positive measure. As already mentioned, the minimal automa-ton of a sofic irreducible shift always has synchronizing words. Lemma 4.
Let r be a state of the minimal automaton of a sofic shift space X and let w be a synchronizing word such that w ∈ P ( r ) . Let µ be a measure suchthat supp( µ ) = fact( X ) . Then µ ( w F ( r )) = µ ( wA N ) > .Proof. Since X is a closed set, its complement A N \ X is equal to the union S w / ∈ fact( X ) wA N . Since supp( µ ) = fact( X ), the equality µ ( X ) = 1 holds. Itfollows that µ ( X ∩ wA N ) = µ ( wA N ) for each word w . We claim that if w issynchronizing and w ∈ P ( q ), then w F ( r ) = X ∩ wA N . The inclusion w F ( r ) ⊆ X ∩ wA N follows directly from w ∈ P ( r ). The reverse inclusion follows fromthe fact that w is synchronizing. Combining the two equalities, we get that µ ( w F ( r )) = µ ( wA N ). This latter number is positive since w ∈ fact( X ) andsupp( µ ) = fact( X ). Lemma 5.
Let A be an automaton accepting a sofic shift space X . Let u bea word and q be a state of A such that u ∈ P ( q ) . There exists a state r of theminimal automaton of X and a synchronizing word v such that uv ∈ P ( r ) and F ( q ) ∩ vA N = v F ( r ) .Proof. Let Q be the state set of A . Let us consider the deterministic automatonˆ A whose state set ˆ Q is the set of non-empty subsets of Q of the form q · w forsome word w . The transitions of ˆ A are of the form q · w a −→ q · wa .Let C be a recurrent strongly connected component of ˆ A . We claim that C accepts X . Let w be a word such that q · w is a state of ˆ A in C . Let v ∈ fact( X )be a factor of X . Since X is irreducible, there is a word u such that there is arun in A starting from q and labelled wuv . This shows that v is a label of apath from q · wu to q · wuv in C .Let ∼ be the equivalence relation on states of ˆ A defined by P ∼ P ′ iff F ( P ) = F ( P ′ ). The automaton C/ ∼ is the minimal deterministic automatonof X . Let v be a word such that q · v is a state in C . As C/ ∼ is a minimalautomaton there is a synchronizing word v such that q · v v is also a statein C .Let v be the word v v . Since v is synchronizing, v is also synchronizing.The equality F ( q ) ∩ vA N = v F ( r ′ ) holds where r ′ = q · v . Since uv is a factorof X , there is some state r of C such that uv ∈ P ( r ). Since v is synchronizingin C/ ∼ , the states r and r ′ satisfy F ( r ) = F ( r ′ ). It follows that uv ∈ P ( r ) and F ( q ) ∩ vA N = v F ( r ) Lemma 6.
Let A be an automaton accepting a sofic shift space X . Let q bestate of A and let w be a word such that w ∈ P ( q ) . Let µ be a measure suchthat supp( µ ) = fact( X ) . Then µ ( w F ( q )) > .Proof. By Lemma 5, there exists a state r of the minimal automaton of X anda synchronizing word v such that wv ∈ P ( r ) and F ( q ) ∩ vA N = v F ( r ). This8mplies that w F ( q ) ∩ wvA N = wv F ( r ). By Lemma 4, the measure µ ( wv F ( r )) ispositive and thus µ ( w F ( q )) > G δ -set (that is Π ) [13, Thm I.9.9]. This impliesin particular that regular and closed sets are accepted by deterministic B¨uchiautomata. Regular and closed sets are actually accepted by deterministic B¨uchiautomata in which each state is final [13, Prop III.3.7]. Lemma 7.
Let X be a sofic shift space and let µ be a rational measure suchthat supp( µ ) = fact( X ) . Let F be a regular and closed set contained in X . If µ ( F ) > , there exists a word w and a state r of the minimal automaton of X such that w ∈ P ( r ) and w F ( r ) ⊆ F . Before proceeding to the proof of the lemma, we show that even in the caseof the full shift, that is X = A N , both hypothesis of being regular and closedare necessary. Since the minimal automaton of the full shift has a single state r satisfying F ( r ) = A N , the lemma can be, in that case, rephrased as follows. If µ ( F ) > µ is the uniform measure, then there exists a word w such that wA N ⊆ F .Being regular is of course not sufficient because the set (0 ∗ N of sequenceshaving infinitely many occurrences of 1 is regular and has measure 1 but doesnot contain any cylinder. Being closed is also not sufficient as it is shown bythe following example. Let X be the set of sequences such that none of theirnon-empty prefixes of even length is a palindrome. The complement of X isequal to the following union [ n > Z n where Z n = [ | w | = n w ˜ wA N and where ˜ w stands for the reverse of w . Suppose for instance that the alphabetis A = { , } . The measure of Z n is equal 2 − n because there are 2 n words oflength n and the measure of each cylinder w ˜ wA N is 2 − n . Furthermore, theset Z ∪ Z is equal to 00 A N ∪ A N ∪ A N ∪ A N whose measure is5 /
8. This shows that the measure of the complement of X is bounded by5 / P n > − n = 7 / Z n are not pairwise disjoint). Therefore X has a positive measure but it does notcontain any cylinder. Indeed, in each cylinder wA N , the cylinder w ˜ wA N is outof X . Proof.
Let h π, ν, i be a stochastic representation of dimension m of the rationalmeasure µ . Let P be the set { , . . . , m } . For each p ∈ P , we let µ p be themeasure whose representation is h δ p , ν, i where the row vector δ p is given by( δ p ) p ′ = 1 if p ′ = p and 0 otherwise. The measure µ satisfies the equality µ = P mp =1 π p µ p .Let A be a deterministic B¨uchi automaton accepting F whose state set is Q .The unique initial state of A is i . Since F is closed, it can be assumed that all Not to be confused with regular closed sets which are equal to the closure of their interior[6, Chap. 4]. A are final. For each state q of A , the set F ( q ) is the set of acceptedsequences if q is taken as the unique initial state of the automaton.We consider a weighted graph G whose vertex set is P × Q . The weight ofthe edge from the vertex ( p, q ) to the the vertex ( p ′ , q ′ ) is given by w p,q,p ′ ,q ′ = X q a −→ q ′ ν ( a ) p,p ′ where the summation ranges over all transitions q a −→ q ′ in the automaton A .Since µ ( F ) >
0, there exists at least one integer p such that µ p ( F ) >
0. Withoutloss of generality, it can be assumed that this integer p is p = 1. The vertex(1 , i ) where i is the initial state A is called the initial vertex of G . The graph G is restricted to its accessible part from its initial vertex (1 , i ), that is, the set ofvertices ( p, q ) such that there is a path from (1 , i ) to ( p, q ) made of edges withpositive weight. Vertices which are not accessible from (1 , i ) are ignored in therest of this proof.With each vertex ( p, q ) of G is associated the real number α p,q = µ p ( F ( q )).Let α be the row P × Q -vector whose entries are the numbers α p,q . Let M be thematrix of weights of the graph G : the (( p, q ) , ( p ′ , q ′ ))-entry of M is the weight w p,q,p ′ ,q ′ defined above. The matrix M and the vector α satisfy the equality α = M α . This latter equality comes first from the equality F ( q ) = ] q a −→ q ′ a F ( q ′ )for each state q of A where ⊎ stands for disjoint union and second from theequality µ p ( aF ) = m X p ′ =1 ν ( a ) p,p ′ µ p ′ ( F )for each p ∈ P , each symbol a and each measurable set F .Let us recall that a strongly connected component C of a graph is called recurrent if no edge leaves it. This means that if ( v, v ′ ) is an edge and v belongsto C , then v ′ also belongs to C .We claim that if ( p, q ) belongs to a recurrent strongly connected compo-nent C of G and α p,q >
0, then F ( p ) ⊆ F ( q ). Let α ′ and M ′ be the restrictionsof α and M respectively to the vertices in C . Because C is recurrent, theequality α ′ = M ′ α ′ holds. If the matrix M ′ is strictly substochastic, this latterequality implies that α is the zero vector and this would contradict α p,q > M ′ is then stochastic. The sum of the elements of the ( p, q )-row ofthe matrix M ′ is equal to X p ′ ,q ′ w p,q,p ′ ,q ′ = X q a −→ q ′ m X p ′ =1 ν ( a ) p,p ′ . Since the automaton A is deterministic the subset q · a is either the empty setor a singleton set { q ′ } . This means that q and a being fixed, there is at mostone choice for q ′ . Let us denote by β p,a the sum P mp ′ =1 ν ( a ) p,p ′ so that X p ′ ,q ′ w p,q,p ′ ,q ′ = X q a −→ q ′ β p,a . P a ∈ A ν ( a ) is stochastic, the sum P a ∈ A β p,a is equal to 1. Thesum P p ′ ,q ′ w p,q,p ′ ,q ′ is thus equal to 1 if for each symbol a , β p,a > q · a is not empty. We claim that if M ′ is stochastic, then F ( p ) ⊆ F ( q ) for eachvertex ( p, q ) in C . Let x = a a a · · · be sequence in F ( p ). Then there exists asequence p = p , p , p , . . . in P N such that ν ( a i ) p i − ,p i > i >
1. Thislast relation implies that β p i − ,a i >
0. There exists then a (unique) sequence q = q , q , q , . . . of states of A such that q i +1 = q i · a i +1 . This completes theproof of the claim.Now we complete the proof. Let V (respectively, V ) be the set of vertices ina non-recurrent (respectively, recurrent) strongly connected component of G . Wewrite α = (¯ α , ¯ α ) where the vectors ¯ α and ¯ α are respectively ¯ α = ( α v ) v ∈ V and ¯ α = ( α v ) v ∈ V . The relation α = M α is equivalent to the relations¯ α = M ¯ α + M ¯ α ¯ α = M ¯ α where M (respectively, M ) is the restriction of M to rows and columns indexedby V (respectively, C ) and M is the restriction of M to rows indexed by V and columns indexed by V . Since there is at least one transition from Q to Q ,the matrix M is strictly substochastic and its spectral radius is strongly lessthan 1. The matrix I − M is thus invertible. The first relation is thus equivalentto ¯ α = ( I − M ) − M ¯ α . This last equality shows that ¯ α = 0 implies ¯ α = 0 and thus α = 0. Let ( p, q )be a vertex in V such that α p,q > F ( p ) ⊆ F ( q ).There is then a path from (1 , i ) to state the ( p, q ). There exists a word u suchthat ν ( u ) ,p > i u −→ q in A . By Lemma 5, there is a word v and a state r ofthe minimal automaton of X such that uv ∈ P ( r ) and F ( p ) ∩ vA N = v F ( r ). Thus v F ( r ) ⊆ F ( q ) and uv F ( r ) ⊆ u F ( q ) ⊆ F . Setting w = uv gives the result.The following result is trivially true when the measure µ is shift invariantbecause µ ( F ) = P | w | = m µ ( wF ) but it does not hold in general. Lemma 8.
Let X be a sofic shift space and let µ be a rational measure suchthat supp( µ ) = fact( X ) . Let F be a regular and closed set contained in X . If µ ( F ) = 0 , then µ ( wF ) = 0 for each word w .Proof. We prove that µ ( wF ) > µ ( F ) >
0. Suppose that µ ( wF ) > wF is also regular and closed, there exists, by Lemma 7, a word u and astate r of the minimal automaton of X such that u ∈ P ( r ) and u F ( r ) ⊆ wF .This latter inclusion implies that either u is a prefix of w or w is a prefix of u .In the first case, that is w = uv for some word v , the inclusion is equivalent to F ( r ) ⊆ vF . Let s be state such that r v −→ s . Then v F ( s ) is contained in F ( r ) andthus F ( s ) ⊆ F . By Lemma 6, µ ( F ( s )) > µ ( F ) >
0. In the secondcase, that is u = wv , for some v , the inclusion is equivalent to v F ( r ) ⊆ F . Againby Lemma 6, µ ( v F ( r )) > µ ( F ) > Lemma 9.
Let A be strongly connected automaton accepting a shift space X .Let µ be a rational measure such that supp( µ ) = fact( X ) . Let q and q ′ twostates of A such that µ ( F ( q ) ∩ F ( q ′ )) > . Then there exists a word w such that F ( q ) ∩ wA N = F ( q ′ ) ∩ wA N . roof. We claim that there exists a word w and a state r of the minimal au-tomaton of X such that w ∈ P ( r ) and F ( q ) ∩ wA N = F ( q ′ ) ∩ wA N = w F ( r ) . Let F be the closed set F ( q ) ∩ F ( q ′ ). By Lemma 7 applied to F , there existsa word u and a state s of the minimal automaton of X such that u ∈ P ( s ) and u F ( s ) ⊆ F ( q ) u F ( s ) ⊆ F ( q ′ )Let v be a synchronizing word of the minimal automaton of X such that s · v is not empty. Let w be the word uv and let r be the state s · v . Since u ∈ P ( s )and r = s · v , w ∈ P ( r ). We claim that F ( q ) ∩ wA N = w F ( r ). Suppose first that x belongs to F ( q ) ∩ wA N . The sequence x is then equal to wx ′ and it is the labelof a run in the minimal automaton of X . Since w = uv and v is synchronizing x ′ belongs to F ( r ). Suppose conversely that x belongs to w F ( r ). Its is thenequal to uvx ′ for some x ′ in F ( r ). Since r = s · v , vx ′ ∈ F ( s ). It follow from theinclusion u F ( s ) ⊆ F ( q ) that x belongs to F ( q ). This completes the proof of theequality F ( q ) ∩ wA N = w F ( r ). By symmetry, the equality F ( q ′ ) ∩ wA N = w F ( r )also holds and the proof is completed. Lemma 10.
Let A be a strongly connected automaton accepting a shift space X .Let µ be a rational measure such that supp( µ ) = fact( X ) . If there are two runs p u −→ q and p u −→ q ′ , with q = q ′ , then µ ( F ( q ) ∩ F ( q ′ )) = 0 .Proof. Suppose by contradiction that µ ( F ( q ) ∩ F ( q ′ )) >
0. There exists, byLemma 9, a word v such that F ( q ) ∩ vA N = F ( q ′ ) ∩ vA N . Let q · v (respectively q ′ · v ) be the set { q , . . . , q r } (respectively { q ′ , . . . , q ′ r ′ } ). Since F ( q ) ∩ vA N = F ( q ′ ) ∩ vA N , the equality F ( q ) ∪ · · · ∪ F ( q r ) = F ( q ′ ) ∪ · · · ∪ F ( q ′ r ′ ) holds. Sincethe automaton is strongly connected, there is a run q w −→ p from q to p .Combining this run with the run p u −→ q v −→ q yields the cyclic run q wuv −−−→ q .Since F ( q ) ∪· · ·∪ F ( q r ) = F ( q ′ ) ∪· · ·∪ F ( q ′ r ′ ), the sequence ( wuv ) N = wuvwuv · · · belongs to a set F ( q ′ i ) for some 1 i r ′ . By symmetry, it can be assumed that( wuv ) N ∈ F ( q ′ ). There exists then a run starting from q ′ with label ( wuv ) N .This run can be decomposed q ′ wuv −−−→ p wuv −−−→ p wuv −−−→ p · · · . Since there are finitely many states, there are two integers k, ℓ > p k = p k + ℓ . There are then the following two runs from p to p k = p k + ℓ with thesame label ( uvw ) k + ℓ uv . p u −→ q v −→ q wuv ) k − −−−−−−→ q w −→ p uv −→ q ′ wuv ) ℓ −−−−→ p k p u −→ q ′ v −→ q ′ wuv ) k + ℓ −−−−−−→ p k + ℓ This is a contradiction with the fact that A is unambiguous. Proof of Theorem 2.
Suppose first that the automaton A is ambiguous. Sup-pose for instance that there are two different runs from p to q with the same12abel w . This shows that w F ( q ) ⊆ F ( p ). Since w ∈ P ( q ), the measure µ ( w F ( q ))satisfies µ ( w F ( q )) > µ ( F ( p )) > A is unambiguous. We show that µ ( F ( p )) =0 for each state p . We start by a decomposition of the set F ( p ). Let x = a a a · · · be a sequence in F ( p ) and let ρ and ρ ′ be the two different runslabelled by x . Suppose that ρ = q a −→ q a −→ q a −→ q · · · ρ ′ = q ′ a −→ q ′ a −→ q ′ a −→ q ′ · · · where q = q ′ = p . Let n be the least integer such that q n = q ′ n . Let a be thesymbol a n , w be the finite word a · · · a n − and x ′ be the tail a n +1 a n +2 a n +3 · · · .The sequence x is equal to wax ′ and there is a finite run q w −→ q n − , twotransitions q n − a −→ q n and q n − a −→ q ′ n , and the tail x ′ belongs to the intersection F ( q n ) ∩ F ( q ′ n ). We have actually proved the following equality expressing F ( p )in term of a union of intersections of sets F ( q ). F ( p ) = [ p w −→ p ′ p ′ a −→ qp ′ a −→ q ′ wa ( F ( q ) ∩ F ( q ′ ))Since the union ranges over a countable union, it suffices to prove that if thereare two transitions p a −→ q and p a −→ q ′ with q = q ′ , then µ ( F ( q ) ∩ F ( q ′ )) = 0.Lemma 10 and Lemma 8 allow us to conclude. References [1] J. Berstel and Ch. Reutenauer.
Noncommutative Rational Series with Ap-plications . Cambridge Uniersity Press, 2010.[2] N. Bousquet and Ch. L¨oding. Equivalence and inclusion problem forstrongly unambiguous b¨uchi automata. In
LATA 2010 , volume 6031 of
Lecture Notes in Computer Science , pages 118–129. Springer, 2010.[3] O. Carton and M. Michel. Unambiguous B¨uchi automata.
Theoret. Com-put. Sci. , 297:37–81, 2003.[4] Ch. Choffrut and S. Grigorieff. Uniformization of rational relations. In
Jew-els are Forever, Contributions on Theoretical Computer Science in Honorof Arto Salomaa , pages 59–71. Springer, 1999.[5] Th. Colcombet. Unambiguity in automata theory. In
Descriptional Com-plexity of Formal Systems , volume 9118 of
Lecture Notes in Computer Sci-ence , pages 3–18. Springer, 2015.[6] P. R. Halmos.
Lectures on Boolean algebras . Von Nostrand, 1963.[7] G. Hansel and D. Perrin. Mesures de probabilit´e rationnelles. InM. Lothaire, editor,
Mots , pages 335–357. Hermes, 1990.138] M. Holzer and M. Kutrib. Descriptional complexity of (un)ambiguous finitestate machines and pushdown automata. In
Reachability Problems , pages1–23. Springer, 2010.[9] J. E. Hopcroft and J. D. Ullman.
Introduction to Automata Theory, Lan-guages, and Computation . Addison-Wesley, 1979.[10] D. Isaak and Ch. L¨oding. Efficient inclusion testing for simple classes ofunambiguous ω -automata. Inf. Process. Lett. , 112(14-15):578–582, 2012.[11] D. Lind and B. Marcus.
An Introduction to Symbolic Dynamics and Coding .Cambridge University Press, 1995.[12] B. Marcus, R. Roth, and Siegel.
Constrained Systems and Coding forRecording Channels . Technion-I.I.T., Department of Computer Science,1998.[13] D. Perrin and J.-´E. Pin.
Infinite Words . Elsevier, 2004.[14] G. Pighizzini. Two-way finite automata: Old and recent results.
ElectronicProceedings in Theoretical Computer Science , 90:3–20, 2012.[15] J. Sakarovitch.
Elements of Automata Theory . Cambridge University Press,2009.[16] E. Schmidt. Succinctness of descriptions of context-free, regular, and finitelanguages.
DAIMI Report Series , 7(84), 1978.[17] E. Senata.
Non-negative Matrices and Markov Chains . Springer, 2006.[18] Richard Edwin Stearns and Harry B. Hunt III. On the equivalence and con-tainment problems for unambiguous regular expressions, regular grammarsand finite automata.