TThe Big-O Problem for Labelled Markov Chainsand Weighted Automata
Dmitry Chistikov
Centre for Discrete Mathematics and its Applications (DIMAP) & Department of ComputerScience, University of Warwick, Coventry, UK
Stefan Kiefer
Department of Computer Science, University of Oxford, UK
Andrzej S. Murawski
Department of Computer Science, University of Oxford, UK
David Purser
Centre for Discrete Mathematics and its Applications (DIMAP) & Department of ComputerScience, University of Warwick, Coventry, UKMax Planck Institute for Software Systems, Saarland Informatics Campus, Germany
Abstract
Given two weighted automata, we consider the problem of whether one is big-O of the other,i.e., if the weight of every finite word in the first is not greater than some constant multiple of theweight in the second.We show that the problem is undecidable, even for the instantiation of weighted automata aslabelled Markov chains. Moreover, even when it is known that one weighted automaton is big-O ofanother, the problem of finding or approximating the associated constant is also undecidable.Our positive results show that the big-O problem is polynomial-time solvable for unambiguousautomata, coNP -complete for unlabelled weighted automata (i.e., when the alphabet is a singlecharacter) and decidable, subject to Schanuel’s conjecture, when the language is bounded (i.e., asubset of w ∗ . . . w ∗ m for some finite words w , . . . , w m ).On labelled Markov chains, the problem can be restated as a ratio total variation distance, which,instead of finding the maximum difference between the probabilities of any two events, finds themaximum ratio between the probabilities of any two events. The problem is related to (cid:15) -differentialprivacy, for which the optimal constant of the big-O notation is exactly exp( (cid:15) ). Theory of computation → Probabilistic computation
Keywords and phrases weighted automata, labelled Markov chains, probabilistic systems
Funding
Dmitry Chistikov : Supported in part by the Royal Society International Exchanges scheme(IEC\R2\170123).
Stefan Kiefer : Supported by a Royal Society Research Fellowship.
Andrzej S. Murawski : Supported by a Royal Society Leverhulme Trust Senior Research Fellowshipand the International Exchanges Scheme (IE161701).
David Purser : Supported by the UK EPSRC Centre for Doctoral Training in Urban Science(EP/L016400/1) and in part by the Royal Society International Exchanges scheme (IEC\R2\170123).
Acknowledgements
The authors would like to thank to Engel Lefaucheux, Joël Ouaknine, andJames Worrell for discussions during the development of this work.
Weighted automata over finite words are a well-known and powerful model of computation,a quantitative analogue of finite-state automata. Special cases of weighted automata includenondeterministic finite automata and labelled Markov chains, two standard formalisms formodelling systems and processes. Algorithms for analysis of weighted automata have been a r X i v : . [ c s . F L ] J u l The Big-O Problem for Weighted Automata studied both in the early theory of computing and more recently by the infinite-state systemsand algorithmic verification communities.Given two weighted automata A , B over an algebraic structure ( S , + , × ), the equivalenceproblem asks whether the two associated functions f A , f B : Σ ∗ → S are equal: f A ( w ) = f B ( w )for all finite words w over the alphabet Σ. Over the ring ( Q , + , × ), equivalence is decidablein polynomial time by the results of Schützenberger [41] and Tzeng [46]; subsequently,fast parallel ( NC and RNC ) algorithms have been found for this problem [47, 26]. Incontrast, for semirings the equivalence problem is hard: undecidable [27, 1] for the semiring( Q , max , +) and PSPACE -hard [35] for the Boolean semiring (for which weighted automataare usual nondeterministic finite automata and equivalence is equality of recognized languages).Replacing = with ≤ makes the problem harder: even for the ring ( Q , + , × ) the questionof whether f A ( w ) ≤ f B ( w ) for all w ∈ Σ ∗ is undecidable—even if f A is constant [38]. Thisproblem subsumes the universality problem for (Rabin) probabilistic automata, yet anothersubclass of weighted automata (see, e.g., [16]).In this paper, we introduce and study another natural problem, in which the ordering isrelaxed from exact (in)equality to (in)equality to within a constant factor. Given A and B as above, is it true that there exists a constant c > f A ( w ) ≤ c · f B ( w ) for all w ∈ Σ ∗ ?Using standard mathematical notation, this condition asserts that f A ( w ) = O ( f B ( w )) as | w | → ∞ , and we refer to this problem as the big-O problem accordingly. The big-
Θ problem(which turns out to be computationally equivalent to the big-O problem), in line with theΘ( · ) notation in analysis of algorithms, asks whether f A = O ( f B ) and f B = O ( f A ).We restrict our attention to the ring ( Q , + , × ) and only consider non-negative weightedautomata , i.e., those in which all transitions have non-negative weights. We remark that,even under this restriction, weighted automata still form a superclass of (Rabin) probabilisticautomata, a non-trivial and rich model of computation. Our initial motivation to study thebig-O problem came from yet another formalism, labelled Markov chains (LMCs). One canthink of the semantics of LMCs as giving a probability distribution or subdistribution onthe set of all finite words. LMCs, often under the name Hidden Markov Models, are widelyemployed in a diverse range of applications; in computer-aided verification, they are perhapsthe most fundamental model for probabilistic systems, with model-checking tools such asPrism [28] or Storm [13] based on analyzing LMCs efficiently. All the results in our paper(including hardness results) hold for LMCs too. Our main findings are as follows.The big-O problem for non-negative WA and LMCs turns out to be undecidable ingeneral , by a reduction from nonemptiness for probabilistic automata.For unambiguous automata , i.e., where every word has at most one accepting path,the big-O problem becomes decidable and can be solved in polynomial time.In the unary case , i.e., if the input alphabet Σ is a singleton, the big-O problem isalso decidable and, in fact, complete for the complexity class coNP . Unary LMCs are asimple and pure probabilistic model of computation: they run in discrete time and canterminate at any step; the big-O problem refers to this termination probability in twoLMCs (or two WA). Our upper bound argument refines an analysis of growth of entriesin powers of non-negative matrices by Friedland and Schneider [40], and the lower boundis obtained by a reduction from unary NFA universality [44]. There also exists a related but slightly different definition of big-O; see Remark 12 for details on thecorresponding version of our big-O problem. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 3
In a more general bounded case , i.e., if the languages of all words w associated withnon-zero weight are included in w ∗ w ∗ . . . w ∗ m for some finite words w , . . . , w m ∈ Σ ∗ (thatis, are bounded in the sense of Ginsburg and Spanier ; see [21, Chapter 5] and [22]),the big-O problem is decidable subject to Schanuel’s conjecture. This is a well-knownconjecture in transcendental number theory [29], which implies that the first-order theoryof the real numbers with the exponential function is decidable [30]. Intuitively, ourreliance on this conjecture is linked to the expressions for the growth rate in powers ofnon-negative matrices. These expressions are sums of terms of the form ρ n · n k , where n is the length of a word, k ∈ N , and ρ is an algebraic number. Our algorithms (howeverimplicitly) need to compare for equality pairs of real numbers of the form log ρ / log ρ ,where ρ i are algebraic, and it is an open problem in number theory whether there is aneffective procedure for this task (the four exponentials conjecture asks whether two suchratios can ever be equal; see, e.g., Waldschmidt [48, Sections 1.3 and 1.4]).Bounded languages form a well-known subclass of regular languages. In fact, a regular (oreven context-free) language L is bounded if and only if the number of words of length n in L is at most polynomial in n . All other regular languages have, in contrast, exponentialgrowth rate (a fact rediscovered multiple times; see, e.g., references in Gawrychowskiet al. [19]). Bounded languages have been studied from combinatorial and algorithmicpoints of view since the 1960s [22, 19], and have recently been used, e.g., in the analysisof quantitative information flow problems in computer security [34, 33]. In the context oflabelled Markov chains, languages that are subsets of a ∗ a ∗ . . . a ∗ m (for individual letters a , . . . , a m ∈ Σ) model consecutive arrival of m events in a discrete-time system. It iscurious that natural decision problems for such simple systems can lead to intricatealgorithmic questions in number theory at the border of decidability. Further motivation and related work.
In the labelled Markov chain setting, the big-O problem can be reformulated as a boundednessproblem for the following function. For two LMCs A and B , define the (asymmetric) ratiovariation function by r ( A , B ) = sup E ⊆ Σ ∗ ( f A ( E ) /f B ( E )) , where f A ( E ) and f B ( E ) denote the total probability mass associated with an arbitrary setof finite words E ⊆ Σ ∗ in A and B , respectively. Here we assume = 0 and x = ∞ for x >
0. Observe that, because max( ab , cd ) ≥ a + cb + d for a, b, c, d ≥
0, the supremum over E ⊆ Σ ∗ can be replaced with supremum over w ∈ Σ ∗ . Consequently, the big-O problem for LMCs isequivalent to deciding whether r ( A , B ) < ∞ .Finding the value of r amounts to asking for the optimal (minimal) constant in the big-Onotation. Further, one can consider a symmetric variant, the ratio distance : rd ( A , B ) =max { r ( A , B ) , r ( B , A ) } , in an analogy with big-Θ. Now, rd is a ratio-oriented variant of theclassic total variation distance tv , defined by tv ( A , B ) = sup E ⊆ Σ ∗ ( f A ( E ) − f B ( E )), which isa well-established way of comparing two labelled Markov chains [6, 25]. We also consider theproblem of approximating r (as well as rd ) to a given precision and the problem of comparingit with a given constant (threshold problem), showing that both are undecidable.The ratio distance rd is also equivalent to the exponential of the multiplicative totalvariation distance defined in [5, 43] in the context of differential privacy. Consider a system M , modelled by a single labelled Markov chain, where output words are observable tothe environment but we want to protect the privacy of the starting configuration. Let The Big-O Problem for Weighted Automata R ⊆ Q × Q be a symmetric relation, which relates the starting configurations intendedto remain indistinguishable. Given (cid:15) ≥
0, we say that M is (cid:15) -differentially private (withrespect to R ) if, for all ( s, s ) ∈ R , we have f s ( E ) ≤ e (cid:15) · f s ( E ) for every observable setof traces E ⊆ Σ ∗ [14, 7]. Here in the subscript of f and elsewhere, references tostates s and s replace references to LMCs/automata: M stays implicit, and wespecify which state it is executed from. Note that there exists such an (cid:15) if and onlyif r ( s, s ) < ∞ for all ( s, s ) ∈ R or, equivalently, (the LMC M executed from) s is big-Oof (the LMC M executed from) s for all ( s, s ) ∈ R . In fact, the minimal such (cid:15) satisfies e (cid:15) = max ( s,s ) ∈ R r ( s, s ), thus r captures the level of differential privacy between s and s .Our results show that even deciding whether the multiplicative total variation distance isfinite or + ∞ is, in general, impossible. Likewise, it is undecidable whether a system modelledby a labelled Markov chain provides any degree of differential privacy, however low. (cid:73) Definition 1. A weighted automaton W over the ( Q , + , × ) semi-ring is a 4-tuple h Q, Σ , M, F i , where Q is a finite set of states , Σ is a finite alphabet , M : Σ → Q Q × Q is a transition weighting function , and F ⊆ Q is a set of final states . We consider onlynon-negative weighted automata, i.e. M ( a )( q, q ) ≥ for all a ∈ Σ and q, q ∈ Q . In complexity-theoretic arguments, we assume that each weight is given as a pair of integers(numerator and denominator) in binary. The description size is then the number of bitsrequired to represent h Q, Σ , M, F i , including the bit size of the weights.Each weighted automaton defines functions f s : Σ ∗ → R , where for all s ∈ Qf s ( w ) = X t ∈ F ( M ( a ) × M ( a ) × · · · × M ( a n )) s,t for w = a a . . . a n ∈ Σ ∗ and A × B is standard matrix multiplication. We refer to f s ( w ) as the weight of w fromstate s . Without loss of generality, a weighted automaton can have a single final state. If not,introduce a new unique final state t s.t. M ( a )( q, t ) = P q ∈ F M ( a )( q, q ) for all q ∈ Q , a ∈ Σ. (cid:73) Definition 2.
We denote by L s ( W ) the set of w ∈ Σ ∗ with f s ( w ) > , that is, with positiveweight from s . Equivalently, this is the language of N s ( W ) , the non-deterministic finiteautomaton (NFA) formed from the same set of states (and final states) as W , start state s ,and transitions q a −→ q whenever M ( a )( q, q ) > . Given s, s ∈ Q , we say that s is big-O of s if there exists C > f s ( w ) ≤ C · f s ( w ) for all w ∈ Σ ∗ . The paper studies the following problem. (cid:73) Definition 3 ( Big-O Problem ) . input Weighted automaton h Q, Σ , M, F i and s, s ∈ Q output Is s big-O of s ? (cid:73) Remark 4.
One could consider whether s is big-Θ of s , defined as s is big-O of s and s is big-O of s ; equivalently, whether rd ( s, s ) < ∞ for LMCs. We note that these twonotions reduce to each other, justifying our consideration of only the big-O problem (seeAppendix C). There is an obvious reduction from big-Θ to big-O making two oracle calls (aCook reduction), but this can be strengthened to a single call preserving the answer (a Karpreduction). This, however, requires at least two characters. In the other direction, one canask if s big-O of s using big-Θ by asking if a linear combination of s and s is big-Θ of s . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 5 In the paper we also work with labelled Markov chains. In particular, they will appear inexamples and hardness (including undecidability) arguments. As they are a special class ofweighted automata, this will imply hardness (resp. undecidability) for weighted automata ingeneral. On the other hand, our decidability results will be phrased using weighted automata,which makes them applicable to labelled Markov chains. (cid:73)
Definition 5. A labelled Markov chain (LMC) is a (non-negative) weighted automaton h Q, Σ , M, F i such that, for all q ∈ Q \ F , we have P q ∈ Q P a ∈ Σ M ( a )( q, q ) = 1 and M ( a )( q, q ) = 0 for all for all a ∈ Σ , q ∈ F and q ∈ Q . Since final states have no outgoing transitions, w.l.o.g., one can assume a unique finalstate. For LMCs, the function f s can be extended to a measure on the powerset of Σ ∗ by f s ( E ) = P w ∈ E f s ( w ), where E ⊆ Σ ∗ . The measure is a subdistribution: P w ∈ Σ ∗ f s ( w ) ≤ unary weighted automata, and similarly LMCs, where | Σ | = 1.Then we will often omit Σ on the understanding that Σ = { a } , and describe transitions witha single matrix A = M ( a ) so that f s ( a n ) = A ns,t , where t is the unique final state. Note that A ns,t stands for ( A n )( s, t ), and not ( A ( s, t )) n . Using the notation of regular expressions, wecan write L s ( W ) ⊆ a ∗ . It will turn out fruitful to consider several larger classes of languages: (cid:73) Definition 6.
Let L ⊆ Σ ∗ . L is bounded [22] if L ⊆ w ∗ w ∗ · · · w ∗ m for some w , . . . , w m ∈ Σ ∗ . L is letter-bounded if L ⊆ a ∗ a ∗ . . . a ∗ m for some a , . . . , a m ∈ Σ . L is plus-letter-bounded if L ⊆ a +1 a +2 . . . a + m for some a , . . . , a m ∈ Σ . In each case, if the language of an NFA is suitably bounded, one can extract a correspondingbounding regular expression [19].
We show that the big-O problem is undecidable. We also establish undecidability for severalother problems related to computing and approximating the ratio variation distance. Recallthat this corresponds to identifying the optimal constant for positive instances of the big-Oproblem or the level of differential privacy between two states in a labelled Markov chain. (cid:73)
Definition 7.
The asymmetric threshold problem takes an LMC along with two states s, s and a constant θ , and asks if r ( s, s ) ≤ θ .The variant under the promise of boundednesspromises that r ( s, s ) < ∞ . The strict variant of each problem replaces ≤ with < .The asymmetric additive approximation task takes an LMC, two states s, s and aconstant γ , and asks for x such that | r ( s, s ) − x | ≤ γ . The asymmetric multiplicativeapproximation task takes an LMC, two states s, s and a constant γ , and asks for x suchthat − γ ≤ x r ( s,s ) ≤ γ .In each case, the symmetric variant is obtained by replacing r with rd. (cid:73) Theorem 8.
The big-O problem is undecidable, even for LMCs.Each variant of the threshold problem (asymmetric/symmetric, non-strict/strict) isundecidable, even under the promise of boundedness.All variants of the approximation tasks (asymmetric/symmetric, additive/multiplicative)are unsolvable, even under the promise of boundedness.
Probabilistic automata are similar to LMCs, except that M ( a ) is stochastic for every a , rather than P a ∈ Σ M ( a ) being stochastic. Formally, a probabilistic automaton is a non-negative weighted automaton with a distinguished start state q s such that P q ∈ Q M ( a )( q, q ) = The Big-O Problem for Weighted Automata s t s
34 14 12 12
Figure 1
Unbounded ratio but language equivalent. q ∈ Q and a ∈ Σ. The problem
Empty asks if f q s ( w ) ≤ for all words w . It isknown to be undecidable [38, 16]. Proof sketch of Theorem 8 (see Appendix D).
We reduce from
Empty . The constructioncreates two branches of a labelled Markov chain. The first simulates the probabilisticautomaton using the original weights multiplied by a scalar ( in the case | Σ | = 2). Theother branch will process each letter from Σ with equal weight (also in an infinite loop).Consequently, if there is a word accepted with probability greater than , the ratio betweenthe two branches will be greater than 1. The construction will enable words to be processedrepeatedly, so that the ratio can then be pumped unboundedly. Certain linear combinations ofthe branches enable a gap promise, entailing undecidability of the threshold and approximationtasks. (cid:74)(cid:73) Remark.
The classic non-strict threshold problem for the total variation distance (i.e.whether tv ( s, s ) ≤ θ ) is known to be undecidable [25], like our distances. However, it is notknown if its strict variant (i.e. whether tv ( s, s ) < θ ) is also undecidable. In contrast, inour case, both variants are undecidable. Further note that (additive) approximation of tv ispossible [25, 6], but this is not the case for our distances r and rd . (cid:73) Remark.
We have shown the undecidability of the big-O problem using the undecidabilityof the emptiness problem for probabilistic automata. Another proof of undecidability can beobtained using the
Value-1 problem (shown to be undecidable in [20]): indeed the big-Oproblem and the
Value-1 problem are interreducible. However, the reduction from big-O to
Value-1 does not entail decidability for subclasses of weighted automata (such as those withbounded languages), as the image of these subclasses does not fall into the known decidablefragments of the
Value-1 problem. Further details are available in Appendix D.1.
Towards decidability results, we identify a simple necessary (but insufficient) condition for s being big-O of s . (cid:73) Definition 9 (LC condition) . A weighted automaton W = h Q, Σ , M, F i and s, s ∈ Q satisfythe language containment condition (LC) if for all words w with f s ( w ) > we also have f s ( w ) > . Equivalently, L s ( W ) ⊆ L s ( W ) . The condition can be verified by constructing NFA N s ( W ) , N s ( W ) that accept L s ( W ) and L s ( W ) respectively and verifying L ( N s ( W )) ⊆ L ( N s ( W )). (cid:73) Remark 10.
Recall that NFA language containment is NL -complete if the automata are infact deterministic, in P if they are unambiguous [10, Theorem 3], coNP -complete if theyare unary [44] and PSPACE -complete in general [35]. In all cases this complexity level willmatch, or be lower than that for our respective algorithm for the big-O problem.We observe that, if s is big-O of s , the LC condition must hold and so the LC conditionis the first step in each of our verification routines. Example 11 shows that the conditionalone is not sufficient to solve the big-O problem, because two states can admit the same setof words with non-zero weight, yet the weight ratios become unbounded. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 7 (cid:73) Example 11.
Consider the unary automaton W in Figure 1. We have L s ( W ) = L s ( W ) = { a n | n ≥ } , but f s ( a n ) f s ( a n ) = (0 . n − · . . n − · . = 0 . · . n − −−−−→ n →∞ ∞ . (cid:73) Remark 12.
The original big-O notation on f, g : N → N , states that f is O ( g ) if ∃ C, k > ∀ n > k f ( n ) ≤ C g ( n ). Despite excluding finitely many points, when g ( n ) ≥
1, it is equivalentto ∃ C > ∀ n > f ( n ) ≤ C g ( n ) by taking C large enough to deal with the finite prefix.In the paper, though, we formally consider s to not be big-O of s if there exists even a singleword w such that f s ( w ) > f s ( w ) = 0. However, for weighted automata, we could amendour definition to “eventually big-O” as follows: ∃ C > , k > ∀ w ∈ Σ ≥ k f s ( w ) ≤ C · f s ( w ).The big-O problem reduces to its eventual variant by checking both the LC conditionand the eventually big-O condition. Thus our undecidability (and hardness) results transferto the eventually big-O problem. The eventually big-O problem can be solved via the big-Oproblem by “fixing” the LC condition through the addition of a branch from s that acceptsall appropriate words with very low probability (see Appendix E for more details). In this section, we prove the first decidability result, that is, polynomial-time solvability inthe unambiguous case. We say a weighted automaton W is unambiguous from a state s ifevery word has at most one accepting path in N s ( W ). (cid:73) Lemma 13.
If a weighted automaton W is unambiguous from states s and s , the big-Oproblem is decidable in polynomial time. Proof sketch (see Appendix E.1).
We construct a product weighted automaton, with edgeweights of the form M ( a )(( q , q ) , ( q , q )) = M ( a )( q ,q ) M ( a )( q ,q ) and ask if there is a cycle on a pathfrom ( s, s ) to ( t, t ) with weight >
1, which can be detected in polynomial time using avariation on the Bellman-Ford algorithm. (cid:74)
Note the relevant behaviours are those on cycles—transitions which are taken at most onceare of little significance to the big-O problem. Such transitions have at most a constantmultiplicative effect on the ratio. This is the case whether or not the system is unambiguous. coNP -complete
In this section we show coNP -completeness in the unary case. (cid:73)
Theorem 14.
The big-O problem for unary weighted automata is coNP -complete. It is coNP -hard even for unary labelled Markov chains.
For the upper bound, our analysis will refine the analysis of the growth of powers ofnon-negative matrices of Friedland and Schneider [18, 40] which gives the asymptotic orderof growth of A ns,t + A n +1 s,t + · · · + A n + qs,t ≈ ρ n n k for some ρ, k and q , which smooths over theperiodic behaviour (see Theorem 18). Our results require a non-smoothed analysis, valid foreach n . This isn’t provided in [18, 40], where the smoothing forces the existence of a singlelimit—which we don’t require. Our big-Θ lemma (Lemma 21) will accurately characterisethe asymptotic behaviour of A ns,t by exhibiting the correct value of ρ and k for every word. The Big-O Problem for Weighted Automata
Let W be a unary non-negative weighted automaton with states Q , transition matrix A anda unique final state t . When we refer to a path in W , we mean a path in the NFA of W , i.e.paths only use transitions with non-zero weights and states on a path may repeat. (cid:73) Definition 15.
A state q can reach q if there is a path from q to q . In particular, any state q can alwaysreach itself.A strongly connected component (SCC) ϕ ⊆ Q is a maximal set of states such that foreach q, q ∈ ϕ , q can reach q . We denote by SCC ( q ) the SCC of state q and by A ϕ , the | ϕ | × | ϕ | transition matrix of ϕ . Note every state is in a SCC, even if it is a singleton.The DAG of W is the directed acyclic graph of strongly connected components. Components ϕ, ϕ are connected by an edge if there exist q ∈ ϕ and q ∈ ϕ with A ( q, q ) > .The spectral radius of an m × m matrix A is the largest absolute value of its eigenvalues.Recall the eigenvalues of A are { λ ∈ C | exists vector ~x ∈ C m , ~x = 0 with A~x = λ~x } .The spectral radius of ϕ , denoted by ρ ϕ , is the spectral radius of A ϕ . By ρ ( q ) we denotethe spectral radius of the SCC in which q is a member.We denote by T ϕ the period of the SCC ϕ : the greatest common divisor of return timesfor some state s ∈ ϕ , i.e. gcd { t ∈ N | A t ( s, s ) > } . It is known that any choice of statein the SCC gives the same value (see e.g. [42, Theorem 1.20]). If A ϕ = [0] then T ϕ = 0 .Let P ( s, s ) be the set of paths from the SCC of s to the SCC of s in the DAG of W .Thus a path π ∈ P ( s, s ) is a sequence of SCCs ϕ , . . . , ϕ m . T ( s, s ) , called the local period between s and s , is defined by T ( s, s ) = lcm π ∈ P ( s,s ) gcd ϕ ∈ π T ϕ .The spectral radius between states s and s , written ρ ( s, s ) , is the largest spectral radiusof any SCC seen on a path from s to s : ρ ( s, s ) = max π ∈ P ( s,s ) ρ ( π ) , where ρ ( π ) =max ϕ ∈ π ρ ϕ for π ∈ P ( s, s ) .The following function captures the number of SCCs which attain the largest spectralradius on the path that has the most SCCs of maximal spectral radius. Let k ( s, s ) =max π ∈ P ( s,s ) k ( π ) − , where, for π ∈ P ( s, s ) , k ( π ) = |{ ϕ ∈ π | ρ ϕ = ρ ( s, s ) }| . (cid:73) Remark 16.
Since our weighted automata have rational weights, the spectral radius ofan SCC is an algebraic number, as the absolute value of a root of a polynomial withrational coefficients. In general, an algebraic number z ∈ A can be represented by a tuple( p z , a, b, r ) ∈ Q [ x ] × Q , where p z is a polynomial over x and a, b, r specify an approximationto distinguish z from all other roots: z is the only root of p z ( x ) with | z − ( a + bi ) | ≤ r . Thisrepresentation, which admits standard operations (addition, multiplication, absolute value,(in)equality testing, etc.), can be found in polynomial time (see, e.g. [36]). Henceforth, whenwe refer to the spectral radius we will implicitly mean representation in this form.The asymptotic behaviours of weighted automata will be characterised using ( ρ, k )-pairs: (cid:73) Definition 17. A ( ρ, k ) -pair is an element of R × N . The ordering on R × N is lexicographic,i.e. ( ρ , k ) ≤ ( ρ , k ) ⇐⇒ ρ < ρ ∨ ( ρ = ρ ∧ k ≤ k ) . Friedland and Schneider [18, 40] essentially use ( ρ, k )-pairs to show the asymptoticbehaviour of the powers of non-negative matrices. In particular they find the asymptoticbehaviour of the sum of several A ns,s , smoothing the periodic behaviour of the matrix. (cid:73) Theorem 18 (Friedland and Schneider [18, 40]) . Let A be an m × m non-negative matrix,inducing a unary weighted automaton W with states Q = { , . . . , m } . Given s, t ∈ Q , let B ns,t = A ns,t + A n +1 s,t + · · · + A n + T ( s,t ) − s,t . Then lim n →∞ B ns,t ρ ( s,t ) n n k ( s,t ) = c, < c < ∞ . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 9 s ρ =0 . ρ =0 . s ρ =0 . t ρ =0 . ρ =0 .
14 3414 34 1212 121214 34
14 34
38 7823 13
Figure 2
Different rates for different phases.
In the case where the local period is 1 ( T ( s, t ) = T ( s , t ) = 1), Theorem 18 can alreadybe used to solve the big-O problem (in particular if the matrix A is aperiodic). In this case A ns,t = B ns,t = Θ( ρ ( s, t ) n n k ( s,t ) ). Then to establish that s is big-O of s we check that thelanguage containment condition holds and that ( ρ ( s, t ) , k ( s, t )) ≤ ( ρ ( s , t ) , k ( s , t )). However,this is not sufficient if the local period is not 1. (cid:73) Example 19.
Consider the chains shown in Figure 2 with local period 2. The behaviourfor n ≥ A ns,t = Θ(0 . n n ) and A ns ,t = Θ(0 . n ) when n is odd and A ns ,t = Θ(0 . n n )when n is even. However, Theorem 18 tells us B ns,t = Θ(0 . n n ) and B ns ,t = Θ(0 . n n )suggesting the ratio is bounded, but in fact s is not big-O s (although s is big-O of s )because A n +1 s,t A n +1 s ,t −−−−→ n →∞ ∞ . coNP Let W be a unary weighted automaton and suppose we are asked whether s is big-O of s .We assume w.l.o.g. (a) that there is a unique final state t with no outgoing transitions, and(b) that s, s do not appear on any cycle .Next we define a ‘degree function’, which captures the asymptotic behaviour of each word a n by a ( ρ, k )-pair, capturing the exponential and polynomial behaviours respectively. (cid:73) Definition 20.
Given a unary weighted automaton W , let d s,t : N → R × N be defined by d s,t ( n ) = ( ρ, k ) , where: ρ is the largest spectral radius of any vertex visited on any path of length n from s to t the path from s to t that visits the most SCCs of spectral radius ρ visits k + 1 such SCCs;if there is no length- n path from s to t , then ( ρ, k )=(0 , . Let s, t ∈ Q be fixed. We are now ready to state the key technical lemma of this subsection(cf. Theorem 18, Friedland and Schneider [18, 40]), where we assume the functions ρ ( n ) , k ( n ),defined by d s,t ( n ) = ( ρ ( n ) , k ( n )). (cid:73) Lemma 21 (The big- Θ lemma) . There exist c, C > such that, for every n > | Q | , c · ρ ( n ) n n k ( n ) ≤ A ns,t ≤ C · ρ ( n ) n n k ( n ) . If this is not the case, copies of s, s and their transitions can be taken. The set of admissible ( ρ, k )-pairs is the image of d s,t . Observe that this set is finite andof size at most | Q | : there can be no more than | Q | values of ρ (if at worst each state wereits own SCC) and the value of k is also bounded by the number of SCCs and thus | Q | .We next define the ( ρ, k )-annotated version of W , i.e. in each state we record the relevantvalue of ( ρ, k ) corresponding to the current run to the state. (cid:73) Definition 22 (The weighted automaton W † ) . Given W = h Q, Σ , A, { t }i and s ∈ Q , theweighted automaton W † has states of the form ( q, ρ, k ) for all q ∈ Q and all admissible ( ρ, k ) -pairs, the same Σ and no final states. For every transition q p −→ q from W denoting A ( q, q ) = p , include the following transition in W † for each admissible ( ρ, k ) : ( q, ρ, k ) p −→ ( q , ρ, k ) if SCC ( q )= SCC ( q ) , ( q, ρ, k ) p −→ ( q , ρ, k + 1) if SCC ( q ) = SCC ( q ) and ρ = ρ ( q ) , ( q, ρ, k ) p −→ ( q , ρ, k ) if SCC ( q ) = SCC ( q ) and ρ > ρ ( q ) , ( q, ρ, k ) p −→ ( q , ρ ( q ) , if SCC ( q ) = SCC ( q ) and ρ ( q ) > ρ . W † is constructable in polynomial time given W . Indeed, the spectral radii of all SCCscan be computed and compared to each other in time polynomial in the size of W (seeRemark 16).For the following lemma, recall the language containment (LC) condition from Definition 9and the ordering on ( ρ, k )-pairs from Definition 17. (cid:73) Lemma 23.
A state s is big-O of s if and only if the LC condition holds and, for all butfinitely many n ∈ N , we have d s,t ( n ) ≤ d s ,t ( n ) . Proof sketch.
Whenever d s,t ( n ) ≤ d s ,t ( n ), by Lemma 21, we have f s ( a n ) ≤ ( Cc ( ρρ ) n n k − k ) · f s ( a n ), in which case either d s,t ( n ) = d s ,t ( n ) and ( ρρ ) n n k − k = 1 or lim n →∞ ( ρρ ) n n k − k = 0and so ( ρρ ) n n k − k ≤ n .However, whenever d s,t ( n ) > d s ,t ( n ), Lemma 21 yields f s ( a n ) ≥ ( cC ( ρρ ) n n k − k ) · f s ( a n )but then lim n →∞ ( ρρ ) n n k − k = ∞ . (cid:74) We are going to use the characterisation from Lemma 23 to prove Theorem 14. As alreadydiscussed, the LC condition can be checked via NFA inclusion testing. To tackle the “for allbut finitely many ...” condition, we introduce the concept of eventual inclusion. (cid:73)
Definition 24.
Given sets
A, B , we say A is eventually included in B , written A ∼ ⊂ B , ifand only if A \ B is finite. The next three lemmas relate deciding the big-O problem using the characterisation ofLemma 23 to eventual inclusion. The missing proofs are available in the Appendix. (cid:73)
Lemma 25.
Given unary NFAs N , N , the problem L ( N ) ∼ ⊂ L ( N ) is in coNP . (cid:73) Lemma 26.
Suppose d , d : N → X , with ( X, ≤ ) a finite total order. Then d ( n ) ≤ d ( n ) for all but finitely many n if and only if { n | d ( n ) ≥ x } ∼ ⊂ { n | d ( n ) ≥ x } for all x ∈ X . (cid:73) Lemma 27.
Given a unary weighted automaton W , the associated problem whether d s,t ( n ) ≤ d s ,t ( n ) for all but finitely many n ∈ N is in coNP . Proof.
Given an admissible pair x = ( ρ, k ), we construct an NFA N s,x accepting { a n | d s,t ( n ) ≥ x } (similarly N s ,x for s ), by taking the NFA N s ( W † ) (Definitions 2, 22) with a suitablechoice of accepting states. Recall that states in W † are of the form ( q, ρ , k ), where q is a . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 11 state from W and ( ρ , k ) is admissible. If we designate states ( t, ρ , k ) with ( ρ , k ) ≥ x asaccepting, it will accept { a n | d s,t ( n ) ≥ x } . This is a polynomial-time construction.Then, by Lemma 26, the problem whether d s,t ( n ) ≤ d s ,t ( n ) for all but finitely many n ∈ N is equivalent to L ( N s,x ) ∼ ⊂ L ( N s ,x ) for all admissible x . As there are at most | Q | values of x and each can be verified non-deterministically in coNP , it suffices to show that L ( N s,x ) ∼ ⊂ L ( N s ,x ) is in coNP for each x . This is the case by Lemma 25. (cid:74) Remark 10 and Lemma 27 together complete the upper bound result for Theorem 14. (cid:73)
Remark.
Lemma 26 may appear simpler using { n | f ( n ) = x } ∼ ⊂ { n | f ( n ) ≥ x } . However,it does not seem possible to construct an NFA for { a n | d s,t ( n ) = x } in polynomial time.Taking just ( t, ρ, k ) as accepting would not be correct, as there could be paths of the samelength ending in ( t, ρ , k ) with ( ρ , k ) > ( ρ, k ). Using ≥ instead of = avoids this problem. (cid:73) Remark.
An alternative approach for obtaining an upper bound could be to compute theJordan normal form of the transition matrix and consider its powers. Instead of the interplayof strongly connected components in the transition graph, we would need to consider linearcombinations of the n th powers of complex numbers (such as roots of unity). It is not clearthis algebraic approach leads to a representation more convenient for our purposes. coNP -hardness for unary LMC Given a unary NFA N , the NFA universality problem asks if L ( N ) = { a n | n ∈ N } . Thisproblem is coNP -complete [44]. We exhibit a polynomial-time reduction from (a variant of)the unary universality problem to the big-O problem on unary Markov chains. In this section we consider the big-O problem for a weighted automaton W and states s, s such that L s ( W ), L s ( W ) are bounded. Throughout the section, we assume that the LCcondition has already been checked, i.e. L s ( W ) ⊆ L s ( W ). We will show that the problem isconditionally decidable, subject to Schanuel’s conjecture. Logical theories of arithmetic and Schanuel’s conjecture. In first-order logical theoriesof arithmetic, variables denote numbers (from Z or R , as appropriate), and atomic predicatesare equalities and inequalities between terms built from variables and function symbols.Nullary function symbols are constants, always from Z . If binary addition and multiplicationare available, then:for R we obtain the first-order theory of the reals, where the truth value of sentencesis decidable due to the celebrated Tarski–Seidenberg theorem [3, Chapter 11 and The-orem 2.77];for Z , the first-order theory of the integers is, in contrast, undecidable (see, e.g, [39]).In the case of R , adding the unary symbol for the exponential function x e x , leads to the first-order theory of the real numbers with exponential function (Th( R exp )). Logarithmsbase 2, for example, are easily expressible in Th( R exp ). The decidability of Th( R exp )is anopen problem and hinges upon Schanuel’s conjecture [30]. Schanuel’s conjecture [29] is a unifying conjecture of transcendental number theory, sayingthat for all z , . . . , z n ∈ C linearly independent over Q the field extension Q ( z , . . . , z n , e z , . . . ,e z n ) has transcendence degree at least n over Q , meaning that for some S ⊆ { z , . . . , z n ,e z , . . . , e z n } of cardinality n , say S = { s , . . . , s n } , the only polynomial p over Q satisfying p ( s , . . . , s n ) = 0 is p ≡
0. See, e.g., Waldschmidt’s book [48, Section 1.4] for further ss a a . a . a p a − pa . a . b . b . b . b . b . b . a . a . Figure 3
Relative orderings are the same, but the boundedness question is different. context. If indeed true, this conjecture would generalise several known results, includingthe Lindemann–Weierstrass theorem and Baker’s theorem, and would entail the decidabilityof Th( R exp ). Our work follows an exciting line of research that reduces problems fromverification [12, 31], linear dynamical systems [2, 8], and symbolic computation [24] to thedecision problem for Th( R exp ). (cid:73) Theorem 28.
Given a weighted automaton W = h Q, Σ , M, F i , s, s ∈ Q , with L s ( W ) and L s ( W ) bounded, it is decidable whether s is big-O of s , subject to Schanuel’s conjecture. In the unary case, it was sufficient to consider the relative order between spectral radii,with careful handling of the periodic behaviour. This approach is insufficient in the boundedcase. Example 29 highlights that the actual values of the spectral radii have to be examined. (cid:73)
Example 29 (Relative orderings are insufficient) . Consider the LMC in Figure 3, with 0 . ≤ p ≤ .
62. We have f s ( a m b n ) = Θ(0 . m . n ) and f s ( a m b n ) = Θ( p m . n + 0 . m . n ).Note that neither 0 . m . n nor p m . n dominate, nor are dominated by, 0 . m . n for anyvalue of 0 . ≤ p ≤ .
62. That is, there are values of m, n where 0 . m . n (cid:29) . m . n (in particular large n ) and values of m, n where 0 . m . n (cid:28) . m . n (in particular large m ); similarly for p m . n vs 0 . m . n (but the cases in which n or m needs to be large areswapped). However, the big-O status can be different for different values of p ∈ [0 . , . p = 0 .
62, the ratio turnsout to be bounded: f s ( a m b n ) f s ( a m b n ) ≤ for all m, n (in particular, maximal at m = n = 0). Incontrast, when p = 0 .
61, we have f s ( a m b . m ) f s ( a m b . m ) −−−−→ m →∞ ∞ .We first prove Theorem 28 for the plus-letter-bounded case, which is the most technicallyinvolved; the other bounded cases will be reduced to it. In the plus-letter-bounded case, wewill characterise the behaviour of such automata, generalising ( ρ, k )-pairs of the unary case.We will need to rely upon the first-order theory of the reals with exponentials to comparethese behaviours. We assume L s ( W ) ⊆ a +1 · · · a + m , where a , · · · , a n ∈ Σ and because the LC condition holds, wealso have L s ( W ) ⊆ a +1 · · · a + m . In the plus-letter-bounded cases, without loss of generality, weassume a i = a j for i = j (see Appendix G for a justification). Then any word w = a n . . . a n m m is uniquely specified by a vector ( n , . . . , n m ) ∈ N m> , where n i is the number of a i ’s in w . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 13 Like in Definition 20, we define a degree function d , which will be used to study theasymptotic behaviour of words. This time we will associate a separate ( ρ, k ) pair to each of the m characters and, consequently, words will induce sequences of the form ( ρ , k ) · · · ( ρ m , k m ).Further, as there may be multiple, incomparable behaviours, words will induce sets ofsuch sequences, i.e. d : N m → P (( R × N ) m ). For the sake of comparisons, it will be convenientto focus on maximal elements with respect to the pointwise order on ( R × N ) m , written ≤ ,where the lexicographic order (recall Definition 17) is used to compare elements of R × N .Recall Lemma 21 does not capture the asymptotics when n ≤ | Q | . In the unary case thisis inconsequential as small words are covered by the finitely many exceptions and the LCcondition. However, here, a small number of one character may be used to enable accessto a particular part of the automaton in another character. For this case, we introduce anew number δ = min ϕ : ρ ϕ > ρ ϕ which is strictly smaller than the spectral radius of everynon-zero SCC (so will not dominate with the partial order), but non-zero. (cid:73) Definition 30.
Let ˆ ρ = ( ρ , k ) , · · · , ( ρ m , k m ) ∈ ( R × N ) m . An a n a n . . . a n m m -labelled pathfrom s (to the final state) is compatible with ˆ ρ if, for each i = 1 , . . . , m , it visits k i + 1 SCCswith spectral radius ρ i while reading a i , unless the path visits only singletons with no loops,in which case ( ρ i , k i ) = ( δ, . The notation ( ρ, k ) ∈ ˆ ρ is used for ‘ ( ρ, k ) is an element of ˆ ρ ’. (cid:73) Definition 31.
Let d s : N m → P (( R × N ) m ) be s.t.: ˆ ρ ∈ d s ( n , . . . , n m ) if and only if (1) there exists an a n a n . . . a n m m -labelled path from s to the final state compatible with ˆ ρ ,and (2) for every a n a n . . . a n m m -labelled path from s compatible with ˆ σ s.t. ˆ ρ ≤ ˆ σ , we have ˆ ρ = ˆ σ . Observe that ˆ ρ may range over at most | Q | m possible values. We write D for the setcontaining them, so that d s : N m → P ( D ). In this extended setting, the big-Θ lemma(Lemma 21) may be generalised as follows. (cid:73) Lemma 32.
Denote z ( n , · · · , n m ) = P ˆ ρ ∈ d s ( n ,...,n m ) Q ( ρ i ,k i ) ∈ ˆ ρ ρ n i i · n k i i . There exist c, C > such that for all n . . . , n m ∈ N : c · z ( n , · · · , n m ) ≤ f s ( a n a n . . . a n m m ) ≤ C · z ( n , · · · , n m ) . The following lemma provides the key characterisation of negative instances of the big-Oproblem, in the plus-letter-bounded case and assuming the LC condition. Here and below,we write n ( t ) to refer to the the t th vector in a sequence n : N → N m . (cid:73) Lemma 33 (Main lemma) . Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if andonly if there exists a sequence n : N → N m and X ∈ D , Y ⊆ D such that (a) X ∈ d s ( n ( t )) and Y = d s ( n ( t )) for all t , and (b) for all j ∈ h Y , the sequence n satisfies m X i =1 α j,i n ( t ) i + p j,i log n ( t ) i −−−→ t →∞ −∞ , where h Y ⊆ { , . . . , |Y|} , α j,i ∈ R , p j,i ∈ Z ( ≤ i ≤ m ) are uniquely determined by X and Y (in a way detailed below), h Y and p j,i ’s are effectively computable and α j,i ’s are first-orderexpressible (with exponential function). Proof.
Observe that then s is not big-O of s iff there exists an infinite sequence of wordssuch that, for all C >
0, the sequence contains a word w such that f s ( w ) f s ( w ) > C . Thanks to Lemma 32, this is equivalent to the existence of a sequence n : N → N m such that X X ∈ d s ( n ( t ) ,...,n ( t ) m ) Y ( ρ i ,k i ) ∈ X ρ n ( t ) i i · n ( t ) k i i X Y ∈ d s ( n ( t ) ,...,n ( t ) m ) Y ( σ i ,‘ i ) ∈ Y σ n ( t ) i i · n ( t ) ‘ i i −−−→ t →∞ ∞ , where n ( t ) i denotes the i th component of n ( t ). Since there are finitely many possible valuesof d s and d s , it suffices to look for sequences n such that d s ( n ( t )) and d s ( n ( t )) are fixed.Further, because of the sum in the numerator, only one X ∈ X is required such that X ∈ d s ( n , . . . , n m ). Thus, we need to determine whether there exist X ∈ D , Y ⊆ D and n : N → N m such that X ∈ d s ( n ( t )), d s ( n ( t )) = Y (for all t ) and Q mi =1 ρ n ( t ) i i · n ( t ) k i i P h Y j =1 Q mi =1 σ n ( t ) i ji · n ( t ) ‘ ji i −−−→ t →∞ ∞ . where X = ( ρ , k ) · · · ( ρ m , k m ), Y = { Y , · · · , Y |Y| } , and Y j = ( σ j , ‘ j ) · · · ( σ jm , ‘ jm ) (1 ≤ j ≤ |Y| ). Taking the reciprocal and requiring each of the summands to go to zero, we obtain Q mi =1 σ n ( t ) i ji · n ( t ) ‘ ji i Q mi =1 ρ n ( t ) i i · n ( t ) k i i = m Y i =1 (cid:18) σ ji ρ i (cid:19) n ( t ) i n ( t ) i‘ ji − k i −−−→ t →∞ ≤ j ≤ |Y| .If we take logarithms, letting α j,i = log( σ ji ρ i ) and p j,i = ‘ ji − k i , we get m X i =1 α j,i n ( t ) i + p j,i log n ( t ) i −−−→ t →∞ −∞ for all j in h Y = { ≤ j ≤ |Y| | σ ji > ≤ i ≤ m } . The number α j,i is the logarithm of the ratio of two algebraic numbers, which are not givenexplicitly. However, they admit an unambiguous, first-order expressible characterisation(see Remark 16). The logarithm is encoded using the exponential function: log( z ) is ∃ x ∈ R : exp( x ) = z . (cid:74) Lemma 33 identifies violation of the big-O property using two conditions. In the remainderof this subsection we will handle Condition (a) using automata-theoretic tools (the Parikhtheorem and semi-linear sets) and Condition (b) using logics. In summary, the characterisationof Lemma 33 will be expressed in the first-order theory of the reals with exponentiation,which is decidable subject to Schanuel’s conjecture.
Condition (a) via automata
It turns out that sequences n satisfying Condition (a) in Lemma 33 can be captured by afinite automaton. In more detail, for any X ∈ D , there exists an automaton N sX such that L ( N sX ) = { a n · · · a n m m | X ∈ d s ( n , · · · , n m ) } . For any Y ⊆ D , there exists an automaton N s Y such that L ( N s Y ) = { a n · · · a n m m | d s ( n , · · · , n m ) = Y} . The relevant automaton capturing X and Y is then found by taking the intersection of L ( N sX ) and L ( N s Y ). (cid:73) Lemma 34.
For any X ∈ D and Y ⊆ D , there exists an automaton N X, Y such that L ( N X, Y ) = { a n · · · a n m m | X ∈ d s ( n , · · · , n m ) , Y = d s ( n , · · · , n m ) } . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 15 Because of our a i = a j assumption, the vector ( n , · · · , n m ) indicates the number ofoccurrences of each character. The set of such vectors derived from the language of anautomaton is known as the Parikh image of this language [37]. It is well known that theParikh image of an NFA is a semi-linear set, i.e. a finite union of linear sets (a linear sethas the form { ~b + λ ~r + · · · + λ s ~r s | λ , . . . , λ s ∈ N } , where ~b ∈ N m is the base vector and ~r , · · · ~r s ∈ N m are called period vectors). However, since L ( N X, Y ) ⊆ a +1 a +2 . . . a + m , the linearsets are of a very particular form, where each ~r i is a constant multiple of the i th unit vector. (cid:73) Lemma 35.
The language of N X, Y can be effectively decomposed as L ( N X, Y ) = S S X, Y k =1 L k ,where L k = n a b k + r k λ · · · a b km + r km λ m m | λ , · · · , λ m ∈ N o , S X, Y ∈ N and b ki , r ki ∈ N (1 ≤ k ≤ S X, Y , ≤ i ≤ m ) . Lemma 35 captures Condition (a) of Lemma 33 precisely.
Condition (b) via logic
With Lemma 35 in place, we now move on to add Condition (b) to the existing machinery.In fact, the logical formulae in the following lemmas will express the conjunction of bothconditions of Lemma 33. (cid:73)
Lemma 36.
Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if and only if there exists X ∈ D , Y ⊆ D , ≤ k ≤ S X, Y such that ∀ C < ∃ ~λ ∈ N m ^ j ∈ h Y m X i =1 α j,i ( b ki + r ki λ i ) + p j,i log( b ki + r ki λ i ) < C, where h Y , α j,i , p j,i (resp. b ki , r ki ) satisfy the same conditions as in Lemma 33 (resp. 35). Note that the formula of Lemma 36 uses quantification over natural numbers. Our nextstep will be to replace integer variables with real variables. In other words, we will obtain anequivalent condition in the first-order theory of the reals with exponentiation, as follows. (cid:73)
Lemma 37.
Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if and only if there exist X ∈ D , Y ⊆ D , ≤ k ≤ S X, Y and U ⊆ { i ∈ { , · · · , m } | r ki > } such that ∀ C < ∃ ~x ∈ R | U |≥ B k ^ j ∈ h Y X i ∈ U α j,i r ki x i + p j,i log( x i ) < C, where B k = max i b ki and h Y , α j,i , p j,i , b ki , r ki are as in Lemma 36. Proof Sketch.
Compare the logical characterisation in Lemmas 36 and 37. The first differenceto note is that the effect of b ki ’s is simply a constant offset, and so the sequence would tendto −∞ with or without its presence. Similar simplifications can be made inside the logarithm:the multiplicative effect of r ki inside the logarithm can be extracted as an additive offset andthus similarly be discarded.The second crucial difference is to relax the variable domains from integers to reals. Ifeach of the λ i in the satisfying assignment is sufficiently large, we show we can relax thecondition to real numbers rather than integers without affecting whether the sequence goesto −∞ . To do this, we test sets of indices U , where if i ∈ U then λ i needs to be arbitrarilylarge over all C (i.e. unbounded). The positions where λ i is always bounded are again aconstant offset and are omitted. (cid:74) By testing the LC condition and the condition from Lemma 37 for each possible X, Y , k, U ,in turn using the relevant (conditionally decidable) first-order theory of the reals, we have: (cid:73) Lemma 38.
Given a weighted automaton W and states s, s such that L s ( W ) and L s ( W ) are plus-letter-bounded, it is decidable whether s is big-O s , subject to Schanuel’s conjecture. Here we consider the case where L s ( W ) and L s ( W ) are letter-bounded, L s ( W ) and L s ( W )are subsets of a ∗ . . . a ∗ m for some a , . . . , a m ∈ Σ, which is a relaxation of the preceding case.For the plus-letter-bounded case, we relied on a 1-1 correspondence between numeric vectorsand words. This correspondence no longer holds in the letter-bounded case: for example, a n matches a ∗ b ∗ a ∗ , but it could correspond to ( n, , , , n ), as well as any ( n , , n ) with n + n = n . Still, there is a reduction to the plus-letter-bounded case. (cid:73) Lemma 39.
The big-O problem for W , s, s with L s ( W ) and L s ( W ) letter-bounded reducesto the plus-letter-bounded case. Proof.
Suppose the LC condition holds and L s ( W ) ⊆ L s ( W ) ⊆ a ∗ · · · a ∗ m . Let I be the setof strictly increasing sequences ~ı = i · · · i k of integers between 1 and m . Given ~ı ∈ I , let W ~ı be the weighted automaton obtained by intersecting W with a DFA for a + i · · · a + i k whoseinitial state is q . Note that s is big-O of s (in W ) iff ( s, q ) is big-O of ( s , q ) in W ~ı for all ~ı ∈ I , because a ∗ · · · a ∗ m = S ~ı ∈ I a + i · · · a + i k . Because the big-O problem for each W ~ı , ( s, q ),( s , q ) falls into the plus-letter-bounded case, the results follows from Lemma 38. (cid:74) Here we consider the case where L s ( W ) and L s ( W ) are bounded, which is a relaxation ofletter-boundedness (see Definition 6): L s ( W ) and L s ( W ) are subsets of w ∗ . . . w ∗ m for some w , . . . , w m ∈ Σ ∗ . We show a reduction to the letter-bounded case from Section 6.2.To showcase the difference to the letter-bounded case, consider the language( abab ) ∗ a ∗ b ∗ ( ab ) ∗ . Observe that, for example the word ( ab ) can be decomposed in anumber of ways: ( abab ) a b ( ab ) , ( abab ) a b ( ab ) , ( abab ) a b ( ab ) , ( abab ) a b ( ab ) or( abab ) a b ( ab ) . One must be careful to consider all such decompositions. (cid:73) Lemma 40.
The big-O problem for W , s, s with L s ( W ) and L s ( W ) bounded reduces tothe letter-bounded case. Proof sketch (see Appendix G.3).
Suppose W is bounded over w ∗ . . . w ∗ m , we will constructa new weighted automaton W letter-bounded over a new alphabet a ∗ . . . a ∗ m with the followingproperty. For every decomposition of a word w , as w n . . . w n m m , the weight of a n . . . a n m m in W is equal to the weight of w in W . (cid:74) Despite undecidability results, we have identified several decidable cases of the big-O problem.However, for bounded languages, the result depends on a conjecture from number theory,leaving open the exact borderline between decidability and undecidability.Natural directions for future work include the analogous problem for infinite words,further analysis on ambiguity (e.g., is the big-O problem decidable for k -ambiguous weightedautomata?), and the extension to negative edge weights. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 17 References Shaull Almagor, Udi Boker, and Orna Kupferman. What’s decidable about weighted automata?In
ATVA , volume 6996 of
Lecture Notes in Computer Science , pages 482–491. Springer, 2011. Shaull Almagor, Dmitry Chistikov, Joël Ouaknine, and James Worrell. O-minimal invariantsfor linear loops. In Ioannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and DonaldSannella, editors, , volume 107 of
LIPIcs , pages 114:1–114:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.114 . Saugata Basu, Richard Pollack, and Marie-Françoise Roy.
Algorithms in Real AlgebraicGeometry , volume 10 of
Algorithms and computation in mathematics . Springer, 2nd edition,2006. Rohit Chadha, Dileep Kini, and Mahesh Viswanathan. Decidable problems for unary PFAs.In Gethin Norman and William H. Sanders, editors,
Quantitative Evaluation of Systems - 11thInternational Conference, QEST 2014 , volume 8657 of
Lecture Notes in Computer Science ,pages 329–344. Springer, 2014. doi:10.1007/978-3-319-10696-0_26 . Konstantinos Chatzikokolakis, Daniel Gebler, Catuscia Palamidessi, and Lili Xu. GeneralizedBisimulation Metrics. In Paolo Baldan and Daniele Gorla, editors,
CONCUR 2014 - Concur-rency Theory - 25th International Conference, CONCUR 2014 , volume 8704 of
Lecture Notesin Computer Science , pages 32–46. Springer, 2014. doi:10.1007/978-3-662-44584-6_4 . Taolue Chen and Stefan Kiefer. On the total variation distance of labelled Markovchains. In Thomas A. Henzinger and Dale Miller, editors,
Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS2014 , pages 33:1–33:10. ACM, 2014. URL: http://dl.acm.org/citation.cfm?id=2603088 , doi:10.1145/2603088.2603099 . Dmitry Chistikov, Andrzej S. Murawski, and David Purser. Asymmetric distances for approx-imate differential privacy. In Wan Fokkink and Rob van Glabbeek, editors, , volume 140 of
LIPIcs , pages 10:1–10:17.Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.CONCUR.2019.10 . Ventsislav Chonev, Joël Ouaknine, and James Worrell. On the Skolem problem for continuouslinear dynamical systems. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani,and Davide Sangiorgi, editors, , volume 55 of
LIPIcs , pages 100:1–100:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICALP.2016.100 . Marek Chrobak. Finite automata and unary languages.
Theor. Comput. Sci. , 47(3):149–158,1986. doi:10.1016/0304-3975(86)90142-8 . Thomas Colcombet. Unambiguity in automata theory. In Jeffrey O. Shallit and AlexanderOkhotin, editors,
Descriptional Complexity of Formal Systems - 17th International Workshop,DCFS 2015 , volume 9118 of
Lecture Notes in Computer Science , pages 3–18. Springer, 2015. doi:10.1007/978-3-319-19225-3_1 . T. H. Cormen, C. E. Leiserson, and R. L. Rivest.
Introduction to Algorithms . MIT Press,1990. Laure Daviaud, Marcin Jurdzinski, Ranko Lazic, Filip Mazowiecki, Guillermo A. Pérez, andJames Worrell. When is containment decidable for probabilistic automata? In IoannisChatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella, editors, , volume 107 of
LIPIcs , pages 121:1–121:14. Schloss Dagstuhl -Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.121 . C. Dehnert, S. Junges, J.-P. Katoen, and M. Volk. A Storm is coming: A modern probabilisticmodel checker. In
Proceedings of Computer Aided Verification (CAV) , pages 592–600. Springer,2017. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating Noiseto Sensitivity in Private Data Analysis. In Shai Halevi and Tal Rabin, editors,
Theory ofCryptography, Third Theory of Cryptography Conference, TCC 2006 , volume 3876 of
LectureNotes in Computer Science , pages 265–284. Springer, 2006. doi:10.1007/11681878_14 . Shimon Even, Alan L. Selman, and Yacov Yacobi. The complexity of promise problems withapplications to public-key cryptography.
Information and Control , 61(2):159–173, 1984. Nathanaël Fijalkow. Undecidability results for probabilistic automata.
SIGLOG News , 4(4):10–17, 2017. URL: https://dl.acm.org/citation.cfm?id=3157833 . Nathanaël Fijalkow, Hugo Gimbert, and Youssouf Oualhadj. Deciding the value 1 problemfor probabilistic leaktight automata. In
Proceedings of the 27th Annual IEEE Symposiumon Logic in Computer Science, LICS 2012 , pages 295–304. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.40 . Shmuel Friedland and Hans Schneider. The growth of powers of a nonnegative matrix.
SIAMJ. Matrix Analysis Applications , 1(2):185–200, 1980. doi:10.1137/0601022 . Pawel Gawrychowski, Dalia Krieger, Narad Rampersad, and Jeffrey Shallit. Finding thegrowth rate of a regular or context-free language in polynomial time.
Int. J. Found. Comput.Sci. , 21(4):597–618, 2010. doi:10.1142/S0129054110007441 . Hugo Gimbert and Youssouf Oualhadj. Probabilistic automata on finite words: Decidable andundecidable problems. In Samson Abramsky, Cyril Gavoille, Claude Kirchner, Friedhelm Meyerauf der Heide, and Paul G. Spirakis, editors,
Automata, Languages and Programming, 37thInternational Colloquium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceedings,Part II , volume 6199 of
Lecture Notes in Computer Science , pages 527–538. Springer, 2010. doi:10.1007/978-3-642-14162-1\_44 . Seymour Ginsburg.
The Mathematical Theory of Context-Free Languages . McGraw-Hill, 1966. Seymour Ginsburg and Edwin H Spanier. Bounded algol-like languages.
Transactions of theAmerican Mathematical Society , 113(2):333–368, 1964. Thanh Minh Hoang and Thomas Thierauf. The complexity of the characteristic and the minimalpolynomial.
Theor. Comput. Sci. , 295:205–222, 2003. doi:10.1016/S0304-3975(02)00404-8 . Cheng-Chao Huang, Jing-Cao Li, Ming Xu, and Zhi-Bin Li. Positive root isolation forpoly-powers by exclusion and differentiation.
Journal of Symbolic Computation , 85:148–169,2018. 41th International Symposium on Symbolic and Alge-braic Computation (ISSAC’16). doi:https://doi.org/10.1016/j.jsc.2017.07.007 . Stefan Kiefer. On computing the total variation distance of hidden Markov models. InIoannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella, editors, ,volume 107 of
LIPIcs , pages 130:1–130:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik,2018. doi:10.4230/LIPIcs.ICALP.2018.130 . Stefan Kiefer, Andrzej S. Murawski, Joël Ouaknine, Björn Wachter, and James Worrell. Onthe Complexity of Equivalence and Minimisation for Q-weighted Automata.
Logical Methodsin Computer Science , 9(1), 2013. doi:10.2168/LMCS-9(1:8)2013 . Daniel Krob. The equality problem for rational series with multiplicities in the tropicalsemiring is undecidable.
International Journal of Algebra and Computation , 4:405–425, 1994. M. Kwiatkowska, G. Norman, and D. Parker. PRISM 4.0: Verification of probabilistic real-timesystems. In
Proceedings of Computer Aided Verification (CAV) , volume 6806 of
LNCS , pages585–591. Springer, 2011. Serge Lang.
Introduction to transcendental numbers . Addison-Wesley Pub. Co., 1966. Angus Macintyre and Alex J Wilkie. On the decidability of the real exponential field, 1996. Rupak Majumdar, Mahmoud Salamati, and Sadegh Soudjani. On decidability of time-bounded reachability in CTMDPs. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 19 editors, , volume 168 of
Leibniz International Proceedings in Informatics (LIPIcs) , pages 133:1–133:19, Dagstuhl, Germany, 2020. Schloss Dagstuhl–Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ICALP.2020.133 . Andrew Martinez. Efficient computation of regular expressions from unary NFAs. In JürgenDassow, Maia Hoeberechts, Helmut Jürgensen, and Detlef Wotschke, editors,
Fourth Interna-tional Workshop on Descriptional Complexity of Formal Systems - DCFS 2002 , volume ReportNo. 586, pages 174–187. Department of Computer Science, The University of Western Ontario,Canada, 2002. David Mestel. Quantifying information flow in interactive systems. In , pages414–427. IEEE, 2019. doi:10.1109/CSF.2019.00035 . David Mestel. Widths of Regular and Context-Free Languages. In ,volume 150 of
Leibniz International Proceedings in Informatics (LIPIcs) , pages 49:1–49:14,Dagstuhl, Germany, 2019. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL: https://drops.dagstuhl.de/opus/volltexte/2019/11611 , doi:10.4230/LIPIcs.FSTTCS.2019.49 . Albert R. Meyer and Larry J. Stockmeyer. The equivalence problem for regular expressionswith squaring requires exponential space. In
Proceedings of the 13th Annual Symposium onSwitching and Automata Theory, College Park, Maryland, USA, October 25-27, 1972 , pages125–129. IEEE Computer Society, 1972. Joël Ouaknine and James Worrell. Positivity problems for low-order linear recurrence sequences.In Chandra Chekuri, editor,
Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2014 , pages 366–379. SIAM, 2014. doi:10.1137/1.9781611973402.27 . Rohit J Parikh. On context-free languages.
Journal of the ACM (JACM) , 13(4):570–581, 1966. Azaria Paz.
Introduction to probabilistic automata . Academic Press, 2014. Bjorn Poonen. Hilbert’s tenth problem over rings of number-theoretic interest.
Note fromthe lecture at the Arizona Winter School on “Number Theory and Logic” , 2003. URL: https://math.mit.edu/~poonen/papers/aws2003.pdf . Hans Schneider. The influence of the marked reduced graph of a nonnegative matrix onthe Jordan form and on related properties: A survey.
Linear Algebra and its Applications ,84:161–189, 1986. Marcel Paul Schützenberger. On the definition of a family of automata.
Information andControl , 4(2-3):245–270, 1961. doi:10.1016/S0019-9958(61)80020-X . Bruno Sericola.
Markov chains: theory and applications . John Wiley & Sons, 2013. Adam D. Smith. Efficient, Differentially Private Point Estimators.
CoRR , abs/0809.4794,2008. URL: http://arxiv.org/abs/0809.4794 . Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time:Preliminary report. In Alfred V. Aho, Allan Borodin, Robert L. Constable, Robert W. Floyd,Michael A. Harrison, Richard M. Karp, and H. Raymond Strong, editors,
Proceedings ofthe 5th Annual ACM Symposium on Theory of Computing, 1973 , pages 1–9. ACM, 1973. doi:10.1145/800125.804029 . Anthony Widjaja To. Unary finite automata vs. arithmetic progressions.
Inf. Process. Lett. ,109(17):1010–1014, 2009. doi:10.1016/j.ipl.2009.06.005 . Wen-Guey Tzeng. A polynomial-time algorithm for the equivalence of probabilistic automata.
SIAM J. Comput. , 21(2):216–227, 1992. doi:10.1137/0221017 . Wen-Guey Tzeng. On path equivalence of nondeterministic finite automata.
Inf. Process. Lett. ,58(1):43–46, 1996. doi:10.1016/0020-0190(96)00039-7 . Michel Waldschmidt.
Diophantine Approximation on Linear Algebraic Groups , volume 326of
Grundlehren der mathematischen Wissenschaften (A Series of Comprehensive Studies inMathematics) . Springer, Berlin, Heidelberg, 2000. qq ss a . a . a (a) Reduction to big-Θ qq ss a . b . a . b . (b) Reduction to big-O
Figure 4
Reductions between big-O and big-Θ
A Additional notation for the appendix
We will typically define weighted automata by listing transitions as q p −→ a q (to mean M ( a )( q, q ) = p ) with the assumption that any unspecified transition has weight 0. B Additional material for Section 1 (cid:73)
Proposition 41.
For r, and rd, on labelled Markov chains, it is sufficient to consider thesupremum over w ∈ Σ ∗ rather than E ⊆ Σ ∗ . Proof of Proposition 41.
We will show we can approximate any event by a finite subset,then we can always simplify an event with more than one word, and not decrease.Suppose a + bc + d > ac and a + bc + d > bd . By the first we have ac + bc > ac + dc = bc > ad = ⇒ bd > ac . By the second we have ad + bd > bc + bd = ad > bc = ⇒ ac > bd . Contradiction.Hence, for the purposes of maximisation, given f, g and a finite set E , such a set canalways be simplified, by repeated application. That is, there exists e such that, P e ∈ E f ( e ) P e ∈ E g ( e ) ≤ f ( e ) g ( e ) . (1)Consider an event E ⊆ Σ ∗ , then for every λ > k such that f s ( E ∩ Σ >k ) ≤ λ .Then f s ( E ∩ Σ ≤ k ) ≤ f s ( E ) ≤ f s ( E ∩ Σ ≤ k ) + λ [25, Lemma 12]. For any (cid:15) , by choice ofsufficiently small λ there is a finite set E such that f s ( E ) f s ( E ) − (cid:15) ≤ f s ( E ) f s ( E ) ≤ f s ( E ) f s ( E ) + (cid:15) .Consider sup E ⊆ Σ ∗ f s ( E ) f s ( E ) , this is equivalent to lim k →∞ sup E ⊆ Σ ∗ ∩ Σ ≤ k f s ( E ) f s ( E ) and by Equa-tion (1) this is equivalent to lim k →∞ sup w ∈ Σ ∗ ∩ Σ ≤ k f s ( w ) f s ( w ) = sup w ∈ Σ ∗ f s ( w ) f s ( w ) . (cid:74) C Additional material for Section 2 (cid:73)
Lemma 42.
The big-O problem is interreducible with the big- Θ problem. Proof. big-O problem reduces to the big- Θ problem: To ask if s is big-O of s , addstates q, q using the construction of Figure 4a, then ask if q is big-Θ of q . f q ( aw ) f q ( aw ) = 0 . f s ( w ) + 0 . f s ( w ) f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < C − f q ( aw ) f q ( aw ) = f s ( w )0 . f s ( w ) + 0 . f s ( w ) ≤ big- Θ problem reduces to the big-O problem: To ask if s is big-Θ of s , add states q, q using the construction of Figure 4b, then ask if q is big-O of q . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 21 f q ( aw ) f q ( aw ) = 0 . f s ( w )0 . f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < Cf q ( bw ) f q ( bw ) = 0 . f s ( w )0 . f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < C Each of the reductions adds a constant number of bits, as such they operate in logarithmicspace. (cid:74)
D Additional material for Section 3
In this section we use the notation that P A ( w ) = f q s ( w ), where q s is the start state of theprobabilistic automaton A . This is to avoid confusion when there is both the probabilisticautomaton being reduced from and the labelled Markov chain being reduced to. Henceforthin this section, the notation f s ( w ) refers to the labelled Markov chain. Undecidability by Emptiness of Probabilistic automata (Theorem 8)
The following lemma plays a key role in proving the result. In its statement, “undecidable todistinguish” means that the corresponding promise problem (see e.g. [15]) is undecidable.In other words, if the input is not in one of the two cases which should be distinguishedbetween, the answer is not specified and can be arbitrary (including non-termination).Results in this section are presented on ratio total variation distances on labelled Markovchains , and thus apply to the big-O problem in the more general weighted automata. (cid:73)
Lemma 43.
1. Given an LMC along with two states s, s and constant c , it is undecidableto distinguish between r ( s, s ) ≤ c and r ( s, s ) = ∞ .2. Given an LMC along with two states s, s and two numbers c and C such that c < C , itis undecidable to distinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ .Both statements remain true if r is replaced with rd. Proof.
For both cases, we reduce from
Empty . We show our construction for Σ = { a, b } ,but the procedure can be generalised to arbitrary alphabets.The construction will create two branches of a labelled Markov chain. The first, fromstate q s , will simulate the given probabilistic automaton using the original weights multipliedby the same scalar (in this case ). The other branch, from state s , will process eachletter from Σ with equal weight (also in an infinite loop). Consequently, if there is a wordaccepted with probability greater than , the ratio between the two branches will be greaterthan 1. The construction will make it possible to process words repeatedly, so that the ratiocan then be pumped unboundedly.Formally, given a probabilistic automaton A = h Q, Σ , M, F i with start state q s . Firstobserve that w.l.o.g. q s is not accepting, since in this case the empty word is accepted withprobability 1, and thus there is a word with probability greater than and a trivial positiveinstance of the big-O problem can be returned.We construct the LMC h Q , Σ , δ, F i taking Q = Q ] { s, s , s , s , t } where ] denotesdisjoint union, Σ = { a, b, acc , rej , ‘} , F = { t } and δ as specified below. First we simulatethe probabilistic automaton with a scaling factor of : for all q, q ∈ Q , q M ( a )( q,q ) −−−−−−−−→ a q q M ( b )( q,q ) −−−−−−−−→ b q . s tq s q a q r qrej a acc b rej rej a M ( a )( q s ,q a )4 b M ( b )( q s ,q a )4 a M ( a )( q a ,q a )4 b M ( b )( q a ,q a )4 a M ( a )( q r ,q r )4 b M ( b )( q r ,q r )4 a M ( a )( q s ,q s )4 b M ( b )( q s ,q s )4 a M ( a )( q s ,q r )4 b M ( b )( q s ,q r )4 acc . . . . . .. . . Figure 5
Reduction; where q a represents accepting states of the probabilistic automaton, q r represents rejecting states and q s represents the start state (assumed to be rejecting). Originally accepting runs trigger a restart, while rejecting ones are redirected to t :if q ∈ F : q −−→ acc q s and if q F : q −−→ rej t. We then add a part of the chain which behaves equally, rather than according to theprobabilistic automaton: s −→ a s s −→ b s s −−→ acc s s −−→ rej t. The construction is illustrated in Figure 5. To complete the reduction, we add the followingtransitions from s, s , s . s −→ ‘ s s −→ ‘ q s s −→ ‘ s s −−→ ‘ s s −−→ ‘ q s We make the following claims: (cid:66)
Claim 44. If A 6∈
Empty then r ( s, s ) = ∞ . If A ∈
Empty then r ( s, s ) ≤ (cid:66) Claim 45. If A 6∈
Empty then 49 < r ( s, s ) ≤
51. If
A ∈
Empty then r ( s, s ) ≤ r ) follows from the undecidability of Empty . Note that s, s are taken to be certain “linear combinations” of s and q s . This ensures that r ( s , s ) ≤ r ( s , s ) ≤
2, consequently the claims for rd will follow. Proof of Claim 44.
First observe that f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) = 12 + 12 f q s ( w ) f s ( w ) (2) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 23 If there is a word w that is accepted by the automaton with probability > , then let w = ( w acc ) i rej and we have f q s ( w ) f s ( w ) = (( ) | w | P ( w ) ) i (( ) | w | ) i = (2 P ( w )) i (3)Since P ( w ) > then 2 P ( w ) > i →∞ f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) = ∞ and r ( s, s ) = rd ( s, s ) = ∞ . If there is no such word then ∀ w ∈ Σ ∗ : P ( w ) ≤ , then probability ratio of all wordsis bounded. All words start with ‘ and are terminated by rej , so in general all wordstake the form w = ‘ (( w acc ) . . . ( w n acc )( w n +1 rej ). Let us consider the probability of w = (( w acc ) . . . ( w n acc )( w n +1 rej ) words from s and q s . Then: f q s ( w ) f s ( w ) (4)= ( Q ni =1 12 ( ) | w i | P [ w i ])(( ) | w n +1 | (1 − P [ w n +1 ]) )( ) | w | + ··· + | w n | ( ) n ( ) | w n +1 | (5) ≤ (( ) | w | + ··· + | w n | ( ) n ( ) n )(( ) | w n +1 | )( ) | w | + ··· + | w n | + n ( ) | w n +1 | ( ∀ i : P [ w i ] ≤ )= 2 (6)Then using Equation (2) we have for every word w we have ≤ f s ( w ) f s ( w ) ≤ and r ( s, s ) ≤ and rd ( s, s ) ≤ (cid:67) Proof of Claim 45.
First observe that the direction of f s ( ‘ w ) f s ( ‘ w ) is always ≤
2, resulting in theonly interesting direction being f s ( ‘ w ) f s ( ‘ w ) : f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w )= f s ( w ) f s ( w ) + f q s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) f s ( w ) + f q s ( w ) f q s ( w )= 2 · ‘ w , r and rd is bounded: f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w )= f s ( w ) f s ( w ) + f q s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) f s ( w ) + f q s ( w ) f q s ( w ) ≤ ·
99 + 1002 ≤ If there is a word w that is accepted by the automaton with probability > , then weconsider the word ‘ ( w acc ) i rej ), let w = ( w acc ) i rej ). f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≥ f q s ( w ) f s ( w ) + f q s ( w )By the previous proof (Equation (3)) we know f qs ( w ) f s ( w ) −−−→ i →∞ ∞ , thus f s ( w ) f qs ( w ) −−−→ i →∞ f s ( w ) + f q s ( w ) f q s ( w ) = 2100 + 2 · (cid:20) f s ( w ) f q s ( w ) (cid:21) −−−→ i →∞ f qs ( w ) f s ( w )+ f qs ( w ) −−−→ i →∞ = 50. So for all (cid:15) there exists an i such that f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) ≥ − (cid:15) . In particular for example r ( s, s ) ≥ ∀ w ∈ Σ ∗ : P ( w ) ≤ , then we show the total variationdistance will be small. All words start with ‘ and are terminated by rej , so in general allwords take the form w = ‘ (( w acc ) . . . ( w n acc )( w n +1 rej ). Let us consider the probabilityof such words from s, s . f s ( w ) f s ( w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) + f q s ( w ) f s ( w ) ≤ · (cid:20)
12 + 12 f q s ( w ) f s ( w ) (cid:21) ≤ ·
32 (by Equation (6)) ≤ ∃ w : P ( w ) > then 49 < r ( s, s ) ≤
51 and49 < rd ( s, s ) ≤
51 and if not then r ( s, s ) ≤ rd ( s, s ) ≤ (cid:67)(cid:74) Theorem 8
Lemma 43 implies Theorem 8.
Proof of Theorem 8.
We reason by contradiction using Lemma 43. For the big-O problem,it suffices to observe that, if it were decidable, one could use it to solve the first promiseproblem from the Lemma (recall that in a promise problem the input is guaranteed to fallinto one of the two cases). This would contradict Lemma 43.Similarly, the decidability of the (asymmetric) threshold problem would allow us todistinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ (second promise problem from theLemma) by considering the instance r ( s, s ) ≤ c + C (non-strict variant) or r ( s, s ) < c + C (strict variant). A positive answer (regardless of the variant) implies r ( s, s ) < C , while anegative one yields r ( s, s ) > c , which suffices to distinguish the cases. Note that in both . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 25 cases r ( s, s ) is bounded, so the reasoning remains valid if it is known in advance that r ( s, s )is bounded.For additive (asymmetric) approximation, we observe that finding x such that | r ( s, s ) − x | ≤ C − c and comparing it with c + C makes it possible to distinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ . This is because r ( s, s ) ≤ c then implies x < c + C and C ≤ r ( s, s )implies c + C < x .In the multiplicative case, finding x such that 1 − C − c C ≤ x r ( s,s ) ≤ C − c C and comparing x with c + C yields an analogous argument.Since Lemma 43 also applies to rd , all of our results hold when r is replaced by rd . (cid:74) D.1 The relation to the Value-1 Problem
The previous section showed undecidability of the big-O problem via the emptiness problemfor probabilistic automata. Another undecidable problem for probabilistic automata is the
Value-1 problem [20]. The
Value-1 problem asks whether some word of a probabilisticautomaton is one, or at least arbitrarily close to 1. This section shows that there is a close,but not complete, connection between the
Value-1 problem and big-O problem by reducingin both directions between the two, the results are shown in Lemmas 47 and 48. (cid:73)
Definition 46.
The
Value-1 problem, given a Probabilistic Automaton A , asks if for all δ > there exists a word w such that P A ( w ) > − δ . (cid:73) Lemma 47.
Value-1 problem reduces to the big-O problem (cid:73)
Lemma 48.
The big-O problem reduces to
Value-1 problem.
Proof of Lemma 47 (Value-1 reduces to big-O).
Given a probabilistic automaton A = h Q, Σ , M, F i and a dedicated starting state q ∈ Q ,which accepts words with probability P A ( w ), first construct A in which words are acceptedwith probability P A ( w ) = 1 − P A ( w ), by inverting accepting states.The proof uses a two letter alphabet, Σ = { a, b } , but the procedure can be generalisedto arbitrary alphabets. Construct a Markov chain M A = h Q , Σ , M , F i , where Q = Q ∪ { s, s , s , rej, acc } , Σ = { a, b, c } and F = { acc } . The probabilistic automaton will besimulated by M A . The relation M is described by the notation p −→ a :For all q ∈ Q : ∀ q ∈ Q : q M ( a )( q,q ) −−−−−−−−→ a q q M ( b )( q,q ) −−−−−−−−→ b q if q ∈ F : q −→ c acc and if q F : q −→ c rejs −→ c q s −→ c s s −→ a s s −→ b s s −→ c acc Note the only words with positive probability are words of the form c Σ ∗ c ⊆ Σ . Thengiven a word w ∈ Σ ∗ , f s ( cwc ) = ( | Σ | +1 ) | wc | and f s ( cwc ) = ( | Σ | +1 ) | wc | (1 − P A ( w )).Then if there is a sequence of words for which P A ( w ) tends to 1 then f s ( cwc ) f s ( cwc ) is unbounded.However, if there exists some γ > w ∈ Σ ∗ we have P A ( w ) ≤ (1 − γ ) then(1 − P A ( w )) ≥ γ , and so f s ( cwc ) f s ( cwc ) ≤ γ . (cid:74) Proof of Lemma 48 (big-O reduces to Value-1).
Given M = h Q, Σ , M, F i and s, s ∈ Q ,construct a probabilistic automaton A = h Q , Σ , M , F i . Each state of Q will be duplicated,once for s and once for s ; Q s = { q s | q ∈ Q } , Q s = { q s | q ∈ Q } . Let Q = Q s ∪ Q s ∪{ q , acc, rej, sink } , Σ = Σ ∪ { $ } and F = { acc } . The reduction can be seen in Figure 6. q s s s s sink accrejq js : q j ∈ Fq is : q i Fq js : q j F q is : q i ∈ F. . .. . . $ $ $ 1 $ 1$ 1$ 1$ 1 $ 1$ 1 Figure 6
Reduction to
Value-1 . Only the effect of transitions on the $ symbol are shown inblack, with the possibility to transition to the sink state depicted in grey (on symbols in Σ). Allremaining transitions are omitted.
Each transition of M will be simulated in each of the copies according the probability in M .For every q, q ∈ Q, a ∈ Σ, let M ( a )( q s , q s ) = M ( a )( q, q ) and M ( a )( q s , q s ) = M ( a )( q, q ).A probabilistic automaton should be stochastic for every a ∈ Σ, so there is unused probabilityfor each character, which will divert to a sink. For every q ∈ Q and a ∈ Σ, let M ( a )( q s , sink ) = 1 − X q ∈ Q M ( a )( q, q )and M ( a )( q s , sink ) = 1 − X q ∈ Q M ( a )( q, q ) . There will be an additional character $.From q the machine will pick either of the two machines with equal probability; M ($)( q , s s ) = M ($)( q , s s ) = . If in the accepting or rejecting state the system will staythere forever M ($)( acc, acc ) = 1 and M ($)( rej, rej ) = 1 .The behaviour on $ will differ in the two copies of M . If in an s state the system willpreference the accepting state when accepting and otherwise restart. If in an s state thesystem will preference the rejecting state when accepting and otherwise restart. Formally, M ($)( q s , acc ) when q s ∈ F and M ($)( q s , q ) when q s F . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 27 and M ($)( q s , rej ) when q s ∈ F and M ($)( q s , q ) when q s F. When in the sink state, the system restarts on $, M ($)( sink, q ) = 1, or for all a ∈ Σstays there M ( a )( sink, sink ) = 1.The idea is that if f s ( w ) >> f s ( w ) then, by repeated reading of the word w , all of theprobability mass will eventually move to ‘acc’; otherwise a sufficiently large amount of masswill be lost to ‘rej’.Denote by P A ( w ) the probability of a word w in the probabilistic automaton, from state q , i.e. f q ( w ). However, f will be used to refer to the probability in the labelled Markovchain M . Further the notation P [ q w −→ q ] is used to denote ( M ( w ) × · · · × M ( w | w | ) q,q , i.e.the probability of transitioning from state q to q after reading w in A .Consider each direction: (cid:73) Case 1 (Not big-O implies
Value-1 ) . The proof shows that ∀ δ ∃ C, i ∈ N , w ∈ Σ ∗ such that f s ( w ) > Cf s ( w ) and P A (( w $) i ) > − δ .Hence given δ , choose C such that (1 − δ ) CC +1 > − δ . Then by the big-O property, choosea word such that f s ( w ) = C f s ( w ) , with C > C . Then (1 − δ ) C C +1 > (1 − δ ) CC +1 > − δ .Given the fixed sequence ($ w $) i , this induces a (unary) Markov chain, represented by theMatrix A , representing states q , acc and rej in the three positions respectively: A = . − f s ( w )) + 0 . − f s ( w )) 0 . f s ( w ) 0 . f s ( w )0 1 00 0 1 Then in the long run, starting from state , observe: [ 1 0 0 ] A i i →∞ −−−→ [ 0 Cx x ] with C x + x = 1 Clearly, A i (0 ,
1) + A i (0 ,
2) + A i (0 ,
0) = 1 , and choose i such that A i (0 , ≤ δ . Then A i (0 ,
1) + A i (0 , ≥ − δ , using the fact that A i (0 ,
1) = C A i (0 , , obtaining A i (0 ,
1) + A i (0 , C ≥ − δ Hence A i (0 , ≥ (1 − δ ) C C +1 > − δ , as required. (cid:73) Case 2 (big-O implies Not Value-1) . We have there exists C such that ∀ w f s ( w ) ≤ Cf s ( w ) and should show there exists δ > such that ∀ w ∈ (Σ ∪ { $ } ) ∗ we have P A ( w ) ≤ − δ To move probability from q to acc it is necessary to use words of the form $Σ ∗ $ where Σ is the alphabet of M . Hence any word can be decomposed into $ w $$ w $ ... $ w m $ .After reading w the probability is such that x = P [ q w $ −−−→ acc ] = f s ( w ) y = P [ q w $ −−−→ rej ] = f s ( w ) P [ q w $ −−−→ q ] = 1 − x − y Since ∃ C ∀ w i : f s ( w i ) ≤ Cf s ( w i ) , we have x ≤ Cy . By induction, repeating this process we have for all i : x i ≤ Cy i . x i = P [ q w $ ... $ w i $ −−−−−−−→ acc ] = (1 − f s ( w i ) − f s ( w i )) x i − + f s ( w i ) y i = P [ q w $ ... $ w i $ −−−−−−−→ acc ] = (1 − f s ( w i ) − f s ( w i )) y i − + f s ( w i ) P [ q w $ ... $ w i $ −−−−−−−→ q ] = i Y j =1 (1 − x j + y j ) . Hence x i = (1 − f s ( w i ) − f s ( w i )) x i − + f s ( w i ) ≤ (1 − f s ( w i ) − f s ( w i )) Cy i − + Cf s ( w i )= C [(1 − f s ( w i ) − f s ( w i )) y i − + f s ( w i )] ≤ Cy i . In the extreme x m + y m = 1 , then x m ≤ CC +1 < , so the probability of reaching acc isbounded away from for every word. (cid:74) The
Value-1 problem is undecidable in general, however it is decidable in the unary casein coNP [4] and for leaktight automata [17]. Note, however, that the construction combinedwith these decidability results does not entail any decidability results for the big-O problem.Firstly note that the construction adds an additional character, and such a unary instanceof the big-O problem always has at least two characters when translated to the
Value-1 problem. Further the construction does not result in a leaktight automaton, to see this thedefinition of leaktight automata are recalled from [17]. The following, does not, of course,preclude the existence of a construction which does maintain these properties. (cid:73)
Definition 49.
A finite word u is idempotent if reading once or twice the word u does notchange qualitatively the transition probabilities. That is P A [ q u −→ q ] > ⇐⇒ P A [ q uu −−→ q ] > . Let u n be a sequence of idempotent words. Assume that the sequence of matrices P A ( u n ) converges to a limit M , that this limit is idempotent and denote M the associated Markovchain. The sequence u n is a leak if there exist r, q ∈ Q such that the following three conditionshold: r and q are recurrent in M , lim P A [ r u n −−→ q ] = 0 , for all n , P A [ r u n −−→ q ] > .An automaton is leaktight if there is no leak. If there were no leak in the probabilistic automaton then decidability would follow.However, this is not the case, and the reduction does not solve any cases by reduction toknown decidable fragment of the
Value-1 problem. (cid:66)
Claim 50.
The resulting automaton from the reduction of the big-O problem to the
Value-1 problem has a leak.
Proof.
Consider some infinite sequence of words w i growing in length, such that f s ( w i ) > i . Let u i = $ w i $.Observe that this word is idempotent. For each starting state, consider the possible stateswith non-zero probability and from each of these the set of reachable states. Observe that inall cases the set reachable after one application is equal to the set reachable after two. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 29 acc $ w i $ −−−→ acc $ w i $ −−−→ accrej $ w i $ −−−→ rej $ w i $ −−−→ rejq w i $ −−−→ q , acc, rej $ w i $ −−−→ q , acc, rejq w i $ −−−→ q , acc, rej $ w i $ −−−→ q , acc, rej For q accepting in Q s : q $ w i $ −−−→ acc $ w i $ −−−→ acc For q rejecting in Q s : q $ w i $ −−−→ ∅ $ w i $ −−−→ ∅ For q accepting in Q s : q $ w i $ −−−→ rej $ w i $ −−−→ rej For q rejecting in Q s : q $ w i $ −−−→ ∅ $ w i $ −−−→ ∅ Assume that the labelled Markov chain M has a sink, that is the decision to terminatethe word must be made by probability. Then ∀ λ > n such that f s (Σ >n ) < λ and f s (Σ >n ) < λ [25, Lemma 12.].Suppose limit P A ( u n ) converges to a limit M and let r = q and q = acc .Hence for longer and longer words the probability of reaching acc is diminishing. Thuslim P A [ r u n −−→ q ] = 0, and in M we have r and q in different SCCs. acc is clearly recurrent asit is deterministically looping on every character. Since the probability of reaching acc isdiminishing for longer and longer words, whenever $ is read the state returns to r , hence allwords return to r with probability 1 in the limit. By the choice of words in the sequence, forevery word f s ( w n ) >
0, we have P A [ r u n −−→ q ] > n .Hence a leak has been defined, even in the case where M is unary. (cid:74) E Additional material for Section 4
Here we discuss the relationship between the big-O problem and the eventually big-O problem.Let W = h Q, Σ , M, { t }i be a weighted automaton, s, s ∈ Q , and s = s . Below, whenever wewrite f s (resp. f s ), this will refer to word weights from s (resp. s ) in W .Choose δ to be a real number such that 0 < δ < δ is smaller than any positiveweight in W . Construct W by adding the following transitions for all x ∈ Σ: s δ −→ x t s δ −→ x • • δ −→ x • • δ −→ x t, where • is a new state. Consequently, for any w ∈ Σ + , we get:the weight of w in W from s is f s ( w ) + δ | w | ,if f s ( w ) > f s ( w ) > δ | w | . (cid:73) Lemma 51. s is eventually big-O of s in W if and only if L s ( W ) \ L s ( W ) is finite and s is big-O of s in W . Proof. ( ⇒ ) Suppose s is eventually big-O of s in W , i.e. there exist C, k such that, for all w ∈ Σ ≥ k , f s ( w ) ≤ Cf s ( w ). Note that, for w ∈ Σ ≥ k , this implies that, whenever f s ( w ) >
0, wemust also have f s ( w ) >
0. Consequently, L s ( W ) \ L s ( W ) ⊆ Σ
0. Because s is big-O of s in W , there exists C , such that f s ( w ) ≤ C ( f s ( w ) + δ | w | ) for any w ∈ Σ ∗ .Let w ∈ Σ ≥ k . From s being big-O of s , we get f s ( w ) ≤ C ( f s ( w ) + δ | w | ).If f s ( w ) > f s ( w ) >
0. By construction of W , we get f s ( w ) > δ | w | , so f s ( w ) ≤ C ( f s ( w ) + δ | w | ) < C ( f s ( w ) + f s ( w )) = 2 Cf s ( w ) . If f s ( w ) = 0 then we also have f s ( w ) = 0 ≤ Cf s ( w ).Consequently, for any w ∈ Σ ≥ k , f s ( w ) ≤ Cf s ( w ), i.e. s is eventually big-O of s in W . (cid:74) The above argument relied on completing the automaton so that any word is acceptedwith some weight. To transfer our decidability results for bounded languages, it will benecessary to complete the automaton with respect to a bound, i.e. the extra weights areadded only for words from a +1 · · · a + m , a ∗ · · · a ∗ m , w ∗ · · · w ∗ k respectively. This can be doneeasily by introducing the extra transitions according to DFA for the bounding language. E.1 Unambiguous Automata
Proof of Lemma 13.
Let W = h Q, Σ , M, F i be a weighted automaton. Suppose s, s ∈ Q , t is a unique final state, and W is unambiguous from s, s .If W fails the LC condition (recall that it can be checked in polynomial time), we returnno. Otherwise, let us construct a weighted automaton W through a restricted productconstruction involving two copies of W : for all q , q , q , q ∈ Q , we add edges ( q , q ) p −→ a ( q , q ) provided M ( a )( q , q ) > M ( a )( q , q ) > p = M ( a )( q ,q ) M ( a )( q ,q ) . Note that thereexists a positively-weighted w -labelled path from ( s, s ) to ( t, t ) in W iff w ∈ L s ( W ) ∩L s ( W ).By the LC condition, this is equivalent to w ∈ L s ( W ), and, to examine the big-O problem,it suffices to consider only such words.By unambiguity of W from s and s , for any w ∈ L s ( W ), there can be exactly onepositively-weighted path from ( s, s ) to ( t, t ) in W . Consequently, the product of weightsalong this path is equal to f s ( w ) /f s ( w ). Hence, s is not big-O of s (for W ) if and onlythere exists a positively-weighted path from ( s, s ) to ( t, t ) in W that contains a cycle suchthat the product of the weights in that cycle is greater than 1.Thus, to decide the big-O problem for s, s , it suffices to be able to detect such cycles.This can be done , for instance, by a modified version of the Bellman-Ford algorithm [11]applied to the weighted directed graph consisting of positively-weighted edges of W . Thealgorithm is normally used to find negative cycles in the sense that the sum of weights isnegative. To adapt it to our setting, we can apply the logarithm function to the weights.However, to preserve rationality of weights and polynomial-time complexity, we cannot affordto do that explicitly. Instead, whenever log( x ) < log( y ) would be tested, we test x < y and,whenever log( x ) + log( y ) would be performed, we compute xy instead. (cid:74) F Additional material for Section 5 (cid:73)
Lemma 52.
Given A ϕ , a representation of the value ρ ϕ can be found in polynomial time.This representation will admit polynomial time testing of ρ ϕ > ρ ϕ and ρ ϕ = ρ ϕ and can beembedded into the first order theory of the reals. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 31 Proof.
An algebraic number z can be represented as a tuple ( p z , a, b, r ) ∈ Q [ x ] × Q . Here p z is a polynomial over x and a, b, r form an approximation such that z is the only root of p z ( x ) with | z − ( a + bi ) | ≤ r .Then operations such as addition and multiplication of two algebraic numbers, finding | x | , testing if x > p, a, b, r ), yielding the same representation. Additionally given a polynomial, one can findthe representation of each of its roots in polynomial time (see e.g. [36]).Any coefficient of the characteristic polynomial of an integer matrix can be found in GapL [23].
GapL is the difference of two L calls, each of which can be found in NC ⊆ P . Herethe matrix will be rational; but it can be normalised to an integer matrix by a scaler, the leastcommon multiple of the denominator of each rational. This number could be exponential,but representable in polynomial space. The final eigenvalues can be renormalised by thisconstant.The characteristic polynomial of an n × n matrix has degree at most n , since eachcoefficient can be found in polynomial time, the whole characteristic polynomial can befound in this time. Thus by enumerating its roots (at most n ), taking the modulus of each,and sorting them ( a > b ⇐⇒ a + − × b >
0) we can find the spectral radius in this form( p z , a, b, r ).Note that the spectral radius is a real number, so that given the spectral radius in theform ( p z , a, b, r ) we actually have b = 0. Then the number can be encoded exactly in thefirst order theory of the reals using ∃ z : p z ( z ) = 0 ∧ z − a ≤ r ∧ a − z ≤ r . (cid:74) Proof of Lemma 21. (lower bound)
Let n ∈ N and suppose d s,t ( n ) = ( ρ , k ). Considerthe witnessing path in W , i.e. the length- n path from s to t that visits k + 1 SCCs ofspectral radius ρ and no SCC with a larger spectral radius. Let π = ϕ . . . ϕ k ∈ P ( s, t )be the corresponding sequence of SCCs visited by that path and let s i , e i (1 ≤ i ≤ k )be the entry and exit points (respectively into and out of ϕ i ) on that path. i.e. s = s , SCC ( s i ) = SCC ( e i ) = ϕ i (1 ≤ i ≤ k ), there is a transition (of positive weight) from e i to s i +1 and e k = t . We write ~s i , e i to represent the particular sequence of entry/exit points.Let us define a new unary weighted automaton W ~s i ,e i to be a restriction of W so thatthe only entry points to its SCCs are s i ’s and the only exit point are e i ’s, i.e. the weight isreduced to zero for any violating transition. Let D be the transition matrix of W ~s i ,e i .Clearly A ns,t ≥ D ns,t , since W ~s i ,e i is a restriction of W . Note that, in W ~s i ,e i , ρ ( s, t ) = ρ and k ( s, t ) = k , because all paths from s to t must visit k + 1 SCC’s with spectral radius ρ .Hence, by Theorem 18, D ns,t + D n +1 s,t + · · · + D n + T − s,t ≥ c ~s i ,e i ( ρ ) n n k , for some c ~s i ,e i >
0, where T is the local period from s to t in W ~s i ,e i . Next we shall show that D n +1 s,t + · · · + D n + T − s,t = 0,which will imply D ns,t ≥ c ~s i ,e i ( ρ ) n n k and, hence, A ns,t ≥ c ~s i ,e i ( ρ ) n n k .Let L be the length of the shortest path from s to t in W ~s i ,e i . Observe that paths from s to t in W ~s i ,e i can only have lengths from { L + n · T SCC ( s ) + · · · + n k · T SCC ( s k ) | n , . . . , n k ∈ N } and, thus, { L + n · gcd { T SCC ( s ) , . . . , T SCC ( s k ) } | n ∈ N } . As P ( s, t ) = { π } in W ~s i ,e i , T = gcd { T SCC ( s ) , . . . , T SCC ( s k ) } . Consequently, all paths from s to t in W ~s i ,e i have lengthsof the form L + nT . Hence, since D ns,t is positive, there are no paths which can contributepositive value to D n +1 s,t + · · · + D n + T − s,t .As c ~s i ,e i depends only on ~s i , e i , to finish the proof it suffices to take c to be the smallestamong the finitely many c ~s i ,e i . (upper bound) Let N ( ρ ,k ) = { n | d s,t ( n ) = ( ρ , k ) } . This gives a finite partition of N as S ( ρ,k ) N ( ρ,k ) . For each ( ρ , k ), we shall find a value C ( ρ ,k ) so that, for n ∈ N ( ρ ,k ) , we have A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k . Then, to have A ns,t ≤ Cρ ( n ) n n k ( n ) for all n ∈ N , it will suffice totake C to be the maximum over all C ( ρ ,k ) .Let us fix ( ρ , k ). Consider W • to be W † in which, for every ( ρ, k ) ≤ ( ρ , k ), we mergethe states ( t, ρ, k ) into a single final state t (recall there are no outgoing edges from t ). Letus rename the state ( s, ,
0) to s . Let E be the corresponding transition matrix of W • . Notethat all paths from s to t in W • go through at most k + 1 SCCs with spectral radius ρ . (cid:66) Claim 53.
For all n ∈ N ( ρ ,k ) , we have A ns,t = E ns ,t .Consider any path s → q → · · · → q m → t in W . There is a corresponding path in W • ,however the states q i are annotated as ( q i , ρ, k ), where ρ is the largest spectral radius seenso far, and k + 1 is the number of SCC’s of that radius number seen so far. The only pathsremoved are those terminating at ( t, ρ, k ) with ( ρ, k ) > ( ρ , k ). Since d s,t ( n ) = ( ρ , k ), weknow that no path visits more than k + 1 SCCs of spectral radius ρ , or an SCC of spectralradius greater than ρ . Consequently, no such path is disallowed in W • . No paths wereadded either. Because every SCC in W remains a strongly connected component in W • (duplicated with various ( ρ, k )) and its transition probability matrix (and hence the spectralradius) remains the same, we can conclude that A ns,t = E ns ,t . (cid:66) Claim 54.
There exists C ( ρ ,k ) such that A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k .We have A ns,t = E ns ,t ≤ E ns ,t + E n +1 s ,t + · · · + E n + T ( s ,t ) − s ,t , where T ( s , t ) is the local periodbetween states s and t in W • . By Theorem 18, there exists C ( ρ ,k ) such that this quantityis bounded by C ( ρ ,k ) ( ρ ) n n k . Thus, for n ∈ N ( ρ ,k ) , we have A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k . (cid:74) Proof of Lemma 23.
First we note some consequences of d s,t ( n ) ≤ d s ,t ( n ). Suppose d s,t ( n ) = ( ρ, k ) and d s ,t ( n ) = ( ρ , k ). Thanks to Lemma 21, we have f s ( a n ) ≤ ( Cc ( ρρ ) n n k − k ) · f s ( a n ). If d s,t ( n ) ≤ d s ,t ( n ) we can distinguish two cases: either ( ρ, k ) = ( ρ , k ) or( ρ, k ) < ( ρ , k ).In the former case, ( ρρ ) n n k − k = 1 and, thus, f s ( a n ) ≤ ( Cc ) · f s ( a n ).In the latter case, we have lim m →∞ ( ρρ ) m m k − k = 0 and, thus, ( ρρ ) m m k − k < m . Consequently, for all but finitely many n , we can conclude f s ( a n ) ≤ ( Cc ) · f s ( a n ).Thanks to the above analysis, if d s,t ( n ) ≤ d s ,t ( n ) holds for all but finitely many n ,it follows that f s ( a n ) ≤ ( Cc ) · f s ( a n ) for all but finitely many n . Moreover, the languagecontainment condition implies that f s ( a n ) ≤ C · f s ( a n ) for some C in the remaining (finitelymany) cases. Hence, s is big-O of s , which shows the right-to-left implication.For the converse, recall that we have already established that “ s is big-O of s ” implies thelanguage containment condition. For the remaining part, we reason by contraposition andsuppose that there are infinitely many n with d s,t ( n ) > d s ,t ( n ). As there are finitely manyvalues in the range of d s,t and d s ,t , there exist ( ρ, k ) and ( ρ , k ) such that ( ρ, k ) > ( ρ , k )and, for infinitely many n , d s,t = ( ρ, k ) and d s ,t = ( ρ , k ). For such n , Lemma 21 yields f s ( a n ) ≥ ( cC ( ρρ ) n n k − k ) · f s ( a n ). But ( ρ, k ) > ( ρ , k ) implieslim m →∞ (cid:18) ρρ (cid:19) m m k − k = ∞ , i.e. ( ρρ ) n n k − k is unbounded. Thus, s cannot be big-O of s . (cid:74) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 33 Proof of Lemma 25.
Let M be a DFA accepting L ( N ) ∩ L ( N ) obtained through standardautomata constructions, i.e. | M | ≤ |N | + |N | . Note that L ( N ) ∼ ⊂ L ( N ) if and only if L ( M ) is finite. Observe that L ( M ) is infinite if and only if there exists w ∈ L ( M ) with | M | ≤ w ≤ | M | .Consequently, violation of eventual inclusion can be detected by guessing n ∈ N such that | M | ≤ n ≤ | M | and verifying a n ∈ L ( M ).Even though M is of exponential size, it is possible to verify a n ∈ L ( M ) in polynomialtime. To this end, we use N , N instead of M and view their transition functions as matrices.Then one can verify the condition using fast matrix exponentiation (by squaring). Becausethe binary size of n must be polynomial in |N | + |N | , the lemma follows. (cid:74) Proof of Lemma 26.
The left-to-right implication is clear. For the opposite direction,observe that, because the order on X is total, d ( n ) > d ( n ) implies the existence of x ∈ X such that d ( n ) ≥ x and d ( n ) < x (it suffices to take x = d ( n )). Because X is finite, d ( n ) > d ( n ) for infinitely many n implies failure of { n | d ( n ) ≥ x } ∼ ⊂ { n | d ( n ) ≥ x } forsome x . (cid:74) F.1 Hardness (cid:73)
Theorem 55.
The big-O problem is coNP -hard on unary Markov chains.
Let us first consider a particular form of unary NFAs. (cid:73)
Definition 56.
A unary NFA N = h Q, → , q s , F i is in Chrobak normal form [9] if Q = S ] C ] · · · ] C m and q s ∈ S ; S = { s , · · · , s k } , q s = s ∈ S and transitions between states from S form a path s a −→ s a −→ . . . a −→ s k ; C i = { c i , · · · , c i | C i |− } ( ≤ i ≤ m ) and transitions between states from C i form a cycle c i a −→ c i a −→ . . . a −→ c i | C i |− a −→ c i ;the remaining transitions connect the end of the path to each cycle: s k a −→ c i for all ≤ i ≤ m . Any unary NFA can be translated to this representation with at most quadratic blow-up inthe size of the machine [9], such representation can be found in polynomial time [45, 32]. Inaddition, to simplify our arguments, we introduce a restricted
Chrobak normal form, whichrequires that there is exactly one accepting state in each cycle. This restricted form canbe found with at most a further quadratic blow-up over Chrobak normal form, by creatingcopies of cycles - one for each accepting state in the cycle.Observe that S ⊆ F is a necessary condition for the universality of a unary NFA inChrobak normal form. Consequently, the universality problem for unary NFA in restrictedChrobak normal form such that k = 1 is already coNP -hard. This is the problem we aregoing to reduce from in the following. Proof of Theorem 55.
Let N = h Q, −→ , q s , F i be a unary NFA in restricted Chrobak normalform with k = 1. We will construct a unary Markov chain M , depicted in Figure 7, withstates Q = Q ∪ { s, u, v, t } , where t is final. The branch starting from s , defined below,guarantees f s ( a n ) = Θ(( ) n ). s −→ u u −→ u u −→ t s u ts v C ... C m ... C C m
12 121 m +114 341 m +1 − m +1 12 | Cm | − | Cm | . . . . . . Figure 7
Reduction from NFA (left) to LMC (right)
We take s = q s and create a similar branch from s , albeit with a smaller weight, to createpaths of weight Θ(( ) n ) when reading a n . s m +1 −−−→ v v −→ v v −→ t Moreover, we add weights to the original NFA transitions from N as follows: s m +1 −−−→ c i (1 ≤ i ≤ m ) c ij (cid:9) ) | Ci | −−−−→ c ij ( c ij ∈ F ) c ij (cid:9) − ( ) | Ci | −−−−−−→ t ( c ij ∈ F ) c ij (cid:9) −→ c ij ( c ij F )where j (cid:9) | C i | + j −
1) mod | C i | . Note that the weights have been selected as if eachletter were read with weight except for a bounded number of transitions, where the boundis max | C i | . Consequently, whenever there are accepting paths for a n in N , their overallweight in M will be Θ(( ) n ).It it easy to check that the reduction produces an LMC and can be carried out inpolynomial time. In Appendix F we show that the reduction is correct. (cid:74) Proof of Theorem 55 continued.
It remains to argue that the reduction is correct.If N is not universal, there exists n such that a n F . Because of the cyclic structureof Chrobak normal form, a n k F for n k = n + kq , where q = lcm {| C | , . . . , | C m |} and k ∈ N . Then, by the earlier observations about growth, there exists C > k f s ( a nk ) f s ( a nk ) = sup k C (1 / nk (1 / nk = sup k C n k = ∞ , i.e. s is not big-O of s .If N is universal then, starting from s in M , every word a n will have a path weightedΘ(( ) n ) as well as paths weighted Θ(( ) n ). Hence, there exists C > n f s ( a n ) f s ( a n ) ≤ sup n C ( ) n ( ) n + ( ) n ≤ C, . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 35 i.e. s is big-O of s . (cid:74)(cid:73) Remark 57.
We note that the branch via state v is not strictly necessary, but itdemonstrates that the problem is hard even if the LC condition is satisfied (i.e., “it can bethe numbers that make the hardness”). G Additional material for Section 6G.1 Additional material for Example 29
Let p = 0 .
62 and then note that f s ( a m b n ) f s ( a m b n ) = 1 · . m · . · . n · . . · . m · . · . n · .
61 + 0 . · . m · . · . n · . m = n = 0 then f s ( a m b n ) f s ( a m b n ) = . For all larger m, n the ratio is smaller.To see that, when when p = 0 .
61 we have f s ( a n b . n ) f s ( a n b . n ) −−−−→ n →∞ ∞ , observe there is asolution to x with 0 . · . x < . · . x and 0 . · . x < . · . x , e.g. x = 0 .
66, then let m = xn and observe Whilst useful for illustration in this example, this effect is not limitedto a linear relation between the characters, and so heavier machinery is required. G.2 Additional proofs (cid:66)
Claim 58.
In the plus-letter-bounded cases, without loss of generality, we assume a i = a j (for i = j ). Proof of Claim 58.
We show how to reduce the big-O problem in the plus-letter-boundedcase to the version of the same problem where a i = a j (for i = j ). Suppose, as above, L s ( W ) ⊆ a +1 · · · a + m . We can assume a i = a i +1 (1 ≤ i < m ) because a + a + can be replacedwith a + . If W had an a -labelled transition that can be used in two different blocks a + i and a + j ( j ≥ i + 2) within a +1 · · · a + m , then a = a i = a j and this transition could be used to “skip”the block a + i +1 , i.e., there would exist a word w ∈ L s ( W ) that does not use the a + i +1 block,contradicting L s ( W ) ⊆ a +1 · · · a + m . Therefore, every transition can be associated with exactlyone block. Define a fresh alphabet Σ = { b , . . . , b m } and relabel each transition associatedwith the i th block by b i . (cid:74) Proof of Lemma 32. f s ( a n a n . . . a n m m ) = ( M ( a ) n × M ( a ) n × · · · × M ( a m ) n m ) s,t = X q ∈ Q M ( a ) n s,q ( × M ( a ) n × · · · × M ( a m ) n m ) q ,t ...= X ( q ,...,q m − ) ∈ Q m − M ( a ) n s,q × M ( a ) n q ,q × · · · × M ( a m ) n m q m − ,t By Lemma 21 in the unary case, for each M ( a i ) n i q i − ,q i , there is a ( ρ q i − ,q i , k q i − ,q i ) , c, C ,such that if n i > | Q | cρ n i q i − ,q i n k qi − ,qi i ≤ M ( a i ) n i q i − ,q i ≤ Cρ n i q i − ,q i n k qi − ,qi i . Otherwise if n i ≤ | Q | , since there are at most | Q | instances it is clear there exists c, C , cδ n i ≤ M ( a i ) n i q i − ,q i ≤ Cδ n i . Take c, C so that C is maximised over all such C and c is minimised over all such c . c m − X ( q ,...,q m − ) ∈ Q m − ρ n s,q n k s,q · . . . · ρ n m q m − ,t n k qm − ,t m ≤ f s ( a n a n . . . a n m m ) ≤ C m − X ( q ,...,q m − ) ∈ Q m − ρ n s,q n k s,q · . . . · ρ n m q m − ,t n k qm − ,t m (7)By standard manipulations, any such that if for all i (ˆ ρ i , ˆ k i ) ≤ ( ρ , k ), thenˆ ρ n n ˆ k · · · · · ˆ ρ n m m n ˆ k m m + ρ n n k · · · · · ρ n m m n k m m = Θ( ρ n n k · · · · · ρ n m m n k m m ) and by sufficientmodification of C, c , paths admitting (ˆ ρ , ˆ k ) , . . . , (ˆ ρ m , ˆ k m ) can be omitted.Since the sum is finite, any two sums with the same ρ, k values can be reduced to a singleone, changing c, C by a factor of two.The remaining ( ρ, k ) paths correspond exactly with d s ( n , . . . , n m ). (cid:74) Proof of Lemma 34.
Let ˆ ρ = ( ρ , k ) · · · ( ρ m , k m ). One can construct an automaton N s ≥ ˆ ρ with L ( N s ≥ ˆ ρ ) = { a n . . . a n m m | ∃ ˆ σ ∈ d s ( n , . . . , n m ) ˆ σ ≥ ˆ ρ } by tracking the current maximum spectral radius seen and the number of different SCCswith this spectral radius. If the only states seen so far have been singletons with no loops(formally having spectral radius 0), the value should be tracked as ( δ,
0) regardless of howmany have been seen.Passage from states reading a j to states reading a j +1 is allowed only if the tracked valueis at least ( ρ j , k j ), and states should be final if the tracked value of a m is at least ( ρ m , k m ).Similarly, one can construct N s> ˆ ρ with L ( N s> ˆ ρ ) = { a n . . . a n m m | ∃ ˆ σ ∈ d s ( n , . . . , n m ) ˆ σ > ˆ ρ } . The construction is the same as for N s ≥ ˆ ρ except that, in order to accept, we need to be surethat at least one of the ‘at least’ comparisons was strict. This can be achieved by maintainingan extra bit at run time.Note that L ( N s ≥ ˆ ρ ) \ N s> ˆ ρ contains all a n . . . a n m m such that there exists ˆ σ ∈ d s ( n , · · · , n m )with ˆ σ ≥ ˆ ρ and, for all ˆ τ ∈ d s ( n , · · · , n m ), we do not have ˆ τ > ˆ ρ . Consequently, we musthave ˆ ρ ∈ d s ( n , · · · , n m ), which implies (by maximality) that we cannot have ˆ τ > ˆ ρ for anyˆ σ ∈ d s ( n , · · · , n m ). Hence, L ( N s ≥ ˆ ρ ) \ L ( N s> ˆ ρ ) = { a n . . . a n m m | ˆ ρ ∈ d s ( n , · · · , n m ) } . Consequently, we can take N s ˆ ρ to be the corresponding automaton.Given Y ⊆ D , we can then take N s Y to be the automaton corresponding to \ ˆ ρ ∈Y L ( N s ˆ ρ ) ∩ \ ˆ ρ ∈D\Y ( a +1 · · · a + m \ L ( N s ˆ ρ )) . The relevant automaton N X, Y is then found by taking the intersection of L ( N sX ) and L ( N s Y ). (cid:74) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 37 Proof of Lemma 35.
Consider the machine N X, Y , accepting a language which is a subsetof a +1 a +2 . . . a + m , with any state not reachable from the starting state or not leading to anaccepting state removed. To induce a form with the property we want, we intersect N X, Y with the standard DFA for a +1 a +2 . . . a + m , without changing the language.Hence every state corresponds to reading from exactly one character block of a , a , . . . , a m .At each state there can be at most two characters enabled, either the character to remainin the current character block, or the character to move to the next. Every state can belabelled as only having transition for a i ; or also having transition with a i +1 .Consider all possible choices of automaton formed by restricting N X, Y so that there is asingle state which is allowed to transition from a i to a i +1 for each i and any other state whichhad this property in N X, Y has its a i +1 transitions removed (but keeps its a i transitions).Each such choice corresponds with a partition of the accepting runs of N X, Y .Thus L ( N X, Y ) is the finite union over the languages induced by all such machines. Wefurther show that such machines can further be expressed as a finite union of linear sets inthe form prescribed.Let us assume N jX, Y is such a machine with a single state capable of transitioning from a i to a i +1 for each i , and again remove any state not reachable from the starting state or notleading to an accepting state. The part of the machine reading a i has a single starting stateand a single final state, which is a unary NFA when the transitions to a i +1 are discarded.This unary NFA can be converted to Chrobak normal form; the section of N jX, Y corres-ponding to a i can be replaced with this unary NFA, and any accepting state has additionallythe transitions for transitioning from a i to a i +1 of the single such state in N jX, Y .Let us repeat the process above for all i , decomposing N jX, Y into the subsets of languageswhere there are exactly one state transitioning from a i to a i +1 . Let N jX, Y = S k N j,kX, Y , afinite union; where each k corresponds to a selection of accepting states ( q , . . . , q m ) with q l being the accepting state in the Chrobak normal form for a l .Consider such an N j,kX, Y . The steps spent in each block corresponding to a i is eitherformed by the finite path or the a single cycle at the end of the path. If the transitionoccurs in the finite path then b ki is the length of the path to that transition and r ki is zero.If the transition occurs in the cycle at the end of the path, then b ki is the length of thepath to that transition from the start of the path and r ki is the length of the cycle. In N j,kX, Y the time spent in block a i has no influence on the time spent in a j for j = i . Then L ( N j,kX, Y ) = { a n a n . . . a n m |∃ ~λ ∈ N m s.t. ∀ i ∈ [ m ] n i = b ki + r ki · λ i } . The language L ( N X, Y )is the union over all L ( N j,kX, Y ). (cid:74) Proof of Lemma 37.
By Lemma 36 we have s is not big-O of s if and only if there exist X ∈ D , Y ⊆ D , 1 ≤ k ≤ S X, Y such that ∀ C ∃ ~λ ∈ N m ^ j ∈ h Y m X i =1 α j,i ( b ki + r ki λ i ) + p j,i log( b ki + r ki λ i ) < C. (8)First we argue that we can restrict to some subset of the components which enable thesatisfying choice of λ to be sufficiently large in all components. By DFA we permit a partial transition function, that is 0 or 1 transition for each character from everystate, rather than exactly 1. (cid:66)
Claim 59.
Equation (8) holds if and only if the following holds for some U ⊆ [ m ]: ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · ( b ki + r ki · λ i ) + p j,i log( b ki + r ki · λ i ) < C (9) Proof of Claim 59.
First note that Equation (9) immediately implies Equation (8). We showthe converse.Recall we can alternatively characterise the formulation as a sequence n : N → N m . Thatis, for each negative integer C , the choice of ~λ corresponds to n ( C ) in the sequence.Note that in the sequence n some components may be bounded. Either because r ki = 0, orthe choice of n makes it so. Suppose there exists a θ > n ( t ) x ≤ θ for some x ∈ [ m ],then P mi =1 α j,i · n ( t ) i + p j,i log( n ( t ) i ) ≤ P mi =1 ,i = x α j,i · n ( t ) i + p j,i log( n ( t ) i ) + | α j,i | · θ + | p j,i | θ .Hence the sequence P mi =1 ,i = x α j,i · n ( t ) i + p j,i log( n ( t ) i ) goes to −∞ as well.Consider each choice of components B ⊆ [ m ] which will be bounded. For some componentsthere will be no choice as r ki = 0. Let us assume that the chosen set is maximal with respectto set-inclusion; that is, there should be no subsequence maintaining the property with fewercomponents unbounded. Let the remaining unbounded components be U = [ m ] \ B .Since each remaining component is not bounded, there is always a later point in thesequence in which the value is larger; thus one can take a subsequence of n ( t ) so that n ( t ) i ≤ n ( t + 1) i for every t . Repeat for every remaining component i ∈ U ; this can be doneas the minimal choice of unbounded components has been selected. Hence, without loss ofgenerality if there exists some sequence, then for any θ , there exists a subsequence of n ( t ),such that n ( t ) i > θ for all i ∈ U . To enable a more succinct analysis later, restrict n ( t ) tothose in which λ i ≥ max i b ki where n ( t ) i = b ki + r ki · λ i for some λ i . (cid:67) Next we argue that the offset component ~b does not affect whether the formula holdsand that we can relax the restriction of ~λ from naturals to positive reals and maintain thesatisfiability of the formula. The advantage here is that this relaxation can be solved withthe first order theory of the reals with exponential function; which is decidable subject toSchanuel’s conjecture. (cid:66) Claim 60. ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · ( b ki + r ki · λ i ) + p j,i log( b ki + r ki · λ i ) < C (10)holds if and only if the following holds: ∀ C ∃ ~x ∈ R U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · x i + X i ∈ U p j,i log( x i ) < C (11) Proof of Claim 60.
Observe that X i ∈ U α j,i · ( b ki + r ki · λ i ) = X i ∈ U α j,i · b ki + X i ∈ U α j,i · r ki · λ i . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 39 and that P i ∈ U α j,i · b ki is constant so it does not affect whether the sequence goes to −∞ ,hence Equation (10) holds if and only if : ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · λ i + p j,i log( b ki + r ki · λ i ) < C (12)Now let us extract the log component by using the following rewritinglog( b ki + r ki · λ i ) = log( λ i · ( b ki λ i + r ki )) = log( λ i ) + log( b ki λ i + r ki ) . Since r ki ≥ λ i ≥ b ki we have log( b ki λ i + r ki ) ≤ log( r ki + 1), which is constant. HenceEquation (10) is equivalent to: ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i ) < C (13)We now show that this is equivalent to Equation (11). Clearly Equation (13) impliesEquation (11). Now consider Equation (11) holding, and we show the Equation (13) issatisfied, by exhibiting a choice of ~λ ∈ N U ≥ max i b ki for every C .Given C <
0, let C = C − max j P i ∈ U | α j,i | r ki − max j P i ∈ U | p j,i | , and choose ~x ∈ R | U |≥ max i b ki satisfying Equation (11).Now let x i = λ i + y i , with y i < , λ i = b x i c . First observe that since x i ≥ max i b ki , aninteger, also λ i ≥ max i b ki .Observe that | P i ∈ U α j,i · r ki · y i | ≤ P i ∈ U | α j,i | r ki . Since X i ∈ U α j,i · r ki · λ i + X i ∈ U α j,i · r ki · y i + X i ∈ U p j,i log( λ i + y i ) < C we have X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i + y i ) < C + X i ∈ U | α j,i | r ki Let us again rewrite log( λ i + y i ) = log( λ i (1 + y i λ i )) = log( λ i ) + log(1 + y i λ i ). Then since λ i > y i ,log(1 + y i λ i ) ≤
1, so | X i ∈ U p j,i log(1 + y i λ i ) | ≤ X i ∈ U | p j,i | . We thus have X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i ) < C + X i ∈ U | α j,i | r ki + X i ∈ U | p j,i | ≤ C and hence, Equation (13) holds. (cid:67)(cid:74) G.3 The bounded case
Proof of Lemma 40.
Let W = h Q, Σ , M, F i . Then we have w , . . . , w m such that for all w with f s ( w ) > w = w n . . . w n m m for some n , . . . , n m ∈ N . Let us assume w i = b i, b i, , . . . , b i, | w i | .Given a word w , there may be multiple paths π , π , . . . from s to t respecting thatword. Further there may be multiple decomposition vectors ~n , ~n , · · · ∈ N m such that ~n i = ( n , . . . , n m ) and w = w n . . . w n m m . Our goal will be to construct a weighted automaton W with states ˆ s and ˆ s letter-bounded over a ∗ . . . a ∗ m such that, for every word w , the weightof a n . . . a n m m in W from ˆ s (resp. ˆ s ), for every valid decomposition vector ~n ∈ N m of w , willbe the sum of the weights of all paths π , π , . . . respecting w in W from s (resp. s ). Tocompute W , we will define a transducer and apply it to our automaton W .A nondeterministic finite transducer is an NFA with transitions labelled by pairs fromΣ × (Σ ∪ { (cid:15) } ). In our construction, we only require edges of this form, i.e. we do not considera definition with transitions labelled with (cid:15) in the first component (e.g. (cid:15)/a ). Our transducerinduces a translation T : Σ ∗ → Σ .Consider the set of regular expressions w + i . . . w + i m each induced by a sequence ~ı =( i , . . . , i m ) ∈ N m , m ≤ m , with 1 ≤ i < · · · < i m ≤ m . Note that two sequences( i , . . . , i m ), ( i , . . . , i m ) may yield the same expression w + i . . . w + i m , in which case we neednot consider more than one. The transducer T will be defined as follows.For each ~ı = ( i , . . . , i m ) described above, build the following automaton. For each i j ,construct the following section, which simply reads the word w i j : f ~ıj b ij, /(cid:15) −−−−→ s ~ıj b ij, /(cid:15) −−−−→ · b ij, /(cid:15) −−−−→ · . . . · b ij, | wi |− /(cid:15) −−−−−−−→ e ~ıj . Then, on the final character, nondeterministically restart or move to the next word, emittinga character representing the word: e ~ıj b ij, | wij | /a ij −−−−−−−−→ f ~ıj and e ~ıj b ij, | wij | /a ij −−−−−−−−→ f ~ıj +1 The transducer T is defined by the union of the above transitions over all ~ı . We also adda global start state q , from which we would like to move nondeterministically to f ~ı for each ~ı . To achieve this and avoid (cid:15) transitions, we duplicate the transitions f ~ı x −→ s ~ı with q x −→ s ~ı .Observe that the valid output sequences are ( (cid:15) ∗ a ) ∗ ( (cid:15) ∗ a ) ∗ . . . ( (cid:15) ∗ a m ) ∗ . However, there canbe a finite number of (cid:15) ’s in a row, at most r = max ≤ i ≤ m | w i | − W = h Q, Σ , M, { t }i and T = h Q , Σ × (Σ ∪ { (cid:15) } ) , → , q i . Then construct theweighted automaton T ( W ) = h Q × Q , Σ , M T , { t } × Q i using a product construction.The probability is associated in the following way M T ( a )(( s, q ) , ( s , q )) = p if there is atransition q b/a −−→ q in T and s p −→ b s in W . Note that, by this definition, there is a matrix M T ( (cid:15) ); however, in every run of T ( W ) at most r many (cid:15) ’s in a row are produced, where r = max ≤ i ≤ m | w i | − W be a copy of T ( W ) with (cid:15) removed: M ( a i ) = ( P rx =0 M T ( (cid:15) ) x ) M T ( a i ). Then f W ( w ) = f W ( a n . . . a n m m ) for all n , . . . , n m such that w = w n . . . w n m m . Hence, W is aweighted automaton with letter-bounded languages from ( s, q ) and ( s , q ) such that ( s, q )is big-O of ( s , q ) in W if and only if s is big-O of s in W ..