[PDF] The Big-O Problem

Abstract

Given two weighted automata, we consider the problem of whether one is big-O of the other, i.e., if the weight of every finite word in the first is not greater than some constant multiple of the weight in the second. We show that the problem is undecidable, even for the instantiation of weighted automata as labelled Markov chains. Moreover, even when it is known that one weighted automaton is big-O of another, the problem of finding or approximating the associated constant is also undecidable. Our positive results show that the big-O problem is polynomial-time solvable for unambiguous automata, coNP-complete for unlabelled weighted automata (i.e., when the alphabet is a single character) and decidable, subject to Schanuel's conjecture, when the language is bounded (i.e., a subset of w ∗ 1 … w ∗ m for some finite words w 1 ,…, w m ) or when the automaton has finite ambiguity. On labelled Markov chains, the problem can be restated as a ratio total variation distance, which, instead of finding the maximum difference between the probabilities of any two events, finds the maximum ratio between the probabilities of any two events. The problem is related to ϵ -differential privacy, for which the optimal constant of the big-O notation is exactly exp(ϵ) .

Full PDF

TThe Big-O Problem for Labelled Markov Chainsand Weighted Automata

Dmitry Chistikov

Centre for Discrete Mathematics and its Applications (DIMAP) & Department of ComputerScience, University of Warwick, Coventry, UK

Stefan Kiefer

Department of Computer Science, University of Oxford, UK

Andrzej S. Murawski

Department of Computer Science, University of Oxford, UK

David Purser

Centre for Discrete Mathematics and its Applications (DIMAP) & Department of ComputerScience, University of Warwick, Coventry, UKMax Planck Institute for Software Systems, Saarland Informatics Campus, Germany

Abstract

Given two weighted automata, we consider the problem of whether one is big-O of the other,i.e., if the weight of every ﬁnite word in the ﬁrst is not greater than some constant multiple of theweight in the second.We show that the problem is undecidable, even for the instantiation of weighted automata aslabelled Markov chains. Moreover, even when it is known that one weighted automaton is big-O ofanother, the problem of ﬁnding or approximating the associated constant is also undecidable.Our positive results show that the big-O problem is polynomial-time solvable for unambiguousautomata, coNP -complete for unlabelled weighted automata (i.e., when the alphabet is a singlecharacter) and decidable, subject to Schanuel’s conjecture, when the language is bounded (i.e., asubset of w ∗ . . . w ∗ m for some ﬁnite words w , . . . , w m ).On labelled Markov chains, the problem can be restated as a ratio total variation distance, which,instead of ﬁnding the maximum diﬀerence between the probabilities of any two events, ﬁnds themaximum ratio between the probabilities of any two events. The problem is related to (cid:15) -diﬀerentialprivacy, for which the optimal constant of the big-O notation is exactly exp( (cid:15) ). Theory of computation → Probabilistic computation

Keywords and phrases weighted automata, labelled Markov chains, probabilistic systems

Funding

Dmitry Chistikov : Supported in part by the Royal Society International Exchanges scheme(IEC\R2\170123).

Stefan Kiefer : Supported by a Royal Society Research Fellowship.

Andrzej S. Murawski : Supported by a Royal Society Leverhulme Trust Senior Research Fellowshipand the International Exchanges Scheme (IE161701).

David Purser : Supported by the UK EPSRC Centre for Doctoral Training in Urban Science(EP/L016400/1) and in part by the Royal Society International Exchanges scheme (IEC\R2\170123).

Acknowledgements

The authors would like to thank to Engel Lefaucheux, Joël Ouaknine, andJames Worrell for discussions during the development of this work.

Weighted automata over ﬁnite words are a well-known and powerful model of computation,a quantitative analogue of ﬁnite-state automata. Special cases of weighted automata includenondeterministic ﬁnite automata and labelled Markov chains, two standard formalisms formodelling systems and processes. Algorithms for analysis of weighted automata have been a r X i v : . [ c s . F L ] J u l The Big-O Problem for Weighted Automata studied both in the early theory of computing and more recently by the inﬁnite-state systemsand algorithmic veriﬁcation communities.Given two weighted automata A , B over an algebraic structure ( S , + , × ), the equivalenceproblem asks whether the two associated functions f A , f B : Σ ∗ → S are equal: f A ( w ) = f B ( w )for all ﬁnite words w over the alphabet Σ. Over the ring ( Q , + , × ), equivalence is decidablein polynomial time by the results of Schützenberger [41] and Tzeng [46]; subsequently,fast parallel ( NC and RNC ) algorithms have been found for this problem [47, 26]. Incontrast, for semirings the equivalence problem is hard: undecidable [27, 1] for the semiring( Q , max , +) and PSPACE -hard [35] for the Boolean semiring (for which weighted automataare usual nondeterministic ﬁnite automata and equivalence is equality of recognized languages).Replacing = with ≤ makes the problem harder: even for the ring ( Q , + , × ) the questionof whether f A ( w ) ≤ f B ( w ) for all w ∈ Σ ∗ is undecidable—even if f A is constant [38]. Thisproblem subsumes the universality problem for (Rabin) probabilistic automata, yet anothersubclass of weighted automata (see, e.g., [16]).In this paper, we introduce and study another natural problem, in which the ordering isrelaxed from exact (in)equality to (in)equality to within a constant factor. Given A and B as above, is it true that there exists a constant c > f A ( w ) ≤ c · f B ( w ) for all w ∈ Σ ∗ ?Using standard mathematical notation, this condition asserts that f A ( w ) = O ( f B ( w )) as | w | → ∞ , and we refer to this problem as the big-O problem accordingly. The big-

Θ problem(which turns out to be computationally equivalent to the big-O problem), in line with theΘ( · ) notation in analysis of algorithms, asks whether f A = O ( f B ) and f B = O ( f A ).We restrict our attention to the ring ( Q , + , × ) and only consider non-negative weightedautomata , i.e., those in which all transitions have non-negative weights. We remark that,even under this restriction, weighted automata still form a superclass of (Rabin) probabilisticautomata, a non-trivial and rich model of computation. Our initial motivation to study thebig-O problem came from yet another formalism, labelled Markov chains (LMCs). One canthink of the semantics of LMCs as giving a probability distribution or subdistribution onthe set of all ﬁnite words. LMCs, often under the name Hidden Markov Models, are widelyemployed in a diverse range of applications; in computer-aided veriﬁcation, they are perhapsthe most fundamental model for probabilistic systems, with model-checking tools such asPrism [28] or Storm [13] based on analyzing LMCs eﬃciently. All the results in our paper(including hardness results) hold for LMCs too. Our main ﬁndings are as follows.The big-O problem for non-negative WA and LMCs turns out to be undecidable ingeneral , by a reduction from nonemptiness for probabilistic automata.For unambiguous automata , i.e., where every word has at most one accepting path,the big-O problem becomes decidable and can be solved in polynomial time.In the unary case , i.e., if the input alphabet Σ is a singleton, the big-O problem isalso decidable and, in fact, complete for the complexity class coNP . Unary LMCs are asimple and pure probabilistic model of computation: they run in discrete time and canterminate at any step; the big-O problem refers to this termination probability in twoLMCs (or two WA). Our upper bound argument reﬁnes an analysis of growth of entriesin powers of non-negative matrices by Friedland and Schneider [40], and the lower boundis obtained by a reduction from unary NFA universality [44]. There also exists a related but slightly diﬀerent deﬁnition of big-O; see Remark 12 for details on thecorresponding version of our big-O problem. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 3

In a more general bounded case , i.e., if the languages of all words w associated withnon-zero weight are included in w ∗ w ∗ . . . w ∗ m for some ﬁnite words w , . . . , w m ∈ Σ ∗ (thatis, are bounded in the sense of Ginsburg and Spanier ; see [21, Chapter 5] and [22]),the big-O problem is decidable subject to Schanuel’s conjecture. This is a well-knownconjecture in transcendental number theory [29], which implies that the ﬁrst-order theoryof the real numbers with the exponential function is decidable [30]. Intuitively, ourreliance on this conjecture is linked to the expressions for the growth rate in powers ofnon-negative matrices. These expressions are sums of terms of the form ρ n · n k , where n is the length of a word, k ∈ N , and ρ is an algebraic number. Our algorithms (howeverimplicitly) need to compare for equality pairs of real numbers of the form log ρ / log ρ ,where ρ i are algebraic, and it is an open problem in number theory whether there is aneﬀective procedure for this task (the four exponentials conjecture asks whether two suchratios can ever be equal; see, e.g., Waldschmidt [48, Sections 1.3 and 1.4]).Bounded languages form a well-known subclass of regular languages. In fact, a regular (oreven context-free) language L is bounded if and only if the number of words of length n in L is at most polynomial in n . All other regular languages have, in contrast, exponentialgrowth rate (a fact rediscovered multiple times; see, e.g., references in Gawrychowskiet al. [19]). Bounded languages have been studied from combinatorial and algorithmicpoints of view since the 1960s [22, 19], and have recently been used, e.g., in the analysisof quantitative information ﬂow problems in computer security [34, 33]. In the context oflabelled Markov chains, languages that are subsets of a ∗ a ∗ . . . a ∗ m (for individual letters a , . . . , a m ∈ Σ) model consecutive arrival of m events in a discrete-time system. It iscurious that natural decision problems for such simple systems can lead to intricatealgorithmic questions in number theory at the border of decidability. Further motivation and related work.

In the labelled Markov chain setting, the big-O problem can be reformulated as a boundednessproblem for the following function. For two LMCs A and B , deﬁne the (asymmetric) ratiovariation function by r ( A , B ) = sup E ⊆ Σ ∗ ( f A ( E ) /f B ( E )) , where f A ( E ) and f B ( E ) denote the total probability mass associated with an arbitrary setof ﬁnite words E ⊆ Σ ∗ in A and B , respectively. Here we assume = 0 and x = ∞ for x >

0. Observe that, because max( ab , cd ) ≥ a + cb + d for a, b, c, d ≥

0, the supremum over E ⊆ Σ ∗ can be replaced with supremum over w ∈ Σ ∗ . Consequently, the big-O problem for LMCs isequivalent to deciding whether r ( A , B ) < ∞ .Finding the value of r amounts to asking for the optimal (minimal) constant in the big-Onotation. Further, one can consider a symmetric variant, the ratio distance : rd ( A , B ) =max { r ( A , B ) , r ( B , A ) } , in an analogy with big-Θ. Now, rd is a ratio-oriented variant of theclassic total variation distance tv , deﬁned by tv ( A , B ) = sup E ⊆ Σ ∗ ( f A ( E ) − f B ( E )), which isa well-established way of comparing two labelled Markov chains [6, 25]. We also consider theproblem of approximating r (as well as rd ) to a given precision and the problem of comparingit with a given constant (threshold problem), showing that both are undecidable.The ratio distance rd is also equivalent to the exponential of the multiplicative totalvariation distance deﬁned in [5, 43] in the context of diﬀerential privacy. Consider a system M , modelled by a single labelled Markov chain, where output words are observable tothe environment but we want to protect the privacy of the starting conﬁguration. Let The Big-O Problem for Weighted Automata R ⊆ Q × Q be a symmetric relation, which relates the starting conﬁgurations intendedto remain indistinguishable. Given (cid:15) ≥

0, we say that M is (cid:15) -diﬀerentially private (withrespect to R ) if, for all ( s, s ) ∈ R , we have f s ( E ) ≤ e (cid:15) · f s ( E ) for every observable setof traces E ⊆ Σ ∗ [14, 7]. Here in the subscript of f and elsewhere, references tostates s and s replace references to LMCs/automata: M stays implicit, and wespecify which state it is executed from. Note that there exists such an (cid:15) if and onlyif r ( s, s ) < ∞ for all ( s, s ) ∈ R or, equivalently, (the LMC M executed from) s is big-Oof (the LMC M executed from) s for all ( s, s ) ∈ R . In fact, the minimal such (cid:15) satisﬁes e (cid:15) = max ( s,s ) ∈ R r ( s, s ), thus r captures the level of diﬀerential privacy between s and s .Our results show that even deciding whether the multiplicative total variation distance isﬁnite or + ∞ is, in general, impossible. Likewise, it is undecidable whether a system modelledby a labelled Markov chain provides any degree of diﬀerential privacy, however low. (cid:73) Deﬁnition 1. A weighted automaton W over the ( Q , + , × ) semi-ring is a 4-tuple h Q, Σ , M, F i , where Q is a ﬁnite set of states , Σ is a ﬁnite alphabet , M : Σ → Q Q × Q is a transition weighting function , and F ⊆ Q is a set of ﬁnal states . We consider onlynon-negative weighted automata, i.e. M ( a )( q, q ) ≥ for all a ∈ Σ and q, q ∈ Q . In complexity-theoretic arguments, we assume that each weight is given as a pair of integers(numerator and denominator) in binary. The description size is then the number of bitsrequired to represent h Q, Σ , M, F i , including the bit size of the weights.Each weighted automaton deﬁnes functions f s : Σ ∗ → R , where for all s ∈ Qf s ( w ) = X t ∈ F ( M ( a ) × M ( a ) × · · · × M ( a n )) s,t for w = a a . . . a n ∈ Σ ∗ and A × B is standard matrix multiplication. We refer to f s ( w ) as the weight of w fromstate s . Without loss of generality, a weighted automaton can have a single ﬁnal state. If not,introduce a new unique ﬁnal state t s.t. M ( a )( q, t ) = P q ∈ F M ( a )( q, q ) for all q ∈ Q , a ∈ Σ. (cid:73) Deﬁnition 2.

We denote by L s ( W ) the set of w ∈ Σ ∗ with f s ( w ) > , that is, with positiveweight from s . Equivalently, this is the language of N s ( W ) , the non-deterministic ﬁniteautomaton (NFA) formed from the same set of states (and ﬁnal states) as W , start state s ,and transitions q a −→ q whenever M ( a )( q, q ) > . Given s, s ∈ Q , we say that s is big-O of s if there exists C > f s ( w ) ≤ C · f s ( w ) for all w ∈ Σ ∗ . The paper studies the following problem. (cid:73) Deﬁnition 3 ( Big-O Problem ) . input Weighted automaton h Q, Σ , M, F i and s, s ∈ Q output Is s big-O of s ? (cid:73) Remark 4.

One could consider whether s is big-Θ of s , deﬁned as s is big-O of s and s is big-O of s ; equivalently, whether rd ( s, s ) < ∞ for LMCs. We note that these twonotions reduce to each other, justifying our consideration of only the big-O problem (seeAppendix C). There is an obvious reduction from big-Θ to big-O making two oracle calls (aCook reduction), but this can be strengthened to a single call preserving the answer (a Karpreduction). This, however, requires at least two characters. In the other direction, one canask if s big-O of s using big-Θ by asking if a linear combination of s and s is big-Θ of s . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 5 In the paper we also work with labelled Markov chains. In particular, they will appear inexamples and hardness (including undecidability) arguments. As they are a special class ofweighted automata, this will imply hardness (resp. undecidability) for weighted automata ingeneral. On the other hand, our decidability results will be phrased using weighted automata,which makes them applicable to labelled Markov chains. (cid:73)

Deﬁnition 5. A labelled Markov chain (LMC) is a (non-negative) weighted automaton h Q, Σ , M, F i such that, for all q ∈ Q \ F , we have P q ∈ Q P a ∈ Σ M ( a )( q, q ) = 1 and M ( a )( q, q ) = 0 for all for all a ∈ Σ , q ∈ F and q ∈ Q . Since ﬁnal states have no outgoing transitions, w.l.o.g., one can assume a unique ﬁnalstate. For LMCs, the function f s can be extended to a measure on the powerset of Σ ∗ by f s ( E ) = P w ∈ E f s ( w ), where E ⊆ Σ ∗ . The measure is a subdistribution: P w ∈ Σ ∗ f s ( w ) ≤ unary weighted automata, and similarly LMCs, where | Σ | = 1.Then we will often omit Σ on the understanding that Σ = { a } , and describe transitions witha single matrix A = M ( a ) so that f s ( a n ) = A ns,t , where t is the unique ﬁnal state. Note that A ns,t stands for ( A n )( s, t ), and not ( A ( s, t )) n . Using the notation of regular expressions, wecan write L s ( W ) ⊆ a ∗ . It will turn out fruitful to consider several larger classes of languages: (cid:73) Deﬁnition 6.

Let L ⊆ Σ ∗ . L is bounded [22] if L ⊆ w ∗ w ∗ · · · w ∗ m for some w , . . . , w m ∈ Σ ∗ . L is letter-bounded if L ⊆ a ∗ a ∗ . . . a ∗ m for some a , . . . , a m ∈ Σ . L is plus-letter-bounded if L ⊆ a +1 a +2 . . . a + m for some a , . . . , a m ∈ Σ . In each case, if the language of an NFA is suitably bounded, one can extract a correspondingbounding regular expression [19].

We show that the big-O problem is undecidable. We also establish undecidability for severalother problems related to computing and approximating the ratio variation distance. Recallthat this corresponds to identifying the optimal constant for positive instances of the big-Oproblem or the level of diﬀerential privacy between two states in a labelled Markov chain. (cid:73)

Deﬁnition 7.

The asymmetric threshold problem takes an LMC along with two states s, s and a constant θ , and asks if r ( s, s ) ≤ θ .The variant under the promise of boundednesspromises that r ( s, s ) < ∞ . The strict variant of each problem replaces ≤ with < .The asymmetric additive approximation task takes an LMC, two states s, s and aconstant γ , and asks for x such that | r ( s, s ) − x | ≤ γ . The asymmetric multiplicativeapproximation task takes an LMC, two states s, s and a constant γ , and asks for x suchthat − γ ≤ x r ( s,s ) ≤ γ .In each case, the symmetric variant is obtained by replacing r with rd. (cid:73) Theorem 8.

The big-O problem is undecidable, even for LMCs.Each variant of the threshold problem (asymmetric/symmetric, non-strict/strict) isundecidable, even under the promise of boundedness.All variants of the approximation tasks (asymmetric/symmetric, additive/multiplicative)are unsolvable, even under the promise of boundedness.

Probabilistic automata are similar to LMCs, except that M ( a ) is stochastic for every a , rather than P a ∈ Σ M ( a ) being stochastic. Formally, a probabilistic automaton is a non-negative weighted automaton with a distinguished start state q s such that P q ∈ Q M ( a )( q, q ) = The Big-O Problem for Weighted Automata s t s

34 14 12 12

Figure 1

Unbounded ratio but language equivalent. q ∈ Q and a ∈ Σ. The problem

Empty asks if f q s ( w ) ≤ for all words w . It isknown to be undecidable [38, 16]. Proof sketch of Theorem 8 (see Appendix D).

We reduce from

Empty . The constructioncreates two branches of a labelled Markov chain. The ﬁrst simulates the probabilisticautomaton using the original weights multiplied by a scalar ( in the case | Σ | = 2). Theother branch will process each letter from Σ with equal weight (also in an inﬁnite loop).Consequently, if there is a word accepted with probability greater than , the ratio betweenthe two branches will be greater than 1. The construction will enable words to be processedrepeatedly, so that the ratio can then be pumped unboundedly. Certain linear combinations ofthe branches enable a gap promise, entailing undecidability of the threshold and approximationtasks. (cid:74)(cid:73) Remark.

The classic non-strict threshold problem for the total variation distance (i.e.whether tv ( s, s ) ≤ θ ) is known to be undecidable [25], like our distances. However, it is notknown if its strict variant (i.e. whether tv ( s, s ) < θ ) is also undecidable. In contrast, inour case, both variants are undecidable. Further note that (additive) approximation of tv ispossible [25, 6], but this is not the case for our distances r and rd . (cid:73) Remark.

We have shown the undecidability of the big-O problem using the undecidabilityof the emptiness problem for probabilistic automata. Another proof of undecidability can beobtained using the

Value-1 problem (shown to be undecidable in [20]): indeed the big-Oproblem and the

Value-1 problem are interreducible. However, the reduction from big-O to

Value-1 does not entail decidability for subclasses of weighted automata (such as those withbounded languages), as the image of these subclasses does not fall into the known decidablefragments of the

Value-1 problem. Further details are available in Appendix D.1.

Towards decidability results, we identify a simple necessary (but insuﬃcient) condition for s being big-O of s . (cid:73) Deﬁnition 9 (LC condition) . A weighted automaton W = h Q, Σ , M, F i and s, s ∈ Q satisfythe language containment condition (LC) if for all words w with f s ( w ) > we also have f s ( w ) > . Equivalently, L s ( W ) ⊆ L s ( W ) . The condition can be veriﬁed by constructing NFA N s ( W ) , N s ( W ) that accept L s ( W ) and L s ( W ) respectively and verifying L ( N s ( W )) ⊆ L ( N s ( W )). (cid:73) Remark 10.

Recall that NFA language containment is NL -complete if the automata are infact deterministic, in P if they are unambiguous [10, Theorem 3], coNP -complete if theyare unary [44] and PSPACE -complete in general [35]. In all cases this complexity level willmatch, or be lower than that for our respective algorithm for the big-O problem.We observe that, if s is big-O of s , the LC condition must hold and so the LC conditionis the ﬁrst step in each of our veriﬁcation routines. Example 11 shows that the conditionalone is not suﬃcient to solve the big-O problem, because two states can admit the same setof words with non-zero weight, yet the weight ratios become unbounded. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 7 (cid:73) Example 11.

Consider the unary automaton W in Figure 1. We have L s ( W ) = L s ( W ) = { a n | n ≥ } , but f s ( a n ) f s ( a n ) = (0 . n − · . . n − · . = 0 . · . n − −−−−→ n →∞ ∞ . (cid:73) Remark 12.

The original big-O notation on f, g : N → N , states that f is O ( g ) if ∃ C, k > ∀ n > k f ( n ) ≤ C g ( n ). Despite excluding ﬁnitely many points, when g ( n ) ≥

1, it is equivalentto ∃ C > ∀ n > f ( n ) ≤ C g ( n ) by taking C large enough to deal with the ﬁnite preﬁx.In the paper, though, we formally consider s to not be big-O of s if there exists even a singleword w such that f s ( w ) > f s ( w ) = 0. However, for weighted automata, we could amendour deﬁnition to “eventually big-O” as follows: ∃ C > , k > ∀ w ∈ Σ ≥ k f s ( w ) ≤ C · f s ( w ).The big-O problem reduces to its eventual variant by checking both the LC conditionand the eventually big-O condition. Thus our undecidability (and hardness) results transferto the eventually big-O problem. The eventually big-O problem can be solved via the big-Oproblem by “ﬁxing” the LC condition through the addition of a branch from s that acceptsall appropriate words with very low probability (see Appendix E for more details). In this section, we prove the ﬁrst decidability result, that is, polynomial-time solvability inthe unambiguous case. We say a weighted automaton W is unambiguous from a state s ifevery word has at most one accepting path in N s ( W ). (cid:73) Lemma 13.

If a weighted automaton W is unambiguous from states s and s , the big-Oproblem is decidable in polynomial time. Proof sketch (see Appendix E.1).

We construct a product weighted automaton, with edgeweights of the form M ( a )(( q , q ) , ( q , q )) = M ( a )( q ,q ) M ( a )( q ,q ) and ask if there is a cycle on a pathfrom ( s, s ) to ( t, t ) with weight >

1, which can be detected in polynomial time using avariation on the Bellman-Ford algorithm. (cid:74)

Note the relevant behaviours are those on cycles—transitions which are taken at most onceare of little signiﬁcance to the big-O problem. Such transitions have at most a constantmultiplicative eﬀect on the ratio. This is the case whether or not the system is unambiguous. coNP -complete

In this section we show coNP -completeness in the unary case. (cid:73)

Theorem 14.

The big-O problem for unary weighted automata is coNP -complete. It is coNP -hard even for unary labelled Markov chains.

For the upper bound, our analysis will reﬁne the analysis of the growth of powers ofnon-negative matrices of Friedland and Schneider [18, 40] which gives the asymptotic orderof growth of A ns,t + A n +1 s,t + · · · + A n + qs,t ≈ ρ n n k for some ρ, k and q , which smooths over theperiodic behaviour (see Theorem 18). Our results require a non-smoothed analysis, valid foreach n . This isn’t provided in [18, 40], where the smoothing forces the existence of a singlelimit—which we don’t require. Our big-Θ lemma (Lemma 21) will accurately characterisethe asymptotic behaviour of A ns,t by exhibiting the correct value of ρ and k for every word. The Big-O Problem for Weighted Automata

Let W be a unary non-negative weighted automaton with states Q , transition matrix A anda unique ﬁnal state t . When we refer to a path in W , we mean a path in the NFA of W , i.e.paths only use transitions with non-zero weights and states on a path may repeat. (cid:73) Deﬁnition 15.

A state q can reach q if there is a path from q to q . In particular, any state q can alwaysreach itself.A strongly connected component (SCC) ϕ ⊆ Q is a maximal set of states such that foreach q, q ∈ ϕ , q can reach q . We denote by SCC ( q ) the SCC of state q and by A ϕ , the | ϕ | × | ϕ | transition matrix of ϕ . Note every state is in a SCC, even if it is a singleton.The DAG of W is the directed acyclic graph of strongly connected components. Components ϕ, ϕ are connected by an edge if there exist q ∈ ϕ and q ∈ ϕ with A ( q, q ) > .The spectral radius of an m × m matrix A is the largest absolute value of its eigenvalues.Recall the eigenvalues of A are { λ ∈ C | exists vector ~x ∈ C m , ~x = 0 with A~x = λ~x } .The spectral radius of ϕ , denoted by ρ ϕ , is the spectral radius of A ϕ . By ρ ( q ) we denotethe spectral radius of the SCC in which q is a member.We denote by T ϕ the period of the SCC ϕ : the greatest common divisor of return timesfor some state s ∈ ϕ , i.e. gcd { t ∈ N | A t ( s, s ) > } . It is known that any choice of statein the SCC gives the same value (see e.g. [42, Theorem 1.20]). If A ϕ = [0] then T ϕ = 0 .Let P ( s, s ) be the set of paths from the SCC of s to the SCC of s in the DAG of W .Thus a path π ∈ P ( s, s ) is a sequence of SCCs ϕ , . . . , ϕ m . T ( s, s ) , called the local period between s and s , is deﬁned by T ( s, s ) = lcm π ∈ P ( s,s ) gcd ϕ ∈ π T ϕ .The spectral radius between states s and s , written ρ ( s, s ) , is the largest spectral radiusof any SCC seen on a path from s to s : ρ ( s, s ) = max π ∈ P ( s,s ) ρ ( π ) , where ρ ( π ) =max ϕ ∈ π ρ ϕ for π ∈ P ( s, s ) .The following function captures the number of SCCs which attain the largest spectralradius on the path that has the most SCCs of maximal spectral radius. Let k ( s, s ) =max π ∈ P ( s,s ) k ( π ) − , where, for π ∈ P ( s, s ) , k ( π ) = |{ ϕ ∈ π | ρ ϕ = ρ ( s, s ) }| . (cid:73) Remark 16.

Since our weighted automata have rational weights, the spectral radius ofan SCC is an algebraic number, as the absolute value of a root of a polynomial withrational coeﬃcients. In general, an algebraic number z ∈ A can be represented by a tuple( p z , a, b, r ) ∈ Q [ x ] × Q , where p z is a polynomial over x and a, b, r specify an approximationto distinguish z from all other roots: z is the only root of p z ( x ) with | z − ( a + bi ) | ≤ r . Thisrepresentation, which admits standard operations (addition, multiplication, absolute value,(in)equality testing, etc.), can be found in polynomial time (see, e.g. [36]). Henceforth, whenwe refer to the spectral radius we will implicitly mean representation in this form.The asymptotic behaviours of weighted automata will be characterised using ( ρ, k )-pairs: (cid:73) Deﬁnition 17. A ( ρ, k ) -pair is an element of R × N . The ordering on R × N is lexicographic,i.e. ( ρ , k ) ≤ ( ρ , k ) ⇐⇒ ρ < ρ ∨ ( ρ = ρ ∧ k ≤ k ) . Friedland and Schneider [18, 40] essentially use ( ρ, k )-pairs to show the asymptoticbehaviour of the powers of non-negative matrices. In particular they ﬁnd the asymptoticbehaviour of the sum of several A ns,s , smoothing the periodic behaviour of the matrix. (cid:73) Theorem 18 (Friedland and Schneider [18, 40]) . Let A be an m × m non-negative matrix,inducing a unary weighted automaton W with states Q = { , . . . , m } . Given s, t ∈ Q , let B ns,t = A ns,t + A n +1 s,t + · · · + A n + T ( s,t ) − s,t . Then lim n →∞ B ns,t ρ ( s,t ) n n k ( s,t ) = c, < c < ∞ . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 9 s ρ =0 . ρ =0 . s ρ =0 . t ρ =0 . ρ =0 .

14 3414 34 1212 121214 34

14 34

38 7823 13

Figure 2

Diﬀerent rates for diﬀerent phases.

In the case where the local period is 1 ( T ( s, t ) = T ( s , t ) = 1), Theorem 18 can alreadybe used to solve the big-O problem (in particular if the matrix A is aperiodic). In this case A ns,t = B ns,t = Θ( ρ ( s, t ) n n k ( s,t ) ). Then to establish that s is big-O of s we check that thelanguage containment condition holds and that ( ρ ( s, t ) , k ( s, t )) ≤ ( ρ ( s , t ) , k ( s , t )). However,this is not suﬃcient if the local period is not 1. (cid:73) Example 19.

Consider the chains shown in Figure 2 with local period 2. The behaviourfor n ≥ A ns,t = Θ(0 . n n ) and A ns ,t = Θ(0 . n ) when n is odd and A ns ,t = Θ(0 . n n )when n is even. However, Theorem 18 tells us B ns,t = Θ(0 . n n ) and B ns ,t = Θ(0 . n n )suggesting the ratio is bounded, but in fact s is not big-O s (although s is big-O of s )because A n +1 s,t A n +1 s ,t −−−−→ n →∞ ∞ . coNP Let W be a unary weighted automaton and suppose we are asked whether s is big-O of s .We assume w.l.o.g. (a) that there is a unique ﬁnal state t with no outgoing transitions, and(b) that s, s do not appear on any cycle .Next we deﬁne a ‘degree function’, which captures the asymptotic behaviour of each word a n by a ( ρ, k )-pair, capturing the exponential and polynomial behaviours respectively. (cid:73) Deﬁnition 20.

Given a unary weighted automaton W , let d s,t : N → R × N be deﬁned by d s,t ( n ) = ( ρ, k ) , where: ρ is the largest spectral radius of any vertex visited on any path of length n from s to t the path from s to t that visits the most SCCs of spectral radius ρ visits k + 1 such SCCs;if there is no length- n path from s to t , then ( ρ, k )=(0 , . Let s, t ∈ Q be ﬁxed. We are now ready to state the key technical lemma of this subsection(cf. Theorem 18, Friedland and Schneider [18, 40]), where we assume the functions ρ ( n ) , k ( n ),deﬁned by d s,t ( n ) = ( ρ ( n ) , k ( n )). (cid:73) Lemma 21 (The big- Θ lemma) . There exist c, C > such that, for every n > | Q | , c · ρ ( n ) n n k ( n ) ≤ A ns,t ≤ C · ρ ( n ) n n k ( n ) . If this is not the case, copies of s, s and their transitions can be taken. The set of admissible ( ρ, k )-pairs is the image of d s,t . Observe that this set is ﬁnite andof size at most | Q | : there can be no more than | Q | values of ρ (if at worst each state wereits own SCC) and the value of k is also bounded by the number of SCCs and thus | Q | .We next deﬁne the ( ρ, k )-annotated version of W , i.e. in each state we record the relevantvalue of ( ρ, k ) corresponding to the current run to the state. (cid:73) Deﬁnition 22 (The weighted automaton W † ) . Given W = h Q, Σ , A, { t }i and s ∈ Q , theweighted automaton W † has states of the form ( q, ρ, k ) for all q ∈ Q and all admissible ( ρ, k ) -pairs, the same Σ and no ﬁnal states. For every transition q p −→ q from W denoting A ( q, q ) = p , include the following transition in W † for each admissible ( ρ, k ) : ( q, ρ, k ) p −→ ( q , ρ, k ) if SCC ( q )= SCC ( q ) , ( q, ρ, k ) p −→ ( q , ρ, k + 1) if SCC ( q ) = SCC ( q ) and ρ = ρ ( q ) , ( q, ρ, k ) p −→ ( q , ρ, k ) if SCC ( q ) = SCC ( q ) and ρ > ρ ( q ) , ( q, ρ, k ) p −→ ( q , ρ ( q ) , if SCC ( q ) = SCC ( q ) and ρ ( q ) > ρ . W † is constructable in polynomial time given W . Indeed, the spectral radii of all SCCscan be computed and compared to each other in time polynomial in the size of W (seeRemark 16).For the following lemma, recall the language containment (LC) condition from Deﬁnition 9and the ordering on ( ρ, k )-pairs from Deﬁnition 17. (cid:73) Lemma 23.

A state s is big-O of s if and only if the LC condition holds and, for all butﬁnitely many n ∈ N , we have d s,t ( n ) ≤ d s ,t ( n ) . Proof sketch.

Whenever d s,t ( n ) ≤ d s ,t ( n ), by Lemma 21, we have f s ( a n ) ≤ ( Cc ( ρρ ) n n k − k ) · f s ( a n ), in which case either d s,t ( n ) = d s ,t ( n ) and ( ρρ ) n n k − k = 1 or lim n →∞ ( ρρ ) n n k − k = 0and so ( ρρ ) n n k − k ≤ n .However, whenever d s,t ( n ) > d s ,t ( n ), Lemma 21 yields f s ( a n ) ≥ ( cC ( ρρ ) n n k − k ) · f s ( a n )but then lim n →∞ ( ρρ ) n n k − k = ∞ . (cid:74) We are going to use the characterisation from Lemma 23 to prove Theorem 14. As alreadydiscussed, the LC condition can be checked via NFA inclusion testing. To tackle the “for allbut ﬁnitely many ...” condition, we introduce the concept of eventual inclusion. (cid:73)

Deﬁnition 24.

Given sets

A, B , we say A is eventually included in B , written A ∼ ⊂ B , ifand only if A \ B is ﬁnite. The next three lemmas relate deciding the big-O problem using the characterisation ofLemma 23 to eventual inclusion. The missing proofs are available in the Appendix. (cid:73)

Lemma 25.

Given unary NFAs N , N , the problem L ( N ) ∼ ⊂ L ( N ) is in coNP . (cid:73) Lemma 26.

Suppose d , d : N → X , with ( X, ≤ ) a ﬁnite total order. Then d ( n ) ≤ d ( n ) for all but ﬁnitely many n if and only if { n | d ( n ) ≥ x } ∼ ⊂ { n | d ( n ) ≥ x } for all x ∈ X . (cid:73) Lemma 27.

Given a unary weighted automaton W , the associated problem whether d s,t ( n ) ≤ d s ,t ( n ) for all but ﬁnitely many n ∈ N is in coNP . Proof.

Given an admissible pair x = ( ρ, k ), we construct an NFA N s,x accepting { a n | d s,t ( n ) ≥ x } (similarly N s ,x for s ), by taking the NFA N s ( W † ) (Deﬁnitions 2, 22) with a suitablechoice of accepting states. Recall that states in W † are of the form ( q, ρ , k ), where q is a . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 11 state from W and ( ρ , k ) is admissible. If we designate states ( t, ρ , k ) with ( ρ , k ) ≥ x asaccepting, it will accept { a n | d s,t ( n ) ≥ x } . This is a polynomial-time construction.Then, by Lemma 26, the problem whether d s,t ( n ) ≤ d s ,t ( n ) for all but ﬁnitely many n ∈ N is equivalent to L ( N s,x ) ∼ ⊂ L ( N s ,x ) for all admissible x . As there are at most | Q | values of x and each can be veriﬁed non-deterministically in coNP , it suﬃces to show that L ( N s,x ) ∼ ⊂ L ( N s ,x ) is in coNP for each x . This is the case by Lemma 25. (cid:74) Remark 10 and Lemma 27 together complete the upper bound result for Theorem 14. (cid:73)

Remark.

Lemma 26 may appear simpler using { n | f ( n ) = x } ∼ ⊂ { n | f ( n ) ≥ x } . However,it does not seem possible to construct an NFA for { a n | d s,t ( n ) = x } in polynomial time.Taking just ( t, ρ, k ) as accepting would not be correct, as there could be paths of the samelength ending in ( t, ρ , k ) with ( ρ , k ) > ( ρ, k ). Using ≥ instead of = avoids this problem. (cid:73) Remark.

An alternative approach for obtaining an upper bound could be to compute theJordan normal form of the transition matrix and consider its powers. Instead of the interplayof strongly connected components in the transition graph, we would need to consider linearcombinations of the n th powers of complex numbers (such as roots of unity). It is not clearthis algebraic approach leads to a representation more convenient for our purposes. coNP -hardness for unary LMC Given a unary NFA N , the NFA universality problem asks if L ( N ) = { a n | n ∈ N } . Thisproblem is coNP -complete [44]. We exhibit a polynomial-time reduction from (a variant of)the unary universality problem to the big-O problem on unary Markov chains. In this section we consider the big-O problem for a weighted automaton W and states s, s such that L s ( W ), L s ( W ) are bounded. Throughout the section, we assume that the LCcondition has already been checked, i.e. L s ( W ) ⊆ L s ( W ). We will show that the problem isconditionally decidable, subject to Schanuel’s conjecture. Logical theories of arithmetic and Schanuel’s conjecture. In ﬁrst-order logical theoriesof arithmetic, variables denote numbers (from Z or R , as appropriate), and atomic predicatesare equalities and inequalities between terms built from variables and function symbols.Nullary function symbols are constants, always from Z . If binary addition and multiplicationare available, then:for R we obtain the ﬁrst-order theory of the reals, where the truth value of sentencesis decidable due to the celebrated Tarski–Seidenberg theorem [3, Chapter 11 and The-orem 2.77];for Z , the ﬁrst-order theory of the integers is, in contrast, undecidable (see, e.g, [39]).In the case of R , adding the unary symbol for the exponential function x e x , leads to the ﬁrst-order theory of the real numbers with exponential function (Th( R exp )). Logarithmsbase 2, for example, are easily expressible in Th( R exp ). The decidability of Th( R exp )is anopen problem and hinges upon Schanuel’s conjecture [30]. Schanuel’s conjecture [29] is a unifying conjecture of transcendental number theory, sayingthat for all z , . . . , z n ∈ C linearly independent over Q the ﬁeld extension Q ( z , . . . , z n , e z , . . . ,e z n ) has transcendence degree at least n over Q , meaning that for some S ⊆ { z , . . . , z n ,e z , . . . , e z n } of cardinality n , say S = { s , . . . , s n } , the only polynomial p over Q satisfying p ( s , . . . , s n ) = 0 is p ≡

0. See, e.g., Waldschmidt’s book [48, Section 1.4] for further ss a a . a . a p a − pa . a . b . b . b . b . b . b . a . a . Figure 3

Relative orderings are the same, but the boundedness question is diﬀerent. context. If indeed true, this conjecture would generalise several known results, includingthe Lindemann–Weierstrass theorem and Baker’s theorem, and would entail the decidabilityof Th( R exp ). Our work follows an exciting line of research that reduces problems fromveriﬁcation [12, 31], linear dynamical systems [2, 8], and symbolic computation [24] to thedecision problem for Th( R exp ). (cid:73) Theorem 28.

Given a weighted automaton W = h Q, Σ , M, F i , s, s ∈ Q , with L s ( W ) and L s ( W ) bounded, it is decidable whether s is big-O of s , subject to Schanuel’s conjecture. In the unary case, it was suﬃcient to consider the relative order between spectral radii,with careful handling of the periodic behaviour. This approach is insuﬃcient in the boundedcase. Example 29 highlights that the actual values of the spectral radii have to be examined. (cid:73)

Example 29 (Relative orderings are insuﬃcient) . Consider the LMC in Figure 3, with 0 . ≤ p ≤ .

62. We have f s ( a m b n ) = Θ(0 . m . n ) and f s ( a m b n ) = Θ( p m . n + 0 . m . n ).Note that neither 0 . m . n nor p m . n dominate, nor are dominated by, 0 . m . n for anyvalue of 0 . ≤ p ≤ .

62. That is, there are values of m, n where 0 . m . n (cid:29) . m . n (in particular large n ) and values of m, n where 0 . m . n (cid:28) . m . n (in particular large m ); similarly for p m . n vs 0 . m . n (but the cases in which n or m needs to be large areswapped). However, the big-O status can be diﬀerent for diﬀerent values of p ∈ [0 . , . p = 0 .

62, the ratio turnsout to be bounded: f s ( a m b n ) f s ( a m b n ) ≤ for all m, n (in particular, maximal at m = n = 0). Incontrast, when p = 0 .

61, we have f s ( a m b . m ) f s ( a m b . m ) −−−−→ m →∞ ∞ .We ﬁrst prove Theorem 28 for the plus-letter-bounded case, which is the most technicallyinvolved; the other bounded cases will be reduced to it. In the plus-letter-bounded case, wewill characterise the behaviour of such automata, generalising ( ρ, k )-pairs of the unary case.We will need to rely upon the ﬁrst-order theory of the reals with exponentials to comparethese behaviours. We assume L s ( W ) ⊆ a +1 · · · a + m , where a , · · · , a n ∈ Σ and because the LC condition holds, wealso have L s ( W ) ⊆ a +1 · · · a + m . In the plus-letter-bounded cases, without loss of generality, weassume a i = a j for i = j (see Appendix G for a justiﬁcation). Then any word w = a n . . . a n m m is uniquely speciﬁed by a vector ( n , . . . , n m ) ∈ N m> , where n i is the number of a i ’s in w . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 13 Like in Deﬁnition 20, we deﬁne a degree function d , which will be used to study theasymptotic behaviour of words. This time we will associate a separate ( ρ, k ) pair to each of the m characters and, consequently, words will induce sequences of the form ( ρ , k ) · · · ( ρ m , k m ).Further, as there may be multiple, incomparable behaviours, words will induce sets ofsuch sequences, i.e. d : N m → P (( R × N ) m ). For the sake of comparisons, it will be convenientto focus on maximal elements with respect to the pointwise order on ( R × N ) m , written ≤ ,where the lexicographic order (recall Deﬁnition 17) is used to compare elements of R × N .Recall Lemma 21 does not capture the asymptotics when n ≤ | Q | . In the unary case thisis inconsequential as small words are covered by the ﬁnitely many exceptions and the LCcondition. However, here, a small number of one character may be used to enable accessto a particular part of the automaton in another character. For this case, we introduce anew number δ = min ϕ : ρ ϕ > ρ ϕ which is strictly smaller than the spectral radius of everynon-zero SCC (so will not dominate with the partial order), but non-zero. (cid:73) Deﬁnition 30.

Let ˆ ρ = ( ρ , k ) , · · · , ( ρ m , k m ) ∈ ( R × N ) m . An a n a n . . . a n m m -labelled pathfrom s (to the ﬁnal state) is compatible with ˆ ρ if, for each i = 1 , . . . , m , it visits k i + 1 SCCswith spectral radius ρ i while reading a i , unless the path visits only singletons with no loops,in which case ( ρ i , k i ) = ( δ, . The notation ( ρ, k ) ∈ ˆ ρ is used for ‘ ( ρ, k ) is an element of ˆ ρ ’. (cid:73) Deﬁnition 31.

Let d s : N m → P (( R × N ) m ) be s.t.: ˆ ρ ∈ d s ( n , . . . , n m ) if and only if (1) there exists an a n a n . . . a n m m -labelled path from s to the ﬁnal state compatible with ˆ ρ ,and (2) for every a n a n . . . a n m m -labelled path from s compatible with ˆ σ s.t. ˆ ρ ≤ ˆ σ , we have ˆ ρ = ˆ σ . Observe that ˆ ρ may range over at most | Q | m possible values. We write D for the setcontaining them, so that d s : N m → P ( D ). In this extended setting, the big-Θ lemma(Lemma 21) may be generalised as follows. (cid:73) Lemma 32.

Denote z ( n , · · · , n m ) = P ˆ ρ ∈ d s ( n ,...,n m ) Q ( ρ i ,k i ) ∈ ˆ ρ ρ n i i · n k i i . There exist c, C > such that for all n . . . , n m ∈ N : c · z ( n , · · · , n m ) ≤ f s ( a n a n . . . a n m m ) ≤ C · z ( n , · · · , n m ) . The following lemma provides the key characterisation of negative instances of the big-Oproblem, in the plus-letter-bounded case and assuming the LC condition. Here and below,we write n ( t ) to refer to the the t th vector in a sequence n : N → N m . (cid:73) Lemma 33 (Main lemma) . Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if andonly if there exists a sequence n : N → N m and X ∈ D , Y ⊆ D such that (a) X ∈ d s ( n ( t )) and Y = d s ( n ( t )) for all t , and (b) for all j ∈ h Y , the sequence n satisﬁes m X i =1 α j,i n ( t ) i + p j,i log n ( t ) i −−−→ t →∞ −∞ , where h Y ⊆ { , . . . , |Y|} , α j,i ∈ R , p j,i ∈ Z ( ≤ i ≤ m ) are uniquely determined by X and Y (in a way detailed below), h Y and p j,i ’s are eﬀectively computable and α j,i ’s are ﬁrst-orderexpressible (with exponential function). Proof.

Observe that then s is not big-O of s iﬀ there exists an inﬁnite sequence of wordssuch that, for all C >

0, the sequence contains a word w such that f s ( w ) f s ( w ) > C . Thanks to Lemma 32, this is equivalent to the existence of a sequence n : N → N m such that X X ∈ d s ( n ( t ) ,...,n ( t ) m ) Y ( ρ i ,k i ) ∈ X ρ n ( t ) i i · n ( t ) k i i X Y ∈ d s ( n ( t ) ,...,n ( t ) m ) Y ( σ i ,‘ i ) ∈ Y σ n ( t ) i i · n ( t ) ‘ i i −−−→ t →∞ ∞ , where n ( t ) i denotes the i th component of n ( t ). Since there are ﬁnitely many possible valuesof d s and d s , it suﬃces to look for sequences n such that d s ( n ( t )) and d s ( n ( t )) are ﬁxed.Further, because of the sum in the numerator, only one X ∈ X is required such that X ∈ d s ( n , . . . , n m ). Thus, we need to determine whether there exist X ∈ D , Y ⊆ D and n : N → N m such that X ∈ d s ( n ( t )), d s ( n ( t )) = Y (for all t ) and Q mi =1 ρ n ( t ) i i · n ( t ) k i i P h Y j =1 Q mi =1 σ n ( t ) i ji · n ( t ) ‘ ji i −−−→ t →∞ ∞ . where X = ( ρ , k ) · · · ( ρ m , k m ), Y = { Y , · · · , Y |Y| } , and Y j = ( σ j , ‘ j ) · · · ( σ jm , ‘ jm ) (1 ≤ j ≤ |Y| ). Taking the reciprocal and requiring each of the summands to go to zero, we obtain Q mi =1 σ n ( t ) i ji · n ( t ) ‘ ji i Q mi =1 ρ n ( t ) i i · n ( t ) k i i = m Y i =1 (cid:18) σ ji ρ i (cid:19) n ( t ) i n ( t ) i‘ ji − k i −−−→ t →∞ ≤ j ≤ |Y| .If we take logarithms, letting α j,i = log( σ ji ρ i ) and p j,i = ‘ ji − k i , we get m X i =1 α j,i n ( t ) i + p j,i log n ( t ) i −−−→ t →∞ −∞ for all j in h Y = { ≤ j ≤ |Y| | σ ji > ≤ i ≤ m } . The number α j,i is the logarithm of the ratio of two algebraic numbers, which are not givenexplicitly. However, they admit an unambiguous, ﬁrst-order expressible characterisation(see Remark 16). The logarithm is encoded using the exponential function: log( z ) is ∃ x ∈ R : exp( x ) = z . (cid:74) Lemma 33 identiﬁes violation of the big-O property using two conditions. In the remainderof this subsection we will handle Condition (a) using automata-theoretic tools (the Parikhtheorem and semi-linear sets) and Condition (b) using logics. In summary, the characterisationof Lemma 33 will be expressed in the ﬁrst-order theory of the reals with exponentiation,which is decidable subject to Schanuel’s conjecture.

Condition (a) via automata

It turns out that sequences n satisfying Condition (a) in Lemma 33 can be captured by aﬁnite automaton. In more detail, for any X ∈ D , there exists an automaton N sX such that L ( N sX ) = { a n · · · a n m m | X ∈ d s ( n , · · · , n m ) } . For any Y ⊆ D , there exists an automaton N s Y such that L ( N s Y ) = { a n · · · a n m m | d s ( n , · · · , n m ) = Y} . The relevant automaton capturing X and Y is then found by taking the intersection of L ( N sX ) and L ( N s Y ). (cid:73) Lemma 34.

For any X ∈ D and Y ⊆ D , there exists an automaton N X, Y such that L ( N X, Y ) = { a n · · · a n m m | X ∈ d s ( n , · · · , n m ) , Y = d s ( n , · · · , n m ) } . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 15 Because of our a i = a j assumption, the vector ( n , · · · , n m ) indicates the number ofoccurrences of each character. The set of such vectors derived from the language of anautomaton is known as the Parikh image of this language [37]. It is well known that theParikh image of an NFA is a semi-linear set, i.e. a ﬁnite union of linear sets (a linear sethas the form { ~b + λ ~r + · · · + λ s ~r s | λ , . . . , λ s ∈ N } , where ~b ∈ N m is the base vector and ~r , · · · ~r s ∈ N m are called period vectors). However, since L ( N X, Y ) ⊆ a +1 a +2 . . . a + m , the linearsets are of a very particular form, where each ~r i is a constant multiple of the i th unit vector. (cid:73) Lemma 35.

The language of N X, Y can be eﬀectively decomposed as L ( N X, Y ) = S S X, Y k =1 L k ,where L k = n a b k + r k λ · · · a b km + r km λ m m | λ , · · · , λ m ∈ N o , S X, Y ∈ N and b ki , r ki ∈ N (1 ≤ k ≤ S X, Y , ≤ i ≤ m ) . Lemma 35 captures Condition (a) of Lemma 33 precisely.

Condition (b) via logic

With Lemma 35 in place, we now move on to add Condition (b) to the existing machinery.In fact, the logical formulae in the following lemmas will express the conjunction of bothconditions of Lemma 33. (cid:73)

Lemma 36.

Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if and only if there exists X ∈ D , Y ⊆ D , ≤ k ≤ S X, Y such that ∀ C < ∃ ~λ ∈ N m ^ j ∈ h Y m X i =1 α j,i ( b ki + r ki λ i ) + p j,i log( b ki + r ki λ i ) < C, where h Y , α j,i , p j,i (resp. b ki , r ki ) satisfy the same conditions as in Lemma 33 (resp. 35). Note that the formula of Lemma 36 uses quantiﬁcation over natural numbers. Our nextstep will be to replace integer variables with real variables. In other words, we will obtain anequivalent condition in the ﬁrst-order theory of the reals with exponentiation, as follows. (cid:73)

Lemma 37.

Assume L s ( W ) ⊆ L s ( W ) . Then s is not big-O of s if and only if there exist X ∈ D , Y ⊆ D , ≤ k ≤ S X, Y and U ⊆ { i ∈ { , · · · , m } | r ki > } such that ∀ C < ∃ ~x ∈ R | U |≥ B k ^ j ∈ h Y X i ∈ U α j,i r ki x i + p j,i log( x i ) < C, where B k = max i b ki and h Y , α j,i , p j,i , b ki , r ki are as in Lemma 36. Proof Sketch.

Compare the logical characterisation in Lemmas 36 and 37. The ﬁrst diﬀerenceto note is that the eﬀect of b ki ’s is simply a constant oﬀset, and so the sequence would tendto −∞ with or without its presence. Similar simpliﬁcations can be made inside the logarithm:the multiplicative eﬀect of r ki inside the logarithm can be extracted as an additive oﬀset andthus similarly be discarded.The second crucial diﬀerence is to relax the variable domains from integers to reals. Ifeach of the λ i in the satisfying assignment is suﬃciently large, we show we can relax thecondition to real numbers rather than integers without aﬀecting whether the sequence goesto −∞ . To do this, we test sets of indices U , where if i ∈ U then λ i needs to be arbitrarilylarge over all C (i.e. unbounded). The positions where λ i is always bounded are again aconstant oﬀset and are omitted. (cid:74) By testing the LC condition and the condition from Lemma 37 for each possible X, Y , k, U ,in turn using the relevant (conditionally decidable) ﬁrst-order theory of the reals, we have: (cid:73) Lemma 38.

Given a weighted automaton W and states s, s such that L s ( W ) and L s ( W ) are plus-letter-bounded, it is decidable whether s is big-O s , subject to Schanuel’s conjecture. Here we consider the case where L s ( W ) and L s ( W ) are letter-bounded, L s ( W ) and L s ( W )are subsets of a ∗ . . . a ∗ m for some a , . . . , a m ∈ Σ, which is a relaxation of the preceding case.For the plus-letter-bounded case, we relied on a 1-1 correspondence between numeric vectorsand words. This correspondence no longer holds in the letter-bounded case: for example, a n matches a ∗ b ∗ a ∗ , but it could correspond to ( n, , , , n ), as well as any ( n , , n ) with n + n = n . Still, there is a reduction to the plus-letter-bounded case. (cid:73) Lemma 39.

The big-O problem for W , s, s with L s ( W ) and L s ( W ) letter-bounded reducesto the plus-letter-bounded case. Proof.

Suppose the LC condition holds and L s ( W ) ⊆ L s ( W ) ⊆ a ∗ · · · a ∗ m . Let I be the setof strictly increasing sequences ~ı = i · · · i k of integers between 1 and m . Given ~ı ∈ I , let W ~ı be the weighted automaton obtained by intersecting W with a DFA for a + i · · · a + i k whoseinitial state is q . Note that s is big-O of s (in W ) iﬀ ( s, q ) is big-O of ( s , q ) in W ~ı for all ~ı ∈ I , because a ∗ · · · a ∗ m = S ~ı ∈ I a + i · · · a + i k . Because the big-O problem for each W ~ı , ( s, q ),( s , q ) falls into the plus-letter-bounded case, the results follows from Lemma 38. (cid:74) Here we consider the case where L s ( W ) and L s ( W ) are bounded, which is a relaxation ofletter-boundedness (see Deﬁnition 6): L s ( W ) and L s ( W ) are subsets of w ∗ . . . w ∗ m for some w , . . . , w m ∈ Σ ∗ . We show a reduction to the letter-bounded case from Section 6.2.To showcase the diﬀerence to the letter-bounded case, consider the language( abab ) ∗ a ∗ b ∗ ( ab ) ∗ . Observe that, for example the word ( ab ) can be decomposed in anumber of ways: ( abab ) a b ( ab ) , ( abab ) a b ( ab ) , ( abab ) a b ( ab ) , ( abab ) a b ( ab ) or( abab ) a b ( ab ) . One must be careful to consider all such decompositions. (cid:73) Lemma 40.

The big-O problem for W , s, s with L s ( W ) and L s ( W ) bounded reduces tothe letter-bounded case. Proof sketch (see Appendix G.3).

Suppose W is bounded over w ∗ . . . w ∗ m , we will constructa new weighted automaton W letter-bounded over a new alphabet a ∗ . . . a ∗ m with the followingproperty. For every decomposition of a word w , as w n . . . w n m m , the weight of a n . . . a n m m in W is equal to the weight of w in W . (cid:74) Despite undecidability results, we have identiﬁed several decidable cases of the big-O problem.However, for bounded languages, the result depends on a conjecture from number theory,leaving open the exact borderline between decidability and undecidability.Natural directions for future work include the analogous problem for inﬁnite words,further analysis on ambiguity (e.g., is the big-O problem decidable for k -ambiguous weightedautomata?), and the extension to negative edge weights. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 17 References Shaull Almagor, Udi Boker, and Orna Kupferman. What’s decidable about weighted automata?In

ATVA , volume 6996 of

Lecture Notes in Computer Science , pages 482–491. Springer, 2011. Shaull Almagor, Dmitry Chistikov, Joël Ouaknine, and James Worrell. O-minimal invariantsfor linear loops. In Ioannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and DonaldSannella, editors, , volume 107 of

LIPIcs , pages 114:1–114:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.114 . Saugata Basu, Richard Pollack, and Marie-Françoise Roy.

Algorithms in Real AlgebraicGeometry , volume 10 of

Algorithms and computation in mathematics . Springer, 2nd edition,2006. Rohit Chadha, Dileep Kini, and Mahesh Viswanathan. Decidable problems for unary PFAs.In Gethin Norman and William H. Sanders, editors,

Quantitative Evaluation of Systems - 11thInternational Conference, QEST 2014 , volume 8657 of

Lecture Notes in Computer Science ,pages 329–344. Springer, 2014. doi:10.1007/978-3-319-10696-0_26 . Konstantinos Chatzikokolakis, Daniel Gebler, Catuscia Palamidessi, and Lili Xu. GeneralizedBisimulation Metrics. In Paolo Baldan and Daniele Gorla, editors,

CONCUR 2014 - Concur-rency Theory - 25th International Conference, CONCUR 2014 , volume 8704 of

Lecture Notesin Computer Science , pages 32–46. Springer, 2014. doi:10.1007/978-3-662-44584-6_4 . Taolue Chen and Stefan Kiefer. On the total variation distance of labelled Markovchains. In Thomas A. Henzinger and Dale Miller, editors,

Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS2014 , pages 33:1–33:10. ACM, 2014. URL: http://dl.acm.org/citation.cfm?id=2603088 , doi:10.1145/2603088.2603099 . Dmitry Chistikov, Andrzej S. Murawski, and David Purser. Asymmetric distances for approx-imate diﬀerential privacy. In Wan Fokkink and Rob van Glabbeek, editors, , volume 140 of

LIPIcs , pages 10:1–10:17.Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.CONCUR.2019.10 . Ventsislav Chonev, Joël Ouaknine, and James Worrell. On the Skolem problem for continuouslinear dynamical systems. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani,and Davide Sangiorgi, editors, , volume 55 of

LIPIcs , pages 100:1–100:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICALP.2016.100 . Marek Chrobak. Finite automata and unary languages.

Theor. Comput. Sci. , 47(3):149–158,1986. doi:10.1016/0304-3975(86)90142-8 . Thomas Colcombet. Unambiguity in automata theory. In Jeﬀrey O. Shallit and AlexanderOkhotin, editors,

Descriptional Complexity of Formal Systems - 17th International Workshop,DCFS 2015 , volume 9118 of

Lecture Notes in Computer Science , pages 3–18. Springer, 2015. doi:10.1007/978-3-319-19225-3_1 . T. H. Cormen, C. E. Leiserson, and R. L. Rivest.

Introduction to Algorithms . MIT Press,1990. Laure Daviaud, Marcin Jurdzinski, Ranko Lazic, Filip Mazowiecki, Guillermo A. Pérez, andJames Worrell. When is containment decidable for probabilistic automata? In IoannisChatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella, editors, , volume 107 of

LIPIcs , pages 121:1–121:14. Schloss Dagstuhl -Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.121 . C. Dehnert, S. Junges, J.-P. Katoen, and M. Volk. A Storm is coming: A modern probabilisticmodel checker. In

Proceedings of Computer Aided Veriﬁcation (CAV) , pages 592–600. Springer,2017. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating Noiseto Sensitivity in Private Data Analysis. In Shai Halevi and Tal Rabin, editors,

Theory ofCryptography, Third Theory of Cryptography Conference, TCC 2006 , volume 3876 of

LectureNotes in Computer Science , pages 265–284. Springer, 2006. doi:10.1007/11681878_14 . Shimon Even, Alan L. Selman, and Yacov Yacobi. The complexity of promise problems withapplications to public-key cryptography.

Information and Control , 61(2):159–173, 1984. Nathanaël Fijalkow. Undecidability results for probabilistic automata.

SIGLOG News , 4(4):10–17, 2017. URL: https://dl.acm.org/citation.cfm?id=3157833 . Nathanaël Fijalkow, Hugo Gimbert, and Youssouf Oualhadj. Deciding the value 1 problemfor probabilistic leaktight automata. In

Proceedings of the 27th Annual IEEE Symposiumon Logic in Computer Science, LICS 2012 , pages 295–304. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.40 . Shmuel Friedland and Hans Schneider. The growth of powers of a nonnegative matrix.

SIAMJ. Matrix Analysis Applications , 1(2):185–200, 1980. doi:10.1137/0601022 . Pawel Gawrychowski, Dalia Krieger, Narad Rampersad, and Jeﬀrey Shallit. Finding thegrowth rate of a regular or context-free language in polynomial time.

Int. J. Found. Comput.Sci. , 21(4):597–618, 2010. doi:10.1142/S0129054110007441 . Hugo Gimbert and Youssouf Oualhadj. Probabilistic automata on ﬁnite words: Decidable andundecidable problems. In Samson Abramsky, Cyril Gavoille, Claude Kirchner, Friedhelm Meyerauf der Heide, and Paul G. Spirakis, editors,

Automata, Languages and Programming, 37thInternational Colloquium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceedings,Part II , volume 6199 of

Lecture Notes in Computer Science , pages 527–538. Springer, 2010. doi:10.1007/978-3-642-14162-1\_44 . Seymour Ginsburg.

The Mathematical Theory of Context-Free Languages . McGraw-Hill, 1966. Seymour Ginsburg and Edwin H Spanier. Bounded algol-like languages.

Transactions of theAmerican Mathematical Society , 113(2):333–368, 1964. Thanh Minh Hoang and Thomas Thierauf. The complexity of the characteristic and the minimalpolynomial.

Theor. Comput. Sci. , 295:205–222, 2003. doi:10.1016/S0304-3975(02)00404-8 . Cheng-Chao Huang, Jing-Cao Li, Ming Xu, and Zhi-Bin Li. Positive root isolation forpoly-powers by exclusion and diﬀerentiation.

Journal of Symbolic Computation , 85:148–169,2018. 41th International Symposium on Symbolic and Alge-braic Computation (ISSAC’16). doi:https://doi.org/10.1016/j.jsc.2017.07.007 . Stefan Kiefer. On computing the total variation distance of hidden Markov models. InIoannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella, editors, ,volume 107 of

LIPIcs , pages 130:1–130:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik,2018. doi:10.4230/LIPIcs.ICALP.2018.130 . Stefan Kiefer, Andrzej S. Murawski, Joël Ouaknine, Björn Wachter, and James Worrell. Onthe Complexity of Equivalence and Minimisation for Q-weighted Automata.

Logical Methodsin Computer Science , 9(1), 2013. doi:10.2168/LMCS-9(1:8)2013 . Daniel Krob. The equality problem for rational series with multiplicities in the tropicalsemiring is undecidable.

International Journal of Algebra and Computation , 4:405–425, 1994. M. Kwiatkowska, G. Norman, and D. Parker. PRISM 4.0: Veriﬁcation of probabilistic real-timesystems. In

Proceedings of Computer Aided Veriﬁcation (CAV) , volume 6806 of

LNCS , pages585–591. Springer, 2011. Serge Lang.

Introduction to transcendental numbers . Addison-Wesley Pub. Co., 1966. Angus Macintyre and Alex J Wilkie. On the decidability of the real exponential ﬁeld, 1996. Rupak Majumdar, Mahmoud Salamati, and Sadegh Soudjani. On decidability of time-bounded reachability in CTMDPs. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 19 editors, , volume 168 of

Leibniz International Proceedings in Informatics (LIPIcs) , pages 133:1–133:19, Dagstuhl, Germany, 2020. Schloss Dagstuhl–Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ICALP.2020.133 . Andrew Martinez. Eﬃcient computation of regular expressions from unary NFAs. In JürgenDassow, Maia Hoeberechts, Helmut Jürgensen, and Detlef Wotschke, editors,

Fourth Interna-tional Workshop on Descriptional Complexity of Formal Systems - DCFS 2002 , volume ReportNo. 586, pages 174–187. Department of Computer Science, The University of Western Ontario,Canada, 2002. David Mestel. Quantifying information ﬂow in interactive systems. In , pages414–427. IEEE, 2019. doi:10.1109/CSF.2019.00035 . David Mestel. Widths of Regular and Context-Free Languages. In ,volume 150 of

Leibniz International Proceedings in Informatics (LIPIcs) , pages 49:1–49:14,Dagstuhl, Germany, 2019. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL: https://drops.dagstuhl.de/opus/volltexte/2019/11611 , doi:10.4230/LIPIcs.FSTTCS.2019.49 . Albert R. Meyer and Larry J. Stockmeyer. The equivalence problem for regular expressionswith squaring requires exponential space. In

Proceedings of the 13th Annual Symposium onSwitching and Automata Theory, College Park, Maryland, USA, October 25-27, 1972 , pages125–129. IEEE Computer Society, 1972. Joël Ouaknine and James Worrell. Positivity problems for low-order linear recurrence sequences.In Chandra Chekuri, editor,

Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2014 , pages 366–379. SIAM, 2014. doi:10.1137/1.9781611973402.27 . Rohit J Parikh. On context-free languages.

Journal of the ACM (JACM) , 13(4):570–581, 1966. Azaria Paz.

Introduction to probabilistic automata . Academic Press, 2014. Bjorn Poonen. Hilbert’s tenth problem over rings of number-theoretic interest.

Note fromthe lecture at the Arizona Winter School on “Number Theory and Logic” , 2003. URL: https://math.mit.edu/~poonen/papers/aws2003.pdf . Hans Schneider. The inﬂuence of the marked reduced graph of a nonnegative matrix onthe Jordan form and on related properties: A survey.

Linear Algebra and its Applications ,84:161–189, 1986. Marcel Paul Schützenberger. On the deﬁnition of a family of automata.

Information andControl , 4(2-3):245–270, 1961. doi:10.1016/S0019-9958(61)80020-X . Bruno Sericola.

Markov chains: theory and applications . John Wiley & Sons, 2013. Adam D. Smith. Eﬃcient, Diﬀerentially Private Point Estimators.

CoRR , abs/0809.4794,2008. URL: http://arxiv.org/abs/0809.4794 . Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time:Preliminary report. In Alfred V. Aho, Allan Borodin, Robert L. Constable, Robert W. Floyd,Michael A. Harrison, Richard M. Karp, and H. Raymond Strong, editors,

Proceedings ofthe 5th Annual ACM Symposium on Theory of Computing, 1973 , pages 1–9. ACM, 1973. doi:10.1145/800125.804029 . Anthony Widjaja To. Unary ﬁnite automata vs. arithmetic progressions.

Inf. Process. Lett. ,109(17):1010–1014, 2009. doi:10.1016/j.ipl.2009.06.005 . Wen-Guey Tzeng. A polynomial-time algorithm for the equivalence of probabilistic automata.

SIAM J. Comput. , 21(2):216–227, 1992. doi:10.1137/0221017 . Wen-Guey Tzeng. On path equivalence of nondeterministic ﬁnite automata.

Inf. Process. Lett. ,58(1):43–46, 1996. doi:10.1016/0020-0190(96)00039-7 . Michel Waldschmidt.

Diophantine Approximation on Linear Algebraic Groups , volume 326of

Grundlehren der mathematischen Wissenschaften (A Series of Comprehensive Studies inMathematics) . Springer, Berlin, Heidelberg, 2000. qq ss a . a . a (a) Reduction to big-Θ qq ss a . b . a . b . (b) Reduction to big-O

Figure 4

Reductions between big-O and big-Θ

A Additional notation for the appendix

We will typically deﬁne weighted automata by listing transitions as q p −→ a q (to mean M ( a )( q, q ) = p ) with the assumption that any unspeciﬁed transition has weight 0. B Additional material for Section 1 (cid:73)

Proposition 41.

For r, and rd, on labelled Markov chains, it is suﬃcient to consider thesupremum over w ∈ Σ ∗ rather than E ⊆ Σ ∗ . Proof of Proposition 41.

We will show we can approximate any event by a ﬁnite subset,then we can always simplify an event with more than one word, and not decrease.Suppose a + bc + d > ac and a + bc + d > bd . By the ﬁrst we have ac + bc > ac + dc = bc > ad = ⇒ bd > ac . By the second we have ad + bd > bc + bd = ad > bc = ⇒ ac > bd . Contradiction.Hence, for the purposes of maximisation, given f, g and a ﬁnite set E , such a set canalways be simpliﬁed, by repeated application. That is, there exists e such that, P e ∈ E f ( e ) P e ∈ E g ( e ) ≤ f ( e ) g ( e ) . (1)Consider an event E ⊆ Σ ∗ , then for every λ > k such that f s ( E ∩ Σ >k ) ≤ λ .Then f s ( E ∩ Σ ≤ k ) ≤ f s ( E ) ≤ f s ( E ∩ Σ ≤ k ) + λ [25, Lemma 12]. For any (cid:15) , by choice ofsuﬃciently small λ there is a ﬁnite set E such that f s ( E ) f s ( E ) − (cid:15) ≤ f s ( E ) f s ( E ) ≤ f s ( E ) f s ( E ) + (cid:15) .Consider sup E ⊆ Σ ∗ f s ( E ) f s ( E ) , this is equivalent to lim k →∞ sup E ⊆ Σ ∗ ∩ Σ ≤ k f s ( E ) f s ( E ) and by Equa-tion (1) this is equivalent to lim k →∞ sup w ∈ Σ ∗ ∩ Σ ≤ k f s ( w ) f s ( w ) = sup w ∈ Σ ∗ f s ( w ) f s ( w ) . (cid:74) C Additional material for Section 2 (cid:73)

Lemma 42.

The big-O problem is interreducible with the big- Θ problem. Proof. big-O problem reduces to the big- Θ problem: To ask if s is big-O of s , addstates q, q using the construction of Figure 4a, then ask if q is big-Θ of q . f q ( aw ) f q ( aw ) = 0 . f s ( w ) + 0 . f s ( w ) f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < C − f q ( aw ) f q ( aw ) = f s ( w )0 . f s ( w ) + 0 . f s ( w ) ≤ big- Θ problem reduces to the big-O problem: To ask if s is big-Θ of s , add states q, q using the construction of Figure 4b, then ask if q is big-O of q . . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 21 f q ( aw ) f q ( aw ) = 0 . f s ( w )0 . f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < Cf q ( bw ) f q ( bw ) = 0 . f s ( w )0 . f s ( w ) < C ⇐⇒ f s ( w ) f s ( w ) < C Each of the reductions adds a constant number of bits, as such they operate in logarithmicspace. (cid:74)

D Additional material for Section 3

In this section we use the notation that P A ( w ) = f q s ( w ), where q s is the start state of theprobabilistic automaton A . This is to avoid confusion when there is both the probabilisticautomaton being reduced from and the labelled Markov chain being reduced to. Henceforthin this section, the notation f s ( w ) refers to the labelled Markov chain. Undecidability by Emptiness of Probabilistic automata (Theorem 8)

The following lemma plays a key role in proving the result. In its statement, “undecidable todistinguish” means that the corresponding promise problem (see e.g. [15]) is undecidable.In other words, if the input is not in one of the two cases which should be distinguishedbetween, the answer is not speciﬁed and can be arbitrary (including non-termination).Results in this section are presented on ratio total variation distances on labelled Markovchains , and thus apply to the big-O problem in the more general weighted automata. (cid:73)

Lemma 43.

1. Given an LMC along with two states s, s and constant c , it is undecidableto distinguish between r ( s, s ) ≤ c and r ( s, s ) = ∞ .2. Given an LMC along with two states s, s and two numbers c and C such that c < C , itis undecidable to distinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ .Both statements remain true if r is replaced with rd. Proof.

For both cases, we reduce from

Empty . We show our construction for Σ = { a, b } ,but the procedure can be generalised to arbitrary alphabets.The construction will create two branches of a labelled Markov chain. The ﬁrst, fromstate q s , will simulate the given probabilistic automaton using the original weights multipliedby the same scalar (in this case ). The other branch, from state s , will process eachletter from Σ with equal weight (also in an inﬁnite loop). Consequently, if there is a wordaccepted with probability greater than , the ratio between the two branches will be greaterthan 1. The construction will make it possible to process words repeatedly, so that the ratiocan then be pumped unboundedly.Formally, given a probabilistic automaton A = h Q, Σ , M, F i with start state q s . Firstobserve that w.l.o.g. q s is not accepting, since in this case the empty word is accepted withprobability 1, and thus there is a word with probability greater than and a trivial positiveinstance of the big-O problem can be returned.We construct the LMC h Q , Σ , δ, F i taking Q = Q ] { s, s , s , s , t } where ] denotesdisjoint union, Σ = { a, b, acc , rej , ‘} , F = { t } and δ as speciﬁed below. First we simulatethe probabilistic automaton with a scaling factor of : for all q, q ∈ Q , q M ( a )( q,q ) −−−−−−−−→ a q q M ( b )( q,q ) −−−−−−−−→ b q . s tq s q a q r qrej a acc b rej rej a M ( a )( q s ,q a )4 b M ( b )( q s ,q a )4 a M ( a )( q a ,q a )4 b M ( b )( q a ,q a )4 a M ( a )( q r ,q r )4 b M ( b )( q r ,q r )4 a M ( a )( q s ,q s )4 b M ( b )( q s ,q s )4 a M ( a )( q s ,q r )4 b M ( b )( q s ,q r )4 acc . . . . . .. . . Figure 5

Reduction; where q a represents accepting states of the probabilistic automaton, q r represents rejecting states and q s represents the start state (assumed to be rejecting). Originally accepting runs trigger a restart, while rejecting ones are redirected to t :if q ∈ F : q −−→ acc q s and if q F : q −−→ rej t. We then add a part of the chain which behaves equally, rather than according to theprobabilistic automaton: s −→ a s s −→ b s s −−→ acc s s −−→ rej t. The construction is illustrated in Figure 5. To complete the reduction, we add the followingtransitions from s, s , s . s −→ ‘ s s −→ ‘ q s s −→ ‘ s s −−→ ‘ s s −−→ ‘ q s We make the following claims: (cid:66)

Claim 44. If A 6∈

Empty then r ( s, s ) = ∞ . If A ∈

Empty then r ( s, s ) ≤ (cid:66) Claim 45. If A 6∈

Empty then 49 < r ( s, s ) ≤

51. If

A ∈

Empty then r ( s, s ) ≤ r ) follows from the undecidability of Empty . Note that s, s are taken to be certain “linear combinations” of s and q s . This ensures that r ( s , s ) ≤ r ( s , s ) ≤

2, consequently the claims for rd will follow. Proof of Claim 44.

First observe that f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) = 12 + 12 f q s ( w ) f s ( w ) (2) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 23 If there is a word w that is accepted by the automaton with probability > , then let w = ( w acc ) i rej and we have f q s ( w ) f s ( w ) = (( ) | w | P ( w ) ) i (( ) | w | ) i = (2 P ( w )) i (3)Since P ( w ) > then 2 P ( w ) > i →∞ f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) = ∞ and r ( s, s ) = rd ( s, s ) = ∞ . If there is no such word then ∀ w ∈ Σ ∗ : P ( w ) ≤ , then probability ratio of all wordsis bounded. All words start with ‘ and are terminated by rej , so in general all wordstake the form w = ‘ (( w acc ) . . . ( w n acc )( w n +1 rej ). Let us consider the probability of w = (( w acc ) . . . ( w n acc )( w n +1 rej ) words from s and q s . Then: f q s ( w ) f s ( w ) (4)= ( Q ni =1 12 ( ) | w i | P [ w i ])(( ) | w n +1 | (1 − P [ w n +1 ]) )( ) | w | + ··· + | w n | ( ) n ( ) | w n +1 | (5) ≤ (( ) | w | + ··· + | w n | ( ) n ( ) n )(( ) | w n +1 | )( ) | w | + ··· + | w n | + n ( ) | w n +1 | ( ∀ i : P [ w i ] ≤ )= 2 (6)Then using Equation (2) we have for every word w we have ≤ f s ( w ) f s ( w ) ≤ and r ( s, s ) ≤ and rd ( s, s ) ≤ (cid:67) Proof of Claim 45.

First observe that the direction of f s ( ‘ w ) f s ( ‘ w ) is always ≤

2, resulting in theonly interesting direction being f s ( ‘ w ) f s ( ‘ w ) : f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w )= f s ( w ) f s ( w ) + f q s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) f s ( w ) + f q s ( w ) f q s ( w )= 2 · ‘ w , r and rd is bounded: f s ( ‘ w ) f s ( ‘ w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w )= f s ( w ) f s ( w ) + f q s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) f s ( w ) + f q s ( w ) f q s ( w ) ≤ ·

99 + 1002 ≤ If there is a word w that is accepted by the automaton with probability > , then weconsider the word ‘ ( w acc ) i rej ), let w = ( w acc ) i rej ). f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≥ f q s ( w ) f s ( w ) + f q s ( w )By the previous proof (Equation (3)) we know f qs ( w ) f s ( w ) −−−→ i →∞ ∞ , thus f s ( w ) f qs ( w ) −−−→ i →∞ f s ( w ) + f q s ( w ) f q s ( w ) = 2100 + 2 · (cid:20) f s ( w ) f q s ( w ) (cid:21) −−−→ i →∞ f qs ( w ) f s ( w )+ f qs ( w ) −−−→ i →∞ = 50. So for all (cid:15) there exists an i such that f s ( ‘ ( w acc ) i rej ) f s ( ‘ ( w acc ) i rej ) ≥ − (cid:15) . In particular for example r ( s, s ) ≥ ∀ w ∈ Σ ∗ : P ( w ) ≤ , then we show the total variationdistance will be small. All words start with ‘ and are terminated by rej , so in general allwords take the form w = ‘ (( w acc ) . . . ( w n acc )( w n +1 rej ). Let us consider the probabilityof such words from s, s . f s ( w ) f s ( w ) = f s ( w ) + f q s ( w ) f s ( w ) + f q s ( w ) ≤ f s ( w ) + f q s ( w ) f s ( w ) ≤ · (cid:20)

12 + 12 f q s ( w ) f s ( w ) (cid:21) ≤ ·

32 (by Equation (6)) ≤ ∃ w : P ( w ) > then 49 < r ( s, s ) ≤

51 and49 < rd ( s, s ) ≤

51 and if not then r ( s, s ) ≤ rd ( s, s ) ≤ (cid:67)(cid:74) Theorem 8

Lemma 43 implies Theorem 8.

Proof of Theorem 8.

We reason by contradiction using Lemma 43. For the big-O problem,it suﬃces to observe that, if it were decidable, one could use it to solve the ﬁrst promiseproblem from the Lemma (recall that in a promise problem the input is guaranteed to fallinto one of the two cases). This would contradict Lemma 43.Similarly, the decidability of the (asymmetric) threshold problem would allow us todistinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ (second promise problem from theLemma) by considering the instance r ( s, s ) ≤ c + C (non-strict variant) or r ( s, s ) < c + C (strict variant). A positive answer (regardless of the variant) implies r ( s, s ) < C , while anegative one yields r ( s, s ) > c , which suﬃces to distinguish the cases. Note that in both . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 25 cases r ( s, s ) is bounded, so the reasoning remains valid if it is known in advance that r ( s, s )is bounded.For additive (asymmetric) approximation, we observe that ﬁnding x such that | r ( s, s ) − x | ≤ C − c and comparing it with c + C makes it possible to distinguish between r ( s, s ) ≤ c and C ≤ r ( s, s ) < ∞ . This is because r ( s, s ) ≤ c then implies x < c + C and C ≤ r ( s, s )implies c + C < x .In the multiplicative case, ﬁnding x such that 1 − C − c C ≤ x r ( s,s ) ≤ C − c C and comparing x with c + C yields an analogous argument.Since Lemma 43 also applies to rd , all of our results hold when r is replaced by rd . (cid:74) D.1 The relation to the Value-1 Problem

The previous section showed undecidability of the big-O problem via the emptiness problemfor probabilistic automata. Another undecidable problem for probabilistic automata is the

Value-1 problem [20]. The

Value-1 problem asks whether some word of a probabilisticautomaton is one, or at least arbitrarily close to 1. This section shows that there is a close,but not complete, connection between the

Value-1 problem and big-O problem by reducingin both directions between the two, the results are shown in Lemmas 47 and 48. (cid:73)

Deﬁnition 46.

The

Value-1 problem, given a Probabilistic Automaton A , asks if for all δ > there exists a word w such that P A ( w ) > − δ . (cid:73) Lemma 47.

Value-1 problem reduces to the big-O problem (cid:73)

Lemma 48.

The big-O problem reduces to

Value-1 problem.

Proof of Lemma 47 (Value-1 reduces to big-O).

Given a probabilistic automaton A = h Q, Σ , M, F i and a dedicated starting state q ∈ Q ,which accepts words with probability P A ( w ), ﬁrst construct A in which words are acceptedwith probability P A ( w ) = 1 − P A ( w ), by inverting accepting states.The proof uses a two letter alphabet, Σ = { a, b } , but the procedure can be generalisedto arbitrary alphabets. Construct a Markov chain M A = h Q , Σ , M , F i , where Q = Q ∪ { s, s , s , rej, acc } , Σ = { a, b, c } and F = { acc } . The probabilistic automaton will besimulated by M A . The relation M is described by the notation p −→ a :For all q ∈ Q : ∀ q ∈ Q : q M ( a )( q,q ) −−−−−−−−→ a q q M ( b )( q,q ) −−−−−−−−→ b q if q ∈ F : q −→ c acc and if q F : q −→ c rejs −→ c q s −→ c s s −→ a s s −→ b s s −→ c acc Note the only words with positive probability are words of the form c Σ ∗ c ⊆ Σ . Thengiven a word w ∈ Σ ∗ , f s ( cwc ) = ( | Σ | +1 ) | wc | and f s ( cwc ) = ( | Σ | +1 ) | wc | (1 − P A ( w )).Then if there is a sequence of words for which P A ( w ) tends to 1 then f s ( cwc ) f s ( cwc ) is unbounded.However, if there exists some γ > w ∈ Σ ∗ we have P A ( w ) ≤ (1 − γ ) then(1 − P A ( w )) ≥ γ , and so f s ( cwc ) f s ( cwc ) ≤ γ . (cid:74) Proof of Lemma 48 (big-O reduces to Value-1).

Given M = h Q, Σ , M, F i and s, s ∈ Q ,construct a probabilistic automaton A = h Q , Σ , M , F i . Each state of Q will be duplicated,once for s and once for s ; Q s = { q s | q ∈ Q } , Q s = { q s | q ∈ Q } . Let Q = Q s ∪ Q s ∪{ q , acc, rej, sink } , Σ = Σ ∪ { $ } and F = { acc } . The reduction can be seen in Figure 6. q s s s s sink accrejq js : q j ∈ Fq is : q i Fq js : q j F q is : q i ∈ F. . .. . . $ $ $ 1 $ 1$ 1$ 1$ 1 $ 1$ 1 Figure 6

Reduction to

Value-1 . Only the eﬀect of transitions on the $ symbol are shown inblack, with the possibility to transition to the sink state depicted in grey (on symbols in Σ). Allremaining transitions are omitted.

Each transition of M will be simulated in each of the copies according the probability in M .For every q, q ∈ Q, a ∈ Σ, let M ( a )( q s , q s ) = M ( a )( q, q ) and M ( a )( q s , q s ) = M ( a )( q, q ).A probabilistic automaton should be stochastic for every a ∈ Σ, so there is unused probabilityfor each character, which will divert to a sink. For every q ∈ Q and a ∈ Σ, let M ( a )( q s , sink ) = 1 − X q ∈ Q M ( a )( q, q )and M ( a )( q s , sink ) = 1 − X q ∈ Q M ( a )( q, q ) . There will be an additional character $.From q the machine will pick either of the two machines with equal probability; M ($)( q , s s ) = M ($)( q , s s ) = . If in the accepting or rejecting state the system will staythere forever M ($)( acc, acc ) = 1 and M ($)( rej, rej ) = 1 .The behaviour on $ will diﬀer in the two copies of M . If in an s state the system willpreference the accepting state when accepting and otherwise restart. If in an s state thesystem will preference the rejecting state when accepting and otherwise restart. Formally, M ($)( q s , acc ) when q s ∈ F and M ($)( q s , q ) when q s F . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 27 and M ($)( q s , rej ) when q s ∈ F and M ($)( q s , q ) when q s F. When in the sink state, the system restarts on $, M ($)( sink, q ) = 1, or for all a ∈ Σstays there M ( a )( sink, sink ) = 1.The idea is that if f s ( w ) >> f s ( w ) then, by repeated reading of the word w , all of theprobability mass will eventually move to ‘acc’; otherwise a suﬃciently large amount of masswill be lost to ‘rej’.Denote by P A ( w ) the probability of a word w in the probabilistic automaton, from state q , i.e. f q ( w ). However, f will be used to refer to the probability in the labelled Markovchain M . Further the notation P [ q w −→ q ] is used to denote ( M ( w ) × · · · × M ( w | w | ) q,q , i.e.the probability of transitioning from state q to q after reading w in A .Consider each direction: (cid:73) Case 1 (Not big-O implies

Value-1 ) . The proof shows that ∀ δ ∃ C, i ∈ N , w ∈ Σ ∗ such that f s ( w ) > Cf s ( w ) and P A (( w $) i ) > − δ .Hence given δ , choose C such that (1 − δ ) CC +1 > − δ . Then by the big-O property, choosea word such that f s ( w ) = C f s ( w ) , with C > C . Then (1 − δ ) C C +1 > (1 − δ ) CC +1 > − δ .Given the ﬁxed sequence ($ w $) i , this induces a (unary) Markov chain, represented by theMatrix A , representing states q , acc and rej in the three positions respectively: A =  . − f s ( w )) + 0 . − f s ( w )) 0 . f s ( w ) 0 . f s ( w )0 1 00 0 1  Then in the long run, starting from state , observe: [ 1 0 0 ] A i i →∞ −−−→ [ 0 Cx x ] with C x + x = 1 Clearly, A i (0 ,

1) + A i (0 ,

2) + A i (0 ,

0) = 1 , and choose i such that A i (0 , ≤ δ . Then A i (0 ,

1) + A i (0 , ≥ − δ , using the fact that A i (0 ,

1) = C A i (0 , , obtaining A i (0 ,

1) + A i (0 , C ≥ − δ Hence A i (0 , ≥ (1 − δ ) C C +1 > − δ , as required. (cid:73) Case 2 (big-O implies Not Value-1) . We have there exists C such that ∀ w f s ( w ) ≤ Cf s ( w ) and should show there exists δ > such that ∀ w ∈ (Σ ∪ { $ } ) ∗ we have P A ( w ) ≤ − δ To move probability from q to acc it is necessary to use words of the form $Σ ∗ $ where Σ is the alphabet of M . Hence any word can be decomposed into $ w $$ w $ ... $ w m $ .After reading w the probability is such that x = P [ q w $ −−−→ acc ] = f s ( w ) y = P [ q w $ −−−→ rej ] = f s ( w ) P [ q w $ −−−→ q ] = 1 − x − y Since ∃ C ∀ w i : f s ( w i ) ≤ Cf s ( w i ) , we have x ≤ Cy . By induction, repeating this process we have for all i : x i ≤ Cy i . x i = P [ q w $ ... $ w i $ −−−−−−−→ acc ] = (1 − f s ( w i ) − f s ( w i )) x i − + f s ( w i ) y i = P [ q w $ ... $ w i $ −−−−−−−→ acc ] = (1 − f s ( w i ) − f s ( w i )) y i − + f s ( w i ) P [ q w $ ... $ w i $ −−−−−−−→ q ] = i Y j =1 (1 − x j + y j ) . Hence x i = (1 − f s ( w i ) − f s ( w i )) x i − + f s ( w i ) ≤ (1 − f s ( w i ) − f s ( w i )) Cy i − + Cf s ( w i )= C [(1 − f s ( w i ) − f s ( w i )) y i − + f s ( w i )] ≤ Cy i . In the extreme x m + y m = 1 , then x m ≤ CC +1 < , so the probability of reaching acc isbounded away from for every word. (cid:74) The

Value-1 problem is undecidable in general, however it is decidable in the unary casein coNP [4] and for leaktight automata [17]. Note, however, that the construction combinedwith these decidability results does not entail any decidability results for the big-O problem.Firstly note that the construction adds an additional character, and such a unary instanceof the big-O problem always has at least two characters when translated to the

Value-1 problem. Further the construction does not result in a leaktight automaton, to see this thedeﬁnition of leaktight automata are recalled from [17]. The following, does not, of course,preclude the existence of a construction which does maintain these properties. (cid:73)

Deﬁnition 49.

A ﬁnite word u is idempotent if reading once or twice the word u does notchange qualitatively the transition probabilities. That is P A [ q u −→ q ] > ⇐⇒ P A [ q uu −−→ q ] > . Let u n be a sequence of idempotent words. Assume that the sequence of matrices P A ( u n ) converges to a limit M , that this limit is idempotent and denote M the associated Markovchain. The sequence u n is a leak if there exist r, q ∈ Q such that the following three conditionshold: r and q are recurrent in M , lim P A [ r u n −−→ q ] = 0 , for all n , P A [ r u n −−→ q ] > .An automaton is leaktight if there is no leak. If there were no leak in the probabilistic automaton then decidability would follow.However, this is not the case, and the reduction does not solve any cases by reduction toknown decidable fragment of the

Value-1 problem. (cid:66)

Claim 50.

The resulting automaton from the reduction of the big-O problem to the

Value-1 problem has a leak.

Proof.

Consider some inﬁnite sequence of words w i growing in length, such that f s ( w i ) > i . Let u i = $ w i $.Observe that this word is idempotent. For each starting state, consider the possible stateswith non-zero probability and from each of these the set of reachable states. Observe that inall cases the set reachable after one application is equal to the set reachable after two. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 29 acc $ w i $ −−−→ acc $ w i $ −−−→ accrej $ w i $ −−−→ rej $ w i $ −−−→ rejq w i $ −−−→ q , acc, rej $ w i $ −−−→ q , acc, rejq w i $ −−−→ q , acc, rej $ w i $ −−−→ q , acc, rej For q accepting in Q s : q $ w i $ −−−→ acc $ w i $ −−−→ acc For q rejecting in Q s : q $ w i $ −−−→ ∅ $ w i $ −−−→ ∅ For q accepting in Q s : q $ w i $ −−−→ rej $ w i $ −−−→ rej For q rejecting in Q s : q $ w i $ −−−→ ∅ $ w i $ −−−→ ∅ Assume that the labelled Markov chain M has a sink, that is the decision to terminatethe word must be made by probability. Then ∀ λ > n such that f s (Σ >n ) < λ and f s (Σ >n ) < λ [25, Lemma 12.].Suppose limit P A ( u n ) converges to a limit M and let r = q and q = acc .Hence for longer and longer words the probability of reaching acc is diminishing. Thuslim P A [ r u n −−→ q ] = 0, and in M we have r and q in diﬀerent SCCs. acc is clearly recurrent asit is deterministically looping on every character. Since the probability of reaching acc isdiminishing for longer and longer words, whenever $ is read the state returns to r , hence allwords return to r with probability 1 in the limit. By the choice of words in the sequence, forevery word f s ( w n ) >

0, we have P A [ r u n −−→ q ] > n .Hence a leak has been deﬁned, even in the case where M is unary. (cid:74) E Additional material for Section 4

Here we discuss the relationship between the big-O problem and the eventually big-O problem.Let W = h Q, Σ , M, { t }i be a weighted automaton, s, s ∈ Q , and s = s . Below, whenever wewrite f s (resp. f s ), this will refer to word weights from s (resp. s ) in W .Choose δ to be a real number such that 0 < δ < δ is smaller than any positiveweight in W . Construct W by adding the following transitions for all x ∈ Σ: s δ −→ x t s δ −→ x • • δ −→ x • • δ −→ x t, where • is a new state. Consequently, for any w ∈ Σ + , we get:the weight of w in W from s is f s ( w ) + δ | w | ,if f s ( w ) > f s ( w ) > δ | w | . (cid:73) Lemma 51. s is eventually big-O of s in W if and only if L s ( W ) \ L s ( W ) is ﬁnite and s is big-O of s in W . Proof. ( ⇒ ) Suppose s is eventually big-O of s in W , i.e. there exist C, k such that, for all w ∈ Σ ≥ k , f s ( w ) ≤ Cf s ( w ). Note that, for w ∈ Σ ≥ k , this implies that, whenever f s ( w ) >

0, wemust also have f s ( w ) >

0. Consequently, L s ( W ) \ L s ( W ) ⊆ Σ f s ( w ) >

0. Because s is big-O of s in W , there exists C , such that f s ( w ) ≤ C ( f s ( w ) + δ | w | ) for any w ∈ Σ ∗ .Let w ∈ Σ ≥ k . From s being big-O of s , we get f s ( w ) ≤ C ( f s ( w ) + δ | w | ).If f s ( w ) > f s ( w ) >

0. By construction of W , we get f s ( w ) > δ | w | , so f s ( w ) ≤ C ( f s ( w ) + δ | w | ) < C ( f s ( w ) + f s ( w )) = 2 Cf s ( w ) . If f s ( w ) = 0 then we also have f s ( w ) = 0 ≤ Cf s ( w ).Consequently, for any w ∈ Σ ≥ k , f s ( w ) ≤ Cf s ( w ), i.e. s is eventually big-O of s in W . (cid:74) The above argument relied on completing the automaton so that any word is acceptedwith some weight. To transfer our decidability results for bounded languages, it will benecessary to complete the automaton with respect to a bound, i.e. the extra weights areadded only for words from a +1 · · · a + m , a ∗ · · · a ∗ m , w ∗ · · · w ∗ k respectively. This can be doneeasily by introducing the extra transitions according to DFA for the bounding language. E.1 Unambiguous Automata

Proof of Lemma 13.

Let W = h Q, Σ , M, F i be a weighted automaton. Suppose s, s ∈ Q , t is a unique ﬁnal state, and W is unambiguous from s, s .If W fails the LC condition (recall that it can be checked in polynomial time), we returnno. Otherwise, let us construct a weighted automaton W through a restricted productconstruction involving two copies of W : for all q , q , q , q ∈ Q , we add edges ( q , q ) p −→ a ( q , q ) provided M ( a )( q , q ) > M ( a )( q , q ) > p = M ( a )( q ,q ) M ( a )( q ,q ) . Note that thereexists a positively-weighted w -labelled path from ( s, s ) to ( t, t ) in W iﬀ w ∈ L s ( W ) ∩L s ( W ).By the LC condition, this is equivalent to w ∈ L s ( W ), and, to examine the big-O problem,it suﬃces to consider only such words.By unambiguity of W from s and s , for any w ∈ L s ( W ), there can be exactly onepositively-weighted path from ( s, s ) to ( t, t ) in W . Consequently, the product of weightsalong this path is equal to f s ( w ) /f s ( w ). Hence, s is not big-O of s (for W ) if and onlythere exists a positively-weighted path from ( s, s ) to ( t, t ) in W that contains a cycle suchthat the product of the weights in that cycle is greater than 1.Thus, to decide the big-O problem for s, s , it suﬃces to be able to detect such cycles.This can be done , for instance, by a modiﬁed version of the Bellman-Ford algorithm [11]applied to the weighted directed graph consisting of positively-weighted edges of W . Thealgorithm is normally used to ﬁnd negative cycles in the sense that the sum of weights isnegative. To adapt it to our setting, we can apply the logarithm function to the weights.However, to preserve rationality of weights and polynomial-time complexity, we cannot aﬀordto do that explicitly. Instead, whenever log( x ) < log( y ) would be tested, we test x < y and,whenever log( x ) + log( y ) would be performed, we compute xy instead. (cid:74) F Additional material for Section 5 (cid:73)

Lemma 52.

Given A ϕ , a representation of the value ρ ϕ can be found in polynomial time.This representation will admit polynomial time testing of ρ ϕ > ρ ϕ and ρ ϕ = ρ ϕ and can beembedded into the ﬁrst order theory of the reals. . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 31 Proof.

An algebraic number z can be represented as a tuple ( p z , a, b, r ) ∈ Q [ x ] × Q . Here p z is a polynomial over x and a, b, r form an approximation such that z is the only root of p z ( x ) with | z − ( a + bi ) | ≤ r .Then operations such as addition and multiplication of two algebraic numbers, ﬁnding | x | , testing if x > p, a, b, r ), yielding the same representation. Additionally given a polynomial, one can ﬁndthe representation of each of its roots in polynomial time (see e.g. [36]).Any coeﬃcient of the characteristic polynomial of an integer matrix can be found in GapL [23].

GapL is the diﬀerence of two L calls, each of which can be found in NC ⊆ P . Herethe matrix will be rational; but it can be normalised to an integer matrix by a scaler, the leastcommon multiple of the denominator of each rational. This number could be exponential,but representable in polynomial space. The ﬁnal eigenvalues can be renormalised by thisconstant.The characteristic polynomial of an n × n matrix has degree at most n , since eachcoeﬃcient can be found in polynomial time, the whole characteristic polynomial can befound in this time. Thus by enumerating its roots (at most n ), taking the modulus of each,and sorting them ( a > b ⇐⇒ a + − × b >

0) we can ﬁnd the spectral radius in this form( p z , a, b, r ).Note that the spectral radius is a real number, so that given the spectral radius in theform ( p z , a, b, r ) we actually have b = 0. Then the number can be encoded exactly in theﬁrst order theory of the reals using ∃ z : p z ( z ) = 0 ∧ z − a ≤ r ∧ a − z ≤ r . (cid:74) Proof of Lemma 21. (lower bound)

Let n ∈ N and suppose d s,t ( n ) = ( ρ , k ). Considerthe witnessing path in W , i.e. the length- n path from s to t that visits k + 1 SCCs ofspectral radius ρ and no SCC with a larger spectral radius. Let π = ϕ . . . ϕ k ∈ P ( s, t )be the corresponding sequence of SCCs visited by that path and let s i , e i (1 ≤ i ≤ k )be the entry and exit points (respectively into and out of ϕ i ) on that path. i.e. s = s , SCC ( s i ) = SCC ( e i ) = ϕ i (1 ≤ i ≤ k ), there is a transition (of positive weight) from e i to s i +1 and e k = t . We write ~s i , e i to represent the particular sequence of entry/exit points.Let us deﬁne a new unary weighted automaton W ~s i ,e i to be a restriction of W so thatthe only entry points to its SCCs are s i ’s and the only exit point are e i ’s, i.e. the weight isreduced to zero for any violating transition. Let D be the transition matrix of W ~s i ,e i .Clearly A ns,t ≥ D ns,t , since W ~s i ,e i is a restriction of W . Note that, in W ~s i ,e i , ρ ( s, t ) = ρ and k ( s, t ) = k , because all paths from s to t must visit k + 1 SCC’s with spectral radius ρ .Hence, by Theorem 18, D ns,t + D n +1 s,t + · · · + D n + T − s,t ≥ c ~s i ,e i ( ρ ) n n k , for some c ~s i ,e i >

0, where T is the local period from s to t in W ~s i ,e i . Next we shall show that D n +1 s,t + · · · + D n + T − s,t = 0,which will imply D ns,t ≥ c ~s i ,e i ( ρ ) n n k and, hence, A ns,t ≥ c ~s i ,e i ( ρ ) n n k .Let L be the length of the shortest path from s to t in W ~s i ,e i . Observe that paths from s to t in W ~s i ,e i can only have lengths from { L + n · T SCC ( s ) + · · · + n k · T SCC ( s k ) | n , . . . , n k ∈ N } and, thus, { L + n · gcd { T SCC ( s ) , . . . , T SCC ( s k ) } | n ∈ N } . As P ( s, t ) = { π } in W ~s i ,e i , T = gcd { T SCC ( s ) , . . . , T SCC ( s k ) } . Consequently, all paths from s to t in W ~s i ,e i have lengthsof the form L + nT . Hence, since D ns,t is positive, there are no paths which can contributepositive value to D n +1 s,t + · · · + D n + T − s,t .As c ~s i ,e i depends only on ~s i , e i , to ﬁnish the proof it suﬃces to take c to be the smallestamong the ﬁnitely many c ~s i ,e i . (upper bound) Let N ( ρ ,k ) = { n | d s,t ( n ) = ( ρ , k ) } . This gives a ﬁnite partition of N as S ( ρ,k ) N ( ρ,k ) . For each ( ρ , k ), we shall ﬁnd a value C ( ρ ,k ) so that, for n ∈ N ( ρ ,k ) , we have A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k . Then, to have A ns,t ≤ Cρ ( n ) n n k ( n ) for all n ∈ N , it will suﬃce totake C to be the maximum over all C ( ρ ,k ) .Let us ﬁx ( ρ , k ). Consider W • to be W † in which, for every ( ρ, k ) ≤ ( ρ , k ), we mergethe states ( t, ρ, k ) into a single ﬁnal state t (recall there are no outgoing edges from t ). Letus rename the state ( s, ,

0) to s . Let E be the corresponding transition matrix of W • . Notethat all paths from s to t in W • go through at most k + 1 SCCs with spectral radius ρ . (cid:66) Claim 53.

For all n ∈ N ( ρ ,k ) , we have A ns,t = E ns ,t .Consider any path s → q → · · · → q m → t in W . There is a corresponding path in W • ,however the states q i are annotated as ( q i , ρ, k ), where ρ is the largest spectral radius seenso far, and k + 1 is the number of SCC’s of that radius number seen so far. The only pathsremoved are those terminating at ( t, ρ, k ) with ( ρ, k ) > ( ρ , k ). Since d s,t ( n ) = ( ρ , k ), weknow that no path visits more than k + 1 SCCs of spectral radius ρ , or an SCC of spectralradius greater than ρ . Consequently, no such path is disallowed in W • . No paths wereadded either. Because every SCC in W remains a strongly connected component in W • (duplicated with various ( ρ, k )) and its transition probability matrix (and hence the spectralradius) remains the same, we can conclude that A ns,t = E ns ,t . (cid:66) Claim 54.

There exists C ( ρ ,k ) such that A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k .We have A ns,t = E ns ,t ≤ E ns ,t + E n +1 s ,t + · · · + E n + T ( s ,t ) − s ,t , where T ( s , t ) is the local periodbetween states s and t in W • . By Theorem 18, there exists C ( ρ ,k ) such that this quantityis bounded by C ( ρ ,k ) ( ρ ) n n k . Thus, for n ∈ N ( ρ ,k ) , we have A ns,t ≤ C ( ρ ,k ) ( ρ ) n n k . (cid:74) Proof of Lemma 23.

First we note some consequences of d s,t ( n ) ≤ d s ,t ( n ). Suppose d s,t ( n ) = ( ρ, k ) and d s ,t ( n ) = ( ρ , k ). Thanks to Lemma 21, we have f s ( a n ) ≤ ( Cc ( ρρ ) n n k − k ) · f s ( a n ). If d s,t ( n ) ≤ d s ,t ( n ) we can distinguish two cases: either ( ρ, k ) = ( ρ , k ) or( ρ, k ) < ( ρ , k ).In the former case, ( ρρ ) n n k − k = 1 and, thus, f s ( a n ) ≤ ( Cc ) · f s ( a n ).In the latter case, we have lim m →∞ ( ρρ ) m m k − k = 0 and, thus, ( ρρ ) m m k − k < m . Consequently, for all but ﬁnitely many n , we can conclude f s ( a n ) ≤ ( Cc ) · f s ( a n ).Thanks to the above analysis, if d s,t ( n ) ≤ d s ,t ( n ) holds for all but ﬁnitely many n ,it follows that f s ( a n ) ≤ ( Cc ) · f s ( a n ) for all but ﬁnitely many n . Moreover, the languagecontainment condition implies that f s ( a n ) ≤ C · f s ( a n ) for some C in the remaining (ﬁnitelymany) cases. Hence, s is big-O of s , which shows the right-to-left implication.For the converse, recall that we have already established that “ s is big-O of s ” implies thelanguage containment condition. For the remaining part, we reason by contraposition andsuppose that there are inﬁnitely many n with d s,t ( n ) > d s ,t ( n ). As there are ﬁnitely manyvalues in the range of d s,t and d s ,t , there exist ( ρ, k ) and ( ρ , k ) such that ( ρ, k ) > ( ρ , k )and, for inﬁnitely many n , d s,t = ( ρ, k ) and d s ,t = ( ρ , k ). For such n , Lemma 21 yields f s ( a n ) ≥ ( cC ( ρρ ) n n k − k ) · f s ( a n ). But ( ρ, k ) > ( ρ , k ) implieslim m →∞ (cid:18) ρρ (cid:19) m m k − k = ∞ , i.e. ( ρρ ) n n k − k is unbounded. Thus, s cannot be big-O of s . (cid:74) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 33 Proof of Lemma 25.

Let M be a DFA accepting L ( N ) ∩ L ( N ) obtained through standardautomata constructions, i.e. | M | ≤ |N | + |N | . Note that L ( N ) ∼ ⊂ L ( N ) if and only if L ( M ) is ﬁnite. Observe that L ( M ) is inﬁnite if and only if there exists w ∈ L ( M ) with | M | ≤ w ≤ | M | .Consequently, violation of eventual inclusion can be detected by guessing n ∈ N such that | M | ≤ n ≤ | M | and verifying a n ∈ L ( M ).Even though M is of exponential size, it is possible to verify a n ∈ L ( M ) in polynomialtime. To this end, we use N , N instead of M and view their transition functions as matrices.Then one can verify the condition using fast matrix exponentiation (by squaring). Becausethe binary size of n must be polynomial in |N | + |N | , the lemma follows. (cid:74) Proof of Lemma 26.

The left-to-right implication is clear. For the opposite direction,observe that, because the order on X is total, d ( n ) > d ( n ) implies the existence of x ∈ X such that d ( n ) ≥ x and d ( n ) < x (it suﬃces to take x = d ( n )). Because X is ﬁnite, d ( n ) > d ( n ) for inﬁnitely many n implies failure of { n | d ( n ) ≥ x } ∼ ⊂ { n | d ( n ) ≥ x } forsome x . (cid:74) F.1 Hardness (cid:73)

Theorem 55.

The big-O problem is coNP -hard on unary Markov chains.

Let us ﬁrst consider a particular form of unary NFAs. (cid:73)

Deﬁnition 56.

A unary NFA N = h Q, → , q s , F i is in Chrobak normal form [9] if Q = S ] C ] · · · ] C m and q s ∈ S ; S = { s , · · · , s k } , q s = s ∈ S and transitions between states from S form a path s a −→ s a −→ . . . a −→ s k ; C i = { c i , · · · , c i | C i |− } ( ≤ i ≤ m ) and transitions between states from C i form a cycle c i a −→ c i a −→ . . . a −→ c i | C i |− a −→ c i ;the remaining transitions connect the end of the path to each cycle: s k a −→ c i for all ≤ i ≤ m . Any unary NFA can be translated to this representation with at most quadratic blow-up inthe size of the machine [9], such representation can be found in polynomial time [45, 32]. Inaddition, to simplify our arguments, we introduce a restricted

Chrobak normal form, whichrequires that there is exactly one accepting state in each cycle. This restricted form canbe found with at most a further quadratic blow-up over Chrobak normal form, by creatingcopies of cycles - one for each accepting state in the cycle.Observe that S ⊆ F is a necessary condition for the universality of a unary NFA inChrobak normal form. Consequently, the universality problem for unary NFA in restrictedChrobak normal form such that k = 1 is already coNP -hard. This is the problem we aregoing to reduce from in the following. Proof of Theorem 55.

Let N = h Q, −→ , q s , F i be a unary NFA in restricted Chrobak normalform with k = 1. We will construct a unary Markov chain M , depicted in Figure 7, withstates Q = Q ∪ { s, u, v, t } , where t is ﬁnal. The branch starting from s , deﬁned below,guarantees f s ( a n ) = Θ(( ) n ). s −→ u u −→ u u −→ t s u ts v C ... C m ... C C m

12 121 m +114 341 m +1 − m +1 12 | Cm | − | Cm | . . . . . . Figure 7

Reduction from NFA (left) to LMC (right)

We take s = q s and create a similar branch from s , albeit with a smaller weight, to createpaths of weight Θ(( ) n ) when reading a n . s m +1 −−−→ v v −→ v v −→ t Moreover, we add weights to the original NFA transitions from N as follows: s m +1 −−−→ c i (1 ≤ i ≤ m ) c ij (cid:9) ) | Ci | −−−−→ c ij ( c ij ∈ F ) c ij (cid:9) − ( ) | Ci | −−−−−−→ t ( c ij ∈ F ) c ij (cid:9) −→ c ij ( c ij F )where j (cid:9) | C i | + j −

1) mod | C i | . Note that the weights have been selected as if eachletter were read with weight except for a bounded number of transitions, where the boundis max | C i | . Consequently, whenever there are accepting paths for a n in N , their overallweight in M will be Θ(( ) n ).It it easy to check that the reduction produces an LMC and can be carried out inpolynomial time. In Appendix F we show that the reduction is correct. (cid:74) Proof of Theorem 55 continued.

It remains to argue that the reduction is correct.If N is not universal, there exists n such that a n F . Because of the cyclic structureof Chrobak normal form, a n k F for n k = n + kq , where q = lcm {| C | , . . . , | C m |} and k ∈ N . Then, by the earlier observations about growth, there exists C > k f s ( a nk ) f s ( a nk ) = sup k C (1 / nk (1 / nk = sup k C n k = ∞ , i.e. s is not big-O of s .If N is universal then, starting from s in M , every word a n will have a path weightedΘ(( ) n ) as well as paths weighted Θ(( ) n ). Hence, there exists C > n f s ( a n ) f s ( a n ) ≤ sup n C ( ) n ( ) n + ( ) n ≤ C, . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 35 i.e. s is big-O of s . (cid:74)(cid:73) Remark 57.

We note that the branch via state v is not strictly necessary, but itdemonstrates that the problem is hard even if the LC condition is satisﬁed (i.e., “it can bethe numbers that make the hardness”). G Additional material for Section 6G.1 Additional material for Example 29

Let p = 0 .

62 and then note that f s ( a m b n ) f s ( a m b n ) = 1 · . m · . · . n · . . · . m · . · . n · .

61 + 0 . · . m · . · . n · . m = n = 0 then f s ( a m b n ) f s ( a m b n ) = . For all larger m, n the ratio is smaller.To see that, when when p = 0 .

61 we have f s ( a n b . n ) f s ( a n b . n ) −−−−→ n →∞ ∞ , observe there is asolution to x with 0 . · . x < . · . x and 0 . · . x < . · . x , e.g. x = 0 .

66, then let m = xn and observe Whilst useful for illustration in this example, this eﬀect is not limitedto a linear relation between the characters, and so heavier machinery is required. G.2 Additional proofs (cid:66)

Claim 58.

In the plus-letter-bounded cases, without loss of generality, we assume a i = a j (for i = j ). Proof of Claim 58.

We show how to reduce the big-O problem in the plus-letter-boundedcase to the version of the same problem where a i = a j (for i = j ). Suppose, as above, L s ( W ) ⊆ a +1 · · · a + m . We can assume a i = a i +1 (1 ≤ i < m ) because a + a + can be replacedwith a + . If W had an a -labelled transition that can be used in two diﬀerent blocks a + i and a + j ( j ≥ i + 2) within a +1 · · · a + m , then a = a i = a j and this transition could be used to “skip”the block a + i +1 , i.e., there would exist a word w ∈ L s ( W ) that does not use the a + i +1 block,contradicting L s ( W ) ⊆ a +1 · · · a + m . Therefore, every transition can be associated with exactlyone block. Deﬁne a fresh alphabet Σ = { b , . . . , b m } and relabel each transition associatedwith the i th block by b i . (cid:74) Proof of Lemma 32. f s ( a n a n . . . a n m m ) = ( M ( a ) n × M ( a ) n × · · · × M ( a m ) n m ) s,t = X q ∈ Q M ( a ) n s,q ( × M ( a ) n × · · · × M ( a m ) n m ) q ,t ...= X ( q ,...,q m − ) ∈ Q m − M ( a ) n s,q × M ( a ) n q ,q × · · · × M ( a m ) n m q m − ,t By Lemma 21 in the unary case, for each M ( a i ) n i q i − ,q i , there is a ( ρ q i − ,q i , k q i − ,q i ) , c, C ,such that if n i > | Q | cρ n i q i − ,q i n k qi − ,qi i ≤ M ( a i ) n i q i − ,q i ≤ Cρ n i q i − ,q i n k qi − ,qi i . Otherwise if n i ≤ | Q | , since there are at most | Q | instances it is clear there exists c, C , cδ n i ≤ M ( a i ) n i q i − ,q i ≤ Cδ n i . Take c, C so that C is maximised over all such C and c is minimised over all such c . c m − X ( q ,...,q m − ) ∈ Q m − ρ n s,q n k s,q · . . . · ρ n m q m − ,t n k qm − ,t m ≤ f s ( a n a n . . . a n m m ) ≤ C m − X ( q ,...,q m − ) ∈ Q m − ρ n s,q n k s,q · . . . · ρ n m q m − ,t n k qm − ,t m (7)By standard manipulations, any such that if for all i (ˆ ρ i , ˆ k i ) ≤ ( ρ , k ), thenˆ ρ n n ˆ k · · · · · ˆ ρ n m m n ˆ k m m + ρ n n k · · · · · ρ n m m n k m m = Θ( ρ n n k · · · · · ρ n m m n k m m ) and by suﬃcientmodiﬁcation of C, c , paths admitting (ˆ ρ , ˆ k ) , . . . , (ˆ ρ m , ˆ k m ) can be omitted.Since the sum is ﬁnite, any two sums with the same ρ, k values can be reduced to a singleone, changing c, C by a factor of two.The remaining ( ρ, k ) paths correspond exactly with d s ( n , . . . , n m ). (cid:74) Proof of Lemma 34.

Let ˆ ρ = ( ρ , k ) · · · ( ρ m , k m ). One can construct an automaton N s ≥ ˆ ρ with L ( N s ≥ ˆ ρ ) = { a n . . . a n m m | ∃ ˆ σ ∈ d s ( n , . . . , n m ) ˆ σ ≥ ˆ ρ } by tracking the current maximum spectral radius seen and the number of diﬀerent SCCswith this spectral radius. If the only states seen so far have been singletons with no loops(formally having spectral radius 0), the value should be tracked as ( δ,

0) regardless of howmany have been seen.Passage from states reading a j to states reading a j +1 is allowed only if the tracked valueis at least ( ρ j , k j ), and states should be ﬁnal if the tracked value of a m is at least ( ρ m , k m ).Similarly, one can construct N s> ˆ ρ with L ( N s> ˆ ρ ) = { a n . . . a n m m | ∃ ˆ σ ∈ d s ( n , . . . , n m ) ˆ σ > ˆ ρ } . The construction is the same as for N s ≥ ˆ ρ except that, in order to accept, we need to be surethat at least one of the ‘at least’ comparisons was strict. This can be achieved by maintainingan extra bit at run time.Note that L ( N s ≥ ˆ ρ ) \ N s> ˆ ρ contains all a n . . . a n m m such that there exists ˆ σ ∈ d s ( n , · · · , n m )with ˆ σ ≥ ˆ ρ and, for all ˆ τ ∈ d s ( n , · · · , n m ), we do not have ˆ τ > ˆ ρ . Consequently, we musthave ˆ ρ ∈ d s ( n , · · · , n m ), which implies (by maximality) that we cannot have ˆ τ > ˆ ρ for anyˆ σ ∈ d s ( n , · · · , n m ). Hence, L ( N s ≥ ˆ ρ ) \ L ( N s> ˆ ρ ) = { a n . . . a n m m | ˆ ρ ∈ d s ( n , · · · , n m ) } . Consequently, we can take N s ˆ ρ to be the corresponding automaton.Given Y ⊆ D , we can then take N s Y to be the automaton corresponding to \ ˆ ρ ∈Y L ( N s ˆ ρ ) ∩ \ ˆ ρ ∈D\Y ( a +1 · · · a + m \ L ( N s ˆ ρ )) . The relevant automaton N X, Y is then found by taking the intersection of L ( N sX ) and L ( N s Y ). (cid:74) . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 37 Proof of Lemma 35.

Consider the machine N X, Y , accepting a language which is a subsetof a +1 a +2 . . . a + m , with any state not reachable from the starting state or not leading to anaccepting state removed. To induce a form with the property we want, we intersect N X, Y with the standard DFA for a +1 a +2 . . . a + m , without changing the language.Hence every state corresponds to reading from exactly one character block of a , a , . . . , a m .At each state there can be at most two characters enabled, either the character to remainin the current character block, or the character to move to the next. Every state can belabelled as only having transition for a i ; or also having transition with a i +1 .Consider all possible choices of automaton formed by restricting N X, Y so that there is asingle state which is allowed to transition from a i to a i +1 for each i and any other state whichhad this property in N X, Y has its a i +1 transitions removed (but keeps its a i transitions).Each such choice corresponds with a partition of the accepting runs of N X, Y .Thus L ( N X, Y ) is the ﬁnite union over the languages induced by all such machines. Wefurther show that such machines can further be expressed as a ﬁnite union of linear sets inthe form prescribed.Let us assume N jX, Y is such a machine with a single state capable of transitioning from a i to a i +1 for each i , and again remove any state not reachable from the starting state or notleading to an accepting state. The part of the machine reading a i has a single starting stateand a single ﬁnal state, which is a unary NFA when the transitions to a i +1 are discarded.This unary NFA can be converted to Chrobak normal form; the section of N jX, Y corres-ponding to a i can be replaced with this unary NFA, and any accepting state has additionallythe transitions for transitioning from a i to a i +1 of the single such state in N jX, Y .Let us repeat the process above for all i , decomposing N jX, Y into the subsets of languageswhere there are exactly one state transitioning from a i to a i +1 . Let N jX, Y = S k N j,kX, Y , aﬁnite union; where each k corresponds to a selection of accepting states ( q , . . . , q m ) with q l being the accepting state in the Chrobak normal form for a l .Consider such an N j,kX, Y . The steps spent in each block corresponding to a i is eitherformed by the ﬁnite path or the a single cycle at the end of the path. If the transitionoccurs in the ﬁnite path then b ki is the length of the path to that transition and r ki is zero.If the transition occurs in the cycle at the end of the path, then b ki is the length of thepath to that transition from the start of the path and r ki is the length of the cycle. In N j,kX, Y the time spent in block a i has no inﬂuence on the time spent in a j for j = i . Then L ( N j,kX, Y ) = { a n a n . . . a n m |∃ ~λ ∈ N m s.t. ∀ i ∈ [ m ] n i = b ki + r ki · λ i } . The language L ( N X, Y )is the union over all L ( N j,kX, Y ). (cid:74) Proof of Lemma 37.

By Lemma 36 we have s is not big-O of s if and only if there exist X ∈ D , Y ⊆ D , 1 ≤ k ≤ S X, Y such that ∀ C ∃ ~λ ∈ N m ^ j ∈ h Y m X i =1 α j,i ( b ki + r ki λ i ) + p j,i log( b ki + r ki λ i ) < C. (8)First we argue that we can restrict to some subset of the components which enable thesatisfying choice of λ to be suﬃciently large in all components. By DFA we permit a partial transition function, that is 0 or 1 transition for each character from everystate, rather than exactly 1. (cid:66)

Claim 59.

Equation (8) holds if and only if the following holds for some U ⊆ [ m ]: ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · ( b ki + r ki · λ i ) + p j,i log( b ki + r ki · λ i ) < C (9) Proof of Claim 59.

First note that Equation (9) immediately implies Equation (8). We showthe converse.Recall we can alternatively characterise the formulation as a sequence n : N → N m . Thatis, for each negative integer C , the choice of ~λ corresponds to n ( C ) in the sequence.Note that in the sequence n some components may be bounded. Either because r ki = 0, orthe choice of n makes it so. Suppose there exists a θ > n ( t ) x ≤ θ for some x ∈ [ m ],then P mi =1 α j,i · n ( t ) i + p j,i log( n ( t ) i ) ≤ P mi =1 ,i = x α j,i · n ( t ) i + p j,i log( n ( t ) i ) + | α j,i | · θ + | p j,i | θ .Hence the sequence P mi =1 ,i = x α j,i · n ( t ) i + p j,i log( n ( t ) i ) goes to −∞ as well.Consider each choice of components B ⊆ [ m ] which will be bounded. For some componentsthere will be no choice as r ki = 0. Let us assume that the chosen set is maximal with respectto set-inclusion; that is, there should be no subsequence maintaining the property with fewercomponents unbounded. Let the remaining unbounded components be U = [ m ] \ B .Since each remaining component is not bounded, there is always a later point in thesequence in which the value is larger; thus one can take a subsequence of n ( t ) so that n ( t ) i ≤ n ( t + 1) i for every t . Repeat for every remaining component i ∈ U ; this can be doneas the minimal choice of unbounded components has been selected. Hence, without loss ofgenerality if there exists some sequence, then for any θ , there exists a subsequence of n ( t ),such that n ( t ) i > θ for all i ∈ U . To enable a more succinct analysis later, restrict n ( t ) tothose in which λ i ≥ max i b ki where n ( t ) i = b ki + r ki · λ i for some λ i . (cid:67) Next we argue that the oﬀset component ~b does not aﬀect whether the formula holdsand that we can relax the restriction of ~λ from naturals to positive reals and maintain thesatisﬁability of the formula. The advantage here is that this relaxation can be solved withthe ﬁrst order theory of the reals with exponential function; which is decidable subject toSchanuel’s conjecture. (cid:66) Claim 60. ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · ( b ki + r ki · λ i ) + p j,i log( b ki + r ki · λ i ) < C (10)holds if and only if the following holds: ∀ C ∃ ~x ∈ R U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · x i + X i ∈ U p j,i log( x i ) < C (11) Proof of Claim 60.

Observe that X i ∈ U α j,i · ( b ki + r ki · λ i ) = X i ∈ U α j,i · b ki + X i ∈ U α j,i · r ki · λ i . Chistikov, S. Kiefer, A. S. Murawski and D. Purser 39 and that P i ∈ U α j,i · b ki is constant so it does not aﬀect whether the sequence goes to −∞ ,hence Equation (10) holds if and only if : ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · λ i + p j,i log( b ki + r ki · λ i ) < C (12)Now let us extract the log component by using the following rewritinglog( b ki + r ki · λ i ) = log( λ i · ( b ki λ i + r ki )) = log( λ i ) + log( b ki λ i + r ki ) . Since r ki ≥ λ i ≥ b ki we have log( b ki λ i + r ki ) ≤ log( r ki + 1), which is constant. HenceEquation (10) is equivalent to: ∀ C ∃ ~λ ∈ N U ≥ max i b ki ^ j ∈ h Y X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i ) < C (13)We now show that this is equivalent to Equation (11). Clearly Equation (13) impliesEquation (11). Now consider Equation (11) holding, and we show the Equation (13) issatisﬁed, by exhibiting a choice of ~λ ∈ N U ≥ max i b ki for every C .Given C <

0, let C = C − max j P i ∈ U | α j,i | r ki − max j P i ∈ U | p j,i | , and choose ~x ∈ R | U |≥ max i b ki satisfying Equation (11).Now let x i = λ i + y i , with y i < , λ i = b x i c . First observe that since x i ≥ max i b ki , aninteger, also λ i ≥ max i b ki .Observe that | P i ∈ U α j,i · r ki · y i | ≤ P i ∈ U | α j,i | r ki . Since X i ∈ U α j,i · r ki · λ i + X i ∈ U α j,i · r ki · y i + X i ∈ U p j,i log( λ i + y i ) < C we have X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i + y i ) < C + X i ∈ U | α j,i | r ki Let us again rewrite log( λ i + y i ) = log( λ i (1 + y i λ i )) = log( λ i ) + log(1 + y i λ i ). Then since λ i > y i ,log(1 + y i λ i ) ≤

1, so | X i ∈ U p j,i log(1 + y i λ i ) | ≤ X i ∈ U | p j,i | . We thus have X i ∈ U α j,i · r ki · λ i + X i ∈ U p j,i log( λ i ) < C + X i ∈ U | α j,i | r ki + X i ∈ U | p j,i | ≤ C and hence, Equation (13) holds. (cid:67)(cid:74) G.3 The bounded case

Proof of Lemma 40.

Let W = h Q, Σ , M, F i . Then we have w , . . . , w m such that for all w with f s ( w ) > w = w n . . . w n m m for some n , . . . , n m ∈ N . Let us assume w i = b i, b i, , . . . , b i, | w i | .Given a word w , there may be multiple paths π , π , . . . from s to t respecting thatword. Further there may be multiple decomposition vectors ~n , ~n , · · · ∈ N m such that ~n i = ( n , . . . , n m ) and w = w n . . . w n m m . Our goal will be to construct a weighted automaton W with states ˆ s and ˆ s letter-bounded over a ∗ . . . a ∗ m such that, for every word w , the weightof a n . . . a n m m in W from ˆ s (resp. ˆ s ), for every valid decomposition vector ~n ∈ N m of w , willbe the sum of the weights of all paths π , π , . . . respecting w in W from s (resp. s ). Tocompute W , we will deﬁne a transducer and apply it to our automaton W .A nondeterministic ﬁnite transducer is an NFA with transitions labelled by pairs fromΣ × (Σ ∪ { (cid:15) } ). In our construction, we only require edges of this form, i.e. we do not considera deﬁnition with transitions labelled with (cid:15) in the ﬁrst component (e.g. (cid:15)/a ). Our transducerinduces a translation T : Σ ∗ → Σ .Consider the set of regular expressions w + i . . . w + i m each induced by a sequence ~ı =( i , . . . , i m ) ∈ N m , m ≤ m , with 1 ≤ i < · · · < i m ≤ m . Note that two sequences( i , . . . , i m ), ( i , . . . , i m ) may yield the same expression w + i . . . w + i m , in which case we neednot consider more than one. The transducer T will be deﬁned as follows.For each ~ı = ( i , . . . , i m ) described above, build the following automaton. For each i j ,construct the following section, which simply reads the word w i j : f ~ıj b ij, /(cid:15) −−−−→ s ~ıj b ij, /(cid:15) −−−−→ · b ij, /(cid:15) −−−−→ · . . . · b ij, | wi |− /(cid:15) −−−−−−−→ e ~ıj . Then, on the ﬁnal character, nondeterministically restart or move to the next word, emittinga character representing the word: e ~ıj b ij, | wij | /a ij −−−−−−−−→ f ~ıj and e ~ıj b ij, | wij | /a ij −−−−−−−−→ f ~ıj +1 The transducer T is deﬁned by the union of the above transitions over all ~ı . We also adda global start state q , from which we would like to move nondeterministically to f ~ı for each ~ı . To achieve this and avoid (cid:15) transitions, we duplicate the transitions f ~ı x −→ s ~ı with q x −→ s ~ı .Observe that the valid output sequences are ( (cid:15) ∗ a ) ∗ ( (cid:15) ∗ a ) ∗ . . . ( (cid:15) ∗ a m ) ∗ . However, there canbe a ﬁnite number of (cid:15) ’s in a row, at most r = max ≤ i ≤ m | w i | − W = h Q, Σ , M, { t }i and T = h Q , Σ × (Σ ∪ { (cid:15) } ) , → , q i . Then construct theweighted automaton T ( W ) = h Q × Q , Σ , M T , { t } × Q i using a product construction.The probability is associated in the following way M T ( a )(( s, q ) , ( s , q )) = p if there is atransition q b/a −−→ q in T and s p −→ b s in W . Note that, by this deﬁnition, there is a matrix M T ( (cid:15) ); however, in every run of T ( W ) at most r many (cid:15) ’s in a row are produced, where r = max ≤ i ≤ m | w i | − W be a copy of T ( W ) with (cid:15) removed: M ( a i ) = ( P rx =0 M T ( (cid:15) ) x ) M T ( a i ). Then f W ( w ) = f W ( a n . . . a n m m ) for all n , . . . , n m such that w = w n . . . w n m m . Hence, W is aweighted automaton with letter-bounded languages from ( s, q ) and ( s , q ) such that ( s, q )is big-O of ( s , q ) in W if and only if s is big-O of s in W ..