An Incompressibility Theorem for Automatic Complexity
aa r X i v : . [ c s . F L ] J a n An Incompressibility Theorem for AutomaticComplexity
Bjørn Kjos-Hanssen [email protected]
Department of MathematicsUniversity of Hawai‘i at M¯anoaHonolulu, HI 96822, USAJanuary 9, 2020
Abstract
Shallit and Wang showed that the automatic complexity A ( x ) ≥ n/ x ∈ { , } n . They also stated that Holger Petersen hadinformed them that the constant 13 can be reduced to 7. Here we showthat it can be reduced to 2 + ǫ for any ǫ > SAC ( AC with semi-unbounded fan-in). Kolmogorov’s structure function for a word x is intended to provide a statisti-cal explanation for x . We focus here on a computable version, the automaticstructure function h x . For definiteness, suppose x is a word over the alphabet { , } . By definition h x ( m ) is the minimum number of states of a finite automa-ton that accepts x and accepts at most 2 m many words of length | x | . The bestexplanation for the word x is then an automaton witnessing a value of h x thatis unusually low, compared to values of h y for most other words y of the samelength. To find such explanations we would like to know the distribution of h x for random x . In the present paper we take a step in this direction by studyingthe case h x (0), known as the automatic complexity of x .The automatic complexity of Shallit and Wang [10] is the minimal numberof states of an automaton accepting only a given word among its equal-lengthpeers. Here we show that this complexity has a similar incompressibility phe-nomenon as that of Kolmogorov complexity for Turing machines, first studiedin [7, 8, 12, 13]. Automatic complexity is denoted A ( x ), and its nondeterministicversion A N ( x ). 1n the theory of algorithmic randomness the Levin–Schnorr Theorem saysthat a sequence X ∈ { , } ω is Martin-L¨of random iff K ( X ↾ n ) ≥ + n . Herewe show an analogue of one direction, that most words x ∈ { , } n satisfy A N ( x ) / ( n/ ≥ (1 − ǫ ). However, since it concerns finite words, ours is acloser analogue of the result that most words are incompressible in terms ofKolmogorov complexity C . As Solomonoff and Kolmogorov observed, for each n there is a word σ ∈ { , } n with C ( σ ) ≥ n . Indeed, each word with C ( σ ) < n uses up a description oflength < n , and there are at most P n − k =0 k = 2 n − < n = |{ , } n | of those.Similarly, we have: Lemma 1 (Solomonoff, Kolmogorov) . For each nonnegative integer n , thereare at least n − (2 n − k − binary words σ of length n such that C ( σ ) ≥ n − k .Proof. For each word with C ( σ ) < n − k we use up at least one of the at most2 n − k − n − k , leaving at least |{ , } n | − (2 n − k − σ that must have C ( σ ) ≥ n − k .As automatic complexity is a kind of length-conditional complexity, it isworth noting that the same argument as in the proof of Lemma 1 works word-for-word for length-conditional complexity: we are only ever considering wordsof the fixed length n . Shallit and Wang connect their work with Kolmogorov complexity in the fol-lowing result.
Theorem 2 (Shallit and Wang [10, proof of Theorem 8]) . For all binary words x , C ( x ) ≤ A ( x ) + 18 + 3 log ( | x | ) . They mention ([10, proof of Theorem 8]), without singling it out as a lemma,the following result.
Lemma 3. C ( x ) ≥ | x | − log | x | for almost all x . And they deduce:
Theorem 4 (Shallit and Wang [10, Theorem 8]) . For almost all x ∈ { , } n we have A ( x ) ≥ n/ . almost all , so we give it here. It is alsoknown by the phrase natural density 1 . Definition 5.
A set of strings S ⊆ { , } ∗ contains almost all x ∈ { , } n if lim n →∞ | S ∩ { , } n | n = 1 . Proof of Lemma 3.
Let S = { x ∈ { , } ∗ : C ( x ) ≥ | x | − log | x |} . By Lemma 1,lim n →∞ | S ∩ { , } n | n ≥ lim n →∞ n − (2 n − log n − n = lim n →∞ − (cid:18) n − n (cid:19) = 1 . Our main result Theorem 34 is that for all ǫ > A ( x ) ≥ n/ (2+ ǫ ) for almostall words x ∈ { , } n . Analogously, one way of expressing the Solomonoff–Kolmogorov result is: Proposition 6.
For each ǫ > , the following statement holds: C ( x ) ≥ | x | (1 − ǫ ) for almost all x ∈ { , } n . Alphabet size.
If we allow the alphabet size to vary with n and be largerthan n then clearly almost all words will be square-free and hence ([10, The-orem 8],[3]) have the greatest possible nondeterministic automatic complexity A N ( x ) = ⌊ n/ ⌋ + 1. Suppose we take consider our alphabet to be the set ofall real numbers, R . Clearly the square-free words ( x , . . . , x n ) ∈ R n have fullLebesgue measure for each n . However the non-square-free words have Haus-dorff dimension n − A natural decision problem associated to automatic complexity can be definedas follows.
Definition 7.
The decision problem C has input a word x , and output the mostsignificant bit of A N ( x ) . Recall that
SAC is a semi-unbounded fan-in version of AC , whereby wehave unbounded fan-in only for OR gates (and not for AND gates). A literaturereview for C goes as follows (see Figure 1). • C ∈ NP (Hyde and Kjos-Hanssen 2015 [3]). • Shallit and Wang [10] asked in effect whether
C ∈ P . • Kjos-Hanssen [5] showed in effect that
C 6∈
CFL .Our results in this paper, showing that most words have complexity close to themaximum, can be used to show
C 6∈
SAC and C 6∈ co SAC . ( SAC = co SAC was shown in [2].) In fact, we can show:3 PP O O C ? ? ⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦⑦ (cid:15) (cid:15) (cid:31) (cid:31) ) ) SAC O O SAC ♦♦♦♦♦♦♦♦♦♦♦♦ co SAC ? ? ⑧⑧⑧⑧⑧⑧⑧ CFL \ \ ✾✾✾✾✾✾✾ co CFL e e ▲▲▲▲▲▲▲▲▲▲▲ NC O O ? ? ⑧⑧⑧⑧⑧⑧⑧ DCFL O O ] ] ✿✿✿✿✿✿✿ Figure 1: Known implications denoted by solid arrow, non-implications bydashed arrow.
Theorem 8. C is SAC -immune and co SAC -immune.Proof sketch. Suppose one of the disjuncts in our
SAC formula gives a wayto determine C ( x ) based on the bits p < · · · < p c of the input, where c is aconstant that does not depend on n .To show C is co SAC -immune, we show that the bits p i cannot guarantee lowcomplexity, i.e., C ( x ) = 0. This follows from our results in the present paper:most words have complexity close to the maximum, so we can randomly fill inthe remaining n − c bits so as to with high probability force C ( x ) = 1.To show C is SAC -immune, we show that the bits p i cannot guarantee highcomplexity, i.e., C ( x ) = 1. For this we modify a construction from Shallit andWang [10, Theorem 17]. They showed that A ( a m . . . a mk ) = O ( m − /k ) wherethe constant in O may depend on k . Our modification (Theorem 9) involveshardcoding some bits in between powers a mi .Let d i = p i +1 − p i . Fix d >
2. We divide the interval [1 , n ] into cd equallyspaced blocks of size n/ ( cd ). There will be up to c many blocks where some bitsare hardcoded, namely where d i < n/ ( cd ). This adds up to X i | σ i | ≤ cn/ ( cd ) = n/d many hardcoded states. For the rest, we get complexity O (cid:18)(cid:16) ncd (cid:17) − cd (cid:19) where the constant may depend on k = cd . For any function O ,( n/cd ) − / ( cd ) O ( cd ) = o ( n )4nd so in the limit we are forcing C ( x ) = 0. Theorem 9.
For any symbols a , . . . , a k and words σ , . . . , σ k , A N ( σ a m σ a m . . . σ k − a mk σ k ) = O ( m − /k ) + X i | σ i | where O may depend on k . Automatic complexity, introduced by [10], is an automata-based and length-conditional analogue of CD complexity ([11]) which is in turn a computableanalogue of the noncomputable Kolmogorov complexity. The nondeterministiccase was taken up by [3], who gave a table of the number of words of length n of a given complexity q for n ≤ Definition 10 ([10]) . A nondeterministic finite automaton (NFA) M = ( Q, Σ , δ, q , F ) over the alphabet Σ consists of a set of states Q , a transition function δ mappingeach ( q, b ) ∈ Q × Σ to a subset of q , an initial state q and a set of final states F ⊆ Q which for our purposes may be assumed to be a singleton F = { q f } . Theset of words accepted by M is L ( M ) . An NFA has the structure of a directedgraph, with set of vertices Q and edges ( s, t ) whenever t ∈ δ ( s, b ) for some b .For our purposes, b may be assumed to be unique when it exists, given s and t .A path is a sequence of edges in the same direction, ( s i , s i +1 ) , ≤ i < k . The nondeterministic automatic complexity A N ( x ) of a word x ∈ Σ n is the minimalnumber of states of an NFA M accepting x such that there is only one acceptingpath in M of length n . Definition 11. A directed cycle in the NFA M is a nonempty sequence of edges ( s i , s i +1 ) , ≤ i < k such that s = s k , s , . . . , s k − are distinct, and for each i there is a b with s i +1 ∈ δ ( s i , b ) . A cycling state sequence in an automaton isa sequence of states s , . . . , s k such that s k = s and such that for all i thereexists b ∈ Σ with s i +1 ∈ δ ( s i , b ) . Definition 12.
Two directed cycles are adjacent if they share at least onevertex. A tree of directed cycles is a digraph such that the set of its directedcycles forms a tree under the relation of adjacency.
For an automaton generated by a witnessing path, there is a subtree of ω <ω whose nodes are such cycles, giving an order of traversal as well. The order5s depth-first. Cycles completed later are associated with substrings (prefixes)if the cycles are adjacent, and siblings to the right , otherwise, in the followingmanner: Definition 13.
At the stage when a cycle C is completed, it is given code h k +1 i ,or code h i if k < , where h i , . . . , h k i already exist and are disjoint from C .The adjacent cycles of C are then dubbed its children and given new labels σ
7→ h k + 1 i ⌢ σ and similarly for their children (ancestors of σ ). τ
7→ h k + 1 i ⌢ τ. Stage s of a computation of an NFA on a given input occurs when s symbolshave been read. Let f s ( σ ) for σ ∈ ω <ω be the directed cycle at σ in the tree atstage s , if defined, as in Definition 13. Example 14.
In Figure 2, where numbering of states is written in hexadec-imal notation, we have a witnessing automaton for A N ( x ) ≤ , where x = , | x | = 22 . Powers ( ) / and ( ) / are exploited here. Ig-noring the last three moves we have 19 moves to make to get from q back to q , and the possible sequences of states are given by solutions to the equation
19 = 6 x + 6 y + 3 z + 2 w with the constraints z > ⇒ y > ,w > ⇒ y > ,y > ⇒ x > . The only solution is x = 1 , y = 1 , z = 1 , and w = 2 . Figure 2 shows thecorrespondence between times, states, and the tree of directed cycles. The finaltree f is (0 , , , , , (4 , , , , , A)000 (6 , , (9 , A) Definition 15.
A subset S ⊆ ω <ω is left-closed if whenever h x , . . . , x n i ∈ S and y i ≤ x i for all i , then h y , . . . , y n i ∈ S . For instance if S is left-closed and h , i ∈ S then h , i ∈ S . Let us denoteconcatenation by ⌢ so that h , i ⌢ h , i = h , , , i .6ime t State f t ∅ ∅ ∅ ∅ ∅ ∅ ∅ ∅ ∅ f ( h i ) = (6 , , f
11 9 f
12 A f
13 9 f ( h i ) = (6 , , f ( h i ) = (9 , A )14 A f
15 9 f
16 A f
17 4 f ( h i ) = (4 , , , , , A ), f ( h i ) = f ( h i ), f ( h i ) = f ( h i )18 5 f
19 0 f ( h i ) = (0 , , , , , f (0 σ ) = f ( σ ) for | σ | >
020 1 f
21 2 f
22 3 f q start q q q q q q q q q A q Figure 2: Analysis of a complexity witness for .7 start q q q q q q q q q start q q q q q q q q q A q h i q start q q q q q q q q q A q h i h i q start q q q q q q q q q A q h i h ih i q start q q q q q q q q q A q h i h ih ih i h ih ih i h i Figure 3: Step-by-step generation of trees of directed cycles for Figure 2. At the last step, h i = l = ( q , q , q ), h i = l = ( q , q A ), h i = l = ( q , q , q , q , q , q A ), and h i = l = ( q , . . . , q ). efinition 16 (Kleene–Brouwer ordering) . The Kleene–Brouwer ordering < KB is defined as follows. If t and s are finite sequences of elements from X , we saythat t < KB s if there is an n such that either: • t ↾ n = s ↾ n and t ( n ) is defined but s ( n ) is undefined (i.e. t properlyextends s ), or • both s ( n ) and t ( n ) are defined, t ( n ) < s ( n ) , and t ↾ n = s ↾ n .Here, the notation t ↾ n refers to the prefix of t up to but not including t ( n ) . In the finite tree case, which we are in, the KB ordering is also called post-order tree traversal . Lemma 17.
Consider the sequence of states visited during processing by anNFA M of a unique accepted word x of length n . Let us call the first visitedstate 0, the next distinct state 1, and so on. Then the state sequence starts , , . . . , q, q + 1 , . . . , q where q is the first state that is visited twice.There will never, at a later point in the state sequence, be a transition (anedge) q , q such that q occurs within the directed cycle q, q + 1 , . . . , q and suchthat the transition q , q does not occur in that directed cycle.Proof. Indeed, otherwise our state sequence would start0 , , . . . , q, . . . , q | {z } first , . . . , q, . . . , q , q | {z } second and then there is a second accepting path of the same length where the firstand second segments are switched. Theorem 18 ([1]) . The digraphs representing the witnessing automata for A N are trees of directed cycles. In Definition 13, cycles are adjacent iff their labelsare h σ i , h σ ⌢ j i in the tree of labels.Proof. By Lemma 17, the path considered there can only return to states thatare not yet in any directed cycles. This leaves only two choices whenever wedecide to create a new edge leading to a previously visited state: • Case 1. Go back to a state that was first visited after the last completeddirected cycle so far seen (i.e., h k + 1 i ), or • Case 2. Go back to a state that was first visited at some earlier time,before some of the directed cycles so far seen started (i.e., between h i i , h i + 1 i , for some i ≤ k ) (and in general after some of them were complete).This gives a tree of directed cycles where each new directed cycle either (Case1) creates a new sibling for the previous directed cycle, or (Case 2) creates a newparent for a final segment of the so far seen top-level siblings: h i i , . . . , h k + 1 i So for example the permitted state sequences of length 3 are only 000, 001, 010, 011, 012. h i i ”, and they instead become h i, i , . . . , h i, ( k + 1 − i ) i . In this tree of directed cycles, only the leaves (thedirected cycles that are not anybody’s parents) can be traversed more thanonce by the uniquely accepted path of length n (Theorem 20).So if the first directed cycle created is l then next we can have two cases:Case 1: ( l , l ) h i 7→ l , h i 7→ l Case 2: l → l h i 7→ l , h i 7→ l In Case 1, l and l are siblings ordered from first to second. In Case 2, → denotes is a child of . Now for the third directed cycle l , we have only thefollowing possibilities:Subcase 1.1: ( l , l , l ) h i 7→ l , h i 7→ l , h i 7→ l Subcase 1.2: ( l , l → l ) h i 7→ l , h i 7→ l , h i 7→ l Subcase 1.3: ( l , l ) → l h i 7→ l , h i 7→ l , h i 7→ l Subcase 2.1: ( l → l , l ) h i 7→ l , h i 7→ l , h i 7→ l Subcase 2.2: l → l → l h i 7→ l , h i 7→ l , h i 7→ l In Subcase 1.2, l and l are siblings and l is a child of l . In Subcase 1.3, l is a common parent of l and l . In Subcase 2.1, l is a new sibling for l ,and l still has l as its child. In Subcase 2.2, l is a parent of l .The general pattern is that the only possibilities when constructing l k +1 are:(i) l k +1 = f new ( h j + 1 i ) where l k = f old ( h j i ) and f old ( h j + 1 i ) is undefined.This happens when the new cycle that is formed has no children.(ii) l k +1 = f new ( h i i ) for some i , where f old ( h i i ) is defined, and f new ( h i, j i ⌢ τ ) = f old ( h i + j i ⌢ τ )for all j ≥ τ for which the latter is defined.Every left-closed subtree of ω <ω can be generated by these two kinds of movesvia a kind of depth–first search. The first node we put on the tree is the longest0 c that occurs in the tree. If this has siblings 0 c − d we next add 0 c − e forlargest e for which this is on the tree. We initially add 0 c − d e as a sibling to0 c , but later use Case (ii) to put the nodes 0 c − d e − etc. in. In fact, we addthe nodes according to the Kleene–Brouwer ordering < KB . The KB orderingcan be thought of as a traversal of the tree with two moves, “right” and “up”comes from the two clauses in the KB ordering definition, and these correspondto (i) and (ii) respectively. Example 19.
The state sequence 01234567345673456720 has the structure ofSubcase 2.2 of Theorem 18, with l being the directed cycle generated from345673, l being generated from 23456734567345672, and l being generatedfrom the whole sequence 01234567345673456720. The corresponding automatonis shown in an online tool . http://math.hawaii.edu/wordpress/bjoern/complexity-of-0001111011110111111/ A leaf power inequality A leaf of the f s tree shall be a cycle h σ i where h σ ⌢ i is undefined. Equivalently, h σ ⌢ i i is undefined for all i . Theorem 20.
In Theorem 18, the states that are not part of leaves of the treeare visited at most twice.Proof.
By the uniqueness of path. If an internal node η of the tree is visitedthree times t , t , t then a leaf (witnessing that η was internal) is visited eitherbetween t , t or between t , t (or later if there are even more visits to η ). Butthen that very visit can be moved instead to the other interval (making oneinterval longer and the other shorter) showing the non-uniquenss of path. Definition 21.
Two occurrences of words a (starting at position i ) and b (start-ing at position j ) in a word x are disjoint if x = uavbw where u, v, w are wordsand | u | = i , | uav | = j . If in addition | v | > then we say that these occurrencesof a and b are strongly disjoint . Definition 22.
Let x be a word and M an NFA with q states. If M witnessesthat A N ( x ) ≤ q then let LP ( M ) = { x α , . . . , x α m m } be the set of strongly disjoint leaf powers in x that are used by M , i.e., they areread during the processing of a leaf of the f s tree. The reduced set of leaf powersis defined by LP ′ ( M ) = { x α i i ∈ LP ( M ) : 1 ≤ i ≤ m and α i ≥ } . Theorem 23 ([1]) . Let x be a word and M an NFA with q states. If M witnesses that A N ( x ) ≤ q then let LP ( M ) = { x α , . . . , x α m m } , be the set of strongly disjoint leaf powers in x that are used by M . The followinguniqueness condition holds, with β i = ⌊ α i ⌋ : m X i =1 β i | x i | = m X i =1 γ i | x i | , γ i ∈ Z , γ i ≥ ⇒ γ i = β i , for each i. (1)Note that leaf powers that are not strongly disjoint are not used by M , asthat would violate uniqueness of path.The following Lemma has an obvious proof but may be worth stating ex-plicitly as it is used in Theorem 25. Lemma 24.
Suppose the equation m X i =1 a i x i = m X i =1 a i c i , x i ≥ , x i ∈ Z , (2)11 as only the solution x i = c i , ≤ i ≤ m . Then the equation m − X i =1 a i x i = m − X i =1 a i c i (3) also has a unique solution.Proof. Suppose b i , ≤ i < m is a solution of (3). Then ( x , . . . , x m ) =( b , . . . , b m − , c m ) is a solution of (2). Hence( b , . . . , b m − , c m ) = ( c , . . . , c m )which means b i = c i for each 1 ≤ i < m . Theorem 25 (Leaf power inequality) . If M witnesses that A N ( x ) ≤ q , then LP ( M ) satisfies q ≥ n + 1 − m − m X i =1 ( α i − | x i | . (4) Proof.
If the complexity of a word is low then in particular there is such atree structure with few states that accepts our word. By Theorem 20, if thecomplexity is strictly below n/ x α , . . . , x α m m , then all the reduction in number of states to below n/ x α , . . . , x α m m in theword x such that, recalling that there are n + 1 times to be assigned to states,and noting that each of the q − P mi =1 | x i | other states can each be visited atmost twice, there must be at least n + 1 − q − m X i =1 | x i | ! times when we are occupied with the leaf powers.Let L be the number of times when we are occupied with the leaf powers and N be the number of times we are occupied with other states. So N + L = n + 1.Then N ≤ q − P mi =1 | x i | ) and so L = n + 1 − N ≥ n + 1 − q − m X i =1 | x i | ! On the other hand L = P mi =1 (1 + α i | x i | ), so m X i =1 (1 + α i | x i | ) ≥ n + 1 − q − m X i =1 | x i | ! . Rearranging this, 2 q ≥ n + 1 − m − m X i =1 ( α i − | x i | . heorem 26. If a word x contains no set of strongly disjoint powers satisfying(1) and (4) then A N ( x ) > q . Theorem 26 is an immediate corollary of Theorem 25.
Theorem 27.
If a word x contains no set of strongly disjoint, at-least-squarepowers satisfying (1) and (4) then A N ( x ) > q .Proof. Suppose LP ′ ( M ) ⊆ L ⊂ LP ( M ) with LP ( M ) \ L = { x α j j } , a singleton.Then L inherits the uniqueness property from LP ( M ) by Lemma 24. Let m ′ be the parameter m for L . Then m ′ = m −
1, and P mi = j,i =1 ( α i − | x i | ≥ P mi =1 ( α i − | x i | + 1, since α i | x i | is always an integer, even when α i is not.Hence overall the right-hand side of (4) for L is at most the right-hand sideof (4) for LP ( M ). Thus Theorem 26 still holds if we require at least squarepowers.The following corollary is weaker but easier to analyze probabilistically. Theorem 28.
If a word x contains no set of strongly disjoint, at-least-squarepowers x α i i , ≤ i ≤ m withall | x i | , ≤ i ≤ m distinct (5) and satisfying (4), then A N ( x ) > q .Proof. We merely note that unique solvability of the equation (1) implies thatall the lengths are distinct: otherwise we can replace γ i = β i , γ j = β j where | x i | = | x j | , i = j by γ i = β i − , γ j = β j + 1 or the other way around . Remark 29.
The Leaf Power Inequality is implemented in Python at [4]. Thevariable called saveUnique is equal to m + P ( α i − | x i | . For Example 14, thePython script reports saveUnique = 5 and q ≥ , and indeed the Leaf PowerInequality (4) then says q ≥ n + 1 − − (6 − · − . Theorem 30 states that the Leaf Power Inequality (4) does not hold in generalwithout the m term. Theorem 30.
It is not true that for all witnessing M as above, LP ( M ) satisfies: q ≥ n + 1 − m X i =1 ( α i − | x i | . Proof.
Let x = 0 n . Then (30) gives2 q ≥ n + 1 − ( n −
2) = 3which is false for an M witnessing that A N ( n ) = 1. Example 31.
Note that for x = , we have LP ( M ) = { } making m = 1 and P ( α i − | x i | = 0 . Here the Leaf Power Inequality would not hold withoutthe m term, even though no powers in x contribute to P ( α i − | x i | . Solomonoff–Kolmogorov result
Definition 32.
The savings associated with powers x α , . . . , x α m m in a word x is s = P ( α i − | x i | . The idea of savings is that an automaton may try to exploit the powers x α i i by reusing edges s many times.To use our lower bound to say something about the complexity of a randomword: if we do not use uniqueness at all we can say very little: for instance, has savings from and . If we use uniqueness, however, we can atleast say that each possible cycle length occurs in only one term of the equation (Equation (5)). And then, since only powers 1 + ǫ for ǫ ≥ n since otherwisesuch a high power is unlikely to occur. And then the length of the saving partfrom a single power is at most log n . So total savings should be at most (log n ) .So 2 q ≥ n + 1 − log n − log n or q ≥ n/ − log n roughly speaking. Thisclosely matches a LFSR (linear feedback shift register) result of [6], incidentally.In Theorem 34 we are inspired by an argument due to [9].Consider the word . Focusing on the segment, and calling thepositions of the word 0 , . . . ,
7, we can say that position m = 5 starts a runwith lookback amount k of length 2. Namely, the second looks back at theprevious to form . Definition 33.
Position m starts a run with lookback amount k of length t inthe word x = x . . . x n , where x i ∈ { , } , if x m +1+ u = x m +1+ u − k for each ≤ u < t . Theorem 34.
For almost all x , A N ( x ) / ( n/ → . That is, for all ǫ > thereis an n such that for all n ≥ n , if x of length n is chosen randomly then | A N ( x ) / ( n ) − | < ǫ with probability at least − ǫ .Proof. Let R m,k be the event that position 1 ≤ m ≤ n starts a run with lookbackamount 1 ≤ k ≤ m of length at least ⌊ d log n ⌊ . By the union bound, P n [ m =1 m [ k =1 R m,k ! ≤ n X m =1 m X k =1 − d log n = n − d n X m =1 m = n ( n + 1)2 · n − d → d >
2. So with high probability, all runs have length strictly smallerthan ⌊ d log n ⌋ . The Leaf Power Inequality can be weakened to a more conve-nient form as follows:2 q ≥ n + 1 − m X i =1 ( α i − | x i | + (( X | x i | ) − m ) . Since | x i | ≥ m X i =1 ( α i − | x i | ≥ n + 1 − q. | x i | we have m X i =1 ( α i − | x i | ≤ (number of base lengths to consider) × (max savings from a single run) ≤ ( d log n ) . If A N ( x ) ≤ q then by Theorem 28 there are α i ≥ q ≥ n + 1 − m X i =1 ( α i − | x i | + (( X | x i | ) − m ) . and in particular there are α i ≥ q ≥ n + 1 − m X i =1 ( α i − | x i | . And since α i ≥
2, it must also be that | x i | ≤ d log n , else there would be atoo-great saving associated with x α i i . In particular m ≤ d log n .Thus, with high probability the total savings can at most be(number of base lengths to consider) × (max savings from a single run) ≤ ( d log n ) . So A N ( x ) ≥ ( n + 1) / − ( d log n ) . References [1] Achilles A. Beros, Bjørn Kjos-Hanssen, and Daylan Kaui Yogi. Planardigraphs for automatic complexity. In
Theory and applications of modelsof computation , volume 11436 of
Lecture Notes in Comput. Sci. , pages 59–73. Springer, Cham, 2019.[2] Allan Borodin, Stephen A. Cook, Patrick W. Dymond, Walter L. Ruzzo,and Martin Tompa. Two applications of inductive counting for comple-mentation problems.
SIAM J. Comput. , 18(3):559–578, 1989.[3] Kayleigh K. Hyde and Bjørn Kjos-Hanssen. Nondeterministic automaticcomplexity of overlap-free and almost square-free words.
Electron. J. Com-bin. , 22(3), 2015. Paper 3.22, 18.[4] Sun Young Kim, Bjørn Kjos-Hanssen, and Clyde James Felix.Code for Automatic complexity of Fibonacci and Tribonacci words. https://github.com/bjoernkjoshanssen/tetranacci , 2019.155] Bjørn Kjos-Hanssen. On the complexity of automatic complexity.
TheoryComput. Syst. , 61(4):1427–1439, 2017.[6] Bjørn Kjos-Hanssen. Automatic complexity of shift register sequences.
Dis-crete Math. , 341(9):2409–2417, 2018.[7] A. N. Kolmogorov. Three approaches to the definition of the concept “quan-tity of information”.
Problemy Peredaˇci Informacii , 1(vyp. 1):3–11, 1965.[8] A. N. Kolmogorov. Three approaches to the quantitative definition of in-formation.
Internat. J. Comput. Math. , 2:157–168, 1968.[9] Anthony Quas. Longest runs and concentration of measure. MathOverflow,2016. URL:https://mathoverflow.net/q/247929 (version: 2016-08-21).[10] Jeffrey Shallit and Ming-Wei Wang. Automatic complexity of strings.
J.Autom. Lang. Comb. , 6(4):537–554, 2001. 2nd Workshop on DescriptionalComplexity of Automata, Grammars and Related Structures (London, ON,2000).[11] Michael Sipser. A complexity theoretic approach to randomness. In
Pro-ceedings of the Fifteenth Annual ACM Symposium on Theory of Computing ,STOC ’83, pages 330–335, New York, NY, USA, 1983. ACM.[12] R. J. Solomonoff. A formal theory of inductive inference. I.
Informationand Control , 7:1–22, 1964.[13] R. J. Solomonoff. A formal theory of inductive inference. II.