[PDF] Pushdown and Lempel-Ziv Depth

Abstract

This paper expands upon existing and introduces new formulations of Bennett's logical depth. In previously published work by Jordon and Moser, notions of finite-state-depth and pushdown-depth were examined and compared. These were based on finite-state transducers and information lossless pushdown compressors respectively. Unfortunately a full separation between the two notions was not established. This paper introduces a new formulation of pushdown-depth based on restricting how fast a pushdown compressor's stack can grow. This improved formulation allows us to do a full comparison by demonstrating the existence of sequences with high finite-state-depth and low pushdown-depth, and vice-versa. A new notion based on the Lempel-Ziv `78 algorithm is also introduced. Its difference from finite-state-depth is shown by demonstrating the existence of a Lempel-Ziv deep sequence that is not finite-state deep and vice versa. Lempel-Ziv-depth's difference from pushdown-depth is shown by building sequences that have a pushdown-depth of roughly 1/2 but low Lempel-Ziv depth, and a sequence with high Lempel-Ziv depth but low pushdown-depth. Properties of all three notions are also discussed and proved.

Full PDF

aa r X i v : . [ c s . CC ] S e p Pushdown and Lempel-Ziv Depth

Liam Jordon ∗ [email protected] Philippe [email protected]. of Computer Science, Maynooth University, Maynooth, Co. Kildare, Ireland Abstract

This paper expands upon existing and introduces new formulations of Bennett’s logicaldepth. In previously published work by Jordon and Moser, notions of ﬁnite-state-depth andpushdown-depth were examined and compared. These were based on ﬁnite-state transducersand information lossless pushdown compressors respectively. Unfortunately a full separationbetween the two notions was not established. This paper introduces a new formulation ofpushdown-depth based on restricting how fast a pushdown compressor’s stack can grow. Thisimproved formulation allows us to do a full comparison by demonstrating the existence ofsequences with high ﬁnite-state-depth and low pushdown-depth, and vice-versa. A new notionbased on the Lempel-Ziv ‘78 algorithm is also introduced. Its diﬀerence from ﬁnite-state-depthis shown by demonstrating the existence of a Lempel-Ziv deep sequence that is not ﬁnite-state deep and vice versa. Lempel-Ziv-depth’s diﬀerence from pushdown-depth is shown bybuilding sequences that have a pushdown-depth of roughly 1 / In 1988 Charles Bennett introduced a new method to measure the useful information contained ina piece of data [3]. This measurement tool is called logical depth . Logical depth helps to formalisethe diﬀerence between complex and non-complex structures. Intuitively, deep structures can bethought of as structures that contain patterns which are incredibly diﬃcult to ﬁnd. Given moreand more time and resources, an algorithm could spot these patterns and exploit them (such as tocompress a sequence). Non-deep structures are sometimes referred to as being shallow. Randomstructures are not considered deep as they contain no patterns. Simple structures are not considereddeep as while they contain patterns, they are too easy to spot.Bennett’s original notion is based on Kolmogorov complexity [3], and interacts nicely withfundamental notions of computability theory [8, 19]. Due to the uncomputabilty of Kolmogorovcomplexity, several researchers have attempted to adapt Bennett’s notion to lower complexity levels.While variations have been based on computable notions [14], more feasible notions based onpolynomial time computations [1, 17, 18] have been studied, including both ﬁnite-state transducersand lossless pushdown compressors [7, 12]. Similarly to randomness, there is no absolute notion of ∗ Supported by a postgraduate scholarship from the Irish Research Council. • Random sequences are not deep (for the appropriate randomness notion). • Computable sequences are not deep (for the appropriate computability notion). • A slow growth law: deep sequences cannot be quickly computed from shallow ones. • Deep sequences exist.In this paper we continue the study of depth via classes of automaton and compression algo-rithms. For two families of compression algorithms T and T ′ , we say a sequence is ( T, T ′ )-deep iffor every compressor C of type T , there exists a compressor C ′ of type T ′ such that on almost everypreﬁx of S (with length denoted n ), C ′ compresses it at least by αn more bits than C , for someconstant α . We refer to α as the ( T, T ′ )-depth level of S . We drop the ( T, T ′ ) notation and refer tojust T or T ′ depending on the context when referring to the depth notions discussed in this paper.Doty and Moser ﬁrst presented an inﬁnitely often notion of depth based on ﬁnite-state trans-ducers in [7] based on the minimal length of an input to a ﬁnite-state transducer that results in thedesired output. Further study of ﬁnite-state minimal descriptional length can be found in [4, 5].This led to Jordon and Moser introducing a notion based on lossless pushdown compressors in [12]where it was shown that there existed a ﬁnite-state deep sequence which was not pushdown-deep.The contrary result was not established, i.e. the existence of a pushdown-deep sequence that isnot ﬁnite-state deep. In this paper we present a new notion of pushdown-depth which provides amuch clearer separation from ﬁnite-state-depth by allowing us to prove the existence of sequenceswhich are pushdown deep and if they are ﬁnite-state deep, have a very low level of depth, and viceversa. This notion of pushdown-depth is based on the output of information lossless pushdowncompressors (ILPDC) for a given input. The model of pushdown compressors used is found in [16].Speciﬁcally we examine the diﬀerence in compression of an ordinary ILPDC against an ILPDCwhose stack grows at a bounded rate. That is, for an order function f , an ILPDC has f -stackgrowth if on preﬁxes of length n of a sequence, the ILPDC’s stack’s height is never above f ( n ). Wecall such pushdown-depth PD f -depth.We also introduce a new notion called Lempel-Ziv-depth (LZ-depth) based on the Lempel-Ziv‘78 (LZ) compression algorithm introduced in [22]. LZ-depth examines the output of a losslessﬁnite-state transducer against the output of the LZ algorithm on a given input.For both the pushdown and Lempel-Ziv-depth notions, we demonstrate that each notion hassome of the fundamental depth properties, i.e. both easy and random sequences based on thesetting are not deep, and that a slow growth law holds.When comparing the three notions we examine the depth level of various sequences. To comparepushdown-depth with ﬁnite-state-depth, we ﬁrst show the existence of an i.o. ﬁnite-state-deep2equence which is not PD ⌊ log ⌋ -deep. We then show the existence of a PD ⌊ log ⌋ -deep sequence withdepth level of roughly such that, if it is ﬁnite-state-deep, it has a ﬁnite-state-depth level ofroughly 0. To compare ﬁnite-state-depth and LZ-depth we show that there exists a normal sequence(from [15]) that is LZ-deep. Since no normal sequence is ﬁnite-state-deep [7], this demonstrates adiﬀerence. We also build a sequence which is ﬁnite-state deep and inﬁnitely often LZ-deep, nbutnot almost everywhere LZ-deep. When comparing pushdown-depth with LZ-depth, we ﬁrst showthat for all order functions f , there exists a sequence that is not PD f -deep but is LZ-deep. Wethen build a sequence which is PD ⌊ log log ⌋ -deep and has depth of roughly but low LZ-depth. We write N to denote the set of all integers. All logarithms are taken in base 2. A string is anelement of { , } ∗ . For a string x , | x | denotes its length. For n ∈ N , { , } n denotes the set ofstrings of length n . A sequence is an element of { , } ω . Given strings x, y and a sequence S , xy and xS denote the concatenation of x with y and x with S respectively. For a string x and n ∈ N , x n denotes the string of x concatenated with itself n times. For a string x and sequence S , for i, j ∈ N with i ≤ j , x [ i..j ] and S [ i..j ] represent the substring of x and S composed of theirrespective i th through j th bits. If j < i , then x [ i..j ] = S [ i..j ] = λ , where λ is the empty string. x [ i ]and S [ i ] represent the i th bit of x and S respectively. For non-equal strings x, y , by the lexicographicordering of strings we say that x < y if | x | < | y | or | x | = | y | and the ﬁrst position i where x and y diﬀer is such that x [ i ] = 0 and y [ i ] = 1. For a string v = xyz we say x is a preﬁx of v , y is asubstring of v , and that z is a suﬃx of v . For a string or sequence S and n ∈ N , S ↾ n denotes S [0 ..n − n of S . We occasionally write x (cid:22) v and x (cid:22) S if x is a preﬁx of v or S respectively. We write x ≺ v if x (cid:22) v and | x | < | v | . For a string x = x x . . . x n , d ( x ) denotes x with every bit doubled, i.e. d ( x ) = x x x x . . . x n x n . For a string x , x − denotes the reverse of x , i.e. x − = x n x n − . . . x x .Unless stated otherwise, for any n -tuple ( x , x , . . . , x n ) we take the convention that the tuplecan be encoded by the string1 ⌈ log n ⌉ n x ⌈ log n ⌉ n x . . . ⌈ log n n − ⌉ n n − x n − x n , where n i = | x i | in binary.We write K ( x ) to represent the plain Kolmogorov complexity of string x . That is, for a ﬁxeduniversal Turing machine U , K U ( x ) = min {| y | : y ∈ { , } ∗ , U ( y ) = x } . That is, y is the shortest input to U that results in the output of x . The value K U ( x ) does notdepend on the choice of universal machine up to an additive constant, therefore we drop the U fromthe notation. Other authors commonly use C to denote plain complexity (see [9, 20]), however wereserve C to denote compressors. Note that for all n ∈ N , there exists a string x ∈ { , } n suchthat K ( x ) ≥ | x | by a simple counting argument.We call a function f : N → N an order function if it is computable, unbounded and non-decreasing. 3e use Borel normality [10] to examine the properties of some sequences. We say that a sequence S is normal if for all strings x ∈ { , } ∗ , x occurs with asymptotic frequency 2 −| x | as a substringin S .The following are the main two ways we examine the complexity of sequences. For a sequence S and a function C : { , } ∗ → { , } ∗ the C -upper and lower compression ratio of S are given by ρ C ( S ) = lim inf n →∞ | C ( S ↾ n ) | n , and R C ( S ) = lim sup n →∞ | C ( S ↾ n ) | n . For a sequence S and a set T of functions from Σ ∗ to Σ ∗ , the T - best case and T - worst case compression ratios of S are given by ρ T ( S ) = inf { ρ C : C ∈ T } , and R T ( S ) = inf { R C : C ∈ T } . We use the standard ﬁnite-state transducer model.

Deﬁnition 2.1. A ﬁnite-state transducer (FST) is a 4-tuple T = ( Q, q , δ, ν ), where • Q is a nonempty, ﬁnite set of states , • q ∈ Q is the initial state , • δ : Q × { , } → Q is the transition function , • ν : Q × { , } → { , } ∗ is the output function ,For all x ∈ { , } ∗ and b ∈ { , } , the extended transition function b δ : q × { , } ∗ → Q is deﬁnedby the recursion b δ ( λ ) = q and b δ ( xa ) = δ ( b δ ( x ) , a ) . For x ∈ { , } ∗ , the output of T on x is thestring T ( x ) deﬁned by the recursion T ( λ ) = λ , and T ( xa ) = T ( x ) ν ( b δ ( x ) , a ).An FST is information lossless (IL) if the function x ( T ( x ) , b δ ( x )) is 1-1; i.e. the output andﬁnal state of T on input x uniquely identify x . We call an FST that is IL an ILFST. By the identityFST, we mean the ILFST I FS that on every input x ∈ { , } ∗ , I FS ( x ) = x. We write (IL)FST todenote the set of all (IL)FSTs.A map f : { , } ω → { , } ω is said to be (IL)FS computable if there is an (IL)FST T such thatfor all S ∈ { , } ω , lim n →∞ | T ( S ↾ n ) | = ∞ and for all n ∈ N , T ( S ↾ n ) (cid:22) f ( S ). In this case we say T ( S ) = f ( S ).We often use the following two theorems [11, 13] that demonstrate that any function computedby an ILFST can be inverted to be approximately computed by another ILFST. Theorem 2.2 ([11, 13]) . For all T ∈ ILFST , there exists T − ∈ ILFST and a constant c ∈ N suchthat for all x ∈ { , } ∗ , x ↾ ( | x | − c ) (cid:22) T − ( T ( x )) (cid:22) x . Corollary 2.3.

For all T ∈ ILFST , there exists T − ∈ ILFST such that for all S ∈ { , } ω , T − ( T ( S )) = S. .3 Pushdown Compressors We use the model of pushdown compressors found in [16]. Note that we keep our model feasibleby bounding the number of times a pushdown compressor can pop a bit oﬀ of its stack withoutreading an input bit. This prevents the compressor spending an arbitrarily long time altering itsstack without reading its input.A pushdown compressor (PDC) is a 7-tuple C = ( Q, Γ , δ, ν, q , z , c ) where1. Q is a non-empty, ﬁnite set of states ,2. Γ is the ﬁnite stack alphabet,3. δ : Q × ( { , } ∪ { λ } ) × Γ → Q × Γ ∗ is the transition function ,4. ν : Q × ( { , } ∪ { λ } ) × Γ → { , } ∗ is the output function ,5. q ∈ Q is the start state,6. z ∈ Γ is the special bottom of stack symbol,7. c is an upper bound on the number of λ -transitions per input bit.For simplicity we consider only binary PDCs where Γ = { , , z } . We assume every state in Q is reachable from q . We write δ Q which returns the ﬁrst component of the output on δ and δ Γ to be the second component. The transition function δ accepts λ as an input in addition to { , } .This means C has the option of altering its stack while not reading an input character. We callthis a λ -transition . In this case for a ∈ { , } , whenever δ ( q, λ, a ) = ( q ′ , λ ), we pop the top symbolfrom the top of the stack. To enforce determinism we require that one of the following hold for all q ∈ Q and a ∈ Γ:1. δ ( q, λ, a ) = ⊥ δ ( q, b, a ) = ⊥ for all b ∈ { , } .For z ∈ Γ + , z is ordered such that z [0] is the top of the stack and z [ | z | −

1] = z . δ isrestricted so that z cannot be popped oﬀ of the stack. That is, for every q ∈ Q, b ∈ { , } ∪ { λ } ,either δ ( q, b, z ) = ⊥ , or δ ( q, b, z ) = ( q ′ , vz ) where q ′ ∈ Q and v ∈ Γ ∗ . Furthermore, at most cλ -transitions can be applied in succession without reading an input bit.The extended transition function b δ : Q × { , } ∗ × Γ + → Q × Γ ∗ is deﬁned by the usual recursion. b δ ( q , w, z ) is abbreviated to b δ ( w ). The extended output function b ν : Q × { , } ∗ × Γ + → Q × Γ ∗ isalso deﬁned by the usual recursion. The output of the PDC C on input w ∈ { , } ∗ is the string C ( w ) = b ν ( q , w, z ) . For a PDC C and strings x and y , we occasionally use the abusive notation ¯ ν C ( y ) to representthe suﬃx of C ( xy ) that is contributed by the y section of the input, i.e. | ¯ ν C ( y ) | = | C ( xy ) | − | C ( x ) | .We note each time it is used.A PDC is said to be information lossless (IL) if the function w ( C ( w ) , c δ Q ( w )) is 1-1. A PDCthat is IL is called an ILPDC. We write (IL)PDC to be the set of all (IL)PDCs. By the identityPDC I P D we mean the ILPDC that on any input x ∈ { , } ∗ , I P D ( x ) = x , and never alters itsstack. 5 eﬁnition 2.4. Let f be an order function. We say a PDC C has f -stack growth if for all x ∈ { , } ∗ , for 0 ≤ i ≤ | x | −

1, when C is reading x [ i ] as part of its computation, C ’s stack’s heightis bounded above by f ( i ). We say that a stack containing only z (i.e. an empty stack) has height0. We write (IL)PDC f to denote the set of all ILPDCs with f -stack growth. Note that I P D ∈ ILPDC f for all such f . The Lempel-Ziv algorithm LZ‘78 (denoted LZ) [22] is a lossless dictionary based compression algo-rithm. Given an input x ∈ { , } ∗ , LZ parses x into phrases x = x x . . . x n such that each phrase x i is unique in the parsing, except for possibly the last phrase. Furthermore, for each phrase x i ,every preﬁx of x i also appears as a phrase in the parsing. That is, if y ≺ x i , then y = x j forsome j < i . Each phrase is stored in LZ’s dictionary. LZ encodes x by encoding each phrase asa pointer to its dictionary containing the longest proper preﬁx of the phrase along with the ﬁnalbit of the phrase. Speciﬁcally for each phrase x i , x i = x l ( i ) b i for l ( i ) < i and b i ∈ { , } . Then for x = x x . . . x n LZ ( x ) = c l (1) b c l (2) b . . . c l ( n ) b n where c i is a preﬁx free encoding of the pointer to the i th element of LZ’s dictionary, and x = λ .We restrict LZ’s input to binary strings.For strings w = xy , we let LZ ( x | y ) denote the output of the LZ algorithm on y after it hasalready parsed x . For strings of the form w = xy n , we use the following lemma to get an upperboundfor | LZ ( x | y n ) | . Lemma 2.5 ([16]) . Let n ∈ N , and x, y ∈ { , } ∗ where y = λ . Let w = xy n . Suppose on itscomputation of the string w that LZ’s dictionary contained d ≥ phrases after reading x . Then wehave that | LZ ( x | y n ) | ≤ p | y | + 1) | w | log ( d + p | y | + 1) | w | ) . An inﬁnitely often (i.o.) ﬁnite-state-depth notion was introduced by Doty and Moser in [7] based onﬁnite-state transducers. In this section, we state and prove properties of ﬁnite-state-depth found in[12] whose proofs were omitted in their original publication for space. These properties are neededto compare it with pushdown-depth and Lempel-Ziv-depth introduced in later sections. Henceforthwhen we say a sequence is ﬁnite-state deep, we assume it is the i.o. version.Before we begin examining depth, we ﬁrst choose a binary representation of all ﬁnite-statetransducers.

Deﬁnition 3.1. A binary representation of ﬁnite-state transducers σ is a partially computablemap σ : D ⊆ { , } ∗ → FST, such that for every FST T , there exists some x ∈ D such that σ ( x )fully describes T , i.e. σ is surjective. If σ ( x ) = T , we call x a σ description of T .6or a binary representation of FSTs σ , we deﬁne | T | σ = min {| x | : σ ( x ) = T } to be the size of T with respect to σ . For all k ∈ N , deﬁneFST ≤ kσ = { T ∈ FST : | T | σ ≤ k } to be the set of FSTs of with σ representation size k or less. For all k ∈ N and x ∈ { , } ∗ , the k-ﬁnite-state complexity of x with respect to binary representation σ is deﬁned as D kσ ( x ) = min n | y | : T ∈ FST ≤ kσ & T ( y ) = x o . Here y is the shortest string that gives x as an output when inputted into an FST of size k or lesswith respect to the binary representation σ . T can be thought of as the FST that can decompress y to reproduce x .For the purpose of this paper, we ﬁx a binary representation of ﬁnite-state transducers σ . Let T = ( Q, q , δ, ν ) be an FST. We deﬁne the function the function ∆ : Q × { , } → Q × { , } ∗ ,where ∆( q, b ) = ( δ ( q, b ) , ν ( q, b )). This function ∆ completely describes the state transitions andoutputs of T . In [4], diﬀerent encoding schemes are presented to represent each transducer via anencoding of this function ∆. We adapt the ﬁrst scheme presented for our own binary representationas follows.For n ∈ N , let bin ( n ) denote the binary representation of n , i.e. bin (1) = 1 , bin (2) = 10 , bin (3) =11 and so on. Note that bin ( n ) begins with a 1 for all n . string ( n ) denotes the binary string builtby removing the ﬁrst 1 in bin ( n ). So, bin ( n ) = 1 · string ( n ). Note that | string ( n ) | = ⌊ log( n ) ⌋ . For x = x x . . . x l , where x i ∈ { , } for 1 ≤ i ≤ l , we deﬁne the following two strings:1. x † = x x . . . x l − x l

1, and2. x ⋄ = (1 x ) † , where ¯0 = 1 and ¯1 = 0 . Then if Q = { q , . . . , q m } , ∆ is encoded by the string π = bin ( n ) ‡ · string ( n ′ ) ⋄ · bin ( n ) ‡ · · · bin ( n m ) ‡ · string ( n ′ m ) ⋄ , where ∆( q i , b ) = ( n i − b , string ( n ′ i − b )), 1 ≤ n i − b ≤ m and n ′ i ≥

1, for i = 1 . . . m and b ∈ { , } . Here, bin ( n j ) ‡ = λ if the corresponding transition stays in the same state, that is δ ( q j , b ) = q j . Otherwise bin ( n j ) ‡ = bin ( n j ) † . The binary representation σ : D → FST for FSTs we use is as follows. Let∆ j = { π | π is an encoding of ∆ for an FST with j states. } The domain D of σ is the set of strings D = [ j ∈ N [ ≤ i ≤ j { d ( bin ( i ))01 y | y ∈ ∆ j } . Then for 1 ≤ i ≤ j and y ∈ ∆ j we set σ ( d ( bin ( i ))01 y ) = T T is the FST with Q = { q , . . . q j } with initial state q = q i and whose transition function ∆is described by y . Clearly σ is surjective and so is a binary representation of all FSTs.We require a pointer to the start state as for two transducers which are equivalent up to arelabelling of their states, this change of relabelling of states changes the encoding of ∆. Thepointer allows us to easily get a bound on the size of transducers with equivalent transition tables,but diﬀerent start states. This bound allows us to prove Lemma 3.4, which in turn is used inTheorem 4.8 to demonstrate the diﬀerence between ﬁnite-state and pushdown-depth. However, asshown in [12], Theorem 3.3 demonstrates that if a sequence is ﬁnite-state deep when the size oftransducers is viewed from the perspective of one binary representation, it is deep when viewedfrom the perspective of any binary representation. Henceforth, we will drop the σ notation andinstead write | T | for | T | σ , FST ≤ k for FST ≤ kσ and D k FS ( x ) instead of D kσ ( x ).For any sequence S ,A sequence S is ﬁnite-state deep if given any ﬁnite-state transducer, we can always build a morepowerful ﬁnite-state transducer such that when we examine the ﬁnite-state complexity of preﬁxesof S on each transducer, their diﬀerence is always bounded below by the length of the preﬁx timesa ﬁxed constant. Intuitively, the larger transducer is more powerful and can spot patterns of thesequence that the smaller transducer cannot not. As such, the larger transducer requires less bitsto describe the preﬁx. Deﬁnition 3.2.

A sequence S is (inﬁnitely often) ﬁnite-state deep if( ∃ α > ∀ k ∈ N )( ∃ k ′ ∈ N )( ∃ ∞ n ∈ N ) D k FS ( S ↾ n ) − D k ′ FS ( S ↾ n ) ≥ αn. A sequence that is not ﬁnite-state-deep is called ﬁnite-state-shallow . The following theoremstates that the binary representation chosen has no eﬀect on the shallowness or depth of a sequence.

Theorem 3.3 ([12]) . Let π be a binary representation of FSTs. Let S be a ﬁnite-state-deep sequencewhen the size of the FSTs are viewed with respect to the binary representation π . Then S is ﬁnite-state-deep when the size of the FSTs are viewed with respect to every binary representation.Proof. Let S and π be as in the statement of the theorem. Let τ be any binary representation ofall FSTs. Fix k ∈ N . Then there exists a constant c such that FST ≤ kτ ⊆ FST ≤ k + cπ . Therefore forall n ∈ N , D k + cπ ( S ↾ n ) ≤ D kτ ( S ↾ n ) . As S is ﬁnite-state-deep with respect to π , there exists constants α and ( k + c ) ′ such that forinﬁnitely many n , D k + cπ ( S ↾ n ) − D ( k + c ) ′ π ( S ↾ n ) ≥ αn. Let d be a constant such that FST ( k + c ) ′ π ⊆ FST ( k + c ) ′ + dτ . Therefore for all n , D ( k + c ) ′ + dτ ( S ↾ n ) ≤ D ( k + c ) ′ π ( S ↾ n ) . Therefore for inﬁnitely many n , D kτ ( S ↾ n ) − D k ′ τ ( S ↾ n ) ≥ D k + cπ ( S ↾ n ) − D ( k + c ) ′ π ( S ↾ n ) ≥ αn. where k ′ = ( k + c ) ′ + d . 8emmas 3.4 and 3.5 examine the k -ﬁnite-state complexity of substrings within a string on FSTsof roughly the same size. Lemma 3.4’s proof relies on viewing FSTs with respect to our ﬁxed binaryrepresentation σ . However, this has no impact on whether a sequence is deep or not by the abovetheorem. Intuitively on input vw , if a FST T outputs x when reading v and y on reading w , if wetake the same transducer T but change the start state to be the one T ﬁnishes in after outputting x , w is a description of y for this diﬀerent transducer. Lemma 3.4 ([12]) . For our ﬁxed binary representation σ , ∀ ∞ k ∈ N , ∀ n ∈ N , ∀ x, y, z ∈ { , } ∗ ,D k FS ( xy n z ) ≥ D k FS ( x ) + nD k FS ( y ) + D k FS ( z ) . Proof.

Let k, n, x, y, z be as in the lemma. Suppose D k FS ( xy n z ) = | p x p y, . . . p y,n p z | , where T ∈ FST ≤ k , p x , p y, , . . . p y,n , p z ∈ { , } ∗ , with T ( p x p y, . . . p y,n p z ) = xy n z, T ( p x p y, . . . p y,j ) = xy j for 1 ≤ j ≤ n , and T ( p x ) = x .For all valid inputs w on T , let T w be the FST such that T w ’s states, transitions and outputsare the same as T ’s with the only diﬀerence being that the start state of T w is the state that T oninput w ends in. So T p x p y, ...p y,j − ( p y,j ) = y and T p x p y, ...p y,n ( p z ) = z .Next we put a bound on the binary description length of T w . Recall that for σ , for an FST C , σ ( d ( bin ( n ))01 π ) = C where d ( bin ( n )) is a pointer to C ’s start state q = q n , and π describes thefunction ∆ of C ’s transitions and outputs.As | T | ≤ k , T has at most k states. Hence, the pointer to T w ’s start state takes at most2 | bin ( k ) | = 2( ⌊ log k ⌋ + 1) bits. Also, ∆ T can be used to represent the transitions and outputs of T w . The encoding of ∆ T is bounded above by k bits. Therefore, | T w | ≤ ⌊ log k ⌋ + 1) + 2 + k ≤ k for k large.So D k FS ( x ) ≤ | p x | , D k FS ( z ) ≤ | p z | and D k FS ( y ) ≤ | p y | where p y = min | p y,j | (cid:8) p y,j : 1 ≤ j ≤ n (cid:9) . Therefore, D k FS ( xy n z ) = | p x p y, . . . p y,n p z | ≥ | p x | + n | p y | + | p z | ≥ D k FS ( x ) + nD k FS ( y ) + D k FS ( z ) . The following lemma states that for almost every pair of strings x and y , given a description of x and a description of y , a transducer T can be built such that upon reading a padded version ofthe description of x , a ﬂag, and the description for y , T can output the string xy . Lemma 3.5. ∀ ε > , ∀ k ∈ N , ∃ k ′ ∈ N , ∀ ∞ x ∈ { , } ∗ , ∀ y ∈ { , } ∗ , D k ′ FS ( xy ) ≤ (1 + ε ) D k FS ( x ) + D k FS ( y ) + 2 . Proof.

Let ε, x, y and k be as stated in the lemma. Let 0 < ε ′ < ε and consider p, q ∈ { , } ∗ suchthat D k FS ( x ) = | p | and D k FS ( y ) = | q | , and let A, B ∈ FST ≤ k where A ( p ) = x and B ( q ) = y .9et b = ⌈ ε ′ ⌉ ∈ N . Then | p | = nb + r , where 0 ≤ r < b . Let p ′ be a new string such that p ′ begins with the ﬁrst nb bits of p , with a 0 placed to separate every b bits starting at the beginningof the string. This is followed by a 1 and the remaining r bits of p doubled. So p ′ = 0 p . . . p b p b +1 . . . p b . . . p nb p nb +1 p nb +1 . . . p nb + r p nb + r , and | p ′ | = n ( b + 1) + 2 r + 1 = | p | + n + r + 1 ≤ | p | + n + b + 1 . | p | ≥ nb means n ≤ (cid:6) | p | b (cid:7) and so for | p | large | p ′ | ≤ | p | + (cid:24) | p | b (cid:25) + b + 1 ≤ | p | + 2 (cid:24) | p | b (cid:25) = | p | + 2 (cid:24) | p |⌈ ε ′ ⌉ (cid:25) ≤ | p | + 2( ε ′ | p | | p | (1 + ε ′ ) + 2 ≤ | p | (1 + ε ) . Let T ∈ FST ≤ k ′ where k ′ is a number whose value is dependent on k and b be the followingFST such that on input p ′ q : T uses p ′ to output A ( p ). T can spot the beginning bits of p fromthe blocks of size b by the 0s. When T sees the block beginning with 1 it knows that the remainingbits will be the ﬁnal bits of p doubled. Upon reading 10, T uses the remaining bits to output B ( q ).Therefore, D k ′ FS ( xy ) ≤ | p ′ | + | q | + 2 ≤ (1 + ε ) D k FS ( x ) + D k FS ( y ) + 2 . The following two lemmas demonstrate the relationship between the k -ﬁnite-state complexityof strings x and M ( x ) where M is an ILFST. Lemma 3.6.

Let M be an ILFST . Then it holds that ( ∀ ε > ∀ k ∈ N )( ∃ k ′ ∈ N )( ∀ ∞ x ∈ { , } ∗ ) D k ′ FS ( x ) ≤ (1 + ε ) D k FS ( M ( x )) + O (1) . Proof.

Let ε, k, x and M be as stated in the lemma. Furthermore let 0 < ε ′ < ε. By Theorem 2.2,there exists an ILFST M − and a constant c ∈ N such that for all y ∈ { , } ∗ , y [0 .. | y | − c − (cid:22) M − ( M ( y )) (cid:22) y. Let p be a k -minimal program for M ( x ), i.e. A ( p ) = M ( x ) for A ∈ FST ≤ k and D k FS ( M ( x )) = | p | .Let 0 < ε ′ < ε and let b = ⌈ ε ′ ⌉ ∈ N . Then | p | = nb + r , where 0 ≤ r < b . Let p ′ be a new stringsuch that p ′ begins with the ﬁrst nb bits of p , with a 0 placed to separate every b bits starting atthe beginning of the string. This is followed by a 1 and the remaining r bits of p doubled, i.e. p ′ = 0 p . . . p b p b +1 . . . p b . . . p nb p nb +1 p nb +1 . . . p nb + r p nb + r . Via the same argument as in Lemma 3.5, whenever | p | is large enough, | p ′ | ≤ p | (1 + ε ) . Next we build A ′ for x . Let y = M − ( M ( x )), i.e. x = yz for some z where | z | ≤ c . Let A ′ bethe machine such that on input p ′ z : A ′ uses p ′ to simulate A ( p ) to retrieve M ( x ) and runs thison M − to retrieve y . A ′ knows where p ′ ends due to the 01 separator. After seeing the separator, A ′ acts the identity FST and outputs z . Hence A ′ ( p ′ z ) = x. Thus D | A ′ | FS ( x ) ≤ | p ′ | + 2 + | z | ≤ | p | (1 + ε ) + 2 + c = D k FS ( M ( x )) + O (1) . emma 3.7 ([7]) . Let M be an ILFST . Then it holds that ( ∀ k ∈ N )( ∃ k ′ ∈ N )( ∀ x ∈ { , } ∗ ) D k ′ FS ( M ( x )) ≤ D k FS ( x ) . This section develops a new notion of pushdown-depth based on pushdown compressors that diﬀersfrom the notion in [12] and compares it with i.o. ﬁnite-state-depth. Our new notion looks at theperformance diﬀerence between a PDC with f -stack growth against a PDC with no restrictions onits stack. This is analagous to Bennett’s deﬁnition of logical depth which examines the diﬀerencebetween the time-bounded Kolmogorov complexity and ordinary Kolmogorov complexity of strings.Intuitively, a sequence S is f -stack growth deep if for an ILP DC with f -stack growth, the bound onthe stack restricts how well the compressor can compress the sequence while an ordinary ILP DC has no restrictions on the height of its stack, and so can use its stack to identify more patterns inthe sequence, thus achieving a smaller compression ratio. Note that the pushdown-depth notion wepresent is an almost everywhere notion.

Deﬁnition 4.1.

Let S be a sequence and let f be an order function. S is pushdown with f -stackgrowth deep (PD f -deep) if ( ∃ α > ∀ C ∈ ILPDC f )( ∃ C ′ ∈ ILPDC)( ∀ ∞ n ∈ N ), | C ( S ↾ n ) | − | C ′ ( S ↾ n ) | ≥ αn. Note we can make the previous deﬁnition more general, by considering classes of bounds f ,instead of a single f , to bound the stack growth. As we will not use this more general notion inlater results, we present the less general deﬁnition above, for the sake of simplicity. Lemma 4.2.

Let S be a sequence. Let f, g be order functions such that ∀ n ∈ N , f ( n ) ≥ g ( n ) .Then, if S is PD f -deep, S is also PD g -deep.Proof. This follows from the fact that ILPDC g ⊆ ILPDC f .The following results show that pushdown-depth satisﬁes the basic depth properties, in the sensethat both easy and random sequences cannot be deep. Theorem 4.3.

Let S ∈ { , } ω . Let f be an order from N to N .1. If ρ ILPDC ( S ) = 1 , then S is not PD f -deep.2. If R ILPDC f ( S ) = 0 , then S is not PD f -deep.Proof. Let f be an order function and S ∈ { , } ω such that ρ ILPDC ( S ) = 1. Therefore for every α > C ∈ ILPDC, for almost every n | C ( S ↾ n ) | > n (1 − α ) . Then for almost every n | I P D ( S ↾ n ) | − | C ( S ↾ n ) | < n − n (1 − α ) = αn. α is arbitrary and I P D ∈ ILPDC f , S is not PD f -deep.Let S ∈ { , } ω such that R ILPDC f ( S ) = 0. Let C ∈ ILPDC f such that lim sup n →∞ | C ( S ↾ n ) | n = 0 . Hence for every β > n , | C ( S ↾ n ) | < βn. Therefore for every C ′ ∈ ILPDC, it holds that for almost every n | C ( S ↾ n ) | − | C ′ ( S ↾ n ) | ≤ | C ( S ↾ n ) | < βn. As β is arbitrary and I P D ∈ ILPDC f , S is not PD f -deep.Before we prove a slow growth law for pushdown-depth, we ﬁrst demonstrate that the composi-tion of any ILPDC C with any ILFST T can be simulated by another ILPDC C ′ which is allowedto perform more λ -transition than C . Lemma 4.4.

Given C ∈ ILPDC and T ∈ ILFST, we can build an ILPDC N , such that ∀ x ∈ { , } ∗ , N ( x ) = C ( T ( x )) . In particular, if C ∈ ILPDC f , then N ∈ ILPDC f .Proof. Let T = ( Q T , q ,T , δ T , ν T ) . Let C = ( Q C , Γ C , δ C , ν C , q ,C , z , c ). Let d = max {| T ( q, b ) | : q ∈ Q T , b ∈ { , }} , the longest output possible from a transition in T . Then we build N =( Q N , Γ C , δ N , ν N , q ,N , z , cd ) , where • Q N = Q C × Q T × S, where S = { , } ≤ cd . • q ,N = ( q ,C , q ,T , λ ) .N works as follows. Before reading a bit, N uses λ -transitions to pop the topmost cd bits of itsstack, or until the stack only contains z , and remembers them in its states. That is, while | y | < cd and a = z , δ N (( q C , q T , y ) , λ, a ) = (( q C , q T , ya ) , λ ) . On such states, ν N (( q C , q T , y ) , λ, a ) = λ. Then for b ∈ { , } , if a = z or | y | = cd , N moves to the state representing how C would moveon input ν T ( q T , b ), how T would move on input b , and to the state representing not having thetopmost stack bits in memory. N ’s stack then updates to be the same as C ’s would be as if it hadread ν T ( q T , b ). That is, δ (( q C , q T , y ) , b, a ) = (( d δ C,Q ( q C , ν T ( q T , b ) , ya ) , δ T,Q ( q T , b ) , λ ) , xa )where for some w ∈ { , } ∗ either1. x = wy , if C would have pushed w onto its stack reading ν T ( q T , b ),2. x = wy [ i . . . | y | − C popped oﬀ the top i symbols and then pushed w onto itsstack reading ν T ( q T , b ),3. x = y [ i . . . | y | − C popped oﬀ the top i symbols from its stack and pushednothing on when reading ν T ( q T , b ). 12s there are only ﬁnitely many possibilities, these can all be coded into the states and transitions.On such states, ν N (( q C , q T , y ) , b, a ) = c ν C ( q C , ν T ( q T , b ) , ya ). N is IL as from the output and q C from the ﬁnal state we can recover T ( x ) as C is IL, and from q T and T ( x ) we can recover x as T is IL.Note that if C has f -stack growth, then so does N .The following result shows that pushdown-depth satisﬁes a slow growth law. Theorem 4.5 (Slow Growth Law) . Let S be any sequence, let f be an order from N to N , let g : { , } ω → { , } ω be ILFS computable and let S ′ = f ( S ) . If S ′ is PD f -deep then S is PD f -deep.Proof. Let

S, S ′ , f and g be as in the statement of the lemma and T be the ILFST computing g .For all m ∈ N , let n m denote the length of the preﬁx of S’ such that T ( S ↾ m ) = S ′ ↾ n m . Furthermore, for all m , let m ′ denote the largest integer such that T ( S ↾ m ) = T ( S ↾ m ′ ) but T ( S ↾ m ) = T ( S ↾ ( m ′ + 1)) . That is, for all n m ≤ i ≤ n m ′ , T ( S ↾ m ) = S ′ ↾ n i . As T is IL, itcannot visit the same state twice without outputting a bit, so there exists a β > n m , n m ≥ βm . Also recall that by Theorem 2.2 there exists an ILFST T − and a constant a suchthat for all x ∈ { , } ∗ , x [0 .. | x | − a − (cid:22) T − ( T ( x )) (cid:22) x. Let C ∈ ILPDC f . Let N be the ILPDC f from Lemma 4.4 such that on input x , N simulates C on T − ( x ) and outputs the same as C . Note that | C ( S ↾ m ) | ≥ | C ( T − ( T ( S ↾ m )) | = | N ( T ( S ↾ m )) | = | N ( T ( S ↾ m ′ )) | = | N ( S ′ ↾ n m ′ ) | . (1)As S ′ is deep, there exists an ILPDC N ′ , α > m | N ( S ′ ↾ n m ) | − | N ′ ( S ′ ↾ n m ) | ≥ αn m . (2)Let C ′ be the ILPDC f from Lemma 4.4 such that on input x , C ′ simulates N ′ on T ( x ) andoutputs what N ′ does. Note that | N ′ ( S ′ ↾ n m ′ ) | = | N ′ ( T ( S ↾ m ′ )) | = | C ′ ( S ↾ m ′ ) |≥ | C ′ ( S ↾ m ) | . (3)Therefore for almost every m ∈ N , | C ( S ↾ m ) | − | C ′ ( S ↾ m ) | ≥ | N ( T ( S ↾ m ′ )) | − | N ′ ( T ( S ↾ m ′ )) | (by (1) & (3))= | N ( S ′ ↾ n m ′ ) | − | N ′ ( S ′ ↾ n m ′ ) |≥ αn m ′ (by (2)) ≥ αβm ′ ≥ αβm. Hence S is PD f -deep. The following subsection demonstrates a distinction between ﬁnite-state-depth and PD ⌊ log ⌋ -depth.This is done by constructing sequences which have low PD ⌊ log ⌋ -depth and high ﬁnite-state-depthand vice versa. First we need the following deﬁnitions.13 eﬁnition 4.6. Let S be a sequence. Let f be an order function. Let β > .

1. We say FS- depth ( S ) ≥ β if( ∀ k ∈ N )( ∃ k ′ ∈ N )( ∃ ∞ n ∈ N ) D k FS ( S ↾ n ) − D k ′ FS ( S ↾ n ) ≥ βn. We say FS- depth ( S ) < β if FS- depth ( S ) ≥ β does not hold.2. We say PD f - depth ( S ) ≥ β if( ∀ C ∈ ILPDC f )( ∃ C ′ ∈ ILPDC)( ∀ ∞ n ∈ N ) | C ( S ↾ n ) | − | C ′ ( S ↾ n ) | ≥ βn. We say PD f - depth ( S ) < β if PD f - depth ( S ) ≥ β does not hold.The following result demonstrates the existence of a sequence which has a large ﬁnite-state-depth but not even a small PD ⌊ log ⌋ -depth level. This sequence is composed of chunks of randomstrings which grow exponentially. Some of these chunks are composed of repetitions of randomstrings which small FSTs cannot identify while larger FSTs can, resulting in ﬁnite-state-depth.Other chunks x are such that K ( x ) ≥ | x | preventing the sequence being PD ⌊ log ⌋ -deep. Theorem 4.7.

There exists a sequence S such that for all < β < , FS -depth(S) > (1 − β ) andPD ⌊ log ⌋ -depth(S) < β .Proof. Let 0 < β <

1. Split N into intervals I , I , I . . . such that | I | = 2 and | I j | = 2 | I | + ··· + | I j − | . S is constructed in stages S S S . . . as follows. Whenever j is odd, set S j to be a string of length | I j | with maximal plain Kolmogorov complexity in the sense that K ( S j ) ≥ | S j | . Otherwise if j is even, I j is devoted to some FST description bound length k ∈ N . Speciﬁcally for each k , k isdevoted to every interval I j where j is of the form j = 2 k + n k +1 , for n ≥

0. That is, k = 1 is ﬁrstdevoted to I and every 4 th interval after that. k = 2 is ﬁrst devoted to I and every 8 th intervalafter that, and so on. This ensures every k is devoted to inﬁnitely many intervals with regularfrequency.For each k , let r k be a string of length | I k | such that r k is 3 k -FS random in the sense that D k FS ( r k ) ≥ | r k | − k. (4)Such a string exists as there are at most | FST ≤ k | · | r k |− k < | r k | strings contradicting this. If I j is devoted to k , we set S j = r | Ij || rk | k . Thus, in both the odd and even case, | S j | = | I j | . First we show S is ﬁnite-state-deep. Let k be large. We examine preﬁxes of the form S ↾ m j = S S . . . S j of S where I j is devoted to k . By Lemma 3.4 and inequality (4) D k FS ( S ↾ m j ) ≥ D k FS ( S . . . S j − ) + | S j || r k | D k FS ( r k ) ≥ | S j || r k | ( | r k | − k ) . (5)Let T r k be the single state FST such that on any input x , T r k ( x ) = r | x | k . Let k ′ = | T r k | + | I FS | .This means that D k ′ FS ( S j ) ≤ | S j || r k | and D k ′ FS ( S . . . S j − ) ≤ | S . . . S j − | since T r k (1 | Sj || rk | ) = S j and I FS ( S . . . S j − ) = S . . . S j − . By Lemma 3 .

5, there exists a ˆ k such that for j large14 ˆ k FS ( S ↾ m j ) ≤ | S S . . . S j − | + D k ′ FS ( S j ) + 2 ≤ | S S . . . S j − | + | S j || r k | + 2 . (6)Recalling that | S j | = log | S . . . S j − | and using thatlim k →∞ k + 1 | r k | = 0 , for inﬁnitely many preﬁxes we have that, D k FS ( S ↾ m j ) − D ˆ k FS ( S ↾ m j ) ≥ | S j || r k | ( | r k | − k − − | S S . . . S j − | − ≥ | S j | (1 − β j and k suﬃciently large)= ( m j − O (log m j ))(1 − β m j = | S j | + log | S j | ) > m j (1 − β ) . While the above inequality is reliant on k being large, it is in fact true for all l where 1 ≤ l ≤ k bynoticing that D l FS ( S ↾ m ) ≥ D k FS ( S ↾ m ). Thus FS-depth(S) > (1 − β ).Next we show PD ⌊ log ⌋ -depth(S) < β . Throughout the remainder of the proof we assume j isodd. Recall that for all odd j , S j is a string of length | I j | that has maximal plain Kolmogorovcomplexity in the sense that K ( S j ) ≥ | S j | . Let C be any ILPDC. Therefore from an encoding ofthe tuple ( S . . . S j − , q s , q e , C, ¯ ν C ( S j )) where q s is the state C begins reading S j , q e is the state C ends up in after reading S j and ¯ ν C ( S j ) is the suﬃx of C ( S . . . S j ) outputted when reading S j , wecan recover S j as C is lossless. Thus for j large we have that | S j | ≤ K ( S j ) ≤ | ¯ ν C ( S j ) | + 2 log( | S . . . S j − | ) + | S . . . S j − | + O ( | C | ) . Using that m j = | S j | + | S . . . S j − | = | S j | + log | S j | and log | S . . . S j − | = log log | S j | , for j large we have | C ( S ↾ m j ) | ≥ | S j | − | S . . . S j − | ) − | S . . . S j − | − O ( | C | )= | S j | − | S j | − | S j | − O ( | C | ) ≥ m j − O (log | S j | ) (as m j − log | S j | = | S j | ) ≥ m j − O (log m j ) > m j (1 − β ) . Hence for all ILPDCs C , for inﬁnitely many m j | I P D ( S ↾ m j ) | − | C ( S ↾ m j ) | < m j − m j (1 − β ) = βm j . Hence PD ⌊ log ⌋ -depth(S) < β .The next result demonstrates the existence of a sequence which achieves a PD ⌊ log ⌋ -depth ofroughly while at the same time while having a small ﬁnite-state-depth level. This sequence issplit into chunks composed of repetitions of strings of the form RF R − where F is a ﬂag and R is a string not containing F with large plain Kolmogorov complexity relative to its length. Alarge ILPDC C is built to push R onto its stack, and then when it sees the ﬂag F , uses its stack15o compress R − . However, these R are built such that an ILPDC ⌊ log ⌋ is unable to push R fullyonto its stack due to the stack’s height restriction, resulting in no compression. For the ﬁnite-statetransducers, the sequence appears random and so no depth is achieved. Theorem 4.8.

For all < β < , there exists a sequence S such that PD ⌊ log ⌋ -depth ( S ) ≥ − β ,and FS -depth ( S ) < β .Proof. Let 0 < β < . Let m = m ( β ) and v = v ( β ) be integers to be determined later. Weconstruct S in stages S = S S S . . . such that for all j , | S j | = 2 j , and for some i ∈ N , to bedetermined later, we set S · · · S i − = 00 · · · T j which contains all strings of length j that do not contain 1 m as a substring.As T j contains strings of the form { , } m − ×{ }×{ , } m − ×{ } · · · , we have that | T j | ≥ j (1 − m ) .For each j , let R j ∈ T mj have maximal plain Kolmogorov complexity in the sense that K ( R j ) ≥ | R j | (1 − m ) . (7)Such an R j exists as | T mj | > | R j | (1 − m ) − j ≥ i , we set S j = ( R j m R − j ) n j t j n j is the maximal possible occurrences of R j m R − j blocks in S j such that for t j it holds that m ≤ t j < | R j m R − j | + m = 2 m ( j + 1) . Note that | R j m R − j | = m (2 j + 1) . Hence the number of bits devoted to R j m R − j blocks in S j isat least | S j | − m ( j + 1) . Hence for j large, we can can bound n j from below by n j ≥ | S j | − m ( j + 1)2 | R j | + m > | S j | (1 − m )2 | R j | . (8)Note that i is chosen to be the least integer such that m < i .First we examine how well any ILPDC ⌊ log ⌋ compresses occurrences of R j . Let C ∈ ILPDC ⌊ log ⌋ .Then as C is lossless, having an encoding of C , the state q s that C begins reading R j in, q e thestate C ends up in after reading R j , the stack contents z of C as it begins reading R j in q s , andthe output b ν C ( q s , R j , z ) of C on R j , we can recover R j . Note that | z | ≤ ⌊ log(2 j +1 − ⌋ < j + 1 . Thus if we encode the tuple (

C, q s , q e , z, b ν C ( q s , R j , z )) via our encoding of tuples, we have that as ⌈ log( j + 1) ⌉ < j for j large, | R j | (1 − m ) ≤ K ( R j ) ≤ | b ν C ( q s , R j , z ) | + 2 j + O ( | C | ) + O (1) = | b ν C ( q s , R j , z ) | + O ( j ) . Hence, | b ν C ( q s , R j , z ) | ≥ | R j | (1 − m ) − O ( j ) ≥ | R j | (1 − m ) . (9)This is similarly true for R − j blocks as K ( R j ) ≤ K ( R − j ) + O (1). As this is true for all R j and R − j blocks of S j we have that for j large, by (9) | C ( S . . . S j ) | − | C ( S . . . S j − ) | = | ¯ ν C ( S j ) | ≥ n j | R j | (1 − m ) . (10)16ext we build an ILPDC C ′ that performs well on S j . A complete description is given at theend of this proof. On S . . . S i − , C ′ outputs its input, trying to ﬁnd the ﬁrst 1 indicating thebeginning of the ﬁrst R i F R − i block. On S j for j ≥ i , while on an R j block, C ′ outputs its inputand pushes R j onto its stack. On R j , C ′ reads its input in chunks of size m trying to the ﬂag1 m . As | R j | is divisible by m , the ﬁrst time C ′ reads 1 m , it knows it has just read the ﬂag andpops 1 m from the top of the stack. C ′ knows that the next bit it will read will be the ﬁrst of R − j .From here, C ′ compares the input bit to the top of its stack making sure they match to ensure it isreading R j reversed, popping bits as it goes. It compresses it by a ratio of v , where v ∈ N and canbe made arbitrarily large when building C ′ . When its stack is empty, C ′ checks if the next m bitsare 1 m . If they are, C ′ outputs its input until it sees a 0 and begins reading S j +1 . Otherwise, C ′ acts as if it were on another R j block as described above. If C ′ sees something it does not expect,it enters an error state and outputs its input to maintain its IL property. Hence for j large | ¯ ν C ′ ( S j ) | = | C ′ ( S . . . S j ) | − | C ′ ( S . . . S j − ) | ≤ n j ( | R j | (1 + 1 v ) + m ) + t j + 1 < n j ( | R j | (1 + 1 v ) + m ) + 2 | R j | + 2 m< n j ( | R j | (1 + 1 v ) + m ) + 2( | R j | (1 + 1 v ) + m ) (11)= ( n j + 2)( | R j | (1 + 1 v ) + m ) ≤ n j | R j | (1 + 2 v ) . (12)Therefore, choosing m and v such that m + v < − m , for j large with m + v < ε < − m we have that | ¯ ν C ( S j ) | − | ¯ ν C ′ ( S j ) | ≥ n j (2 | R j | (1 − m ) − | R j | (1 + 2 v )) (by (10) & (12))= n j | R j | (1 − m − v ) > n j ( | R j | (1 − ε )) (13) > | S j | (1 − m )2 | R j | ( | R j | (1 − ε )) (by (8)) ≥ | S j | − ε ) . (where m + ε < ε < )Say | ¯ ν C ( S j ) | − | ¯ ν C ′ ( S j ) | ≥ | S j | − ε )holds for all j ≥ ˆ j . Then for j large | C ( S . . . S j ) | − | C ′ ( S . . . S j ) | ≥ | S ˆ j . . . S j | ( 1 − ε − | S . . . S ˆ j − | = S ˆ j . . . S j | ( 1 − ε − O (1) ≥ | S . . . S j | ( 12 − ε ) . (14)17ext we examine an arbitrary preﬁx S ↾ n of S . Let j and 0 < s ≤ n j be maximum such that S . . . S j − ( R j m R − j ) s (cid:22) S ↾ n. Let y ∈ { , } ∗ be such that S . . . S j − ( R j m R − j ) s y = S ↾ n .Note that | y | < m ( j + 1) . Then for | ¯ ν C ′ ( y ) | = | C ( S . . . ( R j m R − j ) s y ) | − | C ( S . . . ( R j m R − j ) s ) | ,we have that | C ( S ↾ n ) | − | C ′ ( S ↾ n ) | ≥ | S . . . S j − | ( 12 − ε )+ s | R j | (1 − ε ) − | ¯ ν C ′ ( y ) | (by (13) & (14)) ≥ | S . . . S j − | ( 12 − ε )+ s | R j m R − j | − m )(1 − ε ) − | y |≥ | S . . . S j − | ( 12 − ε ) + s | R j m R − j | ( 12 − ε ) − | y | = ( n − | y | )( 12 − ε ) − | y |≥ n ( 12 − ε ) . (for ε < ε < as n = O (2 j ))That is, PD ⌊ log ⌋ -depth( S ) ≥ − ε .Next we examine the ﬁnite-state-depth of S . For a preﬁx S ↾ n of S , let j and s ≤ n j bemaximal such that S . . . S j − ( R j m R − j ) s y (cid:22) S ↾ n. Note that | y | ≤ m ( j + 1).For all k and j , | R j | (1 − m ) ≤ K ( R j ) ≤ D k FS ( R j ) + O (1) . So for j large (say j ≥ j ′ ), D k FS ( R j ) > | R j | (1 − m ) . (15)This is similarly true for R − j blocks as K ( R j ) ≤ K ( R − j ) + O (1).So, by Lemma 3.4, we can bound the k -ﬁnite-state complexity of S ↾ n below by the sum of the3 k -ﬁnite-state complexity of all the R j and R − j blocks. So, for j large such that j ≥ j ′ D k FS ( S ↾ n ) ≥ (2 j − X a = j ′ n a D k FS ( R a )) + 2 sD k FS ( R j ) + D k FS ( y ) > − m )( j − X a = j ′ n a | R a | + s | R j | ) (by (15)) ≥ (1 − ε )( j − X a = j ′ | S a | + s ( | R j m R − j | − m )) (as m < ε < ε )= ( n − | y | − | S . . . S j ′ | )(1 − ε ) ≥ n (1 − ε ) . (as n = O (2 j ) and | y | = O ( j ) . )18o for some k ′ where I FS ∈ FST ≤ k ′ , for almost every nD k ′ FS ( S ↾ n ) − D k FS ( S ↾ n ) ≤ n − n (1 − ε ) = nε . That is, FS-depth( S ) ≤ ε . Thus, choosing m, v large enough at the start to ensure that ε < β < gives the desired result.For completeness, the following is a construction of C ′ :Let Q be the following set of states:1. The start state q , and q , . . . , q n , the preﬁx states that count up to n = | S . . . S i − | .2. q Rw , the states for the R j blocks where w ∈ { , } ≤ m − .3. q Fi , the states for popping the ﬂag for 0 ≤ i ≤ m − q R − i , the state for the R − j blocks for 1 ≤ i ≤ v .5. q Hi , the check states to see if C ′ is in another R j block or if an S j block is coming to an endfor 0 ≤ i ≤ m .6. q e , the error state.Next we describe the transition function δ : Q × ( { , } ∪ { λ } ) × Γ → Q × Γ ∗ of C ′ . Starting in q , C ′ counts up to n on S . . . S i − , outputting its input. That is, for i = 0 . . . n − δ ( q i , x, y ) = ( q i +1 , y ) . After reading n symbols, C ′ enters the ﬁrst state for reading an R j block, pushing the input bitonto the stack. That is, for any x, y δ ( q n , x, y ) = ( q Rx , xy ) .C ′ then reads its input in chunks of size m trying to ﬁnd the ﬂag 1 m . That is, for w ∈ { , } ≤ m − ,for any x, y δ ( q Rw , x, y ) = ( q wx , xy ) , and for w ∈ { , } m − , δ ( q Rw , x, y ) = ( ( q Rλ , xy ) if w = 1 m − or if ( w = 1 m − and x = 0),( q F , xy ) if w = 1 m − and x = 1. C ′ then pops the top m m it enters the state for the R − j blocks. That is, for 0 ≤ i ≤ m − δ ( q Fi , λ,

1) = ( ( q Fi +1 , λ ) if i = m − q R − , λ ) if i = m − R − j block, C ′ reads its input comparing with the stack counting mod v . If they match, C ′ pops the stack. It does this until the stack is empty. If at any stage the top of the stack andcurrent input bit do not match, C ′ enters q e the error state. That is, for y ∈ { , } , 1 ≤ i ≤ v − ( q R − i , x, y ) = ( ( q R − i +1 , λ ) if x = y ( q e , λ ) if x = y. and if i = v δ ( q R − v , x, y ) = ( ( q R − , λ ) if x = y ( q e , λ ) if x = y. If y = z , for 1 ≤ i ≤ v δ ( q R − i , λ, z ) = ( q H , z ) . In the q Hi states, C ′ checks to see if it is in another R j block, or if it is the end of the S j block.It does this by checking if the ﬁrst m bits it reads are 1 m . That is, for i = 0 . . . m − δ ( q Hi , x, y ) = ( ( q Hi +1 , y ) if x = 1,( q R i , i y ) if x = 0.and if i = m δ ( q Hm , x, y ) = ( ( q Hm , y ) if x = 1,( q Rλ , y ) if x = 0.On the error state q e , C ′ and stays in q e . That is for all x, y ∈ { , } , δ ( q e , x, y ) = ( q e , y ) .C ′ outputs its input on all non-error states except on states q R − i where nothing is outputtedfor 1 ≤ i < v ν ( q R − i , x, y ) = λ and a 0 is outputted after v stack symbols have been checked, ν ( q R − v , x, y ) = 0 if x = y. When an error is seen, 1 i x is outputted, that is for 1 ≤ i ≤ vν ( q R − i , x, y ) = 1 i x if x = y and y = z . In the error state, C ′ outputs its input ν ( q e , x, y ) = x. Now we verify that C ′ is IL. If the ﬁnal state is not an error state, then all R j blocks, ﬂags, S · · · S i − and 1 t j R j blocks are outputted. If the ﬁnal state is q R − i , then the number t of 0s after the ﬁnal ﬂag outputted tells us that the last R − j block is tv + i − q e , then for some a, b ∈ { , } ∗ the output has form aR j m t i b. aR j k t with ﬁnal state q R − followed by R − j [ tv + 1 ..tv + i − b. This section develops a notion of Lempel-Ziv-depth (LZ-depth) based on the diﬀerence in compres-sion of information lossless ﬁnite-state transducers and the Lempel-Ziv ’78 (LZ) algorithm. ILFSTsare chosen as part of this notion as LZ is asymptotically better than any ILFST [22]. Intuitively,a sequence is LZ-deep if given any ILFST, the compression diﬀerence between the ILFST and theLZ algorithm is bounded below by a constant times the length of the preﬁx examined.

Deﬁnition 5.1.

A sequence S is Lempel-Ziv-deep (LZ-deep) if( ∃ α > ∀ C ∈ ILFST)( ∀ ∞ n ∈ N ) , | C ( S ↾ n ) | − | LZ ( S ↾ n ) | ≥ αn. We say a sequence is inﬁnitely often (i.o.) LZ-deep if the ( ∀ ∞ n ∈ N ) term in the above deﬁnitionis replaced with ( ∃ ∞ n ∈ N ).The following results demonstrate that ﬁnite-state compressible and LZ incompressible sequencescannot be LZ-deep. Theorem 5.2.

Let S ∈ { , } ω .1. If ρ LZ ( S ) = 1 , then S is not LZ -deep.2. If R ILFST ( S ) = 0 , then S is not LZ -deep.Proof. The proof follows the same structure as Theorem 4.3.

The following theorem demonstrates the existence of a sequence that is LZ-deep but not ﬁnite-state-deep. It relies on a result by Lathrop and Strauss [15] which demonstrates the existence of anormal sequence S such that R LZ ( S ) = 1. Theorem 5.3.

There exists a sequence S that is LZ-deep but not ﬁnite-state-deepProof. Let S be Lathrop and Strauss’ normal sequence such that R LZ ( S ) = 1[15]. A result byDoty and Moser shows that any normal sequence is not ﬁnite-state deep [7]. Thus S is ﬁnite-stateshallow.Let R LZ ( S ) = ε < δ > ε + 2 δ < . Thus for almost every n , | LZ ( S ) | ≤ ( ε + δ ) n. Also as S is normal it is incompressible by any ILFST, i.e. ρ ILFST ( S ) = 1 [6, 21, 2]. Thus for all C ∈ ILFST, for almost every n , | C ( S ↾ n ) | ≥ (1 − δ ) n. n , | C ( S ↾ n ) | − | LZ ( S ↾ n ) | ≥ (1 − ε − δ ) n. Thus S is LZ-deep.Next we demonstrate that the sequence which satisﬁes Theorem 4.7 is ﬁnite-state deep but notLZ-deep. The long sections of randoms strings prevent LZ-depth as was the case with PD ⌊ log ⌋ -depth. Theorem 5.4.

There exists a sequence S that is ﬁnite-state deep but not LZ-deep.Proof. Let S be the sequence that satisﬁes Theorem 4.7. Recall for all 0 < β < S satisﬁes thatFS-depth( S ) > (1 − β ), i.e. S is ﬁnite-state deep.We now show S is not LZ-deep. Recall that S = S S S . . . where | S | = 1 and for all j , | S j | = 2 | S ...S j − | where for j odd, S j is a string of maximal plain Kolmogorov complexity in thesense that K ( S j ) ≥ | S j | .For any preﬁx S . . . S j , S . . . S j can be recovered from the string d ( S . . . S j − )01 LZ ( S . . . S j − | S j ).Therefore for j odd, | S j | ≤ K ( S j ) ≤ | S . . . S j − | + 2 + | LZ ( S . . . S j − | S j ) | , and so for all 0 < α < we have that for j large | LZ ( S . . . S j − | S j ) | ≥ | S j | − S j ) − ≥ | S j | (1 − α ) . Thus, for inﬁnitely many preﬁxes S . . . S j with j odd | LZ ( S . . . S j ) | ≥ | LZ ( S . . . S j − | S j ) | ≥ | S j | (1 − α )= ( | S . . . S j | − | S . . . S j − | )(1 − α )= ( | S . . . S j | − log | S j | )(1 − α ) > | S . . . S j | (1 − α ) . Hence we have that,lim sup n →∞ | L ( S ↾ n ) | n ≥ lim sup j →∞ | LZ ( S . . . S j +1 ) || S . . . S j +1 | > (1 − α ) . As α can be chosen arbitrarily small this means that R LZ ( S ) = 1 . Thus S is not LZ-deep byTheorem 5.2.The following theorem demonstrates the sequence from Theorem 4.7, while ﬁnite-state deepand not LZ-deep, it is in fact inﬁnitely often LZ-deep. This is because the LZ algorithm is able tocompress the sections of the sequence composed of repetitions of random strings. Theorem 5.5.

There exists a sequence which is ﬁnite-state deep and i.o. LZ-deep, but not a.e.LZ-deep. roof. Let S be the sequence from Theorem 4.7. It was shown in Theorem 5.4 that S was ﬁnite-statedeep but not LZ-deep. All that remains is to show that S is i.o. LZ-deep.Recall that we split N into intervals I , I , . . . such that | I | = 1 and | I i | = 2 | I ...I i − | for all i ∈ N . Also recall that for all k ≥ k is devoted to every interval I j where j is of the form j = 2 k + t k +1 for all t ≥

0. We built S = S S . . . in stages such that if k was devoted to I j then we set S j = r | Ij || rk k where r k was a string of length | I k | that was 3 k -FS random in the sense that D k FS ( r k ) ≥ | r k | − k. First we examine how any ILFST compresses S . Let C ∈ ILFST and suppose Q C = { q , . . . , q n } . For all 1 ≤ i ≤ n , we let C i denote the ILFST with the same states and transition and outputtable as C but with start state q i . That is, for all x ∈ { , } ∗ , C i ( x ) = ν C ( q i , x ). Recall from ourencoding of FSTs that for all i , C i ∈ D | C | FS .Next, let d be such that I FS ∈ FST ≤ d . Then, let d ′ be large enough such that d ′ satisﬁes Lemma3.7 for all i = 1 , . . . , n . That is, for all xD d ′ FS ( C i ( x )) ≤ D d FS ( x ) . (16)Let 0 < ε < . Let l be large enough such that l satisﬁes Lemma 3.6 for all i = 1 , . . . , n . Thatis, for almost every x ∈ { , } ∗ , D l FS ( x ) ≤ (1 + ε D d ′ FS ( C i ( x )) + O (1) . (17)Of our set of random strings { r k } k ∈ N , let l ′ ≥ l be such that r l ′ is 3 l ′ -FS random satisﬁes both(17) and satisﬁes | r l ′ | − l ′ ≥ | r l ′ | (1 − ε ). Such an l ′ must exist as { r k } k ∈ N is a set of strings ofincreasing length.Therefore we have that | r l ′ | (1 − ε ≤ | r l ′ | − l ′ ≤ D l ′ FS ( r l ′ ) ≤ D l FS ( r l ′ ) (as r l ′ is 3 l ′ -FS random.) ≤ (1 + ε D d ′ FS ( C i ( r l ′ )) + O (1) (by (17)) ≤ D d FS ( C i ( r l ′ )) + ε D d FS ( r l ′ ) + O (1) (by (16)) ≤ | C i ( r l ′ ) | + ε | r l ′ | + O (1) . Thus for all i = 1 , . . . , n we have that for l ′ chosen large | C i ( r l ′ ) | ≥ | r l ′ | (1 − ε − | r l ′ | ε − O (1) ≥ | r l ′ | (1 − ε ) . (18)That is | C ( q i , r l ′ ) | ≥ | r l ′ | (1 − ε ) for all i .We now calculate a lower bound for the compression of S . . . S j by C when S j = r | Ij | rl ′ l ′ . We havethat for j devoted to l ′ large | C ( S . . . S j ) | ≥ | S j || r l ′ | | r l ′ | (1 − ε ) (by (18))= ( | S . . . S j | − log | S j | )(1 − ε ) ≥ | S . . . S j | (1 − ε ′ ) (19)23here ε < ε ′ < S of the form S . . . S j where I j isdevoted to l ′ . Note that after reading S . . . S j − , LZ’s dictionary will have size bounded above by | S . . . S j − | , i.e. by log | S j | . Setting a = | r l ′ | + 1 . , by Lemma 2.5 we have that | LZ ( S . . . S j ) | ≤ | S . . . S j − | + o ( | S . . . S j − | ) + | LZ ( S . . . S j − | S j ) |≤ log( | S j | ) + o (log | S j | ) + q a | S j | log(log | S j | + q a | S j | ) (20)= O ( q | S j | log( q | S j | )) . (21)Hence, as inﬁnitely many intervals are devoted to l ′ , there are inﬁnitely many preﬁxes of theform S . . . S j such that | C ( S . . . S j ) | − | LZ ( S . . . S j ) | ≥ | S . . . S j | (1 − ε ′ ) − O ( q | S j | log( q | S j | )) (by (19) & (21)) ≥ | S . . . S j | (1 − β )for ε ′ < β <

1. Thus as C was an arbitrary ILFST, S is i.o. LZ-deep. The following results demonstrates the diﬀerence between LZ-depth with pushdown-depth. We ﬁrstdemonstrate the existence of a sequence that has high LZ-depth but is not pushdown-deep. Wealso show that we can build sequences that have PD ⌊ log log ⌋ -depth level of roughly which have asmall LZ-depth level. Before we begin we note the following notation similar to that found in theprevious section. Deﬁnition 5.6.

Let S be a sequence and β > . We say LZ- depth ( S ) ≥ β if( ∀ C ∈ ILFST)( ∀ ∞ n ∈ N ) | C ( S ↾ n ) | − | LZ ( S ↾ n ) | ≥ βn. We say LZ- depth ( S ) < β if LZ- depth ( S ) ≥ β does not hold.The following result shows the existence of a highly deep LZ sequence that does not evenhave small pushdown-depth. It relies on a result from [16] which builds a sequence S such that R LZ ( S ) = 0 but ρ PD ( S ) = 1. Theorem 5.7.

For all order functions f and for all < ε < , there exists a sequence S such thatPD f -depth ( S ) < ε but LZ-depth ( S ) ≥ (1 − ε ) .Proof. Let f and ε > S be the sequence from [16] such that R LZ ( S ) = 0 but ρ PD ( S ) = 1. For a full proof and construction, see cited paper. Brieﬂy however, S is a sequencethat is built to contain repeated Kolmogorov random substrings which LZ can compress but cannotbe compressed by any ILPDC.For all C ∈ ILPDC, for almost every n we have that | C ( S ↾ n ) | ≥ (1 − ε ) n . Hence, as I PD ∈ ILPDC f , | I PD ( S ↾ n ) | − | C ( S ↾ n ) | < n − (1 − ε ) n = εn. So PD f -depth( S ) < ε . 24ext as ρ ILFST ≥ ρ ILPDC since every ILFST can be simulated by an ILPDC, we have that forall T ∈ ILFST and for almost every n , | T ( S ↾ n ) | ≥ (1 − ε ) n . As R LZ ( S ) = 0, we have for almostevery n , | LZ ( S ↾ n ) | ≤ ε n. Hence for almost every n , | T ( S ↾ n ) | − | LZ ( S ↾ n ) | ≥ (1 − ε ) n. So LZ-depth( S ) ≥ (1 − ε ).Next we demonstrate the existence of a sequence that has roughly a PD ⌊ log log ⌋ -depth levelof while having a very small LZ-depth level. This sequence is from [16] and is built by enu-merating strings in such a way so that a pushdown compressor can use its stack to compress, butILPDC ⌊ log log ⌋ cannot use their stacks as they are too small. LZ cannot compress the sequenceeither as it is similar to a listing of all strings by order of length (i.e. all strings of length 1 followedby all strings of length 2 and so on). LZ performs poorly on such sequences. Theorem 5.8.

For all < β < , there exists a sequence S such that PD ⌊ log log ⌋ -depth ( S ) ≥ ( − β ) but LZ-depth ( S ) < β .Proof. Let 0 < β < . We ﬁrst give a brief description of the sequence from [16] that satisﬁes theresult. Let ε be such that 0 < ε < β , and let k = k ( ε ) , v = v ( ε ) be integers to be determined later.That is, k and v depend on ε which in turn depends on β .For any n ∈ N , let T n denote the set of strings x of length n such that 1 j does not appear in x ,for every j ≥ k . As T n contains { , } k − × { } × { , } k − × { } . . . , we have that | T n | ≥ | ( k − nk | .Note that for all x ∈ T n , there is a y ∈ T n − and a bit b such that x = yb . So, | T n | ≤ | T n − | . (22)Let P n = { p , . . . , p l } be the set of palindromes in T n . As ﬁxing the ﬁrst ⌈ n ⌉ bits of a palindromedetermines it, we have that | P n | ≤ |{ , } ⌈ n ⌉ | . We split the remaining strings in T n − P n into v pairs of sets X n,i = { x i, , . . . , x i,t } and Y n,i = { y i, , . . . , y i,t } with t = | T n − P n | v , and y i,j = ( x i,j ) − for every 1 ≤ j ≤ t and 1 ≤ i ≤ v , x i, , y i,t start with a 0. For convenience we write X i for X n,i . S is constructed in stages. Let f ( k ) = 2 k and f ( n + 1) = f ( n ) + 1 + v . Thus we have that n < f ( n ) < n . For n ≤ k − S n is a concatenation of all strings of length n in lexicographic order.For n ≥ k , S n = p . . . p l f ( n ) x , . . . x ,t f ( n )+1 y ,t . . . y , x , . . . x ,t f ( n )+2 y ,t . . . y , . . .. . . x v, . . . x v,t f ( n )+ v y v,t . . . y v, i.e. a concatenation of all strings in P n followed by a ﬂag of f ( n ) ones, followed by a concatenationof all strings in the X i zones and Y i zones, separated by ﬂags of increasing lengths.Let S = S S . . . S k − k k +1 . . . k − S k S k +1 . . . i.e. a concatenation of all the S j ’s with extra ﬂags between S k − and S k .Then from [16], for ε small, choosing k and v appropriately large we have that ρ LZ ( S ) ≥ − ε, and R ILP DC ( S ) ≤ . C ∈ ILPDC ⌊ log log ⌋ performs on S . Speciﬁcally we examine how welleach C performs on the strings in T n .Let n ≥ k and suppose C is reading S n . During this stage, C ’s stack has height bounded above by ⌊ log log | S . . . S k − k . . . k − S k . . . S n |⌋ . Note that | S . . . S k − k . . . k − S k . . . S n | < n +1 ( n +1)for n large. Thus C ’s stack height is bounded above by ⌊ log log(2 n +1 ( n + 1)) ⌋ < log 2 n bits.We examine the proportion of strings in T n that give a large contribution to the output. Theargument is similar to that found in [2].For simplicity, we write C ( p, x, s ) = ( q, v ) to represent that when C is in state p with stackcontents s , on input x , C outputs v and ﬁnishes in state q , i.e. C ( p, x, s ) = ( c δ Q ( p, x, s ) , b ν ( p, x, s )).For each x ∈ T n , let h x = min {| v | : ∃ p, q ∈ Q, ∃ s ∈ { , } ≤ log 2 n , C ( p, x, s ) = ( q, v ) } be the minimum possible addition of the output that could result from reading x . Note thatrestricting s to just reachable stacks that can be achieved at p results in a larger potential output.Let B n = { x ∈ T n : h x ≥ ( k − nk } be the incompressible strings that give a large contribution to the output.Consider x ′ ∈ T n − B n . There is a computation of x ′ that results in C outputting at most ( k − nk bits. As C is lossless, x ′ can be associated uniquely to a start state q x ′ ,s , stack contents s x ′ ,end state q x ′ ,e and output v x ′ where | v x ′ | < ( k − nk such that C ( q x ′ ,s , x ′ , s x ′ ) = ( q x ′ ,e , v x ′ ) . That is, g ( x ′ ) = ( q x ′ ,s , s x ′ , v x ′ , q x ′ ,e ).As this map g is injective, we can bound | T n − B n | as follows. | T n − B n | ≤ | Q | · ≤ log(2 n ) · < ( k − nk < | Q | · n · ( k − nk . (23)For 0 < δ < whose value is determined later, as | T n | ≥ ( k − nk , we have that for n large (when(23) holds) | B n | = | T n | − | T n − B n | > | T n | − | Q | · n · ( k − nk (by (23)) > | T n | (1 − δ ) . (24)Similarly, as the ﬂags only compose O ( n ) bits in each S n zone, we have for n large that | T n | n > | S n | (1 − δ ) (25)Then for n large (say for all n ≥ i such that (24) and (25) hold),26 C ( S . . . S i . . . S n ) | > k − k n X j = i X x ∈ B j j = k − k m X j = i j | B j | > k − k (1 − δ ) n X j = i j | T j | (by (24)) > k − k (1 − δ ) n X j = i | S j | (by (25))= k − k (1 − δ )( | S . . . S n | − | S . . . S i − | ) > k − k (1 − δ ) | S . . . S n | (26)The compression ratio of S on C ∈ ILPDC ⌊ log log ⌋ is least on preﬁxes of the form S . . . S n x n +1 ,where potentially x n +1 is a concatenation of all the strings in T n +1 − B n +1 , i.e. the compressiblestrings of T n +1 .Let x n +1 be a such a potential preﬁx of S n +1 . Then if F n +1 = P vi =0 ( f ( n + 1) + i ), the lengthof the ﬂags in S n +1 , we can bound the length of | x n +1 | as follows: | x n +1 | < | T n +1 − B n +1 | ( n + 1) + F n +1 < ( | T n +1 | − | B n +1 | )( n + 1) + O ( n ) < δ | T n +1 | ( n + 1) + δ | T n | ( n + 1) (by (24)) < δ | T n | ( n + 1) + δ | T n | ( n + 1) (by (22))= 3 δ | T n | ( n + 1) < δ | S . . . S n | , (27)for n large.So for n large, | C ( S . . . S n x n +1 ) | > ( k − k )(1 − δ )( | S . . . S n x n +1 | − | x n +1 | ) (by (26)) > k − k (1 − δ )( | S . . . S n x n +1 | − δ | S . . . S n | ) (by (27))= k − k (1 − δ )( | S . . . S n | (1 − δ ) + | x n +1 | ) > k − k | S . . . S n x n +1 | when δ is chosen suﬃciently small (i.e. when δ δ +1 < k ).Hence ρ C ⌊ log log ⌋ ( S ) ≥ k − k . Thus for all ε ′ > ε ′ < k − k , for almost every n , | C ( S ↾ n ) | ≥ ( k − k − ε ′ ) n. C ∈ ILPDC be such that R C ( S ) ≤ . Then for all ε ′ > n it holdsthat, | C ( S ↾ n ) | ≤ ( + ε ′ ). Hence, for almost every n and every C ∈ ILPDC ⌊ log log ⌋ | C ( S ↾ n ) | − | ˆ C ( S ↾ n ) | > ( k − k − ε ′ ) n − ( 12 + ε ′ ) n = k − k (1 − ε ′ ) n − ( 12 + ε ′ ) n> an where a = − ε ′ − k . As ε ′ can be chosen arbitrarily small, as long as ε is chosen such that0 < k ( ε ) < β , we have that, PD ⌊ log log ⌋ -depth( S ) > − β .Next we examine LZ-depth. Recall ρ LZ ( S ) ≥ − ε . Thus for c such that ε + c < | LZ ( S ↾ n ) | ≥ (1 − ε − c ) n for almost every n . Hence as I FS ∈ ILFST, we have that for almost every n | I FS ( S ↾ n ) | − | LZ ( S ↾ n ) | ≤ n − (1 − ε − c ) n = ( ε + c ) n As c can be chosen arbitrarily small, as ε < β we have that LZ-depth( S ) < β .In conclusion, for all 0 < β < , choosing ε such that 0 < ε < β and k ( ε ) < β , a sequence S canbe built which satisﬁes the requirements of the theorem. References [1] Luis Antunes, Lance Fortnow, Dieter van Melkebeek, and N. V. Vinodchandran. Computa-tional depth: Concept and applications.

Theor. Comput. Sci. , 354(3):391–404, 2006.[2] Ver´onica Becher and Pablo Ariel Heiber. Normal numbers and ﬁnite automata.

Theor. Comput.Sci. , 477:109–116, 2013.[3] C. H. Bennett. Logical depth and physical complexity.

The Universal Turing Machine, AHalf-Century Survey , pages 227–257, 1988.[4] Cristian S. Calude, Kai Salomaa, and Tania Roblot. Finite state complexity.

Theor. Comput.Sci. , 412(41):5668–5677, 2011.[5] Cristian S. Calude, Ludwig Staiger, and Frank Stephan. Finite state incompressible inﬁnitesequences.

Inf. Comput. , 247:23–36, 2016.[6] Jack Jie Dai, James I. Lathrop, Jack H. Lutz, and Elvira Mayordomo. Finite-state dimension.

Theor. Comput. Sci. , 310(1-3):1–33, 2004.[7] David Doty and Philippe Moser. Feasible depth. In S. Barry Cooper, Benedikt L¨owe, andAndrea Sorbi, editors,

Computation and Logic in the Real World, Third Conference on Com-putability in Europe, CiE 2007, Siena, Italy, June 18-23, 2007, Proceedings , volume 4497 of

Lecture Notes in Computer Science , pages 228–237. Springer, 2007.288] Rod Downey, Michael McInerney, and Keng Meng Ng. Lowness and logical depth.

Theor.Comput. Sci. , 702:23–33, 2017.[9] Rodney G. Downey and Denis R. Hirschfeldt.

Algorithmic Randomness and Complexity .Springer, 2010.[10] M. ´Emile Borel. Les probabilits dnombrables et leurs applications arithmtiques.

Rendicontidel Circolo Matematico di Palermo , 27(1):247–271, 1909.[11] D. Huﬀman. Canonical forms for information-lossless ﬁnite-state logical machines.

IRE Trans-actions on Information Theory , 5(5):41–59, 1959.[12] Liam Jordon and Philippe Moser. On the diﬀerence between ﬁnite-state and pushdown depth.In

SOFSEM 2020: Theory and Practice of Computer Science - 46th International Conferenceon Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus,January 20-24, 2020, Proceedings , volume 12011 of

Lecture Notes in Computer Science , pages187–198. Springer, 2020.[13] Z. Kohavi. Switching and ﬁnite automata theory (second edition).

McGraw-Hill , 1978.[14] James I. Lathrop and Jack H. Lutz. Recursive computational depth.

Inf. Comput. , 153(1):139–172, 1999.[15] James I. Lathrop and Martin Strauss. A universal upper bound on the performance of thelempel-ziv algorithm on maliciously-constructed data. In Bruno Carpentieri, Alfredo De Santis,Ugo Vaccaro, and James A. Storer, editors,

Compression and Complexity of SEQUENCES1997, Positano, Amalﬁtan Coast, Salerno, Italy, June 11-13, 1997, Proceedings , pages 123–135. IEEE, 1997.[16] Elvira Mayordomo, Philippe Moser, and Sylvain Perifel. Polylog space compression, pushdowncompression, and lempel-ziv are incomparable.

Theory Comput. Syst. , 48(4):731–766, 2011.[17] Philippe Moser. On the polynomial depth of various sets of random strings.

Theor. Comput.Sci. , 477:96–108, 2013.[18] Philippe Moser. Polylog depth, highness and lowness for E.

Inf. Comput. , 271:104483, 2020.[19] Philippe Moser and Frank Stephan. Depth, highness and DNR degrees.

Discret. Math. Theor.Comput. Sci. , 19(4), 2017.[20] Andr´e Nies.

Computability and Randomness . Oxford University Press, 2009.[21] Claus-Peter Schnorr and H. Stimm. Endliche automaten und zufallsfolgen.

Acta Informatica ,1:345–359, 1972.[22] Jacob Ziv and Abraham Lempel. Compression of individual sequences via variable-rate coding.