aa r X i v : . [ m a t h . P R ] S e p NEW UPPER BOUNDS FOR TRACE RECONSTRUCTION
ZACHARY CHASE
Abstract.
We improve the upper bound on trace reconstruction to exp( e O ( n / )). Introduction
Given a string x ∈ { , } n , a trace of x is obtained by deleting each bit of x withprobability q , independently, and concatenating the remaining string. For example,a trace of 11001 could be 101, obtained by deleting the second and third bits. Thegoal of the trace reconstruction problem is to determine an unknown string x , withhigh probability, by looking at as few independently generated traces of x as possible.More precisely, fix δ, q ∈ (0 , n large. For each x ∈ { , } n , let µ x be theprobability distribution on { , } ≤ n given by µ x ( w ) = (1 − q ) | w | q n −| w | f ( w ; x ), where f ( w ; x ) is the number of times w appears as a subsequence in x , that is, the numberof strictly increasing tuples ( i , . . . , i | w |− ) such that x i j = w j for 0 ≤ j ≤ | w | − T = T ( n ) for which there existsa function F : ( { , } ≤ n ) T → { , } n satisfying P µ Tx [ F ( e U , . . . , e U T ) = x ] ≥ − δ foreach x ∈ { , } n (where the e U j denote the T independently generated traces).Holenstein, Mitzenmacher, Panigrahy, and Wieder [15] established an upper bound,that exp( e O ( n / )) traces suffice. Nazarov and Peres [20] and De, O’Donnell, andServedio [12] simultaneously obtained the (previous) best upper bound known, thatexp( O ( n / )) traces suffice. Despite the trace reconstruction problem attracting agreat amount of interest, the upper bound has not been improved to date.In this paper, we improve the upper bound on trace reconstruction to exp( O ( n / log n )). Theorem 1.
For any deletion probability q ∈ (0 , and any δ > , there exists C > so that any unkown string x ∈ { , } n can be reconstructed with probabilityat least − δ from T = exp( Cn / log n ) i.i.d. traces of x . Batu et. al. [3] proved a lower bound of n , which was improved to e Ω( n / ) byHolden and Lyons [13], which was then improved to e Ω( n / ) by the author [7]. Date : September 7, 2020.The author is partially supported by Ben Green’s Simons Investigator Grant 376201 and grate-fully acknowledges the support of the Simons Foundation. variant of the trace reconstruction problem is, instead of being required toreconstruct any string x from traces of it, one must reconstruct a string x chosenuniformly at random from traces of it. For a formal statement of the problem, seeSection 1.2 of [13]. Peres and Zhai [21] obtained an upper bound of exp( O (log / n ))for q < , which was then improved to exp( O (log / n )) for all (constant) q byHolden, Pemantle, Peres, and Zhai [14].It is very possible that the methods of [14] allow us to easily use our methods toimprove the upper bound on the random variant just described. If they do, we willupdate this article to state the improved upper bound.Holden and Lyons [13] proved a lower bound for this random variant of e Ω(log / n ),which was then improved by the author [7] to e Ω(log / n ).Several other variants of the trace reconstruction problem have been considered.The interested reader should refer to [1], [2], [11], [10], [4], [18], [16], [19].In a previous version of this paper, we proved Theorem 1 only for q ∈ (0 , ]. ShyamNarayanan found a short argument extending our methods to get all q ∈ (0 , n in Theorem 1.2. Notation
We index starting at 0. For strings w, x , we sometimes write 1 x k + i = w i as shorthandfor Q | w |− i =0 x k + i = w i . Let D = { z ∈ C : | z | < } . For functions f and g , we say f = e O ( g ) if | f | ≤ C | g | log C | g | for some constant C . The symbol E x denotes theexpectation under the probability distribution over traces generated by the string x . For a trace e U , we define e U j = 2 for j > | e U | ; this is simply to make “ e U j = 0” and“ e U j = 1” false. We use 0 := 1. For a positive integer n , denote [ n ] := { , . . . , n } .For a function f and a set E , denote || f || E := max z ∈ E | f ( z ) | .3. Sketch of Argument
The upper bound of exp( O ( n / )) was obtained by analyzing the polynomial P k [ x k − y k ] z k whose value can be well enough approximated from a sufficient num-ber of traces. In this paper, we analyze the polynomial P k [1 x k + i = w i − y k + i = w i ] z k ,for various (sub)strings w ; its value can be well enough approximated from a suf-ficient number of traces, provided q ≤ /
2. The benefit of this polynomial is thatfor certain choices of w , it is far sparser than the more general P k [ x k − y k ] z k . Inthe author’s paper [8] improving the upper bound on the separating words problem,lower bounds were obtained for these sparser polynomials near 1 on the real axisthat were superior to those for the more general P k [ x k − y k ] z k . We use the methodsdeveloped in that paper and methods used in [5] to obtain superior lower boundsfor points on a small arc of the unit circle centered at 1. . Proof of Theorem 1
Fix q ∈ (0 , p = 1 − q . Our starting point is the following identity,which is just the natural generalization of the ‘single bit statistics’ identity. Proposition 4.1.
For any x ∈ { , } n , l ≥ , and z , . . . , z l − ∈ C , we have E x p − l X ≤ j ≤ n − ,..., ∆ l − ≥ e U j = w e U j +∆1+ ··· +∆ i = w i ∀ ≤ i ≤ l − ( z − qp ) j ( z − qp ) ∆ − ( z − qp ) ∆ − . . . ( z l − − qp ) ∆ l − − = X k < ··· By basic combinatorics, the left hand side is p − l X j, ∆ ,..., ∆ l − X k < ··· There is some C > so that for any n ≥ and any p ∈ P n , max | θ |≤ n − / | p ( e iθ ) | ≥ exp( − Cn / log n ) . roposition 4.2. For any distinct x, y ∈ { , } n , if x i = y i for all ≤ i < n / − ,then there are w ∈ { , } n / and z ∈ { e iθ : | θ | ≤ n − / } such that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X k [1 x k + i = w i − y k + i = w i ] z k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ exp( − Cn / log n ) . Proof. Let i ≥ n / − x i = y i . Let w ′ = x i − n / +1 , . . . , x i − .As used in [8], Lemmas 1 and 2 of [22] imply that there is some choice w ∈ { w ′ , w ′ } such that the indices k for which x k + i = w i for all 0 ≤ i ≤ n / − n / -separated, and such that the indices k for which y k + i = w i for all 0 ≤ i ≤ n / − n / -separated. Therefore, if p ( z ) := P k [1 x k + i = w i − y k + i = w i ] z k , then ǫ p ( z ) z m ∈ P n for some ǫ ∈ {− , } and 0 ≤ m ≤ n . Thus, by Theorem 2, there is some θ ∈ [ − n − / , n − / ] such that exp( − Cn / log n ) ≤ | ǫ p ( e iθ ) e imθ | = | p ( e iθ ) | . Take z = e iθ . (cid:3) In a previous version of this paper, we used Proposition 4.1 with z , . . . , z l − = 0and z chosen according to Proposition 4.2 to prove Theorem 1, which only workedfor q ≤ / 2, for if q > / 2, then ( − q/p ) ∆ i − would be too large in magnitude (for∆ i ≈ n ), leading to too large a variance to well-enough approximate P k [1 x k + i = w i − y k + i = w i ] z k with few traces. The idea of Shyam Narayanan was to choose z , . . . , z l − close to 1 so that ( z i − qp ) ∆ i − would no longer be too large in magnitude, while alsokeeping the right hand side of Proposition 4.1 not too small. The following corollary,due to him, establishes the existence of such z , . . . , z l − . Corollary 4.1. For any distinct x, y ∈ { , } n , if x i = y i for all ≤ i < n / − ,then there are w ∈ { , } n / , z ∈ { e iθ : | θ | ≤ n − / } , z , . . . , z n / − ∈ [1 − p, such that, for l := 2 n / , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X k < ··· Let w and z be those guaranteed by Proposition 4.2. Let f ( z ) = (cid:18) n n / (cid:19) − X k < ··· Take distinct x, y ∈ { , } n . If x i = y i for some i < n / − x and y can be distinguished with high probabil-ity with exp( O ( n / )) ≤ exp( C ′′ n / log n ) traces. So suppose otherwise. Let w, z , z , . . . , z n / − be those guaranteed by Corollary 4.1. Since z , . . . , z n / − ∈ [1 − p, z i − qp , 1 ≤ i ≤ n / − 1, is between − n | z − | n n / ,which by [20], is upper bounded by n exp( nn / )2 n / . Therefore, since the expressionin brackets in Proposition 4.1 is a function of just the observed traces, by Corol-lary 4.1 and a standard H¨oeffding inequality argument (see [20] for details; note thepigeonhole is not necessary), we see exp( C ′′′ n / log n ) traces suffice to distinguishbetween x and y . As explained in [20], this suffices to establish Theorem 1. (cid:3) Proof of Theorem 2 We may of course assume n is large.Let a = n − / and r = a − / . Let r ∗ ∈ [ r ] be such that r ∗ X j =1 ( j + 3) − r X j = r ∗ +1 ( j + 3) ∈ [20 , r ∗ clearly exists. Let ( ǫ j = +1 if 1 ≤ j ≤ r ∗ ǫ j = − r ∗ + 1 ≤ j ≤ r . Let λ a ∈ (1 , 2) be such that r X j =1 λ a j log ( j + 3) = 1 . Let d j = λ a j log ( j + 3) . Define e h ( z ) = e λ a r X j =1 ǫ j d j z j , where e λ a ∈ (1 , 2) is such that e h (1) = 1. Define h ( z ) = (1 − a ) e h ( z ) . Throughout the paper, we omit floor functions when they don’t meaningfully affect anything. et α = e ia , β = e − ia , and I t = { z ∈ C : arg( α − zz − β ) = t } for t ≥ 0. Note that I is the line segment connecting α and β and I a = { e iθ : | θ | ≤ a } is the set on which we wish to lower bound p at some point. Let G a = { z ∈ C : arg( α − zz − β ) ∈ ( a , a ) } be the open region bounded by I a/ and I a .As in [8], we needed our choice of h to satisfy (i) | h ( e πit ) | ≤ − c | t | for | t | > a / (up to logs). In this paper, we need (ii) | h ( e πit ) | ≥ − Ca for | t | ≈ a ; in [8], weinstead had | h ( e πit ) | ≈ − a for | t | ≈ a . Some thought shows that a polynomial withpositive coefficients will not work. We therefore had roughly half of our coefficientsbe − h (1) is basically 1, the negative coefficients make it so that h might no longer mapinto the unit disk, which is highly problematic for later application. Luckily, though, e h , and thus h , does map into the unit disk. We prove that in the appendix. Lemma 1. For any t ∈ [ − π, π ] , e h ( e it ) ∈ D . Lemma 2. There are absolute constants c , c , C > such that the following holdfor a > small enough. First, h ( e πit ) ∈ G a for | t | ≤ c a . Second, | h ( e πit ) | ≤ − c | t | log ( a − ) for t ∈ [ − , ] \ [ − C a / , C a / ] .Proof. Take | t | ≤ a . Then, e h ( e πit ) = e λ a r ∗ X j =1 λ a j log ( j + 3) (1 + 2 πitj − π t j + O ( t j )) − e λ a r X j = r ∗ +1 λ a j log ( j + 3) (1 + 2 πitj − π t j + O ( t j )) . By our choice of r ∗ , h ( e πit ) = 1 − δ + ǫi for δ := c t + a + O ( t r log r ) and ǫ := c t + O ( t r log r ), where c , c are bounded positive constants that are bounded awayfrom 0. By multiplying the denominator by its conjugate, we havearg (cid:18) e ia − (1 − δ + ǫi )(1 − δ + ǫi ) − e − ia (cid:19) = arg (cid:16) (cid:2) e ia − (1 − δ + ǫi ) (cid:3) · (cid:2) (1 − δ − ǫi ) − e ia (cid:3) (cid:17) . The ratio of the imaginary part to the real part of the term inside arg( · ) is2(1 − δ − cos( a )) sin( a ) − cos ( a ) + 2(1 − δ ) cos( a ) − (1 − δ ) + sin ( a ) − ǫ . riting cos( a ) = 1 − a + O ( a ) and sin( a ) = a + O ( a ), and using δ = O ( a ), theabove simplifies to a − aδ + O ( a ) a − ǫ + O ( a ) . If | t | ≤ c a , then, as δ = c t + a + O ( t r log r ) , ǫ = c t + O ( t r log r ), the inverse tangentof the above is at least a ; the arctangent is at most a , since, by Lemma 1, h ( e πit )lies in the unit disk (alternatively, one may note 2 aδ > ǫ ).We now establish the second part of the lemma. What [8] shows is (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m X j =1 λ a e πitj j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − λ a | t | ( m + 3) + λ a m log ( m + 3)for any m ≥ t ∈ [ − , ] \ [ − m − , m − ]. For m = r ∗ , if | t | > C a / , for say C = 100, then certainly 3 | t | − < m , and so we have(1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r ∗ X j =1 λ a e πitj j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − c | t | log ( a − ) . We can crudely bound(2) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j = r ∗ +1 λ a e πitj j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ( a − ) 1 r ∗ . Combining (1) and (2), we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j =1 λ a ǫ j e πitj j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − c ′ | t | log ( a − )for | t | ≥ C r − , with c ′ > C large enough. Now, since e λ − a = r ∗ X j =1 λ a j log ( j + 3) − r X j = r ∗ +1 λ a j log ( j + 3)= 1 − r X j = r ∗ +1 λ a j log ( j + 3) ≥ − ( a − ) 2 r ∗ ≥ − r log ( a − ) , we see (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)e λ a r X j =1 λ a ǫ j e πitj j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − c | t | log ( a − )for | t | ≥ C r − , provided C is large enough. Since 1 − a ≤ 1, we are done. (cid:3) et m = c − n / , J = c − n − / m log n , and J = m − J . A minor adapation ofthe relevant proof in [8] proves the following. Lemma 3. Suppose e p ( z ) = 1 − z d for some d ≤ n / . Then Q J − j = J | e p ( h ( e πi j + δm )) | ≤ exp( Cn / log n ) for any δ ∈ [0 , . By adapating the proof of the above lemma, we prove the following. Lemma 4. Suppose u ( z ) = z − ζ for some ζ ∈ ∂ D . Then, for any δ ∈ [0 , , wehave Q J − j = J | u ( h ( e πi j + δm )) | ≤ exp( Cn / log n ) .Proof. First note that(3) | u ( h ( e πiθ )) | ≥ − | h ( e πiθ ) | ≥ a . Define g ( t ) = 2 log | u ( h ( e πi ( t + δm ) )) | . For notational ease, we assume δ = 0; theargument about to come works for all δ ∈ [0 , g is C , by themean value theorem we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m J − X j = J g (cid:18) jm (cid:19) − Z J /mJ /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J − X j = J Z ( j +1) /mj/m (cid:18) g ( t ) − g (cid:18) jm (cid:19)(cid:19) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ J − X j = J Z ( j +1) /mj/m max jm ≤ y ≤ j +1 m | g ′ ( y ) | ! m dt ≤ m J − X j = J max jm ≤ y ≤ j +1 m | g ′ ( y ) | . (4)Since w log | u ( h ( w )) | is harmonic and log | u ( h (0)) | = log | u (0) | = 0, we have Z g ( t ) dt = 2 Z log | u ( h ( e πit )) | dt = 0 , and therefore(5) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z J /mJ /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z J /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)Z J /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12) . Since a ≤ (cid:12)(cid:12) u ( h ( e πit )) (cid:12)(cid:12) ≤ t , we have(6) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z J /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)Z J /m g ( t ) dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18) J m + (1 − J m ) (cid:19) log n ≤ C log nn / . By (4), (5), and (6), we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m J − X j = J g ( jm ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C log nn / + 1 m J − X j = J max jm ≤ t ≤ j +1 m | g ′ ( t ) | . ultiplying through by m , changing C slightly, and exponentiating, we obtain(7) J − Y j = J (cid:12)(cid:12)(cid:12) u ( h ( e πi jm )) (cid:12)(cid:12)(cid:12) ≤ exp Cn / log n + 1 m J − X j = J max jm ≤ t ≤ j +1 m | g ′ ( t ) | ! . Note g ′ ( t ) = ∂∂t h | u ( h ( e πit )) | i(cid:12)(cid:12)(cid:12) t = t | u ( h ( e πit )) | . We first show(8) ∂∂t h | u ( h ( e πit )) | i(cid:12)(cid:12)(cid:12) t = t ≤ t ∈ [0 , e d j = d j for j ≤ r ∗ and e d j = − d j for j > r ∗ so that h ( e πit ) = (1 − a ) P rj =1 e d j e πitj . Then, (cid:12)(cid:12) u (cid:0) h ( e πit ) (cid:1)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (1 − a ) r X j =1 e d j e πijt − ζ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (9) = (1 − a ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j =1 e d j e πijt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − " (1 − a ) ζ r X j =1 e d j e πijt + 1 . The derivative of the first term is(1 − a ) r X j ,j =1 e d j e d j π ( j − j ) e πi ( j − j ) t . Since r X j =1 | e d j | ≤ r X j =1 j | e d j | ≤ , we get an upper bound of 250 for the absolute value of the derivative of the firstterm of (9). The derivative of the second term, if ζ = e iθ , is2(1 − a ) r X j =1 e d j sin(2 πjt + θ )2 πj, which is also clearly upper bounded by (crudely) 250. We’ve thus shown (8).Recall | u ( h ( e πiθ )) | ≥ − | h ( e πiθ ) | . For j ∈ [ J , J ] ⊆ [ C a / m, (1 − C a / ) m ],we use (by Lemma 2) | h ( e πi jm ) | ≤ − c min( jm , − jm )log n o obtain 1 m J − X j = J max jm ≤ t ≤ j +1 m | g ′ ( t ) | ≤ m J − X j = J (cid:16) c jm , − jm )log n ) (cid:17) . Up to a factor of 2, we may deal only with j ∈ [ J , m ]. Then we obtain1 m J − X j = J max jm ≤ t ≤ j +1 m | g ′ ( t ) | ≤ m m/ X j = J m log nc j ≤ m log nc J ≤ Cn / . (cid:3) Let Q n denote all polynomials of the form ( z − α )( z − β ) p ( z ) for p ∈ P n . Corollary 5.1. For any q ∈ Q n and δ ∈ [0 , , Q j ,m − } | q ( h ( e πi j + δm z )) | ≤ exp( Cn / log n ) .Proof. Take q ∈ Q n ; say q ( z ) = ( z − α )( z − β ) p ( z ) for p ∈ P n . For j ∈ { , . . . , J − } and for j ∈ { J , . . . , m − } , by Lemma 1 we can bound | q ( h ( e πi jm z )) | ≤ n , toobtain(10) Y j J ,...,J − } | q ( h ( e πi j + δm )) | ≤ (4 n ) J − m − J − ≤ e Cn / log n . By applying Lemma 4 to u ( z ) := z − α and to u ( z ) := z − β and multiplying theresults, we see(11) J − Y j = J | u ( h ( e πi j + δm )) | ≤ e Cn / log n , where u ( z ) := ( z − α )( z − β ). Let e p ( z ) ∈ { , − z d } be the truncation of p to termsof degree less than n / . Then, since Lemma 2 gives | h ( e πi j + δm ) | ≤ − c min (cid:0) jm + δ, − ( jm + δ ) (cid:1) log n ≤ − c ′ n − / log n for j ∈ { J , . . . , J − } , we see(12) (cid:12)(cid:12)(cid:12) p (cid:16) h ( e πi j + δm ) (cid:17) − e p (cid:16) h ( e πi j + δm ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ ne − c ′ log n ≤ e − c log n . Lemma 3 implies(13) J − Y j = J | e p ( h ( e πi j + δm )) | ≤ e Cn / log n . y an easy argument given in [8], (12) and (13) combine to give(14) J − Y j = J | p ( h ( e πi j + δm )) | ≤ e C ′ n / log n . Combining (10) , (11), and (14), the proof is complete. (cid:3) Proposition 5.1. For any q ∈ Q n , it holds that max w ∈ G a | q ( w ) | ≥ exp( − Cn / log n ) .Proof. Let g ( z ) = Q m − j =0 q ( h ( e πi jm z )). For z = e πiθ , with, without loss of generality, θ ∈ [0 , m ), we have by Lemma 2 and Corollary 5.1 | g ( z ) | ≤ (cid:18) max w ∈ G a | q ( w ) | (cid:19) Y j ,m − } | q ( h ( e πi ( jm + θ ) )) | ≤ (cid:18) max w ∈ G a | q ( w ) | (cid:19) exp( Cn / log n ) . Thus, (max w ∈ G a | q ( w ) | ) exp( Cn / log n ) ≥ max z ∈ ∂ D | g ( z ) | ≥ | g (0) | = 1, where thelast inequality used the maximum modulus principle (clearly g is analytic). (cid:3) The following lemma was proven in [5]. Lemma 5. Suppose g is an analytic function in the open region bounded by I and I a , and suppose g is continuous on the closed region between I and I a . Then, max z ∈ I a/ | g ( z ) | ≤ (cid:18) max z ∈ I | g ( z ) | (cid:19) / (cid:18) max z ∈ I a | g ( z ) | (cid:19) / . Proof of Theorem 2. Take f ∈ P n , and let g ( z ) = ( z − α )( z − β ) f ( z ). A straight-forward geometric argument yields | g ( z ) | ≤ | ( z − α )( z − β ) | − | z | ≤ a ) ≤ n / for z ∈ I . Letting L = || g || I a , Lemma 5 then givesmax z ∈ I a/ | g ( z ) | ≤ (3 Ln / ) / . Since we then have max z ∈ I a/ ∪ I a | g ( z ) | ≤ max( L, (3 Ln / ) / ) , the maximum modulus principle impliesmax z ∈ G a | g ( z ) | ≤ max( L, (3 Ln / ) / ) . By Proposition 5.1, we concludeexp( − Cn / log n ) ≤ max (cid:0) L, (3 Ln / ) / (cid:1) . hus, || f || I a ≥ || g || I a = L ≥ exp( − C ′ n / log n ) , as desired. (cid:3) Appendix: Proof of Lemma 1 Lemma 1 was surprisingly nontrivial to prove. For very small t , such as t = O ( r − ), we can just use a Taylor expansion. However, for t on the order of r − ,it could be the case that the ǫ j ’s make the complex numbers ǫ j z j all point in thesame direction; the saving grace is that if t is on the order of r − , then there iscancellation within each of the sums P r ∗ j =1 λ a z j j log ( j +3) and P rj = r ∗ +1 λ a z j j log ( j +3) .The above is only a very rough sketch. We need to do some work to make the“small t ” argument work all the way up to 6 r − , and then we actually need toexploit cancellation between the two sums mentioned above in addition to theirself-cancellation.For ease, let B = r X j =1 λ a ǫ j j log ( j + 3)and B = r X j =1 λ a ǫ j log ( j + 3) . By our choice of r ∗ , B ≥ e λ a B . Also note r ∗ = r + o ( r ). Lemma 6. If | t | ≤ r − , then e h ( e it ) ∈ D .Proof. We start withIm he h ( e it ) i = e λ a r X j =1 λ a ǫ j j log ( j + 3) sin( jt )= e λ a r X j =1 λ a ǫ j j log ( j + 3) (cid:0) jt + O ( j t ) (cid:1) = e λ a B t + O ( t r log r ) ndRe he h ( e it ) i = e λ a r X j =1 λ a ǫ j j log ( j + 3) cos( jt )= e λ a r X j =1 λ a ǫ j j log ( j + 3) " − ( jt ) 2! + ∞ X k =2 (cid:18) ( jt ) k (2 k )! − ( jt ) k +2 (2 k + 2)! (cid:19) = 1 − e λ a B t − e λ a λ a ∞ X k =2 t k (2 k )! r X j =1 − ǫ j j k − log ( j + 3) − t k +2 (2 k + 2)! r X j =1 − ǫ j j k log ( j + 3) ! . The summand in the above corresponding to a particular k is non-negative if andonly if (2 k + 2)(2 k + 1) r X j =1 − ǫ j j k − log ( j + 3) ≥ t r X j =1 − ǫ j j k log ( j + 3) . As inf k ≥ (2 k + 2)(2 k + 1) P rj =1 − ǫ j j k − log ( j +3) r − P rj =1 − ǫ j j k log ( j +3) ≥ r large enough, we see that if t r ≤ 36, then all of the summands are non-negative. Thus, for t ≤ r − , (cid:12)(cid:12)(cid:12)e h ( e it ) (cid:12)(cid:12)(cid:12) = Re he h ( e it ) i + Im he h ( e it ) i ≤ (cid:18) − e λ a B t (cid:19) + (cid:18)e λ a B t + O ( t log r ) (cid:19) = 1 − e λ a ( B − e λ a B ) t + O ( t log r ) . And this is indeed at most 1 for | t | ≤ r − , provided r is large enough. (cid:3) In the course of the next two proofs, we will have a dominant main term (ensuring e h lies in D ) and will ignore lower order terms (which is a posteriorily justified at theend of the proof, as we will mention). Lemma 7. If t ∈ [ − π, π ] \ [ − r − , r − ] , then e h ( e it ) ∈ D .Proof. We may of course only deal with t > 0. We will repeatedly use (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m X j =1 e ijt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t − for small t , up to lower order terms. Summation by parts gives r ∗ X j =1 λ a e ijt j log ( j + 3) = λ a P r ∗ j =1 e ijt r ∗ log ( r ∗ + 3) + 2 λ a Z r ∗ ( P j ≤ x e ijt )(log( x + 3) + xx +3 ) x log ( x + 3) dx. uickly note t = 0 gives(15) r ∗ X j =1 λ a j log ( j + 3) = λ a r ∗ r ∗ log ( r ∗ + 3) + 2 λ a Z r ∗ ⌊ x ⌋ (log( x + 3) + xx +3 ) x log ( x + 3) dx. We bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ a P r ∗ j =1 e ijt r ∗ log ( r ∗ + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ λ a t − r ∗ log ( r ∗ + 3) . We bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ a Z r ∗ ( P j ≤ x e ijt )(log( x + 3) + xx +3 ) x log ( x + 3) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ λ a Z t − ⌊ x ⌋ (log( x + 3) + xx +3 ) x log ( x + 3) dx + 2 λ a Z r ∗ t − t − (log( x + 3) + xx +3 ) x log ( x + 3) dx = r ∗ X j =1 λ a j log ( j + 3) − λ a r ∗ r ∗ log ( r ∗ + 3) − λ a Z r ∗ t − ( ⌊ x ⌋ − t − ) · (log( x + 3) + xx +3 ) x log ( x + 3) dx, where the equality is merely (15). Now, Z r ∗ t − ( ⌊ x ⌋ − t − ) · (log( x + 3) + xx +3 ) x log ( x + 3) dx ≥ Z r ∗ t − x − t − x log ( x + 3) dx ≥ ( r + 3) Z r ∗ t − x − t − x dx is equal to 1log ( r + 3) (cid:18) t − − r ∗ − t − 12 ( 1(2 t − ) − r ∗ ) (cid:19) . Therefore, we obtain the bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r ∗ X j =1 λ a e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ λ a t − r ∗ log ( r ∗ + 3) + r ∗ X j =1 λ a j log ( j + 3) − λ a r ∗ r ∗ log ( r ∗ + 3) − λ a ( r + 3) (cid:18) t − − r ∗ − t − 12 ( 1(2 t − ) − r ∗ ) (cid:19) . The same arguments give (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j = r ∗ +1 λ a e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ λ a t − r log ( r + 3) + r X j = r ∗ +1 λ a j log ( j + 3) − λ a ( r − r ∗ ) r log ( r + 3) − λ a ( r + 3) (cid:18) r ∗ + 2 t − − r − ( r ∗ + 2 t − ) 12 ( 1( r ∗ + 2 t − ) − r ) (cid:19) . ince r ∗ = r up to lower order terms, if we write d := tr and add the two abovebounds, much simplification occurs and we end up with (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j =1 λ a ǫ j e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ − λ a r log ( r + 3) (cid:20) − d + d (cid:21) . Now, e λ − a = 1 − r X j = r ∗ +1 λ a j log ( j + 3)= 1 − λ a r log ( r + 3) , with the last equality holding up to lower order terms. The point is that, as long as − d + d > c for some absolute c > 0, the lower order terms will indeed be negligible, and we’llhave (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)e λ a r X j =1 λ a ǫ j e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < . As − d + + d is (clearly) increasing in d , we need only consider d = 12, whichyields 4 + . (cid:3) Lemma 8. For | t | ∈ [6 r − , r − ] , e h ( e it ) ∈ D .Proof. We may of course deal only with t > 0. Summation by parts gives(16) r X j =1 λ a ǫ j e ijt j log ( j + 3) = λ a P rj =1 ǫ j e ijt r log ( r + 3) + 2 λ a Z r ( P j ≤ x ǫ j e ijt ) · (log( x + 3) + xx +3 ) x log ( x + 3) dx. Quickly note t = 0 gives(17) e λ − a = λ a P rj =1 ǫ j r log ( r + 3) + 2 λ a Z r ( P j ≤ x ǫ j ) · (log( x + 3) + xx +3 ) x log ( x + 3) dx. Let η = 10 − . Using (16), we bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j =1 λ a ǫ j e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ λ a | P rj =1 ǫ j e ijt | r log ( r + 3) + 2 λ a Z ηr ⌊ x ⌋ (log( x + 3) + xx +3 ) x log ( x + 3) dx +2 λ a Z rηr | P j ≤ x ǫ j e ijt | · (log( x + 3) + xx +3 ) x log ( x + 3) dx. sing (17), the right hand side of the above is equal to λ a | P rj =1 ǫ j e ijt | r log ( r + 3) + e λ − a − λ a P rj =1 ǫ j r log ( r + 3) − λ a Z rηr ( P j ≤ x ǫ j ) · (log( x + 3) + xx +3 ) x log ( x + 3) dx +2 λ a Z rηr | P j ≤ x ǫ j e ijt | · (log( x + 3) + xx +3 ) x log ( x + 3) dx. Note the above is λ a | P rj =1 ǫ j e ijt | r log ( r + 3) + e λ − a − λ a P rj =1 ǫ j r log ( r + 3) − λ a log r Z rηr P j ≤ x ǫ j x dx + 2 λ a log r Z rηr | P j ≤ x ǫ j e ijt | x dx up to lower order terms. Note, for any x ≥ r ∗ + 1,(18) X j ≤ x ǫ j z j = z − z r ∗ − z − z r ∗ +1 − z ⌊ x ⌋− r ∗ − z = z − z r ∗ + z ⌊ x ⌋ − z . Note | − e it | ≤ t − up to smaller order terms. In particular, (18) implies λ a | P rj =1 ǫ j e ijt | r log ( r + 3) ≤ λ a t − r log ( r + 3) . For x ≤ r ∗ , note (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X j ≤ x ǫ j e ijt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X j ≤ x e ijt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t − | − e it ⌊ x ⌋ | = 2 (cid:12)(cid:12)(cid:12)(cid:12) sin (cid:18) ⌊ x ⌋ t (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) t − . Therefore,2 λ a log r Z r ∗ ηr | P j ≤ x ǫ j e ijt | x dx − λ a log r Z r ∗ ηr P j ≤ x ǫ j x dx ≤ λ a log r Z r ∗ ηr | sin( ⌊ x ⌋ t ) | t − − ⌊ x ⌋ x dx = 2 λ a log r Z r/ ηr | sin( xt ) | t − − xx dx = 2 λ a t log r Z rt/ ηrt | sin( y/ | − yy dy, where the first equality is up to lower order terms. Now, using (18),2 λ a log r Z rr ∗ | P j ≤ x ǫ j e ijt | x dx ≤ λ a log r Z rr ∗ t − | − e itr ∗ + e it ⌊ x ⌋ | x dx = 2 λ a t − log r Z rr ∗ p − tr ∗ ) + 2 cos( t ⌊ x ⌋ ) − tr ∗ − t ⌊ x ⌋ ) x dx = 2 λ a t − log r Z rr/ p − tr/ 2) + 2 cos( tx ) − tr/ − tx ) x dx = 2 λ a t log r Z trtr/ p − tr/ 2) + 2 cos( y ) − tr/ − y ) y dt, here the second to last equality is up to lower order terms. Finally,2 λ a log r Z rr ∗ P j ≤ x ǫ j x = 2 λ a log r Z rr ∗ r ∗ − ⌊ x ⌋ x dx = 2 λ a log r Z rr/ r − xx dx = λ a r log r , where the second equality is up to lower order terms. Combining everything, ourupper bound on | P rj =1 λ a ǫ j e ijt j log ( j +3) | is4 λ a t − r log r + e λ − a − λ a t log r Z rt/ ηrt y − | sin( y/ | y dy − λ a r log r + 2 λ a t log r Z trtr/ p − tr/ 2) + 2 cos( y ) − tr/ − y ) y dy up to lower order terms. Letting d = rt , the above is e λ − a − λ a r log r " d Z d/ ηd y − | sin( y/ | y dy + 1 − d − − d Z dd/ p − d/ 2) + 2 cos( y ) − d/ − y ) y dy . As can be verified with a computer, the term in brackets is bounded below bysome absolute positive constant for d ∈ [6 , . Therefore, the lower order termsare indeed negligible, and we’ve shown (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r X j =1 λ a ǫ j e ijt j log ( j + 3) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < e λ − a for | t | ∈ [6 r − , r − ]. This gives our desired result. (cid:3) Acknowledgments I would like to thank Omer Tamuz for introducing me to the wonderful tracereconstruction problem. I also thank Shyam Narayanan for providing an extensionto all q ∈ (0 , References [1] F. Ban, X. Chen, A. Freilich, R. Servedio, and S. Sinha. Beyond trace reconstruction: popu-lation recovery from the deletion channel. ArXiv e-prints, April 2019, 1904.05532.[2] Frank Ban, Xi Chen, Rocco A. Servedio, and Sandip Sinha. Efficient average-case populationrecovery in the presence of insertions and deletions. In APPROX/RANDOM 2019, volume145 of LIPIcs, pages 44:1–44:18. Schloss Dagstuhl-Leibniz-Zentrum f¨ur Informatik, 2019. It is perhaps bounded below by some absolute positive constant for all d ≥ 6, but we didn’tprove this, so we just stuck with our three ranges of | t | . 3] T. Batu, S. Kannan, S. Khanna, and A. McGregor. Reconstructing strings from random traces. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages910–918. ACM, New York , 2004.[4] Joshua Brakensiek, Ray Li, and Bruce Spang. Coded trace reconstruction in a constant numberof traces.CoRR, abs/1908.03996, 2019.[5] P. Borwein and T. Erd´elyi. Littlewood-type problems on subarcs of the unit circle. IndianaUniv. Math. J., 46(4):1323–1346, 1997.[6] P. Borwein, T. Erd´elyi, and G. K´os. Littlewood-type problems on [0 , Proc. London Math.Soc. (3) , 79(1):22–46, 1999.[7] Z. Chase. New Lower Bounds for Trace Reconstruction. To appear in Annales Institute HenriPoincare: Probability and Statistics , May 2019, 1905.03031.[8] Z. Chase. A New Upper Bound for Separating Words. ArXiv e-prints, July 2020, 2007.12097.[9] X. Chen, A. De, C. Lee, R. Servedio, S. Sinha. Polynomial-time trace reconstruction in thesmoothed complexity model. ArXiv e-prints, August 2020, 2008.12386.[10] M. Cheraghchi, R. Gabrys, O. Milenkovic, J. Ribeiro. Coded Trace Reconstruction. In IEEETransactions on Information Theory , doi: 10.1109/TIT.2020.2996377.[11] S. Davies, M. Racz, and C. Rashtchian. Reconstructing trees from traces. ArXiv e-prints,February 2019, 1902.05101.[12] A. De, R. O’Donnell, and R. A. Servedio. Optimal mean-based algorithms for trace recon-struction. In STOC’17—Proceedings of the 49th Annual ACM SIGACT Symposium on Theoryof Computing , pages 1047–1056. ACM, New York, 2017.[13] N. Holden and R. Lyons. Lower bounds for trace reconstruction. To appear in Annals ofApplied Probability , 2019.[14] N. Holden, R. Pemantle, Y. Peres, A. Zhai. Subpolynomial trace reconstruction for randomstrings and arbitrary deletion probability. In Proceedings of the 31st Conference On LearningTheory , PMLR 75:1799-1840, 2018.[15] T. Holenstein, M. Mitzenmacher, R. Panigrahy, and U. Wieder. Trace reconstruction withconstant deletion probability and related results. In Proceedings of the Nineteenth AnnualACM-SIAM Symposium on Discrete Algorithms, pages 389–398. ACM, New York , 2008.[16] A. Krishnamurthy, A. Mazumdar, A. McGregor, S. Pal. Trace reconstruction: generalized andparameterized. ArXiv e-prints, April 2019, 1904.09618.[17] A. McGregor, E. Price, and S. Vorotnikova. Trace reconstruction revisited. In Proceedings ofthe 22nd Annual European Symposium on Algorithms , pages 689–700, 2014.[18] S. Narayanan. Population recovery from the deletion channel: Nearly matching trace recon-struction bounds. CoRR, abs/2004.06828, 2020.[19] S. Narayanan, M. Ren. Circular Trace Reconstruction. ArXiv e-prints, September 2020,2009.01346.[20] F. Nazarov and Y. Peres. Trace reconstruction with exp( O ( n / )) samples. In STOC’17—Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages1042–1046. ACM, New York, 2017.[21] Y. Peres and A. Zhai. Average-case reconstruction for the deletion channel: subpolynomi-ally many traces suffice. In 58th Annual IEEE Symposium on Foundations of Computer Sci-ence—FOCS 2017, pages 228–239. IEEE Computer Soc., Los Alamitos, CA, 2017. MR3734232[22] J. M. Robson, Separating strings with small automata, Information Processing Letters, 30 (4): 209–214, 1989. Mathematical Institute, Andrew Wiles Building, Radcliffe Observatory Quar-ter, Woodstock Road, Oxford OX2 6GG, UK E-mail address : [email protected]@maths.ox.ac.uk