Polynomial-time trace reconstruction in the smoothed complexity model
Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha
aa r X i v : . [ c s . D S ] A ug Polynomial-time trace reconstructionin the smoothed complexity model
Xi Chen ∗ Columbia [email protected] Anindya De † University of [email protected] Chin Ho Lee ‡ Columbia [email protected] A. Servedio § Columbia [email protected] Sandip Sinha ¶ Columbia [email protected] 31, 2020
Abstract
In the trace reconstruction problem , an unknown source string x ∈ { , } n is sent througha probabilistic deletion channel which independently deletes each bit with probability δ andconcatenates the surviving bits, yielding a trace of x . The problem is to reconstruct x givenindependent traces. This problem has received much attention in recent years both in theworst-case setting where x may be an arbitrary string in { , } n [DOS17, NP17, HHP18, HL18,Cha19] and in the average-case setting where x is drawn uniformly at random from { , } n [PZ17, HPP18, HL18, Cha19].This paper studies trace reconstruction in the smoothed analysis setting, in which a “worst-case” string x worst is chosen arbitrarily from { , } n , and then a perturbed version x of x worst is formed by independently replacing each coordinate by a uniform random bit with probability σ . The problem is to reconstruct x given independent traces from it.Our main result is an algorithm which, for any constant perturbation rate 0 < σ < < δ <
1, uses poly( n ) running time and traces and succeeds withhigh probability in reconstructing the string x . This stands in contrast with the worst-caseversion of the problem, for which exp( O ( n / )) is the best known time and sample complexity[DOS17, NP17].Our approach is based on reconstructing x from the multiset of its short subwords and isquite different from previous algorithms for either the worst-case or average-case versions of theproblem. The heart of our work is a new poly( n )-time procedure for reconstructing the multisetof all O (log n )-length subwords of any source string x ∈ { , } n given access to traces of x . ∗ Supported by NSF grants CCF-1703925 and IIS-1838154. † Supported by NSF grants CCF-1926872 and CCF-1910534. ‡ Supported by a grant from the Croucher Foundation and by the Simons Collaboration on Algorithms and Ge-ometry. § Supported by NSF grants CCF-1814873, IIS-1838154, CCF-1563155, and by the Simons Collaboration on Algo-rithms and Geometry. ¶ Supported by NSF grants CCF-1714818, CCF-1822809, IIS-1838154, CCF-1617955, CCF-1740833, and by theSimons Collaboration on Algorithms and Geometry.
Introduction
Trace reconstruction is a simple-to-state algorithmic problem which has been intensively studiedyet remains mysterious in many respects. The problem captures some of the core algorithmic chal-lenges that arise in dealing with the deletion channel ; this is a noise process which, when given aninput string, independently deletes each coordinate with some fixed probability δ and outputs theconcatenation of surviving coordinates. In the trace reconstruction problem an algorithm is givenaccess to independent traces of a fixed unknown string x ∈ { , } n , where a “trace” of x , denoted z ∼ Del δ ( x ), is the string z that results from passing x through a deletion channel. The task is touse these traces to reconstruct the unknown string x. Variants of the trace reconstruction problem have a long history, going back at least to [Kal73].The problem was studied on and off throughout the 2000s [Lev01b, Lev01a, BKKM04, KM05,VS08, HMPW08, MPV14], and has seen a renewed surge of recent interest over the past few years[DOS17, NP17, PZ17, HPP18, HHP18, Cha19, BCF +
19, BCSS19, KMMP19, Nar20, HPPZ20]with the development of new algorithms and lower bounds for both the worst-case and average-case versions of the problem as well as various generalizations. Below we describe these two versionsof the problem and recall the current state of the art for each of them.
The original version of the trace reconstruction problem is the worst-case version, in which theunknown string x is an arbitrary (i.e. adversarially chosen) string from { , } n . This version ofthe problem has proved to be quite challenging; the first non-trivial result is due to Batu et al.[BKKM04], who gave a poly( n )-time algorithm that uses poly( n ) traces and succeeds when thedeletion rate δ is very small, at most n − / − ε for any ε >
0. In [HMPW08] Holenstein et al. gavean algorithm that runs in exp( ˜ O ( n / )) time using exp( ˜ O ( n / )) traces and succeeds for any δ bounded away from 1 by a constant. Simultaneous and independent works of De et al. [DOS17]and Nazarov and Peres [NP17] gave an algorithm that improves the running time and samplecomplexity of [HMPW08] to exp( O ( n / )). In this same constant- δ regime, successively strongerlower bounds on the required sample complexity were given by [MPV14, HL18], culminating in a˜Ω( n / ) lower bound due to Chase [Cha19].Another natural variant of the trace reconstruction problem is the average-case version; inthis variant the unknown string x is assumed to be drawn uniformly at random from { , } n , andthe goal is for the algorithm to succeed with high probability over the random choice of x . Thisproblem variant is motivated both by the apparent difficulty of the worst-case problem and bythe fact that in various application domains it may be overly pessimistic to assume that the inputstring x is adversarially generated. Much more efficient algorithms are known for the average-caseproblem: several early works [BKKM04, KM05, VS08] gave efficient algorithms that succeed fortrace reconstruction of almost all x ∈ { , } n for various o n (1) deletion rates δ , and [HMPW08]gave an algorithm that runs in poly( n ) time using poly( n ) traces when δ is at most some sufficientlysmall constant. More recent results of Peres and Zhai [PZ17] and Holden et al. [HPP18, HPPZ20],which build on worst-case trace reconstruction results of [DOS17, NP17], substantially improve onthis, with [HPP18, HPPZ20] giving an algorithm which uses exp( O (log / n )) traces to reconstructa random x ∈ { , } n in n o n (1) time when the deletion rate is any constant bounded away from 1.Summarizing the results described above, the current exp( O ( n / )) state-of-the-art for worst-case trace reconstruction is exponentially higher than the current exp( O (log / n )) state-of-the-1rt for average-case trace reconstruction. Given this substantial gap, it is natural to investigateintermediate formulations of the problem between the worst-case and average-case models. The well-studied smoothed analysis model, introduced by Spielman and Teng [ST01], provides anatural framework for interpolating between worst-case and average-case complexity. In smoothedanalysis the input to an algorithm is obtained by applying a random σ -perturbation to a worst-case input instance; here σ is a “perturbation rate,” which it is natural to scale so that σ = 1corresponds to a truly random instance and σ = 0 corresponds to a worst-case instance. Bychoosing intermediate settings of σ it is possible to interpolate between worst-case and average-case problem variants.We now give a detailed statement of the smoothed trace reconstruction problem that we con-sider. First, a “worst-case” string x worst is chosen arbitrarily from { , } n , and then a randomlyperturbed version x of the string x worst is formed by independently replacing each coordinate of x worst by a uniform random bit with probability σ . The goal is to reconstruct x given access toindependent traces drawn from Del δ ( x ) . Note that when σ = 0 this reduces to the worst-case tracereconstruction problem, and when σ = 1 this reduces to the average-case problem.As our main result, we give an algorithm for the smoothed trace reconstruction problem. Forany initial string x worst , our algorithm can recover a 1 − / poly( n ) fraction of perturbed strings x obtained from x worst (for any poly( n )) in polynomial time for any constant perturbation rate0 < σ ≤ < δ <
1. More precisely, the main theorem we prove isthe following:
Theorem 1 (Polyomial time smoothed trace reconstruction) . Let < δ, η, τ < and < σ ≤ . Let x worst be an arbitrary and unknown string in { , } n and let x be formed from x worst byindependently replacing each bit of x worst with a uniform random bit from { , } with probability σ .There is an algorithm with the following guarantee: with probability at least − η ( over therandom generation of x from x worst ) , it is the case that the algorithm, given access to independenttraces drawn from Del δ ( x ) , outputs the string x with probability at least − τ ( over the randomtraces drawn from Del δ ( x )) . Its running time, as well as the number of traces it uses, is (cid:18) nη (cid:19) O (cid:16) σ (1 − δ ) log − δ (cid:17) log 1 τ . It is interesting that while the best currently known algorithms for the worst-case problem,corresponding to σ = 0, require exp( O ( n / )) time, for any constant perturbation rate we can solvethe problem in a dramatically more efficient way. Intuitively, this shows that worst-case instancesfor trace reconstruction are “few and far between,” in the sense that even a small perturbation ofsuch an instance typically makes it much easier to solve. Before describing our approach we briefly recall some of the methods used in prior work for theworst-case and average-case problems and discuss why these approaches do not seem applicable tothe smoothed problem that we consider. 2 orst-case algorithms.
All of the known worst-case algorithms [HMPW08, DOS17, NP17] fordeletion rates bounded away from 1 are “mean-based,” meaning that they only use (estimates of)the n expected values E y ∼ Del δ ( x ) [ y i ], i = 1 , . . . , n . The two papers [DOS17, NP17] both show thatmean-based algorithms can only succeed if they are given estimates of these expectations thatare additively ± exp( − Ω( n / ))-accurate, and hence mean-based algorithms must inherently useexp(Ω( n / )) traces for the worst-case problem. Inspection of [DOS17, NP17] shows that theseworst-case lower bounds for mean-based algorithms in fact hold for a 1 − o n (1) fraction of strings in { , } n . Thus, the mean-based algorithmic approach of [HMPW08, DOS17, NP17] will not work forour smoothed variant of the problem (and indeed our algorithm is not a mean-based algorithm). Average-case algorithms.
The average-case algorithms of [PZ17, HPP18, HPPZ20] work byaligning individual traces (and are not mean-based). The analysis builds off of some of the structuralresults established in [DOS17, NP17], but also employs sophisticated probabilistic arguments whichheavily depend on the randomness of the source string x being reconstructed.As noted in [HPP18, HPPZ20], their average-case algorithm extends to the setting in whichthe target string x is drawn from the p -biased distribution over { , } n (under which each bit x i isindependently taken to be 1 with probability p ). Taking p = σ/
2, this corresponds to our smoothedanalysis model in the special case in which the original string x worst is promised to be the string0 n . Equivalently, we can view our smoothed analysis problem as a more challenging variant of p -biased average-case trace reconstruction — more challenging because the initial string ( x worst ) is nolonger promised to be 0 n , but rather is both arbitrary and moreover unknown to the reconstructionalgorithm. It is not clear how to extend the p -biased average-case results of [HPP18, HPPZ20] evento the setting in which the starting string x worst is a known arbitrary string, let alone to our settingin which x worst is both arbitrary and unknown. In contrast with prior algorithms for the worst-case and average-case problem, our approach is basedon first reconstructing subwords of the target string and then reconstructing the target string fromthose subwords. Recall that a subword of a string x = ( x , . . . , x n ) is a sequence of contiguouscharacters of x , i.e. a ( b − a + 1)-character string ( x a , x a +1 , . . . , x b ) for some 1 ≤ a ≤ b ≤ n . Reconstruction from subwords.
Given a length 1 ≤ k ≤ n , let us write subword( x, k ) to denotethe multiset of all n − k + 1 length- k subwords of x ; we refer to this multiset as the k -subword deckof x . For example, if n = 7 and k = 3, then the k -subword deck of x = 1101011 would be the5-element multiset { , , , , } .In general the k -subword deck of x may not uniquely identify the string x within { , } n unless k is very large; for example, the two multisetssubword (cid:0) n/ n/ − n/ n/ , k (cid:1) and subword (cid:0) n/ n/ n/ n/ − , k (cid:1) are identical for every k ≤ n/ − . This simple example shows that for worst-case strings x , the k -subword deck of x may not suffice to information-theoretically specify x unless k is linear in n .The starting point of our approach is the observation that the situation is markedly better for random perturbations of worst-case strings: for any worst-case string x worst ∈ { , } n , with highprobability a random σ -perturbation x of x worst is such that subword( x , k ) does uniquely identify x { , } n even if k is relatively small. Moreover, there is an efficient algorithm to reconstruct x from subword( x , k ). This is captured by the following result, which we prove in Section 3: Lemma 1.1 (Reconstructing perturbed strings from their subword decks) . Let < σ, η < . Thereis a deterministic algorithm Reconstruct-from-subword-deck which takes as input the k -subworddeck subword( x, k ) of a string x ∈ { , } n , where k = O (log( n/η ) /σ ) , and outputs either a stringin { , } n or “ fail .” Reconstruct-from-subword-deck runs in poly( n ) time and has the followingproperty: for any x worst ∈ { , } n , if x is a random σ -perturbation of x worst (i.e. x is obtained byindependently replacing each bit of x worst with a uniform random bit with probability σ ), then withprobability at least − η the output of Reconstruct-from-subword-deck on input subword( x , k ) is the string x . The subword deck reconstruction problem.
Lemma 1.1 naturally motivates the algorithmicproblem of subword deck reconstruction : given access to independent traces drawn from Del δ ( x )and a length k , can we reconstruct the k -subword deck of x ? Our main algorithmic contribution isan efficient algorithm for this problem: Theorem 2 (Reconstructing the k -subword deck of x ) . Let < δ, τ < . There is an algorithm Reconstruct-subword-deck which takes as input a parameter ≤ k ≤ n and access to independenttraces of an unknown source string x ∈ { , } n . The running time of Reconstruct-subword-deck ,as well as the number of traces it uses, is n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) log 1 τ . Reconstruct-subword-deck has the following property: for any string x ∈ { , } n , with probabilityat least − τ the output of Reconstruct-subword-deck is the k -subword deck subword( x, k ) . Theorem 1 follows immediately from Lemma 1.1 and Theorem 2. We note that Theorem 2dominates the overall running time of Theorem 1, and that Theorem 2 works for arbitrary strings.The algorithm in Lemma 1.1 and its analysis are relatively straightforward. To explain the mainidea, we define the notion of the right (and left) extension of a string. (Starting from this point, itwill be convenient for us to index a string x ∈ { , } n using { , . . . , n − } as x = ( x , . . . , x n − ).) Definition 1.
Given a k -bit string w = ( w , . . . , w k − ) ∈ { , } k , a k -bit string ( w , . . . , w k − , b ) forsome b ∈ { , } is said to be a right-extension of w . We define left-extensions of a string similarly. At a high level, the algorithm relies on the fact that if x is obtained by a random σ -perturbationof x worst , then x has useful local uniqueness properties . More precisely, for k = O (log( n/τ ) /σ ), asimple probabilistic argument shows that with high probability x [ n − k : n −
1] is the unique elementof subword( x , k ) with no right-extension in subword( x , k ). Consequently, we can identify x [ n − k : n −
1] from the k -subword deck subword( x , k ) of x . This argument can be extended inductivelywithout much difficulty to in fact identify the whole of x .In contrast, Theorem 2 is substantially more challenging. The structural results that underlieTheorem 2 are based on two different sets of analytic arguments. The first argument only workswhen δ ≤ / δ < x , k ) can be obtained by computing the multiplicity ofoccurrences of each w ∈ { , } k in the set subword( x , k ); we denote this multiplicity by w, x ).The first key step is to define a univariate polynomial (in the variable ζ ) SW x ,w ( ζ ) which has thefollowing two key properties: (i) SW x ,w (0) = w, x ), and (ii) using traces from Del δ ( x ), we havean unbiased estimator for SW x ,w ( ζ ) for ζ = δ . Next, observe that given traces from Del δ ( x ), we cantrivially simulate traces from Del δ ′ ( x ) for any δ ′ ≥ δ , and hence we can get an unbiased estimatorfor SW x ,w ( ζ ) for any ζ ∈ [ δ, x ,w ( ζ ) at ζ = 0 andthus items (i) and (ii) above do not give us an unbiased estimator for SW x ,w (0).The most obvious idea at this point would be to do polynomial interpolation and use estimatesfor SW x ,w ( ζ ) for ζ ∈ [ δ,
1] to infer SW x ,w (0). Unfortunately, directly applying Lagrange inter-polation is too naive an approach: to accurately estimate SW x ,w (0), it turns out that we needSW x ,w ( ζ ) for ζ ∈ [ δ,
1] up to error ± − Θ( n ) . However, to estimate SW x ,w ( ζ ) up to error ± κ , atleast poly(1 /κ ) traces from Del δ ( x ) are needed (Lemma 6.1). Thus, directly applying Lagrangeinterpolation would require a sample complexity that grows like 2 Θ( n ) , which is too expensive.Our approach is to forego Lagrange interpolation and instead (in essence) interpolate usingtools from complex analysis. In particular, we prove a new structural result (Theorem 9) aboutpolynomials whose constant coefficient is not too small and whose coefficients have magnitudebounded from above by a parameter m (which is set to be n k in our application given that everycoefficient of SW x ,w ( ζ ) is bounded by n k ). This result implies that SW x ,w (0) (which must be aninteger given that SW x ,w (0) = w, x )) is uniquely determined by the values of SW x ,w ( ζ ) in theinterval ζ ∈ [ δ,
1] if these values are given up to error n O ( − k/ (1 − δ )) ; see Theorem 8. Thus, in principlewe can determine SW x ,w (0) by estimating SW x ,w ( ζ ) for values of ζ ∈ [ δ,
1] to error ± n O ( − k/ (1 − δ )) .This essentially implies that SW x ,w (0) can be determined using ≈ n O ( k/ (1 − δ )) traces from Del δ ( x ).(Note though that this sample complexity is not quite as good as is claimed in Theorem 2. Werefine the above argument, using stronger coefficient bounds on SW x ,w and other ideas describedat the beginning of Section 7, to get Theorem 2 in its full strength as stated earlier.)In closing this subsection, we emphasize that while Theorem 8 is about the behavior of polyno-mials on the real line, its proof crucially uses tools from complex analysis such as Jensen’s formulaand the Hadamard three circle theorem. We further note that while we have sketched above howSW x ,w (0) can be determined in principle, this does not necessarily give an efficient algorithm. Toget an efficient algorithm, we use an approach based on linear programming. We view this paper as a first exploration, establishing that the algorithmic framework of smoothedanalysis can be fruitfully brought to bear on the trace reconstruction problem. There are manyinteresting questions and directions for future work, some of which we highlight below.One natural question is to establish strong sample complexity lower bounds for smoothed tracereconstruction. Currently the best lower bound we are aware of for this framework is ˜Ω(log / n )for average-case trace reconstruction due to [Cha19]. Can an n Ω(1) lower bound be established forthe smoothed model?Another natural goal is to quantitatively strengthen our algorithmic result. In the regime of σ = c/n with c a small constant, the smoothed problem reduces to the worst-case problem, and5n the regime σ = 1 it reduces to the average-case problem; however, the running times of ouralgorithm in these regimes do not match the state-of-the-art running times for the correspondingproblems that are provided in [DOS17, NP17] and in [HPP18, HPPZ20] respectively. As a concretefirst question along these lines, is it possible to improve the sample complexity of our algorithmfrom its current n /σ dependence on the perturbation rate to a dependence more like n /σ / ? Notation.
Given a nonnegative integer n , we write [ n ] to denote { , . . . , n } . Given integers a ≤ b we write [ a : b ] to denote { a, . . . , b } . It will be convenient for us to index a binary string x ∈ { , } n using [0 : n −
1] as x = ( x , . . . , x n − ). Given such a string x and integers 0 ≤ a < b ≤ n −
1, wewrite x [ a : b ] to denote the subword ( x a , x a +1 , . . . , x b ). We write ln to denote natural logarithmand log to denote logarithm to the base 2.We denote the set of non-negative integers by Z ≥ . For a vector α = ( α , . . . , α ℓ ) ∈ Z ℓ ≥ , wewrite | α | to denote α + α + · · · + α ℓ , and write α ! to denote α ! α ! · · · α ℓ !. Subword deck.
Fix a string x ∈ { , } n and an integer k ∈ [ n ]. A k -subword of x is a (contiguous)subword of x of length k , given by ( x a , x a +1 , . . . , x a + k − ) for some a ∈ [0 : n − k ]. For a string w ∈ { , } k , let w, x ) denote the number of occurrences of w as a subword of x . We define the k -subword deck of x , denoted subword( x, k ), to be the ( n − k + 1)-size (unordered) multiset of all k -subwords of x . We also extend the notation of w, x ) to strings w ∈ { , , ∗} k , where ∗ is thewildcard symbol: w, x ) is the sum of w ′ , x ) over all w ′ ∈ { , } k with w ′ i = w i for every w i = ∗ . Distributions.
We use bold font letters to denote probability distributions and random variables,which should be clear from the context. We write “ x ∼ X ” to indicate that random variable x isdistributed according to distribution X . Deletion channel and traces.
Throughout this paper the parameter δ : 0 < δ < deletion probability . Given a string x ∈ { , } n , we write Del δ ( x ) to denote the distribution of thestring that results from passing x through the δ -deletion channel (so the distribution Del δ ( x ) issupported on { , } ≤ n ), and we refer to a string in the support of Del δ ( x ) as a trace of x . Recall thata random trace y ∼ Del δ ( x ) is obtained by independently deleting each bit of x with probability δ and concatenating the surviving bits. Perturbation and smoothed analysis. . The perturbation model we consider corresponds to thestandard notion of perturbation of an n -bit string which arises in the analysis of Boolean functions.Given an n -bit string x worst ∈ { , } n , a σ -perturbation of x worst is a random string x ∈ { , } n obtained by independently setting each coordinate x i to be x worst i with probability 1 − σ and tobe uniformly random with the remaining probability σ . Equivalently, x is a random string thatis (1 − σ )-correlated with x worst ; in the notation of Chapter 2 of [O’D14], we may write this as“ x ∼ N − σ ( x worst ).”We recall that in the smoothed analysis framework, an initial string x worst ∈ { , } n is selected(in what may be thought of as an adversarial manner), and then a σ -perturbation x of x worst isdrawn at random from N − σ ( x worst ), and the algorithm runs on instance x . The goal is to develop For simplicity in this work we assume that the deletion probability δ is known to the reconstruction algorithm.We note that it is possible to obtain a high-accuracy estimate of δ simply by measuring the average length of tracesreceived from the deletion channel. x worst ∈ { , } n , succeed with high probability on the perturbed instance x ∼ N − σ ( x worst ) . In this section we prove Lemma 1.1:
Restatement of Lemma 1.1 (Reconstructing perturbed strings from their subword decks) . Let < σ, η < . There is a deterministic algorithm Reconstruct-from-subword-deck which takesas input the k -subword deck subword( x, k ) of a string x ∈ { , } n , where k = O (log( n/η ) /σ ) , andoutputs either a string in { , } n or “ fail .” Reconstruct-from-subword-deck runs in poly( n ) timeand has the following property: For any string x worst ∈ { , } n , if x is a random σ -perturbation of x worst (i.e. x is obtained by independently replacing each bit of x worst with a uniform random bit withprobability σ ), then with probability at least − η the output of Reconstruct-from-subword-deck on input subword( x , k ) is the string x . The idea of Lemma 1.1 is very simple: a probabilistic argument shows that for any worst-casestring x worst , a random σ -perturbation introduces enough variability into x ∼ N − σ ( x worst ) so thatthe k -subwords comprising the k -subword deck of x can be easily pieced together in a unique wayto yield x by a simple greedy algorithm. We now provide details.Given subword( x, k ) of a string x ∈ { , } n , we use the following greedy algorithm to recover x :1. We will store the output in y , a string of length n .2. Let w ∈ subword( x, k ) be a string that fails to have a right-extension in subword( x, k ).(Note the only k -subword of x that can fail to have a right-extension in subword( x, k ) is x [ n − k : n − w exists, return fail ; otherwise set y [ n − k : n −
1] = w and ℓ = n − k .3. While ℓ >
0, do the following: Find w ∈ subword( x, k ) as a left-extension of y [ ℓ : ℓ + k − y agrees with x so far, then such a left-extension must exist.) If w is not unique(counted with multiplicity), return fail ; otherwise set y ℓ − = w and decrement ℓ by 1.4. When ℓ = 0, return y .It is clear from the description of the greedy algorithm above and comments therein that eitherit returns fail or there is no ambiguity (in filling in the last k bits and extending from there bit bybit) and x is recovered correctly as y at the end. We use the following definition to capture strings x on which the greedy algorithm succeeds: Definition 2. An n -bit string x is said to be k -good if(i) for every j ∈ [ n − k ] , there is exactly one string in subword( x, k ) (counted with multiplicity)that is a left-extension of the subword x [ j : j + k − ; and(ii) the subword x [ n − k : n − does not have a right-extension in subword( x, k ) . To prove Lemma 1.1, it remains only to establish the following claim:7 laim 3.1.
Fix any string x worst ∈ { , } n . Then for k = O (log( n/η ) /σ ) Pr x ∼ N − σ ( x worst ) (cid:2) x is k -good (cid:3) ≥ − η. Proof.
Let E ( x ) be the event that x is not k -good. We observe that for E ( x ) to occur, there mustexist indices 0 ≤ i < j ≤ n − k + 1 such that the ( k − x starting at positions i and j are equal, i.e., x [ i : i + k −
2] = x [ j : j + k − x ∼ N − σ ( x worst )): Pr (cid:2) E ( x ) (cid:3) ≤ Pr h ∃ i, j such that x [ i : i + k −
2] = x [ j : j + k − i ≤ X ≤ i 2] = x [ j : j + k − i . (by a union bound)Let E i,j ( x ) denote the event that x [ i : i + k − 2] = x [ j : j + k − Pr [ E i,j ( x )] ≤ η/n for each fixed pair 1 ≤ i < j ≤ n − k + 1.To this end, we write the probability of E i,j ( x ) as Pr (cid:2) x i = x j (cid:3) · k − Y ℓ =1 Pr h x i + ℓ = x j + ℓ (cid:12)(cid:12)(cid:12) x i + h = x j + h for all h = 0 , . . . , ℓ − i . The first factor Pr (cid:2) x i = x j (cid:3) is at most 1 − σ/ b of x i , x j agrees with b after the perturbation with probability at most 1 − σ/ 2. The upper bound of 1 − σ/ ℓ th factor, we note that for any fixed values of x i , . . . , x j + ℓ − that satisfy the conditioning part x i + h = x j + h for all h = 0 , ..., ℓ − x j + ℓ agrees with the fixedvalue of x i + ℓ with probability at most 1 − σ/ k = C log( n/η ) /σ for some large enough constant C , we have Pr (cid:2) E i,j ( x ) (cid:3) ≤ (1 − σ/ k − ≤ exp (cid:18) − Ω (cid:18) log nη (cid:19)(cid:19) ≤ ηn . This finishes the proof of the claim. k -subword deck: Proof of Theorem 2 The remaining task to establish the main result, Theorem 1, is to prove Theorem 2 (restated below),which gives an efficient algorithm to reconstruct the k -subword deck of an arbitrary source string x ∈ { , } n given access to independent traces of x . Throughout this section, let ρ = (1 − δ ) / Restatement of Theorem 2 (Reconstructing the k -subword deck) . Let < δ, τ ′ < . Thereis an algorithm Reconstruct-subword-deck which takes as input a parameter ≤ k ≤ n andaccess to independent traces of an unknown source string x ∈ { , } n . The running time of Reconstruct-subword-deck , as well as the number of traces it uses, is (cid:16) n/ρ k (cid:17) O (1 /ρ ) log (cid:0) /τ ′ (cid:1) . Reconstruct-subword-deck has the following property: For any unknown string x ∈ { , } n , withprobability at least − τ ′ , Reconstruct-subword-deck outputs subword( x, k ) . Multiplicity , takes as input a string w ∈ { , } k and access to independent traces from an unknown source string x , and it outputs w, x ), the multiplicity of w in the ( n − k + 1)-element multiset subword( x, k ) (note that thismultiplicity can be zero if w is not present as a subword of x ): Theorem 3. Let < δ, τ < and let ρ = (1 − δ ) / . There is an algorithm Multiplicity whichtakes as input a string w ∈ { , } k and access to independent traces of an unknown source string x ∈ { , } n . Multiplicity runs in (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ) time and uses (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ) manytraces from Del δ ( x ) , and has the following property: For any unknown source string x ∈ { , } n ,with probability at least − τ the output of Multiplicity is w, x ) ( i.e. the number of occurrencesof w as a subword of x ) . A standard “branch-and-bound” argument gives Theorem 2 from Theorem 3: Proof of Theorem 2 using Theorem 3. Let ℓ = ⌊ log n ⌋ . We first consider the case that k ≤ ℓ . In this case Reconstruct-subword-deck simply runs Multiplicity ( w ) once for each of the2 k strings w ∈ { , } k , with the confidence parameter “ τ ” for each run of Multiplicity setto τ ′ / k . Since we can reuse the same traces for each of the 2 k runs, in this case the run-ning time is 2 k (cid:0) n/ρ k (cid:1) O (1 /ρ ) log (cid:0) k /τ ′ (cid:1) = (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ′ ) and the sample complexity is (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ′ ).Next we consider the case that k > ℓ . To avoid an exponential running time dependence on k ,the algorithm uses a simple “branch-and-prune” approach. In the first stage, similar to the previousparagraph, Reconstruct-subword-deck runs Multiplicity on each of the 2 ℓ strings w ∈ { , } ℓ with confidence parameter τ ′ / (2 nk ), thereby obtaining the ℓ -subword deck subword( x, ℓ ). It thenexecutes k − ℓ many successive stages j = 1 , , . . . , k − ℓ, where in stage j the algorithm determinesthe ( ℓ + j )-subword deck of x using the ( ℓ + j − x . It does this in each stage asfollows: for each of the (at most n ) distinct strings w ∈ subword( x, ℓ + j − Multiplicity ( w 0) and Multiplicity ( w τ ′ / (2 nk ).The correctness of this approach follows from the trivial fact that an ( ℓ + j )-bit string can onlybe present in subword( x, ℓ + j ) if its ( ℓ + j − x, ℓ + j − n + 2 n ( k − ℓ ) < kn many runs of Multiplicity overall, the running time of Reconstruct-subword-deck is at most O ( kn ) · (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(2 kn/τ ′ ) = (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ′ )and the sample complexity is at most (cid:0) n/ρ k (cid:1) O (1 /ρ ) log(1 /τ ′ ), and Theorem 2 is proved.Thus, in the rest of the paper, we focus on proving Theorem 3. The following “subword polynomial” plays an important role in our approach: Definition 3. Given x ∈ { , } n and w = ( w , . . . , w k − ) ∈ { , } k , let SW x,w ( ζ ) be the followingunivariate polynomial of degree n − k : SW x,w ( ζ ) := X α ∈ Z k − ≥ | α |≤ n − k w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , x ) · ζ | α | . 9n words, the degree- ℓ coefficient of the subword polynomial SW x,w is the number of ways that w arises as a substring of x with a total of exactly ℓ extraneous additional characters interspersedamong the characters of w . In particular, we have that the constant term of SW x,w (i.e. SW x,w (0),since 0 = 1) is equal to w, x ), the frequency of w as a subword of x , which is what Theorem 3aims to estimate efficiently from traces of x . We prove Theorem 3 by giving two different algorithms depending on the value of the deletion rate δ . The first of these algorithms, Multiplicity small- δ , gives a simple and direct approach to computethe value SW x,w (0) = w, x ); however this approach requires the deletion rate δ to be less than1 / 2. This approach is based on analyzing a new object, the “generalized deletion polynomial,” thatwe believe may be useful for subsequent work. The second of these algorithms, Multiplicity large- δ ,gives a different and somewhat more involved algorithm (involving linear programming and a newextremal result on polynomials, proved using complex analysis) that can be used for any deletionrate δ < δ < / 2) maywish to focus on Multiplicity small- δ (Section 5). Readers who are interested in a more involvedapproach that succeeds for all δ < Multiplicity large- δ (Section 6). Thetwo algorithms and analyses are each self-contained; each may be read independently of the other.For each of the two algorithms, we first give a simpler version of the analysis which establishesa quantitatively weaker version of the result, with an n O ( k ) running time and sample complexity(ignoring the dependence on other parameters); see the statements of Theorem 4 and Theorem 7,at the beginnings of Section 5 and Section 6 respectively, for detailed statements of these weakerversions. In Section 7 we quantitatively strengthen both Theorem 4 and Theorem 7 to achievea poly( n ) · exp( O ( k )) running time and sample complexity, and thereby complete the proof ofTheorem 3. Multiplicity ′ small- δ : An algorithm for deletion rate δ < / In this section we prove Theorem 4, a weaker version of Theorem 3. It gives an algorithm thathas n O ( k ) running time and sample complexity (ignoring the dependence on other parameters) andworks when δ < / 2. Actually Theorem 4 works when δ ≤ / 2; we only require δ < / δ is to 1 / Theorem 4. Let < δ ≤ / . There is an algorithm Multiplicity ′ small- δ which takes as inputa string w ∈ { , } k , access to independent traces of an unknown source string x ∈ { , } n , anda parameter τ > . Multiplicity ′ small- δ draws n O ( k ) · log(1 /τ ) traces from Del δ ( x ) , runs in time n O ( k ) · log(1 /τ ) , and has the following property: For any unknown source string x ∈ { , } n , withprobability at least − τ the output of Multiplicity ′ small- δ is the multiplicity of w in subword( x, k )( i.e. the number of occurrences of w as a subword of x ) . In Section 7 we will build on Theorem 4 to give a stronger version that has poly( n ) · exp( O ( k ))running time and sample complexity (ignoring the dependence on other parameters) for δ < / x,w ( ζ ) in Theorem 5, which relates the subword polynomial to traces drawn from the deletion10hannel. The proof uses the generalized deletion polynomial and is presented in Section 5.2. Thisnew expression for SW x,w ( ζ ) allows one to evaluate SW x,w ( ζ ) at ζ = 0 up to a small error (say, ± . 1) using traces of x (see Corollary 5.1) when δ ≤ / 2. Given that SW x,w (0) is an integer, theresult can be rounded to obtain the exact value of SW x,w (0); this finishes the proof of Theorem 4.We remark that the expression for SW x,w ( ζ ) given in Theorem 5 works for any ζ ∈ C , whenviewing SW x,w ( ζ ) as a polynomial over C , and may be useful for subsequent work. Indeed Corollary5.1 shows that SW x,w ( ζ ) can be evaluated at any ζ ∈ B − δ ( δ ) up to a small error using traces of x ,where B − δ ( δ ) denotes the complex disc with center δ and radius 1 − δ . We need δ ≤ / ∈ B − δ ( δ ). SW x,w ( ζ ) for ζ ∈ B − δ ( δ ) using traces of x In the rest of this section we consider SW x,w ( ζ ) as a polynomial over complex numbers. The maintechnical ingredient in the algorithm Multiplicity ′ small- δ is the following theorem, which relates thesubword polynomial to traces drawn from the deletion channel: Theorem 5. Let x, k and w be as above. Then for all ζ ∈ C we have SW x,w ( ζ ) = 1(1 − δ ) k X α ∈ Z k − ≥ | α |≤ n − k E y ∼ Del δ ( x ) h w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) i · (cid:18) ζ − δ − δ (cid:19) | α | . Before proving Theorem 5 in Section 5.2 we use it to obtain the following corollary. Corollary 5.1 (Corollary of Theorem 5) . Let x, k, w be as above, and let ε > . Then, given accessto traces y ∼ Del δ ( x ) , there exists an algorithm which, given as input any ζ ∈ B − δ ( δ ) , evaluates SW x,w ( ζ ) up to error ± ε with success probability at least − τ . The algorithm takes (cid:18) n − δ (cid:19) O ( k ) · ε · log (cid:18) τ (cid:19) many traces and running time. Recall that SW x,w (0) = w, x ). When δ ≤ / 2, the disc B − δ ( δ ) contains the origin. There-fore, setting ε = 1 / Multiplicity ′ small- δ that uses(( n/ (1 − δ )) O ( k ) ) · log(1 /τ ) = n O ( k ) · log(1 /τ ) traces and running time to evaluate SW x,w (0) up toan error of ε = 1 / 3, which succeeds with probability at least 1 − τ . It then rounds the result to thenearest integer to obtain SW x,w (0) = w, x ) given that the latter is an integer. This finishes theproof of Theorem 4. Proof of Corollary 5.1. The algorithm simply draws s = (cid:18) n − δ (cid:19) O ( k ) · ε · log (cid:18) τ (cid:19) many traces y , . . . , y s of x and uses them to compute an empirical estimate ˜ E α of E α := E y ∼ Del δ ( x ) h w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) i (1)11or each α ∈ Z k − ≥ with | α | ≤ n − k . This is done by computing w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y i ) for each α and y i (in time polynomial in n ) for each pair of α and y i ), and then takingthe average over y , . . . , y s for each α . Given that the number of α ’s is at most n k , the overallrunning time is s · n k · poly( n ), as stated in Corollary 5.1.Given that w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) in (1) is between 0 and n , it follows fromour choice of s , a Chernoff bound and a union bound, that with probability at least 1 − τ , everyempirical estimate ˜ E α satisfies | ˜ E α − E α | ≤ ε · (cid:18) − δn (cid:19) k . (2)Using (cid:12)(cid:12)(cid:12)(cid:12) ζ − δ − δ (cid:12)(cid:12)(cid:12)(cid:12) ≤ ζ ∈ B − δ ( δ ), we can use ˜ E α to obtain an estimate of SW x,w ( ζ ):1(1 − δ ) k X α ˜ E α · (cid:18) ζ − δ − δ (cid:19) | α | and the estimate is correct up to error 1(1 − δ ) k X α | ˜ E α − E α | ≤ ε, where the inequality holds by Equation (2) given that the number of α ’s is no more than n k . In this subsection we prove Theorem 5. We first introduce a more general class of polynomials, the( x, f )-deletion-channel polynomials (see Definition 4), of which SW x,w is a special case. We thenprove an extension of Theorem 5 (see Theorem 6) which applies to every ( x, f )-deletion channelpolynomial; Theorem 5 follows as a direct corollary. While we don’t need the full generality ofTheorem 6 to prove Theorem 5, working with this new class of polynomials makes our proofscleaner. We also believe that Theorem 6 in the general form may be useful for subsequent analysis.The following notation will be convenient for us. Given vectors γ ∈ Z k ≥ and ξ ∈ C k , and afunction P ( z , . . . , z k ) from C k to C , we define ξ γ = ξ γ · · · ξ γ k k and D γ P = ∂ | γ | P∂z γ · · · ∂z γ k k . Recall that γ ! = γ ! · · · γ k ! and | γ | = γ + · · · + γ k . For v ∈ C , we will denote the vector ( v, v, · · · , v ) ∈ C k by ~v , where the dimension k will be clear from context.We define the class of ( x, f )-deletion-channel polynomials: Definition 4. Given f : { , } k → C and a string x ∈ { , } n , the ( x, f )-deletion-channel polyno-mial P x,f : C k → C is defined by P x,f ( ξ ) := X γ ∈ Z k ≥ | γ |≤ n − k f ( x γ , x γ + γ +1 , . . . , x γ + ··· + γ k +( k − ) · ξ γ . 12e call P x,f the ( x, f )-deletion-channel polynomial because by choosing k = 1 and f : { , } →{ , } to be the 1-bit identity function id( x ) = x , we have that P x, id ( ξ ) = n − X i =0 x i ξ i is the deletion-channel polynomial defined in [DOS17].The next theorem shows that under a change of variables, the coefficients of P x,f with respectto the new variables can be expressed in terms of the expectation of f over traces of x drawn fromthe deletion channel. We state it and then show that Theorem 5 follows as a direct corollary. Theorem 6. For any ξ ∈ C k , we have P x,f ( ξ ) = 1(1 − δ ) k X β ∈ Z k ≥ | β |≤ n − k E y ∼ Del δ ( x ) h f ( y β , . . . , y β + ··· + β k + k − ) i · ξ − ~δ − δ ! β . Proof of Theorem 5 assuming Theorem 6. Given x ∈ { , } n and w ∈ { , } k for some k ∈ [ n ] asin the statement of Theorem 5, we take f : { , } k → { , } to be the indicator function of w : f ( b , b , . . . , b k ) = [( b , b , . . . , b k ) = w ] . Using this f we get the following connection between SW x,w ( ζ ) and P x,f (1 , ζ, ζ, . . . , ζ ):SW x,w ( ζ ) = X α ∈ Z k − ≥ | α |≤ n − k w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , x ) · ζ | α | = X α ∈ Z k − ≥ | α |≤ n − kn − k −| α | X i =0 f ( x i , x i + α +1 , x i + α + α +2 , . . . , x i + | α | + k − ) 1 i ζ | α | = P x,f (1 , ζ, ζ, · · · , ζ )Applying Theorem 6 on P x,f (1 , ζ, ζ, . . . , ζ ), we haveSW x,w ( ζ ) = 1(1 − δ ) k X α ∈ Z k − ≥ | α |≤ n − kn − k −| α | X i =0 E y ∼ Del δ ( x ) h f ( y i , y i + α +1 , . . . , y i + | α | + k − ) i · (cid:18) ζ − δ − δ (cid:19) | α | = 1(1 − δ ) k X α ∈ Z k − ≥ | α |≤ n − k E y ∼ Del δ ( x ) h w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) i · (cid:18) ζ − δ − δ (cid:19) | α | where the last step follows by linearity of expectation. This concludes the proof of Theorem 5.We now prove Theorem 6. The high-level idea is to relate the expectation of f over traces of x drawn from the deletion channel to partial derivatives of polynomial P x,f at ~δ , and then applyTaylor’s expansion to P x,f at the point ~δ . 13 laim 5.2. Let β ∈ Z k ≥ with | β | ≤ n − k . We have E y ∼ Del δ ( x ) h f ( y β , . . . , y β + ··· + β k +( k − ) i = (1 − δ ) k · (1 − δ ) | β | β ! · D β P x,f ( ~δ ) . To get some intuition, consider the special case of k = 1 (so P x,f is univariate) and f = id.Then it is straightforward to verify that E y ∼ Del δ ( x ) (cid:2) y (cid:3) = (1 − δ ) n − X i =0 x i δ i = (1 − δ ) · P x, id ( δ ) , and E y ∼ Del δ ( x ) (cid:2) y (cid:3) = (1 − δ ) n − X i =1 x i (cid:18) i (cid:19) (1 − δ ) δ i − = (1 − δ ) n − X i =1 x i iδ i − = (1 − δ ) · D P x, id ( δ ) . Proof of Claim 5.2. For a fixed γ ∈ Z k ≥ with | γ | ≤ n − k , we write γ → β, or equivalently ( γ , γ , . . . , γ k ) → ( β , β , . . . , β k ) , to denote the event that the ( γ , γ + γ + 1 , . . . , γ + · · · + γ k + ( k − x become the( β , β + β + 1 , . . . , β + · · · + β k + ( k − y ∼ Del δ ( x ) respectively. For this to occur,each bit x γ i must be present in y , which happens with probability (1 − δ ) k . Further, for each x γ i to become y β i , exactly β i out of the γ i bits between (and including) positions γ + · · · + γ i − + i and γ + · · · + γ i + ( i − 1) of x must be retained. So, the probability of this event is Pr [ γ → β ] = (1 − δ ) k k Y i =1 (cid:18) γ i β i (cid:19) (1 − δ ) β i δ γ i − β i = (1 − δ ) k k Y i =1 γ i ( γ i − · · · ( γ i − β i + 1) β i ! (1 − δ ) β i δ γ i − β i = (1 − δ ) k · k Y i =1 (1 − δ ) β i β i ! ! · k Y i =1 (cid:16) γ i ( γ i − · · · ( γ i − β i + 1) · δ γ i − β i (cid:17) = (1 − δ ) k · (1 − δ ) | β | β ! · k Y i =1 d β i dδ δ γ i . (3)As a result, we have that E y ∼ Del δ ( x ) h f ( y β , . . . , y β + ··· + β k +( k − ) i = X | γ |≤ n − k f ( x γ , . . . , x γ + ··· + γ k +( k − ) · Pr [ γ → β ]= (1 − δ ) k · (1 − δ ) | β | β ! X | γ |≤ n − k f ( x γ , . . . , x γ + ··· + γ k +( k − ) · k Y i =1 d β i dδ δ γ i (Equation (3))= (1 − δ ) k · (1 − δ ) | β | β ! · D β P x,f ( ~δ ) . This finishes the proof of Claim 5.2. 14 roof of Theorem 6. Since P x,f is a polynomial of degree at most n − k , applying Taylor’s expansionto P x,f at the point ~δ and using Claim 5.2, we get that(1 − δ ) k · P x,f ( ξ ) = (1 − δ ) k X | β |≤ n − k D β P x,f ( ~δ ) β ! · ( ξ − ~δ ) β = X | β |≤ n − k E y ∼ Del δ ( x ) h f ( y β , . . . , y β + ··· + β k + k − ) i · ξ − ~δ − δ ! β . Multiplicity ′ large- δ : An algorithm for deletion rate δ < In this section we prove a weaker version of Theorem 3, giving an algorithm that works for anydeletion rate δ < k ≈ log n (as will be the case in our ultimate application): Theorem 7. Let < τ, δ < . There is an algorithm Multiplicity ′ large- δ which takes as inputa string w ∈ { , } k and access to independent traces of an unknown source string x ∈ { , } n . Multiplicity ′ large- δ runs in (cid:16) n / (1 − δ ) − δ (cid:17) O ( k ) log (cid:0) τ (cid:1) time and uses (cid:16) n / (1 − δ ) − δ (cid:17) O ( k ) log (cid:0) τ (cid:1) many tracesfrom Del δ ( x ) , and has the following property: For any unknown source string x ∈ { , } n , withprobability at least − τ the output of Multiplicity ′ large- δ is w, x ) , the multiplicity of w in subword( x, k ) ( equivalently, the value SW x,w (0)) . Looking ahead, in Section 7 we will build on the proof of Theorem 7 to give a stronger versionthat has polynomial running time and sample complexity when k ≈ log n .The following result is central to our analysis. Informally, it says that if q is a polynomial with“not-too-large” coefficients and a constant term which is bounded away from SW x,w (0) by at least1 / 2, then q must “differ noticeably” from SW x,w over a particular interval. (Looking ahead, for ourpurposes it is crucially important that this interval corresponds to a range of deletion probabilitiesfor which it is easy to estimate the polynomial’s value given access to traces drawn from Del δ ( x ).) Theorem 8. Fix strings x ∈ { , } n , w ∈ { , } k for some k ∈ [ n ] . Let q ( z ) = P n − kℓ =0 q ℓ z ℓ be anypolynomial such that | SW x,w (0) − q (0) | ≥ / , and ≤ q ℓ ≤ n k for all ℓ ∈ { , , · · · , n − k } . Then sup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12) SW x,w ( ζ ) − q ( ζ ) (cid:12)(cid:12) ≥ n − O ( k/ (1 − δ )) , for any δ ∈ (0 , . (4)Theorem 8 is an easy consequence of the following more general theorem: Theorem 9. Let ≤ n ≤ m . Let p ( z ) = P nℓ =0 p ℓ z ℓ be a polynomial of degree at most n with realcoefficients such that | p | ≥ / , and | p ℓ | ≤ m for all ℓ . Then we have sup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12) p ( ζ ) (cid:12)(cid:12) ≥ m − O (1 / (1 − δ )) , for any δ ∈ (0 , . (5)To obtain Theorem 8 from Theorem 9, set p = SW x,w − q . By the condition of Theorem 8 wehave that | p | = | SW x,w (0) − q | ≥ / 2. Writing (SW x,w ) ℓ for the degree- ℓ coefficient of SW x,w ,from the discussion following Definition 3 it is immediate that 0 ≤ (SW x,w ) ℓ ≤ (cid:0) nk (cid:1) ≤ n k , and hence | p ℓ | = | (SW x,w ) ℓ − q ℓ | ≤ n k . Thus we can invoke Theorem 9 with m = n k to obtain Theorem 8.15n Section 6.1 we present and analyze the algorithm Multiplicity ′ large- δ (which is based onlinear programming) and prove Theorem 7 assuming Theorem 8. The proof of Theorem 9, whichis based on complex analysis, is given in Section 6.2. SW x,w ( δ ′ ) for δ ′ ≥ δ The following easy lemma gives an unbiased estimator for SW x,w ( δ ′ ), for all δ ′ ≥ δ , given tracesfrom Del δ ( x ). Lemma 6.1. Let x ∈ { , } n , w ∈ { , } k and let ε > . Then, given traces y ∼ Del δ ( x ) , thereexists an algorithm, which for any δ ′ ∈ [ δ, , evaluates SW x,w ( δ ′ ) up to error ± ε with successprobability at least − τ . The algorithm takes n O (1) · (cid:18) − δ ′ (cid:19) O ( k ) · ε · log (cid:18) τ (cid:19) many traces and running time.Proof. First of all, observe that given y ∼ Del δ ( x ), we can sample y ∼ Del δ ′ ( x ) for any δ ′ ≥ δ withno overhead. Next, observe that the expected number of w in a randomly trace y ∼ Del δ ′ ( x ) isgiven by E y ∼ Del δ ′ ( x ) [ w, y )] = X α ∈ Z k − ≥ | α |≤ n − k w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , x ) · δ ′| α | · (1 − δ ′ ) k . This follows from the fact that every occurrence of w as a subword of trace y can be uniquelyidentified with a subsequence ( i ≤ . . . ≤ i k ) such that (i) x i = w ∧ . . . ∧ x i k = w k . (ii) positions i , . . . , i k are not deleted in y . (iii) every position in [ i , . . . , i k ] \ { i , . . . , i k } is deleted in the trace y . However, by Definition 3, it follows that E y ∼ Del δ ′ ( x ) [ w, y )] = SW x,w ( δ ′ ) · (1 − δ ′ ) k . (6)Now for any y ∼ Del δ ′ ( x ), w, y ) is an integer between 0 and n . Thus, the standard empiricalestimator will use n O (1) · (cid:18) − δ ′ (cid:19) O ( k ) · ε · log (cid:18) τ (cid:19) many traces and running time and returns an estimate of E y ∼ Del δ ′ ( x ) [ w, y )] up to ± ε · (1 − δ ′ ) k .Using (6), we get the claim. Multiplicity ′ large- δ algorithm and its analysis We present the algorithm Multiplicity ′ large- δ in Figure 1. For its correctness we first observe thatwith probability at least 1 − τ , we have thatfor every ζ ∈ S , (cid:12)(cid:12)(cid:12)d SW x,w ( ζ ) − SW x,w ( ζ ) (cid:12)(cid:12)(cid:12) ≤ κ/ . nputs w ∈ { , } k access to independent traces drawn from Del δ ( x ) for an unknown string x ∈ { , } n error parameter τ ∈ (0 , Output w, x ) or “ fail ” Algorithm description 1. Let κ := n − O ( k/ (1 − δ )) be the RHS of Equation (4) in Theorem 8, let ∆ := κ/ (2 n k +2 ), and let S := (cid:8) δ, δ + ∆ , δ + 2∆ , . . . , δ + L ∆ (cid:9) such that L is the largest integer with δ + L ∆ ≤ ( δ + 1) / 2. (Note that | S | = O (1 / ∆).)2. For each ζ ∈ S , compute the empirical estimate d SW x,w ( ζ ) of SW x,w ( ζ ) up to accuracy κ/ − τ / | S | using Lemma 6.1. (We reuse traces from Del δ ( x ) foreach ζ ∈ S .)3. Set up a linear program as follows:(a) Variables are q , . . . , q n − k ∈ [0 , n k ].(b) Constraints are: For each ζ ∈ S , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − k X ℓ =0 q ℓ ζ ℓ − d SW x,w ( ζ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ κ/ . 4. Return “ fail ” if the above linear program has no solution.5. Otherwise solve the linear program and return the nearest integer to q . Figure 1: Description of the algorithm Multiplicity ′ large- δ .We finish the proof by showing that when this happens, the linear program in lines 3(a) and 3(b) isfeasible, and furthermore, | q − SW x,w (0) | < / q , . . . , q n − k ) (when thishappens, the closest integer to q is exactly SW x,w (0)).To see that the linear program is feasible, we let p , . . . , p n − k denote the coefficients of SW x,w ,so SW x,w ( ζ ) = P n − kℓ =0 p ℓ ζ ℓ . From the discussion after Theorem 9, every p ℓ lies between 0 and n k .As a result, p , . . . , p n − k is a feasible solution to the linear program because for every ζ ∈ S , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − k X ℓ =0 p ℓ ζ ℓ − d SW x,w ( ζ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) SW x,w ( ζ ) − d SW x,w ( ζ ) (cid:12)(cid:12)(cid:12) ≤ κ/ . Next we let q , . . . , q n − k be any feasible solution to the linear program and assume for a con-tradiction that | q − SW x,w (0) | ≥ / 2. Let q be the polynomial q ( ζ ) = P n − kℓ =0 q ℓ ζ ℓ . Given that0 ≤ q ℓ ≤ n k for every ℓ (as required by the linear program), Theorem 8 implies (using the choice17f κ in line 1 of the algorithm) that sup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12) SW x,w ( ζ ) − q ( ζ ) (cid:12)(cid:12) ≥ κ. (7)The following claim (with s = SW x,w − q and m = n k ) shows that there exists a ζ ∈ S such that (cid:12)(cid:12) SW x,w ( ζ ) − q ( ζ ) (cid:12)(cid:12) ≥ κ/ , a contradiction to the assumption that q , . . . , q n − k is a feasible solution because (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − k X ℓ =0 q ℓ ζ ℓ − d SW x,w ( ζ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) q ( ζ ) − d SW x,w ( ζ ) (cid:12)(cid:12) ≥ (cid:12)(cid:12) q ( ζ ) − SW x,w ( ζ ) (cid:12)(cid:12) − (cid:12)(cid:12) SW x,w ( ζ ) − d SW x,w ( ζ ) (cid:12)(cid:12) > κ/ . Claim 6.2 (Searching over S suffices) . Let s ( t ) = s + s t + · · · + s n t n be a polynomial such thatevery coefficient s ℓ has | s ℓ | ≤ m . Suppose | s ( t ) | ≥ κ for some t ∈ [ δ, ( δ + 1) / . Then there existsan integer k such that t ′ = δ + k ∆ ∈ [ δ, ( δ + 1) / and | s ( t ′ ) | ≥ κ/ , where ∆ = κ/ (2 mn ) .Proof. Let k be an integer such that t ′ := δ + k ∆ ∈ [ δ, ( δ + 1) / 2] and | t ′ − t | ≤ ∆. Since | t | ≤ | t ′ | ≤ 1, for each ℓ ∈ { , . . . n } we have that | t ′ ℓ − t ℓ | ≤ | t ′ − t | · ℓ − X i =0 (cid:12)(cid:12) t ′ i t ℓ − − i (cid:12)(cid:12) ≤ ∆ ℓ ≤ ∆ n. Since | s ℓ | ≤ m and ∆ = κ/ (2 mn ), we have (cid:12)(cid:12)(cid:12) s ℓ t ′ ℓ − s ℓ t ℓ (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) s ℓ (cid:12)(cid:12) · (cid:12)(cid:12) t ′ ℓ − t ℓ (cid:12)(cid:12) ≤ mn ∆ = κ/ (2 n ) . Therefore (cid:12)(cid:12) s ( t ′ ) − s ( t ) (cid:12)(cid:12) ≤ n X ℓ =1 (cid:12)(cid:12)(cid:12) s ℓ t ′ ℓ − s ℓ t ℓ (cid:12)(cid:12)(cid:12) ≤ κ/ . It follows from the triangle inequality that | s ( t ′ ) | ≥ | s ( t ) | − | s ( t ′ ) − s ( t ) | ≥ κ/ ζ ∈ S , we have 1 − ζ ≥ (1 − δ ) / 2. By Lemma 6.1, the sample complexity is n O (1) · (cid:18) − δ (cid:19) O ( k ) · (cid:18) κ (cid:19) · log (cid:18) | S | τ (cid:19) = n / (1 − δ ) − δ ! O ( k ) log (cid:18) τ (cid:19) . (8)The running time of the algorithm is (8) multiplied by | S | plus the time needed to solve the linearprogram. The former can still be bounded by the same expression on the RHS of (8) above.The latter can be bounded by poly( n ) multiplied by the number of bits needed to describe thelinear program, which can also be bounded by the RHS of (8). This proves the claimed upperbounds on the running time and sample complexity, and concludes the proof of Theorem 7 assumingTheorem 8. 18 .2 Proof of Theorem 9 In this subsection we prove Theorem 9. For convenience we define ρ := 1 − δ ∈ (0 , ρ : Restatement of Theorem 9 : Let ≤ n ≤ m . Let p ( z ) = P ni =0 p i z i be a polynomial of degree atmost n with real coefficients such that | p | ≥ / , and | p i | ≤ m for all i . Then for any ρ ∈ (0 , , sup ζ ∈ [1 − ρ, − ρ/ (cid:12)(cid:12) p ( ζ ) (cid:12)(cid:12) ≥ m − O (1 /ρ ) . The proof uses the Hadamard three-circle theorem, along with other standard results in complexanalysis. Consider the mapping w : C → C given by w ( z ) = 1 − ρ ρ (cid:18) z + 1 z (cid:19) . We observe that the map w ( z ) is meromorphic with only one pole at z = 0. Define radii r = 1; r = 2; r = 4 . For i = 1 , , 3, let C i ⊂ C be the circle centered at the origin with radius r i . Consider the map f : C → C given by f ( z ) = p ( w ( z )). Like w ( · ), f is meromorphic with only one pole at z = 0. Theidea of the proof is to use the Hadamard three-circle theorem [Wik20a] on f , which tells us that2 log (cid:18) sup z ∈ C | f ( z ) | (cid:19) ≤ log (cid:18) sup z ∈ C | f ( z ) | (cid:19) + log (cid:18) sup z ∈ C | f ( z ) | (cid:19) . (9)Now, we will analyze each term in the above inequality. We first record some facts about thebehaviour of w over each circle C i that are immediate from the definition: Fact 6.3. Let w, C , C and C be as defined above.(1) When z ranges over C , w ( z ) ranges over the real line segment [1 − ρ, − ρ/ .(2) When z ranges over C , w ( z ) ranges over the ellipse E in the complex plane which is centeredat the real value − ρ/ and is the locus of all points z = x + iy satisfying (cid:18) x − (1 − ρ/ ρ/ (cid:19) + (cid:18) y ρ/ (cid:19) = 1 . (3) Similarly, when z ∈ C , w ( z ) ranges over the ellipse E in the complex plane which is centeredat the real value − ρ/ and is the locus of all points z = x + iy satisfying (cid:18) x − (1 − ρ/ ρ/ (cid:19) + (cid:18) y ρ/ (cid:19) = 1 . Moreover, the ellipse E is completely contained in the unit disk B (0) . Equation (9) will be useful to us because of the following simple claim, which is immediate fromFact 6.3, Item (1): 19 laim 6.4. sup z ∈ C | f ( z ) | = sup ζ ∈ [1 − ρ, − ρ/ | p ( ζ ) | . Given Equation (9) and Claim 6.4, in order to lower bound sup ζ ∈ [1 − ρ, − ρ/ | p ( ζ ) | , it suffices toupper bound sup z ∈ C | f ( z ) | and to lower bound sup z ∈ C | f ( z ) | . We do this in the following claims: Claim 6.5. sup z ∈ C | f ( z ) | ≤ m · ( n + 1) . Proof. By Fact 6.3, Item (3) above, we have E ⊆ B (0) and sosup z ∈ C | f ( z ) | = sup z ∈ E | p ( z ) | ≤ sup z ∈ B (0) | p ( z ) | , The bounds on the coefficients of p immediately imply that sup z ∈ B (0) | p ( z ) | ≤ m · ( n + 1). Claim 6.6. sup z ∈ C | f ( z ) | ≥ m − O (1 /ρ ) . Proof. Applying Jensen’s formula [Wik20b] to p on the closed origin-centered disk of radius 1 − ρ/ E z [ln | p ( z ) | ] ≥ ln | p (0) | ≥ ln(1 / 2) = − ln 2 . (10)Here z is taken to be a uniform random point on the circle C of radius 1 − ρ/ A = { z ∈ C : | z | = 1 − ρ/ | arg( z ) | ≤ ρ/ } . Let c max , A = max z ∈A | p ( z ) | and θ ∗ = 3 ρ/ 16 (note that θ ∗ /π is the fraction of C that lies in A ).Now since | p ( z ) | ≤ m ( n + 1) for all z ∈ B − ρ/ (0) \ A (because of the coefficient bound on p ), wehave by Equation (10) that − ln 2 ≤ (cid:18) − θ ∗ π (cid:19) ln ( m ( n + 1)) + θ ∗ π · ln c max , A ≤ ln ( m ( n + 1)) + θ ∗ π · ln c max , A . Thus, ln c max , A ≥ − π · ln (2 m ( n + 1)) θ ∗ , and hence c max , A ≥ (2 m ( n + 1)) − π/θ ∗ . Next, we observe that the arc A is entirely in the interior of the ellipse E . (To see this, observethat the center of the arc is the real value 1 − ρ/ 4, which coincides with the center of the ellipse,and that every point on the arc is within distance less than 3 ρ/ 16 from the center of the arc (ellipse).Since 3 ρ/ 16 is the length of the semi-minor axis of the ellipse, it follows that every point in the arcis within the ellipse.) We further recall that m ≥ n and that θ ∗ = Θ( ρ ). Using these facts alongwith the maximum modulus principle and Fact 6.3 Item (2), we conclude thatsup z ∈ C | f ( z ) | = sup z ∈ E | p ( z ) | ≥ sup z ∈A | p ( z ) | = c max , A ≥ m − O (1 /ρ ) , and Claim 6.6 is proved. 20 roof of Theorem 9. We combine Claims 6.4, 6.5 and 6.6 in Equation (9) to get thatlog sup ζ ∈ [1 − ρ, − ρ/ | p ( ζ ) | = log sup z ∈ C | f ( z ) | ≥ − O (1 /ρ ) log m − log( m ( n + 1)) ≥ − O (1 /ρ ) log m. Exponentiating both sides finishes the proof of Theorem 9. In this section we give improved algorithms strengthening the quantitative bounds given in Theorem 4and Theorem 7 and thereby complete the proof of Theorem 3.First we describe the main ideas underlying the improved algorithms. Both algorithms benefitfrom the same insights, so we will just describe the improvement of Theorem 7 in this overview.Recall the definition of the subword polynomial SW x,w from Definition 3:SW x,w ( ζ ) := X α ∈ Z k − ≥ | α |≤ n − k w ∗ α w ∗ α w . . . w k − ∗ α k − w k − , x ) · ζ | α | . Grouping terms of the same degree together, we can write it as SW x,w ( ζ ) = P ℓ ≥ γ ℓ ζ ℓ , where γ ℓ = X α ∈ Z k − ≥ | α | = ℓ w ∗ α w ∗ α w . . . w k − ∗ α k − w k − , x )is the degree- ℓ coefficient, for each 0 ≤ ℓ ≤ n − k . In the proofs of Corollary 5.1 in Section 5 andTheorem 8 in Section 6, we bounded these coefficients uniformly by m = n k . The first insight isthat in fact a sharper bound holds for these coefficients: specifically, we have0 ≤ γ ℓ ≤ m ℓ := n (cid:18) ℓ + k − k − (cid:19) . (11)This is simply because there are at most n choices for the position of the first character w in x , andthere are (cid:0) ℓ + k − k − (cid:1) ways to choose a tuple of non-negative integers α , · · · , α k − that sum to ℓ . Thesecond insight is that since our approaches only involve evaluating SW x,w ( ζ ) on non-negative realinputs ζ that are bounded below 1, we can exploit this improved coefficient bound to truncate thehigh-degree portion of the polynomial; working with the resulting (much) lower-degree polynomialleads to an overall gain in efficiency.To explain this in more detail, we need the following definition: Definition 5. Let p ( ζ ) = P nℓ =0 p ℓ ζ ℓ be a univariate polynomial of degree at most n . For d ∈{ , , · · · , n } , we define the d -low-degree part of p (denoted as p ≤ d ) to be p ≤ d ( ζ ) = d X ℓ =0 p ℓ ζ ℓ . Analogously, we define the d -high-degree part of p to be p >d ( ζ ) := P ℓ>d p ℓ ζ ℓ = p ( ζ ) − p ≤ d ( ζ ) . q with a constant term which is an integer different from SW x,w (0).In order for q to be a polynomial that could possibly arise from the k -subword deck of some string z ∈ { , } n , it must also have coefficients bounded by the right hand side of Equation (11). Usingthese sharper bounds on the coefficients, we show that there exists a threshold degree d that isroughly O ( k + log n ) such that • The d -low-degree part of the polynomials SW x,w and q must differ by at least n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) (see Equation (17)) at some point in the interval [ δ, ( δ + 1) / ≈ n − O ( k/ (1 − δ )) lower bound established in Theorem 8, which leads to savingson both time and sample complexity. • The maximum value that the high-degree part of such polynomials attains on the relevantinterval is negligible compared to the difference specified above.Combining these two facts enables us to carry out our analysis just on the d -low-degree part, whichhas much smaller coefficients and thereby admits a more efficient algorithm.In Section 7.1, we implement these ideas to strengthen Theorem 4 when δ < / 2. In Section 7.2,we do the same to derive a stronger analogue of Theorem 8, which reduces the sample complexityof computing w, x ) for general δ < w, x ) which is faster than the corresponding algorithm in Section 6.1. δ < / In this subsection we strengthen Theorem 4 for deletion rate δ < / Theorem 10. Let < δ < / . There is an algorithm Multiplicity small- δ which takes as inputa string w ∈ { , } k , access to independent traces of an unknown source string x ∈ { , } n , and aparameter τ > . Multiplicity small- δ draws poly( n ) · (1 / − δ ) − O ( k ) · log(1 /τ ) traces from Del δ ( x ) ,runs in time poly( n ) · (1 / − δ ) − O ( k ) · log(1 /τ ) , and has the following property: For any unknownsource string x ∈ { , } n , with probability at least − τ the output of Multiplicity small- δ is themultiplicity of w in subword( x, k ) ( i.e. the number of occurrences of w as a subword of x ) . Recall Theorem 5, which relates the subword polynomial value at any point ζ ∈ C to tracesdrawn from the deletion channel using Taylor series:SW x,w ( ζ ) = 1(1 − δ ) k X α ∈ Z k − ≥ | α |≤ n − k E y ∼ Del δ ( x ) h w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) i · (cid:18) ζ − δ − δ (cid:19) | α | . As in Section 7.1, our goal is to evaluate SW x,w (0) = w, x ) up to error 1 / ξ = ( ζ − δ ) / (1 − δ ), so that ζ = δ + ξ (1 − δ ). Considerthe polynomial p defined as follows: p ( ξ ) := (1 − δ ) k · SW x,w (cid:0) δ + ξ (1 − δ ) (cid:1) . We ignore the dependence on δ for the overview here; see (12) and (16) for exact choices of d . 22e have that SW x,w (0) = (1 − δ ) − k p ( − δ/ (1 − δ )) , so estimating SW x,w (0) up to error ± / p ( − δ/ (1 − δ )) up to error ± (1 − δ ) k / 3. As 0 < δ < / 2, we have1 − δ > / 2, and so it suffices to estimate p ( − δ/ (1 − δ )) up to error 2 − k / 3. Moreover, we have | − δ/ (1 − δ ) | = δ/ (1 − δ ) < 1. We will use these observations to bound the contribution of thehigh-degree-part of p . Let θ = 1 / − δ , so that δ/ (1 − δ ) ≤ δ = 1 − θ . Lemma 7.1. Let δ < / , and let p and θ be as above. Then by setting d := Cθ (cid:18) k ln Cθ + ln n (cid:19) (12) with C = e , we have sup | ξ |≤ − θ | p >d ( ξ ) | ≤ . k . Before proving Lemma 7.1, we show that it implies Theorem 10. Proof of Theorem 10 assuming Lemma 7.1. Consider p ≤ d , the d -low-degree-part of p , where d is asgiven by Lemma 7.1. For all ξ with | ξ | ≤ − θ , | p ( ξ ) − p ≤ d ( ξ ) | = | p >d ( ξ ) | ≤ . k . So, by the triangle inequality, in order to estimate p ( − δ/ (1 − δ )) up to error ± − k / 3, it suffices toestimate p ≤ d ( − δ/ (1 − δ )) up to error ± − k / S d be the set { α ∈ Z k − ≥ : | α | ≤ d } . As in Section 5.1, let E α := E y ∼ Del δ ( x ) h w ∗ α w ∗ α w · · · w k − ∗ α k − w k − , y ) i for each α ∈ S d . (Note that by definition, p ≤ d only includes terms E α for | α | ≤ d .) Then p ≤ d ( ξ ) = X α ∈ S d E α · ξ | α | . Each E α is between 0 and n and using the same argument as that following Equation (11), we have | S d | = M := d X ℓ =0 (cid:18) ℓ + k − k − (cid:19) = (cid:18) d + k − k − (cid:19) ≤ (cid:18) d + kk (cid:19) and we use the following claim to bound the right hand side: Claim 7.2. Let d = Cθ ( k ln Cθ + ln n ) for some θ ∈ (0 , and C ≥ e . Then we have (cid:18) d + kk (cid:19) ≤ n · (cid:18) Cθ (cid:19) k . roof. Using d ≥ k and the approximation k ! ≥ √ πk ( k/e ) k ≥ ( k/e ) k , we have (cid:18) d + kk (cid:19) ≤ (2 d ) k ( k/e ) k = exp (cid:18) k ln 2 edk (cid:19) ≤ exp (cid:18) k (cid:18) Cθ + ln (cid:18) ln Cθ + ln nk (cid:19)(cid:19)(cid:19) ≤ exp (cid:18) k (cid:18) Cθ + ln Cθ + ln nk (cid:19)(cid:19) (13) ≤ n · (cid:18) Cθ (cid:19) k , (14)where (13) used ln a ≤ a , (14) used 2 < ln( C/θ ) since C ≥ e .Plugging in Claim 7.2, we have M ≤ n/θ O ( k ) using θ < / 2. The algorithm just draws s (to bespecified) traces y ∼ Del δ ( x ), computes an empirical estimate ˜ E α of E α for each α ∈ S d so that (cid:12)(cid:12)(cid:12) ˜ E α − E α (cid:12)(cid:12)(cid:12) ≤ . k M . with probability at least 1 − τ . This can be achieved by setting the number of traces to be s := O (cid:16)(cid:0) M k (cid:1) (cid:17) · log (cid:18) Mτ (cid:19) = (cid:16) nθ k (cid:17) O (1) · log 1 τ and a simple application of a Chernoff bound and a union bound. When this happens, it followsfrom the fact that | − δ/ (1 − δ ) | < X α ∈ S d ˜ E α · (cid:18) − δ − δ (cid:19) | α | is an estimate that deviates by at most 2 − k / 5. Combined with the observations at the beginningof the proof, this implies that we can estimate SW x,w (0) = w, x ) up to error ± / 3, and henceour output (the nearest integer to our estimate of SW x,w (0)) is w, x ) with probability at least1 − τ .The runtime is governed by the time required to compute estimates ˜ E α . We can bound it by s · n O (1) · | S d | ≤ (cid:16) nθ k (cid:17) O (1) · log 1 τ = n O (1) · (cid:18) / − δ (cid:19) O ( k ) · log 1 τ . This finishes the proof of the theorem. Proof of Lemma 7.1. We are interested in | p >d ( ξ ) | over | ξ | ≤ − θ , which is trivially bounded by | p >d ( ξ ) | ≤ n − k X ℓ = d +1 n (cid:18) ℓ + k − k − (cid:19) · (1 − θ ) ℓ ≤ n − k X ℓ = d n (cid:18) ℓ + kk (cid:19) · (1 − θ ) ℓ . First, we show that terms in the sum on the right hand side above decreases with ℓ so it suffices tobound the term with ℓ = d multiplied by n . To see this, observe that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:0) ℓ + kk (cid:1)(cid:0) ℓ + k − k (cid:1) · (1 − θ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = ℓ + kℓ · (1 − θ ) ≤ kℓ − θ < , ℓ > k/ θ , which holds for all ℓ > d given our choice of d . So,sup | ξ |≤ − θ (cid:12)(cid:12)(cid:12) p >d ( ξ ) (cid:12)(cid:12)(cid:12) ≤ n (cid:18) d + kk (cid:19) (1 − θ ) d ≤ n (cid:18) d + kk (cid:19) e − θd . We have e − θd = n − Cθ · ( C/θ ) − Ckθ , and so plugging in Claim 7.2 we have n · (cid:0) n · (cid:0) C/θ (cid:1) k (cid:1) · e − θd ≤ n − Cθ · (cid:0) C/θ (cid:1) (3 − Cθ ) k ≤ n k because 3 − C/θ ≤ − C = e . This concludes the proof of the lemma. δ < Our main technical result is the following, which is a strengthening of Theorem 8: Theorem 11. Fix x ∈ { , } n and w ∈ { , } k with k ≤ n . Let q ( z ) = P n − kℓ =0 q ℓ z ℓ be any polynomialsuch that | SW x,w (0) − q (0) | ≥ / and ≤ q ℓ ≤ m ℓ for all ℓ ∈ { , , · · · , n − k } . Then sup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12) SW x,w ( ζ ) − q ( ζ ) (cid:12)(cid:12) ≥ n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) , for any δ ∈ (0 , . (15)Let p ( z ) = SW x,w ( z ) − q ( z ) = P n − kℓ =0 p ℓ z ℓ . Let c > θ = (1 − δ ) / 2. We will choose the threshold on thedegree to be d := Cθ (cid:18) k ln Cθ + ln n (cid:19) (16)where C = e max(1 , c ). For this d , consider the d -low-degree part p ≤ d . This is a polynomial ofdegree at most d with | p ≤ d (0) | ≥ / ℓ coefficient is bounded by | p ≤ dℓ | ≤ n (cid:18) ℓ + k − k − (cid:19) ≤ n (cid:18) d + k − k − (cid:19) ≤ n (cid:18) d + kk (cid:19) for all ℓ ≤ d . We invoke Theorem 9 on p ≤ d to conclude thatsup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12)(cid:12) p ≤ d ( ζ ) (cid:12)(cid:12)(cid:12) ≥ (cid:18) n (cid:18) d + kk (cid:19)(cid:19) − c/ (1 − δ ) . (17)The following lemma upper bounds the contribution of the high-degree part p >d of p : Lemma 7.3. Let p and d be as above. Then sup ζ ∈ [ δ, ( δ +1) / (cid:12)(cid:12)(cid:12) p >d ( ζ ) (cid:12)(cid:12)(cid:12) ≤ n · (cid:18) n (cid:18) d + kk (cid:19)(cid:19) − c/ (1 − δ ) . (18)Before proving this lemma, we show that it implies Theorem 11.25 roof of Theorem 11 using Lemma 7.3. Since p = p ≤ d + p >d , we use Lemma 7.3 and (17) to getsup ζ ∈ [ δ, ( δ +1) / | p ( ζ ) | ≥ . · (cid:18) n (cid:18) d + kk (cid:19)(cid:19) − c/ (1 − δ ) . Plugging in Claim 7.2 with our choice of d , we havesup ζ ∈ [ δ, ( δ +1) / | p ( ζ ) | ≥ . (cid:18) n (cid:18) d + kk (cid:19)(cid:19) − c/ (1 − δ ) ≥ n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) , which concludes the proof of Theorem 11 using Lemma 7.3. Proof of Lemma 7.3. This proof is similar to that of Lemma 7.1. First we show that the maximumpossible contribution to p >d ( ζ ), when ζ ∈ [ δ, ( δ + 1) / d term in p : (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:0) ℓ + kk (cid:1)(cid:0) ℓ + k − k (cid:1) · ζ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = ℓ + kℓ · | ζ | ≤ (cid:18) kℓ (cid:19) (cid:18) − − δ (cid:19) ≤ kℓ − − δ < ℓ > k/ (1 − δ ), which holds for all ℓ > d . So,sup | ζ |≤ ( δ +1) / (cid:12)(cid:12)(cid:12) p >d ( ζ ) (cid:12)(cid:12)(cid:12) ≤ n (cid:18) d + kk (cid:19) (cid:18) − − δ (cid:19) d ≤ n (cid:18) d + kk (cid:19) · exp (cid:18) − (1 − δ ) d (cid:19) . It suffices to show that n (cid:18) d + kk (cid:19) · exp (cid:18) − (1 − δ ) d (cid:19) ≤ n (cid:18) n (cid:18) d + kk (cid:19)(cid:19) − c − δ or equivalently, n c − δ · (cid:18) d + kk (cid:19) c − δ · exp (cid:18) − (1 − δ ) d (cid:19) ≤ . (19)By our choice of d we have exp (cid:18) − (1 − δ ) d (cid:19) ≤ n − C − δ · (cid:0) C/θ (cid:1) − kC − δ . Using Claim 7.2 again, the left hand side of Equation (19) is at most n c − δ − C − δ · (cid:0) C/θ (cid:1) k ( c − δ − C − δ ) ≤ c − δ − C − δ ≤ C = e max(1 , c ). This concludes the proof of the lemma.26 .3 The algorithm of Theorem 3 Armed with Theorem 11 in place of Theorem 8, the algorithm Multiplicity large- δ giving Theorem 3and its analysis are very similar to the algorithm Multiplicity ′ large- δ and its analysis given earlierin Section 6.1; we only indicate the differences here.The algorithm changes in the following ways: • In Line 1 of the algorithm, we now set κ to be the RHS of Equation (15): κ := n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) . With this choice of κ , it follows from the proof of Theorem 11 that the RHS of Equation (18)in Lemma 7.3 can be bounded from above by 0 . κ . • Later in Line 1, we now set ∆ := κ d m d = κ d · n (cid:0) d + k − k − (cid:1) , where d is as given in Equation (16) (the idea is that now we are using the sharper coefficientbound m ℓ ≤ m d given by Equation (11) rather than the cruder n k bound used earlier). • The coefficient bound on q , . . . , q n − k in Line 3(a) for the linear program is now q ℓ ∈ [0 , m ℓ ]for all ℓ ∈ { , , · · · , n − k } rather than q , . . . , q n − k ∈ [0 , n k ] as earlier.With these changes to the algorithm, most of the analysis goes through unchanged. As before,we observe that with probability at least 1 − τ , we havefor every ζ ∈ S , (cid:12)(cid:12)(cid:12)d SW x,w ( ζ ) − SW x,w ( ζ ) (cid:12)(cid:12)(cid:12) ≤ κ/ . We assume this happens henceforth. The solution which sets q ℓ = (SW x,w ) ℓ , the degree- ℓ coefficientof SW x,w , for all ℓ , is clearly feasible.Now we show that every feasible solution q , · · · , q n − k to the linear program must satisfy | q − SW x,w (0) | < / 2; this is the only part of the analysis that is somewhat different. Suppose for acontradiction that q , · · · , q n − k is a feasible solution with | q − SW x,w (0) | ≥ / 2. Let q ( ζ ) = P ℓ q ℓ ζ ℓ and define the polynomial p = SW x,w − q , with coefficients p ℓ . We invoke Theorem 11 to get that | p ( ζ ∗ ) | ≥ κ for some ζ ∗ ∈ [ δ, ( δ + 1) / κ ), (cid:12)(cid:12)(cid:12) p ( ζ ) − p ≤ d ( ζ ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) p >d ( ζ ) (cid:12)(cid:12)(cid:12) ≤ . κ (20)for all ζ ∈ [ δ, ( δ + 1) / | p ≤ d ( ζ ∗ ) | ≥ . κ . Applying Claim 6.2 with s = p ≤ d , n = d , t = ζ ∗ , m = m d and our choice of ∆, there exists a ζ ′ ∈ S such that | p ≤ d ( ζ ′ ) | ≥ . κ andthus, | p ( ζ ′ ) | ≥ | p ≤ d ( ζ ′ ) | − | p >d ( ζ ′ ) | ≥ . κ . Hence, recalling that p = SW x,w − q , we have (cid:12)(cid:12)(cid:12)d SW x,w ( ζ ′ ) − q ( ζ ′ ) (cid:12)(cid:12)(cid:12) ≥ (cid:12)(cid:12) p ( ζ ′ ) (cid:12)(cid:12) − (cid:12)(cid:12)(cid:12)d SW x,w ( ζ ′ ) − SW x,w ( ζ ′ ) (cid:12)(cid:12)(cid:12) ≥ . κ > κ/ . As ζ ′ ∈ S , the solution q violates a constraint of the LP. This concludes the proof of correctness.27ow we analyze the sample complexity of the algorithm. We have | S | = O (1 / ∆) = n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) , using the bounds established in Section 7.2. Moreover, all points ζ ∈ S satisfy 1 − ζ ≥ (1 − δ ) / s = n O (1) κ (cid:18) − δ (cid:19) O ( k ) log (cid:18) | S | τ (cid:19) = n (cid:18) − δ (cid:19) k ! O (1 / (1 − δ )) log 1 τ . (21)The running time is dominated by the time required to compute d SW x,w ( ζ ) for each ζ ∈ S . Therunning time for each ζ can be bounded by (21) and the same expression can be used to bound theoverall running time given the bound on | S | above. References [BCF + 19] Frank Ban, Xi Chen, Adam Freilich, Rocco A. Servedio, and Sandip Sinha. Beyondtrace reconstruction: Population recovery from the deletion channel. In , pages 745–768.IEEE Computer Society, 2019. 1[BCSS19] Frank Ban, Xi Chen, Rocco A. Servedio, and Sandip Sinha. Efficient average-casepopulation recovery in the presence of insertions and deletions. In APPROX/RANDOM 2019 , volume 145 of LIPIcs , pages 44:1–44:18. Schloss Dagstuhl -Leibniz-Zentrum f¨ur Informatik, 2019. 1[BKKM04] T. Batu, S. Kannan, S. Khanna, and A. McGregor. Reconstructing strings fromrandom traces. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2004 , pages 910–918, 2004. 1, 1.1[Cha19] Z. Chase. New lower bounds for trace reconstruction. CoRR , abs/1905.03031, 2019.(document), 1, 1.1, 1.5[DOS17] Anindya De, Ryan O’Donnell, and Rocco A. Servedio. Optimal mean-based algorithmsfor trace reconstruction. In Proceedings of the 49th ACM Symposium on Theory ofComputing (STOC) , pages 1047–1056, 2017. (document), 1, 1.1, 1.3, 1.5, 5.2[HHP18] Lisa Hartung, Nina Holden, and Yuval Peres. Trace reconstruction with varyingdeletion probabilities. In Proceedings of the Fifteenth Workshop on AnalyticAlgorithmics and Combinatorics, ANALCO 2018, New Orleans, LA, USA, January8-9, 2018. , pages 54–61, 2018. (document), 1[HL18] N. Holden and R. Lyons. Lower bounds for trace reconstruction. CoRR ,abs/1808.02336, 2018. (document), 1.128HMPW08] T. Holenstein, M. Mitzenmacher, R. Panigrahy, and U. Wieder. Trace reconstructionwith constant deletion probability and related results. In Proceedings of the NineteenthAnnual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008 , pages 389–398,2008. 1, 1.1, 1.3[HPP18] Nina Holden, Robin Pemantle, and Yuval Peres. Subpolynomial trace reconstruction forrandom strings and arbitrary deletion probability. CoRR , abs/1801.04783, 2018.(document), 1, 1.1, 1.3, 1.5[HPPZ20] Nina Holden, Robin Pemantle, Yuval Peres, and Alex Zhai. Subpolynomial tracereconstruction for random strings and arbitrary deletion probability. CoRR ,abs/1801.04783, 2020. 1, 1.1, 1.3, 1.5[Kal73] V. V. Kalashnik. Reconstruction of a word from its fragments. ComputationalMathematics and Computer Science (Vychislitel’naya matematika i vychislitel’nayatekhnika), Kharkov , 4:56–57, 1973. 1[KM05] Sampath Kannan and Andrew McGregor. More on reconstructing strings from randomtraces: Insertions and deletions. In IEEE International Symposium on InformationTheory , pages 297–301, 2005. 1, 1.1[KMMP19] Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, and Soumyabrata Pal.Trace reconstruction: Generalized and parameterized. In , volume 144 of LIPIcs , pages 68:1–68:25. SchlossDagstuhl - Leibniz-Zentrum f¨ur Informatik, 2019. 1[Lev01a] Vladimir Levenshtein. Efficient reconstruction of sequences. IEEE Transactions onInformation Theory , 47(1):2–22, 2001. 1[Lev01b] Vladimir Levenshtein. Efficient reconstruction of sequences from their subsequences orsupersequences. Journal of Combinatorial Theory Series A , 93(2):310–332, 2001. 1[MPV14] Andrew McGregor, Eric Price, and Sofya Vorotnikova. Trace reconstruction revisited.In Proceedings of the 22nd Annual European Symposium on Algorithms , pages 689–700,2014. 1, 1.1[Nar20] S. Narayanan. Population recovery from the deletion channel: Nearly matching tracereconstruction bounds. CoRR , abs/2004.06828, 2020. 1[NP17] Fedor Nazarov and Yuval Peres. Trace reconstruction with exp (cid:0) o ( n / ) (cid:1) samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing,STOC 2017 , pages 1042–1046, 2017. (document), 1, 1.1, 1.3, 1.5[O’D14] R. O’Donnell. Analysis of Boolean Functions . Cambridge University Press, 2014. 2[PZ17] Yuval Peres and Alex Zhai. Average-case reconstruction for the deletion channel:subpolynomially many traces suffice, 2017. Available athttps://arxiv.org/abs/1708.00854. (document), 1, 1.1, 1.329ST01] Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: why thesimplex algorithm usually takes polynomial time. In Proceedings on 33rd Annual ACMSymposium on Theory of Computing (STOC) 2001 , pages 296–305. ACM, 2001. 1.2[VS08] Krishnamurthy Viswanathan and Ram Swaminathan. Improved string reconstructionover insertion-deletion channels. In