[PDF] Tree trace reconstruction using subtraces

Abstract

Tree trace reconstruction aims to learn the binary node labels of a tree, given independent samples of the tree passed through an appropriately defined deletion channel. In recent work, Davies, R\'acz, and Rashtchian used combinatorial methods to show that \exp(\mathcal{O}(k \log_{k} n)) samples suffice to reconstruct a complete k-ary tree with n nodes with high probability. We provide an alternative proof of this result, which allows us to generalize it to a broader class of tree topologies and deletion models. In our proofs, we introduce the notion of a subtrace, which enables us to connect with and generalize recent mean-based complex analytic algorithms for string trace reconstruction.

Full PDF

TTree trace reconstruction using subtraces

Tatiana Brailovskaya ∗ Mikl´os Z. R´acz † February 3, 2021

Abstract

Tree trace reconstruction aims to learn the binary node labels of a tree, given independentsamples of the tree passed through an appropriately deﬁned deletion channel. In recent work,Davies, R´acz, and Rashtchian [9] used combinatorial methods to show that exp( O ( k log k n ))samples suﬃce to reconstruct a complete k -ary tree with n nodes with high probability. Weprovide an alternative proof of this result, which allows us to generalize it to a broader class oftree topologies and deletion models. In our proofs, we introduce the notion of a subtrace, whichenables us to connect with and generalize recent mean-based complex analytic algorithms forstring trace reconstruction. Trace reconstruction is a fundamental statistical reconstruction problem which has received muchattention lately. Here the goal is to infer an unknown binary string of length n , given independentcopies of the string passed through a deletion channel. The deletion channel deletes each bit in thestring independently with probability q and then concatenates the surviving bits into a trace . Thegoal is to learn the original string with high probability using as few traces as possible.The trace reconstruction problem was introduced two decades ago [19, 3], and despite lotsof work over the past two decades [17, 11, 12, 23, 14, 16, 15, 6, 5, 7, 13], understanding thesample complexity of trace reconstruction remains wide open. Speciﬁcally, the best known upperbound is due to Chase [5] who showed that exp (cid:16) (cid:101) O ( n / ) (cid:17) samples suﬃce; this work builds uponprevious breakthroughs by De, O’Donnell, and Servedio [11, 12], and Nazarov and Peres [23], whosimultaneously obtained an upper bound of exp (cid:0) O ( n / ) (cid:1) . In contrast, the best known lower boundis (cid:101) Ω( n / ) (see [15, 6]). Considering average-case strings as opposed to worst-case ones reduces thesample complexity considerably, but the large gap remains: the current best known upper andlower bounds are exp (cid:16) O (log / n ) (cid:17) (see [16]) and (cid:101) Ω(log / n ) (see [6]), respectively. As we can see,the bounds are exponentially far apart for both the worst-case and average-case problems.Given the diﬃculty of the trace reconstruction problem, several variants have been introduced,in part to study the strengths and weaknesses of various techniques. These include generalizingtrace reconstruction from strings to trees [9] and matrices [18], coded trace reconstruction [8, 4],population recovery [1, 2, 21], and more [22, 7, 10].In this work we consider tree trace reconstruction, introduced recently by Davies, R´acz, andRashtchian [9]. In this problem we aim to learn the binary node labels of a tree, given independent ∗ Princeton University; [email protected] . † Princeton University; [email protected] . Research supported in part by NSF grant DMS 1811724 and by aPrinceton SEAS Innovation Award. a r X i v : . [ c s . D S ] F e b amples of the tree passed through an appropriately deﬁned deletion channel. The additional treestructure makes reconstruction easier; indeed, in several settings Davies, R´acz, and Rashtchian [9]show that the sample complexity is polynomial in the number of bits in the worst case. Further-more, Maranzatto [20] showed that strings are the hardest trees to reconstruct; that is, the samplecomplexity of reconstructing an arbitrary labeled tree with n nodes is no more than the samplecomplexity of reconstructing an arbitrary labeled n -bit string.As demonstrated in [9], tree trace reconstruction provides a natural testbed for studying theinterplay between combinatorial and complex analytic techniques that have been used to tackle thestring variant. Our work continues in this spirit. In particular, Davies, R´acz, and Rashtchian [9]used combinatorial methods to show that exp( O ( k log k n )) samples suﬃce to reconstruct complete k -ary trees with n nodes, and here we provide an alternative proof using complex analytic tech-niques. This alternative proof also allows us to generalize the result to a broader class of treetopologies and deletion models. Before stating our results we ﬁrst introduce the tree trace recon-struction problem more precisely.Let X be a rooted tree with unknown binary labels on its n non-root nodes. We assume that X has an ordering of its nodes, and the children of a given node have a left-to-right ordering. The goalof tree trace reconstruction is to learn the labels of X with high probability, using as few traces aspossible, knowing only the deletion model, the deletion probability q <

1, and the tree structureof X . Throughout this paper, we write ‘with high probability’ to mean with probability tendingto 1 as n → ∞ .While for strings there is a canonical model of the deletion channel, there is no such canonicalmodel for trees. Previous work in [9] considered two natural extensions of the string deletionchannel to trees: the Tree Edit Distance (TED) deletion model and the Left-Propagation (LP)deletion model; see [9] for details. Here we focus on the TED model, while also introducing a newdeletion model, termed All-Or-Nothing (AON), which is more ‘destructive’ than the other models.In both models the root never gets deleted. • Tree Edit Distance (TED) deletion model:

Each non-root node is deleted independentlywith probability q and deletions are associative. When a node v gets deleted, all of the childrenof v now become children of the parent of v . Equivalently, contract the edge between v andits parent, retaining the label of the parent. The children of v take the place of v in the left-to-right order; in other words, the original siblings of v that are to the left of v and surviveare now to the left of the children of v , and the same holds to the right of v . • All-Or-Nothing (AON) deletion model:

Each non-root node is marked independentlywith probability q . If a node v is marked, then the whole subtree rooted at v is deleted. Inother words, a node is deleted if and only if it is marked or it has an ancestor which is marked.Figure 1 illustrates these two deletion models. We refer to [9] for motivation and further remarkson the TED deletion model. While the AON deletion model is signiﬁcantly more destructive thanthe TED deletion model, an advantage of the tools we develop in this work is that we are able toobtain similar results for arbitrary tree topologies under the AON deletion model.Before we state our results, we recall a result of Davies, R´acz, and Rashtchian [9, Theorem 4]. Theorem 1 ([9]) . In the TED model, there exists a ﬁnite constant C depending only on q such that exp( Ck log k n ) traces suﬃce to reconstruct a complete k -ary tree on n nodes with high probability(here k ≥ ). In particular, note that the sample complexity is polynomial in n whenever k is a constant. Ourﬁrst result is an alternative proof of the same result, under some mild additional assumptions, asstated below in Theorem 2. 2 a) Original tree (b) TED model (c) AON model Figure 1: Actions of deletion models on a sample tree. Original tree in (a), with orange nodes tobe deleted. Resulting trace in the TED model (b) and the AON model (c).

Theorem 2.

Fix c ∈ Z + and let q < cc +1 . There exists a ﬁnite constant C , depending only on c and q , such that for any k > c the following holds: in the TED model, exp( Ck log k n ) traces suﬃceto reconstruct a complete k -ary tree on n nodes with high probability. The additional assumptions compared to Theorem 1 are indeed mild. For instance, with c = 1in the theorem above, Theorem 1 is recovered for q < /

2. Theorem 2 also allows q to be arbitrarilyclose to 1, provided that k is at least a large enough constant.In [9], the authors use combinatorial techniques to prove Theorem 1. Our proof of Theorem 2uses a mean-based complex analytic approach, similar to [11, 12, 23, 18]. The advantage of ourapproach is that it allows us to reconstruct labels of more general tree topologies in the TEDdeletion model, as stated below in Theorem 3; the combinatorial proof in [9] does not naturallylend itself to such a generalization. Theorem 3.

Let X be a rooted tree on n nodes with binary labels, with nodes on level (cid:96) all havingthe same number of children k (cid:96) . Let k max := max (cid:96) k (cid:96) and k min := min (cid:96) k (cid:96) , where the minimum goesover all levels except the last one (containing leaf nodes). If q < cc +1 and k min > c for some c ∈ Z + ,then there exists a ﬁnite constant C , depending only on c and q , such that exp (cid:0) Ck max log k min n (cid:1) traces suﬃce to reconstruct X with high probability. Furthermore, with some slight modiﬁcations, our proof of Theorem 2 also provides a samplecomplexity bound for reconstructing arbitrary tree topologies in the AON deletion model.

Theorem 4.

Let X be a rooted tree on n nodes with binary labels, let k max denote the maximumnumber of children a node has in X , and let d be the depth of X . In the AON model, there existsa ﬁnite constant C depending only on q such that exp( Ck max d ) traces suﬃce to reconstruct X withhigh probability. The key idea in the above proofs is the notion of a subtrace , which is the subgraph of a tracethat consists only of root-to-leaf paths of length d , where d is the depth of the underlying tree. Inthe proofs of Theorems 2 and 3 we essentially only use the information contained in these subtracesand ignore the rest of the trace. This trick is key to making the setup amenable to the mean-basedcomplex analytic techniques.The rest of the paper follows the following outline. We start with some preliminaries in Section 2,where we state basic tree deﬁnitions and deﬁne the notion of a subtrace more precisely. In Section 3we present our proof of Theorem 2. In Section 4 we generalize the methods of Section 3 to a broaderclass of tree topologies and deletion models, proving Theorems 3 and 4. We conclude in Section 5.3 Preliminaries

In what follows, X denotes an underlying rooted tree of known topology along with binary labelsassociated with the n non-root nodes of the tree. Basic tree terminology. A tree is an acyclic graph. A rooted tree has a special node thatis designated as the root . A leaf is a node of degree 1. We say that a node v is at level (cid:96) if thegraph distance between v and the root is (cid:96) . We say that node v is at height h if the largest graphdistance from v to a leaf is h . Depth is the largest distance from the root to a leaf. We say thatnode u is a child of node v if there is an edge between u and v and v is closer to the root than u in graph distance. Similarly, we also call v the parent of u . More generally, v is an ancestor of u if there exists a path v = x , . . . , x n = u such that x i is closer to the root than x i +1 for every i ∈ { , , . . . , n − } . A complete k -ary tree is a tree in which every non-leaf node has k children. Subtrace augmentation.

Above we deﬁned the subtrace Z as the subgraph of the trace Y containing all root-to-leaf paths of length d , where d is the depth of X . In what follows, it will behelpful to slightly modify the deﬁnition of the subtrace by augmenting Z to Z (cid:48) such that Z (cid:48) is acomplete k -ary tree that contains Z as a subgraph. Given Z , we construct Z (cid:48) recursively as follows.We begin by setting Z (cid:48) := Z . If the root of Z (cid:48) currently has fewer than k children, then add morechild nodes to the root to the right of the existing children and label them 0. Now, consider theleftmost node in level 1 of Z (cid:48) . If it has fewer than k children, add new children to the right of theexisting children of this node and label them 0. Then repeat the same procedure for the secondleftmost node in level 1. Continue this procedure left to right for each level, moving from top tobottom of the tree. See Figure 2 for an illustration of this process. In Section 3, when we mentionthe subtrace of X we mean the augmented subtrace, constructed as described here. In Section 4,we will slightly modify the notion of an augmented subtrace for the diﬀerent tree topologies we willbe considering. (a) Original tree. (b) TED trace.(c) Subtrace. (d) Augmented subtrace. Figure 2: Construction of an augmented subtrace. Original tree in (a), with orange nodes to bedeleted. The resulting trace under the TED deletion model in (b). Subtrace in (c). Augmentedsubtrace in (d), with blue nodes corresponding to the padding 0s.4

Reconstructing complete k -ary trees in the TED model In this section we prove Theorem 2. The proof takes inspiration from [11, 12, 23, 18]. We beginby computing, for every node in the original tree, its probability of survival in a subtrace. Wethen derive a multivariate complex generating function for every level (cid:96) , with random coeﬃcientscorresponding to the labels of nodes in the subtrace. Finally, we show how we can “average” thesubtraces to determine the correct labeling for each level of the original tree with high probability.

Let d denote the depth of the original tree X . Let Y denote a trace of X and let Z denote thecorresponding subtrace obtained from Y . Observe that a node v at level (cid:96) of X survives in thesubtrace Z if and only if a root-to-leaf path that includes v survives in the trace Y . Furthermore,there exists exactly one path from the root to v , which survives in Y with probability (1 − q ) (cid:96) − (since each of the (cid:96) − v has to survive independently). Let p d − (cid:96) denote theprobability that no v -to-leaf path survives in Y . Thus, P ( v survives in Z ) = (1 − q ) (cid:96) − (1 − p d − (cid:96) ) . Thus, we can see that it suﬃces to compute p h for h ∈ { , , . . . , d − } in order to computethe probability of survival of v in a subtrace. The rest of this subsection is thus dedicated tounderstanding { p h } d − h =0 . We will not ﬁnd an explicit expression for p h , but rather derive a recurrencerelation for p h , which will prove to be good enough for us.Let us denote by v a vertex at height h , which is the root of the subtree under consideration.There are two events that contribute to p h . Either v gets deleted (this happens with probability q )or all of the k subtrees rooted at the children of v do not have a surviving root-to-leaf path in thesubtrace (this happens with probability p kh − ). Thus, we have the following recurrence relation: forevery h ≥ p h +1 = q + (1 − q ) p kh ; (1)furthermore, the initial condition satisﬁes p = q . This recursion allows to compute { p h } d − h =0 . Wenow prove the following statement about this recursion, which will be useful later on. Lemma 1.

Suppose that < q < cc +1 and k > c for some c ∈ Z + . There exists p (cid:48) < , dependingonly on c and q , such that p i ≤ p (cid:48) < for every i ≥ .Proof. The function f ( p ) := 1 + p + . . . + p c is continuous and strictly increasing on [0 ,

1] with f (0) = 1 and f (1) = c + 1. The assumption q ∈ (0 , c/ ( c + 1)) implies that 1 / (1 − q ) ∈ (1 , c + 1), sothere exists a unique p (cid:48) ∈ (0 ,

1) such that f ( p (cid:48) ) = 1 / (1 − q ). By construction p (cid:48) is a function of c and q . We will show by induction that p i ≤ p (cid:48) for every i ≥ f ( p ) ≤ (cid:80) m ≥ p m = 1 / (1 − p ). Thus 1 / (1 − p (cid:48) ) ≥ / (1 − q ) and so q ≤ p (cid:48) .Since p = q , this proves the base case of the induction.For the induction step, ﬁrst note that if p ∈ (0 , f ( p ) = (1 − p c +1 ) / (1 − p ). Thereforethe equation f ( p (cid:48) ) = 1 / (1 − q ) implies that (1 − q ) ( p (cid:48) ) c +1 = p (cid:48) − q . So if p i ≤ p (cid:48) , then (1) impliesthat p i +1 = q + (1 − q ) p ki ≤ q + (1 − q ) ( p (cid:48) ) c +1 = q + ( p (cid:48) − q ) = p (cid:48) , where we also used the assumption k ≥ c + 1 in the inequality. We begin by introducing some additional notation. Let b (cid:96),i denote the label of the node located atlevel (cid:96) in position i from the left in the original tree X ; we start the indexing of i from 0, so b (cid:96), is5he label of the leftmost vertex on level (cid:96) . Similarly, let a (cid:96),i denote the label of the node locatedat level (cid:96) in position i from the left in the subtrace Z . Observe that for i ∈ { , , . . . , k (cid:96) − } wecan write i = t (cid:96) − k (cid:96) − + t (cid:96) − k (cid:96) − + . . . + t with t i ∈ { , . . . , k − } , that is, t (cid:96) − t (cid:96) − . . . t is thebase k representation of i . To abbreviate notation, we will write a (cid:96),i = a (cid:96),t (cid:96) − k (cid:96) − + t (cid:96) − k (cid:96) − + ... + t assimply a (cid:96),t (cid:96) − ...t .We introduce, for every level (cid:96) , a multivariate complex generating function whose coeﬃcientsare the labels of the nodes at level (cid:96) of a subtrace Z . Speciﬁcally, we introduce complex variables w , . . . , w (cid:96) − for each position in the base k representation and deﬁne A (cid:96) ( w ) := k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 a (cid:96),t (cid:96) − ...t w t (cid:96) − (cid:96) − . . . w t . (2)We are now ready to state the main result of this subsection, which computes the expectation ofthis generating function. Lemma 2.

For every (cid:96) ∈ { , . . . , d } we have that E [ A (cid:96) ( w )] = (1 − q ) (cid:96) − (1 − p d − (cid:96) ) k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 b (cid:96),t (cid:96) − ...t (cid:96) − (cid:89) m =0 ((1 − p d − (cid:96) + m ) w m + p d − (cid:96) + m ) t m . (3)This lemma is useful because the right hand side of (3) contains the labels of the nodes onlevel (cid:96) of X , while the left hand side can be estimated by averaging over subtraces. Proof of Lemma 2.

By linearity of expectation we have that E [ A (cid:96) ( w )] = k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 E (cid:2) a (cid:96),t (cid:96) − ...t (cid:3) w t (cid:96) − (cid:96) − . . . w t , (4)so our goal is to compute E (cid:2) a (cid:96),t (cid:96) − ...t (cid:3) . For node i = i (cid:96) − . . . i on level (cid:96) , we may interpret eachdigit i m in the base k representation as follows: consider node i ’s ancestor on level (cid:96) − m ; thehorizontal position of this node amongst its siblings is i m . Thus, if the original bit b (cid:96),i survives inthe subtrace, it can only end up in position j = j (cid:96) − j (cid:96) − . . . j on level (cid:96) satisfying j m ≤ i m forevery m . If the m th digit of the location of b (cid:96),i in the subtrace is j (cid:96) − m , then exactly i (cid:96) − m − j (cid:96) − m siblings left of the ancestor of i on level (cid:96) − m must have gotten deleted and the ancestor of i onlevel (cid:96) − m must have survived in the subtrace. Thus, the probability that bit a (cid:96),j of the subtraceis the original bit b (cid:96),i is given by P ( a (cid:96),j (cid:96) − ...j = b (cid:96),i (cid:96) − ...i ) = (cid:18) i j (cid:19) p i − j d − (cid:96) (1 − p d − (cid:96) ) j +1 × (cid:18) i j (cid:19) p i − j d − (cid:96) +1 (1 − p d − (cid:96) +1 ) j (1 − q ) × . . . × . . . × (cid:18) i (cid:96) − j (cid:96) − (cid:19) p i (cid:96) − − j (cid:96) − d − (1 − p d − ) j (cid:96) − (1 − q )= (1 − q ) (cid:96) − (1 − p d − (cid:96) ) (cid:96) − (cid:89) m =0 (cid:18) i m j m (cid:19) p i m − j m d − (cid:96) + m (1 − p d − (cid:96) + m ) j m Summing over all i satisfying i m ≥ t m for every m , and plugging into (4), we obtain that E [ A (cid:96) ( w )] = (1 − q ) (cid:96) − (1 − p d − (cid:96) ) ×× k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 k − (cid:88) i = t . . . k − (cid:88) i (cid:96) − = t (cid:96) − b (cid:96),i (cid:96) − ...i (cid:96) − (cid:89) m =0 (cid:18) i m t m (cid:19) p i m − t m d − (cid:96) + m (1 − p d − (cid:96) + m ) t m w t m m . Interchanging the order of summations and using the binomial theorem ( (cid:96) times) we obtain (3).6 .3 Bounding the modulus of the generating function

Here we prove a simple lower bound on the modulus of a multivariate Littlewood polynomial. Thisbound will extend to the generating function computed above for appropriate choices of w m . Theargument presented here is inspired by the method of proof of [18, Lemma 4]. Throughout thepaper we let D denote the unit disc in the complex plane and let ∂ D denote its boundary. Lemma 3.

Let F ( z , . . . , z (cid:96) − ) be a nonzero multivariate polynomial with monomial coeﬃcients in {− , , } . Then, sup z ,...,z (cid:96) − ∈ ∂ D | F ( z , . . . , z (cid:96) − ) | ≥ . Proof.

We deﬁne a sequence of polynomials { F i } (cid:96) − i =0 inductively as follows, where F i is a functionof the variables z i , . . . , z (cid:96) − . First, let t be the smallest power of z in a monomial of F and let F ( z , . . . , z (cid:96) − ) := z − t F ( z , . . . , z (cid:96) − ). By construction, F has at least one monomial where z does not appear. For i ∈ { , . . . , (cid:96) − } , given F i − we deﬁne F i as follows. Let t i be the smallestpower of z i in a monomial of F i − (0 , z i , . . . , z (cid:96) − ) and let F i ( z i , . . . , z (cid:96) − ) := z − t i i F i − (0 , z i , . . . , z (cid:96) − ).Observe that this construction guarantees, for every i , that the polynomial F i ( z i , . . . , z (cid:96) − ) has atleast one monomial where z i does not appear. In particular, the univariate polynomial F (cid:96) − ( z (cid:96) − )has a nonzero constant term. Since the coeﬃcients of the polynomial are in {− , , } , this meansthat the constant term has absolute value 1, that is, | F (cid:96) − (0) | = 1.Let (cid:0) z ∗ , . . . , z ∗ (cid:96) − (cid:1) denote the maximizer of | F ( z , . . . , z (cid:96) − ) | with z m ∈ ∂ D for all m . Now, bythe maximum modulus principle, observe that for all i we have that | ( z ∗ i ) t i || F i ( z ∗ i , . . . , z ∗ (cid:96) − ) | = | F i ( z ∗ i , . . . , z ∗ (cid:96) − ) | ≥ | F i (0 , z ∗ i +1 , . . . , z ∗ (cid:96) − ) | . Using the deﬁnition of F i and iterating the above inequality yields, for all i ∈ { , . . . , (cid:96) − } , that | ( z ∗ i ) t i || F i ( z ∗ i , . . . , z ∗ (cid:96) − ) | ≥ | F i +1 (0 , z ∗ i +2 , . . . , z ∗ (cid:96) − ) | . By taking the two ends of this chain of inequalities we can thus see that | F ( z ∗ , . . . , z ∗ (cid:96) − ) | ≥ | F (cid:96) − (0) | = 1 . Proof of Theorem 2.

Let X (cid:48) and X (cid:48)(cid:48) be two complete k -ary trees on n non-root nodes with diﬀerentbinary node labels. Our ﬁrst goal is to distinguish between X (cid:48) and X (cid:48)(cid:48) using subtraces. At the endof the proof we will then explain how to estimate the original tree X using subtraces.Since X (cid:48) and X (cid:48)(cid:48) have diﬀerent labels, there exists at least one level of the tree where the nodelabels diﬀer. Call the minimal such level (cid:96) ∗ = (cid:96) ∗ ( X (cid:48) , X (cid:48)(cid:48) ); we will use this level of the subtraces todistinguish between X (cid:48) and X (cid:48)(cid:48) . Let (cid:110) b (cid:48) (cid:96) ∗ ,i : i ∈ (cid:110) , , . . . , k (cid:96) ∗ − (cid:111)(cid:111) and (cid:110) b (cid:48)(cid:48) (cid:96) ∗ ,i : i ∈ (cid:110) , , . . . , k (cid:96) ∗ − (cid:111)(cid:111) denote the labels on level (cid:96) ∗ of X (cid:48) and X (cid:48)(cid:48) , respectively. Furthermore, for every i deﬁne b (cid:96) ∗ ,i := b (cid:48) (cid:96) ∗ ,i − b (cid:48)(cid:48) (cid:96) ∗ ,i . By construction, b (cid:96) ∗ ,i ∈ {− , , } for every i , and there exists i such that b (cid:96) ∗ ,i (cid:54) = 0. Let Z (cid:48) and Z (cid:48)(cid:48) be subtraces obtained from X (cid:48) and X (cid:48)(cid:48) , respectively, and let (cid:110) Z (cid:48) (cid:96) ∗ ,i : i ∈ (cid:110) , , . . . , k (cid:96) ∗ − (cid:111)(cid:111) and (cid:110) Z (cid:48)(cid:48) (cid:96) ∗ ,i : i ∈ (cid:110) , , . . . , k (cid:96) ∗ − (cid:111)(cid:111) (cid:96) ∗ of Z (cid:48) and Z (cid:48)(cid:48) , respectively. By Lemma 2 we have that E  k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 Z (cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:96) ∗ − (cid:89) m =0 w t m m  − E  k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 Z (cid:48)(cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t t (cid:96) ∗− (cid:89) m =0 w t m m  = (1 − q ) (cid:96) ∗ − (1 − p d − (cid:96) ∗ ) k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 b (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:96) ∗ − (cid:89) m =0 ((1 − p d − (cid:96) ∗ + m ) w m + p d − (cid:96) ∗ + m ) t m . Now deﬁne the multivariate polynomial B ( z ) in the variables z = ( z , . . . , z (cid:96) ∗ − ) as follows: B ( z ) := k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 b (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:96) ∗ − (cid:89) m =0 z t m m . Lemma 3 implies that there exists z ∗ = (cid:0) z ∗ , . . . , z ∗ (cid:96) ∗ − (cid:1) such that z ∗ m ∈ ∂ D for every m ∈{ , . . . , (cid:96) ∗ − } and | B ( z ∗ ) | ≥ . For m ∈ { , . . . , (cid:96) ∗ − } let w ∗ m := z ∗ m − p d − (cid:96) ∗ + m − p d − (cid:96) ∗ + m . Note that the polynomial B is a function of X (cid:48) and X (cid:48)(cid:48) , and thus so is z ∗ and also w ∗ = (cid:0) w ∗ , . . . , w ∗ (cid:96) ∗ − (cid:1) . Putting together the four previous displays and using the triangle inequalitywe obtain that k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 (cid:12)(cid:12)(cid:12) E (cid:104) Z (cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:105) − E (cid:104) Z (cid:48)(cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:105)(cid:12)(cid:12)(cid:12) (cid:96) ∗ − (cid:89) m =0 | w ∗ m | t m ≥ (1 − q ) (cid:96) ∗ − (1 − p d − (cid:96) ∗ ) . (5)Next we estimate | w ∗ m | . By the deﬁnition of w ∗ m and the triangle inequality we have that | w ∗ m | = | z ∗ m − p d − (cid:96) ∗ + m | − p d − (cid:96) ∗ + m ≤ | z ∗ m | + p d − (cid:96) ∗ + m − p d − (cid:96) ∗ + m ≤ − p (cid:48) , (6)where in the last inequality we used that | z ∗ m | = 1 and that p d − (cid:96) ∗ + m ≤ p (cid:48) < p (cid:48) is a constant that depends only on c and q (recall that c is an input to the theorem).The bound in (6) implies that (cid:96) ∗ − (cid:89) m =0 | w ∗ m | t m ≤ (cid:18) − p (cid:48) (cid:19) k(cid:96) ∗ . (7)Plugging this back into (5) (and using that p d − (cid:96) ∗ ≤ p (cid:48) ) we get that k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) ∗− =0 (cid:12)(cid:12)(cid:12) E (cid:104) Z (cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:105) − E (cid:104) Z (cid:48)(cid:48) (cid:96) ∗ ,t (cid:96) ∗− ...t (cid:105)(cid:12)(cid:12)(cid:12) ≥ (1 − q ) (cid:96) ∗ − (1 − p (cid:48) ) k(cid:96) ∗ +1 (1 / k(cid:96) ∗ . Thus by the pigeonhole principle there exists i ∗ ∈ (cid:8) , , . . . , k (cid:96) ∗ − (cid:9) such that (cid:12)(cid:12) E (cid:2) Z (cid:48) (cid:96) ∗ ,i ∗ (cid:3) − E (cid:2) Z (cid:48)(cid:48) (cid:96) ∗ ,i ∗ (cid:3)(cid:12)(cid:12) ≥ (1 − q ) (cid:96) ∗ − (1 − p (cid:48) ) k(cid:96) ∗ +1 (1 / k(cid:96) ∗ k (cid:96) ∗ ≥ exp ( − Ck(cid:96) ∗ ) ≥ exp ( − Ck log k n ) , (8)8here the second inequality holds for a large enough constant C that depends only on c and q ,while the third inequality is because the depth of the tree is log k n . Note that i ∗ is a function of X (cid:48) and X (cid:48)(cid:48) .Now suppose that we sample T traces of X from the TED deletion channel and let Z , . . . , Z T denote the corresponding subtraces. Let X (cid:48) and X (cid:48)(cid:48) be two complete k -ary labeled trees withdiﬀerent labels, and recall the deﬁnitions of (cid:96) ∗ = (cid:96) ∗ ( X (cid:48) , X (cid:48)(cid:48) ) and i ∗ = i ∗ ( X (cid:48) , X (cid:48)(cid:48) ) from above. Wesay that X (cid:48) beats X (cid:48)(cid:48) (with respect to these samples) if (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T T (cid:88) t =1 Z t(cid:96) ∗ ,i ∗ − E (cid:2) Z (cid:48) (cid:96) ∗ ,i ∗ (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T T (cid:88) t =1 Z t(cid:96) ∗ ,i ∗ − E (cid:2) Z (cid:48)(cid:48) (cid:96) ∗ ,i ∗ (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . We are now ready to deﬁne our estimate (cid:98) X of the labels of the original tree. If there exists acomplete k -ary tree X (cid:48) that beats every other complete k -ary tree (with respect to these samples),then we let (cid:98) X := X (cid:48) . Otherwise, deﬁne (cid:98) X arbitrarily.Finally, we show that this estimate is correct with high probability. Let η := exp ( − Ck log k n ).By a union bound and a Chernoﬀ bound (using (8)), the probability that the estimate is incorrectis bounded by P (cid:16) (cid:98) X (cid:54) = X (cid:17) ≤ (cid:88) X (cid:48) : X (cid:48) (cid:54) = X P (cid:0) X (cid:48) beats X (cid:1) ≤ n exp (cid:0) − T η / (cid:1) = 2 n exp (cid:18) − T − Ck log k n ) (cid:19) . Choosing T = exp (3 Ck log k n ), the right hand side of the display above tends to 0. The method of proof shown in the previous section naturally lends itself to the more general resultsof Theorem 3 for the TED deletion model and Theorem 4 for the AON deletion model. The proofsare almost entirely identical to the one presented above, so we will only highlight the new ideasbelow and leave the details to the reader.

Before we proceed with the proof, we must clarify the notion of a subtrace. In Section 2 wedescribed the notion of an augmented subtrace for a k -ary tree. More generally, for trees in thesetting of Theorem 3, we deﬁne an augmented subtrace in a similar way; the key point is that theunderlying tree structure of the augmented subtrace is the same as the underlying tree structureof X . That is, we start with the root of the subtrace Z , and if it has less than k children, we addnodes with 0 labels to the right of its existing children, until the root has k children in total. Wethen move on to the leftmost node on level 1 and add new children with label 0 to the right of itsexisting children, until it has k children. We continue in this fashion from left to right on eachlevel, ensuring that each node on level (cid:96) has k (cid:96) children, moving from top to bottom of the tree. Proof of Theorem 3.

As before, we begin by computing, for every node in the tree, its probabilityof survival in a subtrace. The quantities { p h } d − h =0 can be deﬁned exactly as before, where again d denotes the depth of the tree. The recurrence relation changes slightly: for every h ≥ p h +1 = q + (1 − q ) p k d − h − h ;furthermore, the initial condition satisﬁes p = q . The following lemma is the analog of Lemma 1;we omit its proof, since it is identical to that of Lemma 1.9 emma 4. Suppose that < q < cc +1 and k min > c for some c ∈ Z + . There exists p (cid:48) < , dependingonly on c and q , such that p i ≤ p (cid:48) < for every i ≥ . Next, we turn to deﬁning and analyzing an appropriate generating function. Note that thereare (cid:81) (cid:96) − m =0 k m nodes on level (cid:96) of the tree X . Observe that every i ∈ (cid:110) , , . . . , (cid:81) (cid:96) − m =0 k m − (cid:111) canbe uniquely written as i = i (cid:96) − (cid:96) − (cid:89) m =1 k m + i (cid:96) − (cid:96) − (cid:89) m =2 k m + . . . + i k (cid:96) − + i , (9)where i m ∈ { , , . . . , k (cid:96) − − m − } for every m ∈ { , , . . . , (cid:96) − } . The interpretation of each digit i m in this representation is the same as before: consider node i ’s ancestor on level (cid:96) − m ; the horizontalposition of this node amongst its siblings is i m . To abbreviate notation, we write i = i (cid:96) − . . . i forthe expression in (9). With this representation of the nodes at level (cid:96) , we may deﬁne the generatingfunction for level (cid:96) as follows: A (cid:96) ( w ) := k (cid:96) − − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 a (cid:96),t (cid:96) − ...t w t (cid:96) − (cid:96) − . . . w t . The following lemma is the analog of Lemma 2; we omit its proof, since it is analogous to that ofLemma 2.

Lemma 5.

For every (cid:96) ∈ { , . . . , d } we have that E [ A (cid:96) ( w )] = (1 − q ) (cid:96) − (1 − p d − (cid:96) ) k (cid:96) − − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 b (cid:96),t (cid:96) − ...t (cid:96) − (cid:89) m =0 ((1 − p d − (cid:96) + m ) w m + p d − (cid:96) + m ) t m . With these tools in place, the remainder of the proof is almost identical to Section 3.4. Theinequality (7) is now replaced with (cid:96) ∗ − (cid:89) m =0 | w ∗ m | t m ≤ (cid:18) − p (cid:48) (cid:19) k max (cid:96) ∗ . Subsequently, by the pigeonhole principle there exists i ∗ ∈ (cid:110) , , . . . , (cid:81) (cid:96) ∗ − m =0 k m − (cid:111) such that (cid:12)(cid:12) E (cid:2) Z (cid:48) (cid:96) ∗ ,i ∗ (cid:3) − E (cid:2) Z (cid:48)(cid:48) (cid:96) ∗ ,i ∗ (cid:3)(cid:12)(cid:12) ≥ exp ( − Ck max (cid:96) ∗ ) ≥ exp ( − Ck max d ) , where the ﬁrst inequality holds for a large enough constant C that depends only on c and q . Therest of the proof is identical to Section 3.4, showing that T = exp (3 Ck max d ) traces suﬃce. Theclaim follows because the depth of the tree is at most log k min n . We begin by ﬁrst proving Theorem 4 for complete k -ary trees. We will then generalize to arbitrarytree topologies. Importantly, in the AON model we will work directly with the tree traces, asopposed to the subtraces as we did previously. As described in Section 2, we augment each trace Y with additional nodes with 0 labels to form a k -ary tree. In what follows, when we say “trace” wemean this augmented trace. 10 heorem 5. In the AON model, there exists a ﬁnite constant C depending only on q such that exp( Ck log k n ) traces suﬃce to reconstruct a complete k -ary tree on n nodes w.h.p. (here k ≥ ).Proof. We may deﬁne A (cid:96) ( w ), the generating function for level (cid:96) , exactly as in (2). The followinglemma is the analog of Lemma 2; we omit its proof, since it is analogous to that of Lemma 2. Lemma 6.

For every (cid:96) ∈ { , . . . , d } we have that E [ A (cid:96) ( w )] = (1 − q ) (cid:96) k − (cid:88) t =0 . . . k − (cid:88) t (cid:96) − =0 b (cid:96),t (cid:96) − ...t (cid:96) − (cid:89) m =0 ((1 − q ) w m + q ) t m . With this lemma in place, the remainder of the proof is almost identical to Section 3.4. Thepolynomial B ( z ), and hence also z ∗ , are as before. Now, we deﬁne w ∗ m := ( z ∗ m − q ) / (1 − q ). Theright hand side of (5) becomes (1 − q ) (cid:96) ∗ . The analog of (6) becomes the inequality | w ∗ m | ≤ / (1 − q );moreover, wherever p (cid:48) appears in Section 3.4, it is replaced by q here. Altogether, we obtain thatthere exists i ∗ ∈ (cid:8) , , . . . , k (cid:96) ∗ − (cid:9) such that (cid:12)(cid:12) E (cid:2) Z (cid:48) (cid:96) ∗ ,i ∗ (cid:3) − E (cid:2) Z (cid:48)(cid:48) (cid:96) ∗ ,i ∗ (cid:3)(cid:12)(cid:12) ≥ exp ( − Ck(cid:96) ∗ ) ≥ exp ( − Ckd ) ≥ exp ( − Ck log k n ) , where the ﬁrst inequality holds for a large enough constant C that depends only on q . The restof the proof is identical to Section 3.4, showing that T = exp (3 Ckd ) = exp (3 Ck log k n ) tracessuﬃce. Proof of Theorem 4.

Suppose that X is a rooted tree with arbitrary topology and let k max denotethe largest number of children a node in X has. Once we sample a trace Y from X , we forman augmented trace similarly to how we do it when X is a k -ary tree, except now we add nodeswith 0 labels to ensure that each node has k max children. Thus, each augmented trace is a complete k max -ary tree. Now, let X (cid:48) denote a k max -ary tree obtained by augmenting X to a k max -ary tree inthe same fashion that we augment traces of X to a k max -ary tree.As before, for each node i on level (cid:96) of X , there is a unique representation i = i (cid:96) − . . . i where i m is the position of node i ’s ancestor on level (cid:96) − m among its siblings. Importantly, forevery node in X , its representation in X (cid:48) is the same. This fact, together with the augmentationconstruction, implies that E [ a (cid:96),t (cid:96) − ...t ] for the node a (cid:96),t (cid:96) − ...t in Y is identical to E [ a (cid:96),t (cid:96) − ...t ] forthe node a (cid:96),t (cid:96) − ...t in Y (cid:48) , which is a trace sampled from X (cid:48) . Therefore, we can use the procedurepresented in Theorem 5 to reconstruct X (cid:48) w.h.p. using T = exp ( Ck max d ) traces sampled from X .By taking the appropriate subgraph of X (cid:48) , we can thus reconstruct X as well. In this work we introduce the notion of a subtrace and demonstrate its utility in analyzing tracesproduced by the deletion channel in the tree trace reconstruction problem. We provide a novelalgorithm for the reconstruction of complete k -ary trees, which matches the sample complexity ofthe combinatorial approach of [9], by applying mean-based complex analytic tools to the subtrace.This technique also allows us to reconstruct trees with more general topologies in the TED deletionmodel, speciﬁcally trees where the nodes at every level have the same number of children (with thisnumber varying across levels).However, many questions remain unanswered; we hope that the ideas introduced here will helpaddress them. In particular, how can we reconstruct, under the TED deletion model, arbitrarytrees where all leaves are on the same level? Since the notion of a subtrace is well-deﬁned for suchtrees, we hope that the proof technique presented here can somehow be generalized to answer thisquestion. 11 eferences [1] Frank Ban, Xi Chen, Adam Freilich, Rocco A. Servedio, and Sandip Sinha. Beyond trace re-construction: Population recovery from the deletion channel. In , pages 745–768, 2019.[2] Frank Ban, Xi Chen, Rocco A. Servedio, and Sandip Sinha. Eﬃcient average-case populationrecovery in the presence of insertions and deletions. In Approximation, Randomization, andCombinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM) , volume 145of

LIPIcs , pages 44:1–44:18. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik, 2019.[3] Tugkan Batu, Sampath Kannan, Sanjeev Khanna, and Andrew McGregor. Reconstructingstrings from random traces. In

Proceedings of the Fifteenth Annual ACM-SIAM Symposiumon Discrete Algorithms (SODA) , pages 910–918, 2004.[4] Joshua Brakensiek, Ray Li, and Bruce Spang. Coded trace reconstruction in a constant numberof traces. In

Proceedings of the IEEE Annual Symposium on Foundations of Computer Science(FOCS) , 2020.[5] Zachary Chase. New upper bounds for trace reconstruction. Preprint available at https://arxiv.org/abs/2009.03296 , 2020.[6] Zachary Chase. New Lower Bounds for Trace Reconstruction.

Annales de l’Institut HenriPoincar´e, Probabilit´es et Statistiques , to appear, 2021.[7] Xi Chen, Anindya De, Chin Ho Lee, Rocco A Servedio, and Sandip Sinha. Polynomial-timetrace reconstruction in the smoothed complexity model. In

Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , 2021.[8] Mahdi Cheraghchi, Ryan Gabrys, Olgica Milenkovic, and Joao Ribeiro. Coded trace recon-struction.

IEEE Transactions on Information Theory , 66(10):6084–6103, 2020.[9] Sami Davies, Mikl´os Z. R´acz, and Cyrus Rashtchian. Reconstructing Trees from Traces.

TheAnnals of Applied Probability , to appear, 2021.[10] Sami Davies, Mikl´os Z. R´acz, Cyrus Rashtchian, and Benjamin G. Schiﬀer. Approximate tracereconstruction. Preprint available at https://arxiv.org/abs/2012.06713 , 2020.[11] Anindya De, Ryan O’Donnell, and Rocco A. Servedio. Optimal mean-based algorithms fortrace reconstruction. In

Proceedings of the 49th Annual ACM SIGACT Symposium on Theoryof Computing (STOC) , pages 1047–1056, 2017.[12] Anindya De, Ryan O’Donnell, and Rocco A. Servedio. Optimal mean-based algorithms fortrace reconstruction.

The Annals of Applied Probability , 29(2):851–874, 2019.[13] Elena Grigorescu, Madhu Sudan, and Minshen Zhu. Limitations of Mean-Based Algorithmsfor Trace Reconstruction at Small Distance. Preprint available at https://arxiv.org/abs/2011.13737 , 2020.[14] Lisa Hartung, Nina Holden, and Yuval Peres. Trace reconstruction with varying deletion proba-bilities. In

Proceedings of the Fifteenth Workshop on Analytic Algorithmics and Combinatorics(ANALCO) , pages 54–61, 2018. 1215] Nina Holden and Russell Lyons. Lower bounds for trace reconstruction.

Annals of AppliedProbability , 30(2):503–525, 2020.[16] Nina Holden, Robin Pemantle, Yuval Peres, and Alex Zhai. Subpolynomial trace reconstructionfor random strings and arbitrary deletion probability.

Mathematical Statistics and Learning ,2(3):275–309, 2020.[17] Thomas Holenstein, Michael Mitzenmacher, Rina Panigrahy, and Udi Wieder. Trace recon-struction with constant deletion probability and related results. In

Proc. 19th ACM-SIAMSymposium on Discrete Algorithms (SODA) , pages 389–398, 2008.[18] Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, and Soumyabrata Pal. TraceReconstruction: Generalized and Parameterized. In , pages 68:1–68:25, 2019.[19] Vladimir I Levenshtein. Eﬃcient reconstruction of sequences.

IEEE Transactions on Infor-mation Theory , 47(1):2–22, 2001.[20] Thomas J. Maranzatto. Tree Trace Reconstruction: Some Results. Thesis, New College ofFlorida, 2020.[21] Shyam Narayanan. Population Recovery from the Deletion Channel: Nearly Matching TraceReconstruction Bounds. In

Proceedings of the ACM-SIAM Symposium on Discrete Algorithms(SODA) , 2021.[22] Shyam Narayanan and Michael Ren. Circular Trace Reconstruction. In