[PDF] Efficient Document Exchange and Error Correcting Codes with Asymmetric Information

Abstract

We study two fundamental problems in communication, Document Exchange (DE) and Error Correcting Code (ECC). In the first problem, two parties hold two strings, and one party tries to learn the other party's string through communication. In the second problem, one party tries to send a message to another party through a noisy channel, by adding some redundant information to protect the message. Two important goals in both problems are to minimize the communication complexity or redundancy, and to design efficient protocols or codes. Both problems have been studied extensively. In this paper we study whether asymmetric partial information can help in these two problems. We focus on the case of Hamming distance/errors, and the asymmetric partial information is modeled by one party having a vector of disjoint subsets S → =( S 1 ,⋯, S t ) of indices and a vector of integers k → =( k 1 ,⋯, k t ) , such that in each S i the Hamming distance/errors is at most k i . We establish both lower bounds and upper bounds in this model, and provide efficient randomized constructions that achieve a min{O( t 2 ),O((loglogn ) 2 )} factor within the optimum, with almost linear running time. We further show a connection between the above document exchange problem and the problem of document exchange under edit distance, and use our techniques to give an efficient randomized protocol with optimal communication complexity and \emph{exponentially} small error for the latter. This improves the previous result by Haeupler \cite{haeupler2018optimal} (FOCS'19) and that by Belazzougui and Zhang \cite{BelazzouguiZ16} (FOCS'16). Our techniques are based on a generalization of the celebrated expander codes by Sipser and Spielman \cite{sipser1996expander}, which may be of independent interests.

Full PDF

aa r X i v : . [ c s . CC ] J u l Eﬃcient Document Exchange and Error Correcting Codes withAsymmetric Information

Kuan Cheng ∗ Xin Li † July 20, 2020

Abstract

We study two fundamental problems in communication, Document Exchange (DE) and ErrorCorrecting Code (ECC). In the ﬁrst problem, two parties hold two strings, and one party triesto learn the other party’s string through communication. In the second problem, one partytries to send a message to another party through a noisy channel, by adding some redundantinformation to protect the message. Two important goals in both problems are to minimize thecommunication complexity or redundancy, and to design eﬃcient protocols or codes.Both problems have been studied extensively. In this paper we study whether asymmetricpartial information can help in these two problems. We focus on the case of Hamming dis-tance/errors, and the asymmetric partial information is modeled by one party having a vector ofdisjoint subsets S = ( S , · · · , S t ) of indices and a vector of integers k = ( k , · · · , k t ), such that ineach S i the Hamming distance/errors is at most k i . To our knowledge, no previous work has stud-ied this problem systematically. We establish both lower bounds and upper bounds in this model,and provide eﬃcient randomized constructions that achieve a min { O ( t ) , O (cid:0) (log log n ) (cid:1) } factorwithin the optimum, with almost linear running time.We further show a connection between the above document exchange problem and the prob-lem of document exchange under edit distance , and use our techniques to give an eﬃcient ran-domized protocol with optimal communication complexity and exponentially small error for thelatter. This improves the previous result by Haeupler [18] (FOCS’19), which has polynomiallylarge error; and that by Belazzougui and Zhang [8] (FOCS’16), which is only optimal for alimited range of parameters. Our techniques are based on a generalization of the celebratedexpander codes by Sipser and Spielman [33], which may be of independent interests. ∗ [email protected]. Department of Computer Science, University of Texas at Austin. Supported by a SimonsInvestigator Award ( † [email protected]. Department of Computer Science, Johns Hopkins University. Supported by NSF Award CCF-1617713 and NSF CAREER Award CCF-1845349. Introduction

Document exchange, ﬁrst introduced and studied by Orlitsky [29] and subsequently named byCormode et. al. [13], is a fundamental problem in communication. Here, two parties Alice and Bobeach holds a string (document) x and y , and the goal is for one party to learn the other party’sstring with the least amount of communication possible. For simplicity, let us assume that both x and y have n bits. If x and y can be arbitrary strings, then it is clear that in the worst case thecommunication needs at least n bits, i.e., sending one party’s string to the other party. However,in practice this is often not the case, and x and y can actually be close in some sense. For example,Alice and Bob may be two uses holding diﬀerent versions of some original document, where x and y are obtained after some edits of a string z . If the number of edits is limited, then it is possiblefor one party to learn the other party’s string with signiﬁcantly less amount of communication. Inthis paper, we focus on the case where the strings have binary alphabet.More generally and formally, the document exchange problem can be described as follows. Aliceand Bob each has an n -bit string x and y , and the distance between x and y , D ( x, y ) is upperbounded by some number k . Here the distance D can be any measure of interests. Now, the ﬁrstgoal here is to minimize the communication complexity as a function of n and k . In addition, it isalso an important goal to keep the protocol eﬃcient , i.e., we would like the communication protocolto run in polynomial time of n .There has been a lot of work on the document exchange problem [29, 6, 7, 1, 13, 27, 35, 22, 23,8, 11, 18, 12]. While Orlitsky [29] established some upper and lower bounds on the communicationcomplexity of general “balanced” measures D ( x, y ), as well as exponential time protocols that canachieve the optimal communication, eﬃcient protocols in subsequent works have been mostly fo-cusing on the two natural cases where D ( x, y ) is either the Hamming distance or the edit distance.In the former, the distance is measured by how many bits in x and y are diﬀerent at the corre-sponding locations, while in the latter the distance ED ( x, y ) is measured by the minimum numberof insertions, deletions, and substitutions to transform one string into another. Both distances aremetrics, and edit distance strictly generalizes Hamming distance.For both Hamming distance and edit distance, it is known that if D ( x, y ) ≤ k , then the optimalcommunication complexity in the document exchange problem is Θ( k log( n/k )), and this can beachieved by a deterministic one-round protocol running in exponential time. The situation of eﬃ-cient protocols however is diﬀerent for these two measures. For Hamming distance, we have eﬃcient,deterministic one-round protocol with optimal communication complexity Θ( k log( n/k )), based onAlgebraic Geometry codes [21]. For edit distance, except for the exponential time deterministicone-round protocol in [29] which achieves optimal communication complexity, for a long time onlyeﬃcient randomized one round protocols with sub-optimal communication complexity are known.These include the work of Irmak et al. [22] with communication complexity O ( k log( nk ) log n ), thework of Jowhari [23] with communication complexity O ( k log n log ∗ n ), the work of Chakraborty etal. [10] with communication complexity O ( k log n ), and the work of Belazzougui and Zhang [8] withcommunication complexity O ( k (log k +log n )). In particular, the protocol in [8] has asymptoticallyoptimal communication complexity for k = 2 O ( √ log n ) , with success probability 1 − / poly ( k log n ).In 2018, Cheng et. al. [11], and Haeupler [18] independently gave an eﬃcient, deterministicone-round protocol with communication complexity O ( k log ( n/k )). Finally, Haeupler [18] gave theﬁrst eﬃcient randomized one-round protocol with optimal communication complexity O ( k log( n/k )).However, his protocol only succeeds with probability 1 − / poly ( n ).Document exchange is closely related to the (even more) fundamental problem of error correctingcodes. The goal of an error correcting code is to ensure that one party can successfully sendinformation to another party, despite errors caused by the communication channel. In this setting,1he ﬁrst party (Alice) runs an encoding algorithm that turns a message of m bits into a codewordof n bits, and sends the codeword to the second party (Bob) through a channel. Bob then triesto recover the message by running a decoding algorithm. Similar to document exchange, there arealso two important goals here. First, one wants to keep n − m (the redundancy of the codeword) tobe as small as possible, or alternatively, to keep m (the message length) to be as large as possible.Second, one needs both the encoding and decoding to be eﬃcient, i.e., run in polynomial time of m . There has been extensive study on error correcting codes, which we will not be able to completelysurvey here. Again, the channel error can have several diﬀerent models, and the most studied areHamming errors and edit errors. For both cases, assuming k is an upper bound on the number oferrors, then it is known that the optimal message length one can achieve (with possibly exponentialtime encoding/decoding) is m = n − Θ( k log( n/k )). For Hamming errors, again we have eﬃcientconstructions matching this bound, based on Algebraic Geometry codes [21]. For edit errors theconstructions are far behind, and for a long time we only have asymptotically optimal constructionsfor the two extreme cases of k = 1 [26] and k = αn for some small constant α > m = n − O ( k log ( n/k )).Cheng et. al. [11] further gave an eﬃcient code with m = n − O ( k log n ), which is optimal for k ≤ n − α where α > syndrome . Given such a code, the syndrome canbe used as the information sent in a document exchange protocol. Conversely, given a one rounddocument exchange protocol, one can use a standard error correcting code on the information sentand use it as the syndrome in a systematic error correcting code.In all previous works, Alice and Bob have symmetric information—they both know that theirstring is within distance D ( x, y ) ≤ k to the other party’s string, or the total number of errors in thereceived codeword is at most k . However, in many practical situations, each party may have someadditional partial information that is not known to the other party. For example, in documentexchange, if Bob has made edits in some speciﬁc parts of the original document, then even withoutcarefully tracking the edits, Bob has some partial information of where the diﬀerences can happen.This information is not necessarily known to Alice. In another situation, suppose Alice sends a longstring to Bob by Internet routing, then this string may be broken into several parts and transmit-ted to Bob through diﬀerent channels. These channels may have diﬀerent behavior and introducediﬀerent numbers of errors. While it is reasonable that both parties know the parameters of allchannels, due to the routing process Alice may not know which channels her parts are sent through.On the other hand, Bob can learn these information by observing the received parts. Thus Bobwill have some partial information about the numbers of errors in speciﬁc parts of the receivedstring, which is not known to Alice. The ﬁst example applies to document exchange and the secondexample applies to error correcting codes. One can now ask the following natural question, whichis the focus of this paper. Question:

Can we use these asymmetric information to reduce the communication complexity indocument exchange or the redundancy in error correcting codes, while still designing eﬃcient pro-tocols or codes?

Towards answering this question, we ﬁrst formally deﬁne our model.2 .1 The Model of Asymmetric Information

In this paper we focus on Hamming distance/Hamming errors in the model of asymmetric infor-mation. To model the asymmetric information, we assume that one party has some additionalinformation of where the diﬀerences/errors can happen. More formally, we use a vector of disjointsubsets S = ( S , · · · , S t ) to indicate the positions where the diﬀerences/errors can happen, and avector of integers k = ( k , · · · , k t ) to indicate the upper bounds on the numbers of diﬀerences/errorsin each set S i . For each S i , let s i denote the size of S i , i.e., s i = | S i | . We also use s to indicatethe vector s = ( s , · · · , s t ). We assume the parameters ( s , k , t ) are known to both parties, and that(without loss of generality) k ≥ k ≥ · · · ≥ k t . Deﬁnition 1.1. ( ( s , k , t ) Asymmetric Document Exchange) There are two parties Alice and Bob.Alice has a string x ∈ { , } n and Bob has a string y ∈ { , } n . Both parties know ( s , k , t ) . Inaddition, Bob knows a vector of disjoint subsets S = ( S , · · · , S t ) where ∀ i, S i ⊆ [ n ] and | S i | = s i .That is, within each set S i , the Hamming distance between x and y is at most k i . One party triesto learn the string of the other party. Deﬁnition 1.2. ( ( s , k , t ) Asymmetric Error Correcting Code) There are two parties Alice andBob. Both parties know ( s , k , t ) . Alice encodes a message of m bits into a codeword of n bits, usinga function Enc : { , } m → { , } n and sends it to Bob. Bob knows a vector of disjoint subsets S = ( S , · · · , S t ) where ∀ i, S i ⊆ [ n ] and | S i | = s i . That is, within each set S i , there are at most k i Hamming errors in the received codeword. Bob uses a function

Dec : { , } n → { , } m to recoverthe message. We require the protocol or code to succeed for every possible vector of disjoint subsets S =( S , · · · , S t ) with | S i | = s i , ∀ i , and for every possible distance/error pattern that is consistent with S = ( S , · · · , S t ) and k = ( k , · · · , k t ).We consider both deterministic and randomized protocols/codes. In the case of randomizedsolutions, we assume that the two parties have shared randomness, as is standard in all previousworks. In the case of error correcting codes, we further assume that the channel errors do notdepend on the shared randomness.Our model is quite general in capturing asymmetric information. A naive solution is to simplyignore the extra information, and apply a document exchange protocol or error correcting codefor k = P ti =1 k i Hamming distance or Hamming errors. However, our goal here is to see if theextra information can be used to design better protocols or codes. Another natural strategy forthe document exchange problem, is for Bob to ﬁrst send the descriptions of S to Alice, and theycan then run a protocol on each set S i . However, this strategy can result in a signiﬁcant amount ofcommunication, e.g., P ti =1 s i log n , which can be even larger than n . In some special situations, aset S i may be a continuous block in the string, and it suﬃces to just send the starting and endingindex, using 2 log n bits. If all sets S i are of this form, then the total number of bits required is2 t log n . Even this number can be large when the number of sets t is large. We also stress thatin our model and all results, each set S i does not need to be a continuous block. A ﬁnal simplestrategy is to try to form a large continuous block which includes several S i ’s, but this can increasethe size of the sets signiﬁcantly and thus also results in a penalty on the communication complexity. Remark 1.3.

In the asymmetric document exchange, it may seem unreasonable to assume thatAlice knows the vectors s , k . However, this is without loss of generality up to a small loss in com-munication complexity and communication rounds. Basically, Bob can ﬁrst send these two vectorsto Alice. This only takes one round and the number of bits sent by Bob is O ( P ti =1 (log k i + log s i )) ,while the number of bits needed to distinguish all possible error patters is at least P ti =1 log (cid:0) s i k i (cid:1) . Theformer is always within a constant factor to (and in most cases smaller than) the latter. elated previous works. While document exchange and error correcting codes with asymmetricinformation are natural questions, to our knowledge they have not been studied systematically. Theonly previous work we found is the work of Belazzougui and Zhang [8], which studies a special caseof our model with t = 1, i.e., Bob’s extra information only has one subset S with | S | = s . They useentirely diﬀerent techniques to give a document exchange protocol with sub-optimal communicationcomplexity O ( k (log s + log(1 /ε ))), where Bob can learn Alice’s string with success probability 1 − ε .However, there are a large body of works on a related topic [3, 25, 24, 36, 2, 5], which studythe problem of source coding/data compression with asymmetric information. In this setting, thedecoder has some prior distribution µ not known to the encoder, and the encoder tries to send aset of items drawn independently from the distribution to the decoder, using the smallest numberof bits as possible. The problem we study here, on the other hand, focuses on error correction.While there are similarities between these two problems, they are also fundamentally diﬀerent. Forexample, all the eﬃcient algorithms in these prior works run in time polynomial in the size of thesupport µ . This is prohibitive for our purpose since this number is already exponentially large.We note that source coding and error correction are the two most important applications ofinformation theory. Thus given the abundant works on source coding/data compression with asym-metric information, we believe a systematic study of document exchange and error correcting codeswith asymmetric information is also an important direction. We provide both lower bounds and upper bounds for document exchange and error correcting codeswith asymmetric information. To simplify the presentation, we ﬁrst deﬁne some quantities. Giventwo vectors s = ( s , · · · , s t ) and k = ( k , · · · , k t ), we deﬁne H ( s , k ) = log (cid:16)Q ti =1 (cid:16)P k i j =0 (cid:0) s i j (cid:1)(cid:17)(cid:17) = P ti =1 log (cid:16)P k i j =0 (cid:0) s i j (cid:1)(cid:17) . Similarly, for two integers s and k with s ≥ k , we deﬁne H ( s, k ) = log (cid:16)P kj =0 (cid:0) sj (cid:1)(cid:17) .Note that if ∀ i, s i ≥ k i and s ≥ k , then H ( s , k ) = Θ( P ti =1 k i log( s i /k i )) and H ( s, k ) =Θ( k log( s/k )). Recall that k = P ti =1 k i and s = P ti =1 s i ≤ n , hence H ( s , k ) ≤ H ( n, k ). We havethe following theorem. Theorem 1.4.

In an ( s , k , t ) asymmetric DE problem, we have • Suppose Alice learns Bob’s string, then any deterministic protocol has communication com-plexity at least H ( n, k ) , and any randomized protocol with success probability ≥ / has com-munication complexity at least H ( n, k ) − . • Suppose Bob learns Alice’s string, then any randomized protocol with success probability ≥ / has communication complexity at least H ( s , k ) − . Furthermore if ∀ i, s i ≥ k i , then any oneround deterministic protocol has communication complexity at least H ( n, k ) . This theorem tells us the following important things: First, Bob’s extra information is onlyuseful for him to learn Alice’s string, but not useful in the other direction. Second, in the caseof a one round protocol for Bob to learn Alice’s string, for a wide range of parameters (i.e., when ∀ i, s i ≥ k i ), Bob’s extra information is only useful in randomized protocols.For upper bounds, we note that there are eﬃcient deterministic protocols to meet the bound H ( n, k ), based on algebraic geometry codes. To meet the bound H ( s , k ), there is also a simpleone round randomized protocol: Alice hashes her string x using a random hash function, and Bobenumerates all possible strings to ﬁnd the one with the correct hash value. It’s easy to see thatthis protocol succeeds if there is no hash collision, which happens with high probability if the hashfunction outputs some O ( H ( s , k )) bits. However, this protocol runs in exponential time, and ourmain result is an eﬃcient protocol that gets close to this bound.4o state our main theorem, we deﬁne another quantity χ ( s, k, t ) ∈ N : ﬁrst partition the interval[2 , n ] into disjoint subintervals { I j = [2 j − , j ) } , starting from j = 1. Then, for every i ∈ [ t ],put s i /k i into the corresponding subinterval. χ ( s, k, t ) is deﬁned to be the number of subintervals I j which contain at least one s i /k i . We now have the following theorem. Theorem 1.5.

In an ( s , k , t ) asymmetric DE problem, suppose that ∀ i, s i ≥ k i . There is an eﬃ-cient randomized one round protocol for Bob to learn Alice’s string, with communication complexity O ((log log n ) H ( s , k )) and error probability − Ω( k t ) + poly ( s ) . The protocol runs in time e O ( n ) . In particular, Corollary 1.6 implies that if t is a constant, then we have a one round protocol withasymptotically optimal communication complexity, while Corollary 1.7 gives a one round protocolwith communication complexity optimal up to an additional (log log n ) factor. Both protocols runin near linear time. We also note that the simple strategy of ignoring the extra information canresult in communication complexity Ω( H ( s , k ) log n ) in the worst case.Similarly, we have both lower bounds and upper bounds for error correcting codes with asym-metric information. The ﬁrst theorem shows that such information is only useful for a randomizedcode. Theorem 1.8.

In an ( s , k , t ) asymmetric ECC problem, if ∀ i, s i = | S i | ≥ k i , then any determinis-tic code must have distance at least k + 1 . In particular, this means m ≤ n − H ( n, k ) . Furthermore,any randomized code with success probability ≥ / must have message length m ≤ n − H ( s , k ) + 1 . Again, a code with randomized encoding and exponential time deterministic decoding canachieve message length m = n − O ( H ( s , k ). We design an eﬃcient code that comes close to this. Theorem 1.9.

In an ( s , k , t ) asymmetric ECC problem, suppose ∀ i, s i ≥ k i . There is an eﬃcientcode with randomized encoding and deterministic decoding, which has message length m = n − O ( χ ( s, k, t ) H ( s , k )) and error probability − Ω( k t ) + poly ( s ) . In particular, the message length can be max { n − O ( t H ( s , k )) , n − O ((log log n ) H ( s , k )) } , and the running time is e O ( n ) . Next we show that we can design eﬃcient document exchange protocols with asymptoticallyoptimal communication complexity in a special case, roughly when s , k are geometric progressions. Theorem 1.10.

There is an eﬃcient randomized one-round protocol for every ( s , k , t ) asymmet-ric DE problem, where s i = k Θ( i ) , k i = max { k/ Θ( i ) , Θ( kt log nk ) } ≤ s i / . The communicationcomplexity is O ( k ) and the error probability is − Ω( k/ log nk ) . We show that the problem of document exchange under edit distance can be reduced to thespecial case above, and thus we obtain the following theorem.5 heorem 1.11.

There is an eﬃcient randomized one-round protocol for the DE problem withedit distance at most k . The communication complexity is O ( k log nk ) and the error probability is min { − Θ( k/ log nk ) , / poly ( n ) } . We also have both lower bounds and upper bounds for document exchange where both partieshave some asymmetric partial information, represented as a vector of disjoint subsets. For theclarity of presentation we omit the results here, and refer the reader to Section 8 for details.

Our lower bounds follow from relatively simple information theoretic arguments, so here we onlyprovide an informal outline of our protocols. We start with the asymmetric document exchange forHamming distance. Recall that the asymmetric information is in the form of S = ( S , · · · , S t ) and k = ( k , · · · , k t ), where ∀ i, | S i | = s i and the Hamming distance within S i is at most k i . We assume ∀ i, s i ≥ k i , and without loss of generality that k ≥ k · · · ≥ k t . The protocol for one set.

Our starting point is the simplest case where t = 1, i.e. there is onlyone set S of size s and the Hamming distance in S is at most k . In this case our goal is to give aneﬃcient one round protocol with communication complexity O ( k log sk ). If s = n then this can beachieved by using a systematic algebraic geometry code or an expander code [32]. We will use thelatter and we brieﬂy review the application of expander codes in document exchange.To run the protocol, the two parties choose a bipartite expander graph G : [ n ] × [ d ] → [ m ].Alice associates her string x with the n vertices on the left, and computes a string z of length m asfollows: For every i ∈ [ m ], let z i = L j ∈ Γ − ( i ) x j , where Γ − ( i ) is the set of neighbors of the rightvertex i in the expander. The string z consists of a sequence of parity checks of x , and is then sentto Bob.To recover x , Bob starts out with ˜ x = y as his current version of x , and maintains anotherstring z ′ ∈ { , } m using the same approach as above, except replacing the string x by ˜ x , i.e., z ′ consists of a sequence of parity checks of ˜ x . z and z ′ will diﬀer in several coordinates, and Bob willgradually modify ˜ x into x by ﬂipping some bits in ˜ x according to the parity checks. This processis known as belief propagation, and works as follows. Bob keeps ﬁnding a bit in ˜ x such that byﬂipping this bit, the Hamming distance between z ′ and z decreases by at least one. Bob ﬂips thisbit and updates ˜ x and z ′ correspondingly. Bob stops when z ′ = z , at which point x = ˜ x and hehas successfully recovered x .For the analysis, we use the set R ⊆ [ n ] to denote the coordinates where x and ˜ x are diﬀerent.We say the i ’th parity check bit is satisﬁed if z i = z ′ i , and unsatisﬁed otherwise. Let the numberof satisﬁed and unsatisﬁed checks in Γ( R ) (the neighbors of R ) be s and u . Assume the graph hasgood expansion, i.e. | Γ( R ) | = s + u ≥ . d | R | , and note that in Γ( R ), each satisﬁed check has atleast two neighbors in R . Thus 2 s + u ≤ d | R | . By the two inequalities, we deduce u ≥ . d | R | and thus at least one left vertex has more unsatisﬁed parity checks as neighbors than satisﬁedparity checks, and Bob can ﬂip this bit. The analysis holds as long as the expansion of the set R is guaranteed. Note that the number of unsatisﬁed checks is strictly decreasing in the process,thus | R | can never be more than 1 . k , since otherwise this will induce more than dk unsatisﬁedchecks, but at the beginning there are at most dk unsatisﬁed checks. Therefore, we only need toguarantee the expansion of all R ⊆ [ n ] with | R | ≤ . k , and a random graph with m = O ( k log nk )and d = O (log nk ) satisﬁes this property with high probability.Going back to the case where s < n , the ﬁrst issue is that we can’t aﬀord to use an expanderwhich has good expansion for all subsets R as before, since this will make m = Ω( k log nk ). To ﬁx6his, we instead just require the expansion to hold for all subsets R ⊆ S with | R | ≤ . k . Now, arandom graph with m = O ( k log sk ) and d = O (log sk ) satisﬁes this property with high probability,and both parties can generate the same expander by using the shared randomness. Similarly, whenrecovering x Bob will always look for a bit in S to ﬂip. The analysis is now similar to the standardcase and this gives the protocol for the case of t = 1. The protocol for two sets.

We now consider the case with t >

1. Our goal is to design aneﬃcient one round protocol with communication complexity close to H ( s , k ).The ﬁrst idea may be to take the union of all S i , i ∈ [ t ] as one set S , and the Hamming distancein S is at most k = P i ∈ [ t ] k i . Now we can use the protocol for t = 1 described before. However,in this case the communication complexity will be O ( H ( s, k )), which may not be close to H ( s , k ).For example, consider the case where t = 2, k = n . , s = 10 n . , k = 10 , s = 0 . n . A directcomputation indicates H ( s, k ) = Ω( H ( s , k ) log n ) = ω ( H ( s , k )). It also appears hard to improve thisif we just use a single expander graph, since the decoding requires good expansion for all possiblesubsets of errors during the belief propagation, which can potentially be all possible subsets of sizeΩ( k ). This forces the right hand size of the graph to be H ( s, k ).To overcome this diﬃculty, our idea is to use more than one expander codes. Towards this, ourmain observation is that, the issue with the above example is due to the following fact: for some i ∈ [ t ], k i is large while s i is small, but for some other i , k i is small while s i is large. Indeed, in thecase of t = 2, there are two good situations where H ( s, k ) = O ( H ( s , k )):1. k and k are roughly the same, i.e., k = Θ( k ). In this case we have H ( s, k ) = Θ(( k + k ) log s + s k + k ) = Θ( k log s k + k log s k ) = Θ( H ( s , k )).2. log s k and log s k are roughly the same, i.e., log s k = Θ(log s k ). In this case we also have H ( s, k ) = Θ( H ( s , k )).Our protocol will exploit both of these good cases. We ﬁrst illustrate this with a protocol forthe case of t = 2. Our idea is to reduce the number k (recall that k ≥ k ) to be roughly thesame as k (which is unnecessary if k and k are already roughly the same at the beginning). Inother words, we will ﬁrst reduce the Hamming distance in S from k to at most ck , if k > ck for some constant c >

1. It is not immediately clear why this is feasible, since Alice does not knowthe subset S . Additionally, we need to make sure the communication complexity of this step isnot too large.We achieve this by using an expander code based on a bipartite expander G : [ n ] × [ d ] → [ m ]such that for all sets R ⊆ S with | R | ∈ [ ck , . k ], the set R has good expansion, i.e., | Γ( R ) | ≥ . d | R | . The expander is again generated by shared randomness, and we show that we can choose d = O (log s k ) , m = O ( k log s k ) and the graph satisﬁes the property with high probability. Alicewill again compute the parity checks z and send it to Bob.Now Bob will apply the same method as before: start with ˜ x = y and keep ﬁnding a bit in S with more unsatisﬁed parity checks as neighbors than satisﬁed parity checks. Bob ﬂips this bit andcontinues doing this until no such bit can be found. Since the number of unsatisﬁed parity checkskeeps decreasing, the process will end in a ﬁnite number of steps. We claim that when it ends, theHamming distance in S is at most ck . This eﬀectively reduces the Hamming distance in S .The main issue in the analysis here is that the diﬀerent bits between x and y are not entirely in S , and this may cause problems in belief propagation. However, our observation is that when k is much larger than k , the eﬀect of k can mostly be ignored. More speciﬁcally, let R be the set ofleft vertices which correspond to the diﬀerent bits between x and y in S , and R be the set of leftvertices which correspond to the diﬀerent bits in S . Thus | R | ≤ k . Let the number of satisﬁed and7nsatisﬁed checks in Γ( R ) be s and u . As long as | R | ∈ [ ck , . k ], we have | Γ( R ) | = s + u ≥ . d | R | ,and 2 s + u ≤ d | R | + d | R | ≤ (1+ c ) | R | . Combining these inequalities, we can still deduce u ≥ . d | R | ,by setting c = 10. Hence there must exist a bit in S to ﬂip. Since the number of unsatisﬁed checksdecreases strictly, the size | R | in the process can never be larger than 1 . k . This is becauseotherwise there will be at least 1 . dk unsatisﬁed checks, while at the beginning there are onlyat most (1 + 1 /c ) dk = 1 . dk unsatisﬁed checks. Thus when this process stops, we must have | R | ≤ ck . At this point, we can use the protocol for one set together with another expander graphto ﬁnish the job, by considering the set S = S ∪ S which has Hamming distance at most ( c + 1) k .The total communication complexity is O ( k log s k ) + O ( k log s + s k ) = O ( H ( s , k )). The protocol for arbitrary t . We now generalize the above protocol to arbitrary t . Recall that k ≥ k ≥ · · · ≥ k t . Our idea is to use the above protocol of reducing Hamming distance repeatedly,while going through the index from 1 to t . More formally, we use i ′ to denote the current index and k ′ to denote an upper bound of the Hamming distance in ∪ j ∈ [ i ′ ] S j after possible steps of reducingdistance. We start with i ′ = 0 , k ′ = 0 and repeat the following: ﬁnd the ﬁrst index i > i ′ s.t. thecurrent Hamming distance in ∪ j ∈ [ i ] S j is much larger than the Hamming distance in ∪ tj = i +1 S j , i.e., k ′ + i X j = i ′ +1 k j > c t X j = i +1 k j = k ′′ . (1)Then we reduce the Hamming distance in ∪ j ∈ [ i ] S i to at most k ′′ by using the two set protocoldescribed before, regarding ∪ j ∈ [ i ] S j as one set and ∪ tj = i +1 S j as the other set. We now update k ′ = k ′′ , i ′ = i and continue the process. Finally, the Hamming distance in S = ∪ j ∈ [ t ] S i will bereduced to at most ( c + 1) k t , and we apply the one set protocol for S to ﬁnish the job.The correctness follows from the correctness of the one set protocol and the two set protocol.The main thing left is to bound the communication complexity. Note that except the ﬁrst iteration,in each subsequent iteration i ′ will be updated to i ′ + 1. Thus the number of bits Alice sends in thisstep is m = O (cid:16) ( k ′ + k i ) log P j ∈ [ i ] s j k ′ + k i (cid:17) . We show that this is always O ( t H ( s , k )) by using the boundon k ′ , the fact that k ≥ k ≥ · · · ≥ k t , and k i ≤ s i / , ∀ i ∈ [ t ]. Thus the total communicationcomplexity is O ( t H ( s , k )). Note that this is a one round protocol since only Alice sends outinformation.Finally, we can get further improvement by grouping some sets together. Speciﬁcally, we dividethe interval [2 , n ] into disjoint subintervals I j = [2 j − , j ) , j = 1 , . . . , O (log log n ) and put eachsubset S i into one interval according to the number s i /k i . Whenever two subsets S i and S j arein the same interval, we have log( s i /k i ) = Θ(log( s j /k j )) and thus we can consider S i ∪ S j as oneset with Hamming distance k i + k j , without changing the communication complexity much. Now,taking the union of all subsets in the same interval to be one subset reduces the number of subsetsto χ ( s, k, t ), and applying our protocol results in communication complexity O ( χ ( s, k, t ) H ( s , k )). ECC with asymmetric information.

The protocol for document exchange can be used toconstruct an error correcting code. We do this by ﬁrst estimating the length of the redundantinformation. Let m be the communication complexity of the ( s , k , t ) DE protocol for messagelength n . We choose an asymptotically good code C with message length m and codeword length n , which corrects k errors. The actual message length of our code will be n − n . On input message x , we run Alice’s DE protocol on x ◦ where = 0 n to get z ∈ { , } m . Then we encode z by C and the ﬁnal codeword is x ◦ C ( z ). To decode, one ﬁrst recovers z by running the decoding8lgorithm of C on the part C ( z ). Then we run Bob’s DE protocol using z , and by replacing the C ( z ) part with 0 n . The correctness follows from the code C and the DE protocol. We now describe our protocol for document exchange under edit distance, and show a connectionto the problem of document exchange under Hamming distance with asymmetric information.On a high level, our protocol follows the leveled structure used in several previous works [22,11, 18]. The protocol proceeds in L = O (log( nk )) levels where in each level, Alice sends a sketch ofher string x with O ( k ) bits to Bob. Bob then uses all the sketches and his string y to recover x .On Alice’s side, in the ﬁrst level she divides her string into Θ( k ) blocks where each block hassize O ( nk ). In each subsequent level, every block from the previous level is divided evenly into twoblocks, and this ends when the block size becomes O (log nk ), which takes O (log( nk )) levels. In eachlevel, Alice applies a diﬀerent random hash function to every block using the shared randomness,and computes a sketch based on the hash values. On Bob’s side, his recovering process also proceedsin L levels, where in each level Bob maintains a string ˜ x which is Bob’s current version of Alice’sstring x . Speciﬁcally, in each level Bob also applies the same hash functions to the blocks of ˜ x toget the hash values, then he uses this level’s sketch to recover the correct hash values of Alice’sbocks. Bob will then ﬁnd the blocks in ˜ x which have inconsistent hash values with Alice’s blocks,and update these blocks using his string y by computing a non overlapping matching between y ’s blocks and the corresponding hash values. An important property of the protocol is that ineach level, the number of diﬀerent blocks between x and ˜ x is always bounded by O ( k ) with highprobability. This ensures that Alice can send a short sketch to Bob for him to recover the correcthash values of all blocks.To ensure that Alice’s sketch in each level has length O ( k ), there are several non trivial issues.First, every hash function needs to have only O (1) bits of output, as in [18]. Second, even so,the general task of recovering s hash values with O ( k ) errors needs to use a sketch of size at leastlog (cid:0) sk (cid:1) = Ω( k log sk ), where s is the number of blocks in the current level. This can be as large asΩ( k log nk ) when s becomes n Ω(1) , and thus will be problematic. To ﬁx this issue, [18] uses a morecareful analysis called “t-witness” to show that in each level, the total number of possible errorpatterns is 2 O ( k ) with high probability, instead of (cid:0) sk (cid:1) . Thus, in theory one can simply use anotherrandom hash function with O ( k ) bits of output to distinguish all error patterns, and this bringsthe sketch size back to O ( k ). However, simply doing this will result in an exponential running timesince it involves exhaustive search. Thus, [18] needs to ﬁrst randomly partition the blocks into bins,such that with high probability each bin has O (log n ) hash errors. The exhaustive search in eachbin now takes poly ( n ) time. Unfortunately, this also increases the error probability from 2 − Ω( k ) to1 / poly ( n ).In our protocol, we instead replace the approach of random partitioning and exhaustive searchin [18] by a direct eﬃcient approach, thus improving the error probability to be exponentially small.We achieve this by establishing a connection to the problem of document exchange under Hammingdistance with asymmetric information, as follows.Intuitively, in Bob’s process of recovering the string x , in each level Bob keeps track of thepositions of the possible blocks where his version ˜ x and x may be diﬀerent (we call these blocksbad). More speciﬁcally, recall that we can show in each level, with high probability there are atmost O ( k ) bad blocks. In the next level the number of these blocks will at most double due tosplitting, however since we use random hash functions with O (1) output bits, we can show that inthe next level with high probability Bob will detect O ( k ) bad blocks and update them. Some ofthe updated blocks may still be bad, but Bob knows the positions of all updated blocks, and he9lso knows that there are at most O ( k ) bad blocks in them after the update. Now, suppose theseupdates happen in level j , and Bob is now in level i > j . Then the O ( k ) updated blocks will splitinto O ( k i − j ) smaller blocks. If any of these smaller blocks is bad and it remains undetected sofar, then it must have gone through j − i diﬀerent hash functions. If we choose all hash functionsindependently, then the probability that this happens is 2 − c ( i − j ) for some constant c . By choosingthe number of output bits of the hash functions to be a large enough constant, we know thatthe expected number of smaller bad blocks that remain undetected so far is O ( k/ i − j ). With alittle extra eﬀort, we can show that with high probability the number of these blocks is at most k i − j = max { k/ log nk , k/ i − j } , and Bob knows that these blocks are inside the subset S i − j withsize O ( k i − j ), which stems from the O ( k ) updated blocks in level j . In other words, this gives aforest with the O ( k ) updated blocks in level j being the roots, and the at most k i − j bad blocks areamong the | S i − j | = O ( k i − j ) leaves.Note that the bad blocks in level i can come from the updated blocks in all previous levels, thuswe get a vector S = ( S , · · · , S i − ) and a vector k = ( k , · · · , k i − ). Furthermore in this process,whenever a bad block stemming from some level j gets detected and updated in a later level j ′ , thisnew block in level j ′ will become a new root and all its descendents are removed from the set S i − j and put into the set S i − j ′ . This ensures that the ﬁnal subsets ( S , · · · , S i − ) are disjoint. Finally,only Bob knows the sets ( S , · · · , S i − ), but both parties know ( s = | S | , · · · , s i − = | S i − | ) and( k , · · · , k i − ). Thus, we have reduced the problem of sending the sketch in level i to the problemof document exchange under Hamming distance with asymmetric information. We now give our protocol for document exchange with asymmetric information, in the specialsetting described above. Recall that we have s i = O ( k i ) , k i = max { k/ i − , k/ log nk } , i ∈ [ t ] , t = O (log nk ). One can compute H ( s , k ) = Θ( k ) here, so our protocol for the general setting willresult in sub-optimal communication complexity. We give a diﬀerent protocol here, which uses justone expander graph instead of a sequence of expander graphs.The expander graph G : [ n ] × [ d ] → [ m ] is generated by the shared randomness, with m = O ( k )and the following expansion property: for every R ⊆ ∪ ti =1 S i where | R | ∈ [ k/ log nk , O ( k )] and ∀ i ∈ [ t ] , | R ∩ S i | ≤ k i , we have | Γ( R ) | ≥ . d | R | . Limiting the expansion to restricted setsrather than all sets R with | R | ∈ [ k/ log nk , O ( k )] is the key to reduce the number of right verticesfrom Ω( k log nk ) to O ( k ). Indeed, using a careful analysis of probabilities, we show that a randombipartite graph with constant d and m = O ( k ) satisﬁes this property with high probability. Themain intuition is that the sequence { s i , i ∈ [ t ] } roughly increases exponentially, while the sequence { k i , i ∈ [ t ] } roughly decreases exponentially.Using this expander Alice sends her parity checks to Bob, and Bob again runs a belief propa-gation algorithm. The purpose of this phase is to reduce the total Hamming distance between x and ˜ x (Bob’s current version of x , starting with ˜ x = y ) to at most k/ log nk . However, the beliefpropagation has tricky issues here, as the standard approach may ﬂip much more than 20 k i bits in S i . This can result in a subset R ⊆ [ n ] which does not have good expansion, thus ruining the wholeprocess. To ﬁx this, we prohibit the algorithm from ﬂipping more than 20 k i bits in S i for each i .This is done by keeping track of the number of already ﬂipped bits in each S i , and for any i if thisnumber reaches 19 k i , then subsequently in S i the algorithm will only ﬂip bits that are previouslyﬂipped.To show that this indeed works, at each step of the belief propagation, let R ⊆ ∪ ti =1 S i standfor the set of indices where x and ˜ x have diﬀerent bits, and let R ′ stand for R restricted to theindices which we can ﬂip (due to our modiﬁcation). Thus R ′ always has good expansion. Our ﬁrst10bservation is that at any time, | R ′ | ≥ . | R | . This is because R ′ is diﬀerent from R only if forsome S i , the number of bits already ﬂipped is at least 19 k i . However originally there are at most k i errors in S i , so we have introduced at least 18 k i new errors. This means ∀ i, | R ′ ∩ S i | ≥ . | R ∩ S i | ,and thus | R ′ | ≥ . | R | . Now let ( s ′ , u ′ ) and ( s , u ) be the number of satisﬁed and unsatisﬁed checksin Γ( R ′ ) and Γ( R ) respectively. We know s ′ + u ′ ≥ . d | R ′ | . Also, again by the fact that eachsatisﬁed check in Γ( R ) has at least two neighbors in R , we have 2 s ′ + u ′ ≤ s + u ≤ d | R | ≤ d | R ′ | .From these two inequalities we can still deduce that u ′ ≥ . d | R ′ | , thus Bob can ﬁnd a bit in R ′ toﬂip.When this process stops, the Hamming distance between x and ˜ x is at most k/ log nk . We cannow use a deterministic document exchange protocol for Bob to recover x . The communicationcomplexity is O (( k/ log nk ) log nk ) = O ( k ). The only error probability here comes from the generationof the expander graph, which is 2 − Ω( k/ log nk ) . We also show that the other errors in the protocol foredit distance is 2 − Θ( k/ log nk ) . Thus the total error of the protocol for edit distance is 2 − Θ( k/ log nk ) .When k < log n , we can switch to the protocol in [18] which has error 1 / poly ( n ). In this paper we initiated a systematic study of document exchange and error correcting codeswith asymmetric information. While we provided both lower bounds and upper bounds, as wellas eﬃcient randomized constructions that are close to optimal, there are still many interestingproblems left. We list some below.

Question 1:

The most obvious open problem is to achieve optimal communication complexity (i.e., H ( s , k )) for a one round randomized protocol. Two related questions are to reduce the errorprobability of the randomized protocol, and to study the case where the condition ∀ i, s i ≥ k i does not hold. For example, is there a better deterministic protocol for the latter case? Question 2:

A better understanding of the problem in the case of two sided asymmetric infor-mation. The results in this paper only study the case of two sided asymmetric informationwhere s A + s B ≤ n , i.e., the subsets from both parties can be disjoint in the worst case. Whathappens when s A + s B > n ? In this case the subsets from both parties are guaranteed tooverlap, and the situation becomes more complicated. Question 3:

Two round deterministic protocol. We showed that for any one round deterministicprotocol, the asymmetric information is not useful. However, by a result of Orlitsky [28], thereexists a two round exponential time deterministic protocol with communication complexity O ( H ( s , k ) + log n ). The idea is that Bob sends a description of an appropriate hash function toAlice in the ﬁrst round, and Alice sends the hash value of her string x in the second round. Theexponential running time comes from both the selection of hash functions and the recoveringof x using the hash value. It is an interesting open problem to see if we can design eﬃcientprotocols matching this bound. Our result suggests a way to approximate this: Bob sendsa description of a sequence of appropriate expanders in the ﬁrst round, and Alice sends theparity checks of her string x in the second round. Using our algorithm, the recovering of x in the second round is already eﬃcient (in fact nearly linear time), however the ﬁrst step ofselecting the expanders still requires exponential time. Question 4:

Optimal deterministic document exchange under edit distance. Our results also bringsome hope to obtain an optimal deterministic document exchange protocol under edit distance.Especially, we have replaced the decoding by exhaustive search approach in [18] by an eﬃcientdecoding algorithm. However, how to appropriate pick a hash function remains a problem. We11lso note that reducing the error probability is the ﬁrst step towards a deterministic protocol,since if the error probability is small enough, then by a simple union bound there exists anon-uniform deterministic protocol that runs in polynomial time.

Paper Organization

The rest of the paper is organized as follows. In Section 3 we introduce some basic technical tools.In Section 4 we show lower bounds for asymmetric DE in the general setting. In Section 5 wegive our protocol for asymmetric DE in the general setting. In Section 6 we give our protocol forasymmetric DE in a special setting. In Section 7 we give our protocol for DE under edit distanceby using the protocol in the previous section. In Section 8 we generalize our results and give lowerbounds and protocols for asymmetric DE with two sided information.

We will use the following well known parity check computation based on bipartite expander graphs.

Construction 3.1 (Expander Code Encoding [32]) . Let

Γ : [ n ] × [ d ] → [ m ] be a bipartite graphwith n left vertices, m right vertices, left degree d . The encoding of the Γ -expander code, on inputmessage x ∈ { , } n , is computed as x ◦ z, where z ∈ { , } m , z [ i ] = L j ∈ Γ − ( i ) x [ j ] , i ∈ [ m ] . Deﬁnition 3.2 ([17] ) . A bipartite graph with n left vertices, m right vertices and left degree d isa ( k, a ) expander if for every set of left vertices S ⊆ [ n ] of size k , we have | Γ( S ) | > ak . It is a ( ≤ k max , a ) expander if it is a ( k, a ) expander for all k ≤ k max . Here ∀ x ∈ [ n ], Γ( x ) outputs the set of all neighbours of x . It is also a set function which isdeﬁned accordingly. Also ∀ x ∈ [ n ] , y ∈ [ d ], the function Γ : [ n ] × [ d ] → [ m ] is such that Γ( x, y ) isthe y -th neighbour of x . Theorem 3.3 ([17] ) . For all constants α > , for every n ∈ N , k max ≤ n , and ǫ > , thereexists an explicit ( ≤ k max , (1 − ǫ ) d ) expander with n left vertices, m right vertices, left degree d = O ((log n )(log k max ) /ǫ ) /α and m ≤ d k α max . Here d is a power of . The explicitness here means, given a left node, and an edge, the induced right node computedfound in time O (log n + log d ). Theorem 3.4 (Classic belief propagation for decoding [32]) . Let

Γ : [ n ] × [ d ] → [ m ] be a ( ≤ k, / d ) bipartite graph with left degree d l , right degree d r . Let y be an n -bit string whose distance from acodeword x is at most k/ . Then a repeated application of the following decoding algorithm to y will return x in time O ( d l d r m ) .Decoding algorithm: Upon receiving the input n -bit string y , as long as there exists a variablesuch that most of its neighbouring constraints are not satisﬁed, ﬂip it. Theorem 3.5 ([21] [14] [31] Systematic Algebraic Geometry Code) . There exists an explicit con-struction of algebraic geometry linear ( n, m, d ) q -code with d + m ≥ n − n √ q − . , q = ⌈ nd ⌉ , polynomial-time decoding when the number of errors is less than half of the distance. Here n, q should be atleast some ﬁxed constants. oreover for every message x ∈ F mq , the codeword is x ◦ z for some redundancy z ∈ F n − mq . Inother words, the code is systematic. A distribution X over Σ n is k -wise independent if for any k variables in X , their marginal distribu-tion is uniform. Theorem 3.6.

There exists an explicit construction of κ -wise independence generator g : { , } s →{ , } n , where s = O ( κ log nκ ) .Proof. Let C ⊥ be an algebraic geometry linear ( n, m, d ) q -code constructed by Theorem 3.5, with d = κ + 1, m ≥ n − O ( κ ), q = poly ( n/d ) = poly ( n/κ ).Consider the dual code C = ( C ⊥ ) ⊥ . By duality of codes, its message length is n − m = O ( κ ).Let the generator be g ( · ) = C ( · ), i.e. the encoding function of C . Note that the seed length in bitsis s = ( n − m ) log q = O ( κ log nκ ).We claim that any κ columns of the generating matrix M ∈ F m × nq of C , are linearly independent.Since otherwise there will be a codeword in C ⊥ , which has hamming distance ≤ κ = d − g ( u ) = uM is κ -wise independent, when u is uniform. For any κ symbols inthe output, the corresponding κ columns of M are linearly independent. So the matrix M K , K = { indices of these κ columns } , formed by these columns has rank κ . Thus there are κ rows which arelinearly independent. Hence each linear combination of these κ rows in M K can uniquely representone vector in the space of κ symbols. So ( uM ) K is uniform.To see this is an explicit construction, note that the encoding of C ⊥ is explicit. So the encodingof each e i ∈ F mq , i ∈ [ m ], where e i is i -th unit vector, is explicit. Thus the encoding matrix M ⊥ ,whose i -th row is C ⊥ ( e i ), can be computed explicitly. The corresponding parity check matrix,which is actually M the encoding matrix of its dual code C , can be computed explicitly using M ⊥ by standard procedures. So the construction is explicit.Random variables X , X , . . . , X n ∈ { , } n are ε -almost κ -wise independent in max norm if ∀ i , i , . . . , i κ ∈ [ n ] , ∀ x ∈ { , } κ , | Pr[ X i ◦ X i ◦ · · · ◦ X i κ = x ] − − κ | ≤ ε. A function g : { , } d → { , } n is an ε -almost κ -wise independence generator in max norm if g ( U ) = X = X ◦ · · · X n are ε -almost κ -wise independent in max norm. Unless stated otherwise,we only consider max norm in the following context. Theorem 3.7 ( ε -almost κ -wise independence generator [4]) . There exists an explicit constructions.t. for every n, κ ∈ N , ε > , it computes an ε -almost κ -wise independence generator g : { , } d →{ , } n , where d = O (log κ log nε ) .The construction is highly explicit in the sense that, ∀ i ∈ [ n ] , the i -th output bit can be computedin time ˜ O (log n + log ε ) given the seed and i . (The ˜ O here hides some log log n , log log(1 /ε ) factors) Theorem 3.8 (General moment inequality for k -wise independence) . Let X i ∈ { , } , i = 1 , . . . , n ,be a sequence of k -wise independent random variables. Let X = P ni =1 X i .For every ε > , Pr[ X ≥ (1 + ε ) E X ] ≤ (cid:18)

11 + ε (cid:19) k . .3 LCS and Matching Consider two strings x ∈ { , } pn , y ∈ { , } n ′ , hash functions h j : { , } p → { , } q , j ∈ [ n ]. Amonotone matching w = (( ρ , ρ ′ ) , . . . , ( ρ | w | , ρ ′| w | )) between x, y under h j , j ∈ [ n ] is s.t. for every i ∈ [ | w | ], h ρ i ( x [ ρ i , ρ i + p )) = h ρ i ( y [ ρ ′ i , ρ ′ i + p )), where ρ i ∈ [ pn ] , ρ ′ i ∈ [ n ′ ]. Also we consider x asbeing cut into length p blocks and each ρ i has to be a starting position of a block in x . Lemma 3.9.

For any x ∈ { , } pn , y ∈ { , } n ′ , k ∈ N , S ⊆ [ n ] , | S | = s , h j : { , } p → { , } q , j ∈ [ n ] , the number of matchings w = (( ρ , ρ ′ ) , . . . , ( ρ | w | , ρ ′| w | )) between x S and y under h j , j ∈ [ n ] s.t. | ρ ′ − ρ | + | ( ρ ′ − ρ ′ ) − ( ρ − ρ ) | + · · · + | ( ρ ′| w | − ρ ′| w |− ) − ( ρ | w | − ρ | w |− ) | ≤ k, is at most s + k (log k + s − k +log e ) . Here x S refers to the sequence of blocks of x . The j -th block of it is x S [ j ] ∈ { , } p , j ∈ [ s ]. Weuse pos ( j ) to refer to the starting position of block x S [ j ] in x . Proof.

Let’s ﬁrst consider the number of matchings with length ˜ s ∈ { , , , . . . , s } . The number ofpossible ρ , . . . , ρ ˜ s is (cid:0) s ˜ s (cid:1) .Assume | ( ρ ′ j − ρ ′ j − ) − ( ρ j − ρ j − ) | = k j , j = 1 , . . . , ˜ s , ρ ′ − ρ = 0.For a sequence of ﬁxed ρ , . . . , ρ ˜ s , the total number of possible matchings w s.t. | ρ ′ − ρ | + | ( ρ ′ − ρ ′ ) − ( ρ − ρ ) | + · · · + | ( ρ ′ ˜ s − ρ ′ ˜ s − ) − ( ρ ˜ s − ρ ˜ s − ) | = ˜ s X j =1 k j ≤ k, is at most 2 ˜ s (cid:18) k + ˜ s − s − (cid:19) = 2 ˜ s (cid:18) k + ˜ s − k (cid:19) ≤ ˜ s + k (log k +˜ s − k +log e ) ≤ s + k (log k + s − k +log e ) , Since each sequence of ρ ′ j , j ∈ [˜ s ] one-on-one corresponds to a sequence of k j ∈ N , j ∈ [˜ s ] and thesigns of ( ρ ′ j − ρ ′ j ) − ( ρ j − ρ j ) , j = 1 , . . . , ˜ s .So the overall number of possibilities is at most s X ˜ s =0 (cid:18) s ˜ s (cid:19) s + k (log k + s − k +log e ) ≤ s + k (log k + s − k +log e ) . Lemma 3.10 (DP for LCS within k edit operations) . There is an algorithm, on input x ∈{ , } pn , y ∈ { , } n ′ = O ( np ) , S ⊆ [ n ] , k = ED ( x, y ) , hash functions h i : { , } p → { , } q , i ∈ [ n ] ,outputs a monotone matching w = (( u , u ′ ) , . . . , ( u | w | , u ′| w | )) between x S and y under h i , i ∈ [ n ] s.t. | w | ≥ | S | − k , and | u ′ − u | + | ( u ′ − u ′ ) − ( u − u ) | + · · · + | ( u ′| w | − u ′| w |− ) − ( u | w | − u | w |− ) | ≤ k. Proof.

We present a dynamic programming to compute the maximum matching.For every j ∈ [ | S | ] , j ′ ∈ [ n ′ ], let f ( j, j ′ , l ) be the maximum matching w = (( u , u ′ ) , . . . , ( u | w | , u ′| w | ))between x S [1 , j ] and y [1 , j ′ ] under h i , i ∈ [ n ], s.t. • g ( w ) = | u ′ − u | + | ( u ′ − u ′ ) − ( u − u ) | + · · · + | ( u ′| w | − u ′| w |− ) − ( u | w | − u | w |− ) | ≤ l ;14 The last match matches x S [ j ] to y [ j ′ , j ′ + p ).If there is no such matching, then f ( j, j ′ , l ) is null and g ( ∅ ) = −∞ .We compute f ( j, j ′ , l ) as follows.To initialize, we let f (0 , ,

0) = ∅ .For every j ∈ [ | S | ] , j ′ ∈ [ n ′ ] , l ≤ k ,1. If h j ( x [ j ]) = h j ( y [ j ′ , j ′ + p ) ), then f ( j, j ′ , l ) is null and g ( null ) = ∞ ;2. Pick the maximum matching w in M = { f ( j , j ′ , l ) | j < j, j ′ < j ′ , l ≤ l, g ( f ( j , j ′ , l )) + | pos ( j ) − pos ( j ) − ( j ′ − j ′ ) | ≤ l } ;3. Let f ( j, j ′ , l ) = w ∪ { ( pos ( j ) , j ′ ) } . Finally we use an exhaustive search to ﬁnd the maximum matching among f ( j, j ′ , k ) , j ∈ [ n ] , j ′ ∈ [ n ′ ] and output.Next we prove the correctness.We ﬁrst claim that, there exists a matching w ∗ of length | S | − k between x S and y which has g ( w ∗ ) ≤ k . This is because we can match each i ∈ S to exactly the same entry after the k editoperations to get w ∗ . Here g ( w ∗ ) ≤ k is because otherwise the edit distance between x and y islarger than k .Assuming the i -th pair in w ∗ matches x S [ j i ] to y [ j ′ i , j ′ i + p ). Let w ∗ i be the ﬁrst i pairs of w ∗ .We use induction to show that | f ( j | w ∗ | , j ′| w ∗ | , g ( w ∗ )) | ≥ | w ∗ | .For the base case, note that | f ( j , j ′ , g ( w ∗ )) | ≥ f (0 , , ◦ ( pos ( j ) , j ′ ).Suppose for i ≥ | f ( j i , j ′ i , g ( w ∗ i )) | ≥ i . For i +1, by our construction to compute f ( j i +1 , j ′ i +1 , g ( w ∗ i +1 )),we know g ( f ( j i , j ′ i , g ( w ∗ i )))+ | pos ( j i +1 ) − pos ( j i ) − ( j ′ i +1 − j ′ i ) | ≤ g ( w ∗ i )+ | pos ( j i +1 ) − pos ( j i ) − ( j ′ i +1 − j ′ i ) | = g ( w ∗ i +1 ) . So f ( j i , j ′ i , g ( w ∗ i )) is in M . Since in the second stage of the computing of f ( j i +1 , j ′ i +1 , g ( w ∗ i +1 )) wepick the maximum matching in M and add one more match to it, we know | f ( j i +1 , j ′ i +1 , g ( w ∗ i +1 )) | ≥ | f ( j i , j ′ i , g ( w ∗ i ) | + 1 ≥ i + 1 . This shows the induction step.As a result, the output matching has length at least | w ∗ | . In this section, we show some lower bounds for the asymmetric document exchange and errorcorrecting codes. Given the vectors s = ( s , · · · , s t ) and k = ( k , · · · , k t ), we deﬁne H ( s , k ) = log  t Y i =1  k i X j =0 (cid:18) s i j (cid:19) = t X i =1 log  k i X j =0 (cid:18) s i j (cid:19) . Similarly, for two integers s and k with s ≥ k , we deﬁne H ( s, k ) = log  k X j =0 (cid:18) sj (cid:19) . Note that in particular we have H ( s , k ) ≥ P ti =1 k i log( s i /k i ) and H ( s, k ) ≥ k log( s/k ).We now have the following theorems. 15 heorem 4.1. In an ( s , k , t ) asymmetric DE problem where Bob has the vector of subsets S =( S , · · · , S t ) , let k = P ti =1 k i and suppose Alice learns Bob’s string. Then any deterministic protocolhas communication complexity at least H ( n, k ) , and any randomized protocol with success probability ≥ / has communication complexity at least H ( n, k ) − . This holds even if Alice knows s and k .Proof. Assume for the sake of contradiction that there is a deterministic protocol with commu-nication complexity less than H ( n, k ). Fix Alice’s string x , and the number of strings y withinHamming distance k of x is exactly 2 H ( n,k ) . For each of these strings, one can deﬁne a vectorof subsets S = ( S , · · · , S t ) consistent with s = ( s , · · · , s t ) such that with each subset S i theHamming distance is exactly k i . Since the transcript of the protocol is a deterministic function of( x, y, S , s , k , ⊔ ), at least two diﬀerent y ’s from Bob’s side will produce the same transcript. Nowsince Alice’s ﬁnal output is a deterministic function of x and the transcript, this means Alice willnot be able to distinguish the two diﬀerent y ’s, contradicting that the protocol always succeeds.Similarly, assume for the sake of contradiction that there is a randomized protocol with commu-nication complexity less than H ( n, k ) −

1, that succeeds with probability ≥ /

2. Fix Alice’s string x and consider the 2 H ( n,k ) diﬀerent strings y as above. By an averaging argument there is a ﬁxing ofthe random bits used, such that the protocol succeeds for at least 2 H ( n,k ) − y ’s. Since the protocolis now ﬁxed the same argument gives a contradiction.We now consider the case where Bob tries to learn Alice’s string, and we have the followingtheorem. Theorem 4.2.

In an ( s , k , t ) asymmetric DE problem where Bob has the vector of subsets S =( S , · · · , S t ) , let k = P ti =1 k i and suppose Bob learns Alice’s string. Then any randomized protocolwith success probability ≥ / has communication complexity at least H ( s , k ) − . Furthermore if ∀ i, s i = | S i | ≥ k i , then any one round deterministic protocol has communication complexity atleast H ( n, k ) . This holds even if Alice knows s and k .Proof. Assume for the sake of contradiction that there is a randomized protocol with communicationcomplexity less than H ( s , k ) −

1, that succeeds with probability ≥ /

2. Fix Bob’s string y , andthe number of strings x within Hamming distance k i in each subset S i is exactly 2 H ( s , k ) . By anaveraging argument there is a ﬁxing of the random bits used, such that the protocol succeeds forat least 2 H ( s , k ) − x ’s. Thus, again at least two diﬀerent x ’s will produce the same transcript, andBob will not be able to distinguish. This gives a contradiction.Similarly, assume for the sake of contradiction that there is a deterministic protocol with commu-nication complexity less than H ( n, k ). This means two diﬀerent x ’s will produce the same transcriptin a one-round protocol, where the transcript is a deterministic function of ( x, s , k , t ). For thesetwo diﬀerent x ’s, as long as ∀ i, s i = | S i | ≥ k i , one can deﬁne a vector of subsets S = ( S , · · · , S t )such that for each x , the Hamming distance between the corresponding substrings of x and y in S i is exactly k i . Thus the inputs to Bob are the same for the two x ’s. Since Bob’s ﬁnal output is adeterministic function of his inputs and the transcript, Bob will not be able to distinguish the twodiﬀerent x ’s, a contradiction.We also have the following theorem for asymmetric error correcting codes. Theorem 4.3.

In an ( s , k , t ) asymmetric ECC problem where Bob has the vector of subsets S =( S , · · · , S t ) , let k = P ti =1 k i . If ∀ i, s i = | S i | ≥ k i , then any deterministic code must have distanceat least k + 1 . In particular, m ≤ n − H ( n, k ) . Furthermore, any randomized code with successprobability ≥ / must have message length m ≤ n − H ( s , k ) + 1 . roof. Assume for the sake of contradiction that there is a deterministic code with distance at most2 k . This means there are two diﬀerent codewords Enc ( x ) and Enc ( x ) with Hamming distance atmost 2 k . Thus, an adversary can come up with two error strings z , z where each z j has exactly k Enc ( x ) ⊕ z = Enc ( x ) ⊕ z = y . As long as ∀ i, s i = | S i | ≥ k i , one can deﬁnea vector of subsets S = ( S , · · · , S t ) such that for each z j , the number of 1’s in the subset S i isexactly k i . Thus for x and x , Bob receives the same string y and his other inputs are also thesame. This means that Bob will not be able to distinguish x i and x j , a contradiction.Now assume for the sake of contradiction that there is a randomized code with success proba-bility ≥ / m > n − H ( s , k ) + 1. By an averaging argument there existsa ﬁxing of the random bits used in encoding and decoding, that succeeds for 2 m − > n − H ( s , k ) messages. Note that for any codeword, the number of all strings which have Hamming distanceat most k i in the subset S i to the codeword is 2 H ( s , k ) . This implies that there exists two diﬀerentcodewords Enc ( x ) and Enc ( x ) and a string y such that for each Enc ( x j ), y has Hamming distanceat most k i in the subset S i to the codeword Enc ( x j ). An adversary can thus change Enc ( x ) and Enc ( x ) into the same string y , and both error patterns are consistent with ( S , s , k ). Thus Bob willnot be able to distinguish x i and x j , a contradiction. We give a random protocol for the general setting s.t. the communication complexity is close tooptimal.

Lemma 5.1.

For every S ⊆ [ n ] , integer k ≤ k ≤ s = | S | , the probability that a random bipartitegraph with n left vertices, m ≥ dk /δ right vertices, left degree d = O (log sk ) , havingfor every R ⊆ S, with | R | ∈ [ k , k ] | Γ( R ) | > (1 − δ ) d | R | , (2) is at least − ε , where ε = 2 − Θ( δ (log sk ) k log kk ) .Note that when k = 1 , we get an ( n, m, d, S, ≤ k, − δ ) expander with probability at least − − Θ( δ log sk log(2 k )) ≤ − / poly ( s ) . We also denote a bipartite graph with the expansion property stated as an ( n, m, d, S, [ k , k ] , − δ ) expander. Proof.

The total number of sets R with size r is at most ( esr ) r .For a ﬁxed set R , a ﬁxed set T ⊆ [ m ] , | T | = (1 − δ ) d | R | Pr[Γ( R ) ⊆ T ] = (cid:18) | T | m (cid:19) dr = (cid:18) (1 − δ ) drm (cid:19) dr . (3)There are at most (cid:18) m | T | (cid:19) ≤ (cid:18) em | T | (cid:19) | T | = (cid:18) em (1 − δ ) dr (cid:19) (1 − δ ) dr (4)such set T . 17o by a union bound, the probability that for every R, | R | = r , Γ( R ) ≤ (1 − δ ) dr is at most (cid:18) em (1 − δ ) dr (cid:19) (1 − δ ) dr × (cid:18) (1 − δ ) drm (cid:19) dr × ( esr ) r = e (1 − δ ) dr (cid:18) (1 − δ ) drm (cid:19) δdr × ( esr ) r ≤ e dr e − δdr log mdr ( esr ) r ≤ − Θ( δdr log kr ) . (5)by letting m = 2 dk /δ , d = O (log sk ).By another union bound the probability that for every R, | R | ∈ [ k , k ], it does not have a goodexpansion is at most P kj = k − Θ( δdj log kj ) ≤ ( k − k + 1)2 − Θ( δdk log kk ) ≤ − Θ( δdk log kk ) .When k = 1, this is at most 2 − Θ( δ log sk log(2 k )) ≤ − Θ( δ log sk log(2 k )) ≤ / poly ( s ). Lemma 5.2.

Assume Γ is an ( n, m, d, S, [ k ′ , k ] , . expander. Let y be the expander-code encod-ing of x using Γ . Then there is an explicit decoding which, on input x ′ which has k i , i ∈ [ t ] errorsin S i from x , with k ≥ k ′ ≥ c P ti =2 k i , c = 10 , outputs ˜ x that has at most k ′ errors in S .Proof. We propose the following algorithm. For every iteration, ﬁnd the ﬁrst bit in S s.t. it hasmore unsatisﬁed checks than satisﬁed ones. Loop until we cannot ﬁnd such bit anymore.Now we show this works. Assume there are at least k ′ errors in S . Denote A as the set ofindices of these errors. Let s be the number of satisﬁed neighbors of A = A ∩ S . Let u be thenumber of unsatisﬁed neighbors of A . By the expander property, | Γ( A ) | ≥ . d | A | . So s + u ≥ . d | A | . (6)On the other hand, each satisﬁed check is connected to at least one vertex in A since it is inΓ( A ). Thus it has to be connected to at least 2 vertices in A to make it to be satisﬁed. Also eachunsatisﬁed check is connected to at least 1 vertex in A . Hence2 s + u ≤ d | A | ≤ d | A | + d t X i =2 k i ≤ (1 + 1 c ) d | A | . (7)By Equation (6) and Equation (7), u ≥ . d | A | . So there has to be ≥ . S having more unsatisﬁed checks than satisﬁedones. As a result, the algorithm can ﬁnd a bit to ﬂip and u is decreasing. On the other hand, ifat some iteration, | A | = 2 k , then u ≥ . dk but initially u ≤ dk which contradicts that u isdecreasing. As a result, the iterations will continue until there are less than k ′ errors in S . Theorem 5.3.

There is an eﬃcient -round protocol s.t. for every ( s, k ) DE problem, it hascommunication complexity O ( k log sk ) , success probability − − Θ(log sk log k ) . roof. We ﬁrst generate a random bipartite graph with n left vertices, left degree d = O (log sk ), m = O ( dk ) right vertices. By Lemma 5.1, with probability 1 − − Θ(log sk log k ) , it is an ( n, m, d, S, ≤ k, .

9) expander Γ. We use Γ to compute the sketch z of x .To decode, we use y , z and Γ. By Lemma 5.2, we can get x correctly.The running time of both parties are ˜ O ( n ). Without loss of generality, we assume k ≥ k ≥ · · · ≥ k t . Theorem 5.4.

There is a -way eﬃcient protocol s.t. for every ( s , k , t ) DE with k i ≤ s i / , ∀ i ∈ [ t ] ,it has success probability − − Ω( k t ) − / poly ( s ) , communication complexity O ( t P i ∈ [ t ] k i log s i k i ) . Construction 5.5.

Eﬃcient protocol for ( s , k , t ) DE .Alice: on input x ,1. Let i ′ = 0 , k ′ = 0 , string z be empty string;1.1. While i ′ ≤ t − , ﬁnd i > i ′ s.t. k ′ + P ij = i ′ +1 k j > k ′′ , where k ′′ = c P tj = i +1 k j ; If cannotﬁnd i then break the iterations;1.2. Generate an ( n, m, d, ∪ ij =1 S j , [ k ′′ , k ′ + P ij = i ′ +1 k j )] , . -expander Γ by Lemma , where d = O (log P ij =1 s j k ′ + P ij = i ′ +1 k j ) ;1.3. Compute z i which is the expander code of x using Γ , z = z ◦ z i ;1.4. Let i ′ = i, k ′ = k ′′ .2. Encode x to be z ﬁnal by using a ( n, m, d = O (log sk ﬁnal ) , S, ≤ k ﬁnal , . expander Γ ﬁnal gener-ated by Lemma , where k ﬁnal = k ′ + P tj = i ′ +1 k j ;3. Send z ◦ z ﬁnal to Bob.Bob: on input y , S , k , together with the message z ◦ z ﬁnal from Alice;1. Let i ′ = 0 , k ′ = 0 , y ′ = y ;1.1. While i ′ = t , ﬁnd i > i ′ s.t. k ′ + P ij = i ′ +1 k j > k ′′ , where k ′′ = c P tj = i +1 k j , c = 10 ;1.2. Generate an ( n, m, d, ∪ ij =1 S j , [ k ′′ , k ′ + P ij = i ′ +1 k j )] , . -expander Γ by Lemma usingthe same randomness as of Alice;1.3. Use Γ , z i to reduce the number of errors of y in ∪ ij =1 S j to be at most k ′′ by Lemma ;1.4. Let i ′ = i, k ′ = k ′′ .2. Decode x by Lemma for the ( S, k ′ + k t ) setting, using y ′ , z ﬁnal , and the expander generatedthe same as the Γ ﬁnal of Alice; Lemma 5.6.

The communication complexity is O (cid:16) t P j ∈ [ t ] k j log s j k j (cid:17) .Proof. By Lemma 5.1, m of Γ is O (cid:18) ( k ′ + P ij = i ′ +1 k j ) log P ij =1 s j k ′ + P ij = i ′ +1 k j (cid:19) .Note that in the ﬁrst iteration, the algorithm may pick a i ∈ [ t ]. But in the succeeding iterations,it will always take i = i ′ + 1, since k ′ + k i ′ +1 > k ′′ and we always assume k i ′ +1 > m = O  ( i X j =1 k j ) log P ij =1 s j P ij =1 k j  ≤ O  ( c t X j = i k j + k i ) log P ij =1 s j P ij =1 k j  Since i − X j =1 k j ≤ c t X j = i k j ≤ O  ( c t X j = i k j + k i ) log P ij =1 s j P ij =1 k j + k i  Decreasing the denominator ≤ O  ( c t X j = i k j + k i ) log P ij =1 s j ( c P tj = i +1 k j + k i )  Since i X j =1 k j > c t X j = i +1 k j ≤ O  ( c t X j = i k j + k i ) log P ij =1 s j c P tj = i +1 k j + k i  Because of big-O notation ≤ O k log P ij =1 s j k ! Let k = ( c t X j = i k j + k i ) ≤ O  k log i Y j =1 ( s j k + 1)  log( · ) is an increasing function= O  i X j =1 k log( s j k + 1)  . For each j ∈ [ i ], if k > k j , then since k ≤ ( c + 1) tk j , k log( s j k + 1) ≤ ( c + 1) tk j log( s j k j + 1) ≤ c + 1) tk j log s j k j ;Otherwise if k ≤ k j , then since k j ≤ s j , k log( s j k + 1) ≤ k log 2 s j k ≤ O ( k j log s j k j ) . Hence m = O (cid:16) t P ij =1 k j log s j k j (cid:17) .Next we consider the cases where we are in iterations from the 2nd to the last. We have m = O ( k ′ + k i ) log P ij =1 s j k ′ + k i ! = O k log P ij =1 s j k ! Let k = k ′ + k i = c t X j = i k j + k i ≤ O  k log i Y j =1 ( s j k + 1)  log( · ) is an increasing function20 O  i X j =1 k log( s j k + 1)  . For each j ∈ [ i ], if k > k j , then again since k ≤ ( c + 1) tk j , k log( s j k + 1) ≤ ( c + 1) tk j log( s j k j + 1) ≤ c + 1) tk j log s j k j ;Otherwise if k ≤ k j , then since k j ≤ s j , k log( s j k + 1) ≤ k log 2 s j k ≤ O ( k j log s j k j ) . Hence m = O (cid:16) t P ij =1 k j log s j k j (cid:17) .As there are at most t iterations, the total communication complexity is tm = O (cid:16) t P tj =1 k j log s j k j (cid:17) .Next we show the correctness. Lemma 5.7.

Bob can compute x correctly with probability at least − − Ω( k t ) − / poly ( s ) .Proof. In the ﬁrst iteration, since Γ is an ( n, m, d, ∪ ij =1 S j , [ c P tj = i +1 k j , P ij =1 k j )]) expander, byLemma 5.2, we can successfully reduce the number of errors in ∪ ij =1 S j to be ≤ k ′′ .Note that as long as k i ′ +1 >

0, the number i , found in the iteration, will be i ′ + 1. So theiteration will continue until i ′ = t −

1. After the iterations, the number of errors in S is at most k ′ + k t = ( c + 1) k t .Finally, using z ﬁnal and Γ ﬁnal , by Lemma 5.2, Bob can compute x correctly.The protocol succeeds once all random expander graphs are as desired. For random expandergraph in iteration i , the success probability is 1 − − Ω( dk ′′ log k ′ k ′′ )) ≤ − − Ω( dk ′′ ) , by Lemma 5.1. Soby a union bound, the probability, that all iterations success, is at least 1 − − Ω( k t ) . In the ﬁnalstep, the success probability is 1 − poly ( s ) by Theorem 5.3. Hence the ﬁnal success probability is asdesired. Proof of Theorem . The correctness and communication complexity immediately follows fromLemma 5.6, Lemma 5.7.For the eﬃciency, note that in Alice’s algorithm, she just randomly generate a bipartite graphwith logarithmic degree. And apply the expander encoding to get the sketch. So this is in nearlinear time. For Bob’s algorithm, as S i , i ∈ [ t ] are disjoint, and the belief propagation can be donein near linear time. Other operations are also in near linear time. So Bob’s algorithm is also innear linear time.When t is large, we can group some sets together to reduce t and hence get the followingtheorem. Theorem 5.8.

There is a -way eﬃcient protocol s.t. for every ( s , k , t ) DE with k i ≤ s i / , ∀ i ∈ [ t ] ,it has success probability − − Ω( k t ) − / poly ( s ) , communication complexity O (cid:16) χ ( s , k , t ) P i ∈ [ t ] k i log s i k i (cid:17) .The running time of both parties are ˜ O ( n ) . roof. We cut the interval [2 , n + 1) into t ′ = O (log log n ) intervals s.t. the j -th interval I j is[2 j − , j ). Then for all i s.t. s i /k i ∈ I j , we union them to be a set S ′ j . Also we take k ′ j to bethe summation of the corresponding k i ’s. We neglect these intervals which do not cover any s j /k j ,getting a new problem i.e. a ( s ′′ , k ′′ , χ ) error correction problem.By Theorem 5.4, the communication complexity is O (cid:16) χ P j ∈ [ χ ] k ′′ j log s ′′ j k ′′ j (cid:17) . Since ∀ j ∈ [ χ ] , i ∈ I j , log s i k i = O (log s ′′ j k ′′ j ), the communication complexity is actually O (cid:16) χ P i ∈ [ t ] k i log s i k i (cid:17) .The time complexity and success probability is implied by Theorem 5.4.Notice that χ can only be as large as O (log log n ). So we have the following corollary. Corollary 5.9.

There is a -way eﬃcient protocol s.t. for every ( s , k , t ) DE with k i ≤ s i / , ∀ i ∈ [ t ] ,it has success probability − − Ω( k t ) − / poly ( s ) , communication complexity O (cid:16) log log n P i ∈ [ t ] k i log s i k i (cid:17) .The running time of both parties are ˜ O ( n ) . We show that our construction for DE can be modiﬁed to work for stochastic coding setting.

Theorem 5.10.

There is an eﬃcient stochastic ECC s.t. for every ( s , k , t ) type errors with k i ≤ s i / , ∀ i ∈ [ t ] , it has success probability − − Ω( k t ) − / poly ( s ) , message length n − O (cid:0) χ ( s , k , t ) H ( s , k ) (cid:1) .The running time of both encoding and decoding are ˜ O ( n ) .Proof. For encoding, we ﬁrst compute the length of the redundancy. By the Alice’s algorithm ofTheorem 5.8, the sketch length for ( s , k , t ) document exchange, on input strings of length n , is s = O (cid:16) χ ( s , k , t ) P i ∈ [ t ] k i log s i k i (cid:17) . If we apply an an asymptotically good ECC C , e.g. expandercodes [32] [34], to encode the sketch, then the output has length r = O ( s ). Let the message lengthbe n − r .The encoding of message x has two parts. The ﬁrst part is the message. The second part isthe sketch for ( s , k , t ) document exchange on input x ◦ , where is an all 0 string of length r . Weknow the sketch length is s . Next we apply C on s to get z which has length r . The ﬁnal codewordis x ◦ z .We claim this code can indeed resist ( s , k , t ) type errors by describing the decoding along withits analysis.For decoding, assume the input is x ′ ◦ z ′ . Note that even if all errors happen on z , we candecode to recover z from z ′ , since z is a codeword of an ECC correcting k errors. After we get z .We apply Bob’s algorithm of Theorem 5.8 on x ′ ◦ , using the sketch z . The decoding will successbecause the error type is still ( s , k , t ), as we only remove some errors happened on z .The success probability only comes from the success probability of Theorem 5.8, since that’s theonly part we use randomness. So the success probability is as desired. The encoding and decodingare in near linear time since the protocol and the asymptotically good code [34] are both in nearlinear time. 22 Document Exchange with Asymmetric Information in a SpecialSetting

We ﬁrst develop a randomized two-party (Alice and Bob) one-way hamming error document ex-change protocol in which Bob knows the errors can only happen in some subsets of all positions,where in each subset the number of errors is also bounded.The reason we consider this kind of encoding/decoding for special error patterns is that it canhave shorter redundancy than the general coding for bounded number of hamming errors.The encoding utilize a randomized bipartite expander graph with a large expansion.

Lemma 6.1.

For every n, k, k ′ , k ′′ , r, d, t ∈ N , k ′ ≤ r ≤ k ≤ n , k ′′ t log ek t k ′′ ≤ k ′ log kk ′ , δ ∈ (0 , , d ≥ δ − , constant c > , disjoint sets S i ⊆ [ n ] , i ∈ [ t ] , | S i | = k O ( i ) , k i = max( k/ O ( i ) , k ′′ ) ≤ | S i | / ,the probability that a random bipartite graph with n left vertices, m ≥ dk /δ right vertices, leftdegree d , having thatfor every R ⊆ ∪ i ∈ [ t ] S i , | R | = r ≥ k ′ , with | R ∩ S i | ≤ k i , ∀ i ∈ [ t ] , it holds | Γ( R ) | > (1 − δ ) dr, is at least − ε , where ε = 2 − Θ( δdk ′ log kk ′ ) . We denote the generated expander graph as a ( n, m, d, S , k , [ k ′ , k ] , − δ )-expander, where k isthe sequence of all k i , i ∈ [ t ]. Proof.

We show that a uniformly sampled bipartite graph works. The bipartite graph with n leftvertices, m right vertices, left degree d , is generated as follows. Each edge, from one vertex of theleft, has its ending vertex being uniformly chosen from the right vertices.For a ﬁxed R , if | Γ( R ) | ≤ (1 − δ ) dr , then there exists a set T ⊆ [ m ] s.t. | T | = (1 − δ ) dr, | Γ( R ) | ⊆ T .There are at most (cid:18) m | T | (cid:19) ≤ (cid:18) em | T | (cid:19) | T | = (cid:18) em (1 − δ ) dr (cid:19) (1 − δ ) dr (8)such set T . For each T , Pr[Γ( R ) ⊆ T ] = (cid:18) | T | m (cid:19) dr = (cid:18) (1 − δ ) drm (cid:19) dr . (9)Consider a ﬁxed r . Assuming r ∈ [ k j +1 , k j ], for some j ∈ [ t ]. Notice that j = Θ(log kr ).Let r i = R ∩ S i . The total number of diﬀerent sequences r , . . . , r t is at most (cid:18) r + tr (cid:19) ≤ (cid:18) e ( r + t ) r (cid:19) r ≤ (cid:18) O ( 2 kr ) (cid:19) r ≤ O ( r log kr ) . (10)Consider a ﬁxed sequence r i , i ∈ [ t ] with r i ≤ k i . The total number of possibilities of R ∩ S j , . . . , R ∩ S t is at most t Y i = j (cid:18) | S i | r i (cid:19) ≤ t Y i = j (cid:18) | S i | k i (cid:19) ≤ t Y i = j (cid:18) e | S i | k i (cid:19) k i ≤ t ′ Y i = j O ( i k O ( i ) ) · t Y i = t ′ (cid:18) e | S t | k ′′ (cid:19) k ′′ = 2 O (cid:16)P t ′ i = j i k O ( i ) (cid:17) · O (( t − t ′ ) k ′′ log ek tk ′′ ) ≤ O ( k O ( j ) j ) · O ( r log kr ) = 2 O ( r log kr ) . t ′ is the ﬁrst index s.t. k i = k ′′ .On the other hand, the total number of possibilities of R ∩ S , . . . , R ∩ S j is at most j Y i =1 (cid:18) | S i | r i (cid:19) ≤ (cid:18)P ji =1 | S i | P ji =1 r i (cid:19) ≤ (cid:18) O ( k O ( j ) ) P ji =1 r i (cid:19) ≤ (cid:18) O ( k O ( j ) ) r (cid:19) ≤ O ( k O ( j ) r ) ! r ≤ O ( r log kr ) . (11)So by a union bound, the probability that for every R, | R | = r, | R ∩ S i | ≤ k i , Γ( R ) ≤ (1 − δ ) dr is at most (cid:18) em (1 − δ ) dr (cid:19) (1 − δ ) dr × (cid:18) (1 − δ ) drm (cid:19) dr × O ( r log kr ) = e (1 − δ ) dr (cid:18) (1 − δ ) drm (cid:19) δdr × O ( r log kr ) ≤ e dr e − δdr log mdr O ( r log kr ) ≤ − Θ( δdr log kr ) (12)by letting m = 2 dk /δ .Since k ≥ r ≥ k ′ , it holds that 2 − Θ( δdr log kr ) ≤ − Θ( δdk ′ log kk ′ ) . Remark 6.2.

Note that we can use a κ = O ( kd ) -wise independence generator to generate theedges of the graph. Each edge is chosen according to a random variable in a sequence that is κ -wiseindependent. Each random variable has support size m . Hence inequality (9) still holds. So we canapply the same argument. The decoding algorithm has two parts. Both parts use belief propagation techniques. In theﬁrst part, we reduce the number of errors slightly by using z . In the second part, we further reducethe number of errors to 0 by using z . Construction 6.3 (Protocol for a speciﬁc setting of parameters) . Let n, m, d, t ∈ N , k i ∈ N , k i ≤ n, i ∈ [ t ] , k ′ = O ( k/ log nk ) , disjoint sets S i ⊆ [ n ] , i ∈ [ t ] . Let S = ∪ i ∈ [ t ] S i .Let expander graph Γ : [ n ] × [ d ] → [ m ] , s.t. ∀ R ⊆ ∪ i ∈ [ t ] S i with | R | ∈ [ k ′ , O ( k )] and ∀ i ∈ [ t ] , | R ∩ S i | ≤ k i , it holds Γ ( R ) > . d | R | . Let C be a systematic Algebraic Geometry code from Theorem , with alphabet F q , messagelength n/q , redundancy length O ( k ′ ) correcting k ′ errors.Let x ∈ { , } n be the original message.The decoding takes an input string y ∈ { , } n , parity checks z generated by expander encodingof x using Γ , and z which is the redundancy part of C ( x ) .Stage 1:1. (Generating the restriction set) Let V = ∅ . For every i ∈ [ t ] , if the number of ﬂipped bits in S i is less than k i , then V = V ∪ S i otherwise V = V ∪{ j | the j -th bit is ﬂipped previously by this algorithm } ;(If a bit is ﬂipped twice, then it is regarded as not ﬂipped)2. Find j ∈ V s.t. the number of unsatisﬁed parity checks in Γ ( j ) is larger than | Γ ( j ) | / d / ;Flip the j -th bit, and restart this stage; If no such j , go to the next step;3. Go to the next stage. tage 2 (classic belief propagation using z ):1. Apply the decoding of C on the current y concatenated with z .2. Output the decoded message. Lemma 6.4. If HD ( y S , x S ) = 0 , ∀ i ∈ [ l ] , HD ( y S i , x S i ) ≤ k i , then the decoder outputs x correctly.Proof. Claim 6.5.

The ﬁrst stage ends in at most O ( m ) rounds, and the number of errors in y is reducedto be less than k ′ .Proof. Let A τ be the set of indices of tampered bits (comparing to x ) in y at (immediately before)the τ -th round. At the beginning | A | = HD ( y, x ).We ﬁrst show that if | A τ | ≥ k ′ , then we can indeed ﬁnd an index j ∈ V s.t. the number ofunsatisﬁed parity checks in Γ ( j ) is larger than | Γ ( j ) | / A ′ τ = A τ ∩ V . Let s, s ′ be the numbers of satisﬁed checks in Γ ( A τ ) , Γ ( A ′ τ ). Let u, u ′ be the numbers of unsatisﬁed checks in Γ ( A τ ) , Γ ( A ′ τ ).Consider i ∈ [ l ] s.t. the number of ﬂipped bits is exactly 19 k i . As HD ( y S i , x S i ) ≤ k i , the numberof tampered bits in S i is at most 20 k i . So | A τ ∩ S i | ≤ k i , since HD ( y S i , x S i ) ≤ k i . Also note thatthese tempered bits (at the beginning of the stage) can be ﬂipped by the algorithm, we know thecurrent number of tampered bits in A ′ τ ∩ S i is at least 18 k i . So | A ′ τ ∩ S i | ≥ . | A τ ∩ S i | . (13)For i ∈ [ l ] s.t. the number of ﬂipped bits is less than 19 k i , since V ∩ S i = S i , | A ′ τ ∩ S i | = | A τ ∩ S i | (14)As a result, noting that S i , i ∈ [ l ] are disjoint, | A ′ τ || A τ | = P i | A ′ τ ∩ S i | P i | A τ ∩ S i | ≥ . , (15)As | A τ | ≥ k ′ , it holds | A ′ τ | ≥ . k ′ ≥ k ′ . By the expansion property of Γ , s ′ + u ′ = | Γ ( A ′ τ ) | ≥ . d | A ′ τ | . (16)On the other hand, note that 2 s + u ≤ d | A τ | , since each satisﬁed check in Γ ( A τ ) must have atleast two bits in x A τ to be as addends. As A ′ τ = A τ ∩ V , we have s ′ ≤ s, u ′ ≤ u .Thus 2 s ′ + u ′ ≤ s + u ≤ d | A τ | . (17)Combining (16) and (17), we get u ′ ≥ . d | A ′ τ | − . d | A τ | ) . (18)Further by (15),(18), u ′ ≥ . d | A τ | ≥ . d | A ′ τ | . (19)Hence by an averaging argument, there is an index j ∈ V s.t. the number of unsatisﬁed paritychecks in Γ ( j ) is at least 0 . d . 25s a result, after doing the ﬂipping for this round, the number of unsatisﬁed parity checks isstrictly decreased. Also note that because of the restriction sets in our algorithm our operationcannot create an A τ in some steps s.t. it does not have a good expansion. Hence, the ﬁrst stageends when | A τ | < k ′ .Next we consider | A τ | < k ′ at the beginning of a round τ . There are two possible cases.The ﬁrst case is that in step 2 the algorithm does not ﬁnd a j ∈ V to conduct the operation,so it will go to the next stage as desired.The second case is that there is still an index j ∈ V s.t. the number of unsatisﬁed parity checksin Γ ( j ) is more than half. Hence after ﬂipping, and the number of unsatisﬁed parity checks isagain strictly decreased. Note that there are at most O ( m ) unsatisﬁed checks. So this procedurewill end in at most O ( m ) rounds.For either case, stage 1 will end with | A τ | < k ′ . This shows the claim.As a result, after stage 1, the number of errors is less than 2 k ′ .As C can correct 2 k ′ errors, by Theorem 3.5, the decoding algorithm outputs x correctly. Theorem 6.6.

There is an eﬃcient one-way protocol for every ( s , k , t ) DE, arbitrary s i = k Θ( i ) , l =Ω(log nk ) , k i = max { k/ Θ( i ) , Θ( kl log nk ) } ≤ s i / , t ≤ O ( √ l ) , having communication complexity O ( k ) , success probability − − Θ( k log log nk log nk ) .Proof. The protocol is constructed by Construction 6.3 and we will use a random ( n, m, d ) bipartitegraph to be Γ . By Lemma 6.1, a random bipartite ( n, m, d ) graph Γ is an ( n, m, d, S , k , [ k ′ , k ] , . ε = 2 − Θ( k ′ log kk ′ ) , where we let m = O (2 dk ) , d = O (1), k i = 20 k i , i ∈ [ t ], k ′ = O ( k/ log nk ). Also since t ≤ O ( √ l ), we have k ′′ t log k t k ′′ ≤ k ′ log kk ′ , where k ′′ = O ( kl log nk ).By Lemma 6.4, Bob can compute x , by using y, z, S , k , k ′ and the common randomness.The communication complexity is | z | = m = O ( k ). The protocol is eﬃcient since both encodingand decoding are eﬃcient. The failure probability is ε = 2 − Θ( k log log nk log nk ) since the construction of Γ is the only part we use randomness.Note that Theorem 1.10 directly follows from Theorem 6.6 by letting l = O ( t ). In this section we give the one-way document exchange protocol for edit distance. We begin witha randomized protocol where the two parties have shared randomness.

Construction 7.1.

The input string for Alice has length n ∈ N and there are totally k ∈ [Θ(log nk ) , Θ( n )] edit errors between Alice’s string and Bob’s string.Both Alice’s and Bob’s algorithms have L = O (log nk ) levels. For every i ∈ [ L ] , in the i -th level, • Let block size b i = n · i k , i.e., in each level we divide a block in the previous level evenly intotwo blocks; (We choose L properly s.t. b L = O (log nk ) ) • The number of blocks l i = n/b i ; lice: On input x ∈ { , } n ;1. For the i -th level,1.1. Partition x into consecutive blocks x [1 , b i ] , x [1 + b i , b i ] , . . . , x [1 + ( l − b i , l i b i ] ;1.2. Let h j : { , } b i → { , } c , j ∈ [ l i ] be a sequence of random hash functions with c being alarge enough constant positive integer;1.3. Compute v [ i ][ j ] = h j ( x [1 + ( j − b i , jb i ]) , j ∈ [ l i ] ;1.4. v [ i ] = ( v [ i ][1] , . . . , v [ i ][ l i ]) ;1.5. By the sketch construction of Theorem , compute z [ i ] ∈ { , } m = O ( k ) , a sketch of v [ i ] ,the expander constructed in this step being Γ : { , } l i × { , } d =10 → { , } m ;2. Compute the redundancy z ﬁnal ∈ ( { , } b L ) Θ( k ) for the blocks of the L -th level by Theorem 3.5,where the code has distance k ;3. Send z = ( z [1] , z [2] , . . . , z [ L ]) , v [1] , z ﬁnal .Bob: On input y ∈ { , } O ( n ) and received z , v [1] , z ﬁnal ;1. Create ˜ x ∈ { , , ∗} n (i.e. his current version of Alice’s x ), initiating it to be ( ∗ , ∗ , . . . , ∗ ) ;2. Let A = [ l ] , A i = ∅ , i = 2 , , . . . , L ;3. For the i -th level, where ≤ i ≤ L − ,3.1. Divide ˜ x into length b i consecutive blocks, ˜ x [1 , b i ] , . . . , ˜ x [1 + ( l i − b i , l i b i ] ;3.2. Utilize the common randomness to get functions h j : { , } b i → { , } c , j ∈ [ l i ] that Alicegets in her stage 1.2.3.3. Compute ˜ v [ i ] = ( h (˜ x [1 , b i ]) , . . . , h l i (˜ x [1 + ( l i − b i , l i b i ])) ;3.4. For every i ′ ∈ [ ℓ ] , let S i ′ ⊆ [ l i ] be the indices of the (descendent) blocks in the currentlevel, whose ancestors are those blocks indicated by A i ′ , i.e. j is in S i ′ iﬀ there is j ′ ∈ A i ′ s.t. [1 + ( j − b i , jb i ] ⊆ [1 + ( j ′ − b i ′ , j ′ b i ′ ] ;3.5. Compute v [ i ] by using the decoding algorithm from Construction 6.3 on input ˜ v [ i ] , S i ′ , k i ′ = max( k/ . c ( i − i ′ ) , k/ log nk ) , i ′ = i − , i − , . . . , , and the received z [ i ] ;3.6. Let T i = ∅ . For every j ∈ [ l i ] , if v [ i ][ j ] = ˜ v [ i ][ j ] , then put j ∈ T i and then check every i ′ = i − , i − , . . . , , if the j -th block in the current level is a descendent of the j ′ -th blockin the i ′ -th level, then remove j ′ from A i ′ ;3.7. Let A i = T i ;3.8. Compute w ∈ ( A i × [ | y | ]) | w | which is the maximum monotone matching between x ’s blocksindicated by A i , and y , under h , . . . , h l i , using v [ i ] , by Lemma 3.10; (We interpret w asa sequence of matches, the j th match being denoted as ( w [ j ][1] , w [ j ][2]) .)3.9. Evaluate ˜ x according to w , i.e. let ˜ x [ w [ j ][1]] = y [ w [ j ][2] , w [ j ][2] + b i − , ∀ j ∈ [ | w | ] ;4. In the L ’th level, apply the decoding of Theorem 3.5 on the blocks of ˜ x and z ﬁnal to get x ;5. Return x . Next we show the correctness of our construction.Consider every level i ∈ [ L ], every i ′ = i − , i − , . . . ,

1. We denote the set descendants in the i -th level, stemming from A i ′ , as ˜ A i ′ . The indices set of undetected wrongly recovered blocks in˜ A i ′ , is denoted as B i ′ , i ′ = i − , . . . , i ∗ be s.t. k ′′ , k ′ / Θ(log nk ) ∈ [ k/ c ( i − i ∗ ) , k/ c ( i − i ∗ +1) ], k ′ , k/ log nk . Lemma 7.2.

For every i ∈ [ L ] , if ∀ i ′ < i, | T i ′ | = O ( k ) , and v [ i ′ ] are computed correctly by Bob,then for every i ′ ≤ i ∗ , the probability that | B i ′ | ≥ k ′′ is at most − Ω( k ′′ ) ; • for every i ′ ∈ ( i ∗ , i ) , the probability that | B i ′ | ≥ k i ′ = k/ . c ( i − i ′ ) is at most − Ω( ck/ c ( i − i ′ ) ) .Proof. Consider the possibilities of B i ′ . Each possibility can be described by a w -witness with w = | B i ′ | . The witness is a sequence of (number) w indices where each index is in the i -th levelindicating a wrongly recovered block. This sequence is further partitioned into i − i ′ + 1 groupscorresponding to levels i ′ , i ′ + 1 , . . . , i . We numerate these groups as group i ′ , i ′ + 1 , . . . , i .Consider the trees rooted at blocks in A i ′ . Each of them has height i − i ′ . Each node is a blockin a certain level between i ′ and i .The w -witness describes level i bad blocks which are descendants of blocks in A i ′ , uniquely inthe following way.Group j ∈ [ i ′ , i ] consists of indices of bad blocks, one for each depth i − j tree whose root is awrong block in level j . Note that for one tree, there may be many bad leaf blocks. For this case,we only pick the leftmost wrong one. These forms the group one. After each picking, we cut all theedges from that block to the root. This gives i − i ′ + 1 sub-trees. One of them is the last block. Weonly focus on sub-trees other than that picked block. They have depth from 1 to i − j . We updatethe set of trees by adding these trees from cutting and delete the trees being cut.In this way, every error patterns can be described. This is because, every leaf node is eitherbeing picked or still in one of the trees in the forest. Once the leaf is in one of the trees in theforest, it can be picked in a certain level of the picking procedure.Let the number of wrong blocks being picked for each level j be w j .The total number of error patterns is P = (cid:18) kw i ′ (cid:19) · ( i − i ′ ) w i ′ · (cid:18) w i ′ w i ′ +1 (cid:19) · ( i − i ′ − w i ′ +1 · (cid:18) w i ′ + w i ′ +1 w i ′ +2 (cid:19) · ( i − i ′ − w i ′ +2 · · · (cid:18)P i − j = i ′ w j w i (cid:19) ≤ (cid:18) kw i ′ (cid:19)(cid:18)P i − j = i ′ ( i − j ) w j P ij = i ′ w j (cid:19) · P i − j = i ′ ( i − j ) w j (20)For i ′ ≤ i ∗ , suppose P ij = i ′ ( i − j ) w j = k ′′ . Then P ≤ (cid:18) kk ′′ / ( i − i ′ ) (cid:19) · k ≤ O ( k ′′ ) · O (log k ) i − i ′ · k ′′ ≤ O ( k ′′ ) . (21)Note that the probability that a speciﬁc error pattern happens is at most 2 − c P ij = i ′ ( i − j ) w j = 2 − ck ′′ because each block in group j is checked for i − j times independently. Since c is a large enoughconstant, P ij = i ′ ( i − j ) w j is a integer in [0 , poly ( k log n )], we know by a union bound, P ij = i ′ ( i − j ) w j ≥ k ′′ happens with probability at most 2 − ck ′′ × O ( k ′′ ) × poly ( k log n ) ≤ − Ω( k ′′ ) .For i > i ∗ , suppose P ij = i ′ ( i − j ) w j = k/ . c ( i − i ′ ) = k i ′ . Then P ≤ (cid:18) kk i ′ / ( i − i ′ ) (cid:19) · k i ′ ≤ (0 . c ( i − i ′ )+ O (1)+log( i − i ′ )) · k i ′ / ( i − i ′ ) · k i ′ ≤ . ck i ′ , (22)28hen c is a large enough constant.Similarly, note that the probability that a speciﬁc error pattern happens is at most 2 − c P ij = i ′ ( i − j ) w j =2 − ck i ′ because each block in group j is checked for i − j times independently. Since c is a large enoughconstant, P ij = i ′ ( i − j ) w j is a integer in [0 , poly ( n )], we know by a union bound, P ij = i ′ ( i − j ) w j ≥ k i ′ happens with probability at most 2 − ck i ′ × . ck i ′ × poly ( k log n ) ≤ − Ω( k i ′ ) .As a result, w = P ij = i ′ w j > k i ′ happens with probability at most 2 − Ω( k i ′ ) ≤ − Ω( k ′′ ) . Lemma 7.3.

For every i ∈ [ L ] , if ∀ i ′ < i, | T i ′ | = O ( k ) , and v [ i ′ ] are computed correctly by Bob,then with probability at least − − Ω( k ′′ ) , i − X i ′ =1 | B i ′ | < k. Proof.

By Lemma 7.2, for i ′ ≤ i ∗ , with probability at least 1 − − Ω( k ′′ ) , | B i ′ | < k ′′ ; for i ′ > i ∗ , withprobability at least 1 − − Ω( k ′′ ) , | B i ′ | ≤ k i ′ = k/ . c ( i − i ′ ) .By a union bound, with probability at least 1 − i − Ω( k ′′ ) = 1 − − Ω( k ′′ ) , i − X i ′ =1 | B i ′ | = i ∗ X i ′ =1 | B i ′ | + i − X i ′ = i ∗ +1 | B i ′ | ≤ ( i ∗ − k ′′ + 0 . k < k. Lemma 7.4.

For every i ∈ L , at level i , if v [ i ] are computed correctly by Bob, and | T i | ≤ k , thenwith probability − − Θ( k ) , the number of wrongly recovered blocks introduced by w is at most k .Proof. Assume the number of wrongly recovered blocks introduced by w is larger than k . Thenthere more than k pairs in the matching are bad pairs. This happens with probability 1 / ck .Note that by Lemma 3.10, for w , | ρ ′ − ρ | + | ( ρ ′ − ρ ′ ) − ( ρ − ρ ) | + · · · + | ( ρ ′| w | − ρ ′| w |− ) − ( ρ | w | − ρ | w |− ) | ≤ k. By Lemma 3.9, since | T i | ≤ k , there are totally 2 O ( k ) possible matchings that can be outputby our algorithm.So by a union bound, the conclusion holds with probability 1 − − Θ( k ) . Lemma 7.5.

For every i ∈ L , in level i , if v [ i ] are computed correctly, and | T i | = O ( k ) , then withprobability − − Θ( k ) , the number of wrongly recovered blocks and uncovered blocks in T i after 3.9.is at most k .Proof. By Lemma 3.10, | w | ≥ | T i | − k . Thus the number of uncovered blocks is at most k . ByLemma 7.4, with probability 1 − / Θ( k ) , the number of wrongly recovered blocks introduced by w is at most k . So the total number of wrongly recovered blocks is at most 2 k . Lemma 7.6.

For every i ∈ L , with probability − − Θ( k ′′ ) , • after the ﬁrst step of level i , the number of wrongly recovered blocks is at most k ; Bob can compute v [ i ] correctly; • the number of wrongly recovered blocks in T i is at most k after step 3.9..Proof. We use induction.In the ﬁrst level, ˜ x = ( ∗ , ∗ , . . . , ∗ ). So the number of wrongly recovered blocks at the beginningis l = n/b = 6 k . So The number of wrongly recovered blocks is at most 6 k . Also Bob can get v [1] correctly, since it is directly sent by Alice. By Lemma 7.5, with probability 1 − / Θ( k ) , thetotal number of wrongly recovered blocks is at most 2 k if we regard uncovered blocks as wronglyrecovered.Suppose the conclusion holds for the ﬁrst i − i .By Lemma 7.3, with probability 1 − − Ω( k ′′ ) , the total number of wrongly recovered blocks is P i − i ′ =1 | B i ′ | < k .By Lemma 6.1, with probability 1 − ε = 1 − − Ω( k ′ ) , Γ is a bipartite graph, having n = l i left vertices, m = O ( k ) right vertices, left degree d = O (1), s.t. ∀ R ⊆ [ n ] , | R | ∈ [ k ′ , k ] , | R ∩ S i ′ | ≤ k ′ i ′ = max(20 k/ c ( i − i ′ ) , k . ), Γ( R ) > . d | R | . Note that k ′ i ′ ≥ k i ′ . Also note that i ′ iterates in [1 , i − S i ′ is at most L ≤ k β/ √ log k . So by Theorem 6.6, Bob can get the correct v [ i ].As a result, by a union bound with probability 1 − L − Θ( k ′′ ) , Bob can compute v [ i ] correctly.Note that L = O (log nk ), k = Ω(log nk ). So the probability is at least 1 − − Θ( k ′′ ) .By Lemma 7.5, with probability 1 − / Θ( k ) , the total number of wrongly recovered blocks in T i is at most 2 k after stage 3.9..So the overall probability is as desired.This shows the inductive step. Lemma 7.7.

With probability − − Θ( k ′′ ) , Bob outputs x correctly.Proof. By Lemma 7.6, with probability 1 − − Θ( k ′′ ) , at the last level, there are at most 6 k wrongblocks. Since z ﬁnal is the redundancy for a code with distance 16 k , all wrong blocks can be corrected.So Bob computes x correctly. Lemma 7.8.

The communication complexity is O ( k log nk ) .Proof. Note that since m = O ( k ), z [ i ] = O ( k ). Also note that | v [1] | = O ( k ), as the output lengthfor of the hash function is O (1) and l = O ( k ). | z ﬁnal | = O ( k log nk ) by Theorem 3.5.So the overall communication complexity is P Li =1 | z [ i ] | + | v [1] | + | z ﬁnal | = O ( k log nk ). Theorem 7.9.

There exists an eﬃcient one-way edit distance document exchange protocol usingcommon randomness, for every n ∈ N , k = Ω(log nk ) , having sketch length O ( k log nk ) , successprobability − − Ω( k/ log nk ) .Proof. It immediately follows from Lemma 7.7, 7.8. The protocol is eﬃcient since all componentsand steps are eﬃcient. 30y combining Theorem 7.9 and the result of Haeupler [18], we immediately get the following.

Theorem 7.10.

There exists an eﬃcient one-way edit distance document exchange protocol usingcommon randomness, for every n, k ∈ N , having sketch length O ( k log nk ) , success probability − min { − Θ( k/ log nk ) , / poly ( n ) } .Proof. When k = Ω(log nk ), we use Theorem 7.9. Otherwise we use the random protocol from [18]which has success probability 1 − / poly ( n ). Both of them have the sketch length as desired. In Construction 7.1, we use common randomness to generate hash functions h j , j ∈ [ l i ] for each i ∈ [ L ]. Also we use common randomness to generate the random bipartite graph Γ for the encodingof the hash values. Now we show that we can use almost κ -wise independence generator to reducerandomness. Lemma 7.11.

Replace the common randomness used in Construction 7.1, • for generating hash functions, by an ǫ -almost ck -wise independent distribution, with ǫ =2 − ck ; • for generating Γ , by O ( k ) -wise independent distributions over alphabet [ m ] . (Recall that m = O ( k ) ) Then with probability − − Θ( k ′ ) , Bob outputs x correctly.Proof. We need to recompute the following probabilities.In Lemma 7.2, a speciﬁc error pattern happens with probability at most 2 − ck ′′ ± ǫ ≤ − . ck ′′ .In Lemma 7.4, if there are k wrongly matched blocks introduced by w , then there are k hashcollisions each for a diﬀerent h j in level i . So the probability is at most 2 − ck ± − ck = 2 − Θ( k ) .The rest of the analysis of the above two lemmas can still go through. These two lemmas arethe only two in the proof of Lemma 7.7 which will use the independence of hash functions.As a result, the proof of Lemma 7.7 can still go through. Theorem 7.12.

There exists an eﬃcient one-way edit distance document exchange protocol, for ev-ery k = Ω(log nk ) , having sketch length O ( k max { log nk , log k } ) , success probability − − Ω( k/ log nk ) .Proof. Consider replacing the common randomness used in Construction 7.1 in the way of Lemma7.11. By Theorem 3.6, we can use a generator g of seed length O ( k max { log k, log nk } ) to generatethe O ( k )-wise independent distribution. By Theorem 3.7 we can use a generator g of seed length O (log k log nǫ ) to generate the ǫ -almost 10 ck -wise independent distribution.So we only need to let Alice send the seeds for these two, which have total length O ( k log nk ).Adding the communication complexity calculated by Lemma 7.8, the overall communication com-plexity is as desired.The correctness and success probability follows from 7.11. The protocol is eﬃcient since allcomponents and steps are eﬃcient. 31 Asymmetric Document Exchange with Two Sided Information

In this section we study document exchange with two sided asymmetric information. We have thefollowing deﬁnition.

Deﬁnition 8.1.

There are two parties Alice and Bob. Alice has a string x ∈ { , } n and Bob hasa string y ∈ { , } n . Alice knows a vector of disjoint subsets S A = ( S A , · · · , S At A ) and a vector ofintegers k A = ( k A , · · · , k At A ) . Bob knows a vector of disjoint subsets S B = ( S B , · · · , S Bt B ) and avector of integers k B = ( k B , · · · , k Bt B ) . That is, within each set S Ai or S Bi , the Hamming distancebetween x and y is at most k Ai or k Bi . Now one party tries to learn the string of the other party. Again, let s A = ( s A , · · · , s At ) where ∀ i, s Ai = | S Ai | . Similarly, let s B = ( s B , · · · , s Bt ) where ∀ i, s Bi = | S Bi | . We call this problem an ( s A , s B , k A , k B , t A , t B ) asymmetric document exchange(DE) problem, and we require the protocol to succeed for all possible conﬁgurations of the subsets S A = ( S A , · · · , S At A ), S B = ( S B , · · · , S Bt B ), and all possible strings x, y that are consistent with theparameters.We also have both lower bounds and upper bounds. Theorem 8.2.

In an ( s A , s B , k A , k B , t A , t B ) asymmetric DE problem, suppose Bob learns Alice’sstring. Let s A = P ti =1 s Ai and s B = P ti =1 s Bi , and assume s A + s B ≤ n . Let k A = P ti =1 k Ai and k B = P ti =1 k Bi . Then any deterministic protocol has communication complexity at least H ( n − s B , k A )+ H ( s B , k B ) , and any randomized protocol with success probability ≥ / has communicationcomplexity at least H ( n − s B , k A ) + H ( s B , k B ) − . In addition, if ∀ i, s Bi ≥ k Bi , then any one rounddeterministic protocol has communication complexity at least H ( n, k A + k B ) . This holds even if bothparties know ( s A , s B ) and ( k A , k B ) .Proof. The proof is similar to the one sided case. For a deterministic protocol, assume for thesake of contradiction that there is a protocol with communication complexity less than H ( n − s B , k A ) + H ( s B , k B ). Then ﬁx Bob’s string y and there exist two diﬀerent x ’s that produce thesame transcript, and in addition the inputs to Bob are the same. Thus Bob will not be able todistinguish the two x ’s, a contradiction. The case of a randomized protocol is essentially the sameup to an averaging argument.For the case of one round deterministic protocol, again the argument is similar as before. Assumefor the sake of contradiction that there is a protocol with communication complexity less than H ( n, k A + k B ). Fix Bob’s string y and the number of diﬀerent x ’s within Hamming distance k A + k B is exactly 2 H ( n,k A + k B ) . For each such x , one can arrange the ﬁrst at most k A diﬀerencesto happen in S A , and the rest of at most k B diﬀerences to happen in S B , such that the subsets in S A and S B are all disjoint (since s A + s B ≤ n ). Note that each x gives a vector S A , and the oneround transcript is a deterministic function of ( x, S A , s A , s B , k A , k B ), two diﬀerent x ’s will producethe same transcript. At this point, one can deﬁne a vector S B consistent with s B , k B and both ofthe x ’s (since ∀ i, s Bi ≥ k Bi ). This means the inputs to Bob are the same for the two x ’s. SinceBob’s ﬁnal output is a deterministic function of the transcript and ( y, S B , s A , s B , k A , k B ), Bob willnot be able to distinguish the two x ’s, a contradiction.The positive result directly follows from the one-side result i.e. Theorem 5.8. Theorem 8.3.

There exists an explicit protocol for all ( s A , s B , k A , k B , t A , t B ) DE , having com-munication complexity O (cid:16)(cid:0) χ ( s B , k B , t B ) + 1 (cid:1) (cid:0) H ( n − s B , k A ) + H ( s B , k B ) (cid:1)(cid:17) , success probability − − Ω(min( k t ,k A )) − / poly ( s A + s B ) , to let Bob learn Alice’s string. roof. The two party can just think that there are at most k A errors in set [ n ] − S B . This contributeone more set (and its error bound) to the error pattern. And the problem becomes a one-sideasymmetric information problem. So we can apply Theorem 5.8 and the conclusion follows. References [1] Khaled A. S. Abdel-Ghaﬀar and Amr El Abbadi. An optimal strategy for comparing ﬁle copies.

IEEE Transactions on Parallel and Distributed Systems , 5(1):87–93, 1994.[2] Micah Adler, Erik D. Demaine, Nicholas J.A. Harvey, and Mihai P?atra? Lower bounds forasymmetric communication channels and distributed source cod. In

SODA , pages 251–260,2006.[3] Micah Adler and Bruce M Maggs. Protocols for asymmetric communication channels.

Journalof Computer and System Sciences , 63(4):573–596, 2001.[4] Noga Alon, Oded Goldreich, Johan H˚astad, and Ren´e Peralta. Simple constructions of almostk-wise independent random variables.

Random Structures & Algorithms , 3(3):289–304, 1992.[5] Alexandr Andoni, Javad Ghaderi, Daniel Hsu, Dan Rubenstein, and Omri Weinstein. Codingsets with asymmetric information.

ArXiv e-prints , 2018.[6] Daniel Barbara and Hector Garcia-Molina. Exploiting symmetries for low-cost comparisonof ﬁle copies. In [1988] Proceedings. The 8th International Conference on Distributed , pages471–479. IEEE, 1988.[7] Daniel Barbara and Richard J. Lipton. A class of randomized strategies for low-cost comparisonof ﬁle copies.

IEEE Transactions on Parallel and Distributed Systems , 2(2):160–170, 1991.[8] Djamal Belazzougui and Qin Zhang. Edit distance: Sketching, streaming, and documentexchange. In

Proceedings of the 57th IEEE Annual Symposium on Foundations of ComputerScience , pages 51–60. IEEE, 2016.[9] Boris Bukh and Venkatesan Guruswami. An improved bound on the fraction of correctabledeletions. In

Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discretealgorithms , pages 1893–1901. ACM, 2016.[10] Diptarka Chakraborty, Elazar Goldenberg, and Michal Kouck´y. Low distortion embeddingfrom edit to hamming distance using coupling. In

Proceedings of the 48th IEEE AnnualAnnual ACM SIGACT Symposium on Theory of Computing . ACM, 2016.[11] Kuan Cheng, Zhengzhong Jin, Xin Li, and Ke Wu. Deterministic document exchange protocols,and almost optimal binary codes for edit errors. In , pages 200–211. IEEE, 2018.[12] Kuan Cheng, Zhengzhong Jin, Xin Li, and Ke Wu. Block edit errors with transpositions:Deterministic document exchange protocols and almost optimal binary codes. In . SchlossDagstuhl-Leibniz-Zentrum fuer Informatik, 2019.[13] Graham Cormode, Mike Paterson, Suleyman Cenk Sahinalp, and Uzi Vishkin. Communica-tion complexity of document exchange. In

Proceedings of the Eleventh Annual ACM-SIAMSymposium on Discrete Algorithms , pages 197–206. ACM, 2000.[14] Arnaldo Garcia and Henning Stichtenoth. On the asymptotic behaviour of some towers offunction ﬁelds over ﬁnite ﬁelds.

Journal of number theory , 61(2):248–273, 1996.3315] V. Guruswami and R. Li. Eﬃciently decodable insertion/deletion codes for high-noise andhigh-rate regimes. In ,pages 620–624, July 2016.[16] V. Guruswami and C. Wang. Deletion codes in the high-noise and high-rate regimes.

IEEETransactions on Information Theory , 63(4):1961–1970, April 2017.[17] Venkatesan Guruswami, Christopher Umans, and Salil Vadhan. Unbalanced expanders andrandomness extractors from Parvaresh-Vardy codes.

Journal of the ACM , 56(4), 2009.[18] Bernhard Haeupler. An optimal document exchange protocol. In , 2019.[19] Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: codes for insertionsand deletions approaching the singleton bound. In

Proceedings of the 49th Annual ACMSIGACT Symposium on Theory of Computing , pages 33–46. ACM, 2017.[20] Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: Explicit construc-tions, local decoding, and applications. In

Proceedings of the 50th Annual ACM Symposiumon Theory of Computing , 2018.[21] Tom Høholdt, Jacobus H Van Lint, and Ruud Pellikaan. Algebraic geometry codes.

Handbookof coding theory , 1(Part 1):871–961, 1998.[22] Utku Irmak, Svilen Mihaylov, and Torsten Suel. Improved single-round protocols for remoteﬁle synchronization. In

INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computerand Communications Societies. Proceedings IEEE , volume 3, pages 1665–1676. IEEE, 2005.[23] Hossein Jowhari. Eﬃcient communication protocols for deciding edit distance. In

ESA , 2012.[24] Eduardo Sany Laber and Leonardo Gomes Holanda. A new protocol for asymmetric commu-nication channels: Reaching the lower bounds.

Scientia Iranica , 8(4):297–302, 2001.[25] Eduardo Sany Laber and Leonardo Gomes Holanda. Improved bounds for asymmetric com-munication protocols.

Information Processing Letters , 83(4):205–209, 2002.[26] V. I. Levenshtein. Binary Codes Capable of Correcting Deletions, Insertions and Reversals.

Soviet Physics Doklady , 10:707, February 1966.[27] A Orlitsky and K Viswanathan. Practical algorithms for interactive communication. In

IEEEInt. Symp. on Information Theory , 2001.[28] Alon Orlitsky. Worst-case interactive communication 1: Two messages are almost optimal.

IEEE transactions on Information Theory , 36:1111–1126, 1990.[29] Alon Orlitsky. Interactive communication: Balanced distributions, correlated ﬁles, and average-case complexity. In [1991] Proceedings 32nd Annual Symposium of Foundations of ComputerScience , pages 228–238. IEEE, 1991.[30] L. J. Schulman and D. Zuckerman. Asymptotically good codes correcting insertions, deletions,and transpositions.

IEEE Transactions on Information Theory , 45(7):2552–2557, Nov 1999.[31] Kenneth W Shum, Ilia Aleshnikov, P Vijay Kumar, Henning Stichtenoth, and Vinay Deolalikar.A low-complexity algorithm for the construction of algebraic-geometric codes better than thegilbert-varshamov bound.

IEEE Transactions on Information Theory , 47(6):2225–2241, 2001.[32] Michael Sipser and Daniel A Spielman. Expander codes. In

Proceedings 35th Annual Sympo-sium on Foundations of Computer Science , pages 566–576. IEEE, 1994.[33] Michael Sipser and Daniel A Spielman. Expander codes.

IEEE transactions on InformationTheory , 42(6):1710–1722, 1996. 3434] Daniel A Spielman. Linear-time encodable and decodable error-correcting codes.

IEEE Trans-actions on Information Theory , 42(6):1723–1731, 1996.[35] Torsten Suel, Patrick Noel, and Dimitre Trendaﬁlov. Improved ﬁle synchronization techniquesfor maintaining large replicated collections over slow networks. In

Proceedings. 20th Interna-tional Conference on Data Engineering , pages 153–164. IEEE, 2004.[36] John Watkinson, Micah Adler, and Faith E Fich. New protocols for asymmetric communicationchannels. In