[PDF] Lattice Codes for the Binary Deletion Channel

Abstract

The construction of deletion codes for the Levenshtein metric is reduced to the construction of codes over the integers for the Manhattan metric by run length coding. The latter codes are constructed by expurgation of translates of lattices. These lattices, in turn, are obtained from Construction~A applied to binary codes and $\Z_4-$codes. A lower bound on the size of our codes for the Manhattan distance are obtained through generalized theta series of the corresponding lattices.

Full PDF

aa r X i v : . [ c s . I T ] J un Lattice Codes for the Binary Deletion Channel

Lin Sok , Patrick Sol´e , , Aslan Tchamkerten Telecom ParisTech King Abdulaziz University { lin.sok;patrick.sole;aslan.tchamkerten } @telecom-paristech.fr Abstract

The construction of deletion codes for the Levenshtein metric is reduced to the construction of codes over the integers for theManhattan metric by run length coding. The latter codes are constructed by expurgation of translates of lattices. These lattices,in turn, are obtained from Construction A applied to binary codes and Z − codes. A lower bound on the size of our codes forthe Manhattan distance are obtained through generalized theta series of the corresponding lattices. Keywords:

Deletion codes, lattice, Lee metric, Construction A , weight enumerator, ν -seriesI. I NTRODUCTION

Coding for the binary deletion channel remains a major challenge for coding theorists. Part of the reason for this is thatthe use of standard block algebraic coding techniques (parity-checks, cosets, syndromes) is precluded due to the speciﬁcity ofthe channel which produces output vectors of variable lengths. A variation of this channel is the so-called segmented deletionchannel where at most a ﬁxed number of errors can occur within segments of given size [17], [16]. Because of this restriction,the segmented deletion channel does not alterate the number of runlengths if they are long enough. Hence, if we view thechannel in terms of input/output runlengths, the input and output vectors have the same dimension (assuming long enoughrunlengths). In this case, algebraic coding techniques can be used.In this paper, we construct lattice-based codes, which, in principle, can be decoded when obtained via Construction Afrom Lee metric codes with known decoding algorithms [6]. The proposed code constructions are analogous to the so-called ( d, k ) − codes in magnetic recording where each codeword contains runs of zeros of length at least d and at most k whileeach run of ones has unit length [14]. Given d, k and assuming a constant number of runs of zeros, label the runs by integersmodulo m and consider block codes over the ring of integers modulo m —the smallest possible m depends on d and k .Our approach differs from the one in [14] in two ways. First, we relax the unit length runlength of the ones in [14] (whichwas motivated by magnetic recording applications). Second, we consider lattices rather than codes over the integers modulo m to allow a wider choice of parameters. Indeed our deletion codes are obtained as sets of vectors in a lattice with a givenManhattan norm. By varying this norm, a single lattice, possibly obtained from a single Lee code by Construction A, canproduce an inﬁnity of deletion codes. We extend some results of [1], [21] on generalized theta series, called there ν − series,to effectively enumerate these special sets of vectors in the lattice. In particular, if the lattice is obtained via Construction Afrom a code, the generalized ν − series allows to enumerate these sets from the weight enumerators of the code.The paper is organized as follows. In Section II, we formalize the problem. In Section III, we determine the sizes of codesderived from Construction A lattices. In Section IV we provide a codebook generation algorithm and a corresponding decodingalgorithm for a speciﬁc class of lattices which includes the E lattice. In Section V, using tools developed in Section III wederive the analogue of the Gilbert and Hamming bounds for the Manhattan metric space. In Section VI we derive the asymptoticversions of these bounds. In Section VII, we provide a few concluding remarks and point to some open problems.II. B ACKGROUND AND S TATEMENT OF THE P ROBLEM

Consider a binary sequence of length N that starts with a zero and that contains an even number n of runs—hence n/ runs of zeros and n/ runs of ones. For instance, the sequence corresponds to N = 10 and n = 4 . Throughoutthe paper we make the following hypothesis: Working hypothesis.

In any given code n is the same across codewords and they all start with a zero. Moreover, the runlengthsin each codeword are supposed to be lower bounded by some constant r ≥ where r − corresponds to the maximum numberof deletions that can occur over a length N codeword. This condition is imposed so that the number of runs before and aftertransmission remains the same. With a given length N binary sequence we associate its corresponding runlength sequence ( x , y , . . . , x i , y i , . . . , x n/ , y n/ ) This work was supported in part by an Excellence Chair Grant from the French National Research Agency (ACE project). where x i and y i denote the i th runlength of zeros and ones, respectively. For instance, sequence corresponds to (2 , , , . The integer sequence so constructed satisﬁes the constraint N = n/ X i =1 ( x i + y i ) . Denote by φ the above correspondence from F N to Z n . The

Levenshtein distance between two binary vectors is the leastnumber of deletions to go from one to the other [15]. The

Manhattan distance between two vectors w , z ∈ Z n is deﬁned as | w − z | def = n X i =1 | w i − z i | . The following observation is trivial but crucial.

Proposition 1.

Under the above working hypothesis, the map φ is an isometry between F N with the Levenshtein distance and Z n with the Manhattan distance.Proof: Let z = ( x , y , · · · , x n , y n ) denote a sequence of runs. Let j be an integer ≤ r − . Any deletion of j zeros (resp. ones) into run number i will result intoa change of x i (resp. y i ) into x i ± j (resp. y i ± j ) yielding a sequence z ′ at Manhattan distance j away from z . The problem we consider is to characterize A ( n, d, N, r ) , the largest number of length n vectors of nonnegative integers atManhattan distance at least d apart and with coordinates summing up to N. Any set of length n vectors with integral entries ≥ r , at Manhattan distance at least d apart, and coordinates summing up to N, we refer to as an ( n, d, N, r ) − set.III. E NUMERATION FOR CONSTRUCTION A LATTICES A code C ⊆ Z nm is deﬁned as a Z m − submodule of Z nm . The complete weight enumerator (cwe) of C is deﬁned as thepolynomial (see [22, Chap. 5.6]) cwe C ( x , x , . . . , x m ) = X c ∈ C m − Y i =0 x n i ( C ) i , where n i ( c ) is the number of entries equal to i in the vector c. For m = 2 , we let W C ( x, y ) def = cwe C ( x, y ) be the classical weight enumerator of a binary code.A lattice of R n is deﬁned as a discrete additive subgroup of R n . A lattice L is said to be obtained by Construction A froma code C of Z nm if C is the image of L by reduction modulo m componentwise [8, Chap. 7.2]. Such a lattice is denoted by L = A ( C ) . An important parameter of a lattice is its minimum distance (norm) which is given by the following proposition.Recall that the

Lee weight of a symbol x ∈ Z m = { , , · · · , m − } is deﬁned as min( x, m − x ) . The weight of a vector is the sum of the weights of its components, and the Lee distance of two vectors is the Lee weight oftheir difference vector. The

Lee distance of a linear code C ⊆ Z nm is the minimum weight of its nonzero elements. Proposition 2 ([19]) . Let L = A ( C ) for some C ⊆ Z nm . Then the minimum distance of L is given by d = min( d ′ , m ) where d ′ is the minimum Lee distance of C .For an integer r ≥ deﬁne ν L ( r ; q ) def = X x ∈ L :min i x i ≥ r q | x | as the shifted ν − series in the indeterminate q of the lattice L .This deﬁnition extends trivially to any discrete subset L of R n . The motivation for this generating function, whose case r = 0 is the ν − series of [1], [20], stems from Proposition 3 below which gives a lower bound on A ( n, d, N, r ) . Notation.

We use the Waterloo notation for coefﬁcients of generating series (see [13]). Given q − series f = P i f i q i we denoteby [ q i ] f ( q ) the coefﬁcient f i . Proposition 3. If L is a lattice of R n with minimum Manhattan distance d then the set of vectors of L with coordinate entriesbounded below by r and Manhattan norm N forms an ( n, d, N, r ) − set of size [ q N ] ν L ( r ; q ) ≤ A ( n, d, N, r ) . The proof of Proposition 3 immediately follows from the deﬁnition of [ q N ] ν L ( r ; q ) and A ( n, d, N, r ) .We now show how to compute (shifted) ν − series of lattices from (complete) weight enumerators of codes. Theorem 1. If L = A ( C ) and m = 2 then ν L ( r ; q ) = W C ( q a − q , q b − q ) , where a (resp. b ) is the ﬁrst even (resp. odd) integer ≥ r. If L = A ( C ) and m = 4 , then ν L ( r ; q ) = cwe C ( q a − q , q b − q , q c − q , q d − q ) , where a, b, c, d are the ﬁrst integers ≥ r, congruent to , , , modulo respectively.Proof: Use the same argument as in [1], [21] and write A ( C ) as a disjoint union of cosets of m Z n ν L ( r ; q ) = W C ( ν Z ( r ; q ) , ν Z +1 ( r ; q )) for m = 2 , and ν L ( r ; q ) = cwe C ( ν Z ( r ; q ) , ν Z +1 ( r ; q ) , ν Z +2 ( r ; q ) , ν Z +3 ( r ; q )) for m = 4 , respectively. The result follows by observing that ν Z ( r ; q ) = q a − q and by summing the appropriate geometric series of reason q or q . In Column 2 of Tables I, II, and III, we list for some values of N and r the lower bound [ q N ] ν L ( r ; q ) to A ( n, d, N, r ) forthe well-known lattices E , BW , and Λ . These lattices are constructed from the extended Hamming code H modulo or the Klemm code K modulo for E , the code RM (1 ,

4) + 2 RM (2 , for BW , and the lifted Golay code Q R for Λ . Here K s = R s + 2 P s where R s denotes the length − s repetition code, where P s = R ⊥ s denotes its dual code, and where RM ( k, m ) denotes the order- k Reed-Muller code of length m . Some cwe’s for these codes can be found in [2], [3] while others were computed using Magma [4]. The cwe of K n is easilyseen to be

12 [( x + x ) n + ( x − x ) n + ( x + x ) n + ( x − x ) n ] . These numerical results show, for instance, that for r = 2 and N = 64 , among the three lattices E , BW and Λ , BW achieves the best lower bound while Λ achieves the best bound for r = 1 and N = 64 .We now add an extra ingredient to the above construction which improves the lower bound on A ( n, d, N, r ) for N largeenough. Let L be a Construction A lattice in Z n − with L − distance d . From this lattice in Z n − we construct a new set ofpoints in Z n as b L def = { ( x , x , . . . , x n − , N − n − X i =1 x i ) | ( x , . . . , x n − ) ∈ L } . Note that the map ( x , x , . . . , x n − ) ( x , x , . . . , x n − , N − n − X i =1 x i ) is the Manhattan analogue map of the Yaglom map (see, e.g. , [8, Chap. 9, Theorem 6]) ( x , x , . . . , x n − ) ( x , x , . . . , x n − , ( N − n − X i =1 x i ) / ) from R n − to R n .Column of Tables I and II gives the lower bound [ q N ] ν ˆ L ( r ; q ) for the secondly proposed code construction. As we canobserve, for N large enough ( e.g. , N ≥ for E ), this second construction improves the ﬁrst.In this section we derived lower bounds on A ( n, d, N, r ) in a non-constructive fashion from the properties of L and ˆ L using generating functions (Proposition 3). In the next section we provide an explicit code construction for a speciﬁc familyof lattices along with an effective decoding algorithm. TABLE IS

IZE [ q N ] ν L ( r ; q ) OF ( n, d, N, r ) − SET WITH L = A ( H ) , d ≥ AND r = 1 , N [ q N ] ν E (1; q ) [ q N ] ν b E (1; q ) N [ q N ] ν E (2; q ) [ q N ] ν b E (2; q )

16 1 018 8 120 50 922 232 5924 835 29126 2480 112628 6372 360630 14640 997832 30789 2461834 60280 5540736 111254 11568738 195416 22694140 329095 42235742 534496 75145244 841160 1285948

IV. C

ODE CONSTRUCTION AND DECODING ALGORITHM

In this section, we describe two algorithms with respect to the lattice A ( K n ) : • a search algorithm that generates explicitly an ( n, N, d, r ) set carved from the lattice; • a corresponding decoding algorithm.Deﬁne code C ( n, d, N, r ) def = { c ∈ A ( K n ) : min i c i ≥ r, n X i =1 c i = N } and note that the minimum distance of C ( n, d, N, r ) is at least , the minimum distance inherited from A ( K n ) . The generatormatrix G for the lattice A ( K n ) is G =  · · · · · · · · · ... ... ... . . . ... ... · · · · · ·  hence any codeword c in C ( n, d, N, r ) can be expressed as c = ( x , x + 2 x , . . . , x + 2 x n − , x + 2 n − X i =2 x i + 4 x n ) with l i ≤ x i ≤ u i and where l i and u i are determined as follows.Deﬁne S i def = x + i X j =2 ( x + 2 x j ) TABLE IIS

IZE [ q N ] ν L ( r ; q ) OF ( n, d, N, r ) − SET WITH L = A ( K ) , d ≥ AND r = 1 , N [ q N ] ν E (1; q ) [ q N ] ν b E (1; q ) N [ q N ] ν E (2; q ) [ q N ] ν b E (1; q )

16 1 020 36 124 331 3728 1752 36832 6765 212036 21164 888540 56823 3004944 135728 8687248 295545 22260052 596980 51814556 1133187 111512560 2041480 224831264 3517605 428979268 5832828 780739772 9354095 13640225 and T def = x + 2 n − X j =2 x j . Then • for i = 1 , l = ru = N − ( n − r, • for ≤ i ≤ n − , l i = (cid:6) ( r − x ) (cid:7) u i = (cid:4) ( N − ( n − i ) r − S i − − x ) (cid:5) , • for i = n, l n = (cid:6) ( r − T ) (cid:7) u n = (cid:4) ( N − ( n − r − T ) (cid:5) . Searching the codewords can be done by a tree search through all nodes from level (corresponding to x ) to level n (corresponding to x n ). With the above constraints, we are able to efﬁciently generate all codewords in C ( n, d, N, r ) . Numericalresults are given in Table IV.Table V gives for n = 8 , N = 12 , r = 1 and the quaternary lattice E = A ( K ) the number of visited nodes at level i and its naive upper bound which is roughly ( N − N − ) i − , for different i ’s. Table VI gives the number of visited nodesat level i = 6 for different values of N (we keep n = 8 and r = 1 ).We now turn to decoding. Recall that in [6] the decoding of a Construction A q − ary lattice for the L − norm is reduced tothat of a q − ary linear code for the Lee metric.We now describe our decoding algorithm for the C ( n, N, d, r ) code (carved from A ( K n ) ) using the runlength limited (RLL)sequence of its codewords. Recall that, because of our working hypothesis, the channel preserves the number of runs.From the deﬁnition of A ( K n ) we have A ( K n ) = 2 D n ∪ ( + 2 D n ) , TABLE IIIS

IZE [ q N ] ν L ( r ; q ) OF ( n, d, N, r ) − SET WITH L = BW , Λ , d ≥ AND r = 1 , N [ q N ] ν BW (1; q )

16 120 1624 30628 398432 3923536 31017640 201699644 1100534448 5146374952 21055736056 76779663060 253513656064 768057997568 2158819257672 56814408136 N [ q N ] ν BW (2; q )

32 136 1640 30644 398448 3923552 31017656 201699660 1100534464 5146374968 21055736072 76779663076 253513656080 768057997584 2158819257688 56814408136 N [ q N ] ν Λ (1; q )

24 128 2432 30036 260040 2341544 29976048 414421152 4805882456 44895669060 345099015264 2244821061368 12663927480072 63212064814676 283740797078480 11605964888130 N [ q N ] ν Λ (2; q )

48 152 2456 30060 260064 2341568 29976072 414421176 4805882480 44895669084 345099015288 2244821061392 12663927480096 632120648146100 2837407970784104 11605964888130TABLE IVC

ODEWORDS IN E WITH

RLL

REPRESENTATION FOR r = 1 , N = 12(5 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

1) (1 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , where D n def = { x ∈ Z n | n X i =1 x i ≡ } . It is clear that D n contains A n − = { x ∈ Z n | n X i =1 x i = 0 } as a sublattice.Following [7], we reduce the decoding in D n to the decoding in A n − by noting that D n = k + 2 A n − TABLE VN

UMBER OF VISITED NODES AND ITS UPPER BOUND OF SEARCHING CODEWORDS FROM E = A ( K ) WITH r = 1 , N = 12 Level nodes Upper bound2 9 153 11 454 16 1355 21 4056 28 12157 36 3645TABLE VIN

UMBER OF VISITED NODES AND ITS UPPER BOUND OF SEARCHING CODEWORDS FROM E = A ( K ) FOR r = 1 N nodes(level 7) nodes(level 6) Upper bound(level 6)8 1 1 112 36 28 121516 331 217 2812520 1752 1008 21849124 6765 3465 100383328 21164 9724 338207132 56823 23569 928232536 135728 51136 2202187540 295545 101745 4685528144 596980 188860 9161566348 1133187 331177 16744814152 2041480 553840 28963543556 3517605 889785 47851562560 5832828 1381212 76049207164 9354095 2081185 1169135493 with k = ( N, , . . . , .The following lemma allows us to ﬁnd a closest codeword in A n − to a received vector in Z n . Lemma 1.

Any vector of coordinates summing up to s in Z n + is at L − distance at least | s | from any vector in A n − .Proof: Let x = ( x , x , . . . , x n ) ∈ Z n + with n P i =1 x i = s and y ∈ A n − . Then | x − y | = n X i =1 | x i − y i | ≥ | n X i =1 ( x i − y i ) | = s since P ni =1 y i = 0 . Proposition 4.

Let φ ( i ) : Z n → A n − (1) x ( φ ( i )1 , φ ( i )2 , . . . , φ ( i ) n ) , (2) where φ ( i ) j = ( x j − ( x + · · · + x n ) if j = ix j if j = i. Then for any x ∈ Z n + , φ ( i ) ( x ) is a closest point of A n − to x .Proof: The proof follows from Lemma 1 with s = | x | .In case of a single deletion error (recall that the minimum distance of A ( K n ) is ), there exists a unique i ∈ { , , . . . , n } such that A n − contains φ ( i ) ( x ) . That i is where the error occurs. AlgorithmInput:

A received vector x of length n Output:

A nearest codeword ˆ x to x N ← length of the binary code corresponding2) a ← a coset representative of A n − in D n

3) if n P i =1 x [ i ] == N − then4) ˆ x ← x

5) Find (the unique) coordinate ˆ x [ j ] whose parity is different from the others ˆ x [ j ] ← ˆ x [ j ] + 1

7) else8) ˆ X ← x − a s ← n P i =1 ˆ X [ i ]

10) for i ← to n do11) ˆ x ← ˆ X ˆ x [ i ] ← ˆ x [ i ] − s

13) if all coordinates of ˆ x are even then14) break15) end if16) end for17) ˆ x ← ˆ x + a

18) end if19) return ˆ x The complexity of our algorithm can be calculated as follows: • line requires n − additions • line requires n additions • line requires n − additions • lines to require one addition (plus one parity test) for n times • line requires n additionsThus the decoding algorithm requires n − additions over Z plus n parity tests.For instance, take n = 8 , N = 12 , r = 1 and consider x = (3 , , , , , , , as a received word. The code C (8 , , has codewords and has minimum distance . By taking as coset representative of A n − in D n a = (1 , , , , , , , , the nearest codewords in A n − to x − a are φ (1) ( x − a ) = (2 , , , , , , , − ,φ (2) ( x − a ) = (2 , , , , , , , − ,φ (3) ( x − a ) = (2 , , , , , , , − ,φ (4) ( x − a ) = (2 , , , , , , , − ,φ (5) ( x − a ) = (2 , , , , , , , − ,φ (6) ( x − a ) = (2 , , , , , , , − ,φ (7) ( x − a ) = (2 , , , , , , , − ,φ (8) ( x − a ) = (2 , , , , , , , − . Since φ (2) ( x − a ) is the only codeword in A n − , we decode x = (3 , , , , , , , since φ (2) ( x − a ) + a = (3 , , , , , , , . V. B

OUNDS ON A ( n, d, N, r ) First we recall a well-known identity of formal power series.

Lemma 2.

For any integer n ≥ , we have − q ) n = ∞ X i =0 (cid:18) i + n − n − (cid:19) q i . Proof:

Differentiate the geometric series − q ) = ∞ X i =0 q i with respect to q and use induction on n. Using generating functions, we compute the volume V ( n, e ) of the Manhattan ball of radius e in Z n . Lemma 3.

For any integers n ≥ e ≥ , we have V ( n, e ) = [ q e ] (1 + q ) n (1 − q ) n +1 = min( n,e ) X i =0 i (cid:18) ni (cid:19)(cid:18) ei (cid:19) . Proof: V ( n, e ) = e X i =0 [ q i ] ν Z n ( −∞ , q )= e X i =0 [ q i ]( 1 + q − q ) n = [ q e ] (1 + q ) n (1 − q ) n +1 . The second expression in the Lemma is from [10]. It can be rederived from the above generating series by expanding (1 + 2 q − q ) n +1 = n X i =0 (cid:18) ni (cid:19) i q i (1 − q ) i +1 through Lemma 2.By the same techniques, we can compute the volume of the ambient space A ( n, , N, r ) . Lemma 4.

For any integer

N > nr and r > e ≥ , we have A ( n, , N, r ) = (cid:18) N − nr + n − n − (cid:19) . Proof: A ( n, , N, r ) = [ q N ] ν Z n ( r, q ) = [ q N ]( q r − q ) n = [ q N − nr ] − q ) n . The result follows from Lemma 2.We are now in a position to formulate the analogues of the Gilbert and Hamming bound in the present context.

Theorem 2.

For any integers

N > nr, n ≥ d , and r > e = ⌊ ( d − / ⌋ ≥ , we have (cid:0) N − nr + n − n − (cid:1) V ( n, d − ≤ A ( n, d, N, r ) ≤ (cid:0) N − nr + n − n − (cid:1) V ( n, e ) . Proof:

Combine Lemma 3 and Lemma 4 with the standard arguments.The lower and upper bounds on A ( n, d, N, r ) in Theorem 2 are given in Table VII and Table VIII for lattices E and BW .In these tables we deﬁned I ( n, d, N, r ) def = & (cid:0) N − nr + n − n − (cid:1) V ( n, d − ' and S ( n, e, N, r ) def = & (cid:0) N − nr + n − n − (cid:1) V ( n, d − ' . The numerical results show that [ q N ] ν L ( r ; q ) (a lower bound to A ( n, d, N, r ) by Proposition 3), lies between I ( n, d, N, r ) and S ( n, e, N, r ) for many parameter values. Exceptions are, for instance, for BW with r = 2 , and N = 48 , . . . , . Whetherthese code constructions yield sizes between I ( n, d, N, r ) and S ( n, e, N, r ) for large N is an open issue.Since all codewords have constant Manhattan distance, it is natural to consider the Johnson bound in the Lee metric: Theorem 3. If d > N (1 − / n ) , then we have A ( n, d, N, r ) ≤ dd − N (1 − / n ) . Proof:

Reduce all vectors modulo Q = 2 N. Use Lemma 13.62 of [5] with D = Q/ N/ , and x = 1 /n. TABLE VIIB

OUNDS ON A ( n, d, N, r ) WITH L = E AND r = 2 , , N I (8 , , N,

2) [ q N ] ν E (2; q ) S (8 , , N,

24 8 331 37828 61 1752 296432 295 6765 1442136 1067 21164 5223740 3157 56823 15468044 8073 135728 39556048 18465 295545 90476152 38685 596980 189553656 75500 1133187 369949960 138986 2041480 681030064 243611 3517605 1193692568 409544 5832828 2006761472 664191 9354095 3254533376 1043996 14567520 5115577680 1596508 22105457 78228865

N I (8 , , N,

3) [ q N ] ν E (3; q ) S (8 , , N,

32 8 331 37836 61 1752 296440 295 6765 1442144 1067 21164 5223748 3157 56823 15468052 8073 135728 39556056 18465 295545 90476160 38685 596980 189553664 75500 1133187 369949968 138986 2041480 681030072 243611 3517605 1193692576 409544 5832828 2006761480 664191 9354095 3254533384 1043996 14567520 5115577688 1596508 22105457 78228865

N I (8 , , N,

4) [ q N ] ν E (4; q ) S (8 , , N,

40 8 331 37844 61 1752 296448 295 6765 1442152 1067 21164 5223756 3157 56823 15468060 8073 135728 39556064 18465 295545 90476168 38685 596980 189553672 75500 1133187 369949976 138986 2041480 681030080 243611 3517605 1193692584 409544 5832828 2006761488 664191 9354095 3254533392 1043996 14567520 5115577696 1596508 22105457 78228865

VI. A

SYMPTOTIC BOUNDS ON A ( n, d, N, r ) We assume that r is ﬁxed, that N → ∞ , and that n ∼ ηN/r, d ∼ δN for some constants η, δ with η ∈ (0 , , and δ ≥ . Because each codeword has weight N, the triangle inequality in the Manhattan metric shows that δ ∈ (0 , . Denote by R theasymptotic exponent of A ( n, d, N, r ) , that is R def = lim sup 1 N log A ( n, d, N, r ) . The asymptotic form of Theorem 3 shows that δ ∈ (0 , whenever R = 0 . Let L ( x ) = x log x + log ( x + p x + 1) − x log ( p x + 1 − . It was proved in [9] that when x → ∞ and e ∼ ǫn lim 1 n log V ( n, e ) = L ( ǫ ) . For convenience, let H ( q ) def = − q log q − (1 − q ) log (1 − q ) denote the binary entropy function and let f ( x, y, z ) def = [1 − y + y/x ] H ( yy + x (1 − y ) ) − ( y/x ) L ( xzy ) . We establish the asymptotic version of Theorem 2. TABLE VIIIB

OUNDS ON A ( n, d, N, r ) WITH L = BW AND r = 2 , , N I (16 , , N,

2) [ qN ] νBW q ) S (16 , , N,

36 1 16 11740 82 306 1485844 2890 3984 52678348 49949 39235 910727852 539795 310176 9842252056 4178302 2016996 76184365660 25184088 11005344 459189868764 124915457 51463749 2277625165368 529944363 210557360 9662652216472 1977679995 767796630 36059698563076 6630474804 2535136560 120895657256180 20297778673 7680579975 370096164454284 57467324395 21588192576 1047820881451288 152025004051 56814408136 2771922573848592 378928483749 141077361984 6909129353685096 896068510238 332674600329 163383158366718

N I (16 , , N,

3) [ qN ] νBW q ) S (16 , , N,

52 1 16 11756 82 306 1485860 2890 3984 52678364 49949 39235 910727868 539795 310176 9842252072 4178302 2016996 76184365676 25184088 11005344 459189868780 124915457 51463749 2277625165384 529944363 210557360 9662652216488 1977679995 767796630 36059698563092 6630474804 2535136560 120895657256196 20297778673 7680579975 3700961644542100 57467324395 21588192576 10478208814512104 152025004051 56814408136 27719225738485108 378928483749 141077361984 69091293536850112 896068510238 332674600329 163383158366718

N I (16 , , N,

4) [ qN ] νBW q ) S (16 , , N,

68 1 16 11772 82 306 1485876 2890 3984 52678380 49949 39235 910727884 539795 310176 9842252088 4178302 2016996 76184365692 25184088 11005344 459189868796 124915457 51463749 22776251653100 529944363 210557360 96626522164104 1977679995 767796630 360596985630108 6630474804 2535136560 1208956572561112 20297778673 7680579975 3700961644542116 57467324395 21588192576 10478208814512120 152025004051 56814408136 27719225738485124 378928483749 141077361984 69091293536850128 896068510238 332674600329 163383158366718

Theorem 4.

With the above notation we have f ( r, η, δ ) ≤ R ≤ f ( r, η, δ/ . Proof:

The result follows from Theorem 2 by standard entropic estimates for binomial coefﬁcients for the numerator andthe result on large alphabet Lee balls from [9] for the denominators.In Fig. 1 and 2, the graphs of the asymptotic lower bound curve f ( r, η, δ ) with different parameters η and r = 2 show thatthe rate R is higher when η is around . . Fig. 1. Graphs of f ( r, η, δ ) for r = 2 and η = 0 . , . , . , . , . Fig. 2. Graphs of f ( r, η, δ ) for r = 2 and η = 0 . , . , . , . , . VII. C

ONCLUSION AND OPEN PROBLEMS

We approached a problem of binary coding for the Levenshtein distance by using lattices for the Manhattan metric. Theselattices are obtained by Construction A applied to binary and quaternary codes. Since decoding these lattices for the Manhattanmetric can be reduced to decoding the constructing code for the Lee distance [6], it is worth to investigate the decoding of Z − codes beyond the Klemm’s code considered here. Another approach would be to consider Z − codes with a known decodingalgorithm ( e.g. , Preparata [11], Goethals [12], Calderbank-MacGuire [18]) and look at the performance of the correspondinglattices.More generally, it is worth considering larger alphabets like Z , Z , when building lattices in higher dimensions. The Leedecoding problem for such codes is completely open. Moving away from Construction A, ﬁnding the densest lattice for theManhattan metric in a given dimension is still a deep and fundamental open problem.Finally, turning to the deletion channel, what allowed us to use algebraic coding techniques was our working hypothesis; therunlengths of each codeword is larger than r , the maximum number of deletions that can occur over the transmission period.Extending these techniques to the case where the working hypothesis does not necessarily hold is an important and challengingopen problem. VIII. A CKNOWLEDGMENTS

The authors would like to thank Jean-Claude Belﬁore for helpful discussions.R

EFERENCES[1] M. Barlaud, M. Antonini, P. Sol´e, P. Mathieu and T. Gaidon “A pyramidal scheme for lattice vector quantization of wavelet transform coefﬁcients appliedto image coding,”

IEEE Trans. on Image Processing, Z ,” IEEE Trans. on Information Theory,

IT-43 (1997), pp. 969–976.[3] A. Bonnecaze, P. Sol´e and R. Calderbank, “Quaternary Quadratic Residue Codes and Unimodular Lattices,”

IEEE Trans. on Information Theory,

IT-41(1995), pp. 366–377.[4] W. Bosma and J. Cannon,

Handbook of Magma Functions , Sydney, 1995.[5] E. Berlekamp,

Algebraic Coding Theory , Aegean Park Press (1984).[6] Antonio Campello, Grasiele C. Jorge and Sueli I. R. Costa, “Decoding q-ary lattices in the Lee metric,” http://arxiv.org/abs/1105.5557 .[7] J. H. Conway and N. J. A. Sloane, “Sphere packings lattices and groups,”

Springer-Verlag , 1991.[8] J. H. Conway and N. J. A. Sloane, “Fast quantizing and decoding algorithms for lattice quantizers and codes,”

IEEE Trans. on Information Theory,

IT-28(2), pp. 227–231 (1982). .[9] D. Gardy and P. Sol´e,“Saddle Point Techniques in Asymptotic Coding Theory,” Congr`es Franco-Sovi´etique de codage alg´ebrique, Paris (1991),

SpringerLecture Notes in Computer Science,

573 (1991), pp. 75–81. ftp://ftp.cs.brown.edu/pub/techreports/91/cs91-29.pdf [10] S. W. Golomb and L. R. Welch, “Perfect codes in the Lee metric and the packing of polyominoes,”

SIAM J. on Applied Math,

Vol. 18, No 2, (1970),pp. 302–317.[11] A. R. Hammons Jr., P. Vijay Kumar, A. R. Calderbank, N. J. A. Sloane and P. Sol´e, “The Z − Linearity of Kerdock, Preparata, Goethals and RelatedCodes,”

IEEE Trans. Information Theory,

40 (1994), pp. 301–319.[12] T. Helleseth and P. V. Kumar, “The algebraic decoding of the Z4-linear Goethals code,”

IEEE Trans. Inf. Theory, vol. 41, no. 6, Part II, pp. 2040–2048,Nov. 1995.[13] I. P. Goulden and D. M. Jackson, “Combinatorial Enumeration,” Dover Books on Mathematics, 2004.[14] V. I. Levenshtein and A. J. Han Vinck, “Perfect ( d, k ) − codes capable of correcting single peak-shifts,” IEEE Transactions on Information Theory,

Soviet Physics Doklady , 10(8), pp. 707710, (1966). [16] H. Mirghasemi and A. Tchamkerten, “On the capacity of the one-bit deletion and duplication channel,” Allerton (2012).[17] Z. Liu and M. Mitzenmacher, “Codes for deletion and insertion channels with segmented errors,” ISIT (2007), pp. 846–850.[18] K. Ranto, “On algebraic decoding of the Z -linear Goethals-like codes,” IEEE Transactions on Information Theory, http://neilsloane.com/doc/dijen.pdf .[21] P. Sol´e, “Counting lattice points in pyramids,”

Discrete Mathematics,

Volume 139, Number 1, 24 May 1995 , pp. 381–392.[22] F.J. MacWilliams and N.J.A. Sloane, “The theory of error-correcting codes,”