Decoding Reed-Solomon codes by solving a bilinear system with a Gröbner basis approach
aa r X i v : . [ c s . I T ] F e b Decoding Reed-Solomon codes by solving a bilinear system witha Gr¨obner basis approach
Magali Bardet ∗ , Rocco Mora † , and Jean-Pierre Tillich ‡ LITIS, University of Rouen Normandie Sorbonne Universit´es, UPMC Univ Paris 06 Inria, Team COSMIQ, 2 rue Simone Iff, CS 42112, 75589 Paris Cedex 12, France
Abstract
Decoding a Reed-Solomon code can be modeled by a bilinear system which can besolved by Gr¨obner basis techniques. We will show that in this particular case, thesetechniques are much more efficient than for generic bilinear systems with the same numberof unknowns and equations (where these techniques have exponential complexity). Herewe show that they are able to solve the problem in polynomial time up to the Sudan radius.Moreover, beyond this radius these techniques recover automatically polynomial identitiesthat are at the heart of improvements of the power decoding approach for reaching theJohnson decoding radius. They also allow to derive new polynomial identities that canbe used to derive new algebraic decoding algorithms for Reed-Solomon codes. We providenumerical evidence that this sometimes allows to correct efficiently slightly more errorsthan the Johnson radius.
Decoding a large number of errors in Reed-Solomon codes.
A long-standingopen problem in algebraic coding theory was that of decoding Reed-Solomon codes beyondthe error-correction radius, − R (where R stands for the code rate). This problem wassolved in a breakthrough paper by Sudan in [Sud97] where it was shown that there exists analgebraic decoder that works up to a fraction of errors 1 − √ R (the so called Sudan radiushere). This was even improved later on by Guruswami and Sudan in [GS98] with a decoderthat works up to the Johnson radius 1 − √ R . This represents in a sense the limit for suchdecoders since these decoders are list decoders that output all codewords up to this radiusand beyond this radius the list size is not guaranteed to be polynomial anymore. However, ifwe do not insist on having a decoder that outputs all codewords within a certain radius, orif we just want a decoder that is successful most of the time on the q -ary symmetric channelof crossover probability p , then we can still hope to have an efficient decoder beyond thisbound. Moreover, it is even interesting to investigate if there are decoding algorithms of saysubexponential complexity above the radius 1 − √ R . ∗ [email protected] † [email protected] ‡ [email protected] Gr¨obner basis approach.
Our approach to this problem is to express the decodingproblem as a bilinear system and explore what Gr¨obner bases techniques have to tell us in thiscase. Indeed consider a k -dimensional Reed-Solomon code of length n over F q with support a = ( a i ) ≤ i ≤ n ∈ F nq : RS k ( a ) = { ( P ( a i )) ≤ i ≤ n : P ∈ F q [ X ] , deg ( P ) < k } . Let b = ( b i ) ≤ i ≤ n be the received word, E be the set of positions in error and define the error locator as usualΛ( X ) def = Π i ∈E ( X − a i ) . (1)From this, we can write the bilinear system with unknowns the coefficients p i of the polynomial P ( X ) = P k − i =0 p i X i corresponding to the codeword that was sent and the coefficients λ j of the error locator polynomial Λ( X ) = X t + P t − j =0 λ j X j if we assume that there were t errors. We have n bilinear equations in the p i ’s and the λ j ’s coming from the n relations P ( a ℓ )Λ( a ℓ ) = b ℓ Λ( a ℓ ) , ℓ ∈ J , n K , namely k − X i =0 t X j =0 a i + jℓ p i λ j = t X j =0 b ℓ a jℓ λ j , ℓ ∈ J , n K and λ t = 1. (2) Gr¨obner basis techniques: a simple and automatic way for obtaining a poly-nomial time algorithm in our case.
Standard Gr¨obner bases techniques can be used tosolve this system, however solving (2) is much easier than solving a generic bilinear system.In particular these techniques solve typically in polynomial time the decoding problem whenthe fraction of errors is below the Sudan radius. This is explained in Section 3.1. The rea-son why the Gr¨obner basis approach works in polynomial time is related to power-decoding[SSB10, Nie14] and can be explained by similar arguments. However, the nice thing about thisGr¨obner basis approach is that the algorithm itself is very simple and can be given withoutany reference to power decoding (or the Sudan algorithm). The computation of the Gr¨obnerbasis reveals degree falls which are instrumental for its very low complexity. Understandingthese degree falls can be explained by the polynomial equations used by power decoding.However, this simple algorithm also appears to be very powerful beyond the Sudan bound:experimentally it seems that it is efficient up to the Johnson radius and that it is even ableto correct more errors in some cases than the refinement of the original power decoding al-gorithm [Nie18] (which reaches asymptotically the Johnson radius). This is demonstrated inSection 4.
Understanding the nice behavior of the Gr¨obner basis approach.
Moreover,trying to understand theoretically why this algorithm behaves so well, is not only explainedby the polynomial relations which are at the heart of the power decoding approach, it alsoreveals new polynomial relations that are not exploited by the power decoding approach asshown in Section 3. In other words, this approach not only gives an efficient algorithm, italso exploits other polynomial relations. It seems fruitful to understand and describe them,this namely paves the road towards new algebraic decoders of Reed-Solomon codes.
Notation.
Throughout the paper we will use the following notation. The integer in-terval { a, a + 1 , · · · , b } will be denoted by J a, b K . For a polynomial Q ( X ) = P mi =0 q i X i ,coeff ( Q ( X ) , X s ) stands for the coefficient q s of X s in Q ( X ). For two polynomials Q ( X ) and G ( X ), [ Q ( X )] G ( X ) stands for the remainder of Q ( X ) divided by G ( X ).2 The Algorithm
Consider an algebraic system of equations f ( x , · · · , x ℓ ) = 0 · · · = f m ( x , · · · , x ℓ ) = 0 (3)where the f i ’s are polynomials in x , · · · , x ℓ . Such systems can be solved by Gr¨obner basistechniques (see [CLO15] for instance). To simplify the discussion assume that we have aunique solution to the algebraic system (3) and that the polynomial ideal I generated by the f i ’s is radical, meaning that whenever there is a polynomial f and a positive integer s suchthat f s is in I , then f is in I . This seems to be the typical case for (2) when the number oferrors is below the Gilbert-Varshamov bound. In such a case, the reduced Gr¨obner basis of theideal I is given by the set { x − r , · · · , x n − r ℓ } for any admissible monomial ordering, where( r , · · · , r ℓ ) stands for the unique solution of (3) (this is standard, see for instance [Bar04,Lemma 2.4.3, p.40]). Recall here that a Gr¨obner basis of a polynomial ideal is defined for agiven admissible monomial ordering as a generating set { g , · · · , g s } of the ideal such that theideal generated by the leading monomials LM ( g i ) (where LM ( g ) is the largest monomial in g ) of the g i ’s coincides with the ideal generated by all the leading monomials of the elementsof I : h LM ( g ) , · · · , LM ( g s ) i = h LM ( f ) : f ∈ Ii . We will adopt Lazard’s point of view [Laz83] to compute a Gr¨obner basis and use Gaussianelimination on the Macaulay matrices associated to the system. The main known and efficientalgorithm for this is Faug`ere’s F4 algorithm [Fau99], see [CLO15] for background on thesubject.We recall that the Macaulay matrix M acaulay D ( S ) in degree D of a set S = { f , · · · , f m } of polynomials is the matrix whose columns correspond to the monomials of degree ≤ D sorted in descending order w.r.t. a chosen monomial ordering, whose rows correspond to thepolynomials tf i for all i where t is a monomial of degree ≤ D − deg ( f i ), and whose entry inrow tf i and column u is the coefficient of the monomial u in the polynomial tf i . A Gr¨obnerbasis for the system can be computed by computing a row echelon form of M acaulay D for largeenough D [Laz83] and [CLO15, chap. 10]. However, this way of solving (2) is very inefficient(unless t ≤ n − k where direct row echelonizing (2) is enough) because during the Gaussianelimination process we have a sequence of degree falls which are instrumental for computinga Gr¨obner basis by staying at a very small degree (this appears clearly if we use for instanceFaug`ere’s F4 algorithm [Fau99] on (2)).A degree fall is a polynomial combination P mi =1 g i f i of the f i ’s which satisfies0 < s def = deg m X i =1 g i f i ! < m max i =1 deg ( g i f i ) . We say that deg ( P mi =1 g i f i ) is a degree fall of degree s .The simplest example of such a degree fall occurs in (2) when t < n − k . Here thereare linear combinations of the bilinear equations of (2) giving linear equations. This can be This is a total ordering < of the monomials such that (i) m < m ′ = ⇒ mt < m ′ t for any monomial t (ii)every subset of monomials has a smallest element. z s def = P i,j : i + j = s p i λ j in (2) and get the system t + k − X s =0 a sℓ z s = t X j =0 b ℓ a jℓ λ j , ℓ ∈ J , n K . (4)In other words, by eliminating the z s ’s in these equations we obtain linear equations involvingonly the λ i ’s. When t ≤ n − k there are enough such equations to recover from them the λ i ’sand by substituting for them in (2) the p i ’s by solving again a linear system. Of course, thisis well known, and there are much more efficient algorithms for solving this system but stillit is interesting to notice that the Gr¨obner basis approach already yields a polynomial timealgorithm for the particular bilinear system (2) despite being exponential (for a large rangeof parameters) for generic bilinear systems with the same number of unknowns and equationsas (2) [FSS11, Spa12].A slightly less trivial degree fall behavior is obtained in the case the fraction of errors is theSudan radius. Here, after substituting for the λ i ’s which can be expressed as linear functionsof the other λ i ’s by using the aforementioned linear equations involving the λ i ’s we obtainnew bilinear equations f ′ , · · · , f ′ m . It turns out that we can perform linear combinations onthese f ′ i ’s to eliminate the monomials of degree 2 in them and derive new linear equationsinvolving only the λ i ’s. This is proved in Subsection 3.1. This process can be iterated andthere are typically enough such linear equations to recover the λ i ’s in this way as long as t is below or equal to the Sudan decoding radius. As explained above, this allows to recoverthe right codeword by plugging the values for λ i in (2) and solving the corresponding linearsystem in the p i ’s.Note that here, and in all the paper, we are considering graded monomial orderings (amonomial of degree d is always smaller than a monomial of degree d ′ > d ). Through this paper,we use the notion of affine D -Gr¨obner basis, which is the truncated Gr¨obner basis obtainedby ignoring computations in degree greater than D . It is well known that there exists a D such that a D -Gr¨obner basis is indeed a Gr¨obner basis. We describe here Algorithm 1 whichcomputes a D -Gr¨oner basis of a given system through linear algebra. It is less efficient thanstandard algorithms but has the merit of being simple and showing what is computed duringsuch algorithms. It is also of polynomial time complexity when D is fixed. It uses the function Pol ( M ) that returns the polynomials represented by the rows of a Macaulay matrix M . Algorithm 1 D -Gr¨obner Basis Input D Maximal degree, S = { f , · · · , f m } set of polynomials. repeat S ←
Pol ( EchelonForm ( M acaulay D ( S ))) until dim F q S has not increased.Output S .It is clear that Algorithm 1 terminates and has a polynomial complexity if D is fixed. Theprevious remarks show that it is possible to decode up to the Sudan decoding radius with D = 2. However, when the number of errors becomes bigger, D = 2 is not enough to exhibitmore degree falls. We have to go to a higher degree. It turns out that already taking D = 34ields very interesting degree falls that are instrumental to the generalization of the powerdecoding approach of [Nie18] decoding up to the Johnson radius. The efficiency of Algorithm 1 is already demonstrated by the fact that choosing D = 2 init corrects in polynomial time as many errors as Sudan’s algorithm. Roughly speaking thechoice of D = 2 in Algorithm 1 means that we just keep the equations of degree 2 and try toproduce new linear equations by linear combinations of the equations of degree 2 aiming ateliminating the degree 2 monomials. The reason why this algorithm is so efficient is relatedto power decoding [SSB10]: the algorithm finds automatically the linear equations exploitedby the power decoding approach. It is here convenient in order to explain the effectivenessof the Gr¨obner basis approach to bring in an equivalent algebraic system which is basicallythe key equation implicit in Gao’s decoder [Gao03] (and the one used in the power decodingapproach) which is the following polynomial equation: P ( X )Λ( X ) ≡ R ( X )Λ( X ) mod G ( X ) (5)where R ( X ) is the polynomial of degree ≤ n − R ( a ℓ ) = b ℓ , ℓ ∈ J , n K and G ( X ) def = Π nℓ =1 ( X − a ℓ ) . Note that these two polynomials are immediately computable by the receiver (and G can beprecomputed). By using the same unknowns as in (2), namely the coefficients of P ( X ) andΛ( X ) we obtain a bilinear system with n equations. It is readily seen that Proposition 1.
The bilinear systems (2) and (5) are equivalent: (5) can be obtained fromlinear combinations of (2) and vice versa.
This follows on the spot from
Fact 2.
For any polynomial Q ( X ) ∈ F q [ X ] of degree < n , the coefficients of Q can be expressedas linear combinations of Q ( a ) , · · · , Q ( a n ) . This fact is just a consequence that Q coincides with its interpolation polynomial on thepoints ( a ℓ , Q ( a ℓ )) and that this interpolation polynomial is given by Q ( X ) = n X ℓ =1 Q ( a ℓ ) Π j = ℓ ( X − a j )Π j = ℓ ( a ℓ − a j ) . To understand now why (5) can be derived from (2), we just notice that if we bring in Q ( X ) def = P ( X )Λ( X ) − R ( X )Λ( X ) S ( X ) def = Q ( X ) mod G ( X ) , then 5 (2) amounts to write Q ( a ℓ ) = 0 for ℓ in J , n K and to express the Q ( a ℓ )’s as quadraticforms in the λ i ’s and the p j ’s. • Since Q ( a ℓ ) = S ( a ℓ ) for all ℓ in J , n K and since S is of degree < n we can use theprevious fact and express its coefficients linearly in terms of the S ( a ℓ ) = Q ( a ℓ )’s. • Since (5) is nothing but expressing that the coefficients of S ( X ) are all equal to 0,we obtain that the equations of (5) can be obtained from linear combinations of theequations of (2).Conversely since S ( a ℓ ) can be written as a linear combination of the coefficients of S ( X ),the quadratic equations in the λ i ’s and the p i ’s obtained by writing S ( a ℓ ) = 0 are linearcombinations of the quadratic equations given by (5). These equations S ( a ℓ ) = 0 coincidewith the equations in (2), since Q ( a ℓ ) = S ( a ℓ ) for all ℓ in J , n K .The point of (5) is that • These equations are more convenient to work with to understand what is going onalgebraically during the Gr¨obner basis calculations of Algorithm 1. • They give directly n − k − t + 1 linear equations, since (i) the coefficient of S ( X ) ofdegree d ∈ J t + k, n − K coincides with the coefficient of the same degree in − R ( X )Λ( X )mod G ( X ) since Λ( X ) P ( X ) is of degree ≤ t + k −
1; (ii) the coefficient of S ( X ) of degree t + k − p k − − coeff (cid:16) [Λ( X ) R ( X )] G ( X ) , X t + k − (cid:17) because Λ( X ) is monicand of degree t .With this at hand we can now prove that Proposition 3.
Let q def = max { u : t + ( k − u ≤ n − } = j n − t − k − k . All affine functions in the λ i ’s of the form coeff (cid:16)(cid:2) Λ( X ) R j ( X ) (cid:3) G ( X ) , X u (cid:17) for j ∈ J , q K and u ∈ J t + ( k − j + 1 , n − K are in the linear span of the set S output by Algorithm 1 when D = 2 . Remark 4.
The fact that these are indeed affine functions follows on the spot from general-izing the degree considerations above: Λ( X ) P ( X ) j is of degree ≤ t + ( k − j .Proof. Notice that from the equivalence we have just proved, the space generated by S con-tains initially (and therefore all the time) the space of affine functions in the λ i ’s generatedbycoeff (cid:16) [ − Λ( X ) R ( X )] G ( X ) , X u (cid:17) = coeff (cid:16) [Λ( X ) P ( X ) − Λ( X ) R ( X )] G ( X ) , X u (cid:17) , for all u ∈ J t + k, n − K .Now proceed by induction on j , and assume that at some point the space generated by S contains the linear span of the affine functionscoeff (cid:16)(cid:2) − Λ( X ) R j ( X ) (cid:3) G ( X ) , X u (cid:17) = coeff (cid:16)(cid:2) Λ( X ) P ( X ) j − Λ( X ) R ( X ) j (cid:3) G ( X ) , X u (cid:17) , for all u ∈ J t + ( k − j + 1 , n − K where j is some integer in the interval J , q − K . Notethat (cid:0) Λ P j +1 − Λ R j +1 (cid:1) mod G (6)= (cid:0) P (Λ P j − Λ R j ) + R j (Λ P − Λ R ) (cid:1) mod G = (cid:0) P (Λ P j − Λ R j mod G ) + R j (Λ P − Λ R mod G ) (cid:1) mod G. (7)6e use the equality between the polynomials (6) and (7) to claim that their coefficients shouldcoincide for all the degrees J t + ( j − k − , n − K . Note now that after the elimination ofvariables performed so far, this makes that all coefficients of degree in J t + ( k − j + 1 , n − K in Λ P j − Λ R j mod G vanish, since they were affine functions by the induction hypothesisand become 0 after the variable elimination step. This implies that Λ P j − Λ R j mod G becomes a polynomial of degree ≤ t + ( k − j after elimination of variables. Therefore P (Λ P j − Λ R j mod G ) is a polynomial of degree ≤ t + ( k − j + 1). From the equality ofthe polynomials (6) and (7), this implies that the coefficient of degree u in (cid:0) Λ P j +1 − Λ R j +1 (cid:1) mod G coincides with the coefficient of the same degree in ( R j (Λ P − Λ R mod G )) mod G for u in J t +( k − j +1)+1 , n − K . We observe now that the last coefficient is nothing but a linearcombination of the coefficients of Λ P − Λ R mod G , which are precisely the initial polynomialequations. Since the polynomial (cid:0) Λ P j +1 − Λ R j +1 (cid:1) mod G has all its coefficients that areaffine functions in the λ i ’s by Remark 4 for all the degrees u ∈ J t + ( k − j + 1) + 1 , n − K we obtain that after the Gaussian elimination step, S contains the space generated by theseaforementioned affine functions. This proves the proposition by induction on j .These linear equations that we produce coincide exactly with the linear equations producedby the power decoding approach [SSB10] and this allows to correct as many errors as the powerdecoding approach based on the same assumption, namely that they are all independent,which is actually the typical scenario. However, contrarily to power decoding that is boundto make such an assumption to work, the Gr¨obner basis is more versatile, as it allows todecode even without this assumption as explained in Section 4. The power decoding approach of [SSB10] was generalized in [Nie18] to decode up to theJohnson radius by bringing in the “error evaluator” polynomial Ω( X ) of degree ≤ t − a i ) = − e i , for all i ∈ J , n K for which e i = 0. (8)where e i is the error value at position i . In other words, it is the interpolation polynomialdefined by (8) for all i that are in error. This approach then crucially relies on the followingidentity ([Nie18, Lemma 2.1]): Λ( P − R ) = Ω G. (9)The generalization of power decoding then uses this identity to derive further identitiesthat are summarized by the following formulas (this is Theorem 3.1 in [Nie18]), for anypositive integer s and v such that s ≤ v we have Λ s P u = u X i =0 (cid:0) Λ s − i Ω i (cid:1) (cid:18) ui (cid:19) R u − i G i u ∈ J , s − K , (10)Λ s P u ≡ s − X i =0 (cid:0) Λ s − i Ω i (cid:1) (cid:18) ui (cid:19) R u − i G i mod G s u ∈ J s, v K . (11) The approach of [Nie18] relies on the fact that when the number of errors is below the Johnsonradius, there is a choice of s and v such that the total number of coefficients of the polynomialsΛ s , Λ s − Ω, · · · , Ω s as well as Λ s P , · · · , Λ s P v is less than or equal to the number of equationslinking these coefficients coming from (10) and (11). In this case (and if these equations are7ndependent) we recover them by solving the corresponding linear system. Notice that withthis strategy, there is for a given value s a maximal value for u given by q s def = max { u : st + u ( k − ≤ sn − } . It is readily seen that taking larger of u only increase the number of variables in the linearsystem without being able to make it determinate if it was not determinate before. Interest-ingly enough our Gr¨obner basis approach also exhibits degree falls of degree s that are relatedto the equations (10) and (11). This can be understood by using an equivalent definition ofΩ which is better suited to understand our Gr¨obner basis approach. We namely define Ω asΩ def = − Λ R ÷ G. (12)Notice that from this definition we directly derive two results1. The coefficients of Ω are affine functions of the λ i ’s.2. As long as t ≤ n − k , (9) follows from (12) and (5). Indeed [Λ R ] G = Λ P . This followsfrom (5) and t + k − ≤ n − P = Λ R mod G . This and (12) thenimply that Λ R = − Ω G + Λ P which is obviously equivalent to (9).Note that from these considerations, that if we equate the coefficients of the polynomialsin (10) for all the degrees in J st + u ( k −
1) + 1 , st + u ( n − K and in (11) for all the degrees in J st + u ( k −
1) + 1 , s ( n − K , the coefficient of the left-hand term vanishes and the coefficientin the righthand term is a polynomial of degree s in the λ i ’s (this follows from the fact thatthe coefficients of Ω are affine functions in those λ i ’s). This gives polynomial equations in the λ i ’s of degree s . In a sense, they can be viewed as generalizations at degree s of the linearequations that were mentioned when Algorithm 1 is applied when D = 2. These equations areactually produced as degree falls that are in the linear span of intermediate sets S producedin Algorithm 1 when D = s + 1. There are also other equations of degree s produced byAlgorithm 1 in such a case. To explain this point it makes sense to bring in notation for theright-hand term in (10) and (11). Let us define χ ( s, u ) def = u X i =0 (cid:18) ui (cid:19) Λ s − i R u − i Ω i G i = Λ s − u (Λ R + Ω G ) u if u < s,χ ( s, u ) def = " s − X i =0 (cid:18) ui (cid:19) Λ s − i R u − i Ω i G i G s if u ≥ s We also let χ ( s, u ) H be the polynomial where we dropped all the terms of degree ≤ ts + u ( k − χ ( s, u ), i.e. χ ( s, u ) = P i a i X i , then χ ( s, u ) H = P i>ts + u ( k − a i X i . Theorem 5.
Let I D = hSi F q where S is the set output by Algorithm 1. We have for allnonnegative integers s , s ′ , u ≤ q s , u ′ ≤ q s ′ χ ( s, u ) H ∈ coef I s +1 (13) χ ( s, u ) χ ( s ′ , u ′ ) − χ ( s + s ′ , u + u ′ ) ∈ coef I s + s ′ +1 . (14) where P ∈ coef I v ( where P is a polynomial with coefficients that are polynomials in the λ i ’sand the p i ’s) means that all the coefficients of P belong to I v . χ ( s, u ) χ ( s ′ , u ′ ) − χ ( s + s ′ , u + u ′ ) belongs tothe ideal generated by the polynomial equations since they basically come from the identityΛ s P u Λ s ′ P u ′ = Λ s + s ′ P u + u ′ . What is somehow surprising is that these equations are actuallydiscovered at a rather small degree Gr¨obner basis computation (namely by staying at degree s + s ′ + 1). Moreover these equations only involve the λ i ’s. By inspection of the behaviorof the Gr¨obner basis computation, it seems that the linear equations that we produce lateron are first produced by degree falls only involving these equations of degree s . It is there-fore tempting to change the Gr¨obner basis decoding procedure strategy: instead of feedingAlgorithm 1 with the initial system (2) or (5) we run it with the equations of degree s givenby Theorem 5. Once we have recovered the λ i ’s in this way we recover the p i ’s by solving alinear system as explained earlier. How this strategy behaves on non-trivial examples is nowexplained in the next section.This result is proved in the following subsection. It will be convenient here to notice that χ ( s, s ) has a slightly simpler expression whichavoids the reduction modulo G s . Lemma 6. χ ( s, s ) = (Λ R + Ω G ) s . Proof. χ ( s, s ) is defined as χ ( s, s ) def = " s − X i =0 (cid:18) si (cid:19) Λ s − i R s − i Ω i G i G s = " s X i =0 (cid:18) si (cid:19) Λ s − i R s − i Ω i G i G s = [(Λ R + Ω G ) s ] G s = (Λ R + Ω G ) s It will also be helpful to observe that χ ( s, u ) and χ ( s, u + 1) are related by the followingidentity Lemma 7. χ ( s, u ) P − χ ( s, u + 1) = Λ s − u − (Λ R + Ω G ) u (Λ P − Λ R − Ω G ) for u ∈ J , s − K [ χ ( s, u ) P − χ ( s, u + 1)] G s = " (Λ P − Λ R − Ω G ) s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i G s for u ∈ J s, q s − K .Proof. For u ∈ J , s − K we have (for the case u = s − χ ( s, u + 1)): χ ( s, u ) P − χ ( s, u + 1) = Λ s − u P (Λ R + Ω G ) u − Λ s − u − P (Λ R + Ω G ) u +1 = Λ s − u − (Λ R + Ω G ) u (Λ P − Λ R − Ω G ) . u ∈ J s, q s − K we observe that[ χ ( s, u ) P ] G s = " P s − X i =0 (cid:18) ui (cid:19) Λ s − i R u − i Ω i G i G s = " Λ P s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i G s (15)and(Λ R + Ω G ) s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i = s − X i =0 (cid:18) ui (cid:19) Λ s − i R u +1 − i Ω i G i + s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i +1 G i +1 = Λ s R u +1 + (cid:18) us − (cid:19) R u − s +1 Ω s G s + s − X i =1 (cid:18)(cid:18) ui (cid:19) + (cid:18) ui − (cid:19)(cid:19) Λ s − i R u +1 − i Ω i G i = (cid:18) us − (cid:19) R u − s +1 Ω s G s + s − X i =0 (cid:18) u + 1 i (cid:19) Λ s − i R u +1 − i Ω i G i This implies χ ( s, u + 1) = " (Λ R + Ω G ) s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i G s . (16)The second equation of the lemma follows directly from (15) and (16).A last lemma will be helpful now Lemma 8.
For all nonnegative integers s and u < q s χ ( s, u ) P − χ ( s, u + 1) ∈ coef I s +1 (17) χ ( s, u + 1) H ∈ coef I s +1 . (18) Proof.
We will prove this lemma by induction on u . For u ≤ s − χ ( s, u ) P − χ ( s, u + 1) = Λ s − u − (Λ R + Ω G ) u (Λ P − Λ R − Ω G ) (19) ∈ coef I s +1 The last point follows from the fact that (19) implies that the coefficients of χ ( s, u ) P − χ ( s, u + 1) are clearly in the space spanned by S once we multiply the original f i ’s (i.e. thecoefficients of Λ P − Λ R − Ω G ) by all monomials of degree ≤ s −
1) because the coefficientsof Λ s − u − (Λ R + Ω G ) u are polynomials of degree ≤ s − λ i ’s.This also implies that χ ( s, u + 1) H ∈ coef I s +1 , since deg ( χ ( s, u ) P ) = ts + u ( k − χ ( s, u − P − χ ( s, u ) ∈ coef I s +1 and χ ( s, u ) H ∈ coef I s +1 , for some s ≤ u < q s .From Lemma 7 we know that[ χ ( s, u ) P − χ ( s, u + 1)] G s = " (Λ P − Λ R − Ω G ) s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i G s . χ ( s, u ) P − χ ( s, u + 1)] G s ∈ coef I s +1 since clearly (i) (Λ P − Λ R − Ω G ) s − X i =0 (cid:18) ui (cid:19) Λ s − − i R u − i Ω i G i ∈ coef I s +1 (Λ P − Λ R − Ω G ) P s − i =0 (cid:0) ui (cid:1) Λ s − − i R u − i Ω i G i .By the induction hypothesis χ ( s, u ) H ∈ coef I s +1 and such coefficients have degree s , thenthe coefficients corresponding to degrees > ts + ( u + 1)( k −
1) of χ ( s, u ) P belong to I s +1 too.Since [ χ ( s, u ) P − χ ( s, u + 1)] G s = [ χ ( s, u ) P ] G s − χ ( s, u + 1), it follows that χ ( s, u ) P − χ ( s, u + 1) ∈ I s +1 . Thus, we also have χ ( s, u + 1) H ∈ I s +1 .We are ready now to prove Theorem 5. Proof of Theorem 5.
We proceed by induction on u and u . We first observe that we triviallyhave χ ( s , χ ( s , − χ ( s + s , ∈ coef I s + s +1 since χ ( s , χ ( s , − χ ( s + s ,
0) = Λ s Λ s − Λ s + s = 0 . Now assume that we have χ ( s , u ) χ ( s , u ) − χ ( s + s , u + u ) ∈ coef I s + s +1 , for some positive integers s and s and non-negative integers u < q s and u ≤ q s . Since χ ( s , u ) χ ( s , u ) and χ ( s + s , u + u ) are polynomials where all coefficients are polynomialsin the λ i ’s of degree ≤ s + s , we also have P ( χ ( s , u ) χ ( s , u ) − χ ( s + s , u + u )) ∈ coef I s + s +1 . (20)By Lemma 8 we know that P χ ( s , u ) − χ ( s , u + 1) ∈ coef I s +1 . This implies
P χ ( s , u ) χ ( s , u ) − χ ( s , u + 1) χ ( s , u ) ∈ coef I s + s +1 . (21)On the other hand, still by Lemma 8, we have P χ ( s + s , u + u ) − χ ( s + s , u + u + 1) ∈ coef I s + s +1 . (22)From (21) and (22) we derive that − P χ ( s , u ) χ ( s , u )+ χ ( s , u +1) χ ( s , u )+ P χ ( s + s , u + u ) − χ ( s + s , u + u +1) ∈ coef I s + s +1 (23)(23) and (20) imply that χ ( s , u + 1) χ ( s , u ) − χ ( s + s , u + u + 1) ∈ coef I s + s +1 . This proves the theorem by induction (the induction on u follows directly from the fact wecan exchange the role of u and u ). 11 Experimental Results
In this section, we compare the behavior of a D -Gr¨obner basis computation on the bilinearsystem (5), with a system involving equations in the λ j ’s only. We give examples whereJohnson’s bound is attained and passed.The systems in λ j ’s we use contains equations χ ( s, u ) H and some relations χ ( s, u ) χ ( s ′ , u ′ ) − χ ( s + s ′ , u + u ′ ). Experimentally, they are linearly dependent from χ ( s + s ′ − , u + u ′ ) χ (1 , − χ ( s + s ′ , u + u ′ ) and χ ( s, q s ) H . Moreover, χ ( s − , u ) χ (1 ,
0) mod G s − = χ ( s, u ) mod G s − ,so we will consider equations M s,u defined by( χ ( s − , u ) χ (1 , − χ ( s, u )) ÷ G s − . ( M s,u )We do not add equations that are polynomially dependent from χ ( s, q s ) H or M s +1 ,q s at degreeat most D , and thus unnecessary for the computation.Tables 1, 2 and 3 show results for [ n, k ] q taking values [64 , , [256 , and [37 , .The column λ j indicates the number of remaining λ j ’s after elimination of the linear onesfrom the χ (1 , ∗ ) H relations. The column “Eq” indicates the equations used. The column“ .We do our experiments using the GroebnerBasis(S,D) function in the computer alge-bra system magma v2.25-6 . The practical complexity C is given by the magma function ClockCycles . For instance, on our machine with an Intel ® Xeon ® . clock cycles are done in 1 second, 2 . in 1 minute and 2 . in 1 hour. “Max Matrix” indicatesthe size of the largest matrix during the process. The complexities include the computationof the equations χ ( i, j ) H and M i,j that could be improved.For systems where the number of remaining λ j ’s is small compared to the number of p i ’s,e.g. Table 1 or Table 2, it is clearly interesting to compute a Gr¨obner basis for a systemcontaining only polynomials in λ j ’s: even if the maximal degree D is larger than for thebilinear system, the number of variables is much smaller and the computation is faster. Forinstance for [ n, k ] q = [64 , in Table 1, on Johnson bound t = 23 the Gr¨obner basis for thebilinear system requires more than 6 hours of computation and 47 GB of memory, whereas thecomputation in λ j ’s only takes less than a second. For t = 24 we couldn’t solve the bilinearsystem directly, whereas the system in λ j ’s only solves in less than a minute.Table 2 gives an example where the number of λ j ’s variables is quite large, but still smallerthan the number of p i ’s. The benefit of using equations in λ j ’s only is clear.On the contrary, Table 3 shows that for a small value of k compared to the number of λ j ’s, the maximal degree for the bilinear system is smaller than the one for a system involvingonly λ j ’s, but the total number of variables is almost the same, hence it is more interestingto solve directly the bilinear system. Moreover, here computing the M i,j equations (that areequations in λ j ’s of degree i ) takes time. Note that, for t ≥
26 we may have several solutions:the Gr¨obner basis computation performs a list decoding and returns all the solutions.
This paper demonstrates that using a standard Gr¨obner basis computation on the bilinearsystem (5) for decoding a Reed-Solomon code is of polynomial complexity below Sudan’sradius. The Gr¨obner basis computation reveals polynomial equations of small degree involving n, k ] q = [64 , RS-code. System (5) contains 26variables p i . Johnson’s bound is t = 23. t λ j Eq. D Max Matrix C
19 1 (5) 2:45 2 65 ×
57 2 . χ (2 , H ×
28 2 .
20 3 (5) 2:46 3 1522 × . χ (2 , H ×
28 2 .
21 5 (5) 2:47 3 1711 × . χ (2 , H + χ (3 , H ×
56 2 .
22 7 (5) 2:48 4 31348 × . χ (2 , H + χ (3 , H ×
283 2 .
23 9 (5) 2:49 5 428533 × . χ (2 , H + M , × .
24 11 (5) 2:50 ≥ M , × . Table 2: Experimental results for a [ n, k ] q = [256 , RS-code. System (5) contains 62variables p i . Johnson’s bound is t = 130. t λ j Eq. D Max Matrix C
120 36 (5) 2:182 3 20023 × . χ (2 , H ×
703 2 .
121 39 (5) 2:183 3 21009 × . M , × .
122 42 (5) 2:184 3 22050 × . M , × .
123 45 (5) 2:185 3 23112 × . M , × .
124 48 (5) 2:186 ≥ M , + M , × . Table 3: Experimental results for a [ n, k ] q = [37 , RS-code. System (5) contains 4 variables p i . Johnson’s bound is t = 24, Gilbert-Varshamov’s bound is t = 28. t λ j Eq. D Max Matrix C
24 12 (5) 2:28 3 1065 × . M , ×
454 2 .
25 15 (5) 2:29 3 2520 × . χ (2 , H + χ (3 , H + M , + M , × .
26 18 (5) 2:30 4 20446 × . χ (2 , H + M , + M , + M , × .