[PDF] Efficiently decoding Reed-Muller codes from random errors

Abstract

Reed-Muller codes encode an m -variate polynomial of degree r by evaluating it on all points in {0,1 } m . We denote this code by RM(m,r) . The minimal distance of RM(m,r) is 2 m−r and so it cannot correct more than half that number of errors in the worst case. For random errors one may hope for a better result. In this work we give an efficient algorithm (in the block length n= 2 m ) for decoding random errors in Reed-Muller codes far beyond the minimal distance. Specifically, for low rate codes (of degree r=o( m − − √ ) ) we can correct a random set of (1/2−o(1))n errors with high probability. For high rate codes (of degree m−r for r=o( m/logm − − − − − − − √ ) ), we can correct roughly m r/2 errors. More generally, for any integer r , our algorithm can correct any error pattern in RM(m,m−(2r+2)) for which the same erasure pattern can be corrected in RM(m,m−(r+1)) . The results above are obtained by applying recent results of Abbe, Shpilka and Wigderson (STOC, 2015), Kumar and Pfister (2015) and Kudekar et al. (2015) regarding the ability of Reed-Muller codes to correct random erasures. The algorithm is based on solving a carefully defined set of linear equations and thus it is significantly different than other algorithms for decoding Reed-Muller codes that are based on the recursive structure of the code. It can be seen as a more explicit proof of a result of Abbe et al. that shows a reduction from correcting erasures to correcting errors, and it also bares some similarities with the famous Berlekamp-Welch algorithm for decoding Reed-Solomon codes.

Full PDF

aa r X i v : . [ c s . I T ] A ug Efﬁciently decoding Reed-Muller codes from random errors

Ramprasad Saptharishi ∗ Amir Shpilka † Ben Lee Volk ∗ Abstract

Reed-Muller codes encode an m -variate polynomial of degree r by evaluating it on all pointsin {

0, 1 } m . We denote this code by RM ( m , r ) . The minimal distance of RM ( m , r ) is 2 m − r and soit cannot correct more than half that number of errors in the worst case. For random errors onemay hope for a better result.In this work we give an efﬁcient algorithm (in the block length n = m ) for decoding ran-dom errors in Reed-Muller codes far beyond the minimal distance. Speciﬁcally, for low ratecodes (of degree r = o ( √ m ) ) we can correct a random set of ( − o ( )) n errors with highprobability. For high rate codes (of degree m − r for r = o ( p m / log m ) ), we can correct roughly m r /2 errors.More generally, for any integer r , our algorithm can correct any error pattern in RM ( m , m − ( r + )) for which the same erasure pattern can be corrected in RM ( m , m − ( r + )) . Theresults above are obtained by applying recent results of Abbe, Shpilka and Wigderson (STOC,2015), Kumar and Pﬁster (2015) and Kudekar et al. (2015) regarding the ability of Reed-Mullercodes to correct random erasures.The algorithm is based on solving a carefully deﬁned set of linear equations and thus it issigniﬁcantly different than other algorithms for decoding Reed-Muller codes that are based onthe recursive structure of the code. It can be seen as a more explicit proof of a result of Abbe etal. that shows a reduction from correcting erasures to correcting errors, and it also bares somesimilarities with the famous Berlekamp-Welch algorithm for decoding Reed-Solomon codes. ∗ Department of Computer Science, Tel Aviv University, Tel Aviv, Israel, E-mails: [email protected],[email protected] . The research leading to these results has received funding from the European Community’sSeventh Framework Programme (FP7/2007-2013) under grant agreement number 257575. † Department of Computer Science, Tel Aviv University, Tel Aviv, Israel, [email protected] . The researchleading to these results has received funding from the European Community’s Seventh Framework Programme(FP7/2007-2013) under grant agreement number 257575, and from the Israel Science Foundation (grant number339/10).

Introduction

Consider the following challenge:Given the truth table of a polynomial f ( x ) ∈ F [ x , . . . , x m ] of degree at most r , inwhich 1/2 − o ( ) fraction of the locations were ﬂipped (that is, given the evaluationsof f over F m with nearly half the entries corrupted), recover f efﬁciently.If the errors are adversarial, then clearly this task is impossible for any degree bound r ≥

2, sincethere are two different quadratic polynomials that disagree on only 1/4 fraction of the domain.Hence, we turn to considering random sets of errors of size ( − o ( )) m , and we hope to recover f with high probability (in this case, one may also consider the setting where each bit is indepen-dently ﬂipped with probability 1/2 − o ( ) . By standard Chernoff bounds, both settings are almostequivalent).Even in the random model, if every bit was ﬂipped with probability exactly 1/2, the situation isagain hopeless: in this case the input is completely random and carries no information whatsoeverabout the original polynomial.It turns out, however, that even a very small relaxation leads to a dramatic improvement inour ability to recover the hidden polynomial: in this paper we prove, among other results, thateven at corruption rate 1/2 − o ( ) and degree bound as large as o ( √ m ) , we can efﬁciently recoverthe unique polynomial f whose evaluations were corrupted. Note that in the worst case, given apolynomial of such a high degree, an adversary can ﬂip a tiny fraction of the bits — just slightlymore than 1/2 √ m — and prevent unique recovery of f , even if we do not require an efﬁcientsolution; and yet, in the average case, we can deal with ﬂipping almost half the bits.Recasting the playful scenario above in a more traditional terminology, this paper deals withsimilar questions related to recovery of low-degree multivariate polynomials from their randomly corrupted evaluations on F m , or in the language of coding theory, we study the problem of decod-ing Reed-Muller codes under random errors in the binary symmetric channel (BSC) . We turn to somebackground and motivation.

Reed-Muller (RM) codes were introduced in 1954, ﬁrst by Muller [Mul54] and shortly after byReed [Ree54] who also provided a decoding algorithm. They are among the oldest and simplestcodes to construct — the codewords are multivariate polynomials of a given degree, and the en-coding function is just their evaluation vectors. In this work we mainly focus on the most basiccase where the underlying ﬁeld is F = F , the ﬁeld of two elements, although our techniques dogeneralize to larger ﬁnite ﬁelds. Over F , the Reed-Muller code of degree r in m variables, denotedby RM ( m , r ) , has block length n = m , rate ( m ≤ r ) /2 m and its minimal distance is 2 m − r .RM codes have been extensively studied with respect to decoding errors in both the worst caseand random setting. We begin by giving a review of Reed-Muller codes and their use in theoreticalcomputer science and then discuss our results. Background

Error-correcting codes (over both large and small ﬁnite ﬁelds) have been extremely inﬂuentialin the theory of computation, playing a central role in some important developments in several1reas such as cryptography (e.g. [Sha79] and [BF90]), theory of pseudorandomness (e.g. [BV10]),probabilistic proof systems (e.g. [BFL91, Sha92] and [ALM + Worst case errors:

This is also referred to as errors in the

Hamming model [Ham50]. Here, thealgorithm should recover the original message regardless of the error pattern, as long as thereare not too many errors. The number of errors such a decoding algorithm can tolerate is upperbounded in terms of the distance of the code. The distance of the code C is the minimum Hammingdistance of any two codewords in C . If the distance is d , then one can uniquely recover from at most d − ⌊ ( d − ) /2 ⌋ errors. For this model of worst-case errors it is easy to provethat Reed-Muller codes perform badly. They have relatively small distance compared to whatrandom codes of the same rate can achieve (and also compared to explicit families of codes).Another line of work in Hamming’s worst case setting concerns designing algorithms that cancorrect beyond the unique-decoding bound. Here there is no unique answer and so the algorithmreturns a list of candidate codewords. In this case the number of errors that the algorithm can tol-erate is a parameter of the distance of the code. This question received a lot of attention and amongthe works in this area we mention the seminal works of Goldreich and Levin on Hadamard Codes[GL89] and of Sudan [Sud97] and Guruswami and Sudan [GS99] on list decoding Reed-Solomoncodes. Recently, the list-decoding question for Reed-Muller codes was studied by Gopalan, Kli-vans and Zuckerman [GKZ08] and by Bhowmick and Lovett [BL15], who proved that the listdecoding radius of Reed-Muller codes, over F , is at least twice the minimum distance (recallthat the unique decoding radius is half that quantity) and is smaller than four times the minimaldistance, when the degree of the code is constant. Random errors:

A different setting in which decoding algorithms are studied is Shannon’smodel of random errors [Sha48]. In Shannon’s average-case setting (which we study here), acodeword is subjected to a random corruption, from which recovery should be possible with highprobability . This random corruption model is called a channel . The two most basic ones, the BinaryErasure Channel (BEC) and the Binary Symmetric Channel (BSC), have a parameter p (which maydepend on n ), and corrupt a message by independently replacing, with probability p , the symbolin each coordinate, with a “lost” symbol in the BEC( p ) channel, and with the complementarysymbol in the BSC( p ) case. In his paper Shannon studied the optimal trade-off achievable forthese channels (and many other channels) between the distance and rate. For every p , the capacityof BEC( p ) is 1 − p , and the capacity of BSC( p ) is 1 − h ( p ) , where h is the binary entropy function. Shannon also proved that random codes achieve this optimal behavior. That is, for every 0 < ε there exist codes of rate 1 − h ( p ) − ε for the BSC (and rate 1 − p − ε for the BEC), that can decodefrom a fraction p of errors (erasures) with high probability.For our purposes, it is more convenient to assume that the codeword is subjected to a ﬁxednumber s of random errors. Note that by the Chernoff-Hoeffding bound, (see e.g., [AS92]), theprobability that more than pn + ω ( √ pn ) errors occur in BSC( p ) (or BEC( p )) is o ( ) , and so wecan restrict ourselves to the case of a ﬁxed number s of random errors, by setting the corruptionprobability to be p = s / n . We refer to [ASW15] for further discussion on this subject. The maximum distance η for which the number of code words within distance η is only polynomially large (in n ). h ( p ) = − p log ( p ) − ( − p ) log ( − p ) , for p ∈ (

0, 1 ) , and h ( ) = h ( ) = ecoding erasures to decoding errors Recently, there has been a considerable progress in our understanding of the behavior of Reed-Muller codes under random erasures. In [ASW15], Abbe, Shpilka and Wigderson showed thatReed-Muller codes achieve capacity for the BEC for both sufﬁciently low and sufﬁciently highrates. Speciﬁcally, they showed that RM ( m , r ) achieves capacity for the BEC for r = o ( m ) or r > m − o ( p m / log m ) . More recently, Kumar and Pﬁster [KP15] and Kudekar, Mondelli, S¸ as¸o ˘gluand Urbanke [KMS¸ U15] independently showed that Reed-Muller codes achieve capacity for theBEC in the entire constant rate regime, that is r ∈ [ m /2 − O ( √ m ) , m /2 + O ( √ m )] . These regimesare pictorially represented in Figure 1. m /20 m o ( m ) o ( p ( m / log m )) O ( √ m ) Figure 1: Regime of r for which RM ( m , r ) is known to achieve capacity for the BECAnother result proved by Abbe et al. [ASW15] is that Reed-Muller codes RM ( m , m − r − ) can correct any error pattern if the same erasure pattern can be decoded in RM ( m , m − r − ) . Thisreduction is appealing on its own, since it connects decoding from erasures — which is easier inboth an intuitive and an algorithmic manner — with decoding from errors; but its importance isfurther emphasized by the progress made later by Kumar and Pﬁster and Kudekar et al., whoshowed that Reed-Muller codes can correct many erasures in the constant rate regime, right up tothe channel capacity.This result show that RM ( m , m − ( r + )) can cope with most error patterns of weight ( − o ( ))( m ≤ r ) , which is the capacity of RM ( m , m − ( r + )) for the BEC. While this is polynomiallysmaller than what can be achieved in the Shannon model of errors for random codes of the samerate, this number is still much larger (super-polynomial) than the distance (and the list-decodingradius) of the code, which is 2 r + . Also, since RM (cid:0) m , m − o ( √ m ) (cid:1) can cope with (cid:0) − o ( ) (cid:1) -fraction of erasures, this translation implies that RM ( m , o ( √ m )) can handle that many randomerrors.However, a shortcoming of the proof of Abbe et al. for the BSC is that it is existential. Inparticular it does not provide an efﬁcient decoding algorithm. Thus, Abbe et al. left open thequestion of coming up with a decoding algorithm for Reed-Muller codes from random errors. In this work we give an efﬁcient decoding algorithm for Reed-Muller codes that matches the pa-rameters given by Abbe et al. Following the aforementioned results about the erasure correctingability of Reed-Muller codes, the results can be partitioned into the low-rate and the high-rateregimes. We begin with the result for the low rate case.

Theorem 1 (Low rate, informal) . Let r < δ √ m for a small enough δ . Then, there is an efﬁcient algorithmthat can decode RM ( m , r ) from a random set of ( − o ( )) · ( m ≤ m /2 − r ) errors. In particular, if r = o ( √ m ) , he algorithm can decode from (cid:0) − o ( ) (cid:1) · m errors. The running time of the algorithm is O ( n ) and itcan be simulated in NC . For high rate Reed-Muller codes, we cannot hope to achieve such a high error correction capa-bility as in the low rate case, even information theoretically. We do give, however, an algorithmthat corrects many more errors (a super-polynomially larger number) than what the minimal dis-tance of the code suggests, and its running time is also nearly linear in the block length of thecode.

Theorem 2 (High rate, informal) . Let r = o ( p m / log m ) . Then, there is an efﬁcient algorithm that candecode RM ( m , m − ( r + )) from a random set of ( − o ( ))( m ≤ r ) errors. Moreover, the running time ofthe algorithm is m · poly (( m ≤ r )) and it can be simulated in NC . Recall that the block length of the code is n = m , and thus the running time is near linear in n when r = o ( m ) .A general property of our algorithm is that it corrects any error pattern in RM ( m , m − r − ) for which the same erasure pattern in RM ( m , m − r − ) can be corrected. Stated differently, if anerasure pattern can be corrected in RM ( m , m − r − ) then the same pattern, where the “lost” sym-bol is replaced with arbitrary 0/1 values, can be corrected in RM ( m , m − ( r + )) . This propertyis useful when we know RM ( m , m − r − ) can correct a large set of erasures with high probabil-ity, that is, when m − r − red region in Figure 1. Thus, our result has implications alsobeyond the above two instances. In particular, it may be the case that our algorithm performs wellfor other rates as well. For example, consider the following question and the theorem it implies. Question 3.

Does RM ( m , m − r − ) achieve capacity for the BEC? Theorem 4 (informal) . For any value r for which the answer to Question 3 is positive, there exists anefﬁcient algorithm that decodes RM ( m , m − r − ) from a random set of ( − o ( ))( m ≤ r ) errors withprobability ( − o ( )) (over the random errors). Moreover, the running time of the algorithm is m · poly (cid:16) ( m ≤ r ) (cid:17) . Recall that Abbe et al. [ASW15] also proved that the answer to Question 3 is positive for r = m − o ( m ) (that is, for RM ( m , o ( m )) ) but this case does not help us as we need to consider RM ( m , m − ( r + )) and m − ( r + ) < r , and conjectures to that effectwere made in [CF07, Arı08, MHU14]. Recent simulations have also suggested that the answerto the question is positive [Arı08, MHU14]. Thus, it seems natural to believe that the answeris positive for most values of r , even for r = Θ ( m ) . As a conclusion, the belief in the codingtheory community suggests that our algorithm can decode a random set of roughly ( m ≤ r ) errorsin RM ( m , m − ( r + )) . For example, for r = ρ · m , where ρ < RM ( m , m − ( r + )) is roughly 2 ρ m whereas our algorithm can decode from roughly 2 h ( ρ ) m ran-dom errors (assuming the answer to Question 3 is positive), which is a much larger quantity forevery ρ < The belief that RM codes achieve capacity is much older, but we did not trace back where it appears ﬁrst. C to decodableerasure patterns of an appropriate “tensor” C ′ of C (by essentially embedding these codes in alarge enough RM code). Although Abbe et al. did not provide an efﬁcient decoding algorithm,the algorithm we present directly applies here (Section 3.2). The abstraction of the “error-locatingpairs” method presented in Section 3 should hopefully be applicable in other contexts too, espe-cially considering the generality of the results of [KP15, KMS¸U15]. In Section 1.1 we surveyed the known results regarding the ability of Reed-Muller codes to correctrandom erasures. In this section we summarize the results known about recovering RM codesfrom random errors.Once again, it is useful to distinguish between the low rate and the high rate regime of Reed-Muller codes. We shall use d to denote the distance of the code in context. For RM ( m , r ) codes, d = m − r .In [Kri70], the majority logic algorithm of [Ree54] is shown to succeed in recovering all buta vanishing fraction of error patterns of weight up to d log d /4 for all RM codes of positive rate.In [Dum06], Dumer showed for all r such that min ( r , m − r ) = ω ( log m ) that most error patternsof weight at most ( d log d /2 ) · ( − log m log d ) can be recovered in RM ( m , r ) . To make sense of theparameters, we note that when r = m − ω ( log m ) the weight is roughly ( d log d /2 ) . To comparethis result to ours, we ﬁrst consider the case when r = m − o ( p m / log m ) . Here the algorithmof [Dum06] can correct roughly 2 o ( √ m / log m ) random errors in RM ( m , r ) whereas Theorem 2 givesan algorithm for correcting roughly m o ( √ m / log m ) ≈ ( d log d ) O ( log m ) random errors.Further, even for the case r = ( − ρ ) m , where ρ < O ( d log d ) . On the other hand, assuming a positive answerto Question 3, Theorem 4 implies an efﬁcient decoding algorithm for RM ( m , ( − ρ ) m ) that candecode from, roughly, ( m ρ m ) = d O ( log 1/ ρ ) random errors, for this case.We now turn to other regimes of parameters, speciﬁcally RM codes of low rate. For the specialcase of r =

1, 2, [HKL05] shows that RM ( m , r ) codes are capacity-achieving. In [SP92], it is shownthat RM codes of ﬁxed order (i.e., r = O ( ) ) can decode most error patterns of weight up to n ( − p c ( r − ) m r / nr ! ) , where c > ln ( ) . In [ASW15], Abbe et al. settled the question forlow order Reed-Muller codes proving that RM ( m , r ) codes achieve capacity for the BSC when r = o ( m ) [ASW15]. We note however that all the results mentioned here are existential in natureand do not provide an efﬁcient decoding algorithm.A line of work by Dumer [Dum04, DS06] based on recursive algorithms (that exploit the recur-sive structure of Reed-Muller codes), obtains algorithmic results mainly for low-rate regimes. In[Dum04], it is shown that for a ﬁxed degree, i.e., r = O ( ) , an algorithm of complexity O ( n log n ) can correct most error patterns of weight up to n ( − ε ) given that ε exceeds n − r . In [Dum06],this is improved to errors of weight up to n ( − ( m / d ) r ) for all r = o ( log m ) . The case r = ω ( log m ) is also covered in [Dum06], as described above.We note that all the efﬁcient algorithms mentioned above (both for high- and low-rate) relyon the so called Plotkin construction of the code, that is, on its recursive structure (expand-ing an m -variate polynomial according to the m -th variable f ( x , . . . , x m ) = x m g ( x , . . . , x m − ) + /20 m log m log mo ( √ m ) o ( p m / log m ) Degree ( r ) of RM ( m , r ) :[Dum04, DS06, Dum06]: ≈ n /2 errors O ( d log d ) errors O ( n log n ) time algorithmOur results: ≈ n /2 errors O ( n ) time algo. ( d log d ) O ( log m ) errors n + o ( ) time algo. ( d log d ) ω ( ) errorsassuming positive answer to Question 3 Figure 2: Comparison with [Dum04, DS06, Dum06] h ( x , . . . , x m − ) ), whereas our approach is very different.We summarize and compare our results with [Dum04, DS06, Dum06] for various range ofparameters in Figure 2 (degree is r and distance is d = m − r ). The dotted region in Figure 2 corre-sponds to the uncovered region in Figure 1 beyond m /2, via the connection given in Theorem 4. Before explaining the idea behind the proofs of our results we need to introduce some notationand parameters. We shall use the same notation as [ASW15]. • We denote by M ( m , r ) the set of m -variate monomials over F of degree at most r . • For non-negative integers r ≤ m , RM ( m , r ) denotes the Reed-Muller code whose codewordsare the evaluation vectors of all multivariate polynomials of degree at most r on m booleanvariables. The maximal degree r is sometimes called the order of the code. The block lengthof the code is n = m , the dimension k = k ( m , r ) = ∑ ri = ( mi ) def = ( m ≤ r ) , and the distance d = d ( m , r ) = m − r . The code rate is given by R = k ( m , r ) / n . • We use E ( m , r ) to denote the “evaluation matrix” of parameters m , r , whose rows are indexedby all monomials in M ( m , r ) , and whose columns are indexed by all vectors in F m . The valueat entry ( M , u ) is equal to M ( u ) . For u ∈ F m , we denote by u r the column of E ( m , r ) indexedby u , which is a k -dimensional vector, consisting of all evaluations of degree ≤ r monomialsat u . For a subset of columns U ⊆ F m we denote by U r the corresponding submatrix of E ( m , r ) . • E ( m , r ) is a generator matrix for RM ( m , r ) . The duality property of Reed-Muller codes (see,for example, [MS77]) states that E ( m , m − r − ) is a parity-check matrix for RM ( m , r ) , orequivalently, E ( m , r ) is a parity-check matrix for RM ( m , m − r − ) . • We associate with a subset U ⊆ F m its characteristic vector U ∈ F n . We often think of thevector U as denoting either an erasure pattern or an error pattern .6 For a positive integer n , we use the standard notation [ n ] for the set {

1, 2, . . . , n } .We next deﬁne what we call the degree- r syndrome of a set. Deﬁnition 5 (Syndrome) . Let r ≤ m be two positive integers. The degree- r syndrome , or simply r -syndrome of a set U = { u , . . . , u t } ⊆ F m is the ( m ≤ r ) -dimensional vector α whose entries are indexed byall monomials M ∈ M ( m , r ) , such that α M def = t ∑ i = M ( u i ) .Note that this is nothing but the syndrome of the error pattern U ∈ F n in the code RM ( m , m − r − ) (whose parity check matrix is the generator matrix of RM ( m , r ) ). In this section we describe our approach for constructing a decoding algorithm. Recall that thealgorithm has the property that is decodes in RM ( m , m − r − ) any error pattern U which iscorrectable from erasures in RM ( m , m − r − ) . Such patterns are characterized by the propertythat the columns of E ( m , r ) corresponding to the elements of U are linearly independent vectors.Thus, it sufﬁces to give an algorithm that succeeds whenever the error pattern U gives rise tosuch linearly independent columns, which happens with probability 1 − o ( ) for the regime ofparameters mentioned in Theorem 1 and Theorem 2.So let us assume from now on that the error pattern U corresponds to a set of linearly indepen-dent columns in E ( m , r ) . Notice that by the choice of our parameters, our task is to recover U fromthe degree ( r + ) -syndrome of U . Furthermore, we want to do so efﬁciently. For convenience,let t = | U | = ( − o ( ))( m ≤ r ) .Recall that the degree- ( r + ) syndrome of U is the ( m ≤ r + ) -long vector α such that for everymonomial M ∈ M ( m , 2 r + ) , α M = ∑ ti = M ( u i ) . Imagine now that we could somehow ﬁnddegree- r polynomials f i ( x , . . . , x m ) satisfying f i ( u j ) = δ i , j . Then, from knowledge of α and, say, f , we could compute the following sums: σ ℓ = t ∑ i = ( f · x ℓ )( u i ) , ℓ ∈ [ m ] .Indeed, if we know α and f then we can compute each σ ℓ , as it just involves summing severalcoordinates of α (since deg ( f · x ℓ ) ≤ r + σ ℓ = t ∑ i = ( f · x ℓ )( u i ) = ( f · x ℓ )( u ) = ( u ) ℓ .In other words, knowledge of such an f would allow us to discover all coordinates of u and inparticular, we will be able to deduce u , and similarly all other u i using f i .Our approach is thus to ﬁnd such polynomials f i . What we will do is set up a system of linearequations in the coefﬁcients of an unknown degree r polynomial f and show that f is the uniquesolution to the system. Indeed, showing that f is a solution is easy and the hard part is provingthat it is the unique solution. 7o explain how we set the system of equations, let us assume for the time being that we actuallyknow u . Let f = ∑ M ∈ M ( m , r ) c M · M , where we think of { c M } as unknowns. Consider the followinglinear system:1. t ∑ i = f ( u i ) = f ( u ) = t ∑ i = ( f · M )( u i ) = M ( u ) , for all M ∈ M ( m , r ) .3. t ∑ i = ( f · M · ( x ℓ + ( u ) ℓ + ))( u i ) = M ( u ) for every ℓ ∈ [ m ] and for all M ∈ M ( m , r ) .In words, we have a system of 2 + ( m ≤ r ) + m · ( m ≤ r ) equations in ( m ≤ r ) variables (the coefﬁcients of f ). Observe that f = f is indeed a solution to the system. To prove that it is the unique solutionwe rely on the fact that the columns of U r are linearly independent and hence expressing u r as alinear combination of those columns can be done in a unique way.Now we explain what to do when we do not know u . Let v = ( v , . . . , v m ) ∈ F m . We modifythe linear system above to:1. t ∑ i = f ( u i ) = f ( v ) = t ∑ i = ( f · M )( u i ) = M ( v ) for all M ∈ M ( m , r ) .3. t ∑ i = ( f · M · ( x ℓ + v ℓ + ))( u i ) = M ( v ) for all ℓ ∈ [ m ] and M ∈ M ( m , r ) .Now the point is that one can prove that if a solution exists then it must be the case that v is anelement of U . Indeed, the set of equations in item 2 implies that v r is in the linear span of thecolumns of U r . The linear equations in item 3 then imply that v must actually be in the set U .Notice that what we actually do amounts to setting, for every v ∈ F m , a system of linearequations of size roughly ( m ≤ r ) . Such a system can be solved in time poly (cid:16) ( m ≤ r ) (cid:17) . Thus, when wego over all v ∈ F m we get a running time of 2 m · poly (cid:16) ( m ≤ r ) (cid:17) , as claimed.Our proof can be viewed as an algorithmic version of the proof of Theorem 1.8 of Abbe et al.[ASW15]. That theorem asserts that when the columns of U r are linearly independent, the ( r + ) -syndrome of U is unique. In their proof of the theorem they ﬁrst use the ( r ) -syndrome to claimthat if V is another set with the same ( r ) -syndrome then the column span of U r is the same as thatof V r . Then, using the degree ( r + ) monomials they deduce that U = V . This is similar to whatour linear system does, but, in contrast, [ASW15] did not have an efﬁcient algorithmic version ofthis statement. We begin with the following basic linear algebraic fact.8 emma 6.

Let u , . . . , u t ∈ F m such that { u r , . . . , u rt } are linearly independent. Then, for every i ∈ [ t ] ,there exists a polynomial f i so that for every j ∈ [ t ] ,f i ( u j ) = δ i , j = ( if i = j otherwise .For completeness, we give the short proof. Proof.

Consider the matrix U r ∈ F t × ( m ≤ r ) whose i -th row is u ri . A polynomial f i which satisﬁesthe properties of the lemma is a solution to the linear system U r x = e i , where e i ∈ F t is the i -thelementary basis vector (that is, ( e i ) j = δ i , j ), and the ( m ≤ r ) unknowns are the coefﬁcients of f i . Bythe assumption that U is of full rank, indeed there exists a solution.The algorithm would proceed by making a guess v = ( v , . . . , v m ) ∈ F m for one of the errorlocations. If we could come up with an efﬁcient way to verify that the guess is correct, this wouldimmediately yield a decoding algorithm. We shall verify our guess by using the dual polynomials f , . . . , f t described above. We shall ﬁnd them by solving a system of linear equations that can beconstructed from the ( r + ) -syndrome of { u , . . . , u m } . We will need the following crucial, yetsimple, observation. Observation 7.

Let f be any m-variate polynomial of degree at most r + , and u , . . . , u t ∈ F m . Then,the sum ∑ ti = f ( u i ) can be computed given the ( r + ) -syndrome of { u , . . . , u t } , in time O (cid:0) ( m r + ) (cid:1) .Proof. For any M ∈ M ( m , 2 r + ) , denote α M = ∑ ti = M ( u i ) (so that α = ( α M ) M ∈ M ( M ,2 r + ) isprecisely the syndrome of { u , . . . , u t } ). Write f = ∑ M ∈ M ( m ,2 r + ) c M · M , where c M ∈ F , then t ∑ i = f ( u i ) = t ∑ i = ∑ M ∈ M ( m ,2 r + ) c M · M ( u i )= ∑ M ∈ M ( m ,2 r + ) c M t ∑ i = M ( u i ) ! = ∑ M ∈ M ( m ,2 r + ) c M α M .The following lemma shows how to verify a guess for an error location. It is the main ingre-dient in the analysis of our algorithm and the reason why it works. Basically, the lemma givesa system of linear equations whose solution enables us to decide whether a given v ∈ F m is acorrupted coordinate or not, without knowledge of the set of errors U but only of its syndrome.In a sense, this lemma is analogous to the Berlekamp-Welch algorithm, which also gives a systemof linear equations whose solution reveals the set of erroneous locations ([WB86], and see also theexposition in Chapter 13 of [GRS14]). Lemma 8 (Main Lemma) . Let u , . . . , u t ∈ F m such that { u r , . . . , u rt } are linearly independent, and v =( v , . . . , v m ) ∈ F m . Suppose there exists a multilinear polynomial f ∈ F [ x , . . . , x m ] with deg ( f ) ≤ rsuch that for every monomial M ∈ M ( m , r ) ,1. t ∑ i = f ( u i ) = f ( v ) = ,2. t ∑ i = ( f · M )( u i ) = M ( v ) , and . t ∑ i = ( f · M · ( x ℓ + v ℓ + ))( u i ) = M ( v ) for every ℓ ∈ [ m ] .Then there exists i ∈ [ t ] such that v = u i . Observe that if indeed v = u i for some i ∈ [ t ] , then the polynomial f i guaranteed by Lemma 6satisﬁes those equations. Hence, the lemma should be interpreted as saying the converse: thatif there exists such a solution, then v = u i for some i . Further, given the ( r + ) -syndrome of { u , . . . , u t } as input, Observation 7 shows that each of the above constraints are linear constraintsin the coefﬁcients of f . Thus, ﬁnding such an f is merely solving a system of O (cid:16) ( m ≤ r ) (cid:17) linearequations in ( m ≤ r ) unknowns and can be done in poly (cid:16) ( m ≤ r ) (cid:17) time. Proof of Lemma 8.

Let J = (cid:8) j | f ( u j ) = (cid:9) . Note that by item 1 it holds that J = ∅ . Subclaim 9. ∑ i ∈ J u ri = v r .Proof. Let M ∈ M ( m , r ) . We show that ∑ i ∈ J M ( u i ) = M ( v ) , i.e., that the M ’thcoordinate of ∑ i ∈ J u ri is equal to that of v r . Indeed, as f satisﬁes the constraints initem 2, M ( v ) = t ∑ i = ( f · M )( u i ) = ∑ i ∈ J ( f · M )( u i ) + ∑ i J ( f · M )( u i ) = ∑ i ∈ J M ( u i ) . (Subclaim) For any ℓ ∈ [ m ] , let J ℓ = (cid:8) j | f ( u j ) = ( u j ) ℓ = v ℓ (cid:9) ⊆ J . Observe that this deﬁnition impliesthat for every j ∈ [ t ] , the index j is in J ℓ if and only if ( f · ( x ℓ + v ℓ + ))( u j ) =

1. Using a similarargument, we can show the following.

Subclaim 10.

For every ℓ ∈ [ m ] , ∑ i ∈ J ℓ u ri = v r . (11) Proof.

Again, for any M ∈ M ( m , r ) the constraints in item 3 imply that M ( v ) = t ∑ i = ( f · M · ( x ℓ + v ℓ + ))( u i ) = ∑ i ∈ J ℓ M ( u i ) . (Subclaim) From the above claims, v r = ∑ i ∈ J u ri = ∑ i ∈ J u ri = · · · = ∑ i ∈ J m u ri .By the linear independence of { u r , . . . , u rt } , it follows that J = J = J = · · · = J m . Indeed, thereis a unique linear combination of { u r , . . . , u rt } that gives v r . The only vector which can be in the(non-empty) intersection T mk = J k is v , and so there exists i ∈ [ t ] so that u i = v .Lemma 8 implies a natural algorithm for decoding from t errors indexed by vectors { u , . . . , u t } ,assuming { u r , . . . , u rt } are linearly independent, that we write down explicitly in Algorithm 1.10 lgorithm 1 : Reed-Muller Decoding Input: A ( r + ) -syndrome of { u , . . . , u t } E = ∅ for all v = ( v , . . . , v m ) ∈ F m do Solve for a polynomial f ∈ F [ x , . . . , x m ] of degree at most r : • t ∑ i = f ( u i ) = f ( v ) = • t ∑ i = ( f · M )( u i ) = M ( v ) for all M ∈ M ( m , r ) . • t ∑ i = ( f · M · ( x ℓ + v ℓ + ))( u i ) = M ( v ) for all ℓ ∈ [ m ] and M ∈ M ( m , r ) . if there is a polynomial f that satisﬁes the above system of equations then Add v to the set E . return the set E as the error locations. Theorem 12.

Given the ( r + ) -syndrome of t unknown vectors { u , . . . , u t } ⊆ F m such that { u r , . . . , u rt } are linearly independent, Algorithm 1 outputs { u , . . . , u t } , runs in time m · poly (( m ≤ r )) and can be realized using a circuit of depth poly ( m ) = poly ( log n ) .Proof. The algorithm enumerates all vectors in F m , and for each candidate v checks whether thereexists a solution to the linear system of poly (( m ≤ r )) equations in poly (( m ≤ r )) unknowns given inLemma 8. Observation 7 shows that this system of linear equations can be constructed from the ( r + ) -syndrome in poly (( m ≤ r )) time.By Lemma 6 and Lemma 8, a solution to this system exists if and only if there is i ∈ [ t ] so that v = u i . The bound on the running time follows from the description of the algorithm. Further-more, all 2 m = n linear systems can be solved in parallel, and each linear system can be solvedwith an NC circuit (see, e.g., [MV97]).Observe that the the proof of correctness for Algorithm 1 is valid, for any value of r , wheneverthe set of error locations { u , . . . , u t } satisﬁes the property that { u r , . . . , u rt } are linearly indepen-dent. Therefore, we would like to apply Theorem 12 in settings where { u , . . . , u t } are linearlyindependent with high probability.For the constant rate regime, Kumar and Pﬁster [KP15] and Kudekar, Mondelli, S¸ as¸o ˘glu andUrbanke [KMS¸ U15] proved that RM ( m , m − r − ) achieves capacity for r = m /2 ± O ( √ m ) . Theorem 13 ([KP15], Theorem 23) . Let r ≤ m be integers such that r = m /2 ± O ( √ m ) . Then, fort = ( − o ( ))( m ≤ r ) , with probability − o ( ) , for a set of vectors { u , . . . , u t } ⊆ F m chosen uniformly atrandom, it holds that { u r , . . . , u rt } are linearly independent over F ( m ≤ r ) . Letting r = m /2 − o ( √ m ) and looking at the code RM ( m , m − r − ) = RM ( m , o ( √ m )) sothat ( m ≤ r ) = ( − o ( )) m , we get the following statement, stated earlier as Theorem 1. Corollary 14.

There exists a (deterministic) algorithm that is able to correct t = ( − o ( )) m randomerrors in RM ( m , o ( √ m ) with probability − o ( ) . The algorithm runs in time m · (cid:16) ( mm /2 − o ( √ m ) (cid:17) ≤ n . r = m /2 − O ( √ m ) and correct c · m random errors in the code RM ( m , O ( √ m )) , where c is some positive constant that goes to zero as the constant hidden underthe big O increases.For the high-rate regime, recall the following capacity achieving result proved in [ASW15]: Theorem 15 ([ASW15], Theorem 4.5) . Let ε > , r ≤ m be two positive integers and t < ( m − log (( m ≤ r )) − log ( ε ) ≤ r ) . Then, with probability at least − ε , for a set of vectors { u , . . . , u t } ⊆ F m cho-sen uniformly at random, it holds that { u r , . . . , u rt } are linearly independent over F ( m ≤ r ) . Using Theorem 15, we apply Theorem 12 to obtain the following corollary, which was statedinformally as Theorem 2.

Corollary 16.

Let ε > , and r ≤ m be two positive integers. Then there exists a (deterministic) algo-rithm that is able to correct t = j ( m − log (( m ≤ r )) − log ( ε ) ≤ r ) k − random errors in RM ( m , m − ( r + )) withprobability at least − ε . The algorithm runs in time m · poly (cid:16) ( m ≤ r ) (cid:17) . If r = o ( p m / log m ) , the bound on t is ( − o ( ))( m ≤ r ) , as promised.More generally, a positive answer to Question 3 is equivalent to { u r , . . . , u rt } for t = ( − o ( ))( m ≤ r ) being linearly independent with probability 1 − o ( ) (see Corollary 2.9 in [ASW15]), andthus we also obtain the following corollary, which was stated informally as Theorem 4. Corollary 17.

Let r ≤ m be two positive integers. Suppose that RM ( m , m − r − ) achieves capacity forthe BEC. Then there exists a (deterministic) algorithm that is able to correct ( − o ( ))( m ≤ r ) random errorsin RM ( m , m − ( r + )) with probability − o ( ) . The algorithm runs in time m · poly (cid:16) ( m ≤ r ) (cid:17) . We note that for all values of r , 2 m · poly (cid:16) ( m ≤ r ) (cid:17) is polynomial in the block length n = m , andwhen r = o ( m ) this is equal to n + o ( ) . In this section we present a more abstract view of Algorithm 1, in the spirit of the works by Pel-likaan, Duursma and K ¨otter ([Pel92, DK94]) which abstract the Berlekamp-Welch algorithm (seealso the exposition in [Sud01]). Stated in this way, it is also clear that the algorithm works alsoover larger alphabets, so we no longer limit ourselves to dealing with binary alphabets. As shownin [KP15], Reed-Muller codes over F q (sometimes referred to as Generalized Reed-Muller codes) alsoachieve capacity in the constant rate regime.We begin by giving the deﬁnition of a (pointwise) product of two vectors, and of two codes.

Deﬁnition 18.

Let u , v ∈ F nq . Denote by u ∗ v ∈ F nq the vector ( u v , . . . , u n v n ) . For A , B ⊆ F nq wesimilarly deﬁne A ∗ B = { u ∗ v | u ∈ A , v ∈ B } . Following the footsteps of Algorithm 1, we wish to decode, in a code C , error patterns whichare correctable from erasures in a related code N , through the use of an error-locating code E . Undersome assumptions on C , N and E , we can use a similar proof in order to do this.12 heorem 19. Let E , C , N ⊆ F nq be codes with the following properties.1. E ∗ C ⊆ N2. For any pattern U that is correctable from erasures in N, and for any coordinate i U there existsa codeword e ∈ E such that e j = for all j ∈ U and e i = .Then there exists an efﬁcient algorithm that corrects in C any pattern U , which is correctable from erasuresin N. To put things in perspective, earlier we set C = RM ( m , m − r − ) , N = RM ( m , m − r − ) and E = RM ( m , r + ) . It is immediate to observe that item 1 holds in this case, and item 2is guaranteed by Lemma 6: Indeed, consider the error pattern U = { u , . . . , u t } and the dualpolynomials { f i } ti = , and let v U be any other coordinate of the code. If there exists j ∈ [ t ] suchthat f j ( v ) =

1, we can pick the codeword g = f j · ( + x ℓ + v ℓ ) , where ℓ is some coordinate suchthat v ℓ = ( u j ) ℓ . g has degree at most r + E , and it can be directlyveriﬁed that it satisﬁes the conditions of item 2. If f j ( v ) = j , we can pick g = − ∑ ti = f i .It is also worth pointing out the differences between our approach and the abstract Berlekamp-Welch decoder of Duursma and K ¨otter: They similarly set up codes E , C and N such that E ∗ C ⊆ N . However, instead of item 2, they require that for any e ∈ E and c ∈ C , if e ∗ c = e = c = E and C that guarantee this property).This property, as well as the distance properties, do not hold in the case of Reed-Muller codes.Turning back to the proof of Theorem 19, the algorithm and the proof of correctness turn outto be very short to describe in this level of generality. Given a word y ∈ F nq , the algorithm wouldsolve the the linear system a ∗ y = b , in unknowns a ∈ E and b ∈ N . Under the hypothesis of thetheorem, we show that common zeros of the possible solutions for a determine exactly the errorlocations. Once the locations of the errors are identiﬁed, correcting them is easy: we can replacethe error locations by the symbol ’?’ and use an algorithm which corrects erasures (this can alwaysbe done efﬁciently, when unique decoding is possible, as this merely amounts to solving a systemof linear equations). The algorithm is given in Algorithm 2. Algorithm 2 : Abstract Decoding Algorithm

Input: received word y ∈ F nq such that y = c + e , with c ∈ C and e is supported on a set U Solve for a ∈ E , b ∈ N , the linear system a ∗ y = b . Let { a , . . . , a k } be a basis for the solution space of a , and let E denote the common zeros of { a i | i ∈ [ k ] } . For every j ∈ E , replace y j with ’?’, to get a new word y ′ . Correct y ′ from erasures in C .Note that in Theorem 19 we assume that the error pattern U is correctable from erasures in N ,whereas Algorithm 2 ﬁrst computes a set of error locations E and then corrects y ′ from erasures in C . Thus, the proof of Theorem 19 can be divided into two steps. The ﬁrst, and the main one, willbe to show that E = U . The second, which is merely an immediate observation, will be to showthat U is also correctable from erasures in C . We begin with the second part: Lemma 20.

Assume the setup of Theorem 19, and let U be any pattern which is correctable from erasuresin N. Then U is also correctable from erasures in C. roof. We may assume that U = ∅ , as otherwise the statement is trivial. Suppose on the contrarythat U is not correctable from erasures in C , that is, there exists a non-zero codeword c ∈ C supported on U . For any a ∈ E , we have that a ∗ c is a codeword of N which is supported on asubset of U . In order to reach a contradiction, we want to pick a ∈ E so that a ∗ c is a non-zerocodeword of N , which contradicts the assumption that U is correctable from erasures in N .Pick i ∈ U so that c i =

0. Observe that if U is correctable from erasures in N then so is U \ { i } .By item 2 in Theorem 19 with respect to the set U \ { i } there exists a ∈ E with a i =

1. Thus, inparticular a ∗ c is non-zero.We now prove that main part of Theorem 19, that is, that under the assumptions stated in thetheorem, Algorithm 2 correctly decodes (in C ) any error pattern that is correctable from erasuresin N . Proof of Theorem 19.

Write y = c + e , so that c ∈ C is the transmitted codeword and e is supportedon the set of error locations U . As noted above, by Lemma 20 it is enough to show that under theassumptions of the theorem (in particular, that U is correctable from erasures in N ), the set of errorlocations E computed by Algorithm 2 equals U .In the following two lemmas, we argue that any solution a for the system vanishes on the errorpoints, and then that for every other index i , there exists a solution whose i -th entry is non-zero(and so there must be a basis element for the solution space whose i -th entry is non-zero).The following lemma states that every solution a ∈ E to the equation a ∗ y = b vanishes on U ,the support of e . In the pointwise product notation, this is equivalent to showing that a ∗ e = Subclaim 21.

For every a ∈ E , b ∈ N such that a ∗ y = b , it holds that a ∗ e = .Proof. Since a ∗ y = b ∈ N (by the assumption) and a ∗ c ∈ N (by item 1),we get that a ∗ e = a ∗ y − a ∗ c is also a codeword in N . Furthermore, a ∗ e isalso supported on U , and since U is an erasure-correctable pattern in N , the onlycodeword that is supported on U is the zero codeword. (Subclaim) To ﬁnish the proof, we show that for any i U , there is a solution a to the system of linearequations with a i = Subclaim 22.

For every i U there exists a ∈ E , b ∈ N such that a is 0 on U, a i = and a ∗ y = b .Proof. By item 2, since U is correctable from erasures in N , for every i U we canpick a ∈ E such that a is 0 on U and a i =

1. Set b = a ∗ y . It remains to be shownthat b is a codeword of N . This follows from the fact that b = a ∗ c + a ∗ e = a ∗ c ,where the second equality follows from the fact that a is zero on U (the support of e ). Finally, a ∗ c is a codeword of N by item 1. (Subclaim) These two claims complete the proof of the theorem.14 .2 Decoding of Linear Codes over F In [ASW15], it is observed that their results for Reed-Muller codes imply that for every linear code N , every pattern which is correctable from erasures in N is correctable from errors in what theycall the “degree-three tensoring” of N . One can in fact use our Algorithm 1 almost verbatim toobtain an efﬁcient version of this statement. However, here we remark that this is nothing but aspecial case of Theorem 19 with an appropriate setting of the codes E , C , N . We begin by brieﬂydescribing their deﬁnitions and their argument.The basic tool used by [ASW15] is embedding any parity check matrix in the matrix E ( m , 1 ) for an appropriate choice of m . Let N be any linear code of dimension k over F and H be its paritycheck matrix. For convenience, we ﬁrst extend N by adding a parity bit. This increases the blocklength by 1, does not decrease the distance and preserves the dimension. A parity check matrixfor the extended code can by obtained from H by constructing the matrix H =  · · · H ...0  .The main observation now is that E ( m , 1 ) is an ( m + ) × m matrix that contains all vectors ofthe form ( v ) for v ∈ F m , so if we set m = n − k to be the number of rows of H , we can pick asubset S of the columns of E ( m , 1 ) that correspond to the columns that appear in H .[ASW15] then deﬁne the degree-three tensoring of N , which is a code C whose parity checkmatrix is H ⊗ : this is an ( m ≤ ) × n matrix with rows indexed by tuples i < i < i , with thecorresponding row being the pointwise product (as in Deﬁnition 18) of rows i , i , i of H . Onecan then verify that Algorithm 1 can be used in order to correct (in C ) any error pattern which iscorrectable from erasures in N , by using the algorithm with r = S .A closer look reveals that this construction is in fact a special case of Theorem 19. Given anylinear binary code N with parity check matrix H , the main observation of [ASW15] can be in-terpreted as saying that when we add a parity bit to N , we can embed N in a puncturing of RM ( m , m − ) (whose parity check matrix is E ( m , 1 ) ). We state it in the following claim: Claim 23.

Let N ′ denote the subcode of RM ( m , m − ) of all words that are 0 outside S. Then N isprecisely the restriction of N ′ to the S coordinates.Proof. Let b ∈ N . Then H b =

0, i.e. the columns of H indexed by the non-zero elements in b add up to 0. Let b ′ ∈ F m denote that extension of b into a vector of length 2 m obtained by ﬁlling0’s in every coordinate not in S . Then E ( m , 1 ) b ′ =

0, since the same columns that appeared in H appear in E ( m , 1 ) . This implies that b ′ ∈ N ′ .Similarly, for every b ′ ∈ N ′ , we can deﬁne b to be its restriction to S , and then H b =

0, i.e. b ∈ N .The degree-three tensoring of N , which we denote by C , can then be similarly embedded in apuncturing of RM ( m , m − ) , where again, only the coordinates in S remain, and similarly C canbe seen to be the restriction to S to the subcode C ′ of RM ( m , m − ) that contains the words thatare 0 outside S . 15inally, we deﬁne the error locating code E to be the restriction of RM ( m , 2 ) to the coordinatesof S .We now show that the conditions of Theorem 19 are satisﬁed in this case. We begin withitem 2. If U is a correctable pattern in N , it means that the columns indexed by U in H arelinearly independent. It follows that they are also linearly independent as columns in E ( m , 1 ) .Hence, using the same arguments as before we can ﬁnd, for any coordinate v U , a degree 2polynomial g such that g ( v ) = g restricted to U is 0. Restricting the evaluations of g to thesubset of coordinates S , we get a codeword e ∈ E with the required property.As for item 1: We ﬁrst argue that RM ( m , 2 ) ∗ C ′ ⊆ N ′ , since the degrees match and the propertyof vanishing outside S is preserved under multiplication. Projecting back to the coordinates in S ,we get that E ∗ C ⊆ N . Acknowledgement

We would like thank Avi Wigderson, Emmanuel Abbe and Ilya Dumer for helpful discussionsand for commenting on an earlier version of the paper. We thank Venkatesan Guruswami andanonymous reviewers for pointing out the abstraction of Algorithm 1 given in Section 3.

References [ALM +

98] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy.Proof Veriﬁcation and the Hardness of Approximation Problems.

J. ACM , 45(3):501–555, 1998.[Arı08] Erdal Arıkan. A Performance Comparison of Polar Codes and Reed-Muller Codes.

IEEE Communications Letters , 12(6):447–449, 2008.[AS92] Noga Alon and Joel H. Spencer.

The Probabilistic Method . John Wiley, 1992.[ASW15] Emmanuel Abbe, Amir Shpilka, and Avi Wigderson.Reed-Muller Codes for Random Erasures and Errors. In

Proceedings of the 47thAnnual ACM Symposium on Theory of Computing (STOC 2015) , pages 297–306, 2015.Pre-print available at arXiv:1411.4590 .[BF90] Donald Beaver and Joan Feigenbaum. Hiding instances in multioracle queries. In

Pro-ceedings of the 7th Symposium on Theoretical Aspects of Computer Science (STACS 1990) ,pages 37–48, 1990.[BFL91] L ´aszl ´o Babai, Lance Fortnow, and Carsten Lund.Non-Deterministic Exponential Time has Two-Prover Interactive Protocols.

Com-putational Complexity , 1:3–40, 1991. Preliminary version in the .[BL15] Abhishek Bhowmick and Shachar Lovett. The List Decoding Radius of Reed-Muller Codes over Small Fields.In

Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC 2015) ,pages 277–285, 2015. Pre-print available at eccc:TR14-087 .16BV10] Andrej Bogdanov and Emanuele Viola. Pseudorandom Bits for Polynomials.

SIAMJ. Comput. , 39(6):2464–2486, 2010. Preliminary version in the . Pre-print available at eccc:TR14-081 .[CF07] Daniel J. Costello, Jr. and G. David Forney, Jr.Channel coding: The road to channel capacity.

Proceedings of the IEEE , 95(6):1150–1177, 2007.[DK94] Iwan M. Duursma and Ralf K ¨otter. Error-locating pairs for cyclic codes.

IEEE Transac-tions on Information Theory , 40(4):1108–1121, 1994.[DS06] Ilya Dumer and Kirill Shabunov. Recursive error correction for general Reed-Muller codes.

Discrete Applied Mathematics , 154(2):253–269, 2006.[Dum04] Ilya Dumer. Recursive decoding and its performance for low-rate Reed-Muller codes.

IEEE Transactions on Information Theory , 50(5):811–823, 2004.[Dum06] Ilya Dumer. Soft-decision decoding of Reed-Muller codes: a simpliﬁed algorithm.

IEEE Transactions on Information Theory , 52(3):954–963, 2006.[GKZ08] Parikshit Gopalan, Adam R. Klivans, and David Zuckerman.List-decoding Reed-Muller codes over small ﬁelds. In

Proceedings of the 40th An-nual ACM Symposium on Theory of Computing (STOC 2008) , pages 265–274, 2008.[GL89] Oded Goldreich and Leonid A. Levin. A Hard-Core Predicate for all One-Way Functions.In

Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC 1989) ,pages 25–32, 1989.[GRS14] Venkatesan Guruswami, Atri Rudra, and Madhu Su-dan.

Essential Coding Theory . 2014. Available at .[GS99] Venkatesan Guruswami and Madhu Sudan. Improved decoding of Reed-Solomon and algebraic-geometry codes.

IEEE Transactions on Information Theory , 45(6):1757–1767, 1999.[Ham50] R. W. Hamming. Error Detecting and Error Correcting Codes.

Bell System TechnicalJournal , 26(2):147–160, 1950.[HKL05] Tor Helleseth, Torleiv Kløve, and Vladimir I. Levenshtein.Error-correction capability of binary linear codes.

IEEE Transactions on InformationTheory , 51(4):1408–1423, 2005.[KMS¸ U15] Shrinivas Kudekar, Marco Mondelli, Eren S¸ as¸o ˘glu, and R ¨udiger L. Urbanke.Reed-Muller Codes Achieve Capacity on the Binary Erasure Channel under MAP Decoding.

CoRR , abs/1505.05831, 2015.[KP15] Santhosh Kumar and Henry D. Pﬁster. Reed-Muller Codes Achieve Capacity on Erasure Channels.

CoRR , abs/1505.05123, 2015. 17Kri70] R. E. Krichevskiy. On the number of Reed-Muller code correctable errors.

Dokl. Sov.Acad. Sci. , 191:541–547, 1970.[MHU14] Marco Mondelli, Seyed Hamed Hassani, and R ¨udiger L. Urbanke.From Polar to Reed-Muller Codes: A Technique to Improve the Finite-Length Performance.

IEEE Transactions on Communications , 62(9):3084–3091, 2014.[MS77] F. J. MacWilliams and N. J. A. Sloane.

The theory of error correcting codes . Number v. 2in North-Holland mathematical library. North-Holland Publishing Company, 1977.[Mul54] D. E. Muller. Application of Boolean algebra to switching circuit design and to error detection.

Electronic Computers, Transactions of the I.R.E. Professional Group on , EC-3(3):6–12, Sept1954.[MV97] Meena Mahajan and V. Vinay. Determinant: Combinatorics, Algorithms, and Complexity.

Chicago J. Theor. Comput. Sci. , 1997.[Pel92] Ruud Pellikaan. On decoding by error location and dependent sets of error positions.

Discrete Mathematics , 106-107:369–381, 1992.[Ree54] Irving S. Reed. A class of multiple-error-correcting codes and the decoding scheme.

Trans. of the IRE Professional Group on Information Theory (TIT) , 4:38–49, 1954.[Sha48] C. E. Shannon. A Mathematical Theory of Communication.

The Bell System TechnicalJournal , 27:379–423, 623–656, 1948.[Sha79] Adi Shamir. How to share a secret.

Communications of the ACM , 22(11):612–613, 1979.[Sha92] Adi Shamir. IP = PSPACE.

J. ACM , 39(4):869–877, 1992.[SP92] V. M. Sidel’nikov and A. S. Pershakov. Decoding of Reed-Muller Codes with a Large Number of Errors.

Problems Inform. Transmission , 28(3):80–94, 1992.[Sud97] Madhu Sudan. Decoding of Reed Solomon Codes beyond the Error-Correction Bound.

J. Complexity , 13(1):180–193, 1997.[Sud01] Madhu Sudan. Algorithmic Introduction to Coding Theory, 2001. Lecture Notes,available at http://people.csail.mit.edu/madhu/FT02/scribe/lect11.pdfhttp://people.csail.mit.edu/madhu/FT02/scribe/lect11.pdf