Relaxed Locally Correctable Codes with Improved Parameters
aa r X i v : . [ c s . CC ] S e p Relaxed Locally Correctable Codes with Improved Parameters
Vahid R. Asadi [email protected]
Simon Fraser University
Igor Shinkar [email protected]
Simon Fraser University
Abstract
Locally decodable codes (LDCs) are error-correcting codes C : Σ k → Σ n that admit a local decodingalgorithm that recovers each individual bit of the message by querying only a few bits from a noisycodeword. An important question in this line of research is to understand the optimal trade-off betweenthe query complexity of LDCs and their block length. Despite importance of these objects, the bestknown constructions of constant query LDCs have super-polynomial length, and there is a significantgap between the best constructions and the known lower bounds in terms of the block length.For many applications it suffices to consider the weaker notion of relaxed LDCs (RLDCs) , whichallows the local decoding algorithm to abort if by querying a few bits it detects that the input is not acodeword. This relaxation turned out to allow decoding algorithms with constant query complexity forcodes with almost linear length. Specifically, [BGH +
06] constructed an O ( q ) -query RLDC that encodesa message of length k using a codeword of block length n = O ( k / √ q ) .In this work we improve the parameters of [BGH +
06] by constructing an O ( q ) -query RLDC thatencodes a message of length k using a codeword of block length O ( k /q ) . This construction matches(up to a multiplicative constant factor) the lower bounds of [KT00, Woo07] for constant query LDCs ,thus making progress toward understanding the gap between LDCs and RLDCs in the constant queryregime.In fact, our construction extends to the stronger notion of relaxed locally correctable codes (RLCCs),introduced in [GRR18], where given a noisy codeword the correcting algorithm either recovers eachindividual bit of the codeword by only reading a small part of the input, or aborts if the input is detectedto be corrupt.
Keywords : algorithmic coding theory; consistency test using random walk; reed-muller code; relaxed locally de-codable codes; relaxed locally correctable codes ontents CTRW on Reed-Muller codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 PCPs of proximity and composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
CTRW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216.2 Constructing the composed code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.3 Local correction algorithm for the Reed-Muller part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.4 Local correction algorithm for the proof part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Introduction
Locally decodable codes (LDCs) are error-correcting codes that admit a decoding algorithm that recoverseach specific symbol of the message by reading a small number of locations in a possibly corrupted code-word. More precisely, a locally decodable code C : F k → F n with local decoding radius τ ∈ [0 , is anerror-correcting code that admits a local decoding algorithm D C , such that given an index i ∈ [ k ] and acorrupted word w ∈ F n which is τ -close to an encoding of some message C ( M ) , reads a small number ofsymbols from w , and outputs M i with high probability. Similarly, we have the notion of locally correctablecodes (LCCs) , which are error-correcting codes that not only admit a local algorithm that decode each sym-bol of the message, but are also required to correct an arbitrary symbol from the entire codeword. Locallydecodable and locally correctable codes have many applications in different areas of theoretical computerscience, such as complexity theory, coding theory, property testing, cryptography, and construction of prob-abilistically checkable proof systems. For details, see the surveys [Yek12, KS17] and the references within.Despite the importance of LDCs and LCCs, and the extensive amount of research studying these objects,the best known construction of constant query LDCs has super-polynomial length n = exp(exp(log Ω(1) ( k ))) ,which is achieved by the highly non-trivial constructions of [Yek08] and [Efr12]. For constant query LCCs,the best known constructions are of exponential length, which can be achieved by some parameterization ofReed-Muller codes. It is important to note that there is huge gap between the best known lower bounds forthe length of constant query LDCs and the length of best known constructions. Currently, the best knownlower bound on the length of LDCs says that for q ≥ it must be at least k /q ) , where q stands for thequery complexity of the local decoder. See [KT00, Woo07] for the best general lower bounds for constantquery LDCs.Motivated by applications to probabilistically checkable proofs (PCPs), Ben-Sasson, Goldreich, Har-sha, Sudan, and Vadhan introduced in [BGH +
06] the notion of relaxed locally decodable codes (RLDCs) .Informally speaking, a relaxed locally decodable code is an error-correcting code which allows the localdecoding algorithm to abort if the input codeword is corrupt, but does not allow it to err with high proba-bility. In particular, the decoding algorithm should always output correct symbol, if the given word is notcorrupted. Formally, a code C : F k → F n is an RLDC with decoding radius τ ∈ [0 , if it admits a relaxedlocal decoding algorithm D C which given an index i ∈ [ k ] and a possibly corrupted codeword w ∈ F n ,makes a small number of queries to w , and satisfies the following properties. Completeness: If w = C ( M ) for some M ∈ F k , then D wC ( i ) should output M i . Relaxed decoding: If w is τ -close to some codeword C ( M ) , then D wC ( i ) should output either M i or aspecial abort symbol with probability at least 2/3.This relaxation turns out to be very helpful in terms of constructing RLDCs with better block length. Indeed,[BGH +
06] constructed of a q -query RLDC with block length n = k O (1 / √ q ) .The notion of relaxed LCCs (RLCCs) , recently introduced in [GRR18], naturally extends the notion ofRLDCs. These are error-correcting codes that admit a correcting algorithm that is required to correct everysymbol of the codeword, but is allowed to abort if noticing that the given word is corrupt. More formally,the local correcting algorithm gets an index i ∈ [ n ] , and a (possibly corrupted) word w ∈ F n , makes a smallnumber of queries to w , and satisfies the following properties. Completeness: If w ∈ C , then D wC ( i ) should output w i . Relaxed correcting: If w is τ -close to some codeword c ∗ ∈ C , then D wC ( i ) should output either c ∗ i or aspecial abort symbol with probability at least 2/3.3ote that if the code C is systematic, i.e., the encoding of any message M ∈ F k contains M in its first k symbols, then the notion of RLCC is stronger than RLDC.Recently, building on the ideas from [GRR18], [CGS20] constructed RLCCs whose block length matchesthe RLDC construction of [BGH + n = k /q ) .Given the gap between the best constructions and the known lower bounds, it is natural to ask thefollowing question:What is the best possible trade-off between the query complexity and the block length of an RLDC?In particular, [BGH +
06] asked whether it is possible to obtain a q -query RLDC whose block length isstrictly smaller than the best known lower bound on the length of LDCs. A positive answer to their questionwould show a separation between the two notions, thus proving that the relaxation is strict . See paragraph Open Problem in the end of Section 4.2 of [BGH + C : F K → F N with query complexity O ( q ) and block length K O (1 /q ) . In fact, our construction givesthe stronger notion of a relaxed locally correctable code. Theorem 1 (Main Theorem) . For every q ∈ N there exists an O ( q ) -query relaxed locally correctable code C : { , } K → { , } N with constant relative distance and constant decoding radius, such that the blocklength of C is N = q O ( q ) · K O (1 /q ) . Therefore, our construction improves the parameters of the O ( q ) -query RLDC construction of [BGH + N = K O ( √ /q ) , and matches (up to a multiplicative factor in q ) the lower bound of Ω( K ⌈ q/ ⌉− ) for the block length of q -query LDCs [KT00, Woo07]. Remark 1.1.
In this paper we prove Theorem 1 for a code C : F K → F N over a large alphabet. Specifically,we show a code C : F K → F N satisfying Theorem 1, for a finite field F satisfying | F | ≥ c q · K /q , for some c q ∈ N that depends only on q .Using the techniques from [CGS20] it is not difficult to obtain an RLCC over the binary alphabet withalmost the same block length. Indeed, this can be done by concatenating our code over large alphabet withan arbitrary binary code with constant rate and constant relative distance. See Section 7 for details. RLDC and RLCC constructions:
Relaxed locally decodable codes, were first introduced by [BGH + N = K O (1 / √ q ) . Since that work, there were no constructions with better block length, in the constant querycomplexity regime . Recently, [GRR18] introduced the related notion of relaxed locally correctable codes(RLCCs), and constructed q -query RLCCs with block length N = poly( K ) . Then, [CGS20] constructedrelaxed locally correctable codes with block length matching that of [BGH +
06] (up to a multiplicativeconstant factor q ). The construction of [CGS20] had two main components, that we also use in the currentwork. Consistency test using random walk (
CTRW ): Informally, given a word w , and a coordinate i we wishto correct, CTRW samples a sequence of constraints C , C , . . . , C t on w , such that the domains of4 i and C i +1 intersect, with the guarantee that if w is close to some codeword c ∗ ∈ C , but w i = c ∗ i ,then with high probability w will be far from satisfying at least one of the constraints. In other words, CTRW performs a random walk on the constraints graph and checks if w is consistent with c ∗ in the i ’th coordinate. We introduce this notion in detail in Section 2.1, and prove that the Reed-Muller codeadmits a CTRW in Section 4.
Correctable canonical PCPPs (ccPCPP):
These are PCPP systems for some specified language L satisfy-ing the following properties: (i) for each w ∈ L there is a unique proof π ( w ) that satisfies the verifierwith probability 1, (ii) the verifier accepts with high probability only pairs ( x, π ) that are close tosome ( w, π ( w )) for some w ∈ L , i.e., only the pairs where x is close to some w ∈ L , and π is closeto π ( w ) , and (iii) the set { w ◦ π w : w ∈ L } is an RLCC. Canonical proofs of proximity have beenstudies in [DGG18, Par20]. We elaborate on these constructions in Section 5. Lower bounds:
For lower bounds, the only bound we are aware of is that of [GL20], who proved that any q -query relaxed locally decodable code must have a block length N ≥ K q ) .For the strict notion of locally decodable codes, it is known by [KT00, Woo07] that for q ≥ any q -query LDC must have block length N ≥ Ω( K ⌈ q/ ⌉− ) . For q = 3 a slightly stronger bound of N ≥ Ω( K / log( K )) is known, and furthermore, for -query linear LDC the block length must be N ≥ Ω( K / log log( K )) [Woo07]. For q = 2 [KdW03] proved an exponential lower bound of N ≥ exp(Ω( K )) .See also [DJK +
02, GKST02, Oba02, WdW05, Woo10] for more related work on lower bounds for LDCs.
The rest of the paper is organized as follows. In Section 2, we informally discuss the construction and thecorrecting algorithm. In this discussion we focus on decoding the symbols corresponding to the message,i.e., on showing that the code is an RLDC. Section 3 introduces the formal definitions and notations we willuse in the proof of Theorem 1. We present the notion of consistency test using random walk in Section 4,and prove that the Reed-Muller code admits such test. In Section 5 we present the PCPPs we will use in ourconstruction, and state the properties needed for the correcting algorithm. In Section 6 we prove Theorem 1by proving a composition theorem, which combines the instantiation of the Reed-Muller code with PCPPsfrom the previous sections.
In this section we informally describe our code construction. Roughly speaking, our construction consistsof two parts:
The Reed-Muller encoding:
Given a message M ∈ F K , its Reed-Muller encoding is the evaluation of an m -variate polynomial of degree at most d over F , whose coefficients are determined by the messagewe wish to encode. Proofs of proximity:
The second part of the encoding consists of the concatenation of PCPPs, each claim-ing that a certain restriction of the first part agrees with some Reed-Muller codeword.Specifically, given a message M ∈ F K , we first encode it using the Reed-Muller encoding RM F ( m, d ) ,where m roughly corresponds to the query complexity of our RLDC, and the field is large enough so that5he distance of the Reed-Muller code, which is equal to − d | F | , is some constant, say / . That is, thefirst part of the encoding corresponds to an evaluation of some polynomial f : F m → F of degree at most d . The second part of the encoding consists of a sequence of PCPPs claiming that the restrictions of a theReed-Muller part to some carefully chosen planes in F m are evaluations of some low-degree polynomial.The planes we choose are of the form P ~a,~h,~h ′ = { ~a + t · ~h + s · ~h ′ : t, s ∈ F } , where ~a ∈ F m , and ~h, ~h ′ ∈ H m for some H subfield of F . We will call such planes H -planes . In order to obtain the RLDC withthe desired parameters, we choose the field H so that F is the extension of H of degree [ F : H ] = m . It willbe convenient to think of H as a field and think of F as a vector space of H of dimension m (augmented withthe multiplicative structure on F ). Indeed, the saving in the block length of the RLDC we obtain cruciallyrelies on the fact that we ask for PCPPs for only a small collection of planes, and not all planes in F m . Theactual constraints required to be certified by the PCPPs are slightly more complicated, and we describe thenext.The constraints of the first type correspond to H -planes P and points ~x ∈ P . For each such pair ( P , ~x ) the code will contain a PCPP certifying that (i) the restriction of the Reed-Muller part to P is close to anevaluation of some polynomial of total degree at most d , (ii) and furthermore, this polynomial agrees withthe value of the Reed-Muller part on ~x . In order to define it formally, we introduce the following notation. Notation 2.1.
Let F be a finite field of size n . Fix f : F m → F , a plane P in F m , and a point ~x ∈ P . Denote f ( ~x ) |P = f |P ◦ ( f ( ~x )) n . That is, the length of f ( ~x ) |P is · n , and it consists of f |P concatenated with n repetitions of f ( ~x ) . Given the notation above, if f is the first part of the codeword, corresponding to the Reed-Muller encod-ing of the message, then the PCPP for the pair ( P , ~x ) is expected to be the proof of proximity claiming that f ( ~x ) |P is close to the language RM ( ~x ) |P = { Q ◦ ( Q ( ~x )) ( n ) : Q is the evaluation of a degree- d polynomial on P} ⊆ F n . (1)Note that by repeating the symbol Q ( ~x ) for n times, the definition indeed puts weight 1/2 on the constraintthat the input f |P is close to some low-degree polynomial Q , and puts weight 1/2 of the constraint f ( ~x ) = Q ( ~x ) . In particular, if f |P is δ -close to some bivariate low degree polynomial Q for some small δ > , but f ( ~x ) = Q ( ~x ) , then f |P is at least (1 − d | F | − δ ) / -far from any bivariate low degree polynomial on P .The constraints of second type correspond to H -planes P and lines ℓ ⊆ P . For each such pair ( P , ℓ ) the code will contain a PCPP certifying that (i) the restriction of the Reed-Muller part to P is close to anevaluation of some polynomial of total degree at most d , (ii) and furthermore, this polynomial is close to f | ℓ .(In particular, this implies that f | ℓ is close to some low-degree polynomial.)Next, we introduce the notation analogous to Notation 2.1 replacing the points with lines. Notation 2.2.
Let F be a finite field of size n . Fix f : F m → F , a plane P in F m , and a line ℓ ⊆ P . Denote by f ( ℓ ) |P = f |P ◦ ( f | ℓ ) n . That is, the length of f ( ℓ ) |P is · n , and it consists of f |P concatenated with n repetitionsof f | ℓ . If f is the Reed-Muller part of the codeword, corresponding to the Reed-Muller encoding of the message,then the PCPP for the pair ( P , ℓ ) is expected to be the proof of proximity claiming that f ( ℓ ) |P is close to thelanguage RM ( ℓ ) |P = { Q ◦ ( Q | ℓ ) n : Q is the evaluation of some degree- d polynomial on P} ⊆ F n . (2)6gain, similarly to the first part, repeating the evaluation of Q | ℓ for n times puts weight 1/2 on the constraintthat the input f |P is a close to some low-degree polynomial Q , and puts weight 1/2 of the constraint f | ℓ isclose to Q | ℓ .With the proofs specified above, we now sketch the local correcting algorithm for the code. Below weonly focus on correcting symbols from the Reed-Muller part. Correcting the symbols from the PCPP partfollows a rather straightforward adaptation of the techniques from [CGS20], and we omit them from theoverview.Given a word w ∈ F N and an index i ∈ [ N ] of w corresponding to the Reed-Muller part of the codeword,let f : F m → F be the Reed-Muller part of w , and let ~x ∈ F m be the input to f corresponding to the index i .The local decoder works in two steps. Consistency test using random walk:
In the first step the correcting algorithm invokes a procedure we call consistency test using a random walk (
CTRW ) for the Reed-Muller code. This step creates a sequenceof H -planes of length ( m + 1) , where each plane defines a constraint checking that the restriction of w to the plane is low-degree. Hence, we get m + 1 constraints, each depending on n symbols. Composition using proofs of proximity:
Then, instead of reading the entire plane for each constraint, weuse the PCPPs from the second part of the codeword to reduce the arity of each constraint to O (1) ,thus reducing the total query complexity of the correcting algorithm to q = O ( m ) . That is, for eachconstraint we invoke the corresponding PCPP verifier to check that the restrictions of f to each ofthese planes is (close to) a low-degree polynomial. If at least one of the verifiers rejects, then the word f must be corrupt, and hence the correcting algorithm returns ⊥ . Otherwise, if all the PCPP verifiersaccept, the correcting algorithm returns f ( ~x ) .In particular, if f is a correct Reed-Muller encoding, then the algorithm will always return f ( ~x ) , andthe main part of the analysis is to show that if f is close to some Q ∗ ∈ RM F ( m, d ) , but f ( ~x ) = Q ∗ ( ~x ) ,then the correcting algorithm catches an inconsistency, and returns ⊥ with some constant probability. SeeSection 6.3 for details.The key step in the analysis says that if f is close to some codeword Q ∗ ∈ RM but f ( ~x ) = Q ∗ ( ~x ) , thenwith high probability f will be far from a low degree polynomial on at least one of these planes, where “far”corresponds to the notion of distances defined by the languages RM ( ~x ) |P and RM ( ℓ ) |P . In particular, if on oneof the planes f is far from the corresponding language, then the PCPP verifier will catch this with constantprobability, thus causing the correcting algorithm to return ⊥ . We discuss this part in detail below.It is important to emphasize that the main focus of this work is constructing a correcting algorithm forthe Reed-Muller part. Using the techniques developed in [CGS20], it is rather straightforward to design thealgorithm for correcting symbols from the PCPPs part of the code. See Section 6.4 for details. CTRW on Reed-Muller codes
Below we define the notion of consistency test using random walk (
CTRW ) for the Reed-Muller code. Thisnotion is a slight modification of the notion originally defined in [CGS20] for general codes. In this paper wedefine it only for the Reed-Muller code. Given a word f : F m → F and some ~x ∈ F m , the goal of the test isto make sure that f ( ~x ) is consistent with the codeword of Reed-Muller code closest to f . [CGS20] describea CTRW for the tensor power C ⊗ m of an arbitrary codes C with good distance (e.g., Reed-Solomon). The CTRW they describe works by starting from the point we wish to correct, and choosing an axis-parallel line ℓ containing the starting point. The test continues by choosing a sequence of random axis-parallel lines7 , ℓ , . . . ℓ t , such that each ℓ i intersects the previous one, ℓ i − , until reaching a uniformly random coordinateof the tensor code. That is, the length of the sequence t denotes the mixing time of the corresponding randomwalk . The predicates are defined in the natural way; namely, the test expects to see a codeword of C on eachline it reads.In this work, we present a CTRW for the Reed-Muller code, which is a variant of the
CTRW describedabove. The main differences compared to the description above are that (i) the test chooses a sequenceof planes P , P , . . . P t (and not lines), (ii) and every two planes intersect on a line (and not on a point).Roughly speaking, the algorithm works as follows.1. Given a point ~x ∈ F m the test picks a uniformly random H -plane P containing ~x .2. Given P , the test chooses a random line ℓ ⊆ P , and then chooses another random H -plane P ⊆ F m containing ℓ .3. Given P , the test chooses a random line ℓ ⊆ P , and then chooses another random H -plane P ⊆ F m containing ℓ .4. The algorithm continues for some predefined number of iterations, choosing P , P , P , . . . P t . Roughlyspeaking, the number of iterations is equal to the mixing time of the corresponding Markov chain. Morespecifically, the process continues until a uniformly random point in P t is close to a uniform point in F m .5. The constraints defined for each P i are the natural constraints; namely checking that the restriction of f to P i is a polynomial of degree at most d .One of the important parameters, directly affecting the query complexity of our construction is the mix-ing time of the random walk. Indeed, as explained above, the query complexity of our RLDC is proportionalto the mixing time of the random walk. We prove that if [ F : H ] = m , then the mixing time is upper boundedby m . In order to prove this we use the following claim, saying that if F is the field extension of H of degree m , and ~h , . . . , ~h m ∈ H m and t , . . . , t m ∈ F are sampled uniformly, independently from each other, then P mi =1 t i · ~h i is close to a uniformly random point in F m . See Claim 3.5 for the exact statement.As explained above, the key step of the analysis is to prove that if f is close to some codeword Q ∗ ∈ RM but f ( ~x ) = Q ∗ ( ~x ) , then with high probability at least one of the predicates defined will be violated.Specifically, we prove that with high probability the violation will be in the following strong sense. Theorem 2.3 (informal, see Theorem 4.3) . If f is close to some codeword Q ∗ ∈ RM but f ( ~x ) = Q ∗ ( ~x ) ,then with high probability1. either f ( ~x ) |P is Ω(1) -far from RM ( ~x ) |P ,2. or f ( ℓ i ) |P i is Ω(1) -far from RM ( ℓ i ) |P i for some i ∈ [ m ] . Indeed, this strong notion of violation allows us to use the proofs of proximity in order to reduce thequery complexity to O (1) queries for each i ∈ [ m ] . We discuss proofs of proximity next.8 .2 PCPs of proximity and composition The second building block we use in this work is the notion of probabilistic checkable proofs of proximity(PCPPs) . PCPPs were first introduced in [BGH +
06] and [DR04]. Informally speaking, a PCPP verifier fora language L , gets an oracle access to an input x and a proof π claiming that x is close to some element of L . The verifier queries x and π in some small number of (random) locations, and decides whether to acceptor reject. The completeness and soundness properties of a PCPP are as follows. Completeness: If x ∈ L , then there exists a proof causing the verifier to accept with probability 1. Soundness: If x is far from L , then no proof can make the verifier to accept with probability more than 1/2.In fact, we will use the slightly stronger notion of canonical PCPP (cPCPP) systems . These are PCPPsystems satisfying the following completeness and soundness properties. For completeness, we demandthat for each w in the language there is a unique canonical proof π ( w ) that causes the verifier to accept withprobability 1. For soundness, the demand is that the only pairs ( x, π ) that are accepted by the verifier withhigh probability are those where x is close to some w ∈ L and π is close to π ( w ) . Such proof system havebeen studies in [DGG18, Par20], who proved that such proof systems exist for every language in P .Furthermore, for our purposes we will demand a stronger notion of correctable canonical PCPP systems(ccPCPP). These are canonical PCPP systems where the set { w ◦ π ∗ ( w ) : w ∈ L } is a q -query RLCC forsome parameter q , with π ∗ ( w ) denoting the canonical proof for w ∈ L . It was shown in [CGS20] how toconstruct ccPCPP by combining a cPCPP system with any systematic RLCC. Informally speaking, for every w ∈ L , and its canonical proof π ( w ) , we define π ∗ ( w ) by encoding w ◦ π ( w ) using a systematic RLCC.The verifier for the new proof system is defined in a straightforward manner. See [CGS20] for details.The PCPPs we use throughout this work, are the proofs of two types, certifying that1. f ( ~x ) |P is close to RM ( ~x ) |P for some plane P and some ~x ∈ P , and2. f ( ℓ ) |P is close to RM ( ℓ ) |P for some plane P and some line ℓ ⊆ P .Indeed, it is easy to see that the first type of proofs checks that (i) the restriction of f to P is close to anevaluation of some polynomial Q ∗ of total degree at most d , (ii) and f ( ~x ) = Q ∗ ( ~x ) . Similarly, the secondtype proof certifies that (i) the restriction of f to P is close to an evaluation of some polynomial Q ∗ of totaldegree at most d , (ii) and f | ℓ is close to Q ∗| ℓ .These notions of distance go together well with the guarantees we have for CTRW in Theorem 2.3.This allows us to compose
CTRW with the PCPPs to obtain a correcting algorithm with query complexity q = O ( m ) . Informally speaking, the composition theorem works as follows. We first run the CTRW to obtain a collection of m + 1 constraints on the planes P , P , . . . , P m . By Theorem 2.3, we have theguarantee that with high probability either f ( ~x ) |P is Ω(1) -far from RM ( ~x ) |P , or f ( ℓ i ) |P i is Ω(1) -far from RM ( ℓ i ) |P i forsome i ∈ [ m ] . Then, instead of actually reading the values of f on all these planes, we run the PCPP verifieron f ( ~x ) |P to check that it is close to RM ( ~x ) |P , and running the PCPP verifier on each of the f ( ℓ i ) |P i to check thatthey are close to RM ( ℓ i ) |P i . Each execution of the PCPP verifier makes O (1) queries to f and to the proof, andthus the total query complexity will be indeed O ( m ) . As for soundness, if f ( ~x ) |P is Ω(1) -far from RM ( ~x ) |P , or f ( ℓ i ) |P i is Ω(1) -far from RM ( ℓ i ) |P i for some i ∈ [ m ] , then the corresponding verifier will notice an inconsistencywith constant probability, causing the decoder to output ⊥ .We discuss proofs of proximity in Section 5. The composition is discussed in Section 6.9 Preliminaries
We begin with standard notation. The relative distance between two strings x, y ∈ Σ n is defined as dist( x, y ) := |{ i ∈ [ n ] : x i = y i }| n . If dist( x, y ) ≤ ǫ , we say that x is ǫ -close to y ; otherwise we say that x is ǫ -far from y . For a non-empty set S ⊆ Σ n define the distance of x from S as dist( x, S ) := min y ∈ S dist( x, y ) . If dist( x, S ) ≤ ǫ , we say that x is ǫ -close to S ; otherwise we say that x is ǫ -far from S .We will also need a more general notion of a distance, allowing different coordinates to have differentweight. In particular, we will need the distance that gives constant weight to a particular subset of thecoordinates, and spreads the rest of the weight uniformly between all coordinates. Definition 3.1.
Fix n ∈ N and an alphabet Σ . For a set A ⊆ [ n ] define the distance dist A between twostrings x, y ∈ Σ n as dist A ( x, y ) = |{ i ∈ A : x i = y i }| | A | + |{ i ∈ [ n ] : x i = y i }| n . In particular, if x differs from y on δ | A | coordinates in A , then dist A ( x, y ) is at least δ + δ | A | n .We define dist A between a string x ∈ Σ n and a set S ⊆ Σ n as dist A ( x, S ) = min y ∈ S dist A ( x, y ) . Remark 3.2.
This definition generalizes the definition of [CGS20] of dist k for a coordinate k ∈ [ n ] . Indeed,the notion of dist k for a coordinate k ∈ [ n ] corresponds to the singleton set A = { k } .When the set A is a singleton A = { k } we will write dist k ( x, y ) to denote dist { k } ( x, y ) , and we willwrite dist k ( x, S ) to denote dist { k } ( x, S ) . Let k < n be positive integers, and let Σ be an alphabet. An error correcting code C : Σ k → Σ n isan injective mapping from messages of length k over the alphabet Σ to codewords of length n . The pa-rameter k is called the message length of the code, and n is its block length (which we view as a func-tion of k ). The rate of the code is defined as k/n , and the relative distance of the code is defined as min M = M ′ ∈ Σ k dist( C ( M ) , C ( M ′ )) . We sometimes abuse the notation and use C to denote the set of all ofits codewords, i.e., identify the code with { C ( M ) : M ∈ Σ k } ⊆ Σ n . Linear codes.
Let F be a finite field. A code C : F k → F n is linear if it is an F -linear map from F k to F n . In this case the set of codewords C is a subspace of F n , and the message length of C is also thedimension of the subspace. It is a standard fact that for any linear code C , the relative distance of C is equalto min x ∈ C \{ n } dist( x, n ) . 10 .2 Reed-Muller codes Reed-Muller codes [Mul54] are among the most well studied error correcting codes, with many theoreticaland practical applications in different areas of computer science and information theory. Let F be a finitefield of order | F | = n , and let d and m be integers. The code RM F ( m, d ) is the linear code whose codewordsare the evaluations of polynomials f : F m → F of total degree at most d over F . We will allow ourselves towrite RM ( m, d ) , since the field is fixed throughout the paper. We will also sometimes omit the parameters m and d , and simply write RM , when the parameters are clear from the context.In this paper we consider the setting of parameters where d < | F | = n . It is well known that for d < n the relative distance of RM F ( m, d ) is − dn . The dimension of RM can be computed by counting the numberof monomials of total degree at most d . For d < n the number of such monomials is (cid:0) d + mm (cid:1) ≥ ( d + mm ) m > ( dm ) m . Since the length of each codeword is n m , it follows that the rate of the code is ( d + md ) n m > ( dmn ) m . Definition 3.3.
For ~x, ~y ∈ F m denote by ℓ ~x,~y the line ℓ ~x,~y = { ~x + t · ~y : t ∈ F } . Also, for ~x, ~y, ~z ∈ F m denote by P ~x,~y,~z the plane P ~x,~y,~z = { ~x + t · ~y + s · ~z : t, s ∈ F } . An important property of RM ( m, d ) (and multivariate low-degree polynomials, in general) that we usethroughout this work is that their restrictions to lines and planes in F m are also polynomials of degree atmost d . In other words, if f ∈ RM ( m, d ) , and ℓ is a line ( P is a plane) in F m , then the restriction of f to ℓ (or to P ) is a codeword of the Reed-Muller code of the same degree and lower dimension.The following lemma is a standard lemma in the PCP literature, saying that random lines sample wellthe space F m . Lemma 3.4.
Let F be a finite field. For any subset A ⊆ F of density µ = | A | / | F | , and for any ǫ > itholds that Pr ~x ∈ F ,~y ∈ F (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) | ℓ ~x,~y ∩ A || ℓ ~x,~y | − µ (cid:12)(cid:12)(cid:12)(cid:12) > ǫ (cid:21) ≤ | F | · µǫ . Proof.
For each t ∈ F , let X t be an indicator random variable for the event ~x + t · ~y ∈ A . Since each pointis a uniform point in the plane, we have E [ X t ] = Pr[ X t = 1] = µ , Therefore, denoting X = P t ∈ F X t , itfollows that E [ ℓ ~x,~y ∩ A ] = E [ X ] = µ · | F | .We are interested in bounding the deviation of X = P t X t from its expectation. We do it by boundingthe variance of X . Note first that Var [ X t ] = µ − µ ≤ µ . By the pairwise independence of the points on aline, it follows that Var [ X ] = P t ∈ F Var [ X t ] ≤ µ · | F | . Therefore, by applying Chebyshev’s inequality weget Pr (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) | ℓ ~x,~y ∩ A || ℓ ~x,~y | − µ (cid:12)(cid:12)(cid:12)(cid:12) > ǫ (cid:21) = Pr [ | X − µ | F || > ǫ | F | ] ≤ Var [ X ]( ǫ | F | ) ≤ µ | F | · ǫ , as required.The following claim will be an important step in our analysis.11 laim 3.5. Let m ∈ N be a parameter, let H be a finite field, and let F be its extension of degree m . Let ~h , . . . , ~h m ∈ H m and t , . . . , t m ∈ F be chosen independently uniformly at random from their domains.Then for any set A ⊆ F m of size | A | = α · | F m | it holds that Pr " m X i =1 t i · ~h i ∈ A ≤ α + 2 / H . Proof.
In order to prove the claim let us introduce some notation. We write each element in F as an m -dimensional row vector over H . Also, we will represent an element ~x ∈ F m as a m × m matrix over H ,where the i ’th row represents ~x i ∈ F , the i ’th coordinate of ~x . Using this notation we need to prove thatthe random matrix corresponding to the sum P mi =1 t i · ~h i is close to a random matrix with entries chosenuniformly from H independently from each other.Using the notation above, write each t i ∈ F as a row vector ( t i, , . . . , t i,m ) ∈ H m . Observe that for anyvector ~h i = ( ~h i, , . . . , ~h i,m ) T ∈ H m we can represent t i · ~h i ∈ F m as the outer product t i · ~h i = t i, · ~h i, t i, · ~h i, . . . t i,m · ~h i, t i, · ~h i, t i, · ~h i, . . . t i,m · ~h i, ... ... . . . ... t i, · ~h i,m t i, · ~h i,m . . . t i,m · ~h i,m = ~h i, ~h i, ... ~h i,m · (cid:2) t i, t i, . . . t i,m (cid:3) Therefore, the sum P mi =1 t i · ~h i is represented as m X i =1 ~h i, ~h i, ... ~h i,m · (cid:2) t i, t i, . . . t i,m (cid:3) = H · T , where H is the m × m matrix with H i,j = ~h j,i , and T is the m × m matrix with T i,j = t i,j . That is, the sum P mi =1 t i · ~h i is represented as a product of two uniformly random matrices over H .Next we show that if H, T ∈ H m × m are chosen uniformly at random and independently, then for anycollection A of matrices of size | A | = α · | H m | it holds that Pr[ H · T ∈ A ] ≤ α + 2 / H . Indeed, Pr[ H · T ∈ A ] ≤ Pr[ H · T ∈ A | H is invertible ] + Pr[ H is not invertible ] . If H is invertible, then for a uniformly random T ∈ H m × m the probability that H · T ∈ A is exactly α , andit is easy to check that Pr[ H is not invertible ] = P mi =1 1 | H | i ≤ | H | . Following the discussion in the introduction, we provide a formal definition of relaxed LCCs, and state somerelated basic facts and known results.
Definition 3.6 (Relaxed LCC) . Let C : Σ K → Σ N be an error correcting code with relative distance δ , andlet q ∈ N , τ cor ∈ (0 , δ/ ,and ǫ ∈ (0 , be parameters. Let D be a randomized algorithm that gets anoracle access to an input w ∈ Σ n and an explicit access to an index i ∈ [ n ] . We say that D is a q -queryrelaxed local correction algorithm for C with correction radius τ cor and soundness ǫ if for all inputs thealgorithm D reads explicitly the coordinate i ∈ [ N ] , reads at most q (random) coordinates in w , and satisfiesthe following conditions. . For every w ∈ C , and every coordinate i ∈ [ N ] it holds that Pr[ D w ( i ) = w i ] = 1 .2. For every w ∈ Σ n that is τ cor -close to some codeword c ∗ ∈ C and every coordinate i ∈ [ N ] it holds that Pr[ D w ( i ) ∈ { c ∗ i , ⊥} ] ≥ ǫ , where ⊥ 6∈ Σ is a special abort symbol.The code C is said to be a ( τ cor , ǫ ) -relaxed locally correctable code (RLCC) with query complexity q if itadmits a q -query relaxed local correction algorithm with correction radius τ cor and soundness ǫ . Observation 3.7.
Note that for systematic codes it is clear from Definition 3.6 that RLCC is a strongernotion than RLDC, as it allows the local correction algorithm not only to decode each symbol of the message,but also each symbol of the codeword itself. That is, any systematic RLCC is also an RLDC with the sameparameters.
Finally, we recall the following theorem of Chiesa, Gur, and Shinkar [CGS20].
Theorem 3.8 ([CGS20]) . For any finite field F , and parameters K, q ∈ N , there exists an explicit con-struction of a systematic linear code C CGS : F K → F N with block length N = q O ( √ q ) · K O (1 / √ q ) andconstant relative distance, that is a q -query RLCC with constant correction radius τ cor = Ω(1) , and constantsoundness ǫ = Ω(1) . Next we define the notions of probabilistically checkable proofs of proximity , and the variants that we willneed in this paper.
Definition 3.9 (PCP of proximity) . A q -query PCP of proximity (PCPP) verifier for a language L ⊆ Σ ∗ with soundness ǫ P CP P with respect the to proximity parameter ρ , is a polynomial-time randomizedalgorithm V that receives oracle access to an input x ∈ Σ n and a proof π . The verifier makes at most q queries to x ◦ π and has the following properties: Completeness:
For every x ∈ L there exists a proof π such that Pr[ V x,π = ACCEP T ] = 1 . Soundness: If x is ρ -far from L , then for every proof π it holds that Pr[ V x,π = ACCEP T ] ≤ ǫ P CP P . A canonical PCPP (cPCPP) is a PCPP in which every instance in the language has a canonical acceptingproof. Formally, a canonical PCPP is defined as follows. Definition 3.10 (Canonical PCPP) . A q -query canonical PCPP verifier for a language L ⊆ Σ ∗ with sound-ness ǫ P CP P with respect to proximity parameter ρ , is a polynomial-time randomized algorithm V that getsoracle access to an input x ∈ Σ n and a proof π . The verifier makes at most q queries to x ◦ π , and satisfiesthe following conditions: Canonical completeness:
For every w ∈ L there exists a unique (canonical) proof π ( w ) for which Pr[ V w,π ( w ) = ACCEP T ] = 1 . Canonical soundness:
For every x ∈ Σ n and proof π such that δ ( x, π ) , min w ∈ L (cid:26) max (cid:18) dist( x, w ) n , dist( π, π ( w )) len ( n ) (cid:19)(cid:27) > ρ , (3) it holds that Pr[ V x,π = ACCEP T ] ≤ ǫ P CP P . Theorem 3.11 ([DGG18, Par20]) . Let ρ > be a proximity parameter. For every language in L ∈ P thereexists a polynomial len : N → N and a canonical PCPP verifier for L satisfying the following properties.1. For all x ∈ L of length | x | = n the length of the canonical proof π ( x ) is | π ( x ) | = len ( n ) .2. The query complexity of the PCPP verifier is q = O (1 /ρ ) .3. The PCPP verifier for L has perfect completeness and soundness ǫ = 1 / for proximity parameter ρ (with respect to the uniform distance measure). Next, we define the stronger notion of correctable canonical PCPPs (ccPCPP) , originally defined in[CGS20]. A ccPCPP system is a canonical PCPP system that in addition to allowing the verifier to beable to locally verify the validity of the given proof, it also admits a local correction algorithm that locallycorrects potentially corrupted symbols of the canonical proof. Formally, the ccPCPP is defined as follows.
Definition 3.12 (Correctable canonical PCPP) . A language L ⊆ Σ ∗ is said to admit a ccPCPP with querycomplexity q and soundness ǫ P CP P with respect to the proximity parameter ρ , and correcting soundness ǫ for correcting radius τ cor if it satisfies the following conditions.1. L admits a q -query canonical PCPP verifier for L satisfying the conditions in Definition 3.10 with sound-ness ǫ P CP P with respect to the proximity parameter ρ .2. The code Π L = { w ◦ π ( w ) : w ∈ L } is a ( τ cor , ǫ ) -RLCC with query complexity q , where π ( w ) is thecanonical proof for w ∈ L from Definition 3.10. Below we define the notion of consistency test using random walk (
CTRW ) . This notion has been originallydefined in [CGS20] for tensor powers of general codes. In this paper we focus on CTRW for the Reed-Muller code.Informally speaking, a consistency test using random walk for Reed-Muller code RM = RM F ( m, d ) isa randomized algorithm that gets a word f : F m → F , which is close to some codeword Q ∗ ∈ RM , andan index ~x ∈ F m as an input, and its goal is to check whether f ( ~x ) = Q ∗ ( ~x ) . In other words, it checkswhether the value of f at ~x is consistent with the close codeword Q ∗ . Below we formally describe therandom process. Definition 4.1 (Consistency test using H -plane-line random walk on RM F ( m, d ) ) . Let H be a field, and let F be a field extension of H . Let RM = RM F ( m, d ) be the Reed-Muller code. An r -steps consistency testusing H -plane-line random walk on RM is a randomized algorithm that gets as input the evaluation tableof some f : F m → F and a coordinate ~x ∈ F m , and works as in Algorithm 1.We say that CTRW has perfect completeness and ( τ, ρ, ǫ ) -robust soundness if it satisfies the following guar-antees. Perfect completeness: If f ∈ RM , then Pr[
CTRW f ( ~x ) = ACCEP T ] = 1 for all ~x ∈ F m . ( τ, ρ, ǫ ) -robust soundness: If f is τ -close to some Q ∗ ∈ RM , but f ( ~x ) = Q ∗ ( ~x ) , then Pr[dist ~x ( f |P , RM |P ) ≥ ρ ∨ ∃ i ∈ [ r ] such that dist ℓ i ( f |P i , RM |P i ) ≥ ρ ] ≥ ǫ . lgorithm 1: H -plane-line CTRW for the m − dimensional Reed-Muller code Input: f : F m → F , ~x ∈ F m Pick ~h , ~h ′ ∈ H m uniformly at random, and let ~x = ~x Let P = P ~x ,~h ,~h ′ be a random H -plane passing through ~x for i = 1 to r do Sample s i − , s ′ i − ∈ F uniformly and independently Let ~x i = ~x i − + s i − · ~h i − + s ′ i − · ~h ′ i − be a uniformly random point in P i − Sample t i − , t ′ i − ∈ F uniformly and independently, and let ~h i = t i − · ~h i − + t ′ i − · ~h ′ i − Let ℓ i = ℓ ~x i ,~h i = { ~x i + t · ~h i : t ∈ F } be a random line in P i − Pick ~h ′ i ∈ H m uniformly at random Let P i = P ~x i ,~h i ,~h ′ i if f |P i is an evaluation of a polynomial of total degree at most d for all ≤ i ≤ r then return ACCEPT else return REJECTHere dist ~x and dist ℓ i are as in Definition 3.1. Remark 4.2.
Note that the soundness condition above is equivalent to checking that
Pr[dist( f ( ~x ) |P , RM ( ~x ) |P ) ≥ ρ ∨ ∃ i ∈ [ r ] such that dist( f ( ℓ i ) |P i , RM ( ℓ i ) |P i ) ≥ ρ ] ≥ ǫ . Next, we show that the Reed-Muller code admits an m -steps consistency test using H -plane-line randomwalk with constant robust soundness. Theorem 4.3.
For integer parameters d, m ≥ , let H be a prime field, and let F be field extension of H ofdegree [ F : H ] = m such that | F | ≥ md . Denote the size of F by n = | F | . Let RM = RM F ( m, d ) be theReed-Muller code over the field F , so that the distance of the code is δ RM ≥ − / m ≥ / . Then, forany τ ≤ δ RM / and ρ ≤ δ RM / the m -steps consistency test using H -plane-line random walk on RM hasperfect completeness and ( τ, ρ, ǫ ) − robust soundness, with ǫ = (cid:16) − | F | (cid:17) m − τ + | H | δ RM − ρ .Proof. Consider an r -steps consistency test using random walk on RM as in Algorithm 1. By construction,it is clear that whenever f ∈ RM , the algorithm accepts. It remains to prove the robust soundness of thealgorithm. Assume that f is τ − close to some Q ∗ ∈ RM . Note that since RM is a linear code, withoutloss of generality, we may assume that Q ∗ is all-zeros codeword. Indeed, if f is τ -close to some non-zerocodeword Q ∗ , then we can consider the word f ′ = f − Q ∗ , which is τ -close to the all-zeros codeword, andbehavior of the algorithm on both of these cases are the same. Hence, from now on we will assume that f ( ~x ) = 0 , and f is τ − close to the all-zeros codeword. Below, we show that when running Algorithm 1 onsuch f , then for any choice of P we have either dist ~x ( f |P , RM |P ) ≥ ρ (4)or Pr[ ∃ i ∈ [ r ] s . t . dist ℓ i ( f |P i , RM |P i ) ≥ ρ ] ≥ (cid:18) − | F | (cid:19) r − τ + | H | δ RM − ρ . (5)15t is clear that each of Eq. (4) and Eq. (5) proves Theorem 4.3.Clearly, if dist ~x ( f |P , RM |P ) ≥ ρ , then we are done. Hence, let us assume that dist ~x ( f |P , RM |P ) < ρ .In particular, since f ( ~x ) = 0 , ρ ≤ δ RM / , and dist ~x ( f |P , RM |P ) < ρ , it follows that f |P is ρ -close tosome non-zero codeword of RM |P , and hence f |P contains at least ( δ RM − ρ ) n non-zero entries. For therest of the proof we focus on proving Eq. (5) assuming that f |P contains at least ( δ RM − ρ ) n non-zeroentries.In order to prove it, we introduce the events E i and F i . Definition 4.4.
For i ∈ [ r ] denote by E i the event that f | ℓ i has at least ρn non-zeros, and f |P i has less than ( δ RM − ρ ) n non-zeros. For i ∈ [ r ] denote by F i the event that f | ℓ i has at least ρn non-zeros, and f |P i has at least ( δ RM − ρ ) n non-zeros. The following are the key observations about the event E i Observation 4.5. If E i holds and ρ ≤ δ RM / , then1. dist ℓ i ( f |P i , ) ≥ ρ , since f | ℓ i has at least ρn non-zeros.2. dist ℓ i ( f |P i , Q ) ≥ ρ for all Q ∈ RM \ { } , since f |P i has less than ρn non-zeros.In particular, if E i holds, then dist ℓ i ( f |P i , RM |P i ) ≥ ρ . For each i ∈ [ r ] denote ǫ i = Pr[( ∧ i − j =0 F j ) ^ E i ] . Observe that the events corresponding to ǫ i ’s are disjoint, and hence Pr[ ∃ i ∈ [ r ] s . t . dist ℓ i ( f |P i , RM |P i ) ≥ ρ ] ≥ r X i =1 ǫ i . The following two lemmas are the key steps in the proof of Theorem 4.3.
Lemma 4.6.
For a uniformly random point ~z ∈ P m we have Pr[ f ( ~z ) = 0] ≤ τ + | H | . Lemma 4.7. If dist ~x ( f |P , RM |P ) < ρ , then Pr[( ∧ ri =1 F i )] > (1 − | F | ) r − P ri =1 ǫ i for all r ≥ . We postpone the proofs of the lemmas for now, and proceed with the proof of Theorem 4.3 assumingthe lemmas.Note that if we choose a uniformly random ~z ∈ P m , then Pr[ f ( ~z ) = 0] ≥ Pr[( ∧ mi =1 F i ) ∧ f ( ~z ) = 0] = Pr[( ∧ mi =1 F i )] · Pr[ f ( ~z ) = 0 | ∧ mi =1 F i ] ≥ Pr[( ∧ mi =1 F i )] · ( δ RM − ρ ) , where the last inequality is by noting that if we condition on ∧ mi =1 F i , then f |P m has at least ( δ RM − ρ ) n non-zeros, and hence Pr[ f ( ~z ) = 0 | ∧ mi =1 F i ] ≥ ( δ RM − ρ ) . Therefore, by Lemma 4.6 and Lemma 4.7 itfollows that τ + 2 | H | ≥ Pr[ f ( ~z ) = 0] ≥ Pr[( ∧ mi =1 F i )] · ( δ RM − ρ ) ≥ (cid:18) − | F | (cid:19) m − m X i =1 ǫ i ! · ( δ RM − ρ ) , (6)16nd hence Pr[ ∃ i ∈ [ m ] s . t . dist ℓ i ( f |P i , RM |P i ) ≥ ρ ] ≥ m X i =1 ǫ i ≥ (cid:18) − | F | (cid:19) m − τ + | H | δ RM − ρ . This completes the proof of Theorem 4.3.We now return to the proof of Lemma 4.6.
Proof of Lemma 4.6.
Fix ~x in Algorithm 1, and consider the independent choices of { s i , s ′ i ∈ F } mi =0 , { t i , t ′ i ∈ F } mi =0 , and { ~h ′ i ∈ H m } mi =1 .Note first that for all i ∈ [ m ] we have ~h i = ( i − Y j =0 t j ) · ~h + i − X u =0 ( t ′ u · i − Y j = u +1 t j ) · ~h ′ u . We prove this by induction. Indeed, by Algorithm 1, we have ~h = t · ~h + t ′ · ~h ′ . For the induction step,assume that the equation holds for some i ∈ [ m − . Then for i + 1 we have ~h i +1 = t i · ~h i + t ′ i · ~h ′ i = t i · ( i − Y j =0 t j ) · ~h + i − X u =0 ( t ′ u · i − Y j = u +1 t j ) · ~h ′ u + t ′ i · ~h ′ i = ( i Y j =0 t j ) · ~h + t i i − X u =0 ( t ′ u · i − Y j = u +1 t j ) · ~h ′ u + t ′ i · ~h ′ i = ( i Y j =0 t j ) · ~h + i − X u =0 ( t ′ u · i Y j = u +1 t j ) · ~h ′ u + t ′ i · ~h ′ i = ( i Y j =0 t j ) · ~h + i X u =0 ( t ′ u · i Y j = u +1 t j ) · ~h ′ u , which concludes the induction step. Note that the first equation comes from the definition in Algorithm 1and second equation follows from the induction hypothesis.Also, for all i ∈ [ m ] we have ~x i = ~x + i − X j =0 s j ~h j + i − X j =0 s ′ j ~h ′ j . Again, we prove this by induction. By Algorithm 1, we have ~x = ~x + s · ~h + s ′ · ~h ′ . For the induction17tep, if we assume that the equation holds for some i ∈ [ m − , then for i + 1 we have ~x i +1 = ~x i + s i · ~h i + s ′ i · ~h ′ i = ~x + i − X j =0 s j ~h j + i − X j =0 s ′ j ~h ′ j + s i · ~h i + s ′ i · ~h ′ i = ~x + i X j =0 s j ~h j + i X j =0 s ′ j ~h ′ j This, completes the induction step. Note that the first equation comes from the definition in Algorithm 1 andsecond equation follows from the induction hypothesis.Let P m = P ~x m ,~h m ,~h ′ m Note that we can sample ~z ∈ P m uniformly by choosing s m , s ′ m ∈ F , and letting ~z = ~x m + s m · ~h m + s ′ m ~h ′ m . Therefore, ~z = ~x + m X i =0 s i ~h i ! + m X r =0 s ′ r ~h ′ r ! = ~x + s ~h + m X i =1 s i ( i − Y j =0 t j ) · ~h + i − X r =0 ( t ′ r · i − Y j = r +1 t j ) · ~h ′ r + m X r =0 s ′ r ~h ′ r ! = ~x + ( s + m X i =1 s i ( i − Y j =0 t j )) · ~h + m X i =1 s i i − X r =0 ( t ′ r · i − Y j = r +1 t j ) · ~h ′ r + m X r =0 s ′ r ~h ′ r ! = ~x + ( s + m X i =1 s i ( i − Y j =0 t j )) · ~h + m X r =0 m X i = r +1 ( s i t ′ r · i − Y j = r +1 t j ) + s ′ r ~h ′ r . Next, we fix all random choices except for { s ′ r } and { ~h ′ r } , and apply Claim 3.5. Let A be the set of indices ~x such that f ( ~x ) = 0 . Since f is τ -close to all-zeros codeword, it immediately follows that | A | = τ · | F m | .Since each s ′ r is chosen uniformly at random from F , and each ~h ′ r is chosen uniformly at random from H m ,by applying Claim 3.5 with respect to A , we have Pr[ f ( ~z ) = 0] = Pr ~x + ( s + m X i =1 s i ( i − Y j =0 t j )) · ~h + m X r =0 m X i = r +1 ( s i t ′ r · i − Y j = r +1 t j ) + s ′ r ~h ′ r ∈ A ≤ τ + 2 H , which completes the proof of Lemma 4.6.Next we prove Lemma 4.7. Proof of Lemma 4.7.
We lower-bound the value of
Pr[( ∧ ri =1 F i )] by peeling off one F i at a time. Observethat for every i ∈ [ r ] we have Pr[( ∧ i − j =1 F j ) ∧ ℓ i has at least ρn non-zeros ] = Pr[( ∧ i − j =1 F j ) ∧ F i ] + Pr[( ∧ i − j =1 F j ) ∧ E i ]= Pr[( ∧ ij =1 F j )] + ǫ i . (7)We will use the following claim. 18 laim 4.8. For all i ∈ [ r ] if ρ ≤ δ RM / , then Pr[ ℓ i has at least ρn non-zeros | ∧ i − j =1 F j ] ≥ (cid:16) − | F | (cid:17) .Proof. The proof is rather immediate from Lemma 3.4. Let P i − = P ~x i − ,h i − ,h ′ i − be the plane chosenby Algorithm 1 in the iteration i − . Note that conditioning on ( ∧ i − j =1 F j ) implies that f |P i − has at least ( δ RM − ρ ) n non-zeros. Since ℓ i is a uniformly random line in P i − , by Lemma 3.4 it follows that Pr[ f | ℓ i has less than ρn non-zeros ] ≤ | F | · δ RM − ρ ( δ RM − ρ − ρ ) ≤ | F | , where the last inequality is by the assumption that ρ ≤ δ RM / and δ RM ≥ / . This completes the proof ofClaim 4.8.By applying Claim 4.8 we get Pr[( ∧ i − j =1 F j ) ∧ ℓ i has at least ρn non-zeros ] = Pr[( ∧ i − j =1 F j )] · Pr[ ℓ i has at least ρn non-zeros | ∧ i − j =1 F j ] ≥ Pr[( ∧ i − j =1 F j )] · (cid:18) − | F | (cid:19) . (8)Combining Eq. (7) with Eq. (8) together we get Pr[( ∧ ij =1 F j )] ≥ Pr[( ∧ i − j =1 F j )] · (cid:18) − | F | (cid:19) − ǫ i . (9)By exactly the same argument, using the assumption that f |P contains at least ( δ RM − ρ ) n non-zeros, itfollows that Pr[ F ] ≥ − | F | − ǫ . (10) This follows only from conditioning on F i − , and the other F j ’s are irrelevant. F i at a time, and applying Eq. (9). Pr[( ∧ ri =1 F i )] ≥ Pr[( ∧ r − i =1 F i )] · (cid:18) − | F | (cid:19) − ǫ r ≥ Pr[( ∧ r − i =1 F i )] · (cid:18) − | F | (cid:19) − ǫ r − ! (cid:18) − | F | (cid:19) − ǫ r = Pr[( ∧ r − i =1 F i )] · (cid:18) − | F | (cid:19) − (cid:18) − | F | (cid:19) ǫ r − − ǫ r ≥ . . . ≥ Pr[ F ] · (cid:18) − | F | (cid:19) r − − r X i =1 (cid:18) − | F | (cid:19) r − i · ǫ i ≥ (cid:18) − | F | − ǫ (cid:19) · (cid:18) − | F | (cid:19) r − − r X i =1 (cid:18) − | F | (cid:19) r − i · ǫ i = (1 − | F | ) r − r X i =1 (cid:18) − | F | (cid:19) r − i · ǫ i > (1 − | F | ) r − r X i =1 ǫ i . We get that
Pr[( ∧ ri =1 F i )] > (1 − | F | ) r − P ri =1 ǫ i , which concludes Lemma 4.7. In this section we explain how to construct PCPP systems for the languages RM ( ~x ) |P and RM ( ℓ ) |P defined inEqs. (1) and (2). Note that since all planes in F m are isomorphic, we may think of each RM ( ~x ) |P and RM ( ℓ ) |P as the language RM F (2 , d ) of bivariate polynomials of total degree at most d , concatenated with repetitionsof their values at ~x and ℓ respectively. Following Notation 2.1 and Notation 2.2, we make the followingdefinition. Definition 5.1.
Let F be a field of size | F | = n , and let f : F → F be an F -valued function. Let ~x ∈ F bea point in F , ℓ ⊆ F be the line ℓ = ℓ ~x ,~x = { ~x + t · ~x : t ∈ F } for some ~x , ~x ∈ F . • Define f ( ~x ) to be the concatenation of f with n repetitions of the value of f at the point ~x , i.e., f ( ~x ) = f ◦ ( f ( ~x )) n . • Define f ( ℓ ) to be the concatenation of f with n repetitions of the restriction of f to line ℓ , i.e., f ( ℓ ) = f ◦ ( f | ℓ ) n .Define RM ( ~x ) = RM ( ~x ) F (2 , d ) = { Q ( ~x ) : Q ∈ RM F (2 , d ) } and RM ( ℓ ) = RM ( ℓ ) F (2 , d ) = { Q ( ℓ ) : Q ∈ RM F (2 , d ) } . Note that given an oracle access to f we can query every coordinate of f ( ~x ) and f ( ℓ ) by querying onecoordinate of f . In particular, any PCPP system for f ( ~x ) can be emulated when given access to f withoutincreasing the query complexity of the proof system.The following observation is immediate from Definition 5.1.20 bservation 5.2. Let F be a field of size | F | = n , Let ~x ∈ F be a point in F , ℓ ⊆ F be the line ℓ = ℓ ~x ,~x = { ~x + t · ~x : t ∈ F } for some ~x , ~x ∈ F . Then for any f : F → F we have dist ~x ( f, RM ) = dist( f ( ~x ) , RM ( ~x ) ) and dist ℓ ( f, RM ) = dist( f ( ℓ ) , RM ( ℓ ) ) . It is clear that the languages RM ( ℓ ) and RM ( ~x ) can be solved in polynomial time. Therefore, by Theorem6.1 in [CGS20], these languages admit a ccPCPP with the appropriate parameters. Theorem 5.3 (Canonical PCPP for RM ) . Let F be a finite field and let d ∈ N be a parameter. Let L beeither RM ( ℓ ) F (2 , d ) or RM ( ~x ) F (2 , d ) . Then L admits a ccPCPP with the following parameters1. The ccPCPP verifier has perfect completeness and soundness ǫ P CP P = 0 . for any proximity parameter ρ > .2. The query complexity of the verifier is q = O (1 /ρ ) .3. The length of canonical proof π ( f ) for f ∈ L of length n is len ( n ) = poly( n ) .4. The language Π L = { f ◦ π ( f ) : f ∈ L } is a ( τ cor , ǫ inRLCC ) -RLCC with query complexity q = O (1) ,constant correction radius τ cor = Ω(1) , and constant soundness ǫ = Ω(1) . Informally speaking, in order to prove Theorem 5.3 [CGS20] start with a cPCPP system from Theorem 3.11,and for every w ∈ L , and a canonical proof π ( w ) , define a correctable proof π ∗ ( w ) by encoding w ◦ π ( w ) using a systematic RLCC with constant distance and polynomial block length (e.g., the one from [GRR18]or [CGS20]). Since the RLCC is systematic, the encoding is of the form w ◦ π ( w ) ◦ π ′ ( w ) for some string π ′ ( w ) of length poly( | w | ) . Then, define the canonical proof to be π ∗ ( w ) = π ( w ) ◦ π ′ ( w ) . It is ratherstraightforward to define a verifier that will satisfy the requirements of Theorem 5.3. We omit the detailsfrom here, and refer the interested reader to Theorem 6.1 in [CGS20]. In this section we present a composition theorem used to combine the
CTRW for Reed-Muller codes withappropriate PCPPs. This composition theorem immediately implies the statement of Theorem 1, albeit fora large alphabet.
CTRW
Below we prove that if RM F ( m, d ) admits an m -steps CTRW , then it can be composed with a PCPP systemwith appropriate parameters to obtain an RLCC with query complexity O ( m ) . The composition theoremthat we present is a slightly modified version of the composition theorem presented in [CGS20]. The maindifference compared to [CGS20] is that we consider two types of PCPP proofs for RM . Theorem 6.1 (Composition theorem for Reed-Muller codes) . Let F be a finite field of size | F | = n , let m, d ∈ N be parameters, and let δ RM = 1 − dn . Suppose that H is a subfield of F such that [ F : H ] = m .Consider the following components. • Reed-Muller code RM F ( m, d ) : F K → F n m that admits an m -steps H -plane-line- CTRW with the follow-ing parameters. . CTRW has perfect completeness and ( τ, ρ, ǫ RW ) -robust soundness.2. The total number of predicates (of both types) defined for the CTRW is at most B . • Canonical PCPP systems for languages of the form RM ( ~x ) |P and RM ( ℓ ) |P with the following properties.1. For each f ( ~x ) |P ∈ RM ( ~x ) |P the length of the canonical proof is at most len (2 n ) .2. For each f ( ℓ ) |P ∈ RM ( ℓ ) |P the length of the canonical proof is at most len (2 n ) .3. The verifier has query complexity q PCPP , perfect completeness, soundness ǫ P CP P < for proximityparameter ρ .4. The codes Π RM ( ~x ) |P = { f ( ~x ) |P ◦ π ( f ( ~x ) |P ) : f ( ~x ) |P ∈ RM ( ~x ) |P } and Π RM ( ℓ ) |P = { f ( ℓ ) |P ◦ π ( f ( ℓ ) |P ) : f ( ℓ ) |P ∈ RM ( ℓ ) |P } are (2 ρ, ǫ inRLCC ) -RLCC with query complexity q PCPP .Then, there exists a code C comp : F K → F N with block length N ≤ n m + 2 B · len (2 n ) and relativedistance at least (cid:0) − dn (cid:1) .The code C comp is a ( τ cor , ǫ RLCC ) -RLCC with query complexity q RLCC = ( m + 3) · q PCPP , where thedecoding radius of C comp is τ cor = τ / and the soundness is ǫ RLCC = min (cid:18) ǫ RW · (1 − ǫ P CP P ) · δ RM , ǫ inRLCC (cid:19) . Before proceeding with the proof of Theorem 6.1, we show how it implies Theorem 1.
Proof of Theorem 1.
Given a sufficiently large parameter q RLCC specifying the desired query complexity ofan RLCC, let m = ⌊ q RLCC q PCPP ⌋ − ≥ , and let d ≥ m . Choose a prime field H , such that (4 d ) /m ≤ | H | ≤ · (4 d ) /m . Finally, we let F be the degree- m extension of H , and let n = | F | . In particular, by the choice of d we have | H | ≥ , and the relative distance of RM F ( m, d ) is δ RM = 1 − dn = 1 − d | H | m ≥ − d d ≥ / .By Theorem 4.3, the RM F ( m, d ) admits an m -steps H -plane-line- CTRW with perfect completeness and ( τ, ρ, ǫ RW ) -robust soundness, with τ = δ RM / , ρ = δ RM / , and soundness parameter ǫ RW = (cid:18) − | F | (cid:19) m − τ + | H | δ RM − ρ > − m | F | − δ/ / δ/ ≥ − m · m − / / / ≥ − m m = Ω(1) . Furthermore, by a simple counting, the total number of predicates (of both types) defined for the
CTRW is B ≤ n m · | H | m · n = 2 n m +4 .For the canonical PCPP component, by Theorem 5.3 the ccPCPP system has perfect completeness, con-stant soundness with respect to ρ = δ RM / , query complexity q PCPP = O (1 /ρ ) = O (1 /δ RM ) = O (1) , andthe length of the canonical proof is len (2 n ) = poly( n ) .Therefore, by Theorem 6.1 we obtain a q RLCC -query ( τ cor , ǫ RLCC ) -RLCC C comp : F K → F N withconstant relative distance, τ cor = Ω(1) , ǫ RLCC = Ω(1) , where the message length of the code is K = (cid:0) d + mm (cid:1) ≥ (cid:0) dm (cid:1) m , and its block length is N ≤ n m + 2 B · len (2 n ) ≤ n m + 4 n m +4 · poly( n ) . By pluggingin the parameters we get N = n m + O (1) ≤ (2 m · d ) m + O (1) = (4 m · m ) m + O (1) · (cid:18) dm (cid:19) m + O (1) = 2 O ( m ) · K O (1 /m ) , and relative distance is at least (cid:0) − dn (cid:1) ≥ / . This completes the proof of Theorem 1.The rest of this section is devoting to the proof of Theorem 6.1.22 .2 Constructing the composed code Constructing the composed code C comp : Given the components in the statement of Theorem 6.1, thecomposed code C comp : F K → F N is obtained by concatenating several repetitions of the Reed-Mullerencoding of the message with the canonical proofs of proximity.Specifically, given a message M ∈ F K , we first let Q RM = RM ( M ) be encoding of M using RM ( m, d ) .The final encoding C comp ( M ) consists of the following three parts: C comp ( M ) = RM rep ◦ Π P oint ◦ Π Line , descried below.1. RM rep consists of t = ⌈ B · len (2 n ) /n m ⌉ repetitions of Q RM , where t ≥ is the minimal integer sothat t · n m ≥ B · len (2 n ) . Although these repetitions look rather artificial, they make sure that theReed-Muller part of the encoding will constitute a constant fraction of the codeword C comp ( M ) .2. Π P oint is the concatenation of proofs of proximity π ( P ,~x ) (as per the ccPCPPs in the hypothesis of thetheorem) for each H -plane P and for each point ~x ∈ P . That is, each such π ( P ,~x ) is the canonical prooffor the assertion that ( Q RM ) ( ~x ) |P ∈ RM ( ~x ) |P .Note that since C comp ( M ) contains many copies of Q RM , each π ( P ,~x ) is expected to be the canonicalproof for all the copies.3. Π Line is the concatenation of proofs of proximity π ( P ,ℓ ) (as per the ccPCPPs in the hypothesis of thetheorem) for each H -plane P and for each line ℓ ⊆ P . Each such π ( P ,ℓ ) is the canonical proof for theassertion that ( Q RM ) ( ℓ ) |P ∈ RM ( ℓ ) |P .Again, since C comp ( M ) contains many copies of Q RM , each π ( P ,ℓ ) is expected to be the canonical prooffor all the copies. Parameters of C comp : Note that the total block length of the encoding is N = t · n m + B · len (2 n ) ≤ n m + 2 B · len (2 n ) .As for the relative distance of C comp , if the relative distance of RM F ( m, d ) is δ RM , then, the relativedistance of the RM rep part is also δ RM . Furthermore, since the length of the RM rep part is at least half of thetotal block length, it follows that the relative distance of C comp is at least δ RM / . Below we present the local correcting algorithm for C comp for the RM rep part of the code. Given a word w ∈ F N write w = f rep ◦ Π P oint ◦ Π Line , where f rep is (expected to be) the t copies of some Reed-Mullercodeword, and Π P oint , Π Line are the proofs as described above.Let i ∈ [ N ] be the coordinate in the RM rep part of the code, which corresponds to some ~x ∈ F m of (oneof the copies of) the Reed-Muller encoding. The local correcting algorithm is described in Algorithm 2, andworks as follows. 23 lgorithm 2: Local correcting algorithm for the RM rep part Input: w = f rep ◦ Π P oint ◦ Π Line , i ∈ [ N ] Let ~x ∈ F m be the index corresponding to the i ’th coordinate of the RM encoding Sample r ∈ [ t ] uniformly at random, and let f : F m → F be the substring of f rep corresponding tothe r -th copy of the base codeword Run the m -steps H -Line-Plane- CTRW from Algorithm 1 on the input ( f, ~x ) Let P , P , . . . , P m be the planes sampled by CTRW , and let ℓ , . . . , ℓ m be the sampled lines Run the cPCPP verifier on π ( P ,~x ) to check that f ( ~x ) |P is ρ -close to RM ( ~x ) |P for j = 1 to m do Run the cPCPP verifier on π ( P j ,ℓ j ) to check that f ( ℓ j ) |P j is ρ -close to RM ( ℓ j ) |P j if Step 5 accepts and all iterations of Step 7 accept then return f ( ~x ) else return ⊥ Query complexity:
The total number of queries made in Algorithm 2 is clearly upper bounded by ( m +1) · q PCPP from Lines 5 and 7 of the algorithm.
Proof of correctness:
By the description of the algorithm, it is clear that if the input is a non-corruptedcodeword, i.e., w ∈ C comp then for any ~x ∗ ∈ F m , the algorithm always returns the correct answer.Now, assume that the input w = f rep ◦ Π P oint ◦ Π Line is τ cor -close to some codeword W ∗ ∈ C comp ,and suppose that the RM rep part of W ∗ consists of t copies of some degree- d polynomial Q ∗ ∈ RM . Wewill show that Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥} ] ≥ ǫ RLCC .Note that since w is τ cor -close to W ∗ , and the length of f rep is at least 1/2 of the total block length, itfollows that f rep is τ cor -close to the t repetitions of Q ∗ . Denote W close to be the event that dist( f, Q ∗ ) ≤ τ cor = τ for the random copy f in the RM rep part sampled in Line 2 of the algorithm. Then, by Markov’sinequality Pr[ W close ] ≥ / . Therefore, Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥} ] ≥ / · Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥}| W close ] . From now on, let us condition on the event W close , and focus on the term Pr[ D w Algorithm i ) ∈{ Q ∗ ( ~x ) , ⊥}| W close ] . Furthermore, let us fix the choice of f and consider the following two cases. Case 1: f ( ~x ) = Q ∗ ( ~x ) . Noting that D w Algorithm i ) always outputs either f ( ~x ) or ⊥ it follows that byconditioning on such f we have Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥}| W close , f ( ~x ) = Q ∗ ( ~x )] = 1 . Case 2: f ( ~x ) = Q ∗ ( ~x ) . In this case since
CTRW admits ( τ, ρ, ǫ RW ) -robust soundness, it follows that Pr[dist ~x ( f |P , RM |P ) ≥ ρ ∨ ∃ j ∈ [ m ] such that dist ℓ j ( f |P j , RM |P j ) ≥ ρ ] ≥ ǫ RW . Therefore, when running the cPCPP verifier for which the local view is ρ -far from the corresponding predi-cate, the verifier will reject with probability at least − ǫ P CP P , and hence the decoder will output ⊥ withthe same probability. Therefore, we can lower bound the second term by Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥}| W close , f ( ~x ) = Q ∗ ( ~x )] ≥ ǫ RW (1 − ǫ P CP P ) . Pr[ D w Algorithm i ) ∈ { Q ∗ ( ~x ) , ⊥} ≥ ǫ RW · (1 − ǫ P CP P )2 ≥ ǫ RLCC . This completes the proof of correctness of the algorithm for the RM rep part of the code. Next, we present the correction algorithm for the cPCPP proofs part of the code. Let w = f rep ◦ Π P oint ◦ Π Line ∈ F N be a given word, and let i ∈ [ N ] be a coordinate in the Π P oint ◦ Π Line part of the proof. Thecorrection algorithm is described in Algorithm 3, and works as follows.
Algorithm 3:
Local correction for the PCPP part Π Input: w = f rep ◦ Π P oint ◦ Π Line , i ∈ [ N ] Sample r ∈ [ t ] uniformly at random, and let f : F m → F be the substring of f rep corresponding tothe r -th copy of the base codeword if i is a coordinate in Π P oint then Let P ⋆ and ~x ⋆ ∈ P ⋆ be the plane and the point such that i is a coordinate of π = π ( P ⋆ ,~x ⋆ ) Run the cPCPP verifier to check that dist( f ( ~x ⋆ ) |P ⋆ , RM ( ~x ⋆ ) |P ⋆ ) ≤ ρ if Step 4 rejects then return ⊥ else Let P ⋆ and ℓ ⋆ ⊆ P ⋆ be the plane and the line such that i is a coordinate of π = π ( P ⋆ ,ℓ ⋆ ) Run the cPCPP verifier to check that dist( f ( ℓ ⋆ ) |P ⋆ , RM ( ℓ ⋆ ) |P ⋆ ) ≤ ρ if Step 9 rejects then return ⊥ Choose a uniformly random ~x ∈ P ⋆ Run the m -steps H -Line-Plane- CTRW from Algorithm 1 on the input ( f, ~x ) Let P , P , . . . , P m be the planes sampled by CTRW , and let ℓ , . . . , ℓ m be the sampled lines Run the cPCPP verifier on π ( P ,~x ) to check that f ( ~x ) |P is ρ -close to RM ( ~x ) |P for j = 1 to m do Run the cPCPP verifier on π ( P j ,ℓ j ) to check that f ( ℓ j ) |P j is ρ -close to RM ( ℓ j ) |P j if Steps 15 or 17 reject then return ⊥ if i is a coordinate in Π P oint then Run the local corrector of the inner ccPCPP on f ( ~x ⋆ ) |P ⋆ ◦ π ( P ⋆ ,~x ⋆ ) to correct w i return the value obtained in Step 21 else Run the local corrector of the inner ccPCPP on f ( ℓ ⋆ ) |P ⋆ ◦ π ( P ⋆ ,ℓ ⋆ ) to correct w i return the value obtained in Step 24 25 uery complexity: The total number of queries is upper bounded by (i) q PCPP queries in Step 4 or in Step9, (ii) at most ( m + 1) · q PCPP queries in Steps 15 and 17, and (iii) at most q PCPP queries in Step 21 or Step24 . Therefore, the total query complexity is upper bounded by ( m + 3) · q PCPP , as required.
Proof of correctness:
By the description of the algorithm, it is clear that if w ∈ C comp , then for any index i ∈ [ N ] in the proof part, the algorithm always returns the correct answer w i .We assume from now on that the input w ∈ F N is τ cor -close to some codeword W ∗ ∈ C comp , andsuppose that the RM rep part of W ∗ consists of t copies of some degree- d polynomial Q ∗ ∈ RM . As in theprevious part, since w is τ cor -close to W ∗ , and the length of f rep is at least 1/2 of the total block length, itfollows that f rep is τ cor -close to the t repetitions of Q ∗ . Therefore, for the random copy f in the RM rep part sampled in Line 1 of the algorithm, we have Pr[dist( f, Q ∗ ) ≤ τ cor = τ ] ≥ / . From now on let uscondition on the event dist( f, Q ∗ ) ≤ τ .Let us assume that the coordinate i ∈ N we wish to decode belongs to some π ( P ⋆ ,~x ⋆ ) . The followingclaim completes the analysis of the correcting algorithm. Claim 6.2. If Pr[ D w Algorithm i ) = ⊥ ] < ǫ RLCC , then
Pr[ D w Algorithm i ) = W ∗ i ] > ǫ inRLCC .Proof. Since Algorithm 3 returns ⊥ with probability less than ǫ RLCC in Step 6, the PCPP verifier for π ( P ⋆ ,~x ⋆ ) in Step 4 accepts with probability at least − ǫ RLCC > ǫ
P CP P . Thus, there is some bivariatedegree- d polynomial Q ′ : P ⋆ → F (not necessarily equal to Q ∗|P ⋆ ) such that1. dist( f ( ~x ⋆ ) |P ⋆ , Q ′ ( ~x ⋆ ) |P ⋆ ) ≤ ρ , and hence dist( f |P ⋆ , Q ′|P ⋆ ) ≤ ρ ,2. and π ( P ⋆ ,~x ⋆ ) is ρ -close to the canonical proof π ( Q ′ ( ~x ⋆ ) |P ⋆ ) .Next, we use the assumption that Algorithm 3 returns ⊥ with probability less than ǫ RLCC in Steps 15or 17. That is, when running H -plane-line CTRW from a uniformly random ~x ∈ P ⋆ , and then running thecorresponding cPCPPs, with probability at least − ǫ RLCC all cPCPP verifiers accept. For each ~z ∈ P ⋆ let p ~z be the probability that both Step 15 and Step 17 accept when starting from ~z . Then E [ p ~z ] > − ǫ RLCC , andhence for at least (1 − δ RM / n starting points ~z ∈ P ⋆ we have p ~z ≥ − ǫ RLCC δ RM ≥ − ǫ RW (1 − ǫ P CP P ) .Therefore, by the analysis of the correcting algorithm for the RM rep part, for more than (1 − δ RM / n starting points ~z ∈ P ⋆ it holds that f ( ~z ) = Q ∗ ( ~z ) . Indeed, by case 2 of the analysis if f ( ~z ) = Q ∗ ( ~z ) , then p ~z < − ǫ RW (1 − ǫ P CP P ) . Therefore, dist( f |P ⋆ , Q ∗|P ⋆ ) < δ RM / .Combining with the conclusion from the previous step that dist( f |P ⋆ , Q ′|P ⋆ ) ≤ ρ it follows that dist( Q ∗|P ⋆ , Q ′|P ⋆ ) < ρ + δ RM / ≤ δ RM . Thus, since RM F ( m, d ) has distance δ RM we conclude that Q ∗|P = Q ′|P .So far we showed that if Algorithm 3 returns ⊥ with probability less than ǫ RLCC and f is τ -close to Q ∗ (which happens with probability at least / ), then f |P ⋆ is ρ -close to Q ∗|P ⋆ , and π ( P ⋆ ,~x ⋆ ) is ρ -close to π ( Q ∗ ( ~x ⋆ ) |P ⋆ ) , the canonical proof of Q ∗ ( ~x ⋆ ) |P ⋆ . Therefore, the local correction algorithm for the inner ccPCPPapplied on ( f |P ⋆ ◦ π ( P ⋆ ,~x ⋆ ) ) in Step 21 returns either W ∗ i or ⊥ with probability at least ǫ inRLCC . Therefore, Pr[ D w Algorithm i ) ∈ { W ∗ i , ⊥} ] ≥ Pr[
Step 22 returns W ∗ i or ⊥| dist( f, Q ∗ ) ≤ τ ] · Pr[dist( f, Q ∗ ) ≤ τ ] ≥ ǫ inRLCC , as required. 26e proved correctness of the local correction algorithm assuming that the coordinate i ∈ N we wish todecode belongs to some π ( P ⋆ ,~x ⋆ ) . For the case when i belongs to π ( P ⋆ ,ℓ ⋆ ) , the analysis is exactly the same.This concludes the proof of Theorem 6.1. In this paper we constructed an O ( q ) -query RLDC C : F K → F N with block length N = q O ( q ) · K O (1 /q ) ,assuming that the field is large enough, namely, assuming that | F | ≥ c q · K /q . Using standard techniques it ispossible to obtain a binary RLDC with similar parameters. This can be done by concatenating our code withan arbitrary binary code with constant rate and constant relative distance. Indeed, this transformation appearsin [CGS20, Appendix A], who showed how concatenating CTRW -based RLDC over large alphabet with agood binary code gives a binary RLDC that essentially inherits the block length and the query complexityof the RLDC over large alphabet. Below we provide the proof sketch, explaining how the concatenationworks.
Proof sketch.
Suppose that we want to construct a short binary RLCC. Let C RLCC : F K → F N be theRLCC over some field F with the desired block length, and let C bin : { , } K ′ → { , } N ′ be an error-correcting code with constant rate and constant distance. We also assume that field F is chosen so that | F | = 2 K ′ . (To satisfy this condition, one can simply set H to be a field of characteristic 2.) This assumptionwill allow us to have a bijection between each symbol of F and binary string of length K ′ .We construct the binary concatenated code C concat : { , } K · K ′ → { , } N · N ′ as follows. Given amessage M ∈ { , } K · K ′ , we first convert it to an string in M ′ ∈ F K in the natural way. Then, we encode M ′ using C RLCC to obtain a codeword c ∗ ∈ C RLCC . Finally, we encode each symbol of c ∗ using C bin toget the final codeword c ∈ { , } N · N ′ .To prove that the concatenated code is an RLCC, Chiesa, Gur, and Shinkar proved in [CGS20, TheoremA.4] that if C RLCC admits an r -steps CTRW with some soundness guarantees, then C concat admits an r -steps CTRW with related soundness guarantees. The
CTRW on the concatenated code C concat emulates the CTRW on C RLCC by sampling planes for the
CTRW on the Reed-Muller code, and instead of reading thesymbols from F , it reads the binary encodings of all symbols belonging to these planes.Indeed, it is not difficult to see that if C RLCC admits an r -steps CTRW with some soundness guarantees,then so does the concatenated code. We omit the details, and refer the interested reader to Appendix Ain [CGS20].We conclude the paper with several open problems we leave for future research.1. The most fundamental open problem regarding RLDCs/RLCCs is to understand the optimal trade-offbetween the query complexity of LDCs and their block length in the constant query regime. It is plausiblethat the lower bound of [GL20] can be improved to K /q ) , although we do not have any evidencefor this.2. As discussed in the intoduction, [BGH +
06] asked whether it is possible to prove a separation betweenLDCs and RLDCs. Understanding the trade-off between the query complexity and the block length isone possible way to show such separation.3. Another interesting open problem is to construct an RLDC/RLCC with constant rate and small querycomplexity. In particular, it is plausible that there exist polylog( N ) -query RLDCs with N = O ( K ) .27. Also, it would be interesting to construct RLDCs/RLCCs using high-dimensional expanders [KM17,DK17, DDFH18, KO18]. Since there are several definitions of high-dimensional expanders, it wouldbe interesting to state the sufficient properties of high-dimensional expanders required for RLDCs. Webelieve this approach can be useful in constructing constant rate RLDCs with small query complexity. References [BGH +
06] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil P. Vadhan. RobustPCPs of proximity, shorter PCPs, and applications to coding.
SIAM Journal on Computing ,36(4):889–974, 2006.[CGS20] Alessandro Chiesa, Tom Gur, and Igor Shinkar. Relaxed locally correctable codes with nearly-linear block length and constant query complexity. In
Proceedings of the Fourteenth AnnualACM-SIAM Symposium on Discrete Algorithms , pages 1395–1411, 2020.[DDFH18] Yotam Dikstein, Irit Dinur, Yuval Filmus, and Prahladh Harsha. Boolean Function Analysis onHigh-Dimensional Expanders. In
Approximation, Randomization, and Combinatorial Optimiza-tion. Algorithms and Techniques (APPROX/RANDOM 2018) , Leibniz International Proceedingsin Informatics (LIPIcs), pages 38:1–38:20, 2018.[DGG18] Irit Dinur, Oded Goldreich, and Tom Gur. Every set in P is stronglytestable under a suitable encoding. Technical report, 2018. Available at https://eccc.weizmann.ac.il/report/2018/050 .[DJK +
02] A. Deshpande, R. Jain, T. Kavitha, S. V. Lokam, and J. Radhakrishnan. Better lower boundsfor locally decodable codes. In
Proceedings 17th IEEE Annual Conference on ComputationalComplexity , pages 184–193, 2002.[DK17] I. Dinur and T. Kaufman. High dimensional expanders imply agreement expanders. In , pages 974–985,2017.[DR04] Irit Dinur and Omer Reingold. Assignment testers: Towards a combinatorial proof of the PCPtheorem. In
Proceedings of the 45th Annual IEEE Symposium on Foundations of ComputerScience , FOCS 2004, pages 155–164, 2004.[Efr12] Klim Efremenko. 3-query locally decodable codes of subexponential length.
SIAM J. Comput. ,41(6):1694–1703, 2012.[GKST02] O. Goldreich, H. Karloff, L. J. Schulman, and L. Trevisan. Lower bounds for linear locally de-codable codes and private information retrieval. In
Proceedings 17th IEEE Annual Conferenceon Computational Complexity , pages 175–183, 2002.[GL20] Tom Gur and Oded Lachish. A lower bound for relaxed locally decodable codes. In , SODA 2020, 2020.[GRR18] Tom Gur, Govind Ramnarayan, and Ron D. Rothblum. Relaxed locally correctable codes. In , ITCS ’18, pages 27:1–27:11, 2018.28KdW03] Iordanis Kerenidis and Ronald de Wolf. Exponential lower bound for 2-query locally decodablecodes via a quantum argument. In
Journal of Computer and System Sciences , pages 106–115,2003.[KM17] Tali Kaufman and David Mass. High Dimensional Random Walks and Colorful Expansion.In , Leibniz Interna-tional Proceedings in Informatics (LIPIcs), pages 4:1–4:27, 2017.[KO18] Tali Kaufman and Izhar Oppenheim. High Order Random Walks: Beyond Spectral Gap. In
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques(APPROX/RANDOM 2018) , Leibniz International Proceedings in Informatics (LIPIcs), pages47:1–47:17, 2018.[KS17] Swastik Kopparty and Shubhangi Saraf. Local testing and decoding of high-rate error-correctingcodes.
Electronic Colloquium on Computational Complexity (ECCC) , 2017.[KT00] Jonathan Katz and Luca Trevisan. On the efficiency of local decoding procedures for error-correcting codes. In
Proceedings of the Thirty-Second Annual ACM Symposium on Theory ofComputing , pages 80–86, 2000.[Mul54] David E Muller. Application of boolean algebra to switching circuit design and to error detec-tion.
Transactions of the IRE professional group on electronic computers , 1954.[Oba02] Kenji Obata. Optimal lower bounds for 2-query locally decodable linear codes. In
Randomiza-tion and Approximation Techniques in Computer Science , pages 39–50, 2002.[Par20] Orr Paradise. Smooth and strong pcps. In
Proceedings of the 11th Innovations in TheoreticalComputer Science Conference , ITCS 2020, 2020.[WdW05] Stephanie Wehner and Ronald de Wolf. Improved lower bounds for locally decodable codes andprivate information retrieval. In
Proceedings of the 32nd International Conference on Automata,Languages and Programming , ICALP’05, pages 1424—-1436, 2005.[Woo07] David Woodruff. New lower bounds for general locally decodable codes. Technical report,2007. Available at https://eccc.weizmann.ac.il/report/2007/006/ .[Woo10] David P. Woodruff. A quadratic lower bound for three-query linear locally decodable codesover any field. In
Proceedings of the 14th International Workshop on Randomized Techniquesin Computation , RANDOM 10, pages 766–779, 2010.[Yek08] Sergey Yekhanin. Towards 3-query locally decodable codes of subexponential length.
J. ACM ,55(1):1:1–1:16, 2008.[Yek12] Sergey Yekhanin. Locally decodable codes.