Locally testable codes via high-dimensional expanders
LLocally testable codes via high-dimensional expanders ∗Yotam Dikstein † Irit Dinur ‡ Prahladh Harsha § Noga Ron-Zewi ¶ May 5, 2020
Abstract
Locally testable codes (LTC) are error-correcting codes that have a local tester which can distinguishvalid codewords from words that are far from all codewords, by probing a given word only at a very small(sublinear, typically constant) number of locations. Such codes form the combinatorial backbone of PCPs.A major open problem is whether there exist LTCs with positive rate, constant relative distance andtestable with a constant number of queries.In this paper, we present a new approach towards constructing such LTCs using the machinery ofhigh-dimensional expanders. To this end, we consider the Tanner representation of a code, which isspecified by a graph and a base code. Informally, our result states that if this graph is part of an agreement expander then the local testability of the code follows from the local testability of the basecode. Agreement expanders allow one to stitch together many mostly-consistent local functions into asingle global function. High-dimensional expanders are known to yield agreement expanders with constantdegree.This work unifies and generalizes the known results on testability of the Hadamard, Reed-Muller andlifted codes, all of which are proved via a single round of local self-correction: the corrected value at avertex v depends on the values of all vertices that share a constraint with v . In the above codes this setincludes all of the vertices. In contrast, in our setting the degree of a vertex might be a constant, so wecannot hope for one-round self-correction. We overcome this technical hurdle by performing iterativeself correction with logarithmically many rounds and tightly controlling the error in each iteration usingproperties of the agreement expander.Given this result, the missing ingredient towards constructing a constant-query LTC with positiverate and constant relative distance is an instantiation of a base code and a constant-degree agreementexpander that interact well with each other. In this work, we study an approach to constructing locally testable codes (LTCs) based on high-dimensionalexpansion. LTCs are error-correcting codes that have a local tester which can test if a given word is a valid ∗ Part of this work was done when the authors were visiting the Simons Institute of Theory of Computing, Berkeley for thesummer cluster on "Error-Correcting Codes and High-Dimensional Expansion". † Weizmann Institute, [email protected] ‡ Weizmann and IAS, [email protected] . Research supported by ERC CoG grant 772839 and the National ScienceFoundation under agreement No. CCF-1900460. § Tata Institute of Fundamental Research, Mumbai, India. [email protected] . Research supported by the Departmentof Atomic Energy, Government of India, under project no. 12-R&D-TFR-5.01-0500 and in part by the Swarnajayanti fellowship. ¶ Haifa University, [email protected] a r X i v : . [ c s . CC ] M a y odeword or far (in Hamming distance) from all codewords, by probing the given word only at a very small(sublinear, typically constant) number of locations. Reed-Muller codes were the first codes shown to belocally-testable [FS95, RS96]. These codes are based on low degree polynomial functions, and have inversepolynomial rate. Later on, LTCs with inverse poly-logarithmic rate were constructed by [BS08, Din07].Obtaining an LTC family with rate that is not vanishing is a major open question in this area. Such codesare known as “good” LTCs or c -LTCs since they have c onstant rate, c onstant relative distance, and testablewith a c onstant number of queries [Gol10]. This question is interesting in its own right, and also couldpotentially lead towards constructing linear-length PCPs (as LTCs are the combinatorial backbone of allPCP constructions). The problem of constructing c -LTCs is particularly difficult as we do not know if suchgood codes exist, even non-explicitly (say using a probabilistic argument). The difficulty stems from thefact that local testability requires redundancy in the constraints. In known LTCs, the constraints are highlyoverlapping, a property that in the past went hand in hand with relatively dense families of constraints. Alasthis density seems to significantly limit the rate. In contrast, high-dimensional expanders give sparse familiesof subsets that are heavily overlapping. Perhaps if we manage to find appropriate constraints on these subsetswe may find higher rate LTCs.In this work, the vague notion of “overlapping constraints” is captured through so-called agreement-expansion (which will be formally defined below).Informally speaking, we show that if an error-correcting code is defined through a collection of localconstraints that sit on an agreement expander , then to prove local testability of the entire code it suffices toprove local testability of the local components (which are of merely constant size in the case of constant-degreeagreement expanders). This is similar in spirit to recent applications of high-dimensional expanders towardsproving other local-to-global results. This passing from local to global is particularly important becauseknown constructions of high-dimensional expanders are very difficult to analyze on a global level. So far,successful analyses focused on the local structure (in neighborhoods, or so-called links) of these objects.Through this work, the task of constructing global LTCs is reduced to the task of constructing LTCs on thelocal structure, which appears to be a much more reasonable task.This work can be viewed as providing a generic scheme for constructing an LTC on a high-dimensionalexpander (or an agreement expander), and the (big) missing ingredient is an appropriate instantiation. Wecomment that the flagship example of an LTC, namely Reed-Muller codes, can be viewed as an instantiationof this scheme, with the underlying agreement expander being the Grassmannian complex and the base codebeing the Reed-Solomon code (see Section 5). The hope is that replacing the “dense” Grassmannian complexby a bounded-degree complex, together with finding an appropriate base code, could potentially lead to a c LTC.
Tanner Codes.
To elucidate the main result, we begin by recalling a well-studied family of codes, the
Tanner codes [Gal60, Tan81]. A Tanner code C ⊆ {
0, 1 } n is given by a family of (small, often constant-sized)subsets t , . . . , t m ⊂ [ n ] and for each subset a base code C t i ⊂ {
0, 1 } t i . A string w ∈ {
0, 1 } n is in the code C if for each i , w | t i ∈ C t i . Many known codes, including Reed-Muller codes, lifted codes, tensor codes,and expander codes, are in fact Tanner codes. In all of these cases, there is a single base code C such that C t i = C for all i , but this need not be the case.The Tanner representation of a code also gives a natural candidate for a local test for checking whether agiven word w ∈ {
0, 1 } n is in the code. Natural Tanner Test : Choose a random i ∈ [ m ] and accept iff w | t i ∈ C t i . A Tanner code is equivalently described on a bipartite graph (called the Tanner graph) with n right vertices correspondingto the coordinates of the code and m left vertices corresponding to the sets t i , with an edge between v and t i if v ∈ t i .
2e say that C is ρ -locally-testable with the natural tester if ρ · dist ( w , C ) (cid:54) P [ Test fails ] .A family of codes is a locally testable code (LTC) if it satisfies the above inequality for some test (notnecessarily the natural Tanner test) with a constant ρ (that does not decrease with the block length of thecode).Many Tanner codes, including expander codes and random LDPC codes, that are very good in termsof rate and distance, and can be characterized by “low density” constraints (that look at only a constantnumber of bits in the codeword) fail quite miserably at being LTCs [BHR05].Imagine that in addition to T = { t , . . . , t m } we also have a family S of subsets of [ n ] , such that each s ∈ S has constant size, but slightly larger than the size of the t i ’s. For each such s ∈ S we consider the‘local’ Tanner code C s = { w ∈ {
0, 1 } s | w | t ∈ C t , ∀ t ∈ T , t ⊂ s } .(Of course, C s is non-trivial only if there are some t ∈ T contained in s .)In this work, we show that if for each s ∈ S , the code C s itself is locally testable with the natural Tannertest, then the code C too must be locally testable with respect to the natural Tanner test. This holds as longas we assume some nice structure on the families S and T , namely that they are part of a “multi-layeredagreement sampler”, MAS for short, which is described below.Let us change point of view and look at the codes { C s } as a collection of base codes, giving rise to theTanner code C . Our main result is that local testability of the base codes C s lifts to local testability of theentire code C , assuming an expander-like MAS condition on the underlying Tanner graph. This is analogousto the celebrated expander codes [SS96] in which distance of the base codes gets lifted to distance of theentire code, assuming expansion of the underlying Tanner graph. Whereas expansion alone does not sufficefor local testability, the MAS structure does. High-dimensional expanders and Agreement Expanders.
There are several interesting and non-equivalent definitions for high-dimensional expanders, the two main ones being topological definitions ofcoboundary or cosystolic expansion [LM06, Gro10, DKW18], and, more relevant to this work, random walkdefinitions either locally at the link level [KM17, DK17] or globally [DDFH18, KO18]. Without going intodetails, high-dimensional expansion has already been shown to imply some surprising local to global theorems.For example the trickling down theorem of [Opp18] proves global spectral expansion using local spectralexpansion in the links (which are the neighborhoods of individual vertices). Another example is the listdecoding of [DHKNT19] which deduces global list decoding from list-decoding on the local pieces.Yet another example, which is crucial for this work, is that high-dimensional expanders give rise toagreement expanders [DK17, DD19]. An agreement expander allows one to stitch together many mostly-consistent local functions into a single global function. We elaborate a little more on this notion. Let V be aground set of n elements, and let S be a collection of subsets of V of some fixed size. Let A be a graph whosevertices are the subsets in S , and each edge { s , s } is labeled by a subset k ⊂ s ∩ s . Let K be a collection ofsubsets labelling the edges. ( V , K , S , A ) is an α -agreement expander if whenever an ensemble has agreement value 1 − ε there existsa global function F : V → {
0, 1 } such that f s = F | s for all but at most ε / α of s ∈ S . (See Section 2.5 forthe full definition). An agreement expander is given by V , K , S and the edge-labelled graph A . Supposethat for each s ∈ S we are given a local function f s ∈ {
0, 1 } s . The agreement value of the ensemble { f s } isthe probability of f s | k = f s | k for a randomly chosen edge { s , s } k (this is notation for an edge between s , s k ) in the graph A . Whenever there is a global function F : V → {
0, 1 } such that f s = F | s for all s ∈ S , the agreement value of { f s } is clearly 1. We say thatAgreement expanders have been studied and used in the LTC and PCP literature for years (under differentnames such as direct product tests or sometimes low degree tests). However, prior to the recent connectionwith high-dimensional expanders, the only known agreement expanders were relatively dense. The existenceof sparse such objects seems promising and could potentially lead to LTCs with positive rate. This workshows how agreement expansion can be useful for constructing LTCs. Multilayered Agreement Samplers (MAS).
We now describe the MAS combinatorial structure neededfor our LTC scheme. Let V be a ground set of n elements, and let T , K , S be three families of subsets of V of sizes q < q < q . The system ( V , T , K , S ) is said to be a ( λ , α ) - MAS if the following two conditions aremet.– V , K , S are part of an α -agreement-expander.– The bipartite containment graph of T vs. K is a λ -sampler.The above definition is stricter than what we actually need, see the formal more refined definition inDefinition 3.1. We are now ready to state our main result. Main Result.
Let V , T , K , S be a ( λ , α ) - MAS . Suppose that for each t ∈ T we have a local code C t ⊂ {
0, 1 } t . Let C ⊂ {
0, 1 } n be the Tanner code defined by { C t } for all t ∈ T . Namely, C : = n w ∈ {
0, 1 } V (cid:12)(cid:12)(cid:12) w | t ∈ C t for every t o .Similarly, for each s ∈ S , let C s be the Tanner code defined by { C t | t ⊂ s } , namely, C s = { w ∈ {
0, 1 } s | w | t ∈ C t for every t ⊂ s } ,and similarly define for each k ∈ K , C k = (cid:8) w ∈ {
0, 1 } k (cid:12)(cid:12) w | t ∈ C t for every t ⊂ k (cid:9) . Theorem 1.1.
Let V , T , S be layers in a ( λ , α ) -MAS satisfying λ (cid:54) ρδα / . Suppose C k ⊂ {
0, 1 } k hasrelative distance δ for all k ∈ K and suppose that C s is ρ -locally testable with the natural Tanner tester. Then C is ρδα / locally testable (with the natural Tanner tester). We state our full main theorem in Theorem 4.1.
Overview of proof.
Our proof of local testability, like previous proofs of testability, goes via self correction.The main difficulty in our setting is that a single round of self-correction is insufficient to correct the word.Let w be a word that satisfies a ( − ε ) -fraction of the constraints in the Tanner graph. We would liketo show that there exists a w ∗ ∈ C such that dist ( w , w ∗ ) = O ( ε ) . For specific codes, one could use theproperties of the code to perform this self-correction (cf. Reed-Muller testing, one could use the properties ofpolynomials).However, we cannot resort to such properties since we are working in an abstract setting. Instead, werely on simple majority decoding: each vertex takes a value that satisfies the majority of the constraints itparticipates in. The main engine driving our proof is agreement expansion. Our proof strategy is as follows:Construct a word w from the received word w via self correction (or otherwise) and show(a) w is close to w , and 4b) w is a valid codeword.Property (a) is easy to show if w is constructed via self correction using majority decoding. Property (b) isnot very hard in the context of Hadamard testing and Reed-Muller testing: every vertex participates in aconstraint with every other vertex (indeed the diameter of the Tanner graph is a constant), hence one roundof self-correction results in a valid codeword w . However, since our proof is general enough to work even forconstant-degree Tanner graphs wherein the diameter can be as large as logarithmic, one does not expect asingle step of self correction via majority decoding to yield a codeword in a single step.Our proof instead relies on a novel iterative self correction procedure that slowly corrects a given word inlogarithmically many iterations. A standard problem that arises when using iterative procedures is that theerror grows linearly in the number of iterations, which is prohibitively expensive in our setting. We use theproperties of MAS to show that the number of unsatisfied constraints by the self-corrected word w reduces bya constant factor in each iteration. This allows us to perform an arbitrary number of rounds in the iterativeself-correction procedure till we reach a perfect codeword w ∗ ∈ C (actually a logarithmic number of roundswill suffice). This type of argument is new in the context of locally testable codes.Given this we can proceed with the proof overview as follows. Since w satisfies ( − ε ) -fraction of theconstraints, an averaging argument shows that a ( − O ( ε )) -fraction of the s ’s satisfy most of the constraintswithin them. Hence, by the local testability of the code C s we get that for most s ’s, w | s is close to a localcodeword, say w s ∈ C s . Furthermore, it is not hard to show that these local codewords satisfy that for atypical k ∈ K and s , s ∈ S such that k ⊂ s ∩ s , we have w s | k ≡ w s | k . In other words, the w s ’s satisfy thehypothesis of the agreement test. From the agreement expansion of the MAS, there exists a “global” word w that explains most of the w s ’s. Furthermore, it is not hard to show that w is close to the original word w .We then use the sampler property of the MAS to show that w violates significantly fewer constraints than w (in particular, w violates at most ε / w ( ) : = w , w ( ) , w ( ) , . . . such that w ( i ) violates at most ε / i -fraction of constraints and dist ( w ( i ) , w ( i + ) ) = O ( ε ) / i .Since the fraction of violated constraints cannot infinitely decrease, we have that eventually for a large enough i , w ∗ : = w ( i ) ∈ C and dist ( w , w ∗ ) (cid:54) P i − j = dist ( w ( j ) , w ( j + ) ) = O ( ε ) . Relation to previous work.
We begin by recalling the history of LTCs and the close connection betweenPCP and LTC constructions. LTCs were first studied in the context of program checking by Blum, Lubyand Rubinfeld [BLR93] and Gemmell et al. [GLRSW91]. The notion of LTCs is implicit in the work onlocally checkable proofs by Babai et al. [BFLS91] and subsequent works on PCPs. The explicit definitionappeared independently in the works of Rubinfeld and Sudan [RS96], Friedl and Sudan [FS95], Arora’s PhDthesis [Aro94] and Spielman’s PhD thesis [Spi95]. A formal study of LTCs was initiated by Goldreich andSudan [GS06]. Most known constructions of PCPs yield LTCs with similar parameters. In fact, there is ageneric transformation to convert a PCP of proximity (which is a PCP with more requirements) into anLTC with comparable parameters [BGHSV06, Tre04]. See a survey by Goldreich [Gol10] for the interplaybetween PCP and LTC constructions. In fact, the current best construction of LTCs (constant-query, constantfractional distance and inverse polylogarithmic rate) is obtained from the PCP constructions of Ben-Sassonand Sudan [BS08] and Dinur [Din07]. PCP-based constructions are unlikely to yield LTCs with constant ratesince PCP constructions typically involve at least a logarithmic overhead. Nevertheless LTC constructionsthat aren’t derived from PCPs perhaps have a better chance at achieving the coding-theory gold-standard ofpositive rate and distance.Agreement expansion and the multilayered set system structure play a central role in our proof of localtestability. Another application of agreement expansion towards local testability was studied in [DHKRZ19],5here it was used to enhance the local testability of a code in the context of the subspaces (Grassmannian)complex. We remark that use of such multilayered agreement samplers in the context of locally-testablecodes is actually implicit in many previous constructions of locally testable codes. The Raz-Safra [RS97]proof of the local testability of the Reed-Muller codes works with points-lines-planes structure, a subgraphof the Grassmannian complex which is an excellent agreement expander as explained in detail in Section 5.The original proof due to Blum, Luby and Rubinfeld [BLR93] (as well as subsequent improvements dueto Coppersmith) of the local testability of the Hadamard codes as well as Kaufman and Sudan’s proof oftestability of affine-invariant codes [KS08], relies on the three-layered structure comprising of the points, thethree-point tests and certain nine-point sets, sometimes referred to as "magic squares" [KS08].Our proof makes explicit this use of MAS to construct LTCs and shows that four-layered MAS aresufficient to transform “local” local testability to “global” local testability. In this sense, our proof can beviewed as bringing together these seemingly different proofs of local-testability under a common umbrella.We already remarked that our construction has a similar paradigm as the Sipser-Spielman construction ofexpander codes [SS96] which demonstrates that if the base code has good distance then the Tanner code alsohas good distance provided the graph is an expander. Another construction of the same flavor is the result ofDinur et al. [DHKNT19] that demonstrates that if the local code is efficiently list-decodable then so is theglobal code defined by ABNNR distance amplification property via an expander [ABNNR92], provided theexpander is part of a large high-dimensional expander.
Further Discussion and Future Work.
This work gives a general scheme for constructing an LTC. Itneeds to be instanciated with an appropriate MAS and base codes. As mentioned earlier, and explained indetail in Section 5, one such instanciation is to choose the Grassmannian complex as the MAS, and the ReedSolomon code as the base codes. This gives the well-studied locally testable codes called Reed-Muller codes,as well as the more recent so-called lifted codes.The most interesting direction is to instantiate this scheme with an MAS that comes from some bounded-degree high-dimensional expander, and to combine it with appropriate choice of locally testable base code.The main hurdle in choosing the base codes is to be able to certify that the resulting Tanner code maintainspositive rate. In some similar situations this is done by a simple counting of the number of constraints.However, such an argument cannot work in the setting of LTCs, and we leave it as an open question.
Let Σ be some finite set. A code is some C ⊆ Σ n . Let p be a prime power and Σ = F np be an n -dimensionalvector space over a field with q elements. We say that C is a linear code when C is a subspace of F np . Therate of the code is rate ( C ) = log q | C | n .It is convenient to think about F np as functions f : [ n ] → F p . The distance between two functions f , g : [ n ] → Σ , denoted by dist ( f , g ) , is the fraction of x ∈ [ n ] so that f ( x ) (cid:44) g ( x ) . The distance of a code isdefined to be dist ( C ) = min f , g ∈ C , f (cid:44) g dist ( f , g ) . When C is linear, this is the same as min (cid:44) f ∈ C dist ( f , 0 ) . A Tanner code [Gal60, Tan81] over an alphabet Σ (also called a lifted code) is defined through two objects: afamily T of q -element subsets of [ n ] , and with each subset t ∈ T a base code C t ⊂ Σ t . The code C ⊆ Σ n is6iven by C = { w ∈ Σ n | w | t ∈ C t , t ∈ T } .The family T is often described through a bipartite graph on vertex sets [ n ] and T connecting t ∈ T to i ∈ [ n ] whenever i ∈ t . Several well-known families of codes can be constructed as Tanner codes, including tensorcodes, Reed-Muller codes, and the codes considered by Sipser and Spielman [SS96]. A family of Tannercodes that is especially related to our context is the family of so-called lifted codes. Lifted codes were firstintroduced by Ben-Sasson, Maatouk, Shpilka and Sudan [BMSS11] and their local testability was studied byGuo, Kopparty and Sudan [GKS13]. These codes can be described as Tanner codes where [ n ] is identifiedwith points of a vector space and the family T contains all possible affine subspaces of a prescribed dimension m . The base code C is taken to be affine invariant. A prime example for such codes is the Reed-Muller code. A ( Q , ρ ) -local tester for the code C is a probabilistic oracle algorithm that determines whether a word is inthe code. It does the following: given oracle access to a function f : [ n ] → Σ , it queries f at Q input locations.Then if f ∈ C it accepts with probability 1. If f (cid:60) C it rejects with probability at least ρ · dist ( f , C ) . Here ρ ∈ (
0, 1 ) is some constant parameter, and dist ( f , C ) is the distance between f and the closest codeword toit in C .For linear codes C , [BHR05] showed that without loss of generality, we can assume that the local testerspicks a random subset t ⊂ [ n ] according to some distribution, and accept if and only if w | t ∈ C t (that is, thatthere exists some codeword w ∈ C so that w | t = w | t ). Thus we formally define the locally testable codes asfollowing: Definition 2.1 (Locally Testable Codes) . Let V some finite set, and C be some linear code on V . Let D besome distribution on subsets of V , and suppose every set t ∼ D of of size at most Q . Let ρ >
0. We say C is ( Q , ρ ) -testable with respect to D if ρ · dist ( f , C ) (cid:54) P t ∼ D [ f | t (cid:60) C | t ] .An alternate way of describing a locally testable code is using the Tanner graph G = ([ n ] , T , E ) representation of a code. In this representation, [ n ] corresponds to the n input locations in the codeword. T corresponds to the subsets of indexes that are queried by the local tester. We connect i ∈ [ n ] and t ∈ T if i ∈ t .A local tester that corresponds to this representation picks a random constraint t ∈ T and checks if thecorresponding constraint is satisfied. Let G = ( U , V , E ) be a bipartite graph, and assume that each edge carries a non-negative weight p uv such that P uv ∈ E p uv =
1. This probability distribution induces a marginal probability distribution on Uand similarly on V given by p u = P uv ∈ E p uv . For every set B ⊆ U (and V respectively) we denote by P [ B ] = P u ∈ U [ u ∈ B ] . As a slight abuse of notation, for a set B ⊆ U and a vertex v ∈ V we denote by P [ B | v ] = P uv ∈ E [ u ∈ B | v = v ] .A sampler graph is a graph where for all B ⊆ U , most of the vertices v ∈ V have that P [ B ] ≈ P [ B | v ] .7 efinition 2.2 ( λ -sampler) . Let G = ( V , U , E ) be a bipartite graph. For any B ⊂ U , we define N = N ( B , δ ) = { v ∈ V | P u ∈ U [ u ∈ B | u ∼ v ] > P [ B ] + δ } . For λ ∈ (
0, 1 ) we say G is a λ -sampler if for every B ⊆ U and every δ > P [ N ] (cid:54) λδ P [ B ] .There is a tight connection between expander bipartite graphs and sampler graphs. For more on this, see[Gol11]. Let V be a finite universe, S a collection of subsets of V , and for each subset s ∈ S , a local function f s ∈ Σ s .An ensemble { f s } is perfectly global if it comes from a single global function w : V → Σ , namely, f s = w | s for all s . We denote by G the collection of all perfectly global ensembles. An agreement tester is givenby a non-negatively weighted graph A with vertex set S , and such that each edge { s , s } is labelled bysome k ⊆ s ∩ s . Without loss of generality we require that the weights sum to 1, so that the edges forma distribution over pairs s , s . Given a collection { f s } of local functions, the tester selects an edge s , s atrandom, and accepts if f s ( v ) = f s ( v ) for each v ∈ k . We call this the value of { f s } under A and denote itby A ( { f s } ) , A ( { f s } ) : = P s , s ∼A [ f s ( v ) = f s ( v ) , ∀ v ∈ k ] .It is clear that a perfectly global ensemble has value 1. Indeed for any pair s , s and any v ∈ s ∩ s , f s ( v ) = g ( v ) = f s ( v ) assuming that g : V → Σ is the global function that agrees with { f s } . The graph A isan agreement expander if a robust converse holds, namely any ensemble with A ( { f s } ) ≈ Definition 2.3.
Let V , K , S , A be as above. We call A an α -agreement expander if for every ensemble oflocal functions { f s } α · dist ( { f s } , G ) (cid:54) − A ( { f s } ) . (2.1)where the distance dist ( { f s } , G ) is the distance between { f s } and the closest perfectly global ensemble; wheredistance between two ensembles { f s } , { g s } is defined as probability f s (cid:44) g s when s is chosen from the marginaldistribution of A . More refined notions of agreement-expansion.
We also say that A is an agreement expander withrespect to δ -ensembles if (2.1) holds for every ensemble { f s } that is a δ -ensemble, namely, such that for everyedge { s , s } k in the graph A , we have either f s | k = f s | k or else the Hamming distance between f s | k and f s | k is at least δ | k | .Furthermore, we allow a slightly weaker notion of distance from being perfectly global. We say that A has ( K , α ) soundness wrt δ ensembles if the following holds. Suppose that for every s ∈ S there is a distribution k ∼ D s that samples k ∈ K that are subsets of s . We say that A is ( K , α ) -sound wrt { f s } when α · min G ∈G P s ∈ S , k ∼ D s [ f s | k (cid:44) G | k ] (cid:54) − A ( { f s } ) . (2.2)We say that A is ( K , α ) sound with respect to δ ensembles if (2.2) holds for all δ ensembles. The structure we use to construct locally testable codes has a sampler component and an agreement expandercomponent, that sit together in four layers. We call these structures Multilayer Agreement Samplers.8igure 1: Multilayer Agreement Sampler
Definition 3.1 (Multilayer Agreement Samplers (MAS)) . Let δ , λ , α (cid:62)
0. Let V be a set of elements, and T , K , S ⊂ V be families of subsets so that there is a non-degenerate Markov chain that samples ( v , t , k , s ) from V , T , K , S respectively, so that v ∈ t ⊂ k ⊂ s . (Spelling out the Markov chain requirement we have adistribution over ( v , t , k , s ) such that the choice of t is conditioned only on v , the choice of k is conditionedonly on t , and finally the choice of s is conditioned only on k .)We say that ( V , T , K , S ) are a ( λ , δ , α ) - MAS if1. There is an agreement expander A with vertex set S and edge labels K so that:– The marginal distribution of sampling a labeled edge { s , s } k in A , and returning s , k , is the sameas the marginal distribution s , k of the Markov chain.– A is ( K , α ) -sound for δ -ensembles.2. The bipartite containment graph of K vs. T , is a λ -sampler. Here the probability of sampling an edge ( k , t ) is the probability of sampling ( k , t ) together in the Markov chain.A natural example for an MAS is the Grassmannian complex, that is, a four layer structure where V = F np and T , K , S are affine spaces of F np of fixed dimensions. We elaborate on this example in the subsectionbelow. The Grassmannian complex is dense, that is, the number of subspaces grows exponentially with thedimension. No known codes on the Grassmannian complex have good rate.Currently known constant degree MASs arise from high-dimensional expanders which are simplicialcomplexes. However, we cannot use MASs that are directly simplicial complexes to construct any codewith non-trivial rate and distance. It is conceivable that high-dimensional expanders that are not simplicialcomplexes may yield good LTCs. The set system for the Grassmannian MAS is corresponds to points and affine subspaces of a vector space.Formally, let p be some prime power, and q < q < q < n be some integers greater than 0. Our ground These can still arise from high dimensional expanders. For example an MAS whose subsets are links of a high dimensionalexpander. V = F np , and over it we define the following set system X = ( V , T , K , S ) where T , K , S consist of allaffine subspaces of dimensions q , q and q respectively. The Markov process of this set system, is sampling ( v ∈ t ⊆ k ⊆ s ) uniformly.The edge distribution of the test graph { s , s } k ∼ A , is to sample a subspace k ∈ K , and then twosubspaces s , s ∈ S independently, given that s , s ⊃ k . We call this the q , q -agreement test.We claim that this set system is an MAS: Lemma 3.2.
There is a universal constant α > so that the following holds. Let q < q < q < n be asabove, and assume q (cid:62) q + . Let p be any prime power. Let X = ( V , T , K , S ) be as above. Then X is a ( p q − q , δ , δα ) -MAS for every δ > . the constant above does not depend on p , nor on q , q , q , n . Proof.
The sampling properties of the layers of a Grassmannian complex are folklore:
Fact 3.3.
Let G = ( K , T , E ) be the graph where K are subspaces F np of dimension q and T are subspacesof dimension q , and ( t , k ) ∈ E if t ⊂ k with uniform weights. This graph is a p −| q − q | -sampler. Agreement of the q , q -agreement test graph was proven by [DD19] (Theorem 6.2). Theorem 3.4 (Agreement for Grassmannian) . There exists a constant α > such that for every primepower p , δ > , and integers q , q , n such that q + < q (cid:54) n the following holds. The q , q -Grassmannianagreement test is ( K , δα ) -sound for δ -ensembles. Combining these two statements together we get that there exists some α > δ > ( V , T , K , S ) defined above are a ( p q − q , δ , δα ) -MAS. (cid:3) In Section 5 use our main theorem, Theorem 4.1, to show that local testability of lifted on the Grassmanniancomplex, is implied by the local testability of the base code.
Given an
M AS ( V , T , K , S ) and a set of base codes { C t | t ∈ T } , the lifted code to V is C = { w : V → Σ | w | t ∈ C t , ∀ t ∈ T } .Similarly, for every s ∈ S or k ∈ K , the local lifts to s or k are C s = { w : s → Σ | w | t ∈ C t , ∀ t ⊆ s } , C k = { w : k → Σ | w | t ∈ C t , ∀ t ⊆ k } .The next theorem is a reformulation of Theorem 1.1. Theorem 4.1 (Main) . Let V be a finite set and ρ , δ , λ , α (cid:62) so that λ (cid:54) ρδα . Let X = ( V , T , K , S ) be a ( δ , λ , α ) -MAS. Let { C t | t ∈ T } be a set of base codes, and let C be the lifted code. Suppose that1. Local Distance: C k has distance δ for every k ∈ K .2. Local local testability:
For every s ∈ S , the code C s is ρ -locally testable with respect to sampling t ∈ T given that t ⊂ s .Then C is ρδα -locally testable with respect to the distribution of choosing t ∈ T . We encourage the readers to think of λ , α as some fixed constants of the set system. Then the theoremstates that if { C k } have large relative distance δ = Ω ( ) , and { C s } are ρ -locally testable for a large enough ρ , then the lifted code is Ω ( ρ ) -locally testable. 10 .1 Proof of the Main Theorem Proof of Theorem 4.1.
Let w : V → Σ be some word so that F ail ( w ) def = P t ∈ T [ w | t (cid:60) C t ] = ε .We need to find a word w ∗ so that dist ( w , w ∗ ) (cid:54) ερδα . We will find a word w : V → Σ so that dist ( w , w ) = ερδα , and F ail ( w ) = P t ∈ T [ w | t (cid:60) C t ] (cid:54) ε .As a first step we define a function ensemble { f s | s ∈ S } so that f s ∈ C s is the closest code word to w | s (ties broken arbitrarily). For each k ⊂ s , s both f s | k ∈ C k and f s | k ∈ C k , and since C k is a code withrelative distance δ , we get that { f s } is a δ -ensemble.We claim that the ensemble passes the agreement test with high probability. Claim . P { s , s } k ∼A [ f s | k = f s | k ] = − ερδ .As there is an agreement expander A that is ( K , α ) -sound with respect to δ -ensembles, there exists somefunction w : V → Σ so that P k ⊂ s [ w | k = f s | k ] = − ερδα . (4.1)We claim that w is close to w , and that w fails the test with probability (cid:54) ε . Claim . dist ( w , w ) (cid:54) ερδα . Claim . F ail ( w ) (cid:54) ε .Modulo Claim 4.3 and Claim 4.4, we repeat the correction process poly ( log ( min t ∈ T P [ t ])) times. In thebeginning of the i -th iteration we start with w i that fails the test with probability (cid:54) ε / i . In the end of theiteration, we find w i + that fails the test with probability (cid:54) ε / i + , and so that dist ( w i , w i + ) (cid:54) ερδα i . Thuswe obtain a sequence of functions w , w , w , ..., w r that ends with w r = w ∗ that always passes the test. Thedistance we accumulate from w isdist ( w , w r ) (cid:54) r − X i = dist ( w i , w i + ) (cid:54) ρδα ∞ X i = i = ρδα . (cid:3) Proof of Claim 4.2.
By the local testability of the base code C s , E s [ dist ( w | s , C s )] (cid:54) ρ − E s (cid:20) P t ⊂ s [ w | t (cid:60) C t ] (cid:21) = ρ − P t [ w | t (cid:60) C t ] (cid:54) ερ . (4.2)As f s is closest code word to w | s , ερ (cid:62) E s [ dist ( w | s , f s )] = E s (cid:20) E k ⊂ s [ dist ( w | k , f s | k )] (cid:21) .By Markov’s inequality, with probability 1 − ερδ of sampling { s , s } k ∼ A , it holds that dist ( w | k , f s i | k ) < δ where f s i is the closest codeword in C s i to w | s i .By the local distance assumption, C k has distance δ , and if dist ( f s | k , f s | k ) < δ , then f s | k = f s | k . (cid:3) roof of Claim 4.3. We note that dist ( w , w ) = E s [ dist ( w | s , w | s )] .We show closeness by the triangle inequality. Fix s ∈ S , thendist ( w | s , w | s ) (cid:54) dist ( w | s , f s ) + dist ( f s , w | s ) .By (4.2), E s [ dist ( w | s , f s )] (cid:54) ερ .By the ( K , α ) -soundness of the agreement expander A ,dist ( w | s , f s ) = E k ⊂ s [ dist ( w | k , f s | k )] (cid:54) P k ⊂ s [ w | k (cid:44) f s | k ] = ερδα .By the triangle inequality, and using the fact that both δ , α < ( w , w ) (cid:54) ερδα . (cid:3) The proof of Claim 4.4 relies on the λ -sampling property of the MAS. Proof of Claim 4.4.
By assumption the containment graph between T and K is has the λ -sampling property.Let B = { k ∈ K | ∀ s ⊃ k , f s | k (cid:44) w | k } . We observe the following:1. By the agreement property, P [ B ] (cid:54) ερδα , and without loss of generality P [ B ] (cid:54) (if we want to showthat the code is ρδα -locally testable, it is enough to consider ε so that ερδα (cid:54) t ∈ T contributes to the failure (i.e w | t (cid:60) C t ), then w | k (cid:44) f s | k for all k ⊃ t and s ⊃ k . Thus all itsneighbours are in B .Denote by N the set of t ∈ T so that all of t ’s neighbours are in B . By item 2 above we have that P t ∈ T [ w | t (cid:60) C t ] (cid:54) P t ∈ T [ N ] . We note that if we sample a neighbour of t , we get some k ∈ B with probability1 (cid:62) P [ B ] + . Thus by the λ -sampling property, we have that P t ∈ T [ N ] (cid:54) λ ερδα .We chose λ (cid:54) ρδα , hence F ail ( w ) (cid:54) ε . (cid:3) Remark . The MAS has four layers. The vertex layer V and the layer T are required to define the liftedcode itself. It is also natural to introduce a higher layer S , since without any other requirements we can’texpect any lifted code to be locally testable.However, the intermediate layer K is possibly unneeded. While it is a crucial part of the proof , it is notneeded for lifting the code, nor for the local tests. We believe it is interesting to understand whether itis enough to study a three-layered set system, namely ( V , T , S ) . Are there similar properties, in terms ofagreement, sampling and expansion, that also give us a similar result?12 Local Testability in Vector Spaces
In this section we demonstrate how the main theorem fits in with, and generalizes, the known results ontestability of Reed-Muller codes. In this case the MAS is the Grassmannian complex MAS described inLemma 3.2 for V = F np and T , K , S being the collections of all affine subspaces of dimension q , q , q respectively.We define the code on V by lifting base codes { C t | t ∈ T } . Namely C = (cid:8) w : F np → F p (cid:12)(cid:12) w | t ∈ C t , ∀ t ∈ T (cid:9) .One example of such a code, is the ( n , r ) -Reed-Muller code on F np . This code consists of all polynomialsof degree (cid:54) r . When n = T to be the set of all affine lines (i.e. q = C t be the r -Reed-Solomon code on every line. Lifting this code to V results in all functions w : F np → F p so that for every line t ∈ T , f | t is a function of degree at most r . For some parameters n , r , p thisresults in the ( n , r ) -Reed-Muller code. Surprisingly, [GKS13] showed that for some other parameters r , n , p the code lifted from the r -Reed-Solomon code, contains more than the ( n , r ) -Reed-Muller code. Nevertheless,these codes are locally testable as well [GHS15, HRS15].Our main theorem states that to prove local testability of C it is enough to prove that C s = { w : s → F p | w | t ∈ C t , ∀ t ∈ T , t ⊂ s } is locally testable, for to each subspace s ∈ S .This gives rise to testability results for Reed-Muller codes (which are well studied, see [RS96, RS97, AS03])as well as to lifted codes as were studied in [GKS13] (given of course, that we check their local testability ina some small fixed space). Moreover, this statement continues to hold for more general sets of base codes { C t | t ∈ T } : If the lifts of { C t | t ∈ T } to dimension q subspaces are locally testable (with good enoughparameters), then the lifted code to dimension n is also locally testable. This is particularly useful in theregime where q , q are fixed, and n tends to infinity. This includes the examples above, but is a more generalstatement. Theorem 5.1.
There is a universal constant α > so that the following holds. Let q < q < q < n beas above, and assume q (cid:62) q + . Let p be any prime power. Let X = ( V , T , K , S ) be as above. Let { C t | t ∈ T } be a set of base codes, and suppose that there exists some δ > and ρ (cid:62) p q − q αδ so that:1. For any q dimensional space k ∈ K , C k has distance (cid:62) δ .
2. For every q dimensional space s ∈ S , C s is ρ -locally testable.Then for any n > q , the lift of { C t | t ∈ T } to F np is ρδ α -locally testable. The constant α doesn’t depend on any of the other parameters, nor on the field size.We encourage the readers to think of δ = Ω ( ) . Then for every fixed dimensions q , q and field size p there is some ρ , so that for every lifted code that is ρ -locally testable on spaces of dimension q , the code isalso Ω ( ρ ) -locally testable on all spaces of dimension n > d (for a large enough ρ ). Note that this theoremapplies both to the regime where the field size is small (e.g. p =
2, 3), and where the field size goes to infinity.When p grows, the conditions of the theorem become easier to satisfy, that is, that the lower bound on ρ becomes smaller as well. Proof of Theorem 5.1.
Let α be the constant stated in Lemma 3.2. The system X = ( V , T , K , S ) definedabove is a ( p q − q , δ , δα ) -MAS, by Lemma 3.2, for that α .Denote by C the lift of { C t | t ∈ T } to F np . This code satisfies the distance and local local testabilityproperties: [GKS13] showed this holds, for example, whenever the base codes C t themselves have distance (cid:62) δ + p q .
13. The lift of { C t | t ∈ T } to an q dimensional space k ∈ K has distance (cid:62) δ .2. The lift of { C t | t ∈ T } to a q -dimensional space s ∈ S is ρ -locally testable.Hence by Theorem 4.1, this code is ρδ α -locally testable. (cid:3) References [ABNNR92]
Noga Alon , Jehoshua Bruck , Joseph Naor , Moni Naor , and
Ron M. Roth . Construction ofasymptotically good low-rate error-correcting codes through pseudo-random graphs . IEEE Trans. Inform.Theory, 38(2):509–516, 1992.[Aro94]
Sanjeev Arora . Probabilistic checking of proofs and the hardness of approximation problems . Ph.D.thesis, University of California, Berkeley, 1994.[AS03]
Sanjeev Arora and
Madhu Sudan . Improved low-degree testing and its applications . Combinatorica,23(3):365–426, 2003. (Preliminary version in , 1997). eccc:1997/TR97-003 .[BFLS91]
László Babai , Lance Fortnow , Leonid A. Levin , and
Mario Szegedy . Checking computations inpolylogarithmic time . In
Proc. rd ACM Symp. on Theory of Computing (STOC) , pages 21–31. 1991.[BGHSV06] Eli Ben-Sasson , Oded Goldreich , Prahladh Harsha , Madhu Sudan , and
Salil Vadhan . RobustPCPs of proximity, shorter PCPs and applications to coding . SIAM J. Comput., 36(4):889–974, 2006.(Preliminary version in , 2004). eccc:2004/TR04-021 .[BHR05]
Eli Ben-Sasson , Prahladh Harsha , and
Sofya Raskhodnikova . Some 3CNF properties are hard totest . SIAM J. Comput., 35(1):1–21, 2005. (Preliminary version in , 2003).[BLR93]
Manuel Blum , Michael Luby , and
Ronitt Rubinfeld . Self-testing/correcting with applications tonumerical problems . J. Comput. Syst. Sci., 47(3):549–595, December 1993. (Preliminary version in , 1990).[BMSS11]
Eli Ben-Sasson , Ghid Maatouk , Amir Shpilka , and
Madhu Sudan . Symmetric LDPC codes arenot necessarily locally testable . In
Proc. th IEEE Conf. on Comput. Complexity , pages 55–65. 2011. eccc:2010/TR10-199 .[BS08] Eli Ben-Sasson and
Madhu Sudan . Short PCPs with polylog query complexity . SIAM J. Comput.,38(2):551–607, 2008. (Preliminary version in , 2005). eccc:2004/TR04-060 .[DD19]
Yotam Dikstein and
Irit Dinur . Agreement testing theorems on layered set systems . In
Proc. thIEEE Symp. on Foundations of Comp. Science (FOCS) , pages 1495–1524. 2019. arXiv:1909.00638 , eccc:2019/TR19-112 .[DDFH18] Yotam Dikstein , Irit Dinur , Yuval Filmus , and
Prahladh Harsha . Boolean function analysis onhigh-dimensional expanders . In
Eric Blais , Klaus Jansen , José D. P. Rolim , and
David Steurer ,eds.,
Proc. th International Workshop on Randomization and Computation (RANDOM) , volume 116of LIPIcs , pages 38:1–38:20. Schloss Dagstuhl, 2018. arXiv:1804.08155 , eccc:2018/TR18-075 .[DHKNT19] Irit Dinur , Prahladh Harsha , Tali Kaufman , Inbal Livni Navon , and
Amnon TaShma . Listdecoding with double samplers . In
Proc. th Annual ACM-SIAM Symp. on Discrete Algorithms (SODA) ,pages 2134–2153. 2019. eccc:2018/TR18-198 .[DHKRZ19] Irit Dinur , Prahladh Harsha , Tali Kaufman , and
Noga Ron-Zewi . From local testing to robusttesting via agreement testing . In
Avrim Blum , ed.,
Proc. th Innovations in Theor. Comput. Sci.(ITCS) , volume 124 of LIPIcs , pages 29:1–29:18. Schloss Dagstuhl, 2019. eccc:2016/TR16-160 .[Din07]
Irit Dinur . The PCP theorem by gap amplification . J. ACM, 54(3):12, 2007. (Preliminary version in , 2006). eccc:2005/TR05-046 .[DK17]
Irit Dinur and
Tali Kaufman . High dimensional expanders imply agreement expanders . In
Proc. thIEEE Symp. on Foundations of Comp. Science (FOCS) , pages 974–985. 2017. eccc:2017/TR17-089 .[DKW18] Dominic Dotterrer , Tali Kaufman , and
Uli Wagner . On expansion and topological overlap .Geometriae Dedicata, 195:307––317, 2018. (Preliminary version in , 2016). arXiv:1506.04558 .[FS95]
Katalin Friedl and
Madhu Sudan . Some improvements to total degree tests . In
Proc. rd IsraelSymp. on Theoretical and Computing Systems , pages 190–198. 1995. (See arXiv for corrected version). arXiv:1307.3975 . Gal60]
Robert G. Gallager . Low Density Parity Check Codes . Ph.D. thesis, Massachusetts Institute ofTechnology, 1960.[GHS15]
Alan Guo , Elad Haramaty , and
Madhu Sudan . Robust testing of lifted codes with applications tolow-degree testing . In
Proc. th IEEE Symp. on Foundations of Comp. Science (FOCS) , pages 825–844.2015. eccc:2015/TR15-043 .[GKS13] Alan Guo , Swastik Kopparty , and
Madhu Sudan . New affine-invariant codes from lifting . In
Robert D. Kleinberg , ed.,
Proc. th Innovations in Theor. Comput. Sci. (ITCS) , pages 529–540.ACM, 2013. eccc:2012/TR12-149 .[GLRSW91] Peter Gemmell , Richard J. Lipton , Ronitt Rubinfeld , Madhu Sudan , and
Avi Wigderson . Self-testing/correcting for polynomials and for approximate functions . In
Proc. rd ACM Symp. onTheory of Computing (STOC) , pages 32–42. 1991.[Gol10] Oded Goldreich . Short locally testable codes and proofs: A survey in two parts . In
Oded Goldreich ,ed.,
Property Testing , volume 6390 of
LNCS , pages 65–104. Springer, 2010. eccc:2005/TR05-014 .[Gol11] ———.
A sample of samplers: A computational perspective on sampling . In
Oded Goldreich ,ed.,
Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness andComputation , volume 6650 of
LNCS , pages 302–332. Springer, 2011. eccc:1997/TR97-020 .[Gro10]
Mikhail Gromov . Singularities, expanders and topology of maps. Part 2: from combinatorics to topologyvia algebraic isoperimetry . Geom. Funct. Anal., 20:416––526, 2010.[GS06]
Oded Goldreich and
Madhu Sudan . Locally testable codes and PCPs of almost linear length . J. ACM,53(4):558–655, 2006. (Preliminary version in , 2002). eccc:2002/TR02-050 .[HRS15]
Elad Haramaty , Noga Ron-Zewi , and
Madhu Sudan . Absolutely sound testing of lifted codes . TheoryComput., 11:299–338, 2015. (Preliminary version in , 2013). eccc:2013/TR13-030 .[KM17]
Tali Kaufman and
David Mass . High dimensional random walks and colorful expansion . In
ChristosPapadimitriou , ed.,
Proc. th Innovations in Theor. Comput. Sci. (ITCS) , volume 67 of LIPIcs , pages4:1–4:27. Schloss Dagstuhl, 2017. arXiv:1604.02947 .[KO18]
Tali Kaufman and
Izhar Oppenheim . High order random walks: Beyond spectral gap . In
EricBlais , Klaus Jansen , José D. P. Rolim , and
David Steurer , eds.,
Proc. th InternationalWorkshop on Randomization and Computation (RANDOM) , volume 116 of LIPIcs . Schloss Dagstuhl,2018. arXiv:1707.02799 .[KS08]
Tali Kaufman and
Madhu Sudan . Algebraic property testing: the role of invariance . In
Proc. thACM Symp. on Theory of Computing (STOC) , pages 403–412. 2008. eccc:2007/TR07-111 .[LM06] Nathan Linial and
Roy Meshulam . Homological connectivity of random 2-complexes . Combinatorica,26(4):475–487, 2006.[Opp18]
Izhar Oppenheim . Local spectral expansion approach to high dimensional expanders part I: Descent ofspectral gaps . Discrete Comput. Geom., 59(2):293–330, 2018. arXiv:1709.04431 .[RS96]
Ronitt Rubinfeld and
Madhu Sudan . Robust characterizations of polynomials with applications toprogram testing . SIAM J. Comput., 25(2):252–271, April 1996. (Preliminary version in , 1991and , 1992).[RS97]
Ran Raz and
Shmuel Safra . A sub-constant error-probability low-degree test, and a sub-constanterror-probability PCP characterization of NP . In
Proc. th ACM Symp. on Theory of Computing(STOC) , pages 475–484. 1997.[Spi95] Daniel A. Spielman . Computationally Efficient Error-Correcting Codes and Holographic Proofs . Ph.D.thesis, Massachusetts Institute of Technology, June 1995.[SS96]
Michael Sipser and
Daniel A. Spielman . Expander codes . IEEE Trans. Inform. Theory, 42(6):1710–1722, November 1996. (Preliminary version in , 1994).[Tan81]
Michael R. Tanner . A recursive approach to low complexity codes . IEEE Trans. Inform. Theory,27(5):533–547, 1981.[Tre04]
Luca Trevisan . Some applications of coding theory in computational complexity . Quaderni di Matem-atica, 13:347–424, 2004. arXiv:cs/0409044 , eccc:2004/TR04-043 ..