A Proof of Green's Conjecture Regarding the Removal Properties of Sets of Linear Equations
aa r X i v : . [ m a t h . C O ] A ug A Proof of Green’s Conjecture Regarding the Removal Properties ofSets of Linear Equations
Asaf Shapira ∗ Abstract
A system of ℓ linear equations in p unknowns M x = b is said to have the removal property ifevery set S ⊆ { , . . . , n } which contains o ( n p − ℓ ) solutions of M x = b can be turned into a set S ′ containing no solution of M x = b , by the removal of o ( n ) elements. Green [GAFA 2005] provedthat a single homogenous linear equation always has the removal property, and conjectured thatevery set of homogenous linear equations has the removal property. We confirm Green’s conjectureby showing that every set of linear equations (even non-homogenous) has the removal property. The (triangle) removal lemma of Ruzsa and Szemer´edi [17], which is by now a cornerstone resultin combinatorics, states that a graph on n vertices that contains only o ( n ) triangles can be madetriangle free by the removal of only o ( n ) edges. Or in other words, if a graph has asymptomaticallyfew triangles then it is asymptotically close to being triangle free. While the lemma was provedin [17] for triangles, an analogous result for any fixed graph can be obtained using the same proofidea. Actually, the main tool for obtaining the removal lemma is Szemer´edi’s regularity lemma forgraphs [19], another landmark result in combinatorics. The removal lemma has many applications indifferent areas like extremal graph theory, additive number theory and theoretical computer science.Perhaps its most well known application appears already in [17] where it is shown that an ingeniousapplication of it gives a very short and elegant proof of Roth’s Theorem, which states that every S ⊆ [ n ] = { , . . . , n } of positive density contains a 3-term arithmetic progression.Recall that an r -uniform hypergraph H = ( V, E ) has a set of vertices V and a set of edges E ,where each edge e ∈ E contains r distinct vertices from V . So a graph is a 2-uniform hypergraph.Szemeredi’s famous theorem [18] extends Roth’s theorem by showing that every S ⊆ [ n ] of positivedensity actually contains arbitrarily long arithmetic progressions (when n is large enough). Motivatedby the fact the a removal lemma for graphs can be used to prove Roth’s theorem, Frankl and R¨odl ∗ Microsoft Research. Email: asafi[email protected].
16] showed that a removal lemma for r -uniform hypergraphs could be used to prove Szemeredi’stheorem on ( r + 1)-term arithmetic progressions. They further developed a regularity lemma, aswell as a corresponding removal lemma, for 3-uniform hypergraphs thus obtaining a new proof ofSzemeredi’s theorem for 4-term arithmetic progressions. In recent years there have been manyexciting results in this area, in particular the results of Gowers [8] and of Nagle, R¨odl Schacht andSkokan [14, 15], who independently obtained regularity lemmas and removal lemmas for r -uniformhypergraph, thus providing alternative combinatorial proofs of Szemeredi’s Theorem [18] and some ofit generalizations, notably those of Furstenberg and Katznelson [7]. Tao [20] later obtained anotherproof of the hypergraph removal lemma and of its many corollaries mentioned above. For moredetails see [9, 11].In this paper we will use the above mentioned hypergraph removal lemma in order to resolve aconjecture of Green [10] regarding the removal properties of sets of linear equations. Let M x = b bea set of linear equations, and let us say that a set of integers S is ( M, b )-free if it contains no solutionto
M x = b , that is, if there is no vector x , whose entries all belong to S , which satisfies M x = b .Just like the removal lemma for graphs states that a graph that has few copies of H should be closeto being H -free, a removal lemma for sets of linear equations M x = b should say that a subset ofthe integers [ n ] that contains few solutions to M x = b , should be close to being ( M, b )-free. Let usstart be defining this notion precisely.
Definition 1.1 (Removal Property)
Let M be an ℓ × p matrix of integers and let b ∈ N ℓ . The setof linear equations M x = b has the removal property if for every δ > there is an ǫ = ǫ ( δ, M, b ) > with the following property: if S ⊆ [ n ] is such that there are at most ǫn p − ℓ vectors x ∈ S p satisfying M x = b , then one can remove from S at most δn elements to obtain an ( M, b ) -free set. We note that in the above definition, as well as throughput the paper, we assume that the ℓ × p matrix M of a set of linear equations has rank ℓ .Green [10] has initiated the study of the removal properties of sets of linear equations. His mainresult was the following: Theorem 1 (Green [10])
Any single homogenous linear equation has the removal property.
The main result of Green actually holds over any abelian group. To prove this result, Green devel-oped a regularity lemma for abelian groups, which is somewhat analogous to Szemer´edi’s regularitylemma for graphs [19]. Although the application of the group regularity lemma for proving Theorem1 was similar to the derivation of the graph removal lemma from the graph regularity lemma, theproof of the group regularity lemma was far from trivial. One of the main conjectures raised in [10]is that a natural generalization of Theorem 1 should also hold (Conjecture 9.4 in [10]).
Conjecture 1 (Green [10])
Any system of homogenous linear equations
M x = 0 has the removalproperty.
2e note that besides being a natural generalization of Theorem 1, Conjecture 1 was also raisedin [10] with relation to a conjecture of Bergelson, Host, Kra and Ruzsa [4] regarding the number of k -term arithmetic progressions with a common difference in subsets of [ n ]. See Section 4 for moredetails.Very recently, Kr´al’, Serra and Vena [13] gave a surprisingly simple proof of Theorem 1, whichcompletely avoided the use of Green’s regularity lemma for groups. In fact, their proof is an elegantand simple application of the graph removal lemma mentioned earlier and it actually extends Theorem1 to any single non-homogenous linear equation over non-abelian groups. Kr´al’, Serra and Vena[13] also show that Conjecture 1 holds when M is a 0/1 matrix, which satisfies certain conditions.But these conditions are not satisfied even by all 0/1 matrices. In another recent result, whichwas obtained independently of ours, Candela [5] showed that Conjecture 1 holds for every pair ofhomogenous linear equations, as well as for every system of homogenous equations in which every ℓ columns of M are linearly independent. See more details in Subsection 2.1.In this paper we confirm Green’s for every homogenous set of linear equations. In fact, we provethe following more general result. Theorem 2 (Main Result)
Any set of linear equations (even non homogenous)
M x = b has theremoval property. The rest of the paper if organized as follows. In the next section we give an overview of theproof of Theorem 2. As we show in that section, Theorem 2 also holds over any finite field, that iswhen S ⊆ F n , where F n is the field of size n . In fact it is easy to modify the proof so that it worksover any field, but we will not do so here. The proof of Theorem 2 has two main steps: the firstone, described in Lemma 2.3, applies the main idea from [13] in order to show that if a set of linearequations can be “represented” by a hypergraph then Theorem 2 would follow from the hypergraphremoval lemma. So the second, and most challenging step of the proof, is showing that every set oflinear equations can be represented as a hypergraph. The proof of this result, stated in Lemma 2.4,appears in Section 3. In Section 4 we give some concluding remarks and discuss some open problems. It will be more convenient to deduce Theorem 2 from an analogous result over the finite field F n of size n (for n a prime power). In fact, somewhat surprisingly, we will actually need to prove astronger claim than the one asserted in Theorem 2. This more general variant, stated in Theorem3, allows each of the variables x i to have its own subset S i ⊆ [ n ]. We note that a proof of thisvariant of Theorem 2 for the case of a single equation was already proved in [10] and [13], but inthose papers it was not necessary to go through this more general result. As we will explain later(see Claims 3.1 and 3.3), the fact that we are considering a more general problem will allow us to3vercome some degeneracies in the system of equations by allowing us to remove certain equations.This manipulation can be performed when one considers the generalized removal property (definedbelow) but there is no natural way of performing these manipulations when considering the standardremoval property. Therefore, proving this extended result is essential for our proof strategy.In what follows and throughout the paper, whenever x is a vector, x i will denote its i th entry.Similarly, if x , . . . , x p are elements in a field, then x will be the vector whose entries are x , . . . , x p .We say that a collection of p subsets S , . . . , S p ⊆ F n is ( M, b )-free if there are no x ∈ S , . . . , x p ∈ S p which satisfy M x = b . Definition 2.1 (Generalized Removal Property over Finite Fields)
Let F n be the field of size n , let M be an ℓ × p matrix over F n and let b ∈ F ℓn . The system M x = b is said to have the generalizedremoval property if for every δ > there is an ǫ = ǫ ( δ, p ) > such that if S , . . . , S p ⊆ F n containless than ǫn p − ℓ solutions to M x = b with each x i ∈ S i , then one can remove from each S i at most δn elements to obtain sets S ′ , . . . , S ′ p which are ( M, b ) -free. By taking all sets S i to be the same set S we, of course, get the standard notion of the removalproperty from Definition 1.1 so we may indeed work with this generalized definition. We will deduceTheorem 2 from the following theorem. Theorem 3
Every set of linear equations
M x = b over a finite field has the generalized removalproperty. In this paper we apply the hypergraph removal lemma in order to resolve Green’s conjecture. Infact, for the proof of Theorem 3 we will need a variant of the hypergraph removal lemma which worksfor colored hypergraphs. But let us first recall some basic definitions. An r -uniform hypergraph issimple if it has no parallel edges, that is, if different edges contain different subsets of vertices ofsize r . We say that a set of vertices U in a r -uniform hypergraph H = ( V H , E H ) span a copyof an r -uniform hypergraph K = ( V K , E K ) if there is an injective mapping φ from V K to U suchthat if v , . . . , v r form an edge in K then φ ( v ) , . . . , φ ( v r ) form an edge in U ⊆ V H . We say thata hypergraph is c -colored if its edges are colored by { , . . . , c } . If K and H are c -colored, then U is said to span a colored copy of K if the above mapping φ sends edges of K of color i to edges of H (in U ) of the same color i . We stress that the coloring of the edges does not have to satisfy anyconstraints that are usually associated with edge colorings. Finally, the number of colored copies of K in H is the number of subsets U ⊆ V H of size | V K | which span a colored copy of K .The following variant of the hypergraph removal lemma is a special case of Theorem 1.2 in [2]. As noted to us by Terry Tao, this variant of the hypergraph removal lemma can probably be extracted from theprevious proofs of the hypergraph removal lemma [8, 14, 15, 20], just like the colored removal lemma for graphs canbe extracted from the proof of the graph removal lemma, see [12]. heorem 4 (Austin and Tao [2]) Let K be a fixed r -uniform c -colored hypergraph on k vertices.For every δ > there is an ǫ = ǫ ( δ, k ) > such that if H is an r -uniform c -colored simple hypergraphwith less than ǫn k colored copies of K , then one can remove from H at most δn r edges and obtain ahypergraph that contains no colored copy of K . In order to use Theorem 4 for the proof of Theorem 3, we will need to represent the solutions of
M x = b as colored copies of a certain “small” hypergraph K in a certain “large” hypergraph H . Thefollowing notion of hypergraph representability specifies the requirements from such a representationthat suffice for allowing us to deduce Theorem 3 from Theorem 4. Definition 2.2 (Hypergraph Representation)
Let F n be the field of size n , let M be an ℓ × p matrix over F n . The system of linear equations M x = b is said to be hypergraph representable ifthere is an integer r = r ( M, b ) ≤ p and an r -uniform p -colored hypergraph K with k = r − p − ℓ vertices and p edges, such that for any S , . . . , S p ⊆ [ n ] there is an r -uniform hypergraph H on kn vertices which satisfies the following:1. H is simple and each edge with color i is labeled by one of the elements of S i .2. If x ∈ S , . . . , x p ∈ S p satisfy M x = b then H contains n r − colored copies of K , such thattheir edge with color i has label x i . These colored copies of K should also be edge disjoint.3. If S , . . . , S p contain T solutions to M x = b with x i ∈ S i then H contains T n r − colored copiesof K . The following lemma shows that a hypergraph representation can allow us to prove Theorem 3using the hypergraph removal lemma.
Lemma 2.3 If M x = b has a hypergraph representation then it has the generalized removal property. Proof:
Suppose
M x = b is a system of ℓ linear equations in p unknowns. Let S , . . . , S p be p subsets of F n and let H be the hypergraph guaranteed by Definition 2.2. We claim that we can take ǫ ( δ, p ) in Theorem 2.1 to be the value ǫ = ǫ ( δ/pk r , k ) from Lemma 4. Note that r, k ≤ p so thisstill implies that ǫ is only a function of δ and p . Indeed, if S , . . . , S p contain only ǫn p − ℓ solutions to M x = b then by item 3 of Definition 2.2 we get that H contains at most ǫn p − ℓ · n r − = ǫn k coloredcopies of K . As H is simple, we can apply the removal lemma for colored hypergraphs (Lemma 4)to conclude that one can remove a set E of at most δpk r ( kn ) r = δp n r edges from H and thus destroyall the colored copies of K in H (recall that H has kn vertices).To show that we can turn S , . . . , S p into a collection of ( M, b )-free sets by removing only δn elements from each S i , let us remove an element s from S i if E contains at least n r − /p edges thatare colored with i and labeled with s . As each edge has one label (because H has no parallel edges),5nd | E | ≤ δp n r this means that we remove only δn elements from each S i . To see that we thusturn S , . . . , S p into ( M, b )-free sets, suppose that the new sets S ′ , . . . , S ′ p still contain a solution s ∈ S , . . . , s p ∈ S p to M x = b . By item 2 of Definition 2.2, this solution defines n r − edge disjointcolored copies of K in H , with the property that in every colored copy, the edge with color i is labeledwith the same element s i ∈ S i . As E must contain at least one edge from each of these colored copies(as it should destroy all such copies), there must be some 1 ≤ i ≤ p for which E contains at least n r − /p edges that are colored i and labeled with s i . But this contradicts the fact that s i should havebeen removed from S i .We note that the above lemma generalizes a similar lemma for the case of representing a singleequation using a graph, which was implicit in [13]. In fact, as we have mentioned earlier, [13] alsoshow that a set of homogenous linear equations M x = 0, with M being a 0/1 matrix, that satisfiescertain conditions also has the removal lemma. One of these conditions essentially says that thesystem of equations is graph representable. However, there are even some 0/1 matrices for which M x = 0 is not graph representable (in the sense of [13]). Lemma 2.4 below shows that any set oflinear equations has a hypergraph representation. This lemma is proved in the next section and it isthe most challenging part of this paper.
Lemma 2.4
Every set of linear equations
M x = b over a finite field is hypergraph representable. From the above two lemmas we get the following.
Proof of Theorem 2:
Immediate from Theorem 3 and Lemma 2.3.As we have mentioned before, Theorem 2.4 is now an easy application of Theorem 3.
Proof of Theorem 2:
Given a set of linear equations
M x = b in p unknowns, let c be themaximum absolute value of the entries of M and b . Given an integer n let q = q ( n ) be the smallestprime larger than cp n . It is well known that q ≤ cp n (in fact, much better bounds are known). Itis clear that for a vector x ∈ [ n ] p we have M x = b over R if and only if M x = b over F q . So if M x = b has o ( n p − ℓ ) solutions with x i ∈ S i over R , it also has o ( q p − ℓ ) solutions with x i ∈ S i ⊆ F q over F q . ByTheorem 3 we can remove o ( q ) elements from each S i and obtain sets S ′ i that are ( M, b )-free. But as q = O ( n ) we infer that the removal of the same o ( q ) = o ( n ) elements also guarantees that the setsare ( M, b )-free over R . Let us start by noting that Lemma 2.4 for the case of a single equation was (implicity) proven in[13], where they show that one can take r = 2, in other words, they represent a single equation6s a graph K , in a graph H . Actually, the graph K in the proof of [13] is a cycle of length p .The proof in [13] is very short and elegant, and we recommend reading it to better understandthe intuition behind our proof (although this paper is, of course, self contained). Another relatedresult is the proof of Szemer´edi’s theorem [18] using the hypergraph removal lemma [6], whichcan be interpreted as (essentially) showing that the set of p − p -term arithmetic progression are hypergraph representable with K being the complete ( p − p . “Interpolating” these two special cases of Lemma 2.4 suggests thata hypergraph representation of a set of ℓ linear equations in p unknowns should involve an ( ℓ +1)-uniform hypergraph K of size p . And indeed, we initially found a (relatively) simple way toachieve this for p − p unknowns, thus extending the representability of the arithmeticprogression set of linear equations.However, somewhat surprisingly, when 1 < ℓ < p − M has a setof ℓ columns that are not linearly independent. Let us mention again that Candela [5] has recentlyconsidered linear equations M x = 0 in which every ℓ columns are linearly independent, and showedthat Conjecture 1 holds in these cases.The way we overcome the above complications is by using a representation involving hypergraphsof a much larger degree of uniformity (that is, larger edges), which is roughly the number of non-zeroentries of M after we perform certain manipulations on it. We note that specializing our proof toeither the case ℓ = 1 or to the case ℓ = p − not give proofs that are identical to the ones(implicit) in [6] or [13]. For example, our proof for the case of a single equation in p unknowns usesa ( p − K with p edges,whose copies, within another hypergraph H , will represent the solutions to M x = b . Each edge of H , and therefore also K , will have a color 1 ≤ i ≤ p and a label s ∈ S i . The system M x = b has p unknowns and K has p edges and it may certainly be the case that all the entries of M are non-zero.It is apparent that using all the edges of K to “deduce” a linear equation of M x = b is not a goodidea because in that way we will only be able to extract one equation from a copy of K and we needto extract ℓ such equations. Therefore, we will first “diagonalize” an ℓ × ℓ sub-matrix of M to get anequivalent set of equations (which we still denote by M x = b ) which has the property that p − ℓ of itsunknowns x , . . . , x p − ℓ (can) appear in all equations and the rest of the ℓ unknowns x p − ℓ +1 , . . . , x p each appear in precisely one equation. This suggests the idea of extracting equation i from (someof) the edges corresponding to x , . . . , x p − ℓ and one of the edges corresponding to x p − ℓ +1 , . . . , x p .The hypergraph K first contains p − ℓ edges that do not depend on the structure of M . The other These linear equations are x + x = 2 x , x + x = 2 x , . . . , x p − + x p = 2 x p − . edges do depend on the structure of M and use the previous p − ℓ edges in order to “construct”the equations of M x = b . The way to think about this is that for any copy of K in H the first p − ℓ edges will have a special vertex that will hold a value from S i (this will be the vertex in one of thesets U , . . . , U p − ℓ defined in Section 3). The other ℓ edges will include some of these special vertices,depending on the equation we are trying to build. The way we will deduce an equation from a copyof K in H is that we will argue that the fact that two edges have a common vertex means that acertain equation holds. See Claim 3.4.But there is another complication here because the linear equation we obtain in the above processwill contain many other variables not from the sets S i , which will need to vanish from such anequation, in order to allow us to extract the linear equations we are really interested in. The reasonfor these “extra” variables is that H needs to contain n r − edge disjoint copies of K for every solutionof M x = b . Hence, an edge of H will actually be parameterized by several other elements from F n (these are the elements x , . . . , x r − that are used after Claim 3.2). So we will need to make surethat these extra variables vanish in the linear equation which we extract from a copy of K . To makesure this happens we will need to carefully choose the vertices of each edge within H .A final complication arises from the fact that while we need H to contain relatively few copies of H , we also need it to contain many copies edge disjoint copies of H for every solution of M x = b . Tothis end we will think of each vertex of H as a linear equation and we will want the linear equationscorresponding to the vertices of an edge to be linearly independent. The reason why it is hard toprove Lemma 2.4 using an ( ℓ + 1)-uniform hypergraph (as the results of [13] and [6] may suggest) isthat it seems very hard to obtain all the above requirements simultaneously. The fact that we areconsidering hypergraphs with a larger degree of uniformity will allow us (in some sense) to break thedependencies between these requirements. Let M be an ℓ × p matrix over F n and b ∈ F ℓn . We will first perform a series of operations on M and b which will help us in proving Lemma 2.4. For convenience, we will continue to refer to thetransformed matrix and vector as M and b . Suppose, without loss of generality, that the last ℓ columns of M are linearly independent. We can thus transform M (and accordingly also b ) into anequivalent set of equations in which the last ℓ columns form an identity matrix. For a row M i of M let m i be the largest index 1 ≤ j ≤ p − ℓ for which M i is non-zero. Let W i denote the set of indices1 ≤ j ≤ m i − M i,j is non-zero. Therefore, M i has | W i | + 2 non-zero entries. We will needthe following claim, in which we make use of the fact that we are actually proving that every set ofequations has the generalized removal property and not just the removal property. Claim 3.1
Suppose that every set of ℓ − equations in p − unknowns over F n has the generalizedremoval property. Suppose that the matrix M defined above has a row with less than 3 non-zero ntries. Then M x = b has the generalized removal property as well. Proof:
Suppose that (say) the first row of M has at most 2 non-zero entries. If this row has twonon-zero elements then we can assume without loss of generality that it is of the form x = b − a · x j where p − ℓ + 1 ≤ j ≤ p . But then we can get an equivalent set of linear equations M ′ x = b ′ byremoving the first row from M , removing the column in which x j appears (because x j does not appearin other rows), removing the first entry of b and updating S to be S ′ = S ∩ { b − a · s : s ∈ S j } .We thus get an instance M ′ x = b ′ with ℓ − p − M x = b with x i ∈ S i is precisely thenumber of solutions of M ′ x = b ′ with x ∈ S ′ , x ∈ S , . . . , x j − ∈ S j − , x j +1 ∈ S j +1 , . . . , x p ∈ S p (ii) if we can remove δn elements from each of the sets of the new instance and thus obtain sets withno solution of M ′ x = b ′ then the removal of the same elements from the original sets S i would alsogive sets with no solution of M x = b .If the first row of M has just one non-zero entry, then this equation is of the form x j = b forsome p − ℓ + 1 ≤ j ≤ p and b ∈ F n . If b / ∈ S j then the sets contain no solution to M x = b and thereis nothing to prove. If b ∈ S j then the number of solutions to M x = b is the number of solutions ofthe set of equations M ′ x = b ′ where M ′ is obtained by removing the row and column to which x j belongs and by removing the first entry of b . As in the previous case we can now use the assumptionof the claim.Claim 3.1 implies that we can assume without loss of generality that none of the sets W , . . . , W ℓ is empty, because if one of them is empty then the corresponding row of M contains less than 3non-zero entries. In that case we can iteratively remove equations from M until we either: (i) geta set of linear equations in which none of the rows has less than 3 non-zero entries, in which casewe can use the fact that the result holds for such sets of equations as we will next show, or (ii) weget a single equation with only 2 unknowns with a non-zero coefficient . It is now easy to see thatsuch an equation has the removal property. Indeed, suppose the equation has p unknowns and only x and x have a non-zero coefficient. So the equation is a · x + a · x + P pi =3 · x i = b . In thiscase the number of solutions to the equation from sets S , . . . , S p is the number of solutions to theequation a x + a x = b with x ∈ S , x ∈ S multiplied by Q pi =3 | S i | . Therefore, if S , . . . , S p contain o ( n p − ) solutions, then either (i) one of the sets S , . . . , S p is of size o ( n ), so we can removeall the elements from this set, or (ii) S , S contain o ( n ) solutions to a · x + a · x = b , but inthis case, for every solution ( s , s ) we can remove s from S . In either case the new sets S ′ , . . . , S ′ p contain no solution of the equation, as needed.We now return to the proof of Lemma 2.4, with the assumption that none of the sets W i is empty.Let us multiply each of the rows of M by M − i,m i so that for every 1 ≤ i ≤ ℓ we have M i,m i = 1. For Note that this process can result in having unknowns with a zero coefficient in all the remaining equations. ≤ i ≤ ℓ let d i ∈ { p − ℓ + 1 , . . . , p } denote the index of the unique non-zero entry of M i withinthe last ℓ columns of M . Using the notation which we have introduced thus far, the system of linearequations M x = b can be written as the set of ℓ equations L , . . . , L ℓ , where L i is the equation x m i + M i,d i · x d i + X j ∈ W i M i,j · s j = b i . (1)Let us set r = 1 + X ≤ i ≤ ℓ | W i | . Observe that as mentioned in the statement of the lemma, we indeed have r ≤ p .We now define an r -uniform p -colored hypergraph K , which will help us in proving that M x = b is hypergraph representable as in Definition 2.2. The hypergraph K has k = r − p − ℓ verticeswhich we denote by v , . . . , v r − , u , . . . , u p − ℓ . As for K ’s edges, it first contains p − ℓ edges denoted e , . . . , e p − ℓ , where e i contains the vertices v , . . . , v r − , u i . Note that these edges do not depend onthe system M x = b . As we will see later, these edges will help us to “build” the actual representationof the linear equations of M x = b . So in addition to the above p − ℓ edges, K also contains ℓ edges f p − ℓ +1 , . . . , f p , where edge f d i will represent (in some sense) equation L i , defined in (1). To definethese ℓ edges it will be convenient to partition the set [ r −
1] into ℓ subsets I , . . . , I ℓ such that I contains the numbers 1 , . . . , | W | , and I contains the numbers | W | + 1 , . . . , | W | + | W | and so on.With this partition we define for every 1 ≤ i ≤ ℓ edge f d i to contain the vertices { v i : i ∈ [ r − \ I i } ,the vertices { u j : j ∈ W i } as well as vertex u m i . Note that as | I i | = | W i | the hypergraph K isindeed r -uniform. As for the coloring of the edges of K , for every 1 ≤ i ≤ p − ℓ edge e i is colored i and for every p − ℓ + 1 ≤ d i ≤ p edge f d i is colored d i .Before defining the hypergraph H we need to define p − ℓ vectors a , . . . , a p − ℓ ∈ F r − n which wewill use when defining H . We think of a , . . . , a p − ℓ as the p − ℓ rows of a p − ℓ × r − A .Furthermore, for every 1 ≤ i ≤ p − ℓ let A i be the sub-matrix of A which contains the columns whoseindices belong to I i (which was defined above). We now take the (square) sub-matrix of A i whichcontains the rows whose indices belong to W i to be the identity matrix (over F n ). More precisely, ifthe elements of W i are j < j < . . . < j | W i | then A ′ j g ,g = 1 for every 1 ≤ g ≤ | W i | , and 0 otherwise .For future reference, let’s denote by A ′ i this square sub-matrix of A i . We finally set row m i of A i tobe the vector whose g th entry is − M i,j g , where as above j g is the g th element of W i . If A i has anyother rows besides the ones defined above, we set them to 0. As each column of A belongs to one ofthe matrices A i we have thus defined A and therefore also the vectors a , . . . , a p − ℓ .Let us make two simple observations regarding the above defined vectors which we will use later. Note that we are using the fact that d , . . . , d ℓ are distinct numbers in { p − ℓ + 1 , . . . , p } . Note that the second index of A ′ j g ,g refers to the column number within A i , not A . ≤ i ≤ ℓ and t ∈ I i and suppose t is the g th element of I i . Then X j ∈ W i a jt · M i,j = ( A i ) j g ,g · M i,j g = M i,j g = − ( A i ) m i ,g = − a m i t , (2)where the first equality is due to the fact that the only non-zero entries within column g of A i andthe rows from W i appears in row j g . The second equality uses the fact that this entry is in fact 1.The third equality uses the definition of row m i of A i .The second observation we will need is the following. Claim 3.2
For ≤ i ≤ ℓ , let B i be the following r − × r − matrix: for every j ∈ [ r − \ I i wehave ( B i ) j,j = 1 and ( B i ) j,t = 1 for t = j . The other | I i | rows of B i are the | W i | (= | I i | ) vectors { a t : t ∈ W i } . Then, for every ≤ i ≤ ℓ the matrix B i is non-singular. Proof:
To show that B i is non-singular it is clearly enough to show that its | I i | × | I i | minor B ′ i ,which is determined by I i , is non-singular. But observe that this fact follows from the way we havedefined the vectors a , . . . , a p − ℓ above because B ′ i is just A ′ i , which is in fact the identity matrix.We are now ready to define, for every set of subsets S , . . . , S p ⊆ F n , the hypergraph H which willestablish that M x = b is hypergraph representable. The vertex set of H consists of k (= r − p − ℓ )disjoint sets V , . . . , V r − , U , . . . , U p − ℓ , where each of these sets contains n vertices and we think ofthe elements of each of these sets as the elements of F n . As for the edges of H , we first put for1 ≤ i ≤ p − ℓ and every choice of r − x ∈ V , . . . , x r − ∈ V r − and element s ∈ S i , an edgewith color i and label s , which contains the vertices x , . . . , x r − as well as vertex y ∈ U i , where y = s + r − X j =1 a i,j x j , (3)and the values a i,j were defined above. These edges will later play the role of the edges e , . . . , e p − ℓ of K defined above. Note that these edges are defined irrespectively of the set of equations M x = b .We now define the edges of H which will “simulate” the linear equations of M x = b . For every1 ≤ i ≤ ℓ , and for every choice of an element s ∈ S d i , for every choice of r − − | I i | vertices { x t ∈ V t : t ∈ [ r − \ I i } and for every choice of | W i | (= | I i | ) vertices { y j ∈ U j : j ∈ W i } we havean edge with color d i and label s , which contains the vertices { x t : t ∈ [ r − \ I i } and { y j : j ∈ W i } as well as vertex y ∈ U m i , where y = b i − M i,d i · s − X j ∈ W i M i,j · y j + X t ∈ [ r − \ I i x t · ( a m i t + X j ∈ W i a jt · M i,j ) . (4)Let us first note that as required by Lemma 2.4, each edge of H has a color i and is labeled byan element s ∈ S i . In fact, for each 1 ≤ i ≤ p and for each s ∈ S i , the hypergraph H has n r − edgesthat are colored i and labeled with s . We start with the following claim. Note that t is an index of a column of A , while g is an index of a column of A i . laim 3.3 H is a simple hypergraph, that is, it contains no parallel edges. Proof:
Observe that edges of H with different colors have a single vertex from a different subset of r of the sets V , . . . , V r − , U , . . . , U p − ℓ . Indeed, edges with color 1 ≤ i ≤ p − ℓ contain a vertex fromeach of the sets V , . . . , V r − and another vertex from U i , while an edge with color p − ℓ + 1 ≤ d i ≤ p contains vertices from the sets { V t : t ∈ [ r − \ I i } as well as vertices from some of the sets U , . . . , U p − ℓ . Note that the sets I , . . . , I ℓ are disjoint and non-empty, as none of the sets W i isempty, a fact which (as noted previously) follows from Claim 3.1. Observe that if W i was empty,then edges with color d i would have had parallel edges with color m i .As for edges with the same color 1 ≤ i ≤ p − ℓ , recall that they are defined in terms of a differentcombination of x , . . . , x r − ∈ F n and s ∈ S i . So if one edge is defined in terms of x , . . . , x r − ∈ F n and s ∈ S i and another using x ′ , . . . , x ′ r − ∈ F n and s ′ ∈ S i then either (i) x j = x ′ j for some1 ≤ j ≤ r − V j (ii) x j = x ′ j for all 1 ≤ j ≤ r − s = s ′ . Therefore the edges have a different vertex in U i by the way we chose thevertex in this set in (3).The case of edges with the same color p − ℓ + 1 ≤ d i ≤ p is similar. Recall that such edges aredefined in terms of a different combination of { x t : t ∈ [ r − \ I i } , { y j : j ∈ W i } and s ∈ S d i . Soif one edge is defined in terms of { x t : t ∈ [ r − \ I i } , { y j : j ∈ W i } and s ∈ S d i and another using { x ′ t : t ∈ [ r − \ I i } , { y ′ j : j ∈ W i } and s ′ ∈ S d i then either (i) x t = x ′ t for some t ∈ [ r − \ I i in which case the edges have a different vertex in V t (ii) y j = y ′ j for some j ∈ W i , in which case theedges have a different vertex in U j (iii) x t = x ′ t for all t ∈ [ r − \ I i , and y j = y ′ j for all j ∈ W i ,implying that s = s ′ and therefore the edges have a different vertex in U m i by the way we chose thevertex in this set in (4) and from the fact that M i,d i = 0.The above claim establishes the first property required by Definition 2.2, and we now turnto establish the second and third. Fix arbitrary elements s ∈ S , . . . , s p − ℓ ∈ S p − ℓ . For everychoice of r − x , . . . x r − ∈ F n , let K x be the set of vertices x ∈ V , . . . , x r − ∈ V r − , y ∈ U , . . . , y p − ℓ ∈ U p − ℓ , where for every 1 ≤ j ≤ p − ℓy j = s j + r − X t =1 a jt · x t . (5)We will need the following important claim regarding the vertices of K x . Getting back to theoverview of the proof given in Subsection 2.1, this is where we extract one of the linear equations L i (defined above) from a certain combination of edges of a copy of K . We also note that thelinear equation we “initially” obtain (see (6)) includes also the elements x i , but the way we haveconstructed H guarantees that the x i ’s vanish and we eventually get a linear equation involving onlyelements from the sets S i . We will then use this claim to show that H contains many edge disjointcopies of K when s , . . . , s p − ℓ determine a solution to M x = b , and in the other direction, that H H . For what follows we remind that reader that for 1 ≤ i ≤ ℓ wehave p − ℓ + 1 ≤ d i ≤ p and that for i < i ′ we have d i = d i ′ . Returning to the overview of the proofgiven in Subsection 2.1, we are now going to use the fact that edges with colors d i and m i have acommon vertex in U m i in order to deduce the linear equation L i . Claim 3.4
Let ≤ i ≤ ℓ . Then the vertices { x t : t ∈ [ r − \ I i } ∪ { y j : j ∈ W i } ∪ y m i span anedge (of color d i ) if and only if there is an element s d i ∈ S d i such that { s j : j ∈ W i } ∪ s m i ∪ s d i satisfy equation L i (defined in (1)). Proof: H contains an edge containing the vertices { x t : t ∈ [ r − \ I i } ∪ { y j : j ∈ W i } ∪ y m i ifand only if (recall (4)) there is an s d i ∈ S d i such that y m i = b i − M i,d i · s d i − X j ∈ W i M i,j · y j + X t ∈ [ r − \ I i x t · ( a m i t + X j ∈ W i a jt · M i,j ) (6)Using (5) this is equivalent to requiring that s m i + r − X t =1 a m i t · x t = b i − M i,d i · s d i − X j ∈ W i M i,j · ( s j + r − X t =1 a jt · x t )+ X t ∈ [ r − \ I i x t · ( a m i t + X j ∈ W i a jt · M i,j )= b i − M i,d i · s d i − X j ∈ W i M i,j · s j − r − X t =1 x t · X j ∈ W i a jt · M i,j + X t ∈ [ r − \ I i x t · ( a m i t + X j ∈ W i a jt · M i,j )= b i − M i,d i · s d i − X j ∈ W i M i,j · s j − X t ∈ I i x t · X j ∈ W i a jt · M i,j + X t ∈ [ r − \ I i x t · a m i t . Using (2) in the last row above, we can write the above requirement as s m i + r − X t =1 a m i t · x t = b i − M i,d i · s d i − X j ∈ W i M i,j · s j + r − X t =1 a m i t · x t , or equivalently that s m i + M i,d i · s d i + X j ∈ W i M i,j · s j = b i , which is precisely equation L i . 13or the next two claims, let us recall that we assume that the last ℓ columns of M form a diagonalmatrix. Therefore, a solution to M x = b is determined by the first p − ℓ elements of x . Claim 3.5
Suppose s , . . . , s p − ℓ determine a solution s , . . . , s p to M x = b . Then, any set K x (defined above) spans a colored copy of K . In particular, for every solution s , . . . , s p to M x = b , H has n r − colored copies of K , in which the edge of color i is colored with s i . Proof:
We claim that K x spans a colored copy of K , where for every 1 ≤ i ≤ r − v i of K is mapped to vertex x i of H , and for every 1 ≤ j ≤ p − ℓ vertex u j of K is mapped to vertex y j of H . To see that the above is a valid mapping of the colored edges of K to colored edges of H , we firstnote that the way we have defined H in (3) and the vertices y , . . . , y p − ℓ in (5), guarantees that forevery 1 ≤ j ≤ p − ℓ we have an edge with color i which contains the vertices x , . . . , x r − , y j . Thisis actually true even if s , . . . , s p − ℓ do not determine a solution.As for edges with color p − ℓ +1 ≤ d i ≤ p , the fact that the vertices { x t : t ∈ [ r − \ I i }∪{ y j : j ∈ W i } ∪ y m i span such an edge follows from Claim 3.4, because we assume that s , . . . , s p − ℓ determinea solution to M x = b , so for every 1 ≤ i ≤ ℓ there exists an element s d i ∈ S d i as required by Claim3.4. We thus conclude that x , . . . , x r − , y , . . . , y p − ℓ span a colored copy of K . Finally, note thatby the way we have defined H , the edge of K x which is colored i is indeed labeled with the element s i ∈ S i . Claim 3.6 If s , . . . , s p − ℓ determine a solution to M x = b , then the n r − colored copies of K spannedby the sets K x (defined above) are edge disjoint. Proof:
Let us consider two colored copies K x and K y for some x = y (Claim 3.5 guarantees that K x and K y indeed span a colored copy of K ). Clearly K x and K y cannot share edges with color1 ≤ i ≤ p − ℓ , because the vertices of such edges within V , . . . , V r − are uniquely determined by thecoordinates of x and y .We now consider an edge of K x with color d i ∈ { p − ℓ + 1 , . . . , p } . Let j < j < . . . < j | W i | be theelements of W i , and let B i be the matrix defined in Claim 3.2. Recall that B i satisfies the following :(i) for j ∈ [ r − \ I i we have ( B i ) j,j = 1 and ( B i ) j,t = 0 when t = j , and (ii) if j ∈ I i is the g th element of I i , then the j th row of B i is the vector a j g (where j g is the g th element of W i ). Let us alsodefine an r − c as follows: for every j ∈ [ r − \ I i we have c j = 0, and for every j ∈ I i , if j is the g th element of I i then c j = s j g . The key observation now is that the vertices of theedge whose color is d i ∈ { p − ℓ + 1 , . . . , p } within the r − { V j : j ∈ [ r − \ I i } ∪ { U j : j ∈ W i } are given by B i x + c . More precisely, for every j ∈ [ r − \ I i the vertex of the edge of color d i within V j is given by ( B i x + c ) j . Also, for every j g ∈ W i , if j ∈ I i is the g th element of I i , then the vertex We remark that when we have defined the matrices B i in Claim 3.2 we did not “impose” the ordering of the rowsthat correspond to W i as we do here, but this ordering, of course, does not affect the rank of B i .
14f this edge within U j g is given by ( B i x + c ) j . Claim 3.2 asserts that B i is non-singular, so we canconclude that the edges with color d i of K x and K y can share at most r − r − { V j : j ∈ [ r − \ I i } ∪ { U j : j ∈ W i } . So any pair of edges of color d i can share atmost r − K x and K y are edge disjoint . Claim 3.7 If S , . . . , S p contain T solutions to M x = b with x i ∈ S i then H contains T n r − coloredcopies of K . Proof:
Recall that we assume that the last ℓ columns of M form a diagonal matrix. Therefore, thenumber of solutions T to M x = b is just the number of choices of s ∈ S , . . . , s p − ℓ ∈ S p − ℓ that canbe extended to a solution of M x = b by choosing appropriate values s p − ℓ +1 ∈ S p − ℓ +1 , . . . , s p ∈ S p .Therefore, it is enough to show that every colored copy of K in H is given by a choice of r − x ∈ V , . . . , x r − ∈ V r − and a choice of p − ℓ elements s ∈ S , . . . , s p − ℓ ∈ S p − ℓ that determine asolution to M x = b . So let us consider a colored copy of K in H . This copy must contain edges withthe colors 1 , . . . , p − ℓ . By the way we have defined H this means that this copy must contain r − x ∈ V , . . . , X r − ∈ V r − as well as p − ℓ vertices y ∈ U , . . . , y p − ℓ ∈ U p − ℓ . Furthermore,for 1 ≤ j ≤ p − ℓ we have y j = s j + r − X t =1 a jt · x t (7)for some choice of s j ∈ S j . So the vertex set of such a copy is determined by the choice of x , . . . , x r − and s , . . . , s p − ℓ . Note that the set of vertices is just the set K x defined before Claim 3.4, for x , . . . , x r − and s , . . . , s p − ℓ . Therefore, we can apply Claim 3.4 on this set of vertices.So our goal now is to show that there are elements s p − ℓ +1 , . . . , s p which together with s , . . . , s p − ℓ form a solution of M x = b . Consider any 1 ≤ i ≤ ℓ . As the vertices at hand span a colored copyof K they must span an edge with color d i . This edge must contain the vertices { x t : t ∈ [ r − \ I i } ∪ { y j : j ∈ W i } ∪ y m i . But by Claim 3.4 if these vertices span an edge (of color d i ) thenthere is an element s d i ∈ S d i such that { s j : j ∈ W i } ∪ s m i ∪ s d i satisfy equation L i . As this holdsfor every 1 ≤ i ≤ ℓ we deduce that s , . . . , s p satisfy M x = b .The proof of Lemma 2.4 now follows from Claims 3.3, 3.5, 3.6 and 3.7. • Our removal lemma for sets of linear equations works over any field. For the special case ofa single linear equation, Kr´al’, Serra and Vena [13] (following Green [10]) proved a removal We note that the way we have defined H does not (necessarily) guarantee that edges of the same color cannotshare r − i may share the vertex in the set U m i and r − r − { V j : j ∈ [ r − \ I i } ∪ { U j : j ∈ W i } . Because only vertices from this combination of r of the sets V , . . . , V r − , U , . . . , U p − ℓ spans an edge with color d i . • Green [10] used the regularity lemma for groups in order to resolve a conjecture of Bergelson,Host, Kra and Ruzsa [4], which stated that every S ⊆ [ n ] of size δn contains at least ( δ − o (1)) n S ⊆ [ n ] of size δn contains at least ( δ − o (1)) n • Our proof of the removal lemma for sets of linear equations applies the hypergraph removallemma. As a consequence, we get extremely poor bounds relating ǫ and δ . Roughly speaking,the best current bounds for the graph removal lemma give that δ ( ǫ ) grows like Tower(1 /ǫ ),that is, a tower of exponents of height 1 /ǫ . For 3-uniform hypergraphs, the bounds are givenby iterating the Tower function 1 /ǫ times, and so on. So on the one hand, the fact that weare using hypergraphs with a large degree of uniformity implies that the bounds we get are areextremely weak. On the other hand, as even the graph removal lemma gives bounds which aretoo weak for any reasonable application, this is not such a real issue to be concerned about. Itmay still be interesting, however, to see if one can prove Theorem 2 with a proof similar to theone given in [13] for the special case of a single equation. • Given the above discussion it it reasonable to ask for which sets of equations
M x = b one canget a polynomial dependence between ǫ and δ . This seems to be a challenging open problemeven for a single equation so let us focus on this case. For a linear equation L , let r L ( n ) denotethe size of the largest subset of n which contains no (non-trivial) solution to L . Problems ofthis type were studied by Ruzsa [16]. A simple counting argument shows that if r L ( n ) = n − c for some positive c , then δ ( ǫ ) = O (1 /ǫ ) /c . However, characterizing the equations with thisproperty seems like a very hard problem, see [16]. Furthermore, we do not even know if all thelinear equations for which r L ( n ) = n − o (1) do not have a polynomial dependence between ǫ and δ . For example, we do not know if such a dependence exists for the linear equation x + x = x (for which r L ( n ) = Θ( n )).But for at least some of these linear equations, we can rule out such a polynomial dependenceas the following example shows. Consider the linear equation x + x = 2 x , that is, the linearequation which defines a 3-term arithmetic progression . We claim that for this equationthere is no polynomial relation between ǫ and δ . Fix an ǫ and let n = n ( ǫ ) be large enough The argument can be extended to any linear equation in which one variable is a convex combination of the others.
16o that every S ⊆ [ n ] of size ǫn contains a 3-term arithmetic progression. Roth’s Theorem[16] states that such an n exists. Therefore, for every n ≥ n and for every S ⊆ [ n ] of size2 ǫn we have to remove at least ǫn elements from S in order to destroy all 3-term arithmeticprogressions. Let m be the largest integer for which [ m ] contains a subset of size 4 ǫm , containingno 3-term arithmetic progressions. The well known construction of Behrend [3] implies that m ≥ (1 /ǫ ) c log(1 /ǫ ) for some absolute constant c . Let X be one such subset of [ m ]. For every n ≥ n , let S ⊆ [ n ] be the set of integers with the property that in their base 2 m representation,the least significant element belongs to X . Then clearly | S | = n · | X | m = 2 ǫn and so one shouldremove at least ǫn elements from S to destroy all 3-term arithmetic progressions. On the otherhand if x , x , x ∈ S form a 3-term arithmetic progression then as X ⊆ [ m ], so do the leastsignificant characters of x , x , x , because there in no carry in the base 2 m addition. But asthese characters belong to X we get that they must be identical. Therefore, the number of3-term arithmetic progressions in S is | S | /m ≤ ǫ c log 1 /ǫ n , implying that δ ( ǫ ) ≤ ǫ c log 1 /ǫ . • The contrapositive version of our main result says that if one should remove ǫn elements from S ⊆ [ n ] in order to destroy all solutions of M x = b then S contains f ( ǫ ) n p − ℓ solutions to M x = b . The “analogous” result for graphs (or hypergraphs) is that if one should remove ǫn edges from a graph G in order to destroy all the copies of H then G contains δ ( ǫ ) n h copies of H (where h is the number of vertices of H ). The main result of [1] is an “infinite” version of theremoval lemma for graphs, which states that if H is a (possibly infinite) set of graphs, and if oneshould remove ǫn edges from G in order to destroy all the copies of all the graphs H ∈ H thenfor some H ∈ H , whose size h satisfies h ≤ h ( ǫ ), G contains δ ( ǫ ) n h copies of H . It seems naturalto ask if there is a corresponding “infinite” removal lemma for sets of linear equations. Moreprecisely, is it the case that for every (possibly infinite) set M = { M x = b , M x = b , . . . } of sets of linear equations the following holds: if one should remove ǫn elements from S ⊆ [ n ]in order to destroy all the solutions to all the sets of linear equations in M , then for some setof linear equations M x = b ∈ M , with p ≤ p ( ǫ ) unknowns, S contains δ ( ǫ ) n p − ℓ solutions to M x = b . Acknowledgements:
We would like to thank Vojta R¨odl, Benny Sudakov and Terry Tao forhelpful discussions related to this paper. I would also like to thank Pablo Candela for his helpfulcomments on the paper.
References [1] N. Alon and A. Shapira, Every monotone graph property is testable, SIAM J. on Computing,38 (2008), 505-522. 172] T. Austin and T. Tao, On the testability and repair of hereditary hypergraph properties,manuscript, 2008.[3] F. A. Behrend, On sets of integers which contain no three terms in arithmetic progression, Proc.National Academy ofSciences USA 32 (1946), 331-332.[4] V. Bergelson, B. Host, B. Kra and I.Z. Ruzsa, Multiple recurrence and nilsequences, InventionesMathematicae 160 (2005), 261-303.[5] P. Candela, On systems of linear equations and uniform hypergraphs, manuscript, 2008.[6] P. Frankl and V. R¨odl, Extremal problems on set systems, Random Structures and Algorithms20 (2002), 131-164.[7] H. Furstenberg and Y. Katznelson, An ergodic Szemer´edi theorem for commuting transforma-tions, J. Analyse Math. 34 (1978), 275-291.[8] T. Gowers, Hypergraph regularity and the multidimensional Szemer´edi theorem, Ann. of Math.Volume 166, Number 3 (2007), 897-946.[9] T. Gowers, Quasirandomness, counting and regularity for 3-uniform hypergraphs, Combina-torics, Probability and Computing, 15 (2006), 143-184.[10] B. Green, A Szemer´edi-type regularity lemma in abelian groups, GAFA 15 (2005), 340-376.[11] Y. Kohayakawa, B. Nagle, V. R¨odl, M. Schacht and J. Skokan, The hypergraph regularitymethod and its applications, Proceedings of the National Academy of Sciences USA, 102(23):8109-8113.[12] J. Koml´os and M. Simonovits, Szemer´edi’s Regularity Lemma and its applications in graphtheory. In: