A Multidimensional Szemerédi Theorem in the primes
aa r X i v : . [ m a t h . N T ] M a y A MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES
BRIAN COOK, ´AKOS MAGYAR, TATCHAI TITICHETRAKUNA
BSTRACT . Let A be a subset of positive relative upper density of P d , the d -tuples of primes. We prove that A contains an affine copy of any finite set F ⊆ Z d , which provides a natural multi-dimensional extensionof the theorem of Green and Tao on the existence of long arithmetic progressions in the primes. The proofuses the hypergraph approach by assigning a pseudo-random weight system to the pattern F on a d + 1 -partitehypergraph; a novel feature being that the hypergraph is no longer uniform with weights attached to lowerdimensional edges. Then, instead of using a transference principle, we proceed by extending the proof of theso-called hypergraph removal lemma to our settings, relying only on the linear forms condition of Green andTao.
1. I
NTRODUCTION
Background.
A celebrated theorem in additive combinatorics due to Green and Tao [7] establishesthe existence of arbitrary long arithmetic progressions in the primes. It is proved that if A is a subset of theprimes of positive relative upper density then A necessarily contains infinitely many affine copies of anyfinite set of integers. As such, it might be viewed as a relative version of Szemer´edi’s theorem [17] on theexistence of long arithmetic progressions in dense subsets of the integers.Another fundamental result in this area is the multi-dimensional extension of Szemer´edi’s theorem origi-nally proved by Furstenberg and Katznelson [3]. It states that if A ⊆ Z d is of positive upper density then A contains an affine copy of any finite set F ⊆ Z d . The proof in [3] uses ergodic methods however a morerecent combinatorial approach was developed by Gowers [5] and also independently by Nagel, R ¨odl andSchacht [14].It is natural to ask if a multi-dimensional extension of the result of Green and Tao, or alternatively if arelative version of the Furstenberg-Katznelson theorem can be established. In fact, this question was raisedalready in [18] where the existence of arbitrary constellations among the Gaussian primes was shown. Apartial result was obtained earlier by the first two authors [2], where it was proved that relative dense subsetsof P d contain an affine copy of any finite set F ⊆ Z d which is in general position , in the sense that eachcoordinate hyperplane contains at most one point of F .A common feature of the above mentioned results is that they use an embedding of the underlying sets(the primes or the Gaussian primes) into a set which is sparse but sufficiently random with respect to thepattern F . In our case when the set F is not in general position (the simplest example being a 2-dimensionalcorner) this does not seem possible, due to the extra correlations arising from the direct product structure.For example if 3 vertices of a rectangle are in P , then the fourth vertex is necessarily in P , a type of self-correlation not present in the one dimensional case or the Gaussian primes.Another approach, already partly used in [18], is to establish a hypergraph removal lemma [5], [14] forsparse uniform hypergraphs or alternatively with weights attached to the faces. This approach has beenutilized by the second and third authors [13] to show the existence of d -dimensional corners (simplices with edges parallel to the coordinate axis) in dense subsets of P d . Recently a proof based on hypergraph theoryusing only the linear forms conditions, has been obtained in [1], covering both the original Green-Tao theo-rem and the case of the Gaussian primes.In all of the above approaches the crucial point is to prove a removal lemma for a weighted (or sparse) uniform hypergraphs, using transference arguments to remove the weights from the hyperedges. As op-posed, for a general constellation in P d the hypergraph approach leads to a weighted closed hypergraphwith weights attached possibly to any lower dimensional edge, and the usual transference principles do notapply. Our approach is different, we are not trying to remove the weights and hence to reduce the problem topreviously known results, but to extend the proof of the hypergraph regularity and removal lemmas directlyto the weighted settings, which might be of independent interest. In this aspect our argument is essentiallyself-contained, relying only on results from sieve theory, namely on the so-called linear forms conditions [7].Simultaneously with our original work on this problem, the existence of arbitrary constellations in rela-tive dense subsets of P d has also been shown by Tao and Ziegler [20], using an entirely different methodbased on an infinite number of linear forms conditions to obtain a weighted version of the Furstenberg cor-respondence principle, and a short, elegant proof by Fox and Zhao [4] has been obtained afterwards usingsampling arguments. Both of the above proofs however rely on full force of the results of Green, Tao andZiegler developed in [8],[9],[10] for the study of asymptotic number of prime solutions for systems linearequations. As such the methods of [20], and [4] are do not provide bounds, while from our approach one canextract quantitative statements. The bounds, though recursive, are rather poor (iterated tower-exponentialtype) and we do not pursue to explicitly calculate them here. Also, as we rely only on sieve-tech niques ourapproach is somewhat flexible, i.e. it might not be hard to modify it to count the number of small copies ofa finite set F , of size N ε , in a set A ⊆ [1 , N ] d ∩ P d of positive relative density.1.2. Main results.
Let us recall that a set A ⊆ P d is of positive relative upper density if lim sup N →∞ | A ∩ P dN || P N | d > , where P N denotes the set primes up to N , and | A | stands for the cardinality of a set A . If F ⊆ Z d is a finiteset, we say that a set F ′ is an affine copy of F , or alternatively that F ′ is a constellation defined by F , if F ′ = x + t · F = { x + ty ; y ∈ F } . We call F ′ non-trivial if t = 0 . Our main result is the following. Theorem 1.1. If A is a subset of P d of positive upper relative density, then A contains infinitely manynon-trivial affine copies of any finite set F ⊆ Z d . Note that it is enough to show that the set A contains at least one non-trivial affine copy of F , as deletingthe set F from A will not affect its relative density. Also, replacing the set F by F ′ = F ∪ ( − F ) one canrequire that the dilation parameter t is positive. By lifting the problem to a higher number of dimensions,it is easy to see that one can assume that F forms the vertices of a d -dimensional simplex. Indeed, let F = { , x , . . . , x k } , choose a set of k linearly independent vectors { y , . . . , y k } ⊆ Z k , and define the set ∆ := { , ( x , y ) , . . . , ( x k , y k ) , z k +1 , . . . , z k + d } ⊆ Z k + d such that the vectors of ∆ \{ } form a basis of R k + d . If the set A ′ = A × P k contains an affine copy of ∆ then clearly A contains an affine copy of the set π (∆) ⊇ F , where π : R d × R k → R d is the natural orthogonal projection.In the case when ∆ ⊆ Z d is a d -dimensional simplex, we prove a quantitative version of Theorem 1.1.To formulate it we define the quantity l (∆) := d X i =1 | π i (∆) | , (1.2.1) MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 3 π i : R d → R being the orthogonal projection to the i -th coordinate axis. Theorem 1.2.
Let α > and let ∆ ⊆ Z d be a d -dimensional simplex. There exists a constant c ( α, ∆) > such that for any N > and any set A ⊆ P dN such that | A | ≥ α | P N | d , the set A contains at least c ( α, ∆) N d +1 (log N ) − l (∆) affine copies of the simplex ∆ . Note that in Theorem 1.2 we do not require the copies of ∆ to be non-trivial, thus without loss of generality, N can be assumed to be sufficiently large with respect to α and ∆ . It is clear that Theorem 2 implies Theo-rem 1 as the number of trivial copies of ∆ in A is at most N d (log N ) − d .To see why the above lower bound is meaningful, note that there are ≈ N d +1 affine copies of ∆ in [1 , N ] d , and for a fixed i the probability that all the i -th coordinates of an affine copy ∆ ′ are primes isroughly (log N ) −| π i (∆) | . Thus if the prime tuples behave randomly, the probability that ∆ ′ ⊆ P d is about (log N ) − l (∆) .In the contrapositive, Theorem 1.2 states that if a set A ⊆ P dN contains at most δN d +1 (log N ) − l (∆) affinecopies of ∆ , then its relative density is at most ǫ, where ǫ = ǫ ( δ ) is a quantity such that ǫ ( δ ) → as δ → .As for a number of similar results [7], [18], [2], [20], to prove this, one formulates a statement involving apseudo-random measure ν = ν ( N ) : [1 , N ] → R + .1.3. The Green-Tao measure and the linear forms condition.
Let us recall the pseudo-random measure ν introduced by Green and Tao, and (a slight variant of) the so-called linear forms condition, see [7], Sec.9.Let ω be a sufficiently large number and let W = Q p ≤ ω p be the product of primes up to ω . For given b relative prime to W define the modified von Mangoldt function ¯Λ b : Z → R ≥ by ¯Λ b ( n ) = (cid:26) φ ( W ) W log( W n + b ) if W n + b is a prime otherwise . Here φ is the Euler function. Note that by Dirichlet’s theorem on the distribution of primes in residue classesone has that P n ≤ N ¯Λ b ( n ) = N (1 + o (1)) . A crucial fact is that the function ¯Λ b is majorized by divisorsums closely related to the so-called Goldston-Yildirim divisor sum [7], [11] Λ R ( n ) = X d | n,d ≤ R µ ( d ) log( R/d ) ,µ being the Mobius function and R = N d − − d − . Indeed, for given small parameters < ε < ε < (whose values will be specified later), recall the Green-Tao measure ν b ( n ) = ( φ ( W ) W Λ R ( W n + b ) log R if ε N ≤ n ≤ ε N ;1 otherwise . Clearly ν b ( n ) ≥ for all n , and it is easy to see that ν b ( n ) ≥ d − − d − ¯Λ b ( n ) (1.3.1)for all ε N ≤ n ≤ ε N , for N sufficiently large. Indeed, this is trivial unless W n + b is a prime, and inthat case, since ε N > R , Λ R ( W n + b ) = log R ≥ d − − d − log N . Note that the measure ν is in factdependent on N , however following [7] we do not explicitly indicate that.Let us briefly recall the pseudo-randomness properties of the measures ν b - the so called linear formcondition - which we will need in the proof. This is a slight modification of the formulation given in [7],however the proof works without any changes. BRIAN COOK, ´AKOS MAGYAR, TATCHAI TITICHETRAKUN
Theorem A (Linear forms condition, [7]) . Let N , W and the measures ν b be as above, and let m , t , k ∈ N be small parameters. Then the following holds.For given m ≤ m and t ≤ t , suppose that { l i,j } ≤ i ≤ m, ≤ j ≤ t are arbitrary integers at most k in ab-solute value, and that { b i } are arbitrary numbers relative prime to W . If the linear forms L i ( x ) = t X j =1 l i,j x j , are non-zero and pairwise linearly independent over the rationals then E m Y i =1 ν b i ( L i ( x )); x ∈ Z tN ! = 1 + o N,W →∞ ; m ,t ,k (1) , (1.3.2) where the o (1) term is independent of the choice of the b i ’s. In the above formula the linear forms L i ( x ) are considered as acting on ( Z /N Z ) t and the error term o N,W →∞ ; m ,t ,k (1) denotes a quantity that tends to 0 as both N → ∞ and W → ∞ , for any fixedchoice of m , t , k . In our context it is important to let W = Q p ≤ ω p be independent of N to obtain thequantitative lower bound in Theorem 1.2, see also the remarks in [7] (Sec.11). As all error terms in (1.3.2)are independent of the choice of b i ’s, we will write ν for ν b i for simplicity of notations.With the aid of this measure, we define the weight of a finite set S ⊆ Z d as w ( S ) := d Y i =1 Y y ∈ π i ( S ) ν ( y ) (1.3.3)where π i ( S ) is the canonical projection of S to the i -th coordinate axis. If S = { x } we will write w ( x ) := w ( { x } ) = Q di =1 ν ( x i ) . The point is that if
W x + b ∈ P dN (and x ∈ [ ε N, ε N ] d ), then w ( x ) ≈ (log N ) d . (1.3.4)The implicit constant depends only on d and W - which we will choose sufficiently large but independentof N . Moreover for ∆ ⊆ [ ε N, ε N ] d such that W ∆ + b ⊆ A ⊆ P dN one has w (∆) ≈ (log N ) l (∆) . (1.3.5)Thus identifying [1 , N ] with Z N = Z /N Z it is easy to show that (see Sec.5) Theorem 1.2 follows from Theorem 1.3.
Let ∆ = { v , · · · , v d } ⊆ Z d be a d-dimensional simplex and let δ > . Let N be a largeprime and let A ⊆ Z dN satisfy E x ∈ Z dN ,t ∈ Z N (cid:18) d Y i =0 A ( x + tv i ) (cid:19) w ( x + t ∆) ≤ δ. (1.3.6) Then there exists ǫ = ǫ ( δ ) such that E x ∈ Z dN A ( x ) w ( x ) ≤ ǫ ( δ ) + o N,W →∞ ; ∆ (1) . Moreover ǫ ( δ ) → as δ → . We describe below some of the key elements of the proof. The details are given in the remaining sections.
MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 5
A Removal Lemma for weighted hypergraph systems.
We will use the construction of a weightedhypergraph associated to a set A ⊆ Z dN and a simplex ∆ = { v , . . . , v d } given in [18]. Definition 1.1 (Hypergraph System.) . Let J = { , , . . . , d } , H := { e : e ⊆ J } , and for a set e ∈ H , let V e = Z eN = Q j ∈ e Z N . Identify V e as the subspace of elements x = ( x , . . . , x d ) ∈ V J such that x j = 0 forall j / ∈ e and let π e : V J → V e denote the natural projection. For e = { j } we write V j := V { j } and for agiven H ⊆ H , we will call the quadruplet ( J, V J , H , d ) a hypergraph system. From a graph theoretical point of view we can think of a point x e ( e ∈ H , | e | = d ), as a d -simplex withvertices { x j : j ∈ e } . A set G e ⊆ V e then may be viewed as a d -regular d -partite hypergraph with ver-tex sets V j ( j ∈ e ) . Similarly a point x ∈ V J represents a d + 1 -simplex with faces x e := ( x j ) j ∈ e for e ∈ H d := { e ⊆ J, | e | = d } .For a given e ⊆ J define the σ − algebra A e = { π − e ( F ) : F ⊆ V e } , which will play an important rolein the proof of the removal lemma. For a given set A ⊆ Z dN and for e = J \{ j } , let E e = { x ∈ V J : d X i =0 x i ( v i − v j ) } ∈ A (1.4.1)Note that E e ∈ A e as the expression in (1.4.1) is independent of the coordinate x j . Definition 1.2 (Weighted system) . We will define now a family of functions ν e : V J → R + , µ e : V J → R + .For e ∈ H d , e = J \{ j } and ≤ k ≤ d . Define L ke ( x ) = d X i =0 x i ( v ki − v kj ) (1.4.2) where v ki denotes the k th − coordinate of the vector v i . We partition the family of forms L := { L ke ; | e | = d, ≤ k ≤ d } according to which coordinates they depend on. For this we define the support of a linear form L ( x ) = P dk =0 a k x k as supp ( L ) = { k : a k = 0 } . For a given e ⊆ J , define ν e ( x ) = Y L ∈L , supp ( L )= e ν ( L ( x )) , µ e ( x ) = Y L ∈L , supp ( L ) ⊆ e ν ( L ( x )) , (1.4.3) with the convention that ν e ≡ if { L ; supp ( L ) = e } = ∅ . Note that if ∆ = { v , · · · , v d } is in general position, that is if v ki = v kj for all i = j and k then supp( L ke ) = e for all e ∈ H d hence µ e ( x ) = ν e ( x ) = d Y k =1 ν ( L ke ( x )) In general, we have µ e ( x ) = Q f ⊆ e ν f ( x ) and also µ e ( x ) = µ e ( π e ( x )) , that is µ e is constant along thefibers of the projection π e . We will refer the functions ν e and µ e as weights and measures respectively. Toemphasize this point of view we will often use the integral notation and write Z V J F ( x ) dµ e ( x ) := E x ∈ V J F ( x ) µ e ( x ) , and Z V e F e ( x ) dµ e ( x ) := E x ∈ V e F e ( x ) µ e ( x ) , for functions F : V J → R and F e : V e → R . Thus we could think of µ e as a measure on V J or on thesubspace V e , the exact interpretation will be clear from the context. Note that it follows easily from thelinear forms condition that µ e ( V e ) = R V e dµ e = 1 + o N,W →∞ (1) (similarly µ e ( V J ) = 1 + o N,W →∞ (1) ),see Lemma 2.1. BRIAN COOK, ´AKOS MAGYAR, TATCHAI TITICHETRAKUN
Let us observe now some properties of the family of linear forms L which will play a crucial role in the proof.If e = J \{ j } , e ′ = J \{ j ′ } then supp ( L ke ′ ) ⊆ e if and only if v kj = v kj ′ and that is equivalent to L ke ′ = L ke . Wecall such a family L well-defined . Since for a given e ∈ H d , the forms { L ke , ≤ k ≤ d } are linearly indepen-dent any two distinct forms of the family L are linearly independent. We will refer to such families of formsas being pairwise linearly independent . Also let M = { x ∈ V J : x + . . . + x d = 0 } . Then for any x ∈ M , L ke ( x ) = L ke ′ ( x ) for all e, e ′ ∈ H d and k . We call a family of linear forms L = { L ke ; e ∈ H d , ≤ k ≤ s } satisfying this property symmetric .To see how the weighted hypergraph { ν e } e ∈H is related to our problem we follow [18] to parameterizeaffine copies of ∆ . Define the map
Φ : Z d +1 N → Z d +1 N by Φ( x ) = ( d X i =0 x i v i , − d X i =0 x i ) := ( y, t ) (1.4.4)By (1.4.1) and (1.4.4) we have that x ∈ E e for e = J \{ j } if and only if y + tv j ∈ A thus x ∈ T e ∈H d E e exactly when y + t ∆ ⊆ A . Since Φ is one to one, as we assume { v − v , . . . , v d − v } is alinearly independent family of vectors, this gives a parametrization of all affine copies of ∆ contained in A (mod N ) . Also for e = J \{ j } L ke ( x ) = d X i =0 x i ( v ki − v kj ) = π k ( y + tv j ) (1.4.5)where π k is the orthogonal projection to the k th coordinate axis. This implies that µ e ( x ) = Y supp ( L ) ⊆ e ν ( L ( x )) = d Y k =1 ν ( L ke ( x )) = w ( y + tv j ) , (1.4.6)and also µ J ( x ) = Y L ∈L ν ( L ( x )) = w ( y + t ∆) . (1.4.7)Thus the assumption (1.3.6) in Theorem 1.3 translates to E x ∈ V J Y e ∈H d E e ( x ) µ J ( x ) = E ( y,t ) ∈ Z d +1 N w ( y + t ∆) ≤ δ. (1.4.8)On the other hand, recall M = { x ∈ V J : x + · · · + x d = 0 } then x ∈ M ∩ T e ∈H d E e if and only if Φ( x ) = ( y, with y ∈ A , thus by (1.4.4), (1.4.6) E y ∈ A w ( y ) = E x ∈ M Y e ∈H d E e ( x ) µ e ′ ( x ) (1.4.9)for any fixed e ′ ∈ H d . Thus it is easy to see that Theorem 1.3 follows from a removal lemma for weightedhypergraphs, which we first recall in the unweighted case (where ν f ≡ for all f ). See also [18], [5], [14]. Theorem B. (Simplex Removal Lemma) [19] . Let E e ∈ A e be given for e ∈ H d , and let δ > . Also let µ J and µ e denote the normalized counting measures on V J and V e . There exists ε = ε ( δ ) > and for everyindex set e ∈ H d there exists a set E ′ e ∈ A e such that the following holds.If E x ∈ V J Y e ∈H d E e ( x e ) dµ J ( x ) ≤ δ, MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 7 then Y e ∈H d E ′ e ( x e ) = 0 for all x ∈ V J , E x ∈ V e E e \ E ′ e ( x ) µ e ( x ) ≤ ǫ ( δ ) , and ε ( δ ) → as δ → . Naturally one would like to extend Theorem B to families of measures { µ e } e ∈H d in the weighted case, asthat would easily imply Theorem 1.3 and hence our main result Theorem 1.2. The reason why this seemsdifficult is the existence of weights ν e on lower dimensional edges | e | < d when the configuration ∆ is notin general position. Removing these weights does not seem amenable to known “ transference arguments”developed in [18], [1], [6], [15]. What we prove instead is that the removal lemma extends to a family ofmeasures ˜ µ e which are sufficiently small perturbations of the measures µ e with respect to a given family offunctions g e : V e → R . Theorem 1.4. (Weighted Simplex Removal Lemma) Let { ν e } e ⊆ J , { µ e } e ⊆ J be a system of weights and mea-sures associated to a well-defined, pairwise linearly independent, and symmetric family of linear forms L as defined in (1.4.3) . Let E e ⊆ A e , g e : V e → [0 , be given for e ∈ H d . Then for a given δ > there existsan ǫ = ǫ ( δ ) > such that the following holds: If E x ∈ V J Y e ∈H d E e ( x ) µ J ( x ) ≤ δ (1.4.10) then there exists a well-defined and symmetric family of linear forms ˜ L = { ˜ L ke ; e ∈ H d , ≤ k ≤ d } suchthat the associated system of weights and measures { ˜ ν e } e ⊆ J , { ˜ µ e } e ⊆ J satisfy E x ∈ V J Y e ∈H d E e ( x )˜ µ J ( x ) = E x ∈ V J Y e ∈H d E e ( x ) µ J ( x ) + o N,W →∞ (1) (1.4.11) and for all e ∈ H d E x ∈ V e g e ( x )˜ µ e ( x ) = E x ∈ V e g e ( x ) µ e ( x ) + o N,W →∞ (1) . (1.4.12) In addition there exist sets E ′ e ∈ A e such that \ e ∈H d ( E e ∩ E ′ e ) = ∅ (1.4.13) and for all e ∈ H d we have E x ∈ V e E e \ E ′ e ( x )˜ µ e ( x ) ≤ ǫ ( δ ) + o N,W →∞ (1) . (1.4.14) Moreover, we also have that ǫ ( δ ) → , as δ → . (1.4.15) It seems possible to formulate the properties of weight system { ν e } e ⊆ J so that Theorem 1.4 holds without referring to anunderlying system of linear forms L . For that one would need to formulate a ‘linear forms’ condition for weighted hypergraphssimilar to [18] at an order depending on δ . We will not pursue this approach here. BRIAN COOK, ´AKOS MAGYAR, TATCHAI TITICHETRAKUN
Proof [Theorem 1.4 implies Theorem 1.3]By assumption (1.3.6) in Theorem 1.3 and by (1.4.7), E x ∈ V J Y e ∈H d E e ( x ) µ J ( x ) ≤ δ. For a given e ′ ∈ H d define the function g e ′ : V e ′ → [0 , as follows. Let φ e ′ : V e ′ → M be the inverse ofthe projection map π e ′ : V J → V e ′ restricted to M , and for y ∈ V e ′ let g e ′ ( y ) := Y e ∈H d E e ( φ e ′ ( y )) . Applying Theorem 1.4 to the system of weights { ν e } and functions { g e } gives a system of measures ˜ µ e and sets E ′ e ∈ A e satisfying (1.4.11)-(1.4.15). By (1.4.4) we have that x ∈ M ∩ T e ∈H d E e if and only if Φ( x ) = ( y, with y ∈ A . Moreover in that case w ( y ) = µ e ( x ) for all e ∈ H d by (1.4.6), thus for anygiven e ′ ∈ H d E y ∈ Z dN A ( y ) w ( y ) = E x ∈ M Y e ∈H d E e ( x ) µ e ′ ( x ) = E z ∈ V e ′ g e ′ ( z ) µ e ′ ( z )= E z ∈ V e ′ g e ′ ( z )˜ µ e ′ ( z ) + o N,W →∞ (1)= E x ∈ M Y e ∈H d E e ( x )˜ µ e ′ ( x ) + o N,W →∞ (1) . By (1.4.13), Q e ∈H d E e ≤ P e ∈H d E e \ E ′ e . Then the symmetry of the measures ˜ µ e (i.e. the fact that ˜ µ e ( x ) = ˜ µ e ′ ( x ) for x ∈ M ), (1.4.14) and the fact that E e \ E ′ e is constant on the fibers π − e ( x ) implies E x ∈ M Y e ∈H d E e ( x )˜ µ e ′ ( x ) ≤ X e ∈H d E x ∈ M E e \ E ′ e ( x )˜ µ e ′ ( x )= X e ∈H d E x ∈ V e E e \ E ′ e ( x )˜ µ e ( x ) ≤ ( d + 1) ǫ ( δ ) + o N,W →∞ (1) . Choosing
N, W sufficiently large with respect to δ gives E y ∈ Z dN A ( y ) w ( y ) ≤ ǫ ′ ( δ ) , with, say ǫ ′ ( δ ) := ( d + 2) ǫ ( δ ) . (cid:3) Weighted box norms and hypergraph regularity.
The known proofs of the Simplex Removal Lemmarely on the so-called Hypergraph Regularity Lemma and the associated Counting Lemma [19],[5],[14], andin particular the notion of a regular or pseudo-random hypergraph. This can be defined in different ways,we use a variant of Gowers’s box norms [5] adapted to our settings.Let e ∈ H d be fixed. For a given ω ∈ { , } e (i.e. ω : e → { , } ), define the orthogonal projection ω e : V e × V e → V e by ω e ( x e , q e ) i = ( x i if ω i = 0 q i if ω i = 1 (1.5.1) MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 9 for i ∈ e , and the weighted box norm of a function F : V e → R , using the notation x f := π f ( x ) for f ⊆ J ,as k F k d (cid:3) νe = E x,q ∈ V e Y ω ∈{ , } e F ( ω e ( x, q )) Y f ⊆ e Y ω ∈{ , } f ν f ( ω f ( x f , q f )) (1.5.2)Note that if ν f ≡ for all f ⊆ e , then k F k (cid:3) νe = k F k (cid:3) is the usual box norm. Example 1.
Let e = (0 , and F : V × V → R . Then k F k (cid:3) νe = E x ,q ∈ V , x ,q ∈ V F ( x , x ) F ( x , q ) F ( q , x ) F ( q , q ) × ν e ( x , x ) ν e ( x , q ) ν e ( q , x ) ν e ( q , q ) ν ( x ) ν ( q ) ν ( x ) ν ( q ) . The points ω e ( x, q ) and ω f ( x f , q f ) may be viewed as the faces and edges of a d -dimensional octahedron K d with vertices { x j , q j ; j ∈ e } . The inner product in (1.5.2) represents the total weight of the octahedronobtained by multiplying the weights of all edges and vertices. The boxnorm itself is the weighted averageof F over all embeddings of the hypergraph K d .It is not hard to see that the (cid:3) ν -norm is indeed a norm (for d ≥ ) and an appropriate version of theGowers-Cauchy-Schwarz inequality holds, see the Appendix). The importance of this norm is that it con-trols weighted averages over d + 1 -dimensional simplices, something which plays an important role inproving the Counting Lemma. More precisely one has the following. Proposition 1.1. (Weighted von Neumann inequality) Let F e : V e → R be a given functions, such that | F e | ≤ for each e ∈ H d . Then there is an absolute constant C such that (cid:12)(cid:12) E x ∈ V J Y e ∈H d F e ( π e ( x )) µ J ( x ) (cid:12)(cid:12) ≤ C min e ∈H d k F e k (cid:3) νe + o N,W →∞ (1) . (1.5.3)The (cid:3) ν -norm has also been defined and studied in [8] see Appendix B-C there, where various forms ofvon Neumann type inequalities have been shown. In fact it is not hard to adapt the arguments given there toprove Proposition 1.1, however as our setting is somewhat different we will include a proof in an appendix.The above inequality motivates the following Definition 1.3.
Let e ∈ H d and ε > be fixed and let G e ⊆ V e be a d -regular hypergraph. We say that G e is ε -regular with respect to the weight system { ν f } f ⊆ e if k G e − µ e ( G e ) V e k (cid:3) νe ≤ ε. (1.5.4)It is easy to see from Proposition 1.1 that if the sets E e ∈ A e are cε − regular for all e ∈ H d (with a suf-ficiently small constant c > ), then Theorem 1.4 holds with { ˜ µ e } = { µ e } . Indeed, writing G e = π e ( E e ) , G e = µ e ( G e ) 1 V e + F e , and substituting this decomposition into the left side of (1.4.10) we get d +1 − error terms each of which is bounded by c ′ ε (for some small absolute constant c ′ > as long as N and W is sufficiently large with respect to ε ), and a main term of the form Q e ∈H d µ e ( G e ) which by the assump-tion of Theorem 1.4 should be less than, say ε . This implies that E x ∈ V e G e ( x ) µ e ( x ) = µ e ( G e ) ≤ δ for δ = (2 ε ) d +1 , for at least one e ∈ H d . Thus the sets E ′ e := ∅ , E ′ e ′ := E e ( e ′ = e ) satisfy the conclusion ofTheorem 1.4.Of course in general the hypergraphs G e = π e ( E e ) are not sufficiently regular, the bulk of our argument isto obtain a “Regularity Lemma” in our weighted setting. This roughly says that one can partition the sets G e into sufficiently regular hypergraphs with respect to a system of measures ˜ µ e which are small perturbationsof the initial measures µ e . Our proof is based on the iterative process described in [19] however we needto modify the entire argument because of the presence of weights on the lower dimensional edges. Duringthe process we construct increasing families of weight systems { ν q,e } e ∈ ¯ H ,q ∈ Ω which for most values of the parameter q will give rise to small perturbations of the initial weight system { ν e } e ∈ ¯ H .Let us sketch below how the weights ν q,e and the associated measures µ q,e arise in the special case d = 2 , ν ≡ ν ≡ . Assume that there is an edge e , say e = (1 , , so that the graph G e = π e ( E e ) is not ε -regular. This means k F k (cid:3) νe ≥ ε, (1.5.5)where F = G e − µ e ( G e ) V e . In view of definition (1.5.2), we may write k F k (cid:3) νe = Z V e Z V e F ( x ) u q ( x ) u q ( x ) ν e ( x , q ) ν e ( q , x ) dµ e ( x ) dµ e ( q ) ≥ ε , (1.5.6)where x = ( x , x ) , q = ( q , q ) , u q ( x ) = F ( x , q ) , and u q ( x ) = F ( q , x ) F ( q , q ) . If one definesthe measures µ q,e , depending on the parameter q , by µ q,e ( x ) := ν e ( x , q ) ν e ( q , x ) µ e ( x ) , then the inner expression in (1.5.6) can be viewed as the inner product Γ( q ) := (cid:10) F, u q · u q (cid:11) µ q,e = Z V e F ( x ) u q ( x ) u q ( x ) dµ q,e ( x ) , (1.5.7)on the Hilbert space L ( V e , µ q,e ) . Thus (1.5.6) translates to E q ∈ V e Γ( q ) µ e ( q ) ≥ ε while using the linearforms condition it is easy to see that E q ∈ V e Γ( q ) µ e ( q ) . thus Γ( q ) & ε , for q ∈ Ω , (1.5.8)for a set Ω ⊆ V e of measure µ e (Ω) & ε . As the functions u iq are bounded, without loss of generality we mayassume that they are indicator functions of sets U iq ⊆ V i . Let B q = B q ∨ B q denote the σ -algebra generatedby the sets π − i ( U iq ) ( i = 1 , on V e , and let E µ q,e ( G e |B q ) be the conditional expectation function of G e with respect to this σ -algebra and the measure µ q,e . Then, as u q u q is measurable with respect to B q we have h G e − E µ q,e ( G e |B q ) , u q u q i µ q,e = 0 . This together with (1.5.7) and (1.5.8) implies that for q ∈ Ω we have h E µ q,e ( G e |B q ) − E µ e ( G e |B ) , u q u q i µ q,e & ε , where B = { V e , ∅} is the trivial σ -algebra, and E µ e ( G e |B ) = µ e ( G e ) V e . Then by the Cauchy-Schwartzinequality, we arrive at k E µ q,e ( A e |B q ) − E µ e ( A e |B ) k L ( µ q,e ) & ε . (1.5.9)Note that, by the Pythagorean theorem, if the second term on the left side would be a conditional expectationwith respect to the measure µ q,e then one would obtain an “energy increment” k E µ q,e ( A e |B q ) − E µ q,e ( A e |B ) k L ( µ q,e ) = k E µ q,e ( A e |B q ) k L ( µ q,e ) − k E µ q,e ( A e |B ) k L ( µ q,e ) & ε . To overcome this “discrepancy”, using the linear forms condition, we show that for given B ⊆ V e one hasfor almost every q ∈ V e E q ∈ V e | µ q,e ( B ) − µ e ( B ) | µ e ( q ) = o N,W →∞ (1) . This in turn implies that k E µ q,e ( G e |B ) − E µ e ( G e |B ) k L ( µ q,e ) = o N,W →∞ (1) and k E µ e ( G e |B ) k L ( µ e ) = k E µ q,e ( G e |B ) k L ( µ q,e ) + o N,W →∞ (1) . Though our exposition later is self-contained, some familiarity with standard notions and arguments, such the conditionalexpectation, energy increment, discussed for example in [19], may be helpful here.
MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 11
Then from (1.5.9) we have for almost every q ∈ Ω , that k E µ q,e ( G e |B q ) k L ( µ q,e ) ≥ k E µ e ( G e |B ) k L ( µ e ) + c ε . (1.5.10)If F : V → R is a function and ( V, B , µ ) is a measure space, the quantity k E µ ( F |B ) k L ( µ ) is sometimesreferred to as the “energy” of the function F with respect to the measure space ( V, B , µ ) , so (1.5.10) is tellingthat if G e is not ε -uniform with respect to the initial measure spaces ( V e , B , µ e ) then its energy increasesby a fixed amount when passing to the measure spaces ( V e , B q , µ q,e ) for (almost) every q ∈ Ω . One caniterate this argument to arrive to a family of measure spaces ( V e , B q,e , µ q,e ) e ∈H d , q ∈ Ω such that the atoms G q,e ∈ B q,e become sufficiently uniform, thus obtaining a parametric version of the so-called Koopman-von Neumann decomposition, see [19]. This can be further iterated to eventually obtain a regularity lemma.Note that the number of linear forms defining the measures µ q,e is increasing at each step of the iteration,causing the linear forms condition to be used at a level depending eventually on the relative density of theset A and not just on the dimension d .1.6. Outline of the paper.
In Section 2 we describe the type of parametric weight systems { ν q,f } f ∈H , q ∈ Z that we encounter later on. Here we also discuss their basic properties such as stability and symmetry. InSection 3 we introduce the energy increment argument for parametric systems, as well as prove a regularitylemma. Section 4 is devoted to proving the counting and removal lemmas. Many of our arguments in Sec-tion 3 and Section 4 may be viewed as an extension of those in [19]. In the last section we obtain our mainresults stated in the introduction. The basic properties of weighted box norms are discussed in an Appendix.As for our notations most, of our variables are vector type, although we do not emphasize this. We thinkof the initial data ∆ = { v , . . . , v d } being fixed throughout, and do not denote the dependence on variouselements of ∆ . For example we write Y = O ( X ) or Y . X if Y ≤ C X for some constant
C > depending only on the vectors v i or the dimension d . If y , . . . , y s and X additional parameters we write O y ,...,y s ( X ) for a quantity Y bounded by C ( y , . . . , y s ) X or equivalently Y . y ,...,y s X .We’ll utilize the linear forms condition throughout the paper, giving rise to error terms which tends to 0as both N → ∞ and W → ∞ for any fixed choice of the parameters y , . . . , y s on which they may de-pend. The standard notation for such terms would be o N,W →∞ ; y ,...,y s (1) , which for simplicity we will write o y ,...,y s (1) . Finally as all estimates in the linear forms condition involving the weights ν b are independentof the choice of b we write in certain places ν = ν b for the purpose of simplifying the notation.2. B ASIC PROPERTIES OF PARAMETRIC WEIGHT SYSTEMS AND THEIR EXTENSIONS
In this section we define the type of parametric systems and associated families of measures we encounterlater and discuss their basic properties such as stability and symmetry. We also discuss the type of extensionsof such systems which arise in our induction process.2.1.
Parametric weight systems and stability properties.
Recall the family of measures { µ e } e ∈H con-structed in (1.4.1) µ e ( x ) = Y L ∈L , supp ( L ) ⊆ e ν ( L ( x )) , where the family L defined in (1.4.1) consists of pairwise linearly independent forms. The following state-ment is based on the linear forms condition and is a prototype of many of the arguments in this section. Lemma 2.1.
For all e ∈ H we have that µ e ( V e ) = 1 + o (1) . (2.1.1) Moreover if g : V e → [ − , then E x e ∈ V e g ( x e ) µ e ( x e ) = E x ∈ V J g ( π e ( x )) µ J ( x ) + o (1) , or equivalently Z V e g dµ e = Z V J ( g ◦ π e ) dµ J + o (1) . (2.1.2) Proof.
Note that the linear forms appearing on the right side of µ e ( V e ) = E x ∈ V e Y supp ( L ) ⊆ e ν ( L ( x )) are pairwise linearly independent, and as they are supported on e they remain pairwise independent whenrestricted to V e . Thus (2.1.1) follows from the linear forms condition.To show (2.1.2), let e ′ = J \ e and write x = ( x e , x e ′ ) with x e = π e ( x ) , x e ′ = π e ′ ( x ) . Then E := E x ∈ V J ( g ◦ π e )( x ) µ J ( x ) − E x e ∈ V e g ( x e ) µ e ( x e ) = E x e ∈ V e g ( x e ) µ e ( x e ) E x e ′ ∈ V e ′ (w( x e , x e ′ ) − , where w( x e , x e ′ ) = Q f * e ν f ( x e ∩ f , x e ′ ∩ f ) .By (2.1.1) we have that µ e ( V e ) . , and then by the Cauchy-Schwartz inequality | E | . E x e ∈ V e E x e ′ ,y e ′ ∈ V e ′ (w( x e , x e ′ ) − x e , y e ′ ) − µ e ( x e ) . The right hand side of this expression is a combination of four terms and (2.1.2) follows from the fact thateach term is o (1) . Indeed the linear forms appearing in the definition of the function µ e ( x e ) depend onlyon the variables x j for j ∈ e and are pairwise linearly independent. All linear forms involved in w( x e , x e ′ ) depend also on some of the variables in x j , j ∈ e ′ , while the ones in w( x e , y e ′ ) depend on the variablesin y j , j ∈ e ′ , hence these forms depend on different sets of variables. Thus the forms appearing in theexpression µ e ( x e )w( x e , x e ′ )w( x e , y e ′ ) are pairwise linearly independent and (2.1.2) follows from the linearforms condition. Note that the estimate is independent on the function g . (cid:3) This will allow us to consider sets G e ⊆ V e as sets G e = π − e ( G e ) ⊆ V J , changing their measure only by anegligible amount µ J ( G e ) = µ e ( G e ) + o (1) (2.1.3)Next we define weight systems and associated families of measures depending on parameters. Let L q := ( L ( q, x ) , ..., L s ( q, x )) be a family of linear forms with integer coefficients depending on the parameters q ∈ Z R and the variables x ∈ Z D . We call the family pairwise linearly independent if no two forms in the family are rational multiplesof each other. If N is a sufficiently large prime with respect to the coefficients of the linear forms L i ( q, x ) ,then the forms remain pairwise linearly independent when considered as forms over Z × V , Z = Z RN , V = Z DN . We refer to the set Z = Z RN as the parameter space of the family L q . As our arguments willinvolve averaging over the parameter space Z , we call the family L q well-defined if there is measure on Z given by Z Z g ( q ) dψ ( q ) = E q ∈ Z g ( q ) ψ ( q ) , ψ ( q ) = t Y i =1 ν ( Y i ( q )) , (2.1.4)for a family of pairwise linearly independent linear forms Y i defined over Z , and if all forms L i ( q, x ) dependon some of the x -variables.If V = V J then we define an associated system of weights { ν q,e } q ∈ Z,e ∈H and measures { µ q,e } q ∈ Z,e ∈H asfollows. For a form L k ( q, x ) = P i b i q i + P j a j x j define its x -support as supp x ( L ) = { j ∈ J ; a j = 0 } .For e ⊆ J and q ∈ Z , let MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 13 ν q,e ( x ) := Y L ∈L q supp x ( L )= e ν ( L ( q, x )) , µ q,e ( x ) := Y L ∈L q supp x ( L ) ⊆ e ν ( L ( q, x )) (2.1.5)We use the convention that ν q,e ≡ if there is no form L ⊆ L q such that supp x ( L ) = e . Note that the x -support partitions the family of forms L q independent of the parameters q , thus for given e ∈ H µ q,e ( x ) = Y f ⊆ e ν q,e ( x ) , for all q ∈ Z. A crucial observation is that many of the properties of the measure system { µ e } still hold for well-definedmeasure systems { µ q,f } for almost every value of the parameter q ∈ Z . In order to formulate such state-ments we say that the family L has complexity at most K if the dimension of the space Z , the number oflinear forms L j ( q, x ) , Y l ( q ) , and the magnitude of their coefficients are all bounded by K . This quantitywill control the dependence of the error terms in applications of the linear forms condition. We have theanalogue of Lemma 2.1. Lemma 2.2.
Let { µ q,e } e ∈H ,q ∈ Z be a well-defined parametric measure system of complexity at most K .For every e ∈ H there is a set E e ⊆ Z such that ψ ( E e ) = o K (1) , and for every q / ∈ E e µ q,e ( V e ) = 1 + o K (1) . (2.1.6) Moreover for every e ∈ H there is a set E e ⊆ Z of measure ψ ( E e ) = o (1) , such the following holds. Forany function g : Z × V e → [ − , and for every q / ∈ E e one has the estimate Z V e g ( q, x e ) dµ q,e ( x e ) = Z V J g ( q, π e ( x )) dµ q,J ( x ) + o K (1) . (2.1.7) Proof.
To prove (2.1.6) consider the quantity Λ e := Z Z | µ q,e ( V e ) − | dψ ( q )= Z Z E x e ,y e ( Y supp x ( L ) ⊆ e ν ( L ( q, x e )) − Y supp x ( L ) ⊆ e ν ( L ( q, y e )) − dψ ( q ) . The above expression is a combination of four terms and note that the family of linear forms { Y k ( q ) , L i ( q, x e ) , L j ( q, y e ) } is pairwise linearly independent in the ( q, x e , y e ) variables by our assumptions. Applying the linear formscondition gives that each term is o K (1) and so Λ e = o K (1) and (2.1.6) follows.Now let e ′ = J \ e , write x = ( x e , x e ′ ) and arguing as in Lemma 2.1 we have Λ( q, e, g ) := | E x ∈ V J g ( q, π e ( x )) µ q,J ( x ) − E x e ∈ V e g ( q, x e ) µ q,e ( x e ) | = | E x e ∈ V e g ( q, x e ) µ q,e ( x e ) E x e ′ ∈ V e ′ (w q ( x e , x e ′ ) − |≤ E x e ∈ V e µ q,e ( x e ) | E x e ′ ∈ V e ′ (w q ( x e , x e ′ ) − | , where w q ( x e , x e ′ ) = Q f * e ν q,f ( x e ∩ f , x e ′ ∩ f ) .Notice that the right hand side of the above inequality is independent of the function g ; if we denote itby Λ( q, e ) then (2.1.7) would follow from the estimate E q ∈ Z Λ( q, e ) dψ ( q ) = o K (1) . By the linear formscondition E q,x e dψ ( q ) dµ q,e ( x e ) = 1 + o K (1) ≤ , for N sufficiently large with respect to K . Then by the Cauchy-Schwartz inequality one has ( E q ∈ Z Λ( q, e ) dψ ( q )) . E q ∈ Z, x e ∈ V e E x e ′ ,y e ′ ∈ V e (w q ( x e , x e ′ ) − q ( x e , y e ′ ) − dµ q,e ( x e ) dψ ( q ) . This is a combination of four terms, however each term again is o K (1) as the linear forms defining ψ depend on the variables q while the ones defining µ q,e depend also on the x e variables. On the other handall linear forms appearing in the weight functions w q ( x e , x e ′ ) (respectively, w q ( x e , y e ′ ) ) depend on the x e ′ (respectively, y e ′ ) variables as well. Thus the family of all linear forms in the above expressions is pairwiselinearly independent in the ( q, x e , x e ′ , y e ′ ) variables. (cid:3) Extension of parametric systems.
During our iteration process we will encounter extensions of para-metric families of forms depending on more and more parameters. Roughly speaking one extends a familyby adding new parameters together with new forms depending also on the new parameters. More pre-cisely let L q = { L ( q , x ) , ..., L s ( q , x ) } and L q = { L ( q , x ) , ..., L s ( q , x ) } be two pairwise lin-early indpendent families of linear forms defined on the parameter spaces Z = Z k N and Z = Z k N . Let ψ and ψ be measures on Z and Z defined by the families of linear forms { Y ( q ) , . . . Y s ( q ) } and { Y ( q ) , . . . Y s ( q ) } . Definition 2.1.
We say that the family L q is an extension of the family L q if Z ≤ Z and the followingholds. The family of forms L i ( q , x ) , Y j ( q ) which depend only on the variables q = π ( q ) is exactly thefamily of forms L i ( q , x ) , Y j ( q ) , where π : Z → Z is the natural orthogonal projection. If V = V J let µ := { µ q ,e } q ∈ Z ,e ∈H and µ := { µ q ,f } q ∈ Z ,f ∈H be the associated measure systems asdefined in (2.1.5). We say that the measure system µ is an extension of the system µ .Let us make a few immediate observations. Writing Z = Z × Z , Z = Z rN and q = ( q , q ) , we have ψ ( q , q ) = ψ ( q ) · ϕ ( q , q ) (2.2.1)where ϕ ( q, q ) = Q ti =1 ν ( Y i ( q , q )) . The linear forms Y i ( q , q ) defining ϕ ( q , q ) depend on some of thevariables of q = ( q i ) ≤ i ≤ k and are pairwise linearly independent. Similarly one may write for any e ∈ H µ q ,q ) ,e ( x e ) = µ q ,e ( x e )w e ( q , q, x e ) (2.2.2)where the linear forms L j ( q , q, x e ) defining the function w e ( q, q , x e ) depend on (some of) the variables q as well as on (all of) the variables x e .In the special case when L = ( L ( x ) , .., L s ( x )) is a family of linear forms, a parametric family L q is calledan extension of L if the set of forms in L q which are independent of q is exactly the family L . Similarly, theassociated system of weights { ν q,e } and measures { µ q,e } is referred to as an extension of { ν e } and { µ e } . Lemma 2.3.
Let { µ f } f ∈H be a well defined measure system, and let { µ q,f } q ∈ Z,f ∈H be a well-definedparametric extension of { µ f } f ∈H of complexity at most K . Then for any f ∈ H and for any function g : V f → [ − , there is a set E g,f ⊆ Z of measure ψ ( E g,f ) = o K (1) , so that for all q / ∈ E g,f Z V f g dµ q,f − Z V f g dµ f = o K (1) . (2.2.3) Similarly if { µ q ,f } f ∈H ,q ∈ Z is a well-defined parametric system and if { µ q ,f } f ∈H ,q ∈ Z is an extension ofcomplexity at most K , then to any function g : Z × V f → [ − , there exists a set E g,f ⊆ Z of measure ψ ( E g,f ) = o K (1) , such that for all q = ( q , q ) / ∈ E g,f Z V f g ( q , x ) dµ q ,f ( x ) − Z V f g ( q , x ) dµ q ,f ( x ) = o K (1) . (2.2.4) MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 15
Proof. As µ q,f = µ f ( x f )w f ( q, x f ) , the left side of (2.2.3) may be written as Λ f,g ( q ) := Z V f g ( x )(w f ( q, x ) − dµ f ( x ) . Consider Λ f,g := Z Z | Λ f,g ( q ) | dψ ( q ) . Using the Cauchy-Schwartz inequality we estimate Λ f,g = Z Z Z V f Z V f (w f ( q, x ) − f ( q, y ) − g ( x ) g ( y ) dµ f ( x ) dµ f ( y ) dψ ( q ) ≤ Z V f Z V f (cid:12)(cid:12)(cid:12)(cid:12)Z Z (w f ( q, x ) − f ( q, y ) − dψ ( q ) (cid:12)(cid:12)(cid:12)(cid:12) dµ f ( x ) dµ f ( y ) . Now the Cauchy-Schwartz inequality and (2.1.1) gives | Λ f,g | . Z V f Z V f Z Z Z Z (w f ( q, x ) − f ( q, y ) − ×× (w f ( p, x ) − f ( p, y ) − dµ f ( x ) dµ f ( y ) dψ ( q ) dψ ( p ) . This last expression is a combination of 16 terms where each term is o K (1) by the linear form conditions.Indeed the linear forms which can appear in any of these terms are Y i ( q ) , Y i ( p ) , L i ( x ) , L i ( y ) , L i ( q, x ) , L i ( q, y ) , L i ( p, x ) , L i ( p, y ) . Note that the last 4 terms depend on both sets of variables (for example L i ( q, x ) depends both on q ∈ Z and on x ∈ V f ), and hence the family of these forms are pairwise linearlyindependent in the ( q, p, x, y ) variables. This Proves (2.2.3).The proof of (2.2.4) is essentially the same. Set Λ f,g ( q ) := Z V f g ( q , x ) dµ q ,f ( x ) − Z V f g ( q , x ) dµ q ,f ( x ) and Λ f,g := Z Z | Λ f,g ( q ) | dψ ( q ) . Write Z = Z × Z , where Z = Z kN , and q = ( q , q ) for q ∈ Z . By (2.2.1) we estimate as above Λ f,g . Z V f Z V f Z Z dψ ( q ) dµ q ,f ( x ) dµ q ,f ( y ) | E q ∈ Z (w f ( q , q, x ) − f ( q , q, y ) − ϕ ( q , q ) | . The linear forms condition gives Z V f Z V f Z Z dψ ( q ) dµ q ,f ( x ) dµ q ,f ( y ) = 1 + o K (1) , so then we have | Λ f,g | . Z V f Z V f Z Z E p,q ∈ Z (w f ( q , q, x ) − f ( q , q, y ) − ×× (w f ( q , p, x ) − f ( q , p, y ) − ϕ ( q , q ) ϕ ( q , p ) dψ ( q ) dµ q ,f ( x ) dµ q ,f ( y ) . The point is that any linear form L if ( q , q, x ) depends both on the variables q and x . Thus again the left sideis a combination of 16 terms, each being o K (1) by the linear forms condition as all the linear formsinvolved in any of these expressions are pairwise linearly independent in the ( x, y, q , q, p ) variables. (cid:3) Lemma 2.3 is an example of what we refer to as a stability property. it means that the extension measures µ ( q ,q ) ,f are small perturbations of the measures µ q ,f with respect to quantities which are independent of q .As a first application of this principle we show that the weighted box norms, defined in (1.4.2), remainessentially unchanged under parametric extensions of the weight systems defining the norms. Let L q be apairwise linearly independent family of forms defined on the parameter space ( Z , ψ ) and let { ν q ,e } bethe associated system of weights.Let g : Z × V e → R be a function and let e ∈ H , | e | = d ′ . For a given q ∈ Z recall the box norm of g q ( x ) = g ( q , x ) (cid:13)(cid:13) g q (cid:13)(cid:13) d ′ (cid:3) νq ,e = E p,x ∈ V e Y ω e ∈{ , } e g ( q , ω e ( p, x )) Y f ⊆ e Y ω f ∈{ , } f ν q,f ( ω f ( p f , x f )) , (2.2.5)where x f = π f ( x ) , p f = π f ( p ) , π f : V e → V f being the natural projection. The inner product on the rightside of (2.2.5) is defined by the parametric family of forms ˜ L q = [ f ⊆ e { L ( q , ω f ( p f , x f )); L ∈ L q , supp x ( L ) = f, ω f ∈ { , } f } . (2.2.6)It is easy to see that this is a pairwise linearly independent family of forms defined over Z × V ( V = V e × V e ) . Indeed, if we’d have that L ′ ( q , ω ′ f ′ ( p f ′ , x f ′ )) = λL ( q , ω f ( p f , x f )) , (2.2.7)then restriction both forms to the subspace { p = x } would imply that L ′ ( q , x f ′ ) = λL ( q , x f ) and hence f ′ = supp x ( L ′ ) = supp x ( L ) = f . Then, as L and L ′ depend exactly variables x j for j ∈ f , for (2.2.7) tohold, we should have ω ′ f = ω f and L = L ′ .If { ˜ µ q ,f } q ∈ Z ,f ⊆ e denotes the associated system of measures and G ( q , p, x ) := Y ω ∈{ , } e g ( q , ω e ( p, x )) , (2.2.8)then for given q ∈ Z (cid:13)(cid:13) g q (cid:13)(cid:13) d ′ (cid:3) νq ,e = E p,x ∈ V e G q ( p, x ) ˜ µ q ,e ( p, x ) . (2.2.9)Now, if L q is a well-defined parametric extension of L q then (2.2.6) yields to a well-defined parametricextension ˜ L q of the family ˜ L q . Then by Lemma 2.3, and the simple observation that | a d ′ − b d ′ | ≤ ε implies | a − b | ≤ ε − d ′ for a, b ≥ , we obtain Lemma 2.4.
Let { ν q ,f } f ∈H ,q ∈ Z be a parametric weight system with a well-defined extension { ν q ,f } f ∈H ,q ∈ Z of complexity at most K . Then to any e ∈ H and to any function g : Z × V e → [ − , there exists a set E = E ( g, e ) ∈ Z of measure ψ ( E ) = o K (1) such that for all q = ( q , p ) / ∈ E (cid:13)(cid:13) g q (cid:13)(cid:13) (cid:3) νq ,e = (cid:13)(cid:13) g q (cid:13)(cid:13) (cid:3) νq ,e + o K (1) . (2.2.10)Let ( V, B , µ ) be a measure space and let g : V → R be a function. An important construction, the so-calledconditional expectation function is defined as E µ ( g |B )( x ) = 1 µ ( B ( x )) E y ∈ V B ( x ) ( y ) g ( y ) dµ ( y ) = 1 µ ( B ( x )) Z B ( x ) g ( y ) dµ ( y ) , MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 17 where B ( x ) ∈ B is the atom containing x . If µ ( B ( x )) = 0 then we set E µ ( g |B )( x ) = 1 .The complexity of the σ -algebra B , denoted by compl( B ), is defined as the minimum number of elementsof B which generates B . Note that the number of atoms of B is at most compl ( B ) . Next we compare theconditional expectation functions of parametric systems. Lemma 2.5.
Let ( µ q ,f ) q ∈ Z ,f ∈H be a well-defined parametric measure system with a well-defined exten-sion ( µ q ,f ) q ∈ Z ,f ∈H of complexity at most K . For q ∈ Z and e ∈ H , let B q ,e be a σ − algebra on V e such that compl( B q ,e ) ≤ M for some fixed number M . For any function g : Z × V e → [ − , there existsa set E = E ( B , g ) ⊆ Z of measure ψ ( E ) = o M,K (1) such that for any q = ( q , q ) / ∈ E (1) we have (cid:13)(cid:13) E µ q ,e ( g q |B q ,e ) − E µ q ,e ( g q |B q ,e ) (cid:13)(cid:13) L ( µ q ,e ) = o M,K (1) (2.2.11)(2) and (cid:13)(cid:13) E µ q ,e ( g q |B q ,e ) (cid:13)(cid:13) L ( µ q ,e ) = (cid:13)(cid:13) E µ q ,e ( g q |B q ,e ) (cid:13)(cid:13) L ( µ q ,e ) + o M,K (1) . (2.2.12) Proof.
Let m = 2 M and enumerate the atoms of B q ,e as B q , ..., B mq , allowing some of them to possiblybe empty. For a fixed ≤ i ≤ m define the functions b i ( q , x ) = B iq ( x ) = ( if x ∈ B iq otherwiseand for q = ( q , q ) ∈ Z define the quantities µ i ( q , g ) := Z V e g ( q , x ) b i ( q , x ) dµ q ,e ( x ) , µ i ( q ) := µ q ,e ( B iq ) ,µ i ( q , g ) := Z V e g ( q , x ) b i ( q , x ) dµ q ,e ( x ) , µ i ( q ) := µ q ,e ( B iq ) By Lemma 2.3 we have that µ i ( q , g ) = µ i ( q , g ) + o K (1) , µ i ( q ) = µ i ( q ) + o K (1) (2.2.13)for all q / ∈ E i where E i ⊆ Z is a set of ψ - measure o K (1) . Let E = S mi =1 E i then ψ ( E ) = o K ,M (1) . The left hand side of (2.2.11) takes the form m X i =1 (cid:18) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:19) µ i ( q ) , (2.2.14)with the convention that if µ i ( q ) = 0 or µ i ( q ) = 0 then µ i ( q , g ) /µ i ( q ) := 1 or µ i ( q , g ) /µ i ( q ) := 1 . If q = ( q , q ) / ∈ E then by (2.2.13) ε := m X i =1 (cid:18) | µ i ( q , g ) − µ i ( q , g ) | + | µ i ( q ) − µ i ( q ) | (cid:19) = o K ,M (1) (2.2.15)Now if µ i ( q ) ≤ ε / then µ i ( q ) ≤ ε / by (2.2.13), hence the total contribution of such terms isbounded by m ε / = o K ,M (1) . If µ i ( q ) ≥ ε / then µ i ( q ) ≥ ε / , we have the estimate (cid:12)(cid:12)(cid:12)(cid:12) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ ε ( N )2 ε ( N ) / ≤ ε / = o K ,M (1) , This proves (2.2.11). The proof of inequality (2.2.12) proceeds the same way, here one needs to estimatethe quantity m X i =1 (cid:12)(cid:12)(cid:12)(cid:12) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:12)(cid:12)(cid:12)(cid:12) = m X i =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:18) µ i ( q , g ) µ i ( q ) (cid:19) µ i ( q ) − (cid:18) µ i ( q , g ) µ i ( q ) (cid:19) µ i ( q ) (cid:12)(cid:12)(cid:12)(cid:12) (2.2.16)If µ i ( q ) ≤ ε / then µ i ( q ) ≤ ε / for q = ( q , q ) / ∈ E , thus the contribution of such terms to the rightside of (2.2.16) is trivially estimated by m ε / = o M,K (1) The rest of the terms are bounded by ε / and (2.2.12) follows. (cid:3) We also need an analogue of the above result when the k · k L ( µ q,e ) norm is replaced by the more compli-cated k · k (cid:3) νq,e norms. Lemma 2.6.
Let { ν q ,f } f ∈H ,q ∈ Z be a well-defined extension of the parametric weight system { ν q ,f } f ∈H ,q ∈ Z ,of complexity at most K . For q ∈ Z and e ∈ H , let B q ,e be a σ -algebra of complexity at most M , forsome fixed constant M > . Then k E ν q ,e ( g q |B q ,e ) − E ν q ,e ( g q |B q ,e ) k (cid:3) νq ,e = o M,K (1) , (2.2.17) for all q = ( q , q ) / ∈ E , where E = E ( g, B ) ⊆ Z is a set of measure ψ ( E ) = o M,K (1) . Proof.
First we show that for any family of sets A = ( A q ) q ∈ Z , A q ⊆ V e there is a set E = E ( g, A ) ofmeasure ψ ( E ) = o K (1) such that for all q = ( q , q ) / ∈ E we have k A q k d (cid:3) νq ,e ≤ µ q ,e ( A q ) + o K (1) . (2.2.18)To see this, first note that for q = ( q , q ) ∈ Z one has k A q k d (cid:3) νq ,e ≤ E x,p ∈ V e A q ( x ) µ q ,e ( x ) Y f ⊆ e Y ω f =0 ν q ,f ( ω f ( p f , x f ))= µ q ,e ( A q ) + E ( q ) , with E ( q ) ≤ E x ∈ V e µ q ,e ( x ) | E p ∈ V e ( W ( q , p, x ) − | , where W ( q , p, x ) = Y f ⊆ e Y ω f =0 ν q ,f ( ω f ( p f , x f )) . Arguing as in Lemma 2.3, we see that E q ∈ Z E x,p,p ′ ∈ V e ψ ( q ) dµ q ,e ( x ) ( W ( q , p, x ) − W ( q , p ′ , x ) −
1) = o M,K (1) and (2.2.18) follows.Now let { B iq } mi =1 ( m = 2 M ) be the atoms of B q ,e and define the quantities µ i ( q , g ) , µ i ( q ) , µ i ( q , g ) , µ i ( q ) as in Lemma 2.4. The expression in (2.2.11) is then estimated (cid:13)(cid:13)(cid:13)(cid:13) m X i =1 (cid:18) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:19) B iq (cid:13)(cid:13)(cid:13)(cid:13) (cid:3) νq ,e ≤ m X i =1 (cid:12)(cid:12)(cid:12)(cid:12) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13) B iq (cid:13)(cid:13) (cid:3) νq ,e . m X i =1 (cid:12)(cid:12)(cid:12)(cid:12) µ i ( q , g ) µ i ( q ) − µ i ( q , g ) µ i ( q ) (cid:12)(cid:12)(cid:12)(cid:12) µ q ,e ( B iq ) − d + o M,K (1) , for q = ( q , q ) / ∈ E , where E = E ( B q ,e , g ) is a set of measure o M,K (1) . MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 19
Using the facts that µ i ( q , g ) = µ i ( q , g ) + o K (1) and µ i ( q ) = µ i ( q ) + o K (1) outside a set of measure o M,K (1) , and m X i =1 µ q ,e ( B iq ) = µ q ,e ( V e ) = 1 + o K (1) , it follows that the above expression is o M,K (1) by arguing as in Lemma 2.4. This completes the proof. (cid:3) Symmetric extensions.
We will also need our parametric families of forms to be symmetric, to applyTheorem 1.3, which we define as follows. Let for each e ∈ H d , L q,e = { L e ( q, x ) , ..., L se ( q, x ) } be a pairwiselinearly independent family of linear forms defined on V = V J , depending on parameters q ∈ Z , such that supp x ( L je ) ⊆ e . We say that the family of forms L q = S e ∈H d L q,e is symmetric if L je ( q, x ) = L je ′ ( q, x ) forall q ∈ Z , x ∈ M = { x : x + · · · + x d = 0 } , e, e ′ ∈ H d and ≤ j ≤ s . Note that that our initial familyof forms defined in (1.4.2) has this property.It is not hard to see that to a given family of forms L q,e , for a fixed e ∈ H d , there is a unique symmetricfamily of forms L q such that L q,e = { L ∈ L q ; supp x ( L ) ⊆ e } . Indeed, if L q is such a family, then for e ′ ∈ H d , q ∈ Z and x ∈ V J L je ′ ( q, x ) = L je ′ ( q, π e ′ ( x )) = L je ′ ( q, φ e ′ ◦ π e ′ ( x )) = L je ( q, φ e ′ ◦ π e ′ ( x )) , (2.3.1)where φ e : V e → M is the inverse of the projection π e ′ restricted to M . This shows the uniqueness ofthe family L q . Conversely, define L je ′ ( q, x ) by the above equality, then it is clear that supp x ( L je ′ ) ⊆ e ′ ,moreover if x ∈ M then x = φ e ′ ◦ π e ′ ( x ) hence L je ′ ( q, x ) = L je ( q, x ) . Also, if supp x L je ′ ⊆ e then for all q ∈ Z and x ∈ V J L je ′ ( q, x ) = L je ′ ( q, φ e ◦ π e ( x )) = L je ( q, φ e ◦ π e ( x )) = L je ( q, x ) , This shows that all forms in L q which depend only on the variables x e are the forms of L q,e . Finally, if L q,e is a pairwise linearly independent family then so is L q , as linearly dependent forms must depend on the sameset of variables. We will refer to the family of forms L q as the symmetrization of the family L q,e . If f ∈ H for some edge | f | = d ′ ≤ d and L q,f is a family of forms defined on V f then the above construction can beapplied to obtain a symmetric family L q simply by choosing an e ∈ H d such that f ⊆ e and considering L q,f as a family of forms on V e . Note that the construction is independent of the choice of e ⊇ f , as if f ⊆ e ′ as well then L je = L je ′ for all ≤ j ≤ s .In the next section, following [19] we will start an energy increment process to obtain a regularity lemmafor weighted hypergraphs. At each stage we will pass to an extension of a symmetric, well-defined andpairwise independent parametric family L q defined for q ∈ Z as follows. We choose an edge e ∈ H andconsider the extension of the family L q,e as given in (2.2.6), that is replacing the forms L j ( q, x f ) with theforms L j ( q, ω f ( p f , x f )) , ω f ∈ { , } f . This gives an extension ˜ L q,p,f defined on the parameter space ( q, p ) ∈ Z × V f , which we symmetrize to obtain a new symmetric, well-defined and pairwise independentfamily ˜ L q,p . The first step of this process was described in the introduction in the special case e = (1 , .3. R EGULARIZATION OF P ARAMETRIC S YSTEMS
A Koopman-von Neumann type decomposition.
Let e ⊆ J and let B f be a σ -algebra on V f for f ∈ ∂e , where ∂e = { f ⊆ e ; | f | = | e | − } denotes the boundary of the edge e . Let B := W f ⊆ ∂e B f be the σ -algebra generated by the sets π − ef ( B f ) where π ef : V e → V f is the canonical projection. The atoms of B are the sets G = T f ⊆ ∂e π − ef ( G f ) with G f being an atom of B f , which may be interpreted as the collectionof simplices x ∈ V e whose faces x f are in G f for all f ∈ ∂e . The starting point of the proof of the Regularity Lemma, given in [19], is to show that if a set G e ⊆ V e isnot sufficiently regular with respect to B , that is if (cid:13)(cid:13) G e − E ( G e | _ f ∈ ∂e B f ) (cid:13)(cid:13) (cid:3) ≥ η, (3.1.1)then there exist σ -algebras B ′ f ⊇ B f for f ∈ ∂e , such that k E ( G e | _ f ∈ ∂e B ′ f ) (cid:13)(cid:13) L ≥ (cid:13)(cid:13) E ( G e | _ f ∈ ∂e B f ) (cid:13)(cid:13) L + cη . The quantity k E ( G |B ) k L is referred to as the energy (or index ) of the set G with respect to the σ -algebra B , thus the above inequality means that the energy of the set G e is increased by cη by refining the σ -algebras B f . In addition the complexity of the σ -algebras B ′ f , denoted by compl ( B ′ f ) and defined as theminimal number of sets generating the σ -algebra, is at most 1 larger than that of B f .In our settings for a given e ⊆ J we will have a parametric system of weights { ν q,f } q ∈ Z,f ⊆ e and measures { µ q,f } q ∈ Z,f ⊆ e associated to a well-defined, pairwise linearly independent family of of forms L q defined on Z × V e , as given in (2.1.5). For simplicity we will refer to such systems of weights and measures as being well-defined . Lemma 3.1.
For given e ⊆ J , | e | = d ′ , let { µ q,f } q ∈ Z,f ⊆ e be a well-defined family of measures of complex-ity at most K . For q ∈ Z let G q,e ⊆ V e and {B q,f } f ∈ ∂e be a σ -algebra on V f .Assume (cid:13)(cid:13) G q,e − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) d ′ (cid:3) νq,e ≥ η, (3.1.2) for some η > and for each q ∈ Ω , where Ω ⊆ Z is a set of measure ψ (Ω) ≥ c > .Then for N, W sufficiently large with respect to the parameters c , η , there exists a well-defined extension { µ q ′ ,f } q ′ ∈ Z ′ ,f ⊆ e of the system { µ q,f } of complexity K ′ = O ( K ) , and a set Ω ′ ⊆ Ω × V e ⊆ Z ′ = Z × V e such that the following hold. (1) We have ψ ′ (Ω ′ ) ≥ − c η , (3.1.3) where ψ ′ is the measure on the parameter space Z ′ . (2) For all q ′ = ( q, p ) ∈ Z ′ and f ∈ ∂e there is a σ − algebra B q ′ ,f ⊇ B q,f of complexitycompl ( B q ′ ,f ) ≤ compl ( B q,f ) + 1 . (3.1.4)(3) For all q ′ = ( q, p ) ∈ Ω ′ , one has (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) ≥ (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q,e ) + 2 − η , (3.1.5)(4) and µ q ′ ,e ( V e ) ≤ . (3.1.6)The meaning of the above lemma is that if there is a large “bad ” set Ω of parameters q for which the set G q,e is not sufficiently uniform with respect to the σ -algebra W f ∈ ∂e B q,f , then its energy will increase by afixed amount when passing to a well defined extension {B q ′ ,f } , { µ q ′ ,e } , for all q ′ = ( q, p ) ∈ Ω ′ . MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 21
Proof.
Let g q := G q,e − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) . (3.1.7)Then by (2.2.5) we have for each q ∈ Ω (cid:13)(cid:13) g q (cid:13)(cid:13) d ′ (cid:3) νq,e = Z V e h g q , Y f ∈ ∂e u q,p,f i µ ( q,p ) ,e dµ q,e ( p ) ≥ η, (3.1.8)where u q,p,f : V e → [ − , are functions, and { µ ( q,p ) ,e } ( q,p ) ∈ Z ′ is the family of measures µ ( q,p ) ,e ( x ) = Y f ⊆ e Y ω f ∈{ , } f ω f =0 ν q,f ( ω f ( p f , x f )) . As explained after (2.1.5) the measures µ ( q,p ) ,e are defined by a pairwise independent family of forms L ( q,p ) ,e depending on the parameters ( q, p ) ∈ Z × V e , which is a well-defined extension of the family L q,e defin-ing the measures µ q,e . It is clear from (3.1.8) that the measure ψ ′ on Z ′ has the form ψ ′ ( q, p ) = µ q,e ( p ) ψ ( q ) .For q ′ = ( q, p ) , let Γ( q, p ) := h g q , Y f ∈ ∂e u q,p,f i µ q,p,f (3.1.9)We show that there is a set Ω ′ ⊆ Ω × V e of measure ψ ′ (Ω ′ ) ≥ − c η , (3.1.10)such that for every ( q, p ) ∈ Ω ′ one has Γ( q, p ) ≥ η . (3.1.11)By Lemma 2.2 we have that µ q,e ( V e ) = 1 + o K (1) ≤ for q / ∈ E where E ⊆ Ω is a set of measure ψ ( E ) = o K (1) . Thus for q ∈ Ω \E = Ω we have by (3.1.8) that Z V e { Γ( q,p ) ≥ η/ } Γ( q, p ) dµ q,e ( p ) ≥ η , (3.1.12)where by (3.1.8) and (3.1.9) we have Γ( q, p ) = Z V e g q ( x ) Y f ∈ ∂e u q,f w q,p ( x ) dµ q,e ( x ) The function w q,p ( x ) is the product of weight functions of the form ν ( L ( q, p, x )) depending on both p and x . Thus, using the bounds | g q | ≤ , | u q,p,f | ≤ , one has Z Z Z V e | Γ( q, p ) | dµ q,e ( p ) dψ ( q ) ≤ Z Z Z V e Z V e Z V e w q,p ( x )w q,p ( x ′ ) dµ q,e ( x ) dµ q,e ( x ′ ) dµ q,e ( p ) dψ ( q ) (3.1.13) = 1 + o K (1) ≤ by the linear forms condition, as the factors in the product depend on different sets of variables. Let Ω ′ := { ( q, p ) ∈ Ω × V e ; Γ( q, p ) ≥ η/ } . Thus by (3.1.12), (3.1.13) and the Cauchy-Schwartz inequality c η ≤ Z Ω ′ Γ( q, p ) dµ q,e ( p ) dψ ( q ) ! ≤ Z Ω ′ Γ( q, p ) dµ q,e ( p ) dψ ( q ) ψ ′ (Ω ′ ) ≤ ψ ′ (Ω ′ ) . This shows ψ ′ (Ω ′ ) ≥ − c η as claimed.Since | u q ′ ,f | ≤ , decomposing of each function u q ′ ,f into its positive and negative parts yields that h g q , Y f ∈ ∂e v q ′ ,f i µ q ′ ,e ≥ − η (3.1.14)for some functions v q ′ ,f : V f → [0 , . For a given f ∈ ∂e and some ≤ t f ≤ , let U q ′ ,t f := { x f ∈ V f : v q ′ ,f ( x f ) ≥ t f } be the level set of the functions v q ′ ,f . Then v q ′ ,f ( x f ) = R U q ′ ,tf ( x f ) dt f , and for each term in (3.1.14) wehave Z · · · Z h g q , Y f ∈ ∂e U q ′ ,tf i µ q ′ ,e dt ≥ − η, where t = ( t f ) f ∈ ∂e . Accordingly the integrand must be at least − d − η for some value of the parameter t .Fix such a t = ( t f ) and write U q ′ ,f for U q ′ ,t f for simplicity of notation. For q ′ = ( q, p ) ∈ Ω ′ , define B q ′ ,f tobe the σ − algebra generated by B q,f , and the U q ′ ,t f . For q ′ / ∈ Ω ′ , set B q ′ ,f = B q,f . The function Q f ∈ ∂e U q ′ ,f is constant on the atoms of the σ − algebra W f ∈ ∂e B q ′ ,f , and therefore we havefor q ′ ∈ Ω ′ h G q,e − E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) , Y f ∈ ∂e U q ′ ,f i µ q ′ ,e = 0 for q ′ ∈ Ω ′ . Hence, by (3.1.7) and (3.1.14) it follows that h E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) , Y f ∈ ∂e U q ′ ,f i µ q ′ ,e ≥ − η (3.1.15)By Lemma 2.2 there is a set E ⊆ Z ′ such that ψ ′ ( E ) = o K (1) and (cid:13)(cid:13) Y f ∈ ∂e U q ′ ,f (cid:13)(cid:13) L ( µ q ′ ,e ) ≤ µ q ′ ,e ( V e ) / = 1 + o K (1) ≤ for q ′ ∈ Ω ′ \E =: Ω ′ . Then by the Cauchy-Schwartz inequality, (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) ≥ − η, for q ′ ∈ Ω ′ . By Lemma 2.6 there is an exceptional set E ⊆ Z ′ of measure ψ ′ ( E ) = o K,M (1) such that for q ′ = ( q, p ) ∈ Ω ′ := Ω ′ \E we have (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) − E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) ≥ − η − o K,M (1) ≥ − η. (3.1.16)Since B q,f ⊆ B q ′ ,f , for q ′ = ( q, p ) , (3.1.16) is equivalent to (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) − (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) ≥ − η . (3.1.17)Finally, by a further invocation of Lemma 2.6 there is a set E ⊆ Z ′ of measure ψ ′ ( E ) = o K,M (1) such thatfor q ′ ∈ Ω ′ := Ω ′ \E we have (for N, W sufficiently large)
MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 23 (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) − (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q,e ) ≥ − η . (3.1.18)This proves the lemma choosing Ω ′ = Ω ′ . (cid:3) Iterating the above lemma leads to a parametric family of σ − algebras and measures such that the sets G q,e become sufficiently uniform with respect to them. The associated decomposition of their indicatorfunctions is sometimes referred to as a Koopman-von Neumann type decomposition [19]. We will replacesets G e ⊆ V e by σ -algebras B e on V e for e ∈ H d ′ and for that it is useful to define the total energy of thefamily {B e } e ∈H d ′ with respect to a family of lower order σ -algebras {B f } f ∈H d ′− and a family of measures { µ e } e ∈H d ′ as X e ∈H d ′ ,G e ∈B e (cid:13)(cid:13) E µ e ( G e | _ f ∈ ∂e B f ) (cid:13)(cid:13) L ( µ e ) . (3.1.19)Assuming the measures µ e are normalized i.e. µ e ( V e ) = 1 + o (1) ≤ , a crude upper bound for the totalenergy is d +1 M = O M (1) , where M is the complexity of the σ -algebras B e . Lemma 3.2 (Koopman-von Neumann decomposition) . Let { µ q,f } q ∈ Z,f ∈H be a well-defined, symmetricfamily of measures of complexity at most K . Let ≤ d ′ ≤ d , and let {B q,e } q ∈ Z,e ∈H d ′ and {B q,f } q ∈ Z,f ∈H d ′− be families of σ -algebras of complexity at most M d ′ and M d ′ − . Finally let Ω ⊆ Z with ψ (Ω) ≥ c > ,and let δ > be a constant.Then for N, W sufficiently large with respect to the constants δ, c , M d ′ , M d ′ − and K , there exists a well-defined, symmetric extension { µ q ′ ,f } q ′ ∈ Z ′ ,f ∈H of the system { µ q,f } of complexity at most K ′ = O M d ′ ,K, δ (1) and a family of σ -algebras {B q ′ ,f } q ′ ∈ Z ′ ,f ∈H d ′− such that the following hold. (1) For all q ′ = ( q, p ) ∈ Z ′ and f ∈ H d ′ − we have B q,f ⊆ B q ′ ,f , compl ( B q ′ ,f ) ≤ compl ( B q,f ) + O M d ′ , δ (1) . (3.1.20)(2) There exists a set Ω ′ ⊆ Ω × V ⊆ Z ′ of measure ψ ′ (Ω ′ ) ≥ c ( c , δ, M d ′ ) > such that for all q ′ = ( q, p ) ∈ Ω ′ and for all G q,e ∈ B q,e one has (cid:13)(cid:13) G q,e − E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) (cid:3) νq ′ , e ≤ δ. (3.1.21) and (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) = (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q,e ) + o M d ′ ,K, δ (1) , (3.1.22) Proof.
Initially set Z ′ = Z , then (3.1.20) and (3.1.22) trivially holds for q ′ = q . If there is a set Ω ⊆ Ω ofmeasure ψ (Ω ) ≥ c such that inequality (3.1.21) holds for all q ∈ Ω and G q,e ∈ B q,e then the conclusionsof the lemma hold for the initial system of measures and σ -algebras { µ q,f } , {B q,f } and the set Ω . Other-wise there is a set Ω ⊆ Ω of measure ψ (Ω ) ≥ c such that for each q ∈ Ω there is an e ∈ H d ′ and aset G q,e ∈ B q,e for which the inequality (3.1.21) fails. By the pigeonholing we may assume that e ∈ H d ′ isindependent of q . Then by Lemma 3.1, with η := δ d ′ , there is a well-defined extension { µ q ′ ,f } q ′ ∈ Z ′ ,f ⊆ e , afamily of σ -algebras {B q ′ ,f } q ′ ∈ Z ′ ,f ⊆ e and a set Ω ′ ⊆ Ω for which (3.1.3)-(3.1.5) hold. Let { µ q ′ ,f } q ′ ∈ Z ′ ,f ∈H be the symmetrization of the system { µ q ′ ,f } q ′ ∈ Z ′ ,f ⊆ e as described in section 2.3, and set B q ′ ,f := B q,f for q ′ / ∈ Ω ′ or f * e . By Lemma 2.5 one may remove a set E of measure ψ ′ ( E ) = o M d ′ ,K (1) such that for all q ′ ∈ Ω ′ \E , (3.1.20) and (3.1.22) hold for the extended system, whose total energy is at least − δ d ′ largerthan that of the initial system { µ q,f } q ∈ Z,f ∈H .Based on the above argument we perform the following iteration. Let { µ q ′ ,f } q ′ ∈ Z ′ ,f ∈H be a well-defined,symmetric extension of the initial system { µ q,f } q ∈ Z,f ∈H , {B q ′ ,f } q ′ ∈ Z ′ ,f ∈H d ′− be a family of σ -algebrasand let Ω ′ ⊆ Ω × V ′ ⊆ Z ′ for which (3.1.20) and (3.1.22) hold. If there is a set Ω ′ ⊆ Ω ′ of measure ψ ′ (Ω ′ ) ≥ ψ (Ω ′ ) / such that for all q ∈ Ω ′ , e ∈ H d ′ and G q,e ∈ B q,e inequality (3.1.21) holds, then thesystem { µ q ′ ,f } , {B q ′ ,f } together with the set Ω ′ satisfies the conclusions of the lemma.Otherwise there is a well-defined, symmetric extension { µ q ′′ ,f } q ′′ ∈ Z ′′ ,f ∈H together with a family of σ -algebras {B q ′′ ,f } q ′′ ∈ Z ′′ ,f ∈H d ′− and a set Ω ′′ ⊆ Ω ′ × Z d ′ N such that for all q ′′ ∈ Ω ′′ inequalities (3.1.20) and(3.1.22) hold, and total energy of the system ( µ q ′′ ,e , B q,e , B q ′′ ,f ) is at least − δ d ′ larger than that of thesystem ( µ q ′ ,e , B q,e , B q ′ ,f ) . Set Z ′ := Z , µ q ′ ,e := µ q ′′ ,e and B q ′ ,f := B q ′′ ,f . By (3.1.19) the iteration process must stop in O M d ′ ,δ (1) steps and the system obtained satisfies (3.1.20)-(3.1.22). (cid:3) Hypergraph regularity Lemmas.
The shortcoming of Lemma 3.2 is that the complexity of the σ -algebras B q,f might be very large with respect to the parameter δ , which measures the uniformity of thegraphs G q,e . This issue can be taken care of with an iteration process using Lemma 3.2 repeatedly, alongthe lines it was done in [19]. In the weighted settings we have to pass to a new system of weights andmeasures at each iteration and have to exploit the stability properties of well-defined extensions to show thatthe iteration process terminates. Lemma 3.3 (Preliminary regularity lemma.) . Let ≤ d ′ ≤ d and M d ′ > be a constant. Let { µ q,f } q ∈ Z,f ∈H be a well-defined, symmetric family of measures of complexity at most K , and ≤ d ′ ≤ d and {B q,e } q ∈ Z,e ∈H d ′ be a family of σ − algebras on V e so that for all q ∈ Z, e ∈ H d ′ compl ( B q,e ) ≤ M d ′ . (3.2.1) Let ε > and F : R + → R + be a non-negative, increasing function, possibly depending on ε and Ω ⊆ Z be a set of measure ψ (Ω) ≥ c > . If N, W is sufficiently large with respect to the parameters ε, c , M d ′ , K , and F , then there exists a well-defined, symmetric extension { µ q,f } q ∈ Z of complexity at most O K,M d ′ ,F, ε (1) , and families of σ -algebras B q,f ⊆ B ′ q,f defined for q ∈ Z, f ∈ H d − and a set Ω ⊆ Z such that the following holds. (1) We have that Ω ⊆ Ω × V ⊆ Z = Z × V where V = Z kN of dimension k = O M d ′ ,F, ε (1) . Moreover ψ (Ω) ≥ c ( c , F, M d ′ , ε ) > . (3.2.2)(2) There is a constant M d ′ − = O M d ′ ,F,ε (1) such that for all q ∈ Z and f ∈ H d ′ − we havecompl ( B q,f ) ≤ M d ′ − . (3.2.3)(3) For all q = ( q, p ) ∈ Ω , e ∈ H d ′ and G q,e ∈ B q,e , we have (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q ,e ) ≤ ε (3.2.4) MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 25 and (cid:13)(cid:13) G q,e − E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) (cid:13)(cid:13) (cid:3) νq,e ≤ F ( M d ′ − ) . (3.2.5) Proof.
Let { µ q ′ ,f } q ′ ∈ Z ′ , f ∈H be a well-defined, symmetric extension of the initial system { µ q,f } definedon a parameter space Z ′ = Z × V ′ of complexity at most K ′ . Also for q ′ ∈ Z ′ and f ∈ H d ′ − let {B q ′ ,f } q ′ ∈ Z ′ ,f ∈H d ′− be a family of σ -algebras of complexity at most M d ′ − . Set B q ′ ,e := B q,e for q ′ = ( q, p ) ∈ Z ′ , e ∈ H d ′ , and apply Lemma 3.2 to the system ( µ q ′ ,e , B q ′ ,e , B q ′ ,f ) , with δ = F ( M d ′ − ) − .This generates a well-defined, symmetric extension { µ q,f } q ∈ Z,f ∈H and a family of σ -algebras {B ′ q,f } q ∈ Z,f ∈H d ′− and a set Ω ⊆ Z . Set B q,f := B q ′ ,f for q = ( q ′ , p ) ∈ Z , f ∈ H d ′ − . The new system ( µ q,f , B q,e , B ′ q,f ) satisfies (3.2.2)-(3.2.3) and (3.2.5) as long as the parameters K ′ , M d ′ − are of magnitude O K,M d ′ ,F, ε (1) . There are two possibilities. • Case 1:
There exists a set Ω ⊆ Ω of measure ψ (Ω ) ≥ ψ (Ω) / such that (3.2.5) holds for all q ∈ Ω . In this case the conclusions of the lemma hold for the system ( µ q,e , B q,e , B ′ q,f ) and the set Ω . • Case 2:
There is a set Ω ⊆ Ω of measure ψ (Ω ) ≥ ψ (Ω) so that inequality (3.2.5) fails for all q ∈ Ω . Then, thanks to the stability condition (3.1.22) and the fact that B q ′ ,f = B q,f ⊆ B ′ q,f , wehave for q ∈ Ω , q ′ = π ′ ( q ) , and q = π ( q ) that X e,G q,e (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) (cid:13)(cid:13) L µq,e − X e,G q,e (cid:13)(cid:13) E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L µq ′ ,e ≥ X e,G q,e ( (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) (cid:13)(cid:13) L µq,e − (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L µq,e ) − o M d ′ ,K ′ ,F (1)= X e,G q,e (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) − E µ q,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L µq,e − o M d ′ ,K ′ ,F (1) ≥ ε − o M d ′ ,K ′ ,F (1) , (3.2.6)where the summation is taken over all e ∈ H d ′ and G q,e ∈ B q,e .Thus, for sufficiently large N, W , we have for all q = ( q, p ) ∈ Ω that the total energy of the system ( µ q,f , B q,e , B ′ q,f ) is at least ε larger than that of the system ( µ q ′ ,f , B q ′ ,e , B q ′ ,f ) . In this case, set Z ′ := Z, Ω ′ := Ω , µ q ′ ,f := µ q,f , and B q ′ ,f := B ′ q,f and repeat the above argument. Starting withthe original system µ q,f , B q,e and σ -algebras, B q,f = {∅ , V f } ) , the iteration process must stop in at most ε − ( Md ′ )+1 d +1 = O M d ′ ,ε (1) steps, generating a system ( µ q,f , B q,e , B ′ q,f ) which satisfies the conclusionsof the lemma. (cid:3) This lemma is more widely applicable than Lemma 3.2 as the uniformity of the hypergraphs G q,e with re-spect to the (fine) σ − algebras B ′ q,e can be chosen to be arbitrarily small with respect to the complexity ofthe (coarse) σ − algebras B q,e , while the approximations E µ q,e ( G q,e | W B ′ q,e ) and E µ q,e ( G q,e | W B q,e ) stayvery close in L ( µ q,e ) . In order to obtain a counting and a removal lemma starting from a given measure system { µ q,e } and σ -algebras {B q,e } we need to regularize the elements of the σ -algebras B q,e for all e ∈ H with respect to the lower order σ -algebras W f ∈ ∂e B q,f . This is done by applying Lemma 3.3 inductively, and provides thefinal form of the regularity lemma we need. Let us call a function F : R + → R + a growth function if it iscontinuous, increasing, and satisfies F ( x ) ≥ x for x ≥ . Theorem 3.1. [Regularity lemma.] Let ≤ d ′ ≤ d and M d ′ > be a constant. Let { µ q,f } q ∈ Z,f ∈H be awell-defined, symmetric family of measures of complexity at most K , and ≤ d ′ ≤ d and {B q,e } q ∈ Z,e ∈H d ′ be a family of σ − algebras on V e so that for all q ∈ Z, e ∈ H d ′ compl ( B q,e ) ≤ M d ′ . (3.2.7) Let F : R + → R + be a growth function, and Ω ⊆ Z be a set of measure ψ (Ω) ≥ c > . If N, W is sufficiently large with respect to the parameters c , M d ′ , K , and F , then there exists a well-defined, symmetric extension { µ q,f } q ∈ Z,f ∈H of complexity at most O K,M d ′ ,F (1) , and families of σ -algebras B q,f ⊆ B ′ q,f defined for q ∈ Z, f ∈ H d − and a set Ω ⊆ Z such that the following holds. (1) We have that Ω ⊆ Ω × V ⊆ Z = Z × V where V = Z kN of dimension k = O M d ′ ,F (1) . Moreover ψ (Ω) ≥ c ( c , F, M d ′ ) > . (3.2.8)(2) There exist numbers M d ′ < F ( M d ′ ) ≤ M d ′ − < F ( M d ′ − ) ≤ · · · ≤ M < F ( M ) ≤ M = O M d ′ ,F (1) (3.2.9) such that for all ≤ j < d ′ , f ∈ H j , and q ∈ Z ,compl ( B ′ q,f ) ≤ M j . (3.2.10)(3) For all ≤ j ≤ d ′ , e ∈ H j , q = ( q, p ) ∈ Ω , and G q,e ∈ B q,e (with B q,e := B q,e , if j = d ′ ), onehas (cid:13)(cid:13) E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) − E µ q,e ( G q,e | _ f ∈ ∂e B q,f ) (cid:13)(cid:13) L ( µ q,e ) ≤ F ( M j ) (3.2.11) and (cid:13)(cid:13) G q,e − E µ q,e ( G q,e | _ f ∈ ∂e B ′ q,f ) (cid:13)(cid:13) (cid:3) νq.e ≤ F ( M ) . (3.2.12) Proof.
We proceed by an induction on d ′ . If d ′ = 1 the statement follows from Lemma 3.3 with ε = F ( M ) ,so assume that d ′ ≥ and the theorem holds for d ′ − . Apply Lemma 3.3 with a growth function F ∗ ≥ F (to be specified later) and with ε = F ∗ ( M d ′ ) . This gives a well-defined, symmetric extension { µ q ′ ,f } and afamily of σ -algebras B q ′ ,f ⊆ B ′ q ′ ,f defined on a parameter space Z ′ = Z × V , such that (cid:13)(cid:13) E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) − E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) ≤ F ∗ ( M d ′ ) (3.2.13)and (cid:13)(cid:13) G q ′ ,e − E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) (cid:13)(cid:13) (cid:3) νq ′ ,e ≤ F ∗ ( M d ′ − ) , (3.2.14)hold for all q ′ = ( q, p ) ∈ Ω ′ , e ∈ H d ′ , and G q ′ ,e ∈ B q ′ ,e = B q,e , where Ω ′ ⊆ Ω × V ⊆ Z ′ is a set ofmeasure ψ ′ (Ω ′ ) ≥ c ( c , F, M d ′ ) > . MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 27
Applying the induction hypothesis to the system { µ q ′ ,f } q ′ ∈ Z ′ ,f ∈H , {B q ′ ,f } q ′ ∈ Z ′ ,f ∈H d ′− , the growth func-tion F , and the set Ω ′ , one obtains an extension { µ q,f } q ∈ Z,f ∈H and families of σ − algebras {B q,f ⊆B ′ q,f } q ∈ Z, f ∈H j such that (3.2.10) - (3.2.12) hold for j < d ′ − , with constants M d ′ − < F ( M d ′ − ) ≤ · · · ≤ M < F ( M ) = O M d ′− ,F (1) . (3.2.15)For q = ( q ′ , p ) ∈ Z , f ∈ H d ′ − set B q,f := B q ′ ,f , and B ′ q,f := B ′ q ′ ,f . We show that inequalities (3.2.11)and (3.2.12) hold for j = d ′ . Indeed, by the stability property (2.2.12), one has (cid:13)(cid:13) E µ q,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) − E µ q,e ( G q ′ ,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q,e ) = (cid:13)(cid:13) E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) − E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) L ( µ q ′ ,e ) + o K,M d ′ ,F,F ∗ (1) ≤ F ∗ ( M d ′ ) + o K,M d ′ ,F,F ∗ (1) , (3.2.16)for all q = ( q ′ , p ) ∈ Ω \E , e ∈ H d ′ , and G q ′ ,e ∈ B q ′ ,e . Here E ⊆ Ω is a set of measure ψ ( E ) = o K,M d ′ ,F,F ∗ (1) . Similarly using the stability properties (2.2.10) and (2.2.17) of the box norms, we have (cid:13)(cid:13) G q ′ ,e − E µ q,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) (cid:13)(cid:13) (cid:3) νq,e = (cid:13)(cid:13) G q ′ ,e − E µ q ′ ,e ( G q ′ ,e | _ f ∈ ∂e B ′ q ′ ,f ) (cid:13)(cid:13) (cid:3) νq,e + o K,M d ′ ,F,F ∗ (1)= (cid:13)(cid:13) G q ′ ,e − E µ q ′ ,e ( G q,e | _ f ∈ ∂e B q ′ ,f ) (cid:13)(cid:13) (cid:3) νq ′ ,e + o K,M d ′ ,F,F ∗ (1) ≤ F ∗ ( M d ′ − ) + o K,M d ′ ,F,F ∗ (1) , (3.2.17)for all q = ( q ′ , p ) ∈ Ω \E , e ∈ H d ′ and A q ′ ,e ∈ B q ′ ,e = B q,e , where E ⊆ Ω is a set of measure ψ ( E ) = o K,M d ′ ,F,F ∗ (1) .With F ( M ) = O M d ′− ,F (1) , we have that F ( M ) ≤ C ( M d ′ − , F ) =: F ∗ ( M d ′ − ) for a sufficientlyrapidly growing function F ∗ depending only on F . Assuming N, W are sufficiently large with respect to M d ′ and K , inequalities (3.2.11), (3.2.12) for j = d ′ and q ∈ Ω \ ( E ∪ E ) follow from (3.2.13) and (3.2.14).The rest of the conclusions of the theorem are clear from the construction. (cid:3)
4. C
OUNTING AND THE R EMOVAL L EMMAS .4.1.
The Removal Lemma.
In this section we formulate a so-called counting lemma and show how it im-plies Theorem 1.4. Our arguments will closely follow and are straightforward adaptations of those in [19]to the weighted settings; for the sake of completeness we will include the details.For e ∈ H d let G e ⊆ V e be a hypergraph, and let B e = { A e , A Ce , ∅ , V e } be the σ − algebra generatedby it. Let { ν e } e ∈H and { µ e } e ∈H be the weights and measures associated to a well-defined, symmetric fam-ily forms L = { L ke ; e ∈ H d , ≤ k ≤ d } . Take M d > and F : R + → R + be a growth function to bedetermined later and apply Theorem 3.1 with d ′ = d to obtain a well-defined, symmetric parametric exten-sion { µ q,e } q ∈ Z,e ∈H together with σ -algebras B q,e ⊆ B ′ q,e and a set Ω ⊆ Z such that (3.2.8)-(3.2.12) hold. The family { ν e } can be considered as a parametric family of weights in a trivial way, setting Z = Ω = { } , and ψ (0) = 1 . Note that the complexity of the system as well as the σ -algebras is O M d ,F (1) . We consider the system ofmeasures µ q,e and the σ -algebras B q,e , B ′ q,e fixed for the rest of this section.It will be convenient to define all our σ -algebras on the same space V J and eventually replace the ensem-ble of measures { µ q,e } e ∈H with the measure µ q := µ q,J = Q f ∈H ν q,f . Thanks to the stability conditions(2.1.6)-(2.1.7) this can be done at essentially no cost. Indeed for any e ∈ H there is an exceptional set E e ⊆ Ω of measure ψ ( E e ) = o M d ,F (1) , such that for any family of sets G q,e ⊆ V e we have that µ q ( π − e ( G q,e )) = µ q,e ( G q,e ) + o M d ,F (1) , (4.1.1)uniformly for q ∈ Ω \E e . Let E = S e ∈H E e , Ω ′ := Ω \E , then (4.1.1) means that for any set A q,e ∈ A e one has that µ q ( A q,e ) = µ q,e ( π e ( A q,e )) + o M d ,F (1) uniformly for q ∈ Ω ′ . We will write µ q,e ( A q,e ) = R V e A q,e ( x e ) dµ q,e ( x e ) for simplicity of notations.Define the σ -algebras B q,e := π − e ( B q,e ) , B ′ q,e := π − e ( B ′ q,e ) on V J , and note that B q,e = B e for e ∈ H d asthe initial σ -algebras B e are not altered in Theorem 3.1. Let B q := W e ∈H B q,e be the σ -algebra generated bythe algebras B q,e , and define similarly the σ -algebra B ′ q . The atoms of B q are of the form A q = T e ∈H A q,e where A q,e is an atom of B q,e . In particular if E e ∈ B e then T e ∈H d E e is the union of the atoms of B q .The so-called counting lemma [19], [5], [14], gives an approximate formula for the measure of “most”atoms A q and as consequence it shows that their measure is bounded below by a positive constant depend-ing only on the initial data F and M d . If, as in Theorem 1.4, one assumes that the measure of T e ∈H d E e is sufficiently small then it cannot contain most of the atoms thus removing the exceptional atoms from thesets E e , the intersection of the remaining sets becomes empty, leading to a proof of Theorem 1.4.To make this heuristic precise let us start by defining the relative density δ q,e ( A | B ) := µ q,e ( A ∩ B ) /µ q ( B ) for A, B ∈ B q,e , with the convention that δ q,e ( A | B ) := 1 if µ q ( B ) = 0 . Definition 4.1.
Let A q = ∩ e ∈H A q,e be an atom of B q . We say that the atom A q is regular if the followinghold. (1) For all atoms A q,e δ q,e ( A q,e (cid:12)(cid:12) \ f ∈ ∂e A q,f ) ≥ F ( M j ) , (4.1.2)(2) Moreover Z V e (cid:12)(cid:12) E µ q ( A q,e | _ f ∈ ∂e B ′ q,f ) − E µ q ( A q,e | _ f ∈ ∂e B q,f ) (cid:12)(cid:12) Y f ( e A q,f dµ q,e ≤ F ( M j ) Z V J Y f ( e A q,f dµ q,e . (4.1.3)This roughly means that all atoms A q,e are both somewhat large and regular on the intersection of the lowerorder atoms A q,f , ( f ∈ ∂e ). Note that if | e | = 1 then ∂e = ∅ and by convention we define T f ∈ ∂e A q,f = V J ,and the left side of (4.1.2) becomes µ q,e ( A q,e ) . Proposition 4.1. [Counting lemma] There is a set
E ⊆ Ω of measure ψ ( E ) = o N,W →∞ ; M d ,F (1) such thatif q ∈ Ω \E and if A q = T e ∈H A q,e ∈ W e ∈H B q,e is a regular atom, then µ q ( A q ) = (1 + o M d →∞ (1)) Y e ∈H δ q,e ( A q,e (cid:12)(cid:12) \ f ∈ ∂e A q,f ) + O M (cid:18) F ( M ) (cid:19) + o N,W →∞ ; M d ,F (1) . (4.1.4) MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 29
Next, following [19], we show that the total measure of irregular atoms is small. For any atom A q,e ∈ B q,e ,let B q,e,A q,e be the union of all sets of the form T f ( e A q,f for which (4.1.2) or (4.1.3) fails. Note that if anatom A q = T e ∈H A q,e is irregular then A q ⊆ A q,e ∩ B q,e,A q,e for some e ∈ H . We claim that µ q ( A q,e ∩ B q,e,A q,e ) . F ( M j ) (4.1.5)for q / ∈ E , where E ⊆ Ω is a set of measure ψ ( E ) = o M d ,F (1) . To see this, note that the measure µ q can be replaced by the measure µ q,e as they differ by a negligible quantity on sets which belong to A e . Weestimate first the contribution of those sets T f ( e A q,f to the left side of (4.1.5) for which (4.1.2) fails. Thisquantity is bounded by X { A q,f } f ∈ ∂e , (4.1.2) fails µ q,e ( A q,e ∩ \ f ∈ ∂e A q,f ) ≤ F ( M j ) X { A q,f } f ∈ ∂e µ q,e ( \ f ∈ ∂e A q,f ) ≤ F ( M j ) µ q,e ( V e ) . F ( M j ) , as the summation is taken over the disjoint atoms of the σ -algebra W f ∈ ∂e B q,f .Similarly, one estimates the total contribution of the disjoint atoms T f ( e A q,f for which (4.1.3) fails asfollows. X { A q,f } f ⊆ e , (4.1.3) fails µ q,e ( \ f ( e A q,f ) ≤ F ( M j ) Z V e | E µ q,e ( A q,e | _ f ∈ ∂e B ′ q,f ) − E µ q,e ( A q,e | _ f ∈ ∂e B q,f ) | dµ q,e ≤ F ( M j ) 1 F ( M j ) = 1 F ( M j ) . Since the sets A q,e ∩ B q,e,A q,e contain all irregular atoms, and for given e ∈ H j the number of all atoms ofthe σ -algebra B q,e is at most Mj , one estimates the total measure of all irregular atoms as d X j =1 X e ∈H j X A q,e ∈B q,e µ q ( A q,e ∩ B q,e,A q,e ) ≤ d X j =1 (cid:18) dj (cid:19) Mj F ( M j ) ≤ p log F ( M d ) ≤ − Md (4.1.6)if, say F ( M ) ≥ Md + d . This shows, choosing M d sufficiently large, that most atoms are regular.Another fact we need is that the measure of regular atoms is not too small. Indeed by (4.1.2), (4.1.4),we have that for q ∈ Ω and a regular atom A q = ∩ e ∈H A q,e ,µ q ( A q ) ≥ Y j ≤ d Y e ∈H j F ( M j ) / − O d,M (cid:18) F ( M ) (cid:19) + o M d ,F (1) ≥ F ( M ) > , (4.1.7)as long as F is sufficiently rapid growing and M d is sufficiently large with respect to d. It is clear from(3.2.9) that F ( M ) ≤ F ∗ ( M d ) for a function F ∗ depending only on F and M d .After these preparations, assuming the validity of Proposition 4.1, it is easy to obtain the Proof of Theorem 1.4.
Let δ > , E e ∈ A e and g e : V e → [0 , for e ∈ H d be given. Let E ⊆ Ω be aset of measure ψ ( E ) = o M d ,F (1) so that (4.1.1), (4.1.6) and (4.1.7) hold for q ∈ Ω / E . Also by (2.2.4)conditions (1.4.11)-(1.4.12) hold for ˜ µ J := µ q,J and ˜ µ e := µ q,e ( e ∈ H d ) , (4.1.8)for q / ∈ E , for a set E ⊆ Ω be a set of measure ψ ( E ) = o M d ,F (1) .Now fix q / ∈ E ∪ E and define ˜ µ J and ˜ µ e for e ∈ H d as is (4.1.8). We claim that this system of mea-sures satisfy the conclusions of the theorem. By construction the system is symmetric so it remains toconstruct the sets E ′ e and show (1.4.13)-(1.4.15) hold. For given e ∈ H define the sets E ′ q,e = V J \ ( B q,e,A e ∪ [ f ( e, A q,f ( A q,f ∩ B q,f,A q,f )) , (4.1.9)where A q,f ranges over the atoms of B q,f . As we have B q,e = B e , which is generated by a single set E e , if T e ∈H d E e contains an atom A q = T f ∈H A q,f then A q,e = E e for e ∈ H d . If such an atom would be regularthen by (1.4.10) its measure would satisfy F ∗ ( M d ) ≤ ˜ µ J ( \ e ∈H d E e ) = µ J ( \ e ∈H d E e ) + o M d ,F (1) < δ. Choosing M d to be the largest positive integer so that F ∗ ( M d ) ≤ (2 δ ) − we see that T e ∈H d E e containsonly irregular atoms. From (4.1.9) and (4.1.6) we have ˜ µ J ( E e \ E ′ q,e ) = ˜ µ J ( [ f ⊆ e , A q,f ( A q,f ∩ B q,f,A q,f )) ≤ − Md . (4.1.10)Also, all irregular atoms A q = T f ∈H A q,f ⊆ T e ∈H d E e are contained in one of the sets E e \ E ′ q,e, thus \ e ∈H d ( E e ∩ E ′ q,e ) = ∅ . Finally, choosing ε := 2 − Md , (1.4.14) holds by (4.1.10). Moreover δ → implies M d → ∞ and hence ε → showing the validity of (1.4.15). This proves Theorem 1.4. (cid:3) Proof of Proposition 4.1.
The proof proceeds by induction and uses the Cauchy-Schwartz inequality,causing to double certain sets of variables. As a consequence, we need a generalization of Proposition 4.1which requires the following definition.
Definition 4.2 (Weighted hypergraph bundles over H ) . Let K be a finite set together with a map π : K → J ,called the projection map of the bundle. Let G K be the set of edges g ⊆ K such that π is injective on g and π ( g ) ∈ H .For any g ∈ G K , write V g := V π ( g ) = Y k ∈ g V π ( k ) , and define the weights and measures ν q,g , µ q,g : V g → R + as ν q,g ( x g ) := ν q,π ( g ) ( x g ) , µ q,g ( x g ) = Y g ′ ⊆ g ν q,g ′ ( x g ′ ) . MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 31
The total measure measure µ q,K on V K is given by µ q,K ( x ) = Y g ∈G ν q,g ( x g ) . A hypergraph
G ⊆ G K which is closed in the sense that ∂g ⊆ G for every g ∈ G , together with thespaces V g and the weight functions ν q,g for g ∈ G is called a weighted hypergraph bundle over H . Thequantity d ′ = sup g ∈G | g | is called the order of G . Note that the underlying linear forms defining the weight system { ν q,g } q ∈ Z,g ∈G K , ¯ L g ( q, x g ) = L π ( g ) ( q, x g ) , supp x ( L π ( g ) ) = π ( g ) are pairwise linearly independent. Indeed, if g = g ′ they depend on different sets of variables, and for a fixedsets of variables they are the same as the forms L ( q, x g ) . What happens is that we sample a number variablesfrom each space V j and evaluate the forms L ( q, x ) in the new variables. For example if we have x , x ′ ∈ V and x , x ′ ∈ V then to the edge (1 , ∈ H there correspond the edges (1 , , (1 , ′ ) , (1 ′ , and (1 ′ , ′ ) in G , and to every linear form L ( q, x , x ) there also correspond the forms L ( q, x , x ′ ) , L ( q, x ′ , x ) and L ( q, x ′ , x ′ ) defining the weights on the appropriate edges. Proposition 4.2. [Generalized Counting Lemma] Let
G ⊆ G K be a closed hypergraph bundle over H withthe projection map π : K → J , and d ′ := sup g ∈G | g | be the order of G . Then, for F growing sufficientlyrapidly with respect to d and K , there exists a set E ⊆ Ω of measure ψ ( E ) = o N →∞ ; M d ,K,F (1) such thatfor q ∈ Ω \E we have Z V K Y g ∈G A q,π ( g ) ( x g ) dµ q,K ( x ) (4.2.1) = (1 + o M d →∞ ,K (1)) Y g ∈G δ q,π ( g ) ( A q,π ( g ) | \ f ∈ ∂π ( g ) A q,f ) + O K,M ( 1 F ( M ) ) + o N →∞ ,K,M d (1) . Note that Proposition 3 is the special case when G = H and π is the identity map. Proof.
We use a double induction. First we induct on d ′ , the order of G (note that d ′ ≤ d ), and then, fixing K and π , we induct on the number of edges r := |{ g ∈ G : | g | = d ′ }| .To start, assume that d ′ = r = 1 , so that G = { k } and j = π ( k ) ∈ J. The left hand side of (4.2.1)becomes Z V k A q,j ( x k ) dµ q,k ( x k ) = Z V j A q,j ( x j ) dµ q,j ( x j ) = δ q,j ( A q,j ) . Let { A q,e } e ∈H be a regular collection of atoms for q ∈ Ω , and define the functions b q,e , c q,e : V e → R for e ∈ H by b q,e := E µ q,e ( A q,e | _ f ∈ ∂e B ′ q,f ) − E µ q,e ( A q,e | _ f ∈ ∂e B q,f ) (4.2.2) c q,e := A q,e − E µ q,e ( A q,e | _ f ∈ ∂e B ′ q,f ) (4.2.3) and introduce the shorthand notation δ q,e = δ q,e ( A q,e | \ f ∈ ∂e A q,f ) . Note that if x ∈ A q,e T f ∈ ∂e A q,f then δ q,e = E µ q,e ( A e | _ f ∈ ∂e B q,f )( x e ) , (4.2.4)and thus one has the decomposition A q,e ( x e ) = δ q,e + b q,e ( x e ) + c q,e ( x e ) (4.2.5)on the set T f ∈ ∂e A q,f . Let g ∈ G such that | g | = d ′ and use (4.2.5) to write Y g ∈G A q,π ( g ) ( x g ) = ( δ q,π ( g ) + b q,π ( g ) ( x g ) + c q,π ( g ) ( x g )) Y g ∈G\{ g } A q,π ( g ) ( x g ) . Consider the contribution of the terms separately Z V K Y g ∈G A q,π ( g ) ( x g ) dµ q,K ( x )= Z V K ( δ q,π ( g ) + b q,π ( g ) ( x g ) + c q,π ( g ) ( x g )) Y g ∈G\{ g } A q,π ( g ) ( x g ) dµ q,K ( x )= M q + E q + E q (4.2.6)For main term M q , by the second induction hypothesis we have M q = δ q,π ( g ) Z V K Y g ∈G\{ g } A q,π ( g ) ( x g ) dµ q,K ( x )= δ q,π ( g ) (1 + o M d →∞ (1)) Y g ∈G\ g δ q,π ( g ) + O K,M ( 1 F ( M ) ) + o N,W →∞ ; K,M d (1) , and hence M q agrees with the right side of (4.2.1). We continue to estimate the second error term by E q = Z V K c q,π ( g ) ( x g ) Y g ∈G\{ g } A q,π ( g ) ( x g ) dµ q ( x ) = E x ∈ V K ( c q,π ( g ) ν q,g )( x g ) Y g ∈G\{ g } A q,π ( g ) ν q,g ( x g )= E x ∈ V K Y | g | = d ′ ,g ∈G f q,g ( x g ) Y g ′ ∈G , | g ′ | 5. P ROOF OF T HE M AIN R ESULTS In this section we finish the proof of our main result Theorem 1.2. Since we have already shown the validityof Theorem 1.4 and hence that of Theorem 1.3 by the argument in the introduction, it remains to show thatcounting affine copies of ∆ in a set A ⊆ Z dN with weights w translates to counting copies in A ⊆ P d ofrelative density α > . This is standard, we include the details for the sake of completeness, using thearguments given in [2]. MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 35 First, let us identify [1 , N ] d with Z dN and recall that constellations in Z dN defined by the simplex ∆ whichare contained in a box B ⊆ [1 , N ] d of size εN , are in fact genuine constellations contained in B . Note thatwe can assume that the simplex ∆ is primitive in the sense that t ∆ * Z d for any < t < , as any simplexis a dilate of a primitive one. To any simplex ∆ ⊆ Z d there exists a constant τ (∆) > depending only on ∆ such that the following holds. Lemma 5.1. [2] Let ∆ ⊆ Z d be a primitive simplex. Then there is constant < ε < τ (∆) so that thefollowing holds.Let N be sufficiently large, and let B = I d be a box of size εN contained in [1 , N ] d ≃ Z dN . If thereexist x ∈ Z dN and ≤ t < N such that x ∈ B and x + t ∆ ⊆ B as a subset on Z dN , then either x + t ∆ ⊆ B or x + ( t − N )∆ ⊆ B , also as a subset of Z d .Proof [Theorem 1.3 implies Theorem 1.2]Let N, W be sufficiently large positive integers and assume that | A | ≥ α | P N | d for a set A ⊆ P dN . By thepigeonhole principle choose b = ( b j ) ≤ j ≤ d so that b j is relative prime to W for each j , and | A ∩ (( W Z ) d + b ) | ≥ α N d (log N ) d φ ( W ) d , (5.1)where φ is the Euler totient function. Set N := N/W and A := { n ∈ [1 , N ] d ; W n + b ∈ A } .Choose ε > so that ε < τ (∆) . By the Prime Number Theorem there is a prime N ′ so that ε N ′ = N (1 + o N →∞ (1)) , thus we have | A ∩ [1 , ε N ′ ] d | ≥ α ε d N ′ ) d W d (log N ′ ) d φ ( W ) d . (5.2)By Dirichlet’s theorem on primes in arithmetic progressions the number of n ∈ [1 , N ′ ] d \ [ ε N ′ , N ′ ] d forwhich W n + b ∈ P d is of O ( ε N ′ d W d (log N ′ ) d φ ( W ) d ) , thus (5.2) holds for the set A ′ := A ∩ [ ε N ′ , ε N ′ ] d aswell, if ε ≤ c d ε d α for a small enough constant c d > .If x ∈ A ′ then ε N ′ ≤ x i ≤ ε N ′ and W x i + b i ∈ P for ≤ i ≤ d , thus by the definition of theGreen-Tao measure ν b : [1 , N ′ ] → R + given in Section 1.3, we have w ( x ) = d Y i =1 ν b i ( x i ) ≥ c d (cid:18) φ ( W ) log NW (cid:19) d . (5.3)as log N ′ − log N assuming N sufficiently large with respect to W . Thus E x ∈ Z dN ′ A ′ ( x ) w ( x ) ≥ c d ε d α (5.4)for some constant c d > . Applying the contrapositive of Theorem (1.3) for the set A ′ with ε := c d ε d α gives E x ∈ Z dN ′ , t ∈ Z N ′ (cid:18) d Y j =0 A ′ ( x + tv j ) (cid:19) w ( x + t ∆) ≥ δ (5.5)with a constant δ = δ ( α, ∆) > depending only on α and the simplex ∆ = { v , . . . , v d } . Similarly as in(5.3) w ( x + t ∆) ≤ C d (cid:18) φ ( W ) log NW (cid:19) l (∆) , (5.6)since all coordinates of x + t ∆ are primes, bigger then R . Thus the number of copies ∆ ′ = x + t ∆ which arecontained in A ′ as a subset of Z dN ′ is at least c N d +1 (log N ) − l (∆) , for some constant c = c ( α, ∆ , W ) > depending only on the initial data α , ∆ and the number W . Since A ′ ⊆ [ ε N ′ , ε N ′ ] d , by Lemma 5.1 atleast half of the simplices ∆ ′ are contained in A ′ as a subset of Z d , and then the simplices ∆ ′′ := W ∆ ′ + b are contained in A .Now choose W = W ( α, ∆) large enough so that Theorem 1.3 holds for all sufficiently large N , and then A contain at least c ′ ( α, ∆) N d +1 (log N ) − l (∆) similar copies of ∆ for some constant c ′ ( α, ∆) > dependingonly on α and the simplex ∆ . This proves Theorem 1.2 (cid:3) A PPENDIX A. B ASIC PROPERTIES OF WEIGHTED BOX NORMS In this appendix we describe some basic facts about the weighted version of Gowers’s box norms definedin (1.5.2) for functions F : V e → R . These norms have also been defined in [8], Appendix B, and in factall the properties we prove here, including Proposition 1.1, can be deduced from the arguments given there.However as our settings is slightly different, we include the proofs below.We will assume e = { , . . . , d } =: [ d ] , and V = V [ d ] = Z dN without loss of generality. To show thatthese are indeed norms (for d ≥ ) let us define a multilinear form referred to as the weighted Gowers’sinner product. Let F ω : V e → R for ω ∈ { , } e , be a given family of functions and define D F ω , ω ∈ { , } d E (cid:3) ν := E x [ d ] ,y [ d ] ∈ V Y ω ∈{ , } d F ω ( ω ( x [ d ] , y [ d ] )) Y | I | We will use Cauchy-Schwartz inequality several times and the linear forms condition. D F ω ; ω ∈ { , } d E (cid:3) dν = E x [2 ,d ] ,y [2 ,d ] (cid:20)(cid:18) Y | I | G , ω = 1 ≤ X ω ∈{ , } d k h ω k (cid:3) dν ... k h ω d k (cid:3) dν = ( k F k (cid:3) dν + k G k (cid:3) dν ) d Also it follows directly from the definition that k λF k d (cid:3) dν = λ d k f k d (cid:3) dν , hence k λF k (cid:3) dν = | λ | k F k (cid:3) dν . (cid:3) Proof of Proposition 1. Let H ′ = { f ∈ H ; | f | < d , and write the left side of (1.5.3) as E = E x ∈ V J Y e ∈H d F e ( x e ) Y f ∈H ′ ν f ( x f ) . Fix e = [ d ] and write e j := [ d + 1] \{ j } for the rest of the faces. The idea is to apply the Cauchy-Schwartzinequality successively in the x , x , . . . , x d variables to eliminate the functions F e ≤ ν e , . . . , F e d ≤ ν e d ,using the linear forms condition at each step. Using F e ≤ ν e we have | E | ≤ E x ,...,x d +1 ν e ( x ) Y / ∈ f ∈H ′ ν f ( x f ) (cid:12)(cid:12) E x Y j =2 F e j ( x j ) Y ∈ f ∈H ′ ν f ( x f ) (cid:12)(cid:12) . By the linear forms condition E x ,...,x d +1 ν e ( x ) Q / ∈ f ∈H ′ ν f ( x f ) = 1 + o N →∞ (1) , thus by the Cauchy-Schwartz inequality E . E x ,...,x d +1 ν e ( x ) Y / ∈ f ∈H ′ ν f ( x f ) E x ,y Y j =2 F e j ( x , x e j \{ } ) F e j ( y , x e j \{ } ) (A.1) × Y ∈ f ∈H ′ ν f ( y , x f \{ } ) ν f ( x , x f \{ } ) Note that, what happened is that we have replaced the function F e by the measure ν e , doubled the variable x to the pair of variables ( x , y ) and also doubled each factor of the form G e ( x e ) (which is either F e ( x e ) or ν e ( x e ) , for e ∈ H ) depending on the x variable. To keep track of these changes as we continue with therest of that variables, let us introduce some notations. Let g ⊆ [ d ] and for a function G e ( x e ) define G ∗ e ( x e ∩ g , y e ∩ g , x e \ g ) := Y ω e ∈{ , } e ∩ g G e ( ω e ( x e ∩ g , y e ∩ g ) , x e \ g ) . (A.2)We claim that after applying the Cauchy-Schwartz inequality in the x , . . . , x i variables we have with g = [ i ] E i . E x [ i ] ,y [ i ] ,x J \ [ i ] Y j ≤ i ν ∗ e j ( x [ i ] ∩ e j , y [ i ] ∩ e j , x e j \ [ d ] ) Y j>i F ∗ e j ( x [ i ] ∩ e j , y [ i ] ∩ e j , x e j \ [ d ] ) (A.3) × Y f ∈H ′ ν ∗ f ( x f ∩ [ i ] , y f ∩ [ i ] , x f \ [ i ] ) . (A.4)For i = 1 this can be seem from (A.1). Note that the linear forms appearing in any of these factors arepairwise linearly independent as our system is well-defined. Assuming it holds for i separating the factorsindependent of the x i +1 variable, replacing the function F e i +1 with ν e i +1 , and applying the Cauchy-Schwartzinequality we double the variable x i +1 to the pair ( x i +1 , y i +1 ) and each factor G ∗ e ( x e ∩ [ i ] , y e ∩ [ i ] , x e \ [ i ] ) de-pending on it, to obtain the factor G ∗ e ( x e ∩ [ i +1] , y e ∩ [ i +1] , x e \ [ i +1] ) , thus the formula holds for i + 1 . Afterfinishing this process we have by (A.2) and (A.3) E d . E x [ d ] ,y [ d ] Y ω ∈{ , } d F e ( ω ( x [ d ] , y [ d ] )) Y f ⊆ [ d ] ,f = e Y ω f ∈{ , } f ν f ( ω f ( x f , y f )) W ( x [ d ] , y [ d ] ) , where W ( x [ d ] , y [ d ] ) = E x d +1 Y d +1 ∈ e ∈H Y ω e ∈{ , } e ∩ [ d ] ν e ( ω e ( x e ∩ [ d ] , y e ∩ [ d ] , x e \ [ d ] )) . Thus, as F e ≤ ν e , to prove (1.5.3) it is enough to show that E x [ d ] ,y [ d ] Y f ⊆ [ d ] Y ω f ∈{ , } f ν f ( ω f ( x f , y f )) |W ( x [ d ] , y [ d ] ) − | = o N →∞ (1) . This, similarly as in [7], can be done with one more application of the Cauchy-Schwartz inequality leadingto 4 terms involving the ”big” weight functions W and W . Each terms is however o N →∞ (1) by thelinear forms condition, as the underlying linear forms are pairwise linearly independent. Indeed the forms L f ( ω f ( x f , y f ) are pairwise independent for f ⊆ [ d ] , and depend on a different set of variables then theforms L e ( ω e ( x e ∩ [ d ] , y e ∩ [ d ] , x e \ [ d ] )) for e * [ d ] defining the weight function W . The new forms appearing MULTIDIMENSIONAL SZEMER ´EDI THEOREM IN THE PRIMES 39 in W are copies of the forms in W with the x d +1 variable replaced by a new variable y d +1 hence areindependent of each other and the rest of the forms. This proves the proposition. (cid:3) Acknowledgements. We would like to thank Terence Tao for some helpful correspondence during thisresearch. R EFERENCES [1] D. C ONLON , J. F OX , Y. Z HAO , A relative Szemeredi theorem ,Geometric and Functional Analysis , to appear.[2] B. C OOK , A. M AGYAR Constellations in Pd International Mathematics Research Notices 2012.12 (2012), 2794-2816.[3] H. F URSTENBERG , Y. K ATZNELSON , An ergodic Szemer´edi theorem for commuting trnasformations , J. Analyse Math. 31(1978), 275-291[4] J. F OX , Y. Z HAO , A short proof of the multidimensional Szemer´edi theorem in the primes. , American Journal of Mathematics,ti appear.[5] W.T. G OWERS , Hypergraph regularity and the multidimensional Szemer´edi theorem , Annals of Math. 166/3 (2007), 897-946[6] W.T. G OWERS , Decompositions, approximate structure, transference, and the Hahn-Banach theorem , Bull. London Math.Soc. 42 (4) (2010), 573-606[7] B. G REEN AND T. T AO , The primes contain arbitrary long arithmetic progressions , Annals of Math. 167 (2008), 481-547[8] B. G REEN AND T. T AO , Linear equations in primes. Annals. of Math.(2) 171.3 (2010), 1753-1850.[9] B. G REEN , T. T AO , The Mobius function is asymptotically orthogonal to nilsequences , Annals of Math., 175 (2012), 541-566[10] B. G REEN , T. T AO , T. Z IEGLER , An inverse theorem for the Gowers U s +1 [ N ] norm , Annals of Math., 176 (2012), no. 2,1231-1372[11] D. G OLDSTON , C. Y ILDIRIM , Higher correlations of divisor sums related to primes I: triple correlations , Integers: ElectronicJournal of Combinatorial Number theory, 3 (2003), 1-66[12] D. G OLDSTON , C. Y ILDIRIM , Higher correlations of divisor sums related to primes III: small gaps between primes , Proc.London Math. Soc. 95 (2007), 653-686[13] A. M AGYAR , T. T ITICHETRAKUN Corners in dense subsets of P d , preprint[14] B. N AGLE , V. R DL , M. S CHACHT , The counting lemma for regular k-uniform hypergraphs , Random Structures and Algo-rithms, 28(2), (2006), 113-179[15] O. R EINGOLD , L. T REVISAN , M. T ULSIANI , S. V ADHAM , Dense subsets of pseudorandom sets Electronic Colloquium ofComputational Complexity, Report TR08-045 (2008)[16] J. S OLYMOSI , Note on a generalization of Roths theorem , Discrete and Computational Geometry, Algorithms Combin. 25,(2003), 825-827[17] E. S ZEMER ´ EDI , On sets of integers containing no k elements in arithmetic progression , Acta Arith. 27 (1975), 299-345[18] T. T AO , The Gaussian primes contain arbitrarily shaped constellations , J. Analyse Math., 99/1 (2006), 109-176[19] T. T AO , A variant of the hypergraph removal lemma Journal of Combinatorial Theory, Series A 113.7 (2006): 1257-1280[20] T. T AO AND T. Z