aa r X i v : . [ m a t h . N T ] J un CORNERS IN DENSE SUBSETS OF P d ´AKOS MAGYAR AND TATCHAI TITICHETRAKUNA BSTRACT . Let P d be the d -fold direct product of the set of primes. We prove that if A is a subset of P d ofpositive relative upper density then A contains infinitely many “corners”, that is sets of the form { x, x + te , ..., x + te d } where x ∈ Z d , t ∈ Z and { e , .., e d } are the standard basis vectors of Z d . Themain tools are the hypergraph removal lemma, the linear forms conditions of Green-Tao and the transferenceprinciples of Gowers and Reingold et al.
1. I
NTRODUCTION
A remarkable result in additive number theory due to Green and Tao [7] proves the existence of arbitrarylong arithmetic progressions in the primes. It roughly states that if A is a subset of the primes of positiverelative upper density then A contains arbitrary constellations, that is non-trivial affine copies of any finiteset of integers. It is closely related to Szemer´edi’s theorem [16] on the existence of long arithmetic progres-sions in dense subsets of the integers, in fact it might be viewed as a relative version of that. Another basicresult in this area is the multi-dimensional extension of Szemer´edi’s theorem first proved by Furstenberg andKatznelson [4], which states that if A ⊆ Z d is of positive upper density then A contains non-trivial affinecopies of any finite set F ⊆ Z d . The proof in [4] uses ergodic methods however a more recent combinatorialapproach was developed by Gowers [5] and also independently by Nagel, R ¨odl and Schacht [13].It is natural to ask if both results have a common extension, that is if the Furstenberg-Katznelson theo-rem can be extended to subsets of P d of positive relative upper density, that is when the base set of integersare replaced by that of the primes. In fact, this question was raised by Tao [18], where the existence ofarbitrary constellations among the Gaussian primes was shown. A partial result, extending the original ap-proach of [7], was obtained earlier by B. Cook and the first author [3], where it was proved that relativedense subsets of P d contain an affine copy of any finite set F ⊆ Z d which is in general position , meaningthat each coordinate hyperplane contains at most one point of F .However when the set F is not in general position, it does not seem feasible to find a suitably pseudo-random measure supported essentially on the d -tuples of the primes, due to the self-correlations inherentin the direct product structure. For example, if we want to count corners { ( a, b ) , ( a + d, b ) , ( a, b + d ) } in A ⊆ P then if ( a + d, b ) , ( a, b + d ) ∈ P then the remaining vertex ( a, b ) must also be in P . Thus theprobability that all three vertices are in P (or in the direct product of the almost primes) is not (log N ) − as one would expect, but roughly (log N ) − , preventing the use of any measure of the form ν ⊗ ν .In light of this our method is different, based on the hypergraph approach partly used already in [18], whereone reduces the problem to that of proving a hypergraph removal lemma for weighted uniform hypergraphs.The natural approach is to use an appropriate form of the so-called transference principle [6], [14] to removethe weights and apply the removal lemmas for “un-weighted” hypergraphs, obtained in [5], [13], [19]. Thisway our argument also covers the main result of [3] and in particular that of [7]. Very recently another proofof the (one dimensional) Green-Tao theorem and the main result of [18], based on a removal lemma foruniform hypergraphs, has been given in [1]. An interesting feature of the argument there is that it only usesthe so-called linear forms condition of [7]. Recall that a set A ⊆ P d has upper relative density α if lim sup N →∞ | A ∩ P dN || P dN | = α Let us state our main result.
Theorem 1.1.
Let A ⊂ ( P N ) d with positive relative upper density α > then A contains at least C ( α ) N d +1 (log N ) d corners for some (computable) constant C ( α ) > . As mentioned above, we will use the hypergraph approach which has been used to establish the theexistence of corners (and then that of general constellations) in dense subsets Z d [5] [13]. This was firstobserved, in the the case of 2-dimensional corners by Solymosi [15], where the key tool was to apply toso-called triangle removal lemma of Ruzsa and Szemer´edi. In our weighted setting, this method allows usto distribute the weights so that we can avoid dealing with higher moments of the Green-Tao measure ν . Wewill define the notion and prove some facts for independent weight systems for which the weight systemsrelated to corners is just a special case. The reason that we cannot handle arbitary constellations is thatwe don’t quite have a suitable removal lemma (e.g. Thm. 5.1) for general weight systems on non-uniformhypergraphs. Indeed for general constellations our approach leads to a weighted hypergraph with weightspossibly attached to any lower dimensional hyperedge, making it difficult to apply transference principles toremove the weights. Thus one needs different ideas which is addressed by Cook and the authors in a separatepaper [2]. Simultaneously a completely different approach, based on an ”infinite form” of the linear formscondition and on the recent work on inverse Gowers conjectures [8], [9], [10], has also been developed byTao and Ziegler [20].1.1. Notation. [ N ] := { , , ..., N } , [ M, N ] := { M, M + 1 , ..., M + N } , P N := P ∩ [ N ] . Write x = ( x , ..., x d ) , y = ( y , ..., y d ) , ω ∈ { , } d , let P ω : Z dN → Z dN be the projection defined by P ω ( x, y ) = u = ( u , ..., u d ) , u j = ( x j if ω j = 0 y j if ω j = 1 For each I ⊆ [ d ] , x I = ( x i ) i ∈ I . We may denote x for x [ d ] when we work in Z dN . ω I means elements in { , } | I | . Similarly we may write ω for ω [ d ] . We also define P ω I ( x I , y I ) in the same way. ω | I is the ω restricted to the index set I. For finite sets X j , j ∈ [ d ] , I ⊆ [ d ] then X I := Q j ∈ I X j and P ω I ( X I , Y I ) = Y i ∈ I Z i , Z i = ( X i , ω I ( i ) = 0 Y i , ω I ( i ) = 1 If we want to fix on some position, we can write for example ω (0 , [2 ,d ]) means element in { , } d such thatthe first position is .Also for each ω , define y ω ) ∈ Z d by ( y ω ) ) i = ( if ω i = 0 y i if ω i = 1 , ≤ i ≤ d.y ω ) ∈ Z d is also defined similarly.For any finite set X and f : X → R , and for any measure µ on X , E x ∈ X f ( x ) := 1 | X | X x ∈ X f ( x ) , Z X f dµ := 1 | X | X x ∈ X f ( x ) µ ( x ) Unless otherwise specified, the error term o (1) means a quantity that goes to as N, W → ∞ . ORNERS IN DENSE SUBSETS OF P d
2. W
EIGHTED HYPERGRAPHS AND BOX NORMS .2.1.
Hypergraph setting.
First let us parameterize any affine copies of a corner as follows
Definition 2.1.
A non-degenerate corner is given by the following set of d − tuples of size d + 1 in Z d (or Z dN ): { ( x , ..., x d ) , ( x + s, x , ..., x d ) , ..., ( x , ..., x d − , x d + s ) , s = 0 } or equivalently, { ( x , ..., x d ) , ( z − X ≤ j ≤ dj =1 x j , x , ..., x d ) , ( x , z − X ≤ j ≤ dj =2 x j , x , ..., x d ) , ..., ( x , ..., x d − , z − X ≤ j ≤ dj = d x j ) } with z = P ≤ i ≤ d x i Now to a given set A ⊆ Z dN , we assign a ( d + 1) − partite hypergraph G A as follows:Let ( X , ..., X d +1 ): X = ... = X d +1 := Z N be the vertex sets, and for j ∈ [1 , d ] let an element a ∈ X j represent the hyperplane x j = a , and an element a ∈ X d +1 represent the hyperplane a = x + .. + x d .We join these d vertices (which represent d hyperplanes) if all of these d hyperplanes intersect in A . Then asimplex in G A corresponds to a corner in A . Note that this includes trivial corners which consist of a singlepoint.For each I ⊆ [ d + 1] let E ( I ) denote the set of hyperedges whose elements are exactly from vertices set V i , i ∈ I . In order to count corners in A , we will place some weights on some of these hyperedges that willrepresent the coordinates of the corner. To be more precise we define the weights on − edges: ν j ( a ) = ν ( a ) , a ∈ X j , j ≤ d, ν d +1 ( a ) = 1 , a ∈ X d +1 , and on d − hyperedges: ν I ( a ) = ν ( a d +1 − X j ∈ I \{ d +1 } a j ) , a ∈ E ( I ) , | I | = d, d + 1 ∈ Iν [1 ,d ] ( a ) = 1 , a ∈ E ([1 , d ]) In particular the weights are or of the form ν I ( L I ( x I )) where all linear forms { L I ( x I ) } are pairwiselinearly independent. This is an example of something we call independent weight system (see definition2.1). Note that we can also parameterize any configuration of the form { x, x + tv , . . . , x + tv d } in P d usingan appropriate independent weight system. Now for each I = [ d + 1] \{ j } , ≤ j ≤ d let f I = A ( x , ..., x j − , x d +1 − X ≤ i ≤ di = j x i , x j +1 , ..., x d ) · ν I and for I = [ d ] let f I = A ( x , ..., x d ) . As the coordinates of a corner contained in P d are given by d prime numbers, we define Λ := Λ d +1 ( f I , | I | = d ) := E x [ d +1] Y | I | = d f I d Y i =1 ν ( x i ) = X p i ∈ A, ≤ i ≤ d ( p i ) ≤ i ≤ d constitutes a corner d Y i =1 ν ( p i ) ≈ log d NN d +1 |{ number of corners in A }| Hence Λ can be used to estimate the numbers of corners (ignoring W-trick here and assuming that ν ( N ) ≈ log N for now). Indeed if Λ ≥ C then |{ number of corners in A }| ≥ C N d +1 log d N . ´AKOS MAGYAR AND TATCHAI TITICHETRAKUN
We define measure spaces associated to our system of measure as follows. For ≤ i ≤ d , let ( X i , dµ X i ) =( Z N , ν ) where ν is the Green-Tao measure, and let µ X d +1 be the normalized counting measure on X d +1 = Z N . With this notation one may write Λ := Λ d +1 ( f I , | I | = d ) = Z X · · · Z X d +1 Y | I | = d f I dµ X · · · dµ X d +1 . This is indeed a special case of
Definition 2.2 (Independent weight system) . An independent weight system is a family of weights on theedges of a d + 1 − partite hypergraph such that for any I ⊆ [ d + 1] , | I | ≤ d , ν I ( x I ) is either or of the form Q K ( I ) j =1 ν ( L jI ( x I )) where all distinct linear forms { L jI } I ⊆ [ d +1] , ≤ j ≤ K ( I ) are pairwise linearly independent,moreover the form L jI depends exactly on the variables x I = ( x j ) j ∈ I . In fact for a weight system that arised from parametrizing affine copies of configurations in Z d , it is easyto see from the construction that for any I ⊆ [ d + 1] , | I | = d all distinct linear forms { L kJ } J ⊆ I, ≤ k ≤ K ( J ) are linearly independent however we don’t need this fact in our paper. We define a measure on X I , I ⊆ [ d + 1] , | I | = d associated to an independent weight system by Z X [ d ] f dµ X [ d ] := E x [ d ] f I · Y I ⊆ [ d ] , | I | In this section we describe the weighted version of Gow-ers’s uniformity (box) norms and the so-called Gowers’s inner product associated to the hypergraph G A endowed with a weight system { ν I } I ⊆ [ d +1] , | I |≤ d . Definition 2.3. For each ≤ j ≤ d, let X j , Y j be finite set (in this paper we will define X j = Y j := Z N )with a weight system ν on X [ d ] × Y [ d ] . For f : X [ d ] → R , define k f k d (cid:3) ν := Z X [ d ] × Y [ d ] Y ω [ d ] f ( P ω [ d ] ( x [ d ] , y [ d ] )) dµ X [ d ] × Y [ d ] := E x [ d ] E y [ d ] Y ω [ d ] f ( P ω [ d ] ( x [ d ] , y [ d ] )) Y | I | We will use Cauchy-Schwartz’s inequality and linear form condition. Write D f ω ; ω ∈ { , } d E (cid:3) dν = E x [2 ,d ] ,y [2 ,d ] (cid:20)(cid:18) Y | I | The generalized von-Neumann inequality says thatthe average Λ := Λ d +1 ,ν ( f I , I ⊆ [ d + 1] , | I | = d ) (see (2.1)) is controlled by the weighted box norm. Weshow this inequality in the general settings of an independent weight system. Theorem 2.3 (Weighted generalized von-Neumann inequality) . Let I ⊆ [ d + 1] , | I | = d, f I : X I → R . Let ν be an independent system of measure on X [ d +1] that satisfies linear form conditions. Suppose f I aredominated by ν i.e. | f I | ≤ ν I then | Λ d +1 ,ν ( f (1) , ..., f ( d +1) ) | . min {k f (1) k (cid:3) dν , ..., k f ( d +1) k (cid:3) dν } ORNERS IN DENSE SUBSETS OF P d Proof. We will only use Cauchy-Schwartz inequality and the linear forms condition. The idea is to considerone of the variables say x j , as a dummy variable and write Λ := E x j ( ... ) E x [ d +1] \{ j } ( ... ) then apply Cauchy Schwartz’s inequality to eliminate the lower complexity factors and use linear formscondition to control the extra factor gained. We do this repeatedly d times.First apply Cauchy-Schwartz’s inequality in x d +1 variable to eliminate f ( d +1) | Λ | ≤ E x [ d ] f ( d +1) ( x [ d ] ) Y | I | 3. T HE DUAL FUNCTION ESTIMATE .In this section we prove Theorem 3.1. For any independent measure system and any fixed J ⊆ [ d + 1] , | J | = d , let F , ..., F K : X J → R , F j ( x J ) ≤ ν J ( x J ) be given functions. Then for each ≤ j ≤ K we have that (cid:13)(cid:13) K Y j =1 D F j (cid:13)(cid:13) ∗ (cid:3) dν = O K (1) Proof. We will denote by I the subsets of a fixed set J ⊆ [ d + 1] , | J | = d . First, write D F j ( x ) = E y j ∈ Z dN Y ω =0 F j ( P ω ( x, y j )) Y | I | For each I ⊆ [ d ] and fixed ω I , the number of ω [ d ] such that ω [ d ] | I = ω I is d −| I | and P ω ( x, y ) | I = P ω I ( x I , y I ) ⇐⇒ ω | I = ω I So from the remark D G ω,Y ; ω ∈ { , } d E (cid:3) dν = E x,y ∈ Z dN Y ω [ d ] G ω,Y ( P ω ( x, y )) Y | I | 4. T RANSFERENCE P RINCIPLE In this section, we will slightly modify the transference principle in [6](see Theorem 4.6) , which willallow us to deduce results for functions dominated by a pseudo-random measure from the correspondingresult on bounded functions. We will do this on the set on which our functions have bounded dual, and treatthe contributions of the remaining set as error terms.We will work on functions f : X I → R , dominated by ν I . WLOG assume I = [ d ] . Let h·i be anyinner product on F := { f : X [ d ] → R } written as h f, g i = R f · g dµ for some measure µ on X [ d ] . In this ORNERS IN DENSE SUBSETS OF P d section we will need the explicit discription of the set Ω( T ) that the dual function is bounded by T using thecorrelation condition (see appendix).4.1. Dual Boundedness on X I . One property of the dual functions that is used in [7] is their boundedness.However in the weighted settings, this is generally not true. To get around this, we will be working on setson which the dual functions are bounded and treat the contributions of the remaining parts as error terms.Consider any independent weight system. Let I ⊆ [ d + 1] , | I | = d , f : X I → R , | f | ≤ ν I (WLOG I = [ d ] ). Recall D ( f ) = E y Y ω =0 ν ( L ( P ω ( x, y ))) Y | I | For each T > we have the set Ω( T ) and define the following sets F := { f : X [ d ] → R }F T := { f ∈ F : supp ( f ) ⊆ Ω( T ) }S T := { f ∈ F T : | f | ≤ ν [ d ] ( x [ d ] ) + 2 } We define the following (basic anti-correlation) norm on F T k f k BAC := max g ∈S T | h f, D g i | We have the following basic properties of this norm. Proposition 1. (1) g ∈ F T ⇒ D g ∈ F T (2) k·k BAC is a norm on F T and can be extended to be a seminorm on F . Furthermore, we have k f k BAC = (cid:13)(cid:13) f · Ω( T ) (cid:13)(cid:13) BAC , f ∈ F . (3) Span {D g : g ∈ S T } = F T (4) k f k ∗ BAC = inf { P ki =1 | λ i | , f = P ki =1 λ i D g i ; g i ∈ S T } for f ∈ F T Remark 2. If f / ∈ F T then supp ( f ) * Ω( T ) so f is not of the form P ki =1 λ i D g i ; g i ∈ F T as RHS is zero.Proof. (1) Suppose (˜ x , ..., ˜ x d ) ∈ Ω( T ) C then there is an J ( [ d ] such that K (˜ x [ d ] \ J ) > T where K is the function in the definition of Ω J ( T ) for some j . Let g ∈ F T then g (˜ x [ d ] \ J , x J ) = 0 for all x J ∈ X J so D g ∈ F T . (2) It follows directly from the definition that k f + g k BAC ≤ k f k BAC + k g k BAC and k λf k BAC = | λ | k f k BAC for any λ ∈ R . Now suppose f ∈ F T , f is not identically zero then we need to showthat k f k BAC = 0 . Since X and Z are finite sets, we have that k f k ∞ = max x,z | f ( x, z ) | < ∞ .Let g = γf where γ is a constant such that k g k ∞ < then g ∈ S T and h f, D g i = h f, D γf i = γ d − h f, D f i > so k f k BAC > Now supp ( D g ) ⊆ Ω( T ) we have for any f ∈ Fk f k BAC = sup g ∈S T | h f, D g i | = sup g ∈S T | (cid:10) f · Ω( T ) , D g (cid:11) | = (cid:13)(cid:13) f · Ω( T ) (cid:13)(cid:13) BAC (3) If there is an f ∈ F T , f is not identically zero and f / ∈ span {D g : g ∈ S T } So f ∈ span {D g : g ∈S T } ⊥ then h f, D g i = 0 for all g ∈ S T . So k f k BAC = 0 which is a contradiction.(4) Define k f k D = inf { P ki =1 | λ i | : f = P ki =1 λ i D g i , g i ∈ S T } which can be easily verified to be anorm on F T . Now let φ, f ∈ F T , f = P ki =1 λ i D g i , g i ∈ S T , then | h φ, f i | = k X i =1 | λ i || h φ, D g i i | ≤ k φ k BAC k X i =1 | λ i | ≤ k φ k BAC k f k D so k f k ∗ BAC ≤ k f k D Next for all g ∈ S T , we have kD g k D ≤ then k f k BAC = sup g ∈S T | h f, D g i | ≤ sup k h k D ≤ | h f, h i | = k f k ∗ D so k f k BAC ≤ k f k ∗ D i.e. k f k ∗ BAC ≥ k f k D . So k f k ∗ BAC = k f k D . (cid:3) Now let us prove the following lemma whose proof relies on the dual function estimate. From here weconsider our inner product h·i ν and the norm k·k (cid:3) ν . This argument also works for any norm for which onehas the dual function estimate. Lemma 4.1. Let φ ∈ F T be such that k φ k ∗ BAC ≤ C and η > . Let φ + := max { , φ } . Then there is apolynomial P ( u ) = a m u m + ... + a u + a such that (1) k P ( φ ) − φ + k ∞ ≤ η (2) k P ( φ ) k ∗ (cid:3) dν ≤ ρ ( C, T, η ) where ρ ( C, T, η ) := 2 inf R P ( C ) where the infimum is taken over polynomials P such that k P − φ + k ∞ ≤ η on [ − CT, CT ] and R P ( x ) = m X j =0 C ( j ) | a j | x j , where C ( m ) is the constant in the dual function estimateProof. First, recall that if ( x , ..., x d ) ∈ supp ( D g i ) ⊆ Ω( T ) then |D g ( x , ..., x d ) | ≤ T Now suppose k φ k ∗ BAC ≤ C then there exist g , .., g k ∈ S T and λ , ..., λ k such that φ = P ki =1 λ i D g i and P ≤ i ≤ k | λ i | ≤ C . Hence | φ ( x , ..., x d ) | ≤ ( k X i =1 | λ i | )( max ≤ i ≤ k |D g i ( x , .., x d ) | ) ≤ CT ORNERS IN DENSE SUBSETS OF P d Hence the Range of φ = φ (Ω( T )) ⊆ [ − CT, CT ] . Then by Weierstrass approximation theorem, there is apolynomial P (which may depend on C, T, η ) such that R P ( C ) ≤ ρ and | P ( u ) − u + | ≤ η ∀| u | ≤ CT and so k P ( φ ) − φ + k ∞ ≤ η and we have (1).Now using the dual function estimate, we have k φ m k ∗ (cid:3) dν ≤ (cid:13)(cid:13) ( X ≤ i ≤ k λ i D g i ) m (cid:13)(cid:13) ∗ (cid:3) dν ≤ X ≤ i ≤ ... ≤ i m ≤ k | λ i ...λ i m | kD g i ... D g i m k ∗ (cid:3) dν ≤ C ( m ) X ≤ i ≤ ... ≤ i m ≤ k | λ i ...λ i m | ≤ C ( m )( X ≤ i ≤ k | λ i | ) m ≤ C ( m ) C m Hence k P ( φ ) k ∗ (cid:3) dν ≤ P dm =0 | a m | C ( m ) C m ≤ ρ ( C, T, η ) (cid:3) Now we are ready to prove the transference principle. Theorem 4.2. Suppose ν is an independent weight system. Let f ∈ F and ≤ f ( x [ d ] ) ≤ ν [ d ] ( x [ d ] ) , let η > . Suppose N ≥ N ( η, T ) is large enough, then there are functions g, h on X × ... × X d such that (1) f = g + h on Ω( T ) (2) ≤ g ≤ on Ω( T ) (3) (cid:13)(cid:13) h · Ω( T ) (cid:13)(cid:13) (cid:3) dν ≤ η To prove this theorem, it suffices to show Theorem 4.3. With the same assumption in Theorem 4.2, there are functions g, h such that (1) f = g + h on Ω( T ) (2) ≤ g ≤ on Ω( T ) (3) (cid:13)(cid:13) h · Ω( T ) (cid:13)(cid:13) BAC ≤ η Here the BAC-norm is the BAC-norm with respect to h·i ν Theorem 4.3 ⇒ Theorem 4.2: Since h · Ω( T ) = f · Ω( T ) − g · Ω( T ) , we have − ≤ h · Ω( T ) ≤ ν so | h · Ω( T ) | ≤ ν + 2 so h · Ω( T ) ∈ S T . Hence by the definifion of BAC-norm, η ≥ k h · Ω( T ) k BAC ≥ h h · Ω( T ) , D ( h · Ω( T ) ) i ν = k h · Ω( T ) k d (cid:3) dν (cid:3) The following lemma will be used in the next proof. Lemma 4.4 (cororally 3.2 in [6]) . Let K , ..K r be closed convex subsets of R d , each containing 0 andsuppose f ∈ R d cannot be written as a sum f + ... + f r , f i ∈ c i K i , c i > . Then there is a linear functional φ such that h f, φ i > and h g, φ i ≤ c − i for all i ≤ r and all g ∈ K i . Proof of Theorem 4.3: Define K := { g ∈ F : 0 ≤ g ≤ on Ω( T ) } L := { h ∈ F : k h k BAC ≤ η } Then it is clear that K, L are convex.(Also ∈ K, ∈ Int ( L ) and then ∈ Int ( K + L ) .) Assume that f / ∈ K + L on Ω( T ) then by Lemma 4.4, there exists φ ∈ F such that(1) (cid:10) φ, f · Ω( T ) (cid:11) ν > (2) h φ, g i ν ≤ ∀ g ∈ K (3) h φ, h i ν ≤ ∀ h ∈ L First, we claim that φ ∈ F T . To see this, suppose g is a function whose supp ( g ) ⊆ Ω( T ) C (i.e. g ≡ on Ω( T ) so g ∈ K .) Since g ∈ K, h φ, g i ν ≤ but g could be chosen arbitrarily on Ω( T ) C so we must have φ (cid:12)(cid:12) Ω( T ) C ≡ and hence φ ∈ F T . Now let g ( x [ d ] ) = ( if φ ( x [ d ] ) ≥ otherwisethen g ∈ K and h φ, g i ν = h φ + , i ν = 2 h φ + , i ν ≤ ⇒ h φ + , i ν ≤ Now since φ ∈ F T , h ∈ L . Suppose k h · Ω( T ) C k BAC ≤ then we have h φ, h · Ω( T ) C i ν = h φ, h i ν ≤ η − . Hence if h ′ ∈ F T and k h ′ k BAC ≤ then (cid:13)(cid:13) h ′ · Ω( T ) (cid:13)(cid:13) BAC = k h ′ k BAC ≤ so h φ, h ′ i ν ≤ η − ∀ h ′ ∈ F T , (cid:13)(cid:13) h ′ (cid:13)(cid:13) BAC ≤ so k φ k ∗ BAC ≤ η − as k·k BAC is a norm on F T .Now by the Lemma 4.1, there is a polynomial P such that k P ( φ ) − φ + k ∞ ≤ and k P ( φ ) k ∗ (cid:3) dν ≤ ρ ( C, T, η ) Then h P ( φ ) , i ν ≤ h P ( φ ) − φ + , i ν + h φ + , i ν ≤ + Also, from the definition of the weighted boxnorm and the linear form condition, we have k ν [ d ] ( x [ d ] ) − k d (cid:3) dν = o N →∞ (1) so suppose N ≥ N ( T, η ) then (cid:10) P ( φ ) , ν [ d ] (cid:11) ν = h P ( φ ) , i ν + (cid:10) P ( φ ) , ν [ d ] − (cid:11) ν ≤ 12 + 18 + k P ( φ ) k ∗ (cid:3) dν (cid:13)(cid:13) ν [ d ] − (cid:13)(cid:13) (cid:3) dν ≤ 12 + 14 = 34 | (cid:10) ν [ d ] , φ + (cid:11) ν | = | (cid:10) ν [ d ] , φ + − P ( φ ) (cid:11) ν | + | (cid:10) ν [ d ] , P ( φ ) (cid:11) ν | ≤ k φ + − P ( φ ) k ∞ (cid:10) ν [ d ] , (cid:11) ν + (cid:10) ν [ d ] , P ( φ ) (cid:11) ν ≤ · 12 + 34 Hence (cid:10) f · Ω( T ) , φ (cid:11) ν = h f, φ i ν ≤ h f, φ + i ν ≤ (cid:10) ν [ d ] , φ + (cid:11) ν ≤ 34 + 110 < which is a contradiction. Hence f ∈ K + L on Ω( T ) . (cid:3) Now we can rephrase Theorem 4.3 as follow: Theorem 4.5 (Transference Principle) . Suppose ν is an independent weight system. Let f ∈ F , ≤ f ≤ ν and < η < ≪ T then there exists f , f , f ∈ F such that (1) f = f + f + f (2) ≤ f ≤ , supp ( f ) ⊆ Ω( T ) (3) k f k (cid:3) dν ≤ η, supp ( f ) ⊆ Ω( T ) (4) ≤ f ≤ ν, supp ( f ) ⊆ Ω( T ) C , k f k L ν . T . ORNERS IN DENSE SUBSETS OF P d Proof. Let g, h be as in Theorem 4.3. Take f = g · Ω T , f = h · Ω T then f · Ω T = f + f . Let f = f · Ω CT . Now by linear form condition k f k L ν ≤ T E x [ d ] f · D f · Y I ⊆ [ d ] , | I | 5. R ELATIVE H YPERGRAPH R EMOVAL L EMMA First let us recall the statement of ordinary functional hypergraph removal lemma [19]. Recall the defi-nition of Λ in equation (2.1). Theorem 5.1. Given measure spaces ( X , µ X ) , ..., ( X d +1 , µ X d +1 ) and f ( i ) : X I → [0 , , I = [ d + 1] \{ i } Let ǫ > , suppose | Λ d +1 ( f (1) , ..., f ( d ) , f ( d +1) ) | ≤ ǫ. Then for ≤ i ≤ d + 1 , there exists E i ⊆ X [ d +1] \{ i } such that Q ≤ j ≤ d +1 E j ≡ and for ≤ i ≤ d + 1 , Z X · · · Z X d +1 f ( i ) · E Ci dµ X · · · dµ X d dµ X d +1 ≤ δ ( ǫ ) where δ ( ǫ ) → as ǫ → . Also let us state a functional version of Szemer´edi’s Regularity Lemma [19] that we will use later in theproof. If B is a finite factor of X i.e. a finite σ − algebra of measurable sets in X , then B is a partitionof X into atoms A , ..., A M . Let f : X → R be measurable then we define the conditional expectation E ( f |B ) : X → R is defined by E ( f |B )( x ) = (1 / | A i | ) R A i f ( x ) dµ X if x ∈ A i (defined up to set of measurezero). We say that B has complexity at most m if it is generated by at most m sets. If B X is a finite factorof X with atoms A , ..., A M and B Y is a finite factor of Y with atoms B , ..., B N then B X ∨ B Y is a finitefactor of X × Y with atoms A i × B j , ≤ i ≤ M, ≤ j ≤ N. Theorem 5.2 (Szemer´edi’s Regularity Lemma [19] ) . Let f : X [ d ] → [0 , be measurable, let τ > and F : N → N be arbitrary increasing functions (possibly depends on τ ). Then there is an integer M = O F,τ (1) , factors B I ( I ⊆ [ d ] , | I | = d − on X I of complexity at most M such that f = f + f + f where • f = E ( f | W I ⊆ [ d ] , | I | = d − B I ) . • k f k L ν ≤ τ. • k f k (cid:3) dν ≤ F ( M ) − . • f , f + f ∈ [0 , . Remark 3. A consequence from this lemma that we will use later is the following: since f is a constant oneach atom of W I. | I | = d − B I , we can decompose f as a finite sum of lower complexity functions i.e.a finitesum of product Q di =1 J i where J i is a function in x [ d ] \{ i } variable and takes values in [0 , . In fact the paper [19] proves this theorem only with the counting measure (with thenotion of e − discrepancy in place of Boxnorm). But the proof also works for any finite measure that has direct product structure (with the notion of weighted Box Norm).(see[17] for the case of probability measures in d = 2 , ). However we don’t know how to genralize this argument to arbitrary measureon the product space. If we can prove this theorem for any measure µ X × ... × X d then we would be able to prove multidimensionalGreen-Tao’s Theorem. This theorem is proved for counting measure in [19] but the proof would work for any product measure on the product spaces. Theorem 5.3 (Weighted Simplex-Removal Lemma) . Suppose f ( i ) ( x [ d +1] \{ i } ) ≤ ν [ d +1] \{ i } ( x [ d +1] \{ i } ) . Let ǫ > , Suppose | Λ | ≤ ǫ then there exist E i ⊆ Q j ∈ [ d +1] \{ i } X j such that for ≤ i ≤ d + 1 , • Y i ∈ [ d +1] E i ≡ • R X · · · R X d +1 f ( i ) E Ci dµ X · · · dµ X d +1 = E x [ d +1] \{ i } E Ci f ( i ) ( x [ d +1] \{ i } ) Q J ( [ d +1] \{ i } ν J ( x J ) ≤ δ ( ǫ ) where δ ( ǫ ) → as ǫ → . Proof. Using the transference principle (Theorem 4.6) for ≤ i ≤ d + 1 , write f ( i ) = g ( i ) + h ( i ) + k ( i ) where(1) f ( i ) = g ( i ) + h ( i ) + k ( i ) (2) ≤ g ( i ) ≤ , supp ( g ( i ) ) ⊆ Ω ( i ) ( T ) (3) (cid:13)(cid:13) h ( i ) (cid:13)(cid:13) (cid:3) dν ≤ η, supp ( h ( i ) ) ⊆ Ω ( i ) ( T ) (4) k ( i ) = f ( i ) · (Ω ( i ) ) C ( T ) where Ω ( i ) ( T ) = { x [ d +1] \{ i } : |D f ( i ) | ≤ T } , ≤ i ≤ d Step 1: We’ll show that if T ≥ T ( ǫ ) is sufficiently large then Λ d +1 ( g (1) + h (1) , ..., g ( d +1) + h ( d +1) ) = Λ d +1 ( f (1) − k (1) , ..., f ( d +1) − k ( d +1) ) . ǫ. Proof of Step1: For I ⊆ [ d + 1] , the term on LHS can be written as a sum of the following terms: Λ d +1 ,I ( e (1) , ..., e ( d ) , e ( d +1) ) , e ( i ) = ( − k ( i ) if i ∈ If ( i ) if i / ∈ I If I = ∅ then Λ d +1 ( f (1) , ..., f ( d ) , f ( d +1) ) ≤ ǫ by the assumption. Suppose I = { i , ..., i r } 6 = ∅ then | Λ d +1 ,I ( e (1) , ..., e ( d ) , f ( d +1) ) | = (cid:12)(cid:12)(cid:12)(cid:12) Z X · · · Z X d +1 f (1) · · · f ( d +1) · Y i ∈ I (Ω ( i ) ) C dµ X · · · dµ X d +1 (cid:12)(cid:12)(cid:12)(cid:12) ≤ E x [ d +1] Y I ⊆ [ d +1] , | I |≤ d ν I ( x I ) (Ω ( i ) C ≤ T E x d +1 E y [ d +1] \{ i } Y I ⊆ [ d +1] , | I |≤ d ν I ( x I ) Y ω I =0 I ⊆ [ d +1] \{ i } ν I ( P ω I ( x I , y I )) . T by linear form condition. Step 2 We’ll show Λ d +1 ( g (1) , ..., g ( d +1) ) . ǫ if η ≤ η ( ǫ ) , N ≥ N ( ǫ, η ) . Proof ofstep 2: Write g ( i ) = g ( i ) + h ( i ) − h ( i ) = f ( i ) · Ω ( i ) ( T ) − h ( i ) then we have ≤ f ( i ) · Ω ( i ) ( T ) ≤ ν i , k h ( i ) k (cid:3) dν ≤ η so by the weighted von-Neumann inequality and step 1 , we have | Λ d +1 ( g (1) , ..., g ( d +1) ) | = | Λ d +1 ( g (1) + h (1) , ..., g ( d +1) + h ( d +1) ) − Λ d +1 ( h (1) , .., h ( d ) , h ( d +1) ) | . ǫ + η + o N →∞ (1) . ǫ if τ ≤ ǫ, η ≤ ǫ, N ≥ N ( ǫ ) and the proof of step 2 is completed.Now since ≤ g ( i ) ≤ then (after normalizing) using the ordinary hypergraph removal lemma (Theorem ORNERS IN DENSE SUBSETS OF P d F ( i ) ⊆ X [ d +1] \{ i } such that Y ≤ k ≤ d +1 F k ≡ and Z X · · · Z X d +1 g ( i ) · F Ci dµ X · · · dµ X d +1 . δ ( ǫ ) so Z X · · · Z X d +1 f ( i ) · F Ci dµ X · · · dµ X d +1 . δ ( ǫ ) + Z X · · · Z X d +1 h ( i ) · F Ci dµ X · · · dµ X d dµ X d +1 | {z } ( A ) ++ Z X · · · Z X d +1 f ( i ) · Ω Ci ( T ) F Ci dµ X · · · dµ X d +1 | {z } ( B ) Now for our purpose, it suffices to show ( A ) , ( B ) . ǫ. Estimate for (A): By the regularity lemma , the function F Ci could be written as a sum of O (1) offunctions of the form Q j ∈ [ d +1] \{ i } v ( i ) j plus some functions which give a small error term O ( ǫ ) in (A)(using von Neumann’s inequality) where v ( i ) j is a [0 , - valued function in x [ d +1] \{ i,j } . We could write u ( i ) j = max v ( i ) j for each fixed i, j then the sum of Q j ∈ [ d +1] \{ i } v ( i ) j is less than C Q j ∈ [ d +1] \{ i } u ( i ) j for someabsolute constant C . Applying Cauchy-Schwartz’s inequality d times to estimate the expression (A) (herelet assume i < d + 1 , the case i = d + 1 is the same.) : (cid:18) Z X · · · Z X d Z X d +1 h ( i ) · F Ci dµ X · · · dµ X d dµ X d +1 (cid:19) d . (cid:20)(cid:18) Z X · · · Z X d (cid:18) Z X d +1 h ( i ) Y ≤ j ≤ dj = i u ( i ) j dµ X d +1 (cid:19) u id +1 dµ X · · · dµ X d dµ X d +1 (cid:19) (cid:21) d − ≤ (cid:20) Z X · · · Z X d (cid:18) Z X d +1 h ( i ) Y ≤ j ≤ d − j = i u ( i ) j dµ X d +1 (cid:19) dµ X · · · dµ X d × Z X · · · Z X d (cid:0) u id +1 (cid:1) dµ X · · · dµ X d (cid:21) d − . (cid:20) Z X · · · Z X d Z X d +1 Z Y d +1 h ( i ) ( x [ d +1] \{ i } , x d +1 ) h ( i ) ( x [ d ] \{ i } , y d +1 ) Y ≤ j ≤ dj = i u ( i ) j ( x [ d ] \{ i } , x d +1 ) u ( i ) j ( x [ d ] \{ i } , y d +1 ) dµ X · · · dµ X d +1 dµ Y d +1 (cid:21) d − Continue applying cauchy schwartz’s inequality this way. After d application of cauchy Schwartz’s inequal-ity, the u ( i ) j eventually disappears and we have this bounded by k h ( i ) k d (cid:3) ν ≤ ǫ. Estimate for (B): Next we estimate the expression in (B), (cid:12)(cid:12)(cid:12)(cid:12) Z X · · · Z X d +1 f ( i ) · (Ω ( i ) ( T )) C · F Ci dµ X · · · dµ X d +1 (cid:12)(cid:12)(cid:12)(cid:12) We need this since we don’t have something like k fg k (cid:3) ν ≤ k f k (cid:3) ν k g k (cid:3) ν ≤ Z X · · · Z X d +1 ( ν [ d +1] \{ i } ) · (Ω ( i ) ( T )) C dµ X · · · dµ X d +1 ≤ T E x [ d +1] \{ i } E y [ d +1] \{ i } ν [ d +1] \{ i } ( x [ d +1] \{ i } ) Y | I |≤ d,i/ ∈ I ν I ( x I ) Y ω I =0 I ⊆ [ d +1] \{ i } ν I ( P ω I ( x I , y I )) . T , by the linear forms condition. Hence if we choose sufficiently large T then Z X · · · Z X d +1 f ( i ) · F Ci dµ X · · · dµ X d +1 . δ ( ǫ ) . (cid:3) PROOF OF THE MAIN RESULT From Z N to Z . Now recall that ν δ ,δ ( n ) ≈ φ ( W ) W log N, δ N ≤ n ≤ δ N, δ , δ ∈ (0 , for asufficiently large prime N in the residue class b (mod W ) By pigeonhole principle choose a b ∈ ( Z × W ) d such that | A ∩ ( W Z ) d + b | ≥ α N d (log d N ) φ ( W ) d Now consider A b = { n ∈ [1 , N/W ] d : W n + b ∈ A } and let δ ∈ (0 , then by the Prime NumberTheorem there is a prime N ′ such that δ N ′ = (1 + δ ) NW for arbitrarily small real number δ . Then if N issufficiently large and δ is sufficiently small with respect to α then | A b ∩ [1 , δ N ′ ] d | ≥ αδ d N ′ W ) d (log d N ′ ) φ ( W ) d (6.1)On the other hand by Dirichlet’s theorem on primes in arithmetic progressions, the number n ∈ [1 , N ′ ] d \ [ δ N ′ , N ′ ] d for which W n + b ∈ P d is ≤ c d ǫ N ′ W ) d log d N ′ φ ( W ) d Hence the estimate (6.1) holds for A ′ := A W ∩ [ δ N ′ , δ N ′ ] d as well provided that δ is small enough.Now we may consider A ′ in place of A (we are working in the group Z dN ′ instead). Now if we identifythe group Z dN ′ with [ − N ′ , N ′ ] d then for a sufficiently small δ , δ any points in A ′ are the same when wechange from Z dN ′ to [ − N ′ , N ′ ] d (no wrap around issue).6.2. Proof of the Main Theorem. To prove the theorem, suppose on the contrary that A ′ contains less than ǫ N ′ d +1 (log N ′ ) d corners. ( ǫ = c ( α )) then Λ d +1 ( f (1) , ..., f ( d +1) )= ( N ′ ) − ( d +1) X x [ d +1] Y ≤ i ≤ d A ′ ( x , ..., x i − , x d +1 − X ≤ j ≤ dj = i , x i +1 , ..., x d ) ν I A ′ ( x , ..., x d ) · ν ( x ) ...ν ( x d ) ≤ N ′ d +1 X p i ∈ A ′ , ≤ i ≤ d that consitutes a corner Y ≤ k ≤ d A ′ ( p , ..., p k − , p d + k , p k +1 , ..., p d ) A ( p , ..., p d ) ν ( p ) ...ν ( p d ) . N ′ d +1 (cid:18) φ ( W ) log N ′ W (cid:19) d × ( The number of corners in A ′ ) ≤ ǫ Now assume that Λ d +1 ( f (1) , ..., f ( d ) , f ( d +1) ) . ǫ then by the relative hypergraph removal lemma ∃ E i , ≤ i ≤ d + 1 , E i ⊆ X [ d +1] \{ i } := ˜ X i , ORNERS IN DENSE SUBSETS OF P d such that Y ≤ i ≤ d +1 E i ≡ , Z ˜ X i f ( i ) E Ci dµ ˜ X i . δ ( ǫ ) where δ ( ǫ ) → as ǫ → . Let A ′ = A ∩ [ δ N, δ N ] d , z = P ≤ j ≤ d x j , g A ′ := g · A ′ for any function g then ˜Λ := N ′− d X ( x ,...,x d ) ∈ A ′ f (1) A ′ ( x , ..., x d , z ) f (2) A ′ ( x , x , ..., x d , z ) ...f ( d ) A ′ ( x , x , ..., x d − , z ) f ( d +1) A ′ ( x , ..., x d ) ≥ N ′− d X ( x ,...,x d ) ∈ A ′ ν ( x ) ...ν ( x d ) & ( N ′ ) − d (cid:0) φ ( W ) W log N ′ (cid:1) d · α · ( N ′ W ) d ( φ ( W ) log N ′ ) d = α. for arbitrarily large N ′ . Now ˜Λ = E x [ d ] ( f (1) A ′ E + f (1) A ′ E C ) ... ( f ( d +1) A ′ E d +1 + f ( d +1) A ′ E Cd +1 ) Now we have by the assumption E x [ d ] f (1) A ′ · E ...f ( d +1) A ′ · E d +1 ≡ so we just need to estimate each otherterm individually.Consider E x [ d ] f (1) A ′ · E C f (2) A ′ · E ± ...f ( d +1) A ′ · E ± d +1 , where F ± can be either F or F C for any set F . Nowsince ≤ f ( j ) A ′ E ± j ≤ ν ( x j ) , d ≥ j ≥ and ≤ f ( d +1) A ≤ We have E x [ d ] f (1) A ′ · E C f (2) A ′ · E ± ...f ( d +1) A ′ · E ± d +1 ≤ E x [ d ] f (1) A ′ · E C ν ( x ) ...ν ( x d )= Z ˜ X f (1) · E C dµ X · · · dµ X d +1 . δ ( ǫ ) . In the same way, we have for any ≤ i ≤ d + 1 , E x [ d ] f ( i ) A ′ · E Ci Y ≤ j ≤ d +1 ,j = i ( f ( j ) · E ± j ) . δ ( ǫ ) So if N ′ > N ( α ) then E x [ d ] f (1) A ′ ( x , ..., x d , u ) f (2) A ′ ( x , x , ..., x d , u ) ...f ( d ) A ′ ( x , ..., x d − , u ) f ( d +1) A ′ ( x , ..., x d ) . δ ( ǫ ) = o ( α ) This is a contradiction. Hence there are & ǫ N ′ d +1 (log N ′ ) d corners in A . Note that the number of degeneratedcorners is at most O ( N ′ d (log N ′ ) d ) as the corner is degenerated (and will be degenerated into a single point ) iff z = P ≤ j ≤ d x j . A PPENDIX A. T HE G REEN -T AO M EASURE AND PSEUDORANDOMNESS A.1. Pseudorandom Measure Majorizing Primes. Let us recall the Mangoldt function which in manyproblems is used to replace the indicator function of the set of primes. Λ( n ) = ( log p if n = p k , k ≥ otherwisePrimes has local obstructions that prevents them from being truly random; Λ( n ) is concentrated on just φ ( q ) residue classes (mod q ) . To get rid of this kind of obstruction on all small residue classes Green and Tao introduced a device, the so-called W-Trick [7] we we recall here. Let ω ( N ) be a sufficiently slowly growingfunction of N and let W = Q p ≤ ω ( N ) p . If b is any positive integer with ( b, W ) = 1 , then by the PrimeNumber Theorem, we have W = exp((1 + o (1)) ω ( N )) and we have that P W,b is uniformly distributed (mod q ) for q ≤ ω ( N ) .Let P N,W,b := { n : W n + b ∈ P N } and define the modified von-Mangoldt function by Definition A.1. For any fixed ( b, W ) = 1 , let Λ b ( n ) = ( φ ( W ) W log( W n + b ) if W n + b is prime. otherwise,moreover define Λ db ( x , ..., x d ) = Λ b ( x ) · · · Λ b d ( x d ) , b = ( b , ..., b d ) ∈ Z × dN Note that by the Prime Number Theorem in arithmetic progressions, we have E n ≤ N Λ b ( n ) ∼ . Let us recallnow the definition of Green-Tao measure introduced in [7]. Definition A.2 (Goldston-Yildirim sum) . [11] , [7] Λ R ( n ) = X d | n,d ≤ R µ ( d ) log Rd We may take R = N d − − d − Definition A.3 (Green-Tao measure) . For given small parameters ≥ δ , δ > , define a function ν δ ,δ : Z N → R ν δ ,δ ( n ) = ν ( n ) = ( φ ( W ) W Λ R ( W n + b ) log R if δ N ≤ n ≤ δ N otherwise It is immediate from the definition that ν ( n ) ≥ d − − d − Λ b ( n ) . Finally lets us recall the pseudo-randomness properties we used here; summarized in the following definitions. Definition A.4 (Linear forms condition.) . Let m , t ∈ N be parameters then we say that ν satisfies ( m , t ) − linear form condition if for any m ≤ m , t ≤ t , suppose { a ij } ≤ i ≤ m ≤ j ≤ t are subsets of inte-gers and b i ∈ Z N . Given m (affine) linear forms L i : Z tN → Z N with L i ( x ) = P ≤ j ≤ t a ij x j + b i for ≤ i ≤ m be such that each φ i is nonzero and they are pairwise linearly independent over rational. Then E ( Y ≤ i ≤ m ν ( L i ( x )) : x ∈ Z tN ) = 1 + o N →∞ ,m ,t (1) Definition A.5 (Correlation condition.) . We say that a measure ν satisfies ( m , m , ..., m l ) − correlationcondition if there is a function τ : Z N → R + such that (1) E ( τ ( x ) m : x ∈ Z N ) = O m (1) for any m ∈ Z + (2) Suppose • φ i , ψ ( k ) : Z tN → Z N (1 ≤ i ≤ l , ≤ k ≤ l , l + l ≤ m ) are all pairwise linearlyindependent (over Z ) linear forms • For each ≤ g ≤ l , ≤ j < j ′ ≤ m g we have a gj = 0 , and a ( g ) j ψ ( g ) ( x ) + h ( g ) j , a ( g ) j ′ ψ ( g ) ( x ) + h ( g ) j ′ are different (affine) linear forms.then, we have E x ∈ Z dN l Y k =1 ν ( φ k ( x )) l Y k =1 m k Y j =1 ν ( a ( k ) j ψ ( k ) ( x )+ h ( k ) j ) ≤ l Y k =1 X ≤ j The green-Tao measure ν satisfies linear forms and correlation conditions on any parametersthat may depend on d or α (not in N ). The proof of the linear forms condition is given in [7], as well as the proof of a slightly simpler formof the correlation condition. The above correlation condition is essentially the same as the one given in[3], Proposition 4. In fact the argument there works without any modification, save for a minor change incalculating the so-called local factors, see Lemma 3 there. We omit the details.R EFERENCES [1] D. C ONLON , J. F OX , Y. Z HAO , A relative Szemeredi theorem , preprint.[2] B. C OOK , A. M AGYAR , T. T ITICHETRAKUN , A multidimensional Szemer´edi theorem in the primes , preprint.[3] B. C OOK , A. M AGYAR Constellations in Pd International Mathematics Research Notices 2012.12 (2012), 2794-2816.[4] H. F URSTENBERG , Y. K ATZNELSON , An ergodic Szemer´edi theorem for commuting trnasformations , J. Analyse Math. 31(1978), 275-291[5] W.T. G OWERS , Hypergraph regularity and the multidimensional Szemer´edi theorem , Annals of Math. 166/3 (2007), 897-946[6] W.T. G OWERS , Decompositions, approximate structure, transference, and the Hahn-Banach theorem , Bull. London Math.Soc. 42 (4) (2010), 573-606[7] B. G REEN AND T. T AO , The primes contain arbitrary long arithmetic progressions , Annals of Math. 167 (2008), 481-547[8] B. G REEN AND T. T AO , Linear equations in primes. Annals. of Math.(2) 171.3 (2010), 1753-1850.[9] B. G REEN AND T. T AO , The M¨obius Function is Strongly Orthogonal to Nilsequences. Annals. of Math.(2) 175 (2012),541-566.[10] B. G REEN , T. T AO , T. Z IEGLER , An inverse theorem for the Gowers U s +1 [ N ] norm , Annals of Math., 176 (2012), no. 2,1231-1372[11] D. G OLDSTON , C. Y ILDIRIM , Higher correlations of divisor sums related to primes I: triple correlations , Integers: ElectronicJournal of Combinatorial Number theory, 3 (2003), 1-66[12] D. G OLDSTON , C. Y ILDIRIM , Higher correlations of divisor sums related to primes III: small gaps between primes , Proc.London Math. Soc. 95 (2007), 653-686[13] B. N AGLE , V. R ¨ ODL , M. S CHACHT , The counting lemma for regular k-uniform hypergraphs , Random Structures and Algo-rithms, 28(2), (2006), 113-179[14] O. R EINGOLD , L. T REVISAN , M. T ULSIANI , S. V ADHAM , Dense subsets of pseudorandom sets Electronic Colloquium ofComputational Complexity, Report TR08-045 (2008)[15] J. S OLYMOSI , Note on a generalization of Roths theorem , Discrete and Computational Geometry, Algorithms Combin. 25,(2003), 825-827[16] E. S ZEMER ´ EDI , On sets of integers containing no k elements in arithmetic progression , Acta Arith. 27 (1975), 299-345[17] T. T AO , The ergodic and combinatorial approaches to Szemer´edi’s theorem , Centre de Recerches Math´ematiques CRM Pro-ceedings and Lecture Notes, 43 (2007), 145–193.[18] T. T AO , The Gaussian primes contain arbitrarily shaped constellations , J. Analyse Math., 99/1 (2006), 109-176[19] T. T AO , A variant of the hypergraph removal lemma Journal of Combinatorial Theory, Series A 113.7 (2006): 1257-1280[20] T. T AO AND T. Z IEGLER , A multidimensional Szemer´edi theorem for the primes via a correspondence principle , preprint. E-mail address : [email protected] E-mail address ::