An arithmetic transference proof of a relative Szemerédi theorem
aa r X i v : . [ m a t h . N T ] O c t AN ARITHMETIC TRANSFERENCE PROOF OFA RELATIVE SZEMER´EDI THEOREM
YUFEI ZHAO
Abstract.
Recently, Conlon, Fox, and the author gave a new proof of a relative Szemer´edi theo-rem, which was the main novel ingredient in the proof of the celebrated Green-Tao theorem thatthe primes contain arbitrarily long arithmetic progressions. Roughly speaking, a relative Szemer´editheorem says that if S is a set of integers satisfying certain conditions, and A is a subset of S withpositive relative density, then A contains long arithmetic progressions, and our recent results showthat S only needs to satisfy a so-called linear forms condition.This note contains an alternative proof of the new relative Szemer´edi theorem, where we di-rectly transfer Szemer´edi’s theorem, instead of going through the hypergraph removal lemma. Thisapproach provides a somewhat more direct route to establishing the result, and it gives betterquantitative bounds.The proof has three main ingredients: (1) a transference principle/dense model theorem of Green-Tao and Tao-Ziegler (with simplified proofs given later by Gowers, and independently, Reingold-Trevisan-Tulsiani-Vadhan) applied with a discrepancy/cut-type norm (instead of a Gowers unifor-mity norm as it was applied in earlier works), (2) a counting lemma established by Conlon, Fox,and the author, and (3) Szemer´edi’s theorem as a black box. Introduction
The celebrated Green-Tao theorem [6] states that the primes contain arbitrarily long arithmeticprogressions (AP). A key ingredient in their work is a relative Szemer´edi theorem. Szemer´edi’stheorem [14] states that any subset of the integers with positive upper density contains arbitrarilylong arithmetic progressions. A relative Szemer´edi theorem is a result where the ground set is nolonger Z but some sparse pseudorandom subset (or more generally some measure).Green and Tao proved a relative Szemer´edi theorem provided that the ground set satisfies certainpseudorandomness conditions known as the linear forms condition and the correlation condition.They then constructed a majorizing measure to the primes, using ideas from the work of Goldstonand Yıldırım [2] (subsequently simplified in [15]), so that this majorizing measure satisfies thedesired pseudorandomness conditions.Recently, Conlon, Fox, and the author [1] gave a new proof of Green and Tao’s relative Szemer´editheorem, requiring simpler pseudorandomness hypotheses on the ground set. We showed that aweak version of Green and Tao’s linear forms condition is sufficient. A precise definition of ourlinear forms condition will be given in § This work was done while the author was an intern at Microsoft Research New England. sparse pseudorandom set of integers by a dense subset. The dense model is a good approximation ofthe original set with respect to a discrepancy-type norm (similar to the cut metric for graphs). Thiscontrasts previous proofs the Green-Tao theorem [6, 5, 10] where the dense model theorem is appliedwith respect to the Gowers uniformity norm, which gives a stronger notion of approximation.Another important ingredient in the proof is the relative counting lemma of [1], which impliesthat the dense model behaves similarly to the original set in the number of arithmetic progressions.The arithmetic transference approach presented here establishes the new relative Szemer´editheorem in a more direct fashion. It also gives better quantitative bounds than [1]. Indeed,instead of going through the hypergraph removal lemma, which currently has an Ackermann-typedependence on the bounds (due to the application of the hypergraph regularity lemma), we can nowuse Szemer´edi’s theorem as a black box and automatically transfer the best quantitative boundsavailable (currently the state-of-art is [13] for 3-term APs, [7] for 4-term APs, and [3] for longerAPs). This answers a question that was left open in [1]. The approach presented here, however, isless general compared to [1], since it does not provide a relative hypergraph removal lemma, nordoes it give a more general sparse regularity approach to hypergraphs.The main theorem is stated in §
2. In §
3, we apply a dense model theorem to find a denseapproximation of the original set. In §
4, we apply a counting lemma to show that the dense modelhas approximately the same number of k -term APs as the original set. Finally in §
5, we puteverything together and apply Szemer´edi’s theorem as a black box to conclude the proof.2.
Definitions and results
Notation.
Dependence on N . We consider functions ν = ν ( N ) , where N (usually suppressed) isassumed to be some large integer. We write o (1) for a quantity that tends to zero as N → ∞ alongsome subset of Z . If the rate at which the quantity tends to zero depends on some other parameters(e.g., k, δ ), then we put these parameters in the subscript (e.g., o k,δ (1)). Expectation.
We write E [ f ( x , x , . . . ) | P ] for the expectation of f ( x , x , . . . ) when the variablesare chosen uniformly out of all possibilities satisfying P .We shall use, as a black box, the following weighted version of Szemer´edi’s theorem as formulated,for example, in [6, Prop. 2.3]. It may be helpful to think of f as the indicator function 1 A of someset A ⊆ Z N . It will be easier for us to work in Z N := Z /N Z as opposed to [ N ] := { , . . . , N } ,although these two settings are easily seen to be equivalent. Theorem 2.1 (Szemer´edi’s theorem, weighted version) . Let k ≥ and < δ ≤ be fixed. Let f : Z N → [0 , be a function satisfying E [ f ] ≥ δ . Then E [ f ( x ) f ( x + d ) f ( x + 2 d ) · · · f ( x + ( k − d ) | x, d ∈ Z N ] ≥ c ( k, δ ) − o k,δ (1) (1) for some constant c ( k, δ ) > which does not depend on f or N . Gowers’ results [3] (along with a Varnavides-type [19] averaging argument) imply that Theo-rem 2.1 holds with c ( k, δ ) = exp( − exp( δ − c k )) with c k = 2 k +9 (see [13] and [7] for the current bestbounds for k = 3 and 4 respectively).A relative Szemer´edi theorem is an extension of Theorem 2.1 of the following form. Instead of0 ≤ f ≤ ≤ f ( x ) ≤ ν ( x ) for all x ∈ Z N , where ν : Z N → R ≥ is some function (also called a majorizing measure) that satisfies certain pseuodorandomnessconditions. Here the function ν is normalized so that E [ ν ] = 1 + o (1). For instance, one can thinkof ν as N | S | S for some pseudorandom subset S ⊆ Z N , and f as 1 A ν with some A ⊆ S . So in thiscase (1) says that A contains many k -term APs when N is sufficiently large.As in [1], the pseudorandomness condition that we assume on ν is the linear forms condition, asfollows. N ARITHMETIC TRANSFERENCE PROOF OF A RELATIVE SZEMER´EDI THEOREM 3
Definition 2.2 (Linear forms condition) . A nonnegative function ν = ν ( N ) : Z N → R ≥ is said toobey the k -linear forms condition if one has E h k Y j =1 Y ω ∈{ , } [ k ] \{ j } ν (cid:16) k X i =1 ( i − j ) x ( ω i ) i (cid:17) n j,ω (cid:12)(cid:12)(cid:12) x (0)1 , x (1)1 , . . . , x (0) k , x (1) k ∈ Z N i = 1 + o (1) (2)for any choice of exponents n j,ω ∈ { , } . Example 2.3.
For k = 3, condition (2) says that E [ ν ( y + 2 z ) ν ( y ′ + 2 z ) ν ( y + 2 z ′ ) ν ( y ′ + 2 z ′ ) ν ( − x + z ) ν ( − x ′ + z ) ν ( − x + z ′ ) ν ( − x ′ + z ′ ) ν ( − x − y ) ν ( − x ′ − y ) ν ( − x − y ′ ) ν ( − x ′ − y ′ ) | x, x ′ , y, y ′ , z, z ′ ∈ Z N ] = 1 + o (1)and similar conditions hold if one or more of the twelve ν factors in the expectation are erased.The main result of this note is the following theorem. Theorem 2.4 (Relative Szemer´edi theorem) . Let k ≥ and < δ ≤ be fixed. Let ν : Z N → R ≥ satisfy the k -linear forms condition. Assume that N is sufficiently large and relatively prime to ( k − . Let f : Z N → R ≥ satisfy ≤ f ( x ) ≤ ν ( x ) for all x ∈ Z N and E [ f ] ≥ δ . Then E [ f ( x ) f ( x + d ) f ( x + 2 d ) · · · f ( x + ( k − d ) | x, d ∈ Z N ] ≥ c ( k, δ ) − o k,δ (1) , (3) where c ( k, δ ) is the same constant which appears in Theorem 2.1. The rate at which the o k,δ (1) term goes to zero depends not only on k and δ but also the rate of convergence in the k -linear formscondition for ν . This theorem was proved in [1] without the additional conclusion that c ( k, δ ) can be taken to bethe same as in Theorem 2.1. Indeed, the proof in [1] uses the hypergraph removal lemma as a blackbox, so that the constants c ( k, δ ) there are much worse, with an Ackermann-type dependence due tothe use of hypergraph regularity. In [6], Green and Tao also transfered Szemer´edi’s theorem directlyto obtain the same constants c ( k, δ ) as in Theorem 2.1, but under stronger pseudorandomnesshypotheses for ν . So Theorem 2.4 combines the conclusions of the two relative Szemer´edi theoremsin [1] and [6]. 3. Dense model theorem
In this section, we show that the f in Theorem 2.4 can be modeled by a function ˜ f : Z N → [0 , G (written additively), but there is no lossin thinking G = Z N . For x = ( x , . . . , x r ) ∈ G r , and I ⊆ [ r ], we write x I = ( x i ) i ∈ I . Definition 3.1.
Let G be a finite abelian group, r be a positive integer, ψ : G r → G be a surjectivehomomorphism, and f, ˜ f : G → R ≥ be two functions. We say that ( f, ˜ f ) is an ( r, ǫ ) -discrepancypair with respect to ψ if (cid:12)(cid:12)(cid:12) E h ( f ( ψ ( x )) − ˜ f ( ψ ( x ))) r Y i =1 u i ( x [ r ] \{ i } ) (cid:12)(cid:12)(cid:12) x ∈ G r i(cid:12)(cid:12)(cid:12) ≤ ǫ (4)for all collections of functions u , . . . , u r : G r − → [0 , Example 3.2.
When r = 2 and ψ ( x, y ) = x + y , (4) says | E [( f ( x + y ) − ˜ f ( x + y )) u ( y ) u ( x ) | x, y ∈ G ] | ≤ ǫ. In other words, this says that the two weighted graphs g, ˜ g : G × G → R ≥ given by g ( x, y ) = f ( x + y )and ˜ g ( x, y ) = ˜ f ( x + y ) satisfy k g − ˜ g k (cid:3) ≤ ǫ , where k·k (cid:3) is the cut norm for bipartite graphs.When r = 3 and ψ ( x, y, z ) = x + y + z , (4) says | E [( f ( x + y + z ) − ˜ f ( x + y + z )) u ( y, z ) u ( x, z ) u ( x, y ) | x, y, z ∈ G ] | ≤ ǫ. YUFEI ZHAO
The following key lemma says that any 0 ≤ f ≤ ν can be approximated by a 0 ≤ ˜ f ≤ Lemma 3.3.
For every ǫ > there is an ǫ ′ = exp( − ǫ − O (1) ) such that the following holds:Let G be a finite abelian group, r be a positive integer, and ψ : G r → G be a surjective homo-morphism. Let f, ν : G → R ≥ be such that ≤ f ≤ ν , E [ f ] ≤ , and ( ν, is an ( r, ǫ ′ ) -discrepancypair with respect to ψ . Then there exists a function ˜ f : G → [0 , so that E [ ˜ f ] = E [ f ] and ( f, ˜ f ) isan ( r, ǫ ) -discrepancy pair with respect to ψ . The proof of Lemma 3.3 uses the dense model theorem of Green-Tao [6] and Tao-Ziegler [18],which was later simplified in [5] and [10]. The expository note [9] has a nice and short write-up ofthe proof of the dense model theorem, and we quote the statement from there.Let X be a finite set. For any two functions f, g : X → R , we write h f, g i = E [ f ( x ) g ( x ) | x ∈ X ].For F a collection of functions ϕ : X → [ − , F k to mean the collections of all functionsof the form Q k ′ i =1 ϕ i , where ϕ i ∈ F and k ′ ≤ k . In particular, if F is closed under multiplication,then F k = F . Lemma 3.4 (Green-Tao-Ziegler dense model theorem) . For every ǫ > , there is a k = (1 /ǫ ) O (1) and an ǫ ′ = exp( − (1 /ǫ ) O (1) ) such that the following holds:Suppose that F is a collection of functions ϕ : X → [ − , on a finite set X , ν : X → R ≥ satisfies |h ν − , ϕ i| ≤ ǫ ′ for all ϕ ∈ F k , and f : X → R ≥ satisfies f ≤ ν and E [ f ] ≤ . Then there is a function ˜ f : X → [0 , such that E [ ˜ f ] = E [ f ] , and |h f − ˜ f , ϕ i| ≤ ǫ for all ϕ ∈ F . We shall use Lemma 3.4 with F closed under multiplication, so that k plays no role. This is animportant point in our simplification over previous approaches using the dense model theorem. Proof of Lemma 3.3.
For any collection of functions u , . . . , u r : G r − → R , define a generalizedconvolution ( u , . . . , u r ) ∗ ψ : G → R by( u , . . . , u r ) ∗ ψ ( x ) = E h r Y i =1 u i ( y [ r ] \{ i } ) (cid:12)(cid:12)(cid:12) y ∈ G r , ψ ( y ) = x i . Then the left-hand side of (4) can be written as |h f − ˜ f , ( u , . . . , u r ) ∗ ψ i| . Let F be the set of functionswhich can be obtained by convex combinations of functions of the form ( u , . . . , u r ) ∗ ψ , varying overall combinations of functions u , . . . , u r : G r − → [0 ,
1] (but ψ is fixed). Then ( f, ˜ f ) being an ( r, ǫ ′ )-discrepancy pair with respect to ψ is equivalent to |h f − ˜ f , ϕ i| ≤ ǫ for all ϕ ∈ F . The desiredclaim would then follow from Lemma 3.4 and the triangle inequality provided we can show that F is closed under multiplication. It suffices to show that for u , . . . , u r , u ′ , . . . , u ′ r : G r − → [0 , u , . . . , u r ) ∗ ψ and ( u ′ , . . . , u ′ r ) ∗ ψ still lies in F . Indeed, we have( u , . . . , u r ) ∗ ψ ( x )( u ′ , . . . , u ′ r ) ∗ ψ ( x ) = E h r Y i =1 u i ( y [ r ] \{ i } ) u ′ i ( y ′ [ r ] \{ i } ) (cid:12)(cid:12)(cid:12) y, y ′ ∈ G r , ψ ( y ) = ψ ( y ′ ) = x i = E h r Y i =1 u i ( y [ r ] \{ i } ) u ′ i ( y [ r ] \{ i } + z [ r ] \{ i } ) (cid:12)(cid:12)(cid:12) y, z ∈ G r , ψ ( y ) = x, ψ ( z ) = 0 i = E [( v ,z [ r ] \{ } , v ,z [ r ] \{ } , . . . , v r,z [ r ] \{ r } ) ∗ ψ ( x ) | z ∈ G r , ψ ( z ) = 0]where v i,z [ r ] \{ i } : G r − → [0 ,
1] is defined by v i,z [ r ] \{ i } ( y [ r ] \{ i } ) = u i ( y [ r ] \{ i } ) u ′ i ( y [ r ] \{ i } + z [ r ] \{ i } ). Thisshows that the product of two such generalized convolutions is a convex combination of generalizedconvolutions, so that F is closed under multiplication. (cid:4) N ARITHMETIC TRANSFERENCE PROOF OF A RELATIVE SZEMER´EDI THEOREM 5 Counting lemma
Next we show that if ( f, ˜ f ) is a ( k − , ǫ )-discrepancy pair, with f ≤ ν and ˜ f ≤
1, then f and˜ f have similar number of (weighted) k -term APs. This is a special case of the counting lemma forsparse hypergraphs from [1], whose self-contained proof takes up about 4 pages [1, Sec. 6]. Lemma 4.1 ( k -AP counting lemma) . For every k ≥ and γ > , there exists an ǫ > so that thefollowing holds.Let ν, f, ˜ f : Z N → R ≥ be functions. Suppose that ν satisfies the k -linear forms condition and N is sufficiently large. Suppose also that ≤ f ≤ ν , ≤ ˜ f ≤ , and ( f, ˜ f ) is a ( k − , ǫ ) -discrepancypair with respect to each of ψ , . . . , ψ k , where ψ j : Z k − N → Z N is defined by ψ j ( x , . . . , x j − , x j +1 , · · · , x k ) := X i ∈ [ k ] \{ j } ( i − j ) x i . Then (cid:12)(cid:12)(cid:12) E h k − Y i =0 f ( a + id ) (cid:12)(cid:12)(cid:12) a, d ∈ Z N i − E h k − Y i =0 ˜ f ( a + id ) (cid:12)(cid:12)(cid:12) a, d ∈ Z N i(cid:12)(cid:12)(cid:12) ≤ γ. (5)Let us explain why Lemma 4.1 is a special case of [1, Thm. 2.17]. We use the hypergraph notationfrom [1, Sec. 2]. Let V = ( J, ( V j ) j ∈ J , k − , H ) be a hypergraph system, where J = [ k ], V j = Z N for every j ∈ J , and H = (cid:0) Jk − (cid:1) (corresponding to a simplex). Let ( ν e ) e ∈ H , ( g e ) e ∈ H , and (˜ g e ) e ∈ H be weighted hypergraphs on V defined by ν [ k ] \{ j } ( x [ k ] \{ j } ) = ν ( ψ j ( x [ k ] \{ j } )) g [ k ] \{ j } ( x [ k ] \{ j } ) = f ( ψ j ( x [ k ] \{ j } ))˜ g [ k ] \{ j } ( x [ k ] \{ j } ) = ˜ f ( ψ j ( x [ k ] \{ j } ))for j ∈ [ k ] and x [ k ] \{ j } ∈ V [ k ] \{ j } = Z k − N . Then the weighted hypergraph ( ν e ) e ∈ H satisfies the H -linear forms condition [1, Def. 2.8] (which is equivalent to ν : Z N → R ≥ satisfying the k -linearforms condition). That ( f, ˜ f ) is a ( k − , ǫ )-discrepancy pair with respect to ψ j is equivalent to( g [ k ] \{ j } , ˜ g [ k ] \{ j } ) being an ǫ -discrepancy pair as weighted hypergraphs [1, Def. 2.13]. Note that E h k − Y i =0 f ( a + id ) (cid:12)(cid:12)(cid:12) x, d ∈ Z N i = E h Y e ∈ H g e ( x e ) (cid:12)(cid:12)(cid:12) x ∈ V J i (to see this, let a = ψ ( x , . . . , x k ) and d = − ( x + · · · + x k )) and similarly with ˜ f and ˜ g e . Thenthe relative hypergraph counting lemma [1, Thm. 2.17] reduces to Lemma 4.1.5. Proof of the relative Szemer´edi theorem
Proof of Theorem 2.4.
We begin with the following simple observation, that for any g, g ′ : Z N → R ≥ , if ( g, g ′ ) is a ( k − , ǫ )-discrepancy pair with respect to one ψ j from Lemma 4.1, then it is sowith respect to all ψ j . This is simply because 1 , , . . . , k − Z N ,as N is coprime to ( k − ψ j to another ψ j ′ .The linear forms condition on ν implies that ( ν,
1) is a ( k − , o (1))-discrepancy pair with respectto ψ from Lemma 4.1. Indeed, we have the following inequality (cid:12)(cid:12)(cid:12) E h ( ν ( ψ ( x )) − r Y i =1 u i ( x [ r ] \{ i } ) (cid:12)(cid:12)(cid:12) x ∈ G r i(cid:12)(cid:12)(cid:12) ≤ E h Y ω ∈{ , } r ( ν ( ψ ( x ( ω ) )) − (cid:12)(cid:12)(cid:12) x (0) , x (1) ∈ G r i / r (6) YUFEI ZHAO which is proved by a sequence of Cauchy-Schwarz inequalities, similar to [1, Lem. 6.2]. The right-hand side of (6) is o (1) by the linear forms condition (expand the product so that each term is ± o (1) by (2), and everything cancels accordingly).Since ( ν,
1) is a ( k − , o (1))-discrepancy pair with respect to ψ , Lemma 3.3 implies that thereexists ˜ f : G → [0 ,
1] so that E [ ˜ f ] = E [ f ] ≥ δ (if E [ f ] >
1, then replace f by δf / E [ f ]) and ( f, ˜ f ) is a( k − , o (1))-discrepancy pair with respect to ψ , and hence with respect to all ψ j , 1 ≤ j ≤ k . So E h k − Y i =0 f ( x + id ) (cid:12)(cid:12)(cid:12) x, d ∈ Z N i ≥ E h k − Y i =0 ˜ f ( x + id ) (cid:12)(cid:12)(cid:12) x, d ∈ Z N i − o (1) ≥ c ( k, δ ) − o k,δ (1) , where the first inequality is by Lemma 4.1 and the second inequality is by Theorem 2.1. (cid:4) Acknowledgments.
The author would like to thank Jacob Fox and David Conlon for carefulreadings of the manuscript.
References [1] D. Conlon, J. Fox, and Y. Zhao,
A relative Szemer´edi theorem. , arXiv:1305.5440.[2] D. A. Goldston and C. Y. Yıldırım,
Higher correlations of divisor sums related to primes. I. Triple correlations ,Integers (2003), A5, 66.[3] W. T. Gowers, A new proof of Szemer´edi’s theorem , Geom. Funct. Anal. (2001), no. 3, 465–588.[4] , Hypergraph regularity and the multidimensional Szemer´edi theorem , Ann. of Math. (2007), no. 3,897–946.[5] ,
Decompositions, approximate structure, transference, and the Hahn-Banach theorem , Bull. Lond. Math.Soc. (2010), no. 4, 573–606.[6] B. Green and T. Tao, The primes contain arbitrarily long arithmetic progressions , Ann. of Math. (2008),no. 2, 481–547.[7] ,
New bounds for Szemer´edi’s theorem. II. A new bound for r ( N ), Analytic number theory, CambridgeUniv. Press, Cambridge, 2009, pp. 180–204.[8] B. Nagle, V. R¨odl, and M. Schacht, The counting lemma for regular k -uniform hypergraphs , Random StructuresAlgorithms (2006), no. 2, 113–179.[9] O. Reingold, L. Trevisan, M. Tulsiani, and S. Vadhan, New proofs of the Green-Tao-Ziegler dense model theorem:An exposition , arXiv:0806.0381.[10] ,
Dense subsets of pseudorandom sets , 49th Annual IEEE Symposium on Foundations of ComputerScience, IEEE Computer Society, 2008, pp. 76–85.[11] V. R¨odl and J. Skokan,
Regularity lemma for k -uniform hypergraphs , Random Structures Algorithms (2004),no. 1, 1–42.[12] , Applications of the regularity lemma for uniform hypergraphs , Random Structures Algorithms (2006),no. 2, 180–194.[13] T. Sanders, On Roth’s theorem on progressions , Ann. of Math. (2) (2011), no. 1, 619–636.[14] E. Szemer´edi,
On sets of integers containing no k elements in arithmetic progression , Acta Arith. (1975),199–245.[15] T. Tao, A remark on Goldston-Yıldırım correlation estimates , available at .[16] ,
The Gaussian primes contain arbitrarily shaped constellations , J. Anal. Math. (2006), 109–176.[17] , A variant of the hypergraph removal lemma , J. Combin. Theory Ser. A (2006), no. 7, 1257–1280.[18] T. Tao and T. Ziegler,
The primes contain arbitrarily long polynomial progressions , Acta Math. (2008),no. 2, 213–305.[19] P. Varnavides,
On certain sets of positive density , J. London Math. Soc. (1959), 358–360. Department of Mathematics, MIT, Cambridge, MA 02139-4307
E-mail address ::