[PDF] A quantitative bound on Furstenberg-Sárközy patterns with shifted prime power common differences in primes

Abstract

Let k\geq1 be a fixed integer, and \mathcal P_N be the set of primes no more than N. We prove that if set A\subset\mathcal P_N contains no patterns p_1,p_1+(p_2-1)^k, where p_1,p_2 are prime numbers, then \frac{|A|}{|\mathcal P_N|}\ll\biggl(\frac{\log\log N}{\log N}\biggr)^{\frac{1}{12k^3}}.

Full PDF

aa r X i v : . [ m a t h . N T ] F e b A QUANTITATIVE BOUND ON FURSTENBERG-S ´ARK ¨OZYPATTERNS WITH SHIFTED PRIME POWER COMMONDIFFERENCES IN PRIMES

MENGDI WANG

Abstract.

Let k > P N be the set of primes no morethan N . We prove that if set A ⊂ P N contains no patterns p , p + ( p − k ,where p , p are prime numbers, then | A ||P N | ≪ (cid:18) log log N log N (cid:19) k . Contents

1. Introduction 12. Notations and Preliminaries 53. Exponential sum estimates 74. Restriction for shifted prime powers 155. A Local Inverse Theorem 196. Density Increment 23Appendix A. Restriction for prime numbers 26References 321.

Introduction

Lov´asz conjected that any integer set without non-zero perfect square diﬀerencemust have asymptotic density zero. This conjecture was conﬁrmed by Furstenberg[Fur] and S´ark¨ozy [S781] independently. Furstenberg used ergodic theory and hisresult is just a purely qualitative one. However, using Fourier-analytic densityincrement method, S´ark¨ozy proved the following quantitative strengthening.

Theorem A (S´ark¨ozy [S781]) . If A is a subset of [ N ] = { , · · · , N } and lacksnon-trivial patterns x, x + y ( y = 0) , then | A | N ≪ (cid:16) (log log N ) log N (cid:17) / . Pintz, Steiger and Szemer´edi [PSS] created a double iteration strategy whichreﬁned S´ark¨ozy’s proof and improved above upper bound to | A | N ≪ (log N ) − log log log log N . Subsequently, this method has been generalized to deal with sets with no dif-ferences of the form n k ( k >

3) by Balog, Pelik´an, Pintz and Szemer´edi [BPPS].

They proved that for any integer k >

3, if A ⊂ [ N ] lacks non-trivial patterns x, x + y k ( y = 0), then | A | N ≪ (log N ) − log log log log N . Very recently, Bloom and Maynard gained a much more eﬃcient density incre-ment argument by showing that no set A can have many large Fourier coeﬃcientswhich are rationals with small and distinct denominators. Taking the advantageof this density increment argument they can prove Theorem B (Bloom-Maynard [BM]) . If A ⊂ [ N ] lacks non-trivial patterns x, x + y with y = 0 , then | A | N ≪ (log N ) − c log log log N for some absolute constant c > . In the same series of papers (to study diﬀerence sets of subsets of integers),S´ark¨ozy also answered a question of Erd˝os, proving that if a subset of integers doesnot contain two elements which diﬀer by one less than a prime number, then thissubset has asymptotic density zero. Assume that A is a subset of [ N ], in the fol-lowing theorems we’ll use A − A to denote the diﬀerence set { a − a : a , a ∈ A } .More formally, S´ark¨ozy proved that Theorem C (S´ark¨ozy [S783]) . If A ⊂ [ N ] and p − A − A for all primes p ,then | A | N ≪ (log log log N ) log log log log N (log log N ) . By exploiting a dichotomy depending on whether the exceptional zero of Dirich-let L -functions occurs or not, Ruzsa and Sanders [RS] improved above upper den-sity bound to | A | N ≪ exp (cid:0) − c (log N ) / (cid:1) , for some absolute constant c >

0. And thebest current record on this problem dues to Wang [Wang], who improved Ruzsaand Sanders’ upper bound by improving 1 / / Theorem D (Li-Pan [LP]) . Let h ∈ Z [ x ] be a polynomial with positive leadingterm and zero constant term. If A ⊂ [ N ] and h ( p − A − A for all primes p ,then | A | N ≪ N .

Rice improved above triple logarithmic decay to a single logarithmic bound.Besides, Rice also generalized this kind of consideration to a larger class of poly-nomials called P -intersective polynomials (see [Rice, Deﬁnition 1]). Theorem E (Rice [Rice]) . Let h ∈ Z [ x ] be a P -intersective polynomial of degree k > with positive leading term. If A ⊂ [ N ] and h ( p ) A − A for all primes p with h ( p ) > , then | A | N ≪ (log N ) − c for any < c < k − . PARSE FURSTENBERG-S ´ARK ¨OZY 3

Similar to Green-Tao theorem [GT08], which shows that the prime numberscontain arbitrary long arithmetic progressions, one interesting question is to studywhether the sequences of prime numbers contain above-described patterns. Asthe readers may see from Theorem B, primes trivially contain x, x + y with y = 0. By adopting transference principle [GT06], Li-Pan and Rice can alsoprove the corresponding P -intersective polynomial diﬀerence results of subsets ofprime numbers. Theorem F (Li-Pan [LP], Rice [Rice]) . Let h ∈ Z [ x ] be a zero constant termpolynomial. If A ⊂ P N is a subset of primes and A does not contain patterns p , p + h ( p − for all primes p , p , then | A | = o ( |P N | ) . In this paper, we are planning to provide Theorem F a quantitative relative up-per density bound when h ∈ Z [ x ] takes the form of perfect k -th power polynomial,i.e. h ( x ) = x k . Theorem 1.1.

Let k > be a ﬁxed integer, and P N be the set of primes whichare no more than N . If A ⊂ P N and ( p − k A − A for all primes p , then | A ||P N | ≪ (cid:18) log log N log N (cid:19) k . The above degree k can be improved slightly by a more careful calculation,but we shall not do so. Additionally, we would take a positive attitude that ourmethod can be generalized to deal with zero constant term polynomials even thegeneral P -intersective polynomials. Nevertheless, it should be noted that we’llface more complicated exponential sum estimates. Outline of the argument.

In [LP] and [Rice] they used the idea of the trans-ference principle to convert the problem about a dense subset A of the primesto one about a dense subset A ′ of the integers. In contrast, we run a Roth-typedensity increment argument [Roth], using the dichotomy that a dense subset A of the primes either already contains the desired patterns x, x + ( p − k , or itmust have increased density after passing to a sub-progression P = a + q · [ X ]. Toperform these iterations, it is necessary to consider the more general pattern ofthe form x, x + q k − y k with the conditions that qx + a and qy + 1 are primes. Inorder to establish the dichotomy for these patterns, we need restriction estimatesfor primes in arithmetic progressions modulus q , and for k -th powers of shiftedprimes in arithmetic progressions. In the case q = 1 (or small value q ), theserestriction estimates have been proved in the literature [GT06], [LP], [Chow] andso on, but we will need to extend these results for the common diﬀerence q up to O (cid:16) exp (cid:0) c ′ √ log N (cid:1)(cid:17) for some small constant c ′ > P N be the setof primes which are less than or equal to N , and A be a subset of P N . Supposethat ν : Z → R > supported on the interval [ N ] is the majorant function whichcaptures the information of primes. The starting point is to lift our set A to [ N ]weighted by ν . It allows us to transfer counting number of patterns x, x + ( p − k in P N to counting weighted number of patterns (weighted by ν · A ) in [ N ]. MENGDI WANG

Logically, S´ark¨ozy’s proof idea [S781, S783] which inspired by Roth can beused, just on noting that in our cases we are counting patterns weighted by theunbounded function ν · A . Now assume that A does not contain patterns of theform x, x + ( p − k . For the experienced readers, this directly means that ν · A is highly non-uniform, in the sense that, (cid:16) E x ∈ [ N ] ν · A ( x ) (cid:17) E x ∈ [ N ] E p ∈ [ N /k ] f ( x ) f ( x + ( p − k ) , for some suitable (unbounded) balanced function f of ν · A , and we have written E x ∈ [ N ] f ( x ) = N − P x ∈ [ N ] f ( x ) as the average of f on [ N ]. Similar to the classicalbounded cases, we want to use this non-uniformity condition to extract a largearithmetic progression on which ν · A has increased density. But one thing isthat, since f is unbounded, in the processing, the applications of Cauchy-Schwarzinequality and Parseval’s identity need to be replaced by H¨older’s inequality andrestriction lemmas.Now suppose that there is an arithmetic progression P = a + q · [ X ] ⊂ [ N ] suchthat E n ∈ P ν · A ( n ) > E n ∈ [ N ] ν · A ( n ). There are two iteration strategies we canchoose. One is to iterate inside progressions. More precisely, the next step is toﬁnd a sub-progression P ′ inside P such that E n ∈ P ′ ν · A ( n ) > E n ∈ P ν · A ( n ), andso on. But at each stage of the iteration, the majorant function needs to possessinformation not only for the prime numbers but also for sub-progressions. Theother, by using translation-dilation, transfers to consider patterns x, x + q k − y k with qy + 1 ∈ P in the interval [ X ]. In spite of in this context, we are iterating inthe intervals, however, unlike the integer cases, we also need to keep the conditionthat a + qx, qy + 1 are prime numbers as the sieving condition. The readersmay have noticed, more or less, the above two iteration strategies are the same.Therefore, we would not distinguish them in the proof. Besides, compared withthe bounded cases [BM, S781], we’d like to call our iteration as the unboundediteration in the following.Assume that at each stage of the iteration the length of arithmetic progressionsonly shrinks about O ( δ − O k (1) ). If the iteration process is not too long, there mustbe many prime numbers throughout our iteration and our analysis makes sense.Indeed we can show that our unbounded iteration is O ( δ − O k (1) ) steps. Note alsothat in the bounded cases, by establishing a relationship of Fourier coeﬃcientsof indicator function with additive energy, Bloom and Maynard [BM] can showthat there must be many frequencies with large values of Fourier coeﬃcients havethe same (small) denominators. Hence, their iteration procedure is only aroundexp (cid:16) − c log δ − log log δ − (cid:17) steps. Unfortunately, it seems a little bit diﬃcult to generalizetheir method to the unbounded cases, and we believe it is an interesting questionto reduce the iteration steps and so the density bound.In addition, one can verify more precisely that, the Fourier-analytic transferenceprocedure can also provide the patterns x, x + ( p − k in primes a quantitativebound, but it seems hard to get such a logarithmic bound. Compared with trans-ference principle, in our proof, the combinatorial (generalized) Behrend theoremis useless, instead, we’ll take more attention to this special sparse set (the set ofprime numbers) itself and employ a much more precise result of prime numbers,the Linnik’s theorem [Xyl]. Furthermore, as previously explained, we need to pay PARSE FURSTENBERG-S ´ARK ¨OZY 5 attention to the behavior of primes in arithmetic progressions all the time. Inthe transference procedure, as the common diﬀerence of the progression wouldnot have a signiﬁcant inﬂuence on the ﬁnal result, they can adopt the classicalSiegel-Walﬁsz theorem. Nonetheless, in our cases, the common diﬀerence has aclose bearing on how long the iteration process would be. Thereupon we’d like toseek to gain experiences from Ruzsa and Sanders [RS].We organize this paper as follows. In Section 2, we set notations and exhibitRuzsa-Sanders’ result [RS] for primes in arithmetic progressions. In Section 3, weuse the Hardy-Littlewood method to study the Fourier transform of the majorantof shifted prime powers. We then use Bourgain’s method [Bou89] and aboveFourier transform information to prove the restriction lemma for shifted primepowers in Section 4. In Section 5, we’ll prove our main result, a local inversetheorem, and then in Section 6 use this inverse result to carry out the densityincrement argument and so complete the proof of Theorem 1.1.

Acknowledgements.

The author would like to thank her advisors XuanchengShao and Lilu Zhao for their discussions.2.

Notations and Preliminaries

Notations.

For a real number X >

1, we use [ X ] to denote the discreteinterval { , · · · , ⌊ X ⌋} . P denotes the set of prime numbers and P N = P ∩ [ N ]denotes the set of primes no more than N . For α ∈ T = R / Z , we write k α k for thedistance of α to the nearest integer. For any set of integers A , use 1 A to denoteits indicator function.Follows from standard arithmetical function conventions, we’ll use φ to denotethe Euler totient function; Λ to denote the von Mangoldt function; µ to denotethe M¨obius function.We’ll use counting measure on Z , so for a function f : Z → C its L p -norm isdeﬁned to be k f k pp = X x | f ( x ) | p , and L ∞ -norm is deﬁned to be k f k ∞ = sup x | f ( x ) | . Besides, we’ll use Haar probability measure on T , so for a function F : T → C deﬁne its L p -norm as k F k pp = Z T | F ( α ) | p d α. If f : A → C is a function and B is a non-empty ﬁnite subset of A , we write E x ∈ B f ( x ) = 1 | B | X x ∈ B f ( x )as the average of f on B . We would also abbreviate E x ∈ A f ( x ) to E ( f ) if thesupported set A is ﬁnite and no confusion caused. MENGDI WANG

We’ll use Fourier analysis on Z with its dual T . Let f : Z → C be a function,deﬁne its Fourier transform by setting b f ( α ) = X x f ( x ) e ( xα ) , where α ∈ T . For functions f, g : Z → C deﬁne the convolution of f and g as f ∗ g ( x ) = X y f ( y ) g ( x − y ) . Then, basic properties of Fourier analysis, for f, g : Z → C (1) (Fourier inversion formula ) f ( x ) = R T b f ( α ) e ( − xα ) d α ;(2) (Parseval’s identity) k f k = k b f k ;(3) [ f ∗ g = b f · b g.ε > c > c and ε are allowed to change at diﬀerent occurrences. For a function f and positive-valued function g , write f ≪ g or f = O ( g ) if there exists a constant C > | f ( x ) | Cg ( x ) for all x ; and write f ≫ g if f is also positive-valued andthere is a constant C > f ( x ) > Cg ( x ) for all x .2.2. Preliminaries.

Suppose that x is a real number, a and q are positive coprimeintegers, then we write ψ ( x ; q, a ) = X n xn ≡ a (mod q ) Λ( n ) . Estimating ψ ( x ; q, a ) is one of the central problems in analytic number theory, andthe following lemma is [Dav, P.123 Eq.(11)]. Lemma 2.1 (Siegle-Walﬁsz Theorem) . Provided that q (log x ) D for some ﬁxed D > , we have ψ ( x ; q, a ) = xφ ( q ) + O (cid:0) x exp( − c (log x ) / ) (cid:1) . As the readers may notice, Siegle-Walﬁsz Theorem is only valid for commondiﬀerence q (log x ) D , and in our cases, this would lead to a poor density bound.To make moduli q beyond the limitation of Lemma 2.1 possible, we are going toadopt Ruzsa and Sanders’ argument, who established an asymptotic formula for ψ ( x ; q, a ) based on whether the exceptional zero of Dirichlet L -functions exists ornot. The next lemma is a modiﬁcation of [RS, Proposition 4.7]. Lemma 2.2 (Exceptional pair result) . D , D > are positive integers, χ is acharacter modulus q , and ρ is the exceptional zero for χ , if exists. Suppose that q D and (1 − ρ ) − ≪ q / (log q ) . Then for any real x > and integers a q satisfying q | q and q D , we have ψ ( x ; q, a ) = χ ( a ) xφ ( q ) − χ χ ( a ) x ρ φ ( q ) ρ + O (cid:18) x exp (cid:16) − c log x √ log x + log D (cid:17) (log D ) (cid:19) , where χ is the principle character modulus q .Besides, when the exceptional zero ρ does not exist, we will take q = 1 and letthe second term vanish. PARSE FURSTENBERG-S ´ARK ¨OZY 7

Assume that c > < c ′ < c/ (100 k ) be a ﬁxed number throughout this paper. If the supportinginterval, in our practical applications, is [ X ], we would take D = exp(2 c ′ p log X ) and D = exp(2 c ′ p log X/ . (2.1)Therefore, when x > X / (2 k ) (say) above formula can be rewritten as ψ ( x ; q, a ) = χ ( a ) xφ ( q ) − χ χ ( a ) x ρ φ ( q ) ρ + O ( x exp( − c p log x )) . (2.2)Let P = b + d · [ X ] be an arithmetic progression inside [ N ] which satisﬁes1 b d and ( b, d ) = 1. Deﬁne a majorant function Λ b,d : Z → R > supportedon [ X ] by taking Λ b,d ( x ) = ( φ ( d ) d Λ( b + dx ) if b + dx ∈ P ;0 otherwise. (2.3) Lemma 2.3 (Restriction lemma) . Let δ > be a number tends to zero slowly.Let b d exp( c ′ √ log X ) be a relative prime pair of integers. For anyfunction ν : Z → C supported on [ X ] and satisfying | ν | Λ b,d pointwise, and anyreal number p > , we always have k b f k pp = Z T | b f ( α ) | p d α ≪ p X p − , where f = ν − δ [ X ] . The proof of this lemma, in one sense, is self-consistent and without the proofof this lemma would not disturb our main argument. For this reason, we are goingto prove this in the appendix.3.

Exponential sum estimates

We begin our analysis with estimating the following auxiliary exponential sum-mation. Suppose that d, k > N is a large number and M = l(cid:0) N (cid:1) /k m . For any α ∈ T deﬁne S d ( α ) = X y ∈ [ M ] φ ( d ) d ky k − Λ( dy + 1) e ( y k α ) . (3.1)In the light of the Hardy-Littlewood method we’ll partition frequencies α ∈ T intomajor arcs and minor arcs, and then calculate the behavior of S d ( α ) in major andminor arcs respectively. We now handle these tasks in turn.Assume that d ≪ exp (cid:0) c ′ √ log N (cid:1) throughout this section, and let us divide theinterval of integration T into major arcs M and minor arcs m which are deﬁnedin the following forms M = [ q Q (cid:8) α ∈ T : k qα k ≪ Q/N (cid:9) , (3.2) m = T \ M , MENGDI WANG where Q = (log N ) k D , (3.3)and D > S d ( α ) in majorarcs ﬁrstly. But before it, an auxiliary lemma. Lemma 3.1.

Let k > be an integer. Let a and q be relatively prime positiveintegers. Then for any coprime pair b, t > we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X r ( q )( tr + b,q )=1 e (cid:0) ar k /q (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≪ q − k + ε , where ε > is arbitrarily small.Proof. We ﬁrstly claim that the left-hand side expression is multiplicative. To seethis, let q = q q with ( q , q ) = 1, and r = r q + r q , on noting that( tr + b, q ) = ( tr q + tr q + b, q q ) = ( tr q + b, q )( tr q + b, q ) , we then have X r ( q q )( tr + b,q q )=1 e (cid:18) ar k q q (cid:19) = X r ( q )( tr q + b,q )=1 e (cid:18) ar k q k − q (cid:19) X r ( q )( tr q + b,q )=1 e (cid:18) ar k q k − q (cid:19) . Thus, without loss of generation, we may assume that q = p m with p ∈ P . If p | t ,then for any r modulus p m , ( tr + b, p ) = 1 always holds, just by taking note that( b, t ) = 1. Thus , in this case it follows from [Va, Theorem 4.2] that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X r ( p m )( tr + b,p )=1 e (cid:18) ar k p m (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) X r ( p m ) e (cid:18) ar k p m (cid:19)(cid:12)(cid:12)(cid:12) ≪ ( p m ) − k + ε . Now we assume that ( p, t ) = 1. By taking advantage of the M¨obius function,one has X r ( p m )( tr + b,p )=1 e (cid:18) ar k p m (cid:19) = X r ( p m ) e (cid:18) ar k p m (cid:19) X d | ( tr + b,p ) µ ( d ) = X d | p µ ( d ) X r ( p m ) d | ( tr + b ) e (cid:18) ar k p m (cid:19) . Just from the deﬁnition, above M¨obius function µ will vanish except when d = 1and d = p , and, thus, above expression is indeed X r ( p m ) e (cid:18) ar k p m (cid:19) − X r ( p m ) tr + b ≡ p ) e (cid:18) ar k p m (cid:19) . By making use of [Va, Theorem 4.2] once again to get that (cid:12)(cid:12)(cid:12)(cid:12) X r ( p m ) e (cid:18) ar k p m (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≪ ( p m ) − k + ε . PARSE FURSTENBERG-S ´ARK ¨OZY 9

As for the second term, taking note that ( t, p ) = ( t, b ) = 1, let t p be the uniquesolution to the linear congruence equation tr + b ≡ p ) in the range [1 , p ],then for r ∈ [ p m ], the solutions are in the form of r = t p + sp with s ∈ [ p m − ].And therefore, X r ( p m ) tr + b ≡ p ) e (cid:18) ar k p m (cid:19) = X s ( p m − ) e (cid:18) a ( t p + sp ) k p m (cid:19) = e (cid:16) at kp p m (cid:17) X s ( p m − ) e (cid:18) akt k − p s + · · · + ap k − s k p m − (cid:19) . Now let p τ k k , if τ > (cid:6) m (cid:7) , then P s ( p m − ) e (cid:18) akt k − p s + ··· + ap k − s k p m − (cid:19) is triviallybounded by O (1); if 0 < τ < (cid:6) m (cid:7) , let 0 < u τ be the integer such thatgcd (cid:16) at k − p k, at k − p p (cid:0) k (cid:1) , · · · , ap k − , p m − (cid:17) = p u , it can be deduced from [Va, Theo-rem 7.1] that (cid:12)(cid:12)(cid:12)(cid:12) X s ( p m − ) e (cid:18) akt k − p s + · · · + ap k − s k p m − (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) k (cid:12)(cid:12)(cid:12)(cid:12) X s ( p m − u − ) e (cid:18) a ′ s + · · · + a ′ k s k p m − u − (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≪ ( p m − u − ) − k + ε , where gcd( a ′ , · · · , a ′ k , p ) = 1; if τ = 0, it is directly from [Va, Theorem 7.1] that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X r ( p m ) tr + b ≡ p ) e (cid:18) ar k p m (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) X s ( p m − ) e (cid:18) akt k − p s + · · · + ap k − s k p m − (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≪ ( p m − ) − k + ε . Putting all above cases together we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X r ( p m )( tr + b,p )=1 e (cid:18) ar k p m (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≪ ( p m ) − k + ε , and the lemma follows. (cid:3) Lemma 3.2 (Major-arc behavior of S d ( α )) . Suppose that α ∈ T can be decomposedas α = aq + β with coprime pair a q . If d, q ≪ exp( c ′ √ log N ) , then S d ( α ) ≪ q − k + ε N (1 + N | β | ) − + N (1 + N | β | ) exp (cid:0) − c p log N (cid:1) holds for every suﬃciently small ε > .Proof. In the ﬁrst place, we consider the much easier case, that is α = a/q . Ex-panding the deﬁnition of S d ( α ) with α = a/q , and then making the change of variables dy + 1 y to get that S d ( a/q ) = X y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) e (cid:16) ay k q (cid:17) = X y ∈ [ dM +1] y ≡ d ) k φ ( d ) d (cid:16) y − d (cid:17) k − Λ( y ) e (cid:16) aq · (cid:16) y − d (cid:17) k (cid:17) . We now split above summation interval into sub-progressions, according to theresidue classes of y − d modulus q , to obtain that S d ( a/q ) = X r ( q ) X y ∈ [ dM +1] y ≡ d ) y − d ≡ r (mod q ) k φ ( d ) d (cid:16) y − d (cid:17) k − Λ( y ) e (cid:16) aq · (cid:16) y − d (cid:17) k (cid:17) . Then it is quite easy to see that S d ( a/q ) = X r ( q ) e (cid:0) ar k q (cid:1) Ψ( dM + 1; dq, dr + 1) , (3.4)where Ψ( dM + 1; dq, dr + 1) = X y ∈ [ dM +1] y ≡ dr +1 (mod dq ) Λ( y ) k φ ( d ) d (cid:16) y − d (cid:17) k − . (3.5)Our task now is to compute Ψ( dM +1; dq, dr +1), and the main tools are Lemma2.2 and Abel’s summation formula. Indeed, an application of summation by partsyieldsΨ( dM + 1; dq, dr + 1) = ψ ( dM + 1; dq, dr + 1) φ ( d ) d kM k − − φ ( d ) d Z dM +11 ψ ( t ; dq, dr + 1) (cid:18) k (cid:16) t − d (cid:17) k − (cid:19) ′ d t, recalling that ψ ( x ; q, a ) = P n ∈ [ x ] n ≡ a (mod q ) Λ( n ). It follows from Lemma 2.2, as wellas (3.8), that when dq exp(2 c ′ √ log N ) the second term is equal to φ ( d ) d times χ ( dr + 1) φ ( dq ) n − Z dM +11 t (cid:16) k (cid:16) t − d (cid:17) k − (cid:17) ′ d t + χ ( dr + 1) Z dM +11 t ρ ρ (cid:18) k (cid:16) t − d (cid:17) k − (cid:19) ′ d t o + O (cid:16)Z dM +11 d − k t k − exp (cid:16) − c log t √ log t + √ log N (cid:17) (log N ) d t (cid:17) = − χ ( dr + 1) φ ( dq ) ( k − dM k + χ χ ( dr + 1) φ ( dq ) k ( k − ρ d ρ M k − ρ k − ρ + O ( M k exp (cid:0) − c p log N (cid:1) ) , (3.6)the last line is just a direct integral calculation along with the facts that d exp (cid:0) c ′ √ log N (cid:1) , c ′ < c/ (100 k ) and N ≍ M k . By substituting (3.6) and (2.2) with PARSE FURSTENBERG-S ´ARK ¨OZY 11 x = dM + 1 into (3.5), one hasΨ( dM + 1; dq, dr + 1) = φ ( d ) φ ( dq ) χ ( dr + 1) M k − φ ( d ) φ ( dq ) χ χ ( dr + 1) kd ρ − M k − ρ k − ρ + O (cid:0) M k exp( − c p log N ) (cid:1) . Combining this with (3.4), and recalling that χ is the principle character modulus dq , one then has S d ( a/q ) = φ ( d ) φ ( dq ) C ( q ; a, d ) (cid:26) M k − χ ( dr + 1) kd ρ − k − ρ M k − ρ (cid:27) + O (cid:0) M k exp( − c p log N ) (cid:1) , where C ( q ; a, d ) = P r ( q )( dr +1 ,q )=1 e (cid:0) ar k q (cid:1) and recalling that q exp (cid:0) c ′ √ log N (cid:1) .Now let α = a/q + β , it follows from Abel’s summation formula, as well as N ≍ M k , that S d ( α ) = X y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) e (cid:18) ay k q (cid:19) e ( βy k )= S d ( a/q ) e ( βN ) − Z M X y ∈ [ t ] k φ ( d ) d y k − Λ( dy + 1) e (cid:18) ay k q (cid:19)(cid:16) e ( βt k ) (cid:17) ′ d t. From some computation which is similar to the previous paragraph, it turns outthat S d ( α ) = φ ( d ) φ ( dq ) C ( q ; a, d ) (cid:26)Z M k e ( βt ) d t − χ ( dr + 1) d ρ − Z M k t ρ − k e ( βt ) d t (cid:27) + O (cid:18) N (1 + N | β | ) exp( − c p log N ) (cid:19) , once again recalling that N ≍ M k . If the exceptional zero doesn’t exist, the secondterm would not appear; otherwise, as 1 / < ρ < − ρ ≫ exp( − c ′ √ log N ), it is directly that the second term is rather smaller thanthe ﬁrst one. Therefore, we can get | S d ( α ) | ≪ φ ( d ) φ ( dq ) | C ( q ; a, d ) | N (1 + N | β | ) − + N (1 + N | β | ) exp (cid:0) − c p log N ) (cid:1) , on noting that R M k e ( βt ) d t ≪ min (cid:8) M k , | β | − (cid:9) ≪ N (1 + N | β | ) − . We’ll estimatethe multiple components in the ﬁrst summation in turns. From elementary prop-erties of Euler totient function φ , see [Apo, Theorem 2.5] as an example, onehas, φ ( d ) φ ( dq ) = φ ( d ) φ ( d ) φ ( q ) · φ (gcd( d, q ))gcd( d, q ) φ ( q ) − . Whilst, it can be veriﬁed by Lemma 3.1 with ( a, q ) = ( d,

1) = 1 that C ( q ; a, d ) ≪ q − k + ε , for arbitrary small ε >

0. Combining the above three representations, this is theupper bound of S d ( α ) stating in the lemma. (cid:3) We now concentrate on the minor-arc estimates in the rest of this section. Infact, the minor-arc analysis for exponential sum subjects to a congruence equationhas been developed in [Gr05a], the linear cases, and [Chow], the higher degreecases. However, the modulus of the congruence equation in above mentionedresults are only in logarithmic power size, therefore there are slight diﬀerences inour approach. As the minor-arc estimates are long and miscellaneous, the typeI and type II sum estimates in the following lemma is brief and rough, and we’dlike to refer the readers to [LZ, Chapter 3] for more details.

Lemma 3.3 (Minor arcs estimates) . sup α ∈ m | S d ( α ) | ≪ N (log N ) − D . Proof.

For one thing, in view of Dirichlet approximate theorem, for any frequency α ∈ T , there always exists an integer q N/Q such that k qα k ≪ Q/N . Thus,when α belongs to minor arcs m , we must have Q q N/Q. (3.7)For another, from the proof of Lemma 3.2, it is clear that S d ( α ) = X y ∈ [ dM +1] y ≡ d ) k φ ( d ) d (cid:16) y − d (cid:17) k − Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17) . After the making use of Abel’s summation formula, we’ll get S d ( α ) = φ ( d ) d kM k − X y ∈ [ dM +1] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17) − Z dM +11 (cid:18) X y ∈ [ t ] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:19) · (cid:18) φ ( d ) d k (cid:16) t − d (cid:17) k − (cid:19) ′ d t. From the perspecitive of Brun -Titchmarsh theorem ([MV, Theorem 2]) it is a easymatter to check that (cid:12)(cid:12)(cid:12)(cid:12) X y ∈ [ t ] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) π ( t ; d,

1) log t ≪ t log tφ ( d ) , whilst derivation gives the upper bound (cid:18) φ ( d ) d k (cid:16) t − d (cid:17) k − (cid:19) ′ ≪ φ ( d ) d k t k − . PARSE FURSTENBERG-S ´ARK ¨OZY 13

Combining above two inequalities it is not hard to ﬁnd that the integral over theinterval [1 , dM (log N ) − D ] is negligible, in practice, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z dM (log N ) − D (cid:18) X y ∈ [ t ] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:19) · (cid:18) φ ( d ) d k (cid:16) t − d (cid:17) k − (cid:19) ′ d t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≪ d − k Z dM (log N ) − D t k − log t d t ≪ N (log N ) − D , on recalling that N ≍ M k . Therefore, it can be veriﬁed that the absolute value of S d ( α ) is bounded by φ ( d ) d M k − sup dM (log N ) − D t dM +1 (cid:12)(cid:12)(cid:12)(cid:12) X y ∈ [ t ] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) + O (cid:0) N (log N ) − D (cid:1) . For convenience, we’ll partition above exponential sum into dyadic intervals toget that (cid:12)(cid:12)(cid:12)(cid:12) X y ∈ [ t ] y ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) ≪ log N (cid:12)(cid:12)(cid:12)(cid:12) X y ∼ Ty ≡ d ) Λ( y ) e (cid:16) α (cid:16) y − d (cid:17) k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) with 1 T t and dM (log N ) − D t dM + 1. Besides, it is a fairly immedi-ate consequence of Brun -Titchmarsh theorem that when 1 T dM (log N ) − D above right-hand side expression is bounded by O (cid:16) dMφ ( d ) (log N ) − D (cid:17) , thus the contri-bution to S d ( α ) is O ( N (log N ) − D ). So, now, we’ll assume that dM (log N ) − D T dM + 1, and it can be deduced from Vaughan’s identity with parameter U = (cid:16) Td (cid:17) k that the inner sum of above right-hand side is equal to X uv ∼ Tuv ≡ d ) u U µ ( u )(log v ) e (cid:16) α (cid:16) uv − d (cid:17) k (cid:17) − X s U sw ∼ Tsw ≡ d ) a ( s ) e (cid:16) α (cid:16) sw − d (cid:17) k (cid:17) + X sv ∼ Ts,v>Usv ≡ d ) b ( s )Λ( v ) e (cid:16) α (cid:16) sv − d (cid:17) k (cid:17) := I − I + I , where a ( s ) = P uv = su,v U µ ( u )Λ( v ) and b ( s ) = P uw = su>U µ ( u ) are two arithmetic func-tions. We now estimate I , I and I in turns. On writing u − as the inverse of u modulus d , that is uu − ≡ d ) (so does s − , v − and so on), and then deﬁne c d to be the integer such that uu − − dc d , one then has | I | X u U ( u,d )=1 (cid:12)(cid:12)(cid:12)(cid:12) X v ∼ Tu v ≡ u − (mod d ) (log v ) e (cid:16) α (cid:16) uv − d (cid:17) k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) X u U ( u,d )=1 log T · sup z Tu (cid:12)(cid:12)(cid:12)(cid:12) X max { z, Tu }

0. Secondly, in a similar routine , italso can be deduced from [LZ, formula (3.67)] that | I | k a k ∞ X s U ( s,u )=1 (cid:12)(cid:12)(cid:12) X w ∼ Tsd e ( α ( sw + c d ) k ) (cid:12)(cid:12)(cid:12) ≪ Td (log N ) C (cid:18) q + dT + q (cid:0) Td (cid:1) k (cid:19) k , here we have abused the number c d .As for I , a dyadic argument gives for each U L TU | I | ≪ log T (cid:12)(cid:12)(cid:12)(cid:12) X s ∼ L ( s,d )=1 b ( s ) X max { Ts ,U }

0. Now we let D = max { C , C } + 2 k ( k + 1) + 8 . (3.8)Putting everything together and absorbing terms which are obviously smaller thanother terms, it allows us to conclude that S d ( α ) ≪ N (log N ) − D + N (log N ) D (cid:18) q + (log N ) D M / + qN (cid:19) k − , (3.9)taking note that dM (log N ) − D T dM + 1 and N ≍ M k . The lemma isproven by above inequality (3.9), and noting the fact that when α lies in minorarcs m the lower and upper bound of q in (3.7) and (3.3). (cid:3) Restriction for shifted prime powers

Another preparation is to establish the following restriction lemma.

Lemma 4.1 (Restriction) . Let p > k ( k + 1) + 2 be a real number. S d ( α ) is theexponential sum deﬁned in (3.1), then we have Z T (cid:12)(cid:12) S d ( α ) (cid:12)(cid:12) p d α ≪ p N p − . Generally speaking, the procedure of proving restriction lemma is to performone type of ε -removing argument. So we’d like to consider a variant of Hua’slemma ﬁrstly. Lemma 4.2 (A variant of Hua’s lemma) . Let p > k ( k + 1) + 2 be a positivenumber, we have Z T | S d ( α ) | p d α ≪ N p − (log N ) k ( k +1)+2 . Proof.

Firstly, by expanding the p -th powers, we notice that when p = k ( k + 1) + 2 Z T | S d ( α ) | k ( k +1)+2 d α = Z T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X y ∈ [ M ] φ ( d ) d ky k − Λ( dy + 1) e (cid:16) αy k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) k ( k +1)+2 d α ≪ X y ,...,y k ( k +1)+2 ∈ [ M ] Y i k ( k +1)+2 (cid:0) y k − i log( dy i +1) (cid:1) Z T e (cid:0) α (cid:0) y k + · · ·− y kk ( k +1)+2 (cid:1)(cid:1) d α. From the trivial upper bounds y k − i log( dy i +1) ≪ M k − log N for all i = 1 , . . . , k ( k +1) + 2, one may ﬁnd that Z T | S d ( α ) | k ( k +1)+2 d α ≪ (cid:0) M k − log N (cid:1) k ( k +1)+2 Z T (cid:12)(cid:12)(cid:12) X y ∈ [ M ] e ( y k α ) (cid:12)(cid:12)(cid:12) k ( k +1)+2 d α. On the other hand, let α = ( α , · · · , α k ) ∈ T k be a k -tuple of reals, and deﬁnean exponential sum F ( α ; M ) = X y ∈ [ M ] e ( yα + · · · + y k α k ) . From the observation that R T k | F ( α ; M ) | k ( k +1)+2 d α counts the number of integralsolutions to the system of k equations y j + · · · y j k ( k +1)2 = y j k ( k +1)2 + · · · + y jk ( k +1)+2 j k with variables y i ∈ [ M ] (cid:0) i k ( k + 1) + 2 (cid:1) , hence Z T (cid:12)(cid:12)(cid:12)(cid:12) X y ∈ [ M ] e ( y k α ) (cid:12)(cid:12)(cid:12)(cid:12) k ( k +1)+2 d α X | h j | k ( k +1) M j j k − Z T k | F ( α ; M ) | k ( k +1)+2 e (cid:0) − h α −· · ·− h k − α k − (cid:1) d α. The making use of [BDG, formula (7)] with p = k ( k + 1) + 2 together with thetriangle inequality leads to Z T (cid:12)(cid:12)(cid:12) X y ∈ [ M ] e ( y k α ) (cid:12)(cid:12)(cid:12) k ( k +1)+2 d α ≪ M k ( k +1)+2 − k . Next, substituting above inequality into the upper bound of R T | S d ( α ) | k ( k +1)+2 d α to get that Z T | S d ( α ) | k ( k +1)+2 d α ≪ N k ( k +1)+1 (log N ) k ( k +1)+2 , just on taking note that N = M k .Whilst, it is fairly straightforward that k S d k ∞ X y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) ≪ N, once again recalling that N = M k . We then can jump to conclude that Z m | S d ( α ) | p d α k S d k p − k ( k +1) − ∞ Z T | S d ( α ) | k ( k +1)+2 d α ≪ N p − (log N ) k ( k +1)+2 whenever p > k ( k + 1) + 2. (cid:3) Now, for any real number 0 < η

1, deﬁne the η -large spectrum of S d to be R η = (cid:8) α ∈ T : | S d ( α ) | > ηN (cid:9) , which is a set contains large value frequencies of S d . The key observation is thatthe large spectrum set can not contain many elements. PARSE FURSTENBERG-S ´ARK ¨OZY 17

Lemma 4.3 (Large spectrum set is small) . For every real number < η , meas R η ≪ ε η − k ( k +1) − − ε N − holds for arbitrarily small ε = ε ( η ) > . We can now prove Lemma 4.1 driving from Lemma 4.3.

Proof of Lemma 4.1.

Suppose that p > k ( k + 1) + 2 is a real number, by dyadic argument, we have Z T | S d ( α ) | p d α X j > Z(cid:8) α ∈ T : N j +1 | S d ( α ) | N j (cid:9) | S d ( α ) | p d α X j > − jp N p meas (cid:8) α ∈ T : | S d ( α ) | > − j − N (cid:9) . It then follows from Lemma 4.3 that Z T | S d ( α ) | p d α N p − k ( k +1)+2 X j > j ( k ( k +1)+2 − p )+ ε . And then, on noting the assumption that p > k ( k + 1) + 2, it is permissible to take ε = ε ( p ) suﬃciently small so that above geometric series converges. Therefore, itis immediately that the left-hand side integral is bounded by O p ( N p − ). (cid:3) As a consequence, it is admissible to concentrate on proving Lemma 4.3. Firstly,from a very elementary version, which is based on Lemma 4.2, that if η ≪ (log N ) − k ( k +1)+2 ε , the lemma follows. To see this, recalling the deﬁnition of thelarge spectrum set R η and Lemma 4.2, one has( ηN ) k ( k +1)+2 meas R η Z T | S d ( α ) | k ( k +1)+2 d α ≪ N k ( k +1)+1 (log N ) k ( k +1)+2 , which gives meas R η ≪ η − k ( k +1) − (log N ) k ( k +1)+2 N − . This allows us to conclude that whenever η ≪ (log N ) − k ( k +1)+2 ε we trivially havemeas R η ≪ η − k ( k +1) − − ε N − . Thus, in the remainder part of this section we canreduce the consideration to those η in the region(log N ) − k ( k +1)+2 ε ≪ η . (4.1)We may pick up a discrete sequence { α , · · · , α R } ⊂ T for which satisﬁes(1) For any α r ∈ { α , · · · , α R } , | S d ( α r ) | > ηN ;(2) For any pair α i = α j ∈ { α , · · · , α R } , | α i − α j | > N − ;(3) For any α ∈ R η , there is some α r ∈ { α , · · · , α R } such that | α − α r | N − ;(4) R is the largest integer such that above three events holds.Consequently, it suﬃces to show that R ≪ ε η − k ( k +1) − − ε . (4.2)And, clearly, meas R η R/N . For 1 r R , let c r ∈ { z ∈ C : | z | = 1 } be a number such that c r S d ( α r ) = | S d ( α r ) | . One may ﬁnd from the ﬁrst assumption of above sequence { α , · · · , α R } that R η N (cid:16) X r R | S d ( α r ) | (cid:17) . Expanding the deﬁnition of the exponential sum and using Cauchy-Schwarz in-equality successively to obtain that R η N (cid:12)(cid:12)(cid:12)(cid:12) X r R c r X y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) e ( α r y k ) (cid:12)(cid:12)(cid:12)(cid:12) ≪ N X y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) (cid:12)(cid:12)(cid:12) X r R c r e ( α r y k ) (cid:12)(cid:12)(cid:12) , just on noting that P y ∈ [ M ] k φ ( d ) d y k − Λ( dy + 1) ≪ N . Expanding the square todouble the variables r we then have R η N ≪ X y ∈ [ M ] X r,r ′ R c r c r ′ k φ ( d ) d y k − Λ( dy + 1) e (cid:0) ( α r − α r ′ ) y k (cid:1) . In view of c r , c r ′ are 1-bounded, we can swap the order of above two summationsand then take absolute value insider r and r ′ , it then follows from the deﬁnitionof S d ( α ) that R η N ≪ X r,r ′ R | S d ( α r − α r ′ ) | . Let γ > k be a positive number, utilizing H¨older’s inequality one has R η γ N γ ≪ X r,r ′ R | S d ( α r − α r ′ ) | γ . It is clear from Lemma 3.3 and (4.1) that if α r − α r ′ ∈ m , the contributionof S d ( α r − α r ′ ) is negligible, i.e. rather smaller than η N . We then can assumefurther that α r − α r ′ ∈ M . Under such circumstances, the major-arc approximationrecorded in Lemma 3.2 leads us to S d ( α r − α r ′ ) ≪ q − k + ε N (1 + N | α r − α r ′ − a/q | ) − for some relatively prime pair 1 a q Q (recalling the deﬁnition of major arcs M ). We also notice that when q > e Q , S d ( α r − α r ′ ) ≪ e Q − k + ε N. Hence, X r,r ′ R | S d ( α r − α r ′ ) | γ ≪ R N γ η γ , if we take e Q = η − k + ε and limit q > e Q . Combining what we have so far we ﬁndthat we can restrict our consideration to those q which are no more than e Q . Andwe’ll have η γ R ≪ X q e Q X a ( q )( a,q )=1 X r,r ′ R q − γk + ε (1 + N | α r − α r ′ − a/q | ) − γ X r,r ′ R G ( α r − α r ′ ) , PARSE FURSTENBERG-S ´ARK ¨OZY 19 where G ( α ) = X q e Q X a ( q ) q − γk + ε F ( α − a/q )and F ( β ) = (1 + N | sin β | ) − γ . We are now in the position of [Bou89, Eq.(4.16)], just need to replace N by N ,replace δ by η . Following from Bourgain’s proof, one may ﬁnd that when γ > k + εη γ R N − ≪ N − n e Q τ RN − + R N − e Q · n | u | N : d ( e Q ; u ) > e Q oo , where τ > d ( e Q ; u ) counts the number of thedivisors of u which are smaller than e Q . It follows from [Bour, Lemma 2.8] that R ≪ η − γ e Q τ + η − γ e Q − B R with 0 B < + ∞ . By careful choice of τ and B , together with letting γ = k ( k +1)+22 + ε will lead us to the desired bound (4.2) of R .5. A Local Inverse Theorem

The main goal of this section is to prove the following local inverse result, whichshows that if the sum, weighted by an average zero function f , is non-zero, theneither the summation interval is small, or there is a reasonable bias of f towardsa structured sub-region of the summation interval. Theorem 5.1 (Local inverse theorem) . Let δ > be a positive number satisfying (log N ) − D k ( k +1)+8 ≪ δ , where D > k ( k + 1) + 8 is the large number deﬁned in(3.8). Let d, k > be integers and M = N /k . Let a ′ q ′ exp (cid:0) c ′ √ log N (cid:1) be positive integers with ( a ′ , q ′ ) = 1 , and Λ a ′ ,q ′ is the majorant function deﬁnedin (2.3) which is supported on the interval [ N ] . Assume that f : [ N ] → C is afunction satisfying − δ f Λ a ′ ,q ′ − δ pointwise and E ( f ) = 0 . If δ N ≪ (cid:12)(cid:12)(cid:12) X x ∈ [ N ] X y ∈ [ M ] φ ( d ) d ky k − Λ( dy + 1) f ( x ) f ( x + y k ) (cid:12)(cid:12)(cid:12) , then one of the following three alternatives holds:(1) ( N is small) N ≪ q ′ ;(2) ( N is small) exp( c ′ √ log N ) ≪ dδ − ;(3) ( f has local structure) there exists an arithmetic progression P with com-mon diﬀerence q ≪ δ − k ( k +1) and length X ≫ (log N ) k D N , where D isthe number deﬁned in (3.8), such that X x ∈ P f ( x ) ≫ δ k X. As the readers may expect, we are planning to prove this result using Fourieranalysis. From the perspective of the exponential sum S d ( α ) which is deﬁned in(3.1), it follows from orthogonality principle that X x ∈ [ N ] X y ∈ [ M ] φ ( d ) d ky k − Λ( dy + 1) f ( x ) f ( x + y k ) = Z T b f ( α ) b f ( − α ) S d ( α ) d α. Just after an application of the triangle inequality, it can be seen from the as-sumption of Theorem 5.1 that δ N ≪ (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)Z m b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) , (5.1)where M and m are major and minor arcs deﬁned in (3.2). We’d like to show thatthe integral over the minor arcs is negligible, i.e. bounded by cδ N for some tinyenough c >

0. To do this, we need the restriction lemmas established in Section2 and Section 4. By making use of H¨older’s inequality, the second term in theright-hand side of (5.1), i.e. the integral over minor arcs, is bounded by (cid:16)Z m | S d ( α ) | k ( k +1)+4 d α (cid:17) k ( k +1)+4 (cid:16)Z T (cid:12)(cid:12)(cid:12) b f ( α ) (cid:12)(cid:12)(cid:12) k ( k +1)+8 k ( k +1)+3 d α (cid:17) k ( k +1)+3 k ( k +1)+4 sup α ∈ m (cid:12)(cid:12) S d ( α ) (cid:12)(cid:12) k ( k +1)+4 (cid:16)Z T | S d ( α ) | k ( k +1)+3 d α (cid:17) k ( k +1)+4 (cid:16)Z T (cid:12)(cid:12)(cid:12) b f ( α ) (cid:12)(cid:12)(cid:12) k ( k +1)+8 k ( k +1)+3 d α (cid:17) k ( k +1)+3 k ( k +1)+4 . The application of Lemma 2.3 with Λ b,d = Λ a ′ ,q ′ and p = k ( k +1)+8 k ( k +1)+3 and Lemma 4.1with p = k ( k + 1) + 3 yields (cid:12)(cid:12)(cid:12)Z m b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≪ sup α ∈ m (cid:12)(cid:12) S d ( α ) (cid:12)(cid:12) k ( k +1)+4 N k ( k +1)+7 k ( k +1)+4 . It then follows from Lemma 3.3 and the assumption (log N ) − D k ( k +1)+8 ≪ δ that (cid:12)(cid:12)(cid:12)Z m b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≪ (log N ) − Dk ( k +1)+4 N ≪ δ N . Substitute above inequality into (5.1), we then obtain that (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≫ δ N . (5.2)The next step is to show that we can gain from (5.2) a long arithmetic pro-gression which possesses much information of the function f . To carry out thisprocedure we are now proving that the inequality (5.2) can be reduced to a subsetof major arcs M , and the set only contains frequencies which are close to rationalswith small denominators (depending only on δ ).Let q be a positive integer, deﬁne a set with respect to q as M ( q ) = n α ∈ T : ∃ a q with ( a, q ) = 1 such that (cid:12)(cid:12) α − aq (cid:12)(cid:12) QN o , where Q is the number deﬁned in (3.3). We now partition the major arcs M deﬁned in (3.2) into two disjoint ranges M = [ q ≪ δ − k k +1) M ( q ) and M = [ δ − k k +1) ≪ q ≪ Q M ( q ) . It follows from triangle inequalty and (5.2) that δ N ≪ (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) . (5.3)In an analogous way of proving that the integral over minor arcs m is negligible,we can also show that the integral over M is negligible. Indeed, one can deduce PARSE FURSTENBERG-S ´ARK ¨OZY 21 from H¨older’s inequality and restriction lemmas, i.e. Lemma 2.3 and Lemma 4.1,that (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≪ sup α ∈ M (cid:12)(cid:12) S d ( α ) (cid:12)(cid:12) k ( k +1)+4 N k ( k +1)+7 k ( k +1)+4 . Dues to Lemma 3.2 works when q ≪ Q ≪ (log N ) k D , the application of Lemma3.2 gives that the integral over M is bounded by (cid:16) q − k + ε min (cid:8) N, | β | − (cid:9) + N (1 + N | β | ) exp (cid:0) − c p log N (cid:1)(cid:17) k ( k +1)+4 N k ( k +1)+7 k ( k +1)+4 for all aq + β ∈ M with coprime pair 1 a q . The assumption δ − k ( k +1) ≪ q ≪ (log N ) k D whenever α = aq + β ∈ M then gives (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≪ δ N . Substituting above inequality into (5.3) yields that (cid:12)(cid:12)(cid:12)Z M b f ( α ) b f ( − α ) S d ( α ) d α (cid:12)(cid:12)(cid:12) ≫ δ N . Remark.

It is evident from above analysis that a minor-arc type restriction resultwhich states that R M ∪ m | S d ( α ) | p d α ≪ N p − holds for positive numbers p > C ,where C >

Proof of Theorem 5.1.

It can be deduced from the deﬁnition of M together with above inequality that δ N ≪ X q ≪ δ − k k +1) Z M ( q ) | b f ( α ) | | S d ( α ) | d α. Taking note that the asymptotic formula in Lemma 3.2 is valid when q ≪ δ − k ( k +1) and δ − ≪ (log N ) D k ( k +1)+8 , it can be deduced from above inequality and Lemma3.2 that δ N ≪ X q ≪ δ − k k +1) q − k + ε N Z M ( q ) | b f ( α ) | d α. Clearly, on noting that P q ≤ δ − k k +1) q − k + ε ≪ δ − k ( k − − ε , it then follows frompigeonhole principle that there is some q ≤ δ − k ( k +1) such that δ k ( k − ε N ≪ Z M ( q ) | b f ( α ) | d α. (5.4)Now, suppose that P = a + q · [ X ] is an arithmetic progression of commondiﬀerence q , the denominator such that (5.4) holds, length X = Q − N/ (2 π ).We are going to show that, if N is not small, for any frequency α in M ( q ), the Fourier coeﬃcient | b − P ( α ) | is large. For this purpose, we now consider the Fouriertransform of the indicator function 1 − P , after changing variables x to − x one has | b − P ( α ) | = (cid:12)(cid:12)(cid:12)(cid:12)X x − P ( x ) e ( xα ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)X x P ( x ) e ( − xα ) (cid:12)(cid:12)(cid:12)(cid:12) . On omitting the indicator function 1 P , | b − P ( α ) | = (cid:12)(cid:12)(cid:12)(cid:12) X x ∈ [ X ] e ( − αa − qxα ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) X x ∈ [ X ] e ( − qxα ) (cid:12)(cid:12)(cid:12)(cid:12) . (5.5)With the help of the estimate | − e ( α ) | π k α k , as well as the triangle inequality,we can ensure that | b − P ( α ) | > X − X x ∈ [ X ] π k qxα k > X − πX k qα k . It thus follows immediately from the facts k qα k ≪ QN − ( α ∈ M ( q )) and X = Q − N/ (2 π ) that whenever α ∈ M ( q ), | b − P ( α ) | > X/ . (5.6)We’d like to remark that (5.6) holds for any choice of the starting element a ofthe progression P , and this can be easily seen from (5.5).In order to prove the local inverse theorem, our strategy is to show that theconvolution f ∗ − P ( x ) is large at some point x , which means that f correlateswith a translation of this progression P . Combining (5.4) with (5.6), we note that δ k ( k − ε X N ≪ Z M ( q ) | b − P ( α ) | | b f ( α ) | d α. Then by removing the restriction of domain of the integral to get that δ k +1 X N ≪ Z T | b − P ( α ) | | b f ( α ) | d α. Applying Parseval’s identity and pigeonhole principle in successive, we have δ k +1 X N ≪ k f ∗ − P k sup x | f ∗ − P ( x ) | X x | f ∗ − P ( x ) | . If there exists some x such that | f ∗ − P ( x ) | > δX , on noting that − δ f pointwise so that − δX f ∗ − P pointwise, it must be f ∗ − P ( x ) > δX, (5.7)which gains more than we expected. Otherwise, one has X x | f ∗ − P ( x ) | ≫ δ k XN. (5.8)However, on the other hand, we also notice from the deﬁnition of the convolutionthat X x f ∗ − P ( x ) = X y f ( y ) X x − P ( x − y ) . PARSE FURSTENBERG-S ´ARK ¨OZY 23

For each ﬁxed number y , the inner sum is always X , the length of the progression P . Consequently, X x f ∗ − P ( x ) = 0 , (5.9)recalling that the average of f is zero. As a consequence, on combining (5.8) and(5.9), it then follows from the pigeonhole principle, together with supp( f ) ⊂ [ N ],that there is some x such that f ∗ − P ( x ) ≫ δ k X. Expanding the convolution once again, it allows us to conclude that δ k X ≪ X y ∈ x + P f ( y ) . And Theorem 5.1 follows. (cid:3) Density Increment

We are going to prove our main result in this section. The strategy is to run adensity increment argument, the method for Roth to bound the size of sets lacking3-term arithmetic progressions, and for S´ark¨ozy to bound the size of sets lackingFurstenberg-S´ark¨ozy conﬁgurations x, x + y .In the ﬁrst place, assume that A , as a subset of P N of size at least δ |P N | , lacksthe non-trivial patterns x, x + ( p − k . It then follows from prime number theorythat X n ∈ [ N ] Λ( n )1 A ( n ) > δN. By increasing δ if necessary, we may assume that the average of the weightedbalanced function f = Λ · A − δ is zero, i.e. X n ∈ [ N ] f ( n ) = X n ∈ [ N ] (Λ · A − δ [ N ] )( n ) = 0 . We now consider the following summation, which counts the weighted numberof patterns x, x + ( p − k lie in A , X x ∈ [ N ] X y ∈ [ M ] ky k − Λ( y − · A )( x )(Λ · A )( x + y k ) , where M = N /k . Clearly, from our assumption that A does not contain non-trivial such patterns, it is zero.Now we assume that (cid:12)(cid:12) A ∩ [ N , N ] (cid:12)(cid:12) > δ |P N | , otherwise either (cid:12)(cid:12) A ∩ [ N ] (cid:12)(cid:12) > δ |P N | or (cid:12)(cid:12) A ∩ [ N , N ] (cid:12)(cid:12) > δ |P N | , which means that in at least one of the in-tervals [1 , N ] and [ N , N ] the set A has a relative density at least δ . Bydecomposing the weighted balanced function f one may ﬁnd that the summation P x ∈ [ N ] P y ∈ [ M ] ky k − Λ( y − f ( x ) f ( x + y k ) equals to X x ∈ [ N ] X y ∈ [ M ] ky k − Λ( y − n (Λ · A )( x )(Λ · A )( x + y k ) − δ (Λ · A )( x )1 [ N ] ( x + y k ) − δ [ N ] ( x )(Λ · A )( x + y k ) + δ [ N ] ( x )1 [ N ] ( x + y ) o . On restricting the summation interval of x into [ N , N ], it is not hand to seethat the second term is no more than − δ X x ∈ [ N , N ] (Λ · A )( x ) X y ∈ [ M ] ky k − Λ( y − [ N ] ( x + y k ) . Taking noting that when x N and y k M k = N we trivially have x + y k N ,above expression is indeed − δ X x ∈ [ N , N ] (Λ · A )( x ) X y ∈ [ M ] ky k − Λ( y − . It then follows from the assumption (cid:12)(cid:12) A ∩ [ N , N ] (cid:12)(cid:12) > δ |P N | that the second termis indeed less than or equal to − δ N M k . In the same routine one can also getthe third term is no more than − δ N M k . As a consequence of the assumptionthat A lacks p , p + ( p − k with p , p ∈ P , X x ∈ [ N ] X y ∈ [ M ] ky k − Λ( y − f ( x ) f ( x + y k ) − δ N M k . That is, δ N ≪ (cid:12)(cid:12)(cid:12) X x ∈ [ N ] X y ∈ [ M ] ky k − Λ( y − f ( x ) f ( x + y k ) (cid:12)(cid:12)(cid:12) , which means that if A is not pseudorandom, the weighted number of patterns x, x + ( p − k in A is far away from the expected number δ N .When (log N ) − D k ( k +1)+8 ≪ δ , it follows from Theorem 5.1 with d = 1, a ′ = q ′ = 1and f = Λ · A − δ [ N ] , together with the fact that the interval [ N ] always containsprime numbers, that there exists an arithmetic progression P = a + q · [ X ] with( a, q ) = 1, common diﬀerence q ≪ δ − k ( k +1) and length X ≫ (log N ) k D N suchthat δ k X ≪ X x ∈ P f ( x ) . For a technical reason, we’d like to reduce our consideration to a reasonablesub-progression. The trick is to partition above progression P into disjoint sub-progressions of common diﬀerence q k , and, thus, each sub-progression has length ≫ j Xq k − k . The pigeonhole principle tells us that there is such a sub-progression P ′ such that δ k | P ′ | ≪ X x ∈ P ′ f ( x ) . Now we rename P = P ′ = a + q k · [ X ] with 1 a q k (this can be guaranteedfrom the proof of Theorem 5.1), q ≪ δ − k ( k +1) and X ≫ (log N ) k D q − k N . Inaddition, on recalling the deﬁnition of the balanced function f , above inequalityis indeed δ k X ≪ X x ∈ P Λ · A ( x ) − δX . That is, from the deﬁnition of the progression P , X x ∈ [ X ] (Λ · A )( a + q k x ) = δ X PARSE FURSTENBERG-S ´ARK ¨OZY 25 with δ > δ + cδ k . As we described in Section 1, in the second iteration stepwe need to ﬁnd conﬁgurations a + q k x, a + q k x + ( p − k in the progression P = a + q k · [ X ]. More or less, it equals to ﬁnd x, x + y k in [ X ]with q y + 1 ∈ P and a + q k x ∈ P . Since A = A − a q k contains no non-trivialconﬁgurations x, x + y k with q y + 1 ∈ P and a + q k x ∈ P , whenever (cid:12)(cid:12)(cid:12) A ∩ (cid:16) a + q k · [ 110 X , X ] (cid:17)(cid:12)(cid:12)(cid:12) > δ (cid:12)(cid:12)(cid:12) P N ∩ (cid:0) a + q k · [ 110 X , X ] (cid:1)(cid:12)(cid:12)(cid:12) Theorem 5.1 can be applied with a ′ = a , q ′ = q k and N = X whenever q k ≪ exp (cid:0) c ′ √ log X (cid:1) . Then by taking advantage of Theorem 5.1 with d = q , M = (cid:6) ( X / /k (cid:7) , and f = Λ a ,q k · A − δ [ X ] , one of the following three casesmust hold. If X ≪ q k , from Xylouris’ result [Xyl, Theorem 1.2] for Linnik’stheorem, we can not guarantee the progression P = a + q k · [ X ] contains aprime number; if X ≫ q k but q δ − ≫ exp (cid:0) c ′ √ log X (cid:1) , it will violate Lemma3.2 (and so Lemma 2.3); if X ≫ q k and q k ≪ exp (cid:0) c ′ √ log X (cid:1) (and immed-italy q δ − ≪ exp (cid:0) c ′ √ log X (cid:1) ), after a necessary partition of progressions, wecan ﬁnd an arithmetic progression P = a + q k · [ X ] with q ≪ δ − k ( k +1)1 and X ≫ (log N ) k D q − k X and ( a , q ) = 1 such that X x ∈ P Λ a ,q k · A ( x ) − δ X ≫ δ k X . On noticing 1 A ( x ) = 1 A ( a + q k x ) and Λ a ,q k ( x ) = Λ( a + q k x ), by writing Q = a + a q k + ( q q ) k · [ X ], it is indeed X x ∈ Q (Λ · A )( a + a q k + ( q q ) k x ) > (cid:0) δ + cδ k (cid:1) | Q | . And Theorem 1.1 is proven by performing above argument frequently.

Proof of Theorem 1.1.

Assume that P i = a i + q ki · [ X i ](1 i m ) with ( a i , q i ) = 1 is a sequence ofprogressions obtained by running above-described argument repeatedly. And, forconvenience, let us deﬁne another sequence of progressions Q i (1 i m ) withrespect to P i by taking Q i = a + a q k + · · · + ( q · · · q i ) k · [ X i ] . It is clear that Q i ⊂ [ N ] and | Q i | = | P i | for each i = 1 , · · · , m . Besides, we deﬁnethe relative density of A in Q i (1 i m ) to be the following magnitude (cid:12)(cid:12) A ∩ Q i (cid:12)(cid:12)(cid:12)(cid:12) P N ∩ Q i (cid:12)(cid:12) . Very specially, when Q i = [ N ], the relative density of A in Q i is the standardrelative density in primes.To sum up what we have so far, there is a series of arithmetic progressions Q i (1 i m ) and associated sets A i (1 i m ) such that either at some step i the relative density of A in at least one of the progressions a + a q k + · · · + ( q · · · q i ) k · [1 , X i ] and a + a q k + · · · + ( q · · · q i ) k · [ X i , X i ] is no less than δ i ,or the following events happen simultaneously(1) the relative density of A in Q i is δ i satisfying δ i > δ i − + cδ k i − with δ = δ + cδ k ;(2) the progressions Q i take the form of Q i = a + a q k + · · · + ( q · · · q i ) k · [ X i ]for which ( a i , q i ) = 1, q i ≪ δ − k ( k +1) i − and X i ≫ (log N ) k D q − ki X i − ;(3) each set A i ⊂ [ X i ](1 i m ) lacks conﬁgurations x, x + y k with q · · · q i y +1 ∈ P and a + a q k + · · · + q k · · · q ki x ∈ P ;(4) X i ≫ (cid:0) q · · · q i (cid:1) k ;(5) ( q · · · q i ) k ≪ exp (cid:0) c ′ √ log X i (cid:1) .However, this iteration cannot last too long—it must stop at some step m ≪ δ − k ,since otherwise the density will surpass 1. And we’ll end up (the iteration) witheither the condition (4) or (5) failed, otherwise we could apply the third case ofTheorem 5.1 one more time.Firstly, we assume that (4) fails, combining this with (2) gives (cid:0) q · · · q m (cid:1) k ≫ (cid:0) (log N ) k D (cid:1) m (cid:0) q · · · q m (cid:1) − k N. Recalling that q i ≪ δ − k − for all 1 i m and m ≪ δ − k +1 , it then follows fromre-arranging that (cid:16)(cid:0) δ − k ( k +1) (cid:1) k (cid:17) cδ − k ≫ N (cid:0) (log N ) k D (cid:1) cδ − k . Taking logarithms of both sides yields δ ≪ (cid:18) log log N log N (cid:19) k . Secondly, we suppose that (5) fails, in a similar routine, one has( q · · · q m ) k ≫ exp (cid:0) c ′ p log X m (cid:1) , after taking logarithms of both sides (cid:18) δ − k log (cid:16)(cid:0) δ − k ( k +1) (cid:1) k (cid:17)(cid:19) ≫ log N − cδ − k log (cid:16)(cid:0) δ − k ( k +1) (cid:1) k − (cid:17) − cδ − k log log N. After re-arranging we’ll have δ ≪ (cid:18) log log N log N (cid:19) k . Put above two cases together, we’ll get the desired density bound. (cid:3)

Appendix A. Restriction for prime numbers

The main task of the appendix is to prove Lemma 2.3. We begin with anobservation that the tail δ in Lemma 2.3 is harmless, so that it can be reduced toprove the following alternative lemma. PARSE FURSTENBERG-S ´ARK ¨OZY 27

Lemma A.1 (Altenative restriction) . Let b d exp (cid:0) c ′ √ log X (cid:1) be twointegers satisfying ( b, d ) = 1 . Let Λ b,d be the majorant function with respect to thepair ( b, d ) , which is deniﬁed in (2.3) and supported on the interval [ X ] . Then forany complex-valued function ν satisfying | ν | Λ b,d pointwise, and any real number p > , k b ν k pp = Z T | b ν ( α ) | p d α ≪ p X p − . Practically, suppose that p > f = ν − δ [ X ] . The use ofmean value inequality yields that Z T | b f ( α ) | p d α = Z T (cid:12)(cid:12)(cid:12)b ν ( α ) − δ b [ X ] ( α ) (cid:12)(cid:12)(cid:12) p d α ≪ p Z T | b ν ( α ) | p d α + δ p Z T | b [ X ] ( α ) | p d α. On the one hand, it follows from Lemma A.1 that k b ν k pp ≪ p X p − . On the other, we notice that when p > Z T | b [ X ] ( α ) | p d α (cid:13)(cid:13)(cid:13)b [ X ] (cid:13)(cid:13)(cid:13) p − ∞ · (cid:13)(cid:13)(cid:13)b [ X ] (cid:13)(cid:13)(cid:13) X p − , the last inequality just follows from Parseval’s identity. On taking note the as-sumption that δ is a positive number tends to zero, we thus have Z T | b f ( α ) | p d α ≪ p Z T | b ν ( α ) | p d α + δ p X p − ≪ p X p − . This gives Lemma 2.3. And in the remainder part we will focus on Lemma A.1.The same as Lemma 4.3, we’ll establish the large spectrum set. For any realnumber 0 < η <

1, the η -large spectrum for the Fourier transform of ν is deﬁnedto be the set R η = (cid:8) α ∈ T : | b ν ( α ) | > ηX (cid:9) . As expected, if we can show thatmeas R η ≪ ε η − − ε X − , (A.1)then by making use of dyadic argument we can prove Lemma A.1. Indeed, assumethat p > ε = ε ( p ) suﬃciently small, one has Z T | b ν ( α ) | p d α X j > Z(cid:8) α ∈ T : X j +1 | b ν ( α ) | X j (cid:9) | b ν ( α ) | p d α ≪ ε X p − X j > j (2 − p )+ ε ≪ X p − . And this is Lemma A.1.Noting that Parseval’s identity provides a crude bound, that if η ≪ (log N ) − /ε (A.1) follows. Indeed, since | ν | Λ b,d pointwise, applying Parseval’s identity andpigeonhole principle successively, one has k b ν k = k ν k = X x ∈ [ X ] | ν ( x ) | k Λ b,d k ∞ X x ∈ [ X ] Λ b,d ( x ) X log X. For another thing, it can be seen from the deﬁnition of η -large spectrum set that( ηX ) meas R η Z R | b ν ( α ) | d α Z T | b ν ( α ) | d α. On combining above two inequalities to get thatmeas R η η − (log X ) X − . Thus, we may assume that (log X ) − /ε ≪ η . (A.2)Let α , · · · , α R be X − -spaced point in R η , in order to prove (A.1) it suﬃces toshow that R ≪ ε η − − ε . (A.3)In the next step, we would transfer the Fourier analysis from the uncertainfunction ν to Λ b,d . For this reason, we are now going to partition T into majorand minor arcs, and show that the Fourier transform of the majorant function b Λ b,d possesses logarithmic savings on the minor arcs. Let α ∈ T be a frequency, we seefrom Fourier transform and the deﬁnition of Λ b,d that b Λ b,d ( α ) = X x ∈ [ X ] Λ b,d ( x ) e ( αx ) = X x ∈ [ X ] φ ( d ) d Λ( b + dx ) e ( αx ) . Let M ∗ = [ q Q (cid:8) α ∈ T : k qα k ≪ Q/X (cid:9) , and the minor arcs m ∗ are the complement of major arcs, i.e. m ∗ = T \ M ∗ , where Q = (log X ) C for some large number C > Lemma A.2 (Major-arc asymptotic for b Λ b,d ) . Suppose that α = aq + β ∈ M ∗ and b d exp (cid:0) c ′ √ log X (cid:1) , then | b Λ b,d ( α ) | ≪ q − ε X (1 + X | β | ) − + X exp (cid:0) − c p log X (cid:1) (1 + X | β | ) . Proof.

Dues to the proof of this lemma is conceptually the same as Lemma 3.2,we’d like to omit some details. If β = 0 and α = aq , b Λ b,d ( a/q ) = φ ( d ) d X r ( q )( b + rd,q )=1 e (cid:0) ar/q (cid:1)n ψ ( b + dX ; dq, b + rd ) − ψ ( b + d ; dq, b + rd ) o . On noticing that the discrete interval [ b + d ] contains at most one element whichis congruent to b + rd modulus dq , thereby, b Λ b,d ( a/q ) = φ ( d ) d X r ( q )( b + rd,q )=1 e (cid:0) ar/q (cid:1) ψ ( b + dX ; dq, b + rd ) + O ( q ) . It then follows from summation by parts and Lemma 2.2 that when α = aq + β b Λ b,d ( α ) = φ ( d ) φ ( dq ) X r ( q )( b + rd,q )=1 e (cid:0) ar/q (cid:1)(cid:26)Z X e ( βt ) d t − χ ( b + dr ) Z X ( b + dt ) ρ − e ( βt ) d t (cid:27) + O (cid:16) X exp (cid:0) − c p log X (cid:1) (1 + X | β | ) (cid:17) . PARSE FURSTENBERG-S ´ARK ¨OZY 29

We’ll ﬁnish the proof by making use of Lemma 2.2 and Lemma 3.1, which is in asimilar path of Lemma 3.2. (cid:3)

The minor arcs analysis is conceptually the same as Lemma 3.3. Again, weoutline the main steps.

Lemma A.3 (Minor arcs estimates) . sup α ∈ m ∗ (cid:12)(cid:12)b Λ b,d ( α ) (cid:12)(cid:12) ≪ X (log X ) − C . Proof.

It is immediate from Fourier transform as well as the deﬁnition of Λ b,d that b Λ b,d ( α ) = φ ( d ) d X x ∈ [ X ] Λ( b + dx ) e ( xα ) . After changing variables b + dx x , we have b Λ b,d ( α ) = φ ( d ) d e (cid:16) − bαd (cid:17) X x ∈ [ b,b + dX ] x ≡ b (mod d ) Λ( x ) e (cid:16) αxd (cid:17) . On noting that b is quite small compared with X , to prove this lemma it suﬃcesto show that (cid:12)(cid:12)(cid:12)(cid:12) X x ∈ [ b + dX ] x ≡ b (mod d ) Λ( x ) e (cid:16) αxd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) ≪ X (log X ) − A (A.4)for all frequencies α ∈ m ∗ . And the proof is a modiﬁcation of Lemma 3.3 and[Gr05a, Lemma 4.9]. To begin with, it follows from [Gr05b, formula (3)] that X x ∈ [ b + dX ] x ≡ b (mod d ) Λ( x ) e (cid:16) αxd (cid:17) = X x Ux ≡ b (mod d ) Λ( x ) e (cid:16) αxd (cid:17) + X xy ∈ [ b + dX ] x Uxy ≡ b (mod d ) µ ( x )(log y ) e (cid:16) αxyd (cid:17) + X xy ∈ [ b + dX ] x U xy ≡ b (mod d ) f ( x ) e (cid:16) αxyd (cid:17) + X xy ∈ [ b + dX ] x,y>Uxy ≡ b (mod d ) µ ( x ) g ( y ) e (cid:16) αxyd (cid:17) , where f, g are arithmetic functions deﬁned in [Gr05b, formula (4-5)] satisfying k f k ∞ , k g k ∞ ≪ log X . Practically, above four terms are re-grouped and departedinto the following type I and type II sums, X xy ∈ [ b + dX ] x ∼ L,y > Txy ≡ b (mod d ) f ( x ) e (cid:16) αxyd (cid:17) and X xy ∈ [ b + dX ] x ∼ Lxy ≡ b (mod d ) f ( x ) g ( y ) e (cid:16) αxyd (cid:17) . Next, we turn our attention to the behavior of frequencies α in minor arcstemporarily. In view of Dirichlet approximate theorem, for any frequency α ∈ T , there always exists an integer q X/Q such that k qα k ≪ Q/X . Thus, when α belongs to minor arcs m ∗ , we must have Q q X/Q. (A.5)The proof of [Gr05b, Lemma 4] tells us that (A.4) follows from the following twoclaims and (A.5).

Claim 1.

Suppose that (cid:12)(cid:12) α − aq (cid:12)(cid:12) q with ( a, q ) = 1 and 1 a q . Suppose alsothat X > L , then we have X xy ∈ [ b + dX ] x ∼ L,y > Txy ≡ b (mod d ) f ( x ) e (cid:16) αxyd (cid:17) ≪ (log X )(log q ) k f k ∞ (cid:0) Xq + L + q (cid:1) . Claim 2.

Suppose that (cid:12)(cid:12) α − aq (cid:12)(cid:12) q with ( a, q ) = 1 and 1 a q . Suppose alsothat X > L , then we have X xy ∈ [ b + dX ] x ∼ Lxy ≡ b (mod d ) f ( x ) g ( y ) e (cid:16) αxyd (cid:17) ≪ k f ∞ k k g k ∞ X (log q ) (cid:16) q + LX + qX (cid:17) / . To prove Claim 1, we notice that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X xy ∈ [ b + dX ] x ∼ L,y > Txy ≡ b (mod d ) f ( x ) e (cid:16) αxyd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X x ∼ L ( x,d )=1 f ( x ) (cid:12)(cid:12)(cid:12)(cid:12) X T y b + dXx y ≡ bx − (mod d ) e (cid:16) αxyd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) ≪ k f k ∞ X x ∼ L (cid:12)(cid:12)(cid:12) X Td y Xx e ( xyα ) (cid:12)(cid:12)(cid:12) ≪ k f k ∞ X x ∼ L min (cid:8) Xx , k xα k − (cid:9) . Thus, Claim 1 follows from [Gr05b, Lemma 2]. It remains to prove Claim 2. Onemay obtain from an application of Cauchy-Schwarz inequality that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X xy ∈ [ b + dX ] x ∼ Lxy ≡ b (mod d ) f ( x ) g ( y ) e (cid:16) αxyd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X x ∼ L f ( x ) X s ∼ L ( x,d )=1 X y ,y b + dXx y ,y ≡ bx − (mod d ) g ( y ) g ( y ) e (cid:16) αx ( y − y ) d (cid:17) . Changing variables and exchanging the order of summation yields that aboveexpression is bounded by L k f k ∞ k g k ∞ X y ,y XL (cid:12)(cid:12)(cid:12) X x ∼ L ( x,d )=1 e ( x ( y − y ) α ) (cid:12)(cid:12)(cid:12) . PARSE FURSTENBERG-S ´ARK ¨OZY 31

After square (both sides) and another application of Cauchy-Schwarz inequalitygives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X xy ∈ [ b + dX ] x ∼ Lxy ≡ b (mod d ) f ( x ) g ( y ) e (cid:16) αxyd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≪ k f k ∞ k g k ∞ X X y ,y XL X x ∼ L ( x,d )=1 X t L e ( t ( y − y ) α ) ≪ k f k ∞ k g k ∞ X X j XL min (cid:8) L, k jα k − (cid:9) . One then deduces from [Gr05b, Lemma 1] that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X xy ∈ [ b + dX ] x ∼ Lxy ≡ b (mod d ) f ( x ) g ( y ) e (cid:16) αxyd (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≪ k f k ∞ k g k ∞ X (log q ) (cid:16) q + LX + qX (cid:17) . (cid:3) We are now left to verify (A.3). For 1 r R , let c r ∈ { z ∈ C : | z | = 1 } be anumber such that c r b ν ( α r ) = | b ν ( α r ) | . And let l x (1 x X ) ∈ C be a sequencesuch that | l x | l x Λ b,d ( x ) = ν ( x ). Routinely, as in Section 5, one has R η X (cid:18) X r ∈ [ R ] | b ν ( α r ) | (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) X r ∈ [ R ] c r X x ∈ [ X ] l x Λ b,d ( x ) e ( xα r ) (cid:12)(cid:12)(cid:12)(cid:12) X X x ∈ [ X ] (cid:12)(cid:12)(cid:12)(cid:12) X r ∈ [ R ] c r Λ b,d ( x ) e ( xα r ) (cid:12)(cid:12)(cid:12)(cid:12) ≪ X X r,r ′ ∈ [ R ] (cid:12)(cid:12)(cid:12) X x ∈ [ X ] Λ b,d ( x ) e (cid:18) ( α r − α r ′ ) x (cid:19)(cid:12)(cid:12)(cid:12) , on noticing that P x ∈ [ X ] l x Λ b,d ( x ) X . It then follows from Fourier transformthat R η X ≪ X r,r ′ ∈ [ R ] (cid:12)(cid:12)b Λ b,d ( α r − α r ′ ) (cid:12)(cid:12) . Put γ = 1 + ε , utilizing H¨older’s inequality one then has R η γ X γ ≪ X r,r ′ ∈ [ R ] (cid:12)(cid:12)b Λ b,d ( α r − α r ′ ) (cid:12)(cid:12) γ . Joint Lemma A.2, Lemma A.3, (A.2) and putting e Q = η − − ε gives η γ R ≪ X q e Q X a ( q )( a,q )=1 X r,r ′ R q − γ + ε (1 + X k α r − α r ′ − a/q k ) − γ X r,r ′ R G ( α r − α r ′ ) , where G ( α ) = X q e Q X a ( q ) q − γ + ε F ( α − a/q )and F ( β ) = (1 + X | sin β | ) − γ . We are now in the position of [Bou89, Eq.(4.16)], just need to replace N by X ,replace δ by η . Following from Bourgain’s proof, on noting γ = 1 + ε , one thenhas R ≪ ε η − − ε . References [Apo] T. Apostol.

Introduction to analytic number theory , Springer Science & Busi-ness Media (2013).[BPPS] A. Balog, J. Pelik´an, J. Pintz, E. Szemer´edi. “Diﬀerence sets without k -thpowers.” Acta Mathematica Hungarica , 65(2), 165–187 (1994).[BM] T. Bloom, J. Maynard. “A new upper bound for sets with no square diﬀer-ences.” arXiv preprint arXiv:2011.13266. (2020).[Bou89] J. Bourgain. “ On Λ( p )-subsets of squares.” Israel Journal of Mathematics ,67(3), 291–311 (1989).[BDG] J. Bourgain, C. Demeter, L. Guth. “Proof of the main conjecture in Vino-gradov’s mean value theorem for degrees higher than three.”

Annals of Math-ematics , 633–682 (2016).[Chow] S. Chow. “Roth-Waring-Goldbach.”

International Mathematics ResearchNotices , (8) 2341–2374 (2018).[Dav] H. Davenport.

Multiplicative number theory (Vol. 74) Springer Science &Business Media (2013).[Fur] H. Furstenberg. “Ergodic behavior of diagonal measures and a theoremof Szemer´edi on arithmetic progressions.”

Journal d’Analyse Math´ematique ,31(1), 204–256 (1977).[Gr05a] B. Green. “Roth’s theorem in the primes.”

Annals of mathematics , 1609–1636 (2005).[Gr05b] B. Green. “Some minor arcs estimates relevant to the paper ‘Roth’s the-orem in the primes’ ”.[GT06] B. Green, T. Tao. “Restriction theory of the Selberg sieve, with applica-tions.”

Journal de th´eorie des nombres de Bordeaux , 1(18) 147–182 (2006).[GT08] B. Green, T. Tao. “The primes contain arbitrarily long arithmetic progres-sions.”

Annals of Mathematics , 481–547 (2008)[LP] H. Li, H. Pan. “Diﬀerence sets and polynomials of prime variables. ”

ActaArithmetica , 138, 25–52 (2009).[LZ] J. Liu, T. Zhan.

New Developments in the Additive Theory of Prime Numbers ,World Scientiﬁc (2011).[MV] H. Montgomery, R. Vaughan. “The large sieve.”

Mathematika , 20(2), 119–134 (1973).[PSS] J. Pintz, W. L. Steiger, E. Szemer´edi. “On sets of natural numbers whose dif-ference set contains no squares.”

Journal of the London Mathematical Society ,2(2), 219–231 (1988).[Rice] A. Rice. “S´ark¨ozy’s theorem for P -intersective polynomials.”arXiv:1111.6559v5 [math.CA][Roth] K. Roth. “On certain sets of integers.” Journal of the London MathematicalSociety , 1(1) 104–109 (1953).

PARSE FURSTENBERG-S ´ARK ¨OZY 33 [RS] I. Ruzsa, T. Sanders. “ Diﬀerence sets and the primes.”

Acta Arithmetica ,131, 281–301 (2008).[S781] A. S´ark¨ozy. “On diﬀerence sets of sequences of integers. I.”

Acta Mathemat-ica Academiae Scientiarum Hungarica , 31(1-2), 125–149 (1978).[S783] A. S´ark¨ozy. “ On diﬀerence sets of sequences of integers. III.”

Acta Mathe-matica Academiae Scientiarum Hungarica , 31(3-4), 355–386 (1978).[Va] R. Vaughan. the Hardy-Littlewood method , Cambreidge (2003).[Wang] R. Wang. “On a theorem of S´ark¨ozy for diﬀerence sets and shifted primes.”

Journal of Number Theory , 211, 220–234 (2020).[Xyl] T. Xylouris. “On the least prime in an arithmetic progression and estimatesfor the zeros of Dirichlet L-functions.”

Acta Arithmetica , 150(1) 65–91 (2011).

School of Mathematics, Shandong University, Jinan, 250100, China

Email address ::