[PDF] Sums of Powers in Large Finite Fields: A Mix of Methods

Abstract

Can any element in a sufficiently large finite field be represented as a sum of two d th powers in the field? In this article, we recount some of the history of this problem, touching on cyclotomy, Fermat's last theorem, and diagonal equations. Then, we offer two proofs, one new and elementary, and the other more classical, based on Fourier analysis and an application of a nontrivial estimate from the theory of finite fields. In context and juxtaposition, each will have its merits.

Full PDF

aa r X i v : . [ m a t h . N T ] S e p Sums of Powers in Large Finite Fields:A Mix of Methods

Vitaly Bergelson, Andrew Best, and Alex IosevichSeptember 22, 2020

Abstract

Can any element in a suﬃciently large ﬁnite ﬁeld be represented as a sum of two d th powers in theﬁeld? In this article, we recount some of the history of this problem, touching on cyclotomy, Fermat’slast theorem, and diagonal equations. Then, we oﬀer two proofs, one new and elementary, and the othermore classical, based on Fourier analysis and an application of a nontrivial estimate from the theory ofﬁnite ﬁelds. In context and juxtaposition, each will have its merits. We denote by F q the ﬁnite ﬁeld with q elements, where q is a power of a prime. Consider the followingproblem, which has been attacked several times in diﬀerent centuries, and just two of its extensions. Problem 1.

Fix an integer d >

1. Show that for any suﬃciently large ﬁnite ﬁeld F q , we have F q = { x d + y d : x, y ∈ F q } , or, in other words, every element of F q is a sum of two d th powers. Problem 2.

Fix an integer d >

1. Show that for any suﬃciently large ﬁnite ﬁeld F q , we have for every a, b ∈ F × q that F q = { ax d + by d : x, y ∈ F q } . Problem 3.

Fix integers n > k , . . . , k n > b ∈ F q and coeﬃcients a , . . . , a n ∈ F × q ,determine the number of solutions ( x , . . . , x n ) ∈ F nq to the diagonal equation a x k + · · · + a n x k n n = b. (1) Remarks.

1. Looking at Problems 1 and 2 with fresh eyes, one might ﬁrst observe that since x q − = 1for all x ∈ F × q , it follows that { x q − + y q − : x, y ∈ F q } = { , , } , which is usually not all of F q . Thus,one should expect to require q > d + 1, hence the “suﬃciently large” in the formulation. As we willsee, in the case d = 2, this requirement is unnecessary.2. The progression in diﬃculty of the problems should be clear, with Problem 3 a signiﬁcant step up.Indeed, when n = 2 and k = k = d , knowing merely that the number of solutions to equation (1) ispositive for each choice of a , a , b suﬃces to solve Problem 2. However, even the more modest jumpfrom Problem 1 to Problem 2 can pose a challenge in that some approaches that work for the formerseem unable to be easily upgraded to work for the latter.3. According to Small, Kaplansky privately conjectured that the “outrageous” statement of Problem 1holds [28]. We will see what role the article [28] occupies in the panoply of results we will survey shortly.1athematicians who have been caught on paper being interested in diagonal equations include Lagrange,H. M. Weber, Cauchy, Skolem, Gauss, V. A. Lebesgue, Dickson, Hurwitz, and Weil. In the processof proving his celebrated four-square theorem, Lagrange showed in [19] that, given any b ∈ F × p , we have F p = { x + by : x, y ∈ F p } . Though not the ﬁrst to do so, Weber solved Problem 1 in the case d = 2: Proposition 1 ([31, p. 309]) . Every element in a ﬁnite ﬁeld F q is the sum of two squares.Proof. Suppose F q has characteristic 2, so that q = 2 r for some positive integer r . Let c ∈ F q . Since c q = c ,it follows that c = ( c r − ) , that is, c is a square. Thus, the element c = ( c r − ) + 0 is a sum of two squares.Now suppose F q has characteristic p >

2, with p prime. Let c ∈ F q . If c is already a square, there isnothing to show, so suppose c is a nonsquare. We now analyze two cases, depending on whether − F q . On the one hand, suppose − F q , so there is some element g ∈ F q such that g = −

1. Then, the clever identity c = (cid:18) c + 14 (cid:19) + (cid:18) g (cid:18) c − (cid:19)(cid:19) expresses c as a sum of two squares. Note that when p = 3, the symbol means 1.On the other hand, suppose − p − F q instead. We always know that 1 is a squarein F q . Now let’s consider the prime subﬁeld F p ⊆ F q , which is generated by 1. Since 1 is a square and p − a and a + 1, both in F p , and both nonzero, suchthat a is a square and a + 1 is a nonsquare. Since a is a square, there is a g ∈ F q such that g = a . Since c isa nonsquare and so is a +1 , it follows that ca +1 is a square, so write ca +1 = h for some h ∈ F q . We concludethat c = ca + 1 ( a + 1) = h ( a + 1) = h a + h = ( hg ) + h , which shows that c is a sum of two squares.Weber’s proof amounts to cleverly producing, for each c ∈ F q , a solution to the diagonal equation x + y = c . It can be upgraded as follows to a solution to Problem 2 in the case d = 2: Proposition 2.

For any ﬁnite ﬁeld F q , we have for every a, b ∈ F × q that F q = { ax + by : x, y ∈ F q } . Proof.

There are two cases. First, suppose that exactly one of a and b is a square, say a . If c is a square, then ca is a square, and hence ax = c has a solution in F q , which implies that c ∈ { ax + by : x, y ∈ F q } , and if c is a nonsquare, then similarly bx = c has a solution in F q , which implies the same. Second, suppose a and b are both squares or both nonsquares, so that ab is a square. Let g ∈ F × q satisfy g = ab . By Proposition 1,there exist x , y ∈ F q such that ca = x + y . Set y = y g . Then c = ax + ay = ax + a (cid:18) y g (cid:19) = ax + by . There is a more straightforward argument than Weber’s. Long before Weber’s textbook was published,Cauchy [1] had solved Problem 2 when d = 2: Be careful with the initials. A contemporary of both the sociologist M. Weber and the physicist H. F. Weber, the latter ofwhose lectures’ lack of Maxwell’s equations spurred Einstein to spurn his mentorship, the mathematician H. M. Weber workedon algebra, analysis, and number theory, incorporating many of his results into his well-regarded textbook

Lehrbuch der Algebra .According to Schapper in [22], Weber and Dedekind “[took] a decisive step towards the creation of modern algebraic geometry”with their publication of [3]. Not the famous analyst! A proliﬁc author, L. E. Dickson wrote (at least) two important books, namely the ﬁrst comprehensive book on ﬁnite ﬁeldsand the three-volume

History of the Theory of Numbers . In a book review, he wrote, “Fricke’s

Algebra is a worthy successor toWeber’s

Algebra , which it henceforth displaces” [7]. Here we write “Hurwitz” to mean the well-known A. Hurwitz, who collaborated with his generally overshadowed olderbrother J. Hurwitz on complex continued fractions. A short argument establishes that the product of two nonsquares is a square. The map φ : F × q → F × q given by φ ( x ) = x is a group homomorphism with kernel ker( φ ) = { , − } , so H := im( φ ) is an index 2 subgroup of F × q . In other words, F × q /H isthe group of two elements. As c and a +1 are nonsquares, we have cH = a +1 H , hence ca +1 H = ( cH )( a +1 H ) = ( cH ) = H ,hence ca +1 ∈ H . econd Proof of Proposition 2. If F q has characteristic 2, then, as observed before, every element is alreadya square, which quickly ﬁnishes the argument. Otherwise, ﬁx a, b ∈ F × q and let c ∈ F q . Since there are q +12 squares in F q and the cardinality of a subset of a ﬁnite ﬁeld is invariant under multiplication or addition bya ﬁxed nonzero element of the ﬁeld, the sets { ax : x ∈ F q } and { c − by : y ∈ F q } both have cardinality q +12 ,i.e., occupy slightly more than half of the ﬁeld F q . Thus these two sets have at least one common element z ,which satisﬁes z = ax = c − by for some x, y ∈ F q . Hence c = ax + by . Remark.

Cauchy’s argument is more well known than Weber’s, in part because it has been rediscoveredmore frequently. See, e.g., [6, Lemma 1], where it appears without citation, perhaps because it was wellknown by then.Cauchy’s argument cannot be generalized to solve Problem 2 in general, not even when d = 3. Forexample, in F = { , , , , , , } , the set of cubes is { x : x ∈ F } = { , , } , which has less than half thecardinality of the ﬁeld. Thus the pigeonhole principle used in the proof does not apply, as we cannot hopeto intersect two modiﬁed sets of cubes. While we are here, we also observe that { x + y : x, y ∈ F } = { , , , , } 6 = F , so the failure of the argument is of course caused by F itself and not by a lack of ingenuityin modifying the technique in some way.It turns out that, except when q = 4 and q = 7, every element of F q can be written as a sum of twocubes, which solves Problem 1 in the case d = 3. This precise result, proved in an elementary way in thetwentieth century, is due to Skolem [27] when q is prime and to Singh [26] in general. Skolem and manyothers addressed Problem 2 in the case d = 3; see [20, pp. 325–326] for sources.Diverting our attention from successes of elementary methods for the moment, we return to the nineteenthcentury. Initiating the study of cyclotomy, a web of problems that all involve roots of unity, Gauss, accordingto Weil, “obtain[ed] the numbers of solutions for all congruences ax − by ≡ p )” for primes p with p ≡ q prime, and soon. It was Kummer who more securely connected cyclotomy to Gauss sums [16, 17, 18], paving the way forfurther advancements in the theory of diagonal equations.We enter the twentieth century again, this time wearing our cyclotomy goggles. From this viewpoint,Dickson was quite important. The following theorem of his, from two of dozens of his cyclotomy papers,illustrates another historical reason for interest in diagonal equations: Theorem 1 ([4, 5]) . Fix an odd prime e > . Then for all suﬃciently large primes p , the equation x e + y e + z e = 0 (mod p ) (2) has a solution ( x, y, z ) ∈ ( F × p ) . Remark.

This theorem’s raison d’ˆetre was that it nulliﬁed an approach to proving the case of Fermat’s lasttheorem with prime exponents. If the conclusion of Theorem 1 were false for, say, e = 3, then there wouldexist a sequence of primes ( p n ) tending to inﬁnity for which the only solutions ( x, y, z ) ∈ F p n to equation (2)with e = 3 would have at least one of x, y, z divisible by p n . It follows that if we had a solution in integers tothe equation x + y + z = 0, taking this equation modulo a large enough p n would yield a contradiction.Perhaps surprisingly, this observation appears in [11], but not [4] or [5].As stated, Theorem 1 is a partial solution to Problem 3 but not to Problem 2 or even Problem 1. Hurwitzobtained an improvement which pertains to Problem 2: Theorem 2 ([11]) . Fix an odd prime e > and nonzero integers a, b, c . Then for all suﬃciently large primes p , the equation ax e + by e + cz e = 0 (mod p ) (3) has a solution ( x, y, z ) ∈ ( F × p ) . Unlike these other people, who did not help found ﬁnitism, Skolem was more keen on mathematical logic. Several of hisresults, including this one, were independently rediscovered due to the general disconnection in research networks that prevailedin the years before the internet. See [30] for a treatment of cyclotomy. emark. If p is a prime such that ( x, y, z ) ∈ ( F × p ) is a solution to equation (3), then after subtracting cz e from both sides and dividing by z e , we have a (cid:16) xz (cid:17) e + b (cid:16) yz (cid:17) e = − c, which expresses an arbitrary nonzero element − c of F p as a member of the set { ax e + by e : x, y ∈ F p } . ThusTheorem 2 solves Problem 2 when q is prime and d is an odd prime. We will make use of this division trickagain.Having explained some partial solutions to Problem 3, and their connection to Problems 1 and 2 in thecase q prime, it is reasonable to say that everything has been set in motion. In a few decades, some solutionsto Problem 3 have emerged. In this article we are not interested in the exact details of these solutions, butfor completeness’s sake, we mention that, broadly speaking, the approaches go either by Gauss sums, or byJacobi sums, or by elementary methods. Weil summarized historical progress on Problem 3 up to 1949 in [33], in the process simplifying andadvancing some earlier work by Hasse and Davenport in [2]. It should be mentioned that the new resultsfrom [33] were discovered during the prooﬁng process to be, in Weil’s words, “substantially identical” tothose of Hua and Vandiver in [9]. What appears in [33] and [9] has been made more elementary in severalpresentations, such as [12], and can now be readily studied therefrom.Small explicitly observed in the short article [28] the fact that a certain estimate on the number ofsolutions to diagonal equations yields a solution to Problem 1. For the sake of review, we now reproduce theone theorem from [28], with minor changes in notation for consistency:

Theorem 3 ([28]) . Let d be a positive integer, let F q be a ﬁnite ﬁeld, and put δ = gcd( q − , d ) . Assume q > ( δ − . Then every element of F q is a sum of two d th powers. (In particular, the conclusion holds if q > ( d − , since d ≥ δ .)Proof. For b ∈ F q let N ( b ) denote the number of solutions ( x, y ) ∈ F q × F q of x d + y d = b . Then, by deﬁnition, N ( b ) ≥

0; we have to show that q > ( δ − implies N ( b ) >

0. We may assume b = 0, since 0 is certainly asum of two d th powers. Then, by [13, Corollary 1, p. 57], we have | N ( b ) − q | ≤ ( δ − √ q . In particular, N ( b ) − q ≥ − ( δ − √ q , so that N ( b ) ≥ √ q ( √ q − ( δ − ). Hence N ( b ) >

0, for all b , provided √ q > ( δ − ,or in other words q > ( δ − . Remark. If d > q for which not everyelement of F q is the sum of two d th powers. By Theorem 3, this value must be less than ( d − + 1. Thisproblem might be interesting to pursue, in part because it is related to Waring’s problem over ﬁnite ﬁelds.Small cites Jean-Ren´e Joly’s self-contained survey [13] about equations and algebraic varieties over ﬁniteﬁelds. In the chapter endnotes, Joly states that the particular result Small would later quote is due inde-pendently to Davenport and Hasse in a 1934 paper [2] and to Hua and Vandiver in 1949 papers [9] and [10].But one of the Hua–Vandiver papers [10] cites [2] regarding the result in question! Anyway, as for the resultSmall uses, it follows from the development of Gauss and Jacobi sums over a few chapters in Joly. Remark.

We remark that although Small recorded a solution to Problem 1, the inequality he quoted isgeneral enough that he had essentially recorded a solution to Problem 2 as well.In the remaining two sections, we share two other ways to attack Problem 2. The ﬁrst is both new andelementary in the sense that it does not depend on Gauss sums, Jacobi sums, or hard counting of solutionsto diagonal equations. As we will see, it is a soft averaging argument—what we are averaging will becomeapparent—and its main tool is the Cauchy–Schwarz inequality. The second way, which is not new, couldaccidentally be considered elementary if the reader blinks at the right moment. To be only a little moreprecise, we will reframe the situation using Fourier analysis, reducing the problem to a state where a singlebolt of lightning from the Riemann hypothesis over ﬁnite ﬁelds creates a piece of fulgurite to add to one’s Each of these methods, including a combined treatment of the elementary approaches by Stepanov [29] and Schmidt [24],is treated in, e.g., [20]. It is possible that Hua and Vandiver learned about the Davenport–Hasse result after the publication of [9] in 1949 andbefore the publication of [10]. . . in 1949. What a year 1949 was! The treatments in Joly [13] and Ireland–Rosen [12] are roughly equally accessible. After reading this paragraph and theone about Weil, the reader hopefully has some feeling of a frenzy, whether synchronous or asynchronous, around this topic. Fulgurite is a general term for a mineral-like clump of dirt that can form where lightning strikes the ground.

From this section onward, we let µ q denote the counting measure on F q , normalized to be a probabilitymeasure; thus, for any subset A of F q , µ q ( A ) equals | A | /q , the cardinality of A divided by q . In context, q will be ﬁxed, so we will suppress the subscript and just write µ . Besides measuring sets, we will also shiftthem: If y lives in F q , denote by A + y the set A + y := { x + y ∈ F q : x ∈ A } . We need a basic lemma.

Lemma 1.

Fix positive integers d and q , and let δ = gcd( d, q − . Then { x δ : x ∈ F q } = { x d : x ∈ F q } . Proof.

By B´ezout’s lemma, there exist integers r and s such that rd + s ( q −

1) = δ . Hence, for all x ∈ F × q ,we have x δ = ( x r ) d ( x s ) q − = ( x r ) d , (4)since y q − = 1 for all y ∈ F × q . But equation (4) implies the set containment { x δ : x ∈ F q } ⊆ { x d : x ∈ F q } . The reverse set containment holds since x d = ( x d/δ ) δ for all x ∈ F q .Now, as promised in the introduction, we provide here an elementary proof of the following: Theorem 4.

Fix an integer d > . Then, for every suﬃciently large ﬁnite ﬁeld F q with characteristic largerthan d and every a, b ∈ F × q , we have F q = { ax d + by d : x, y ∈ F q } . Remark.

If we restrict to the special case where q is prime, we can remove the assumption that char( F q ) > d .Indeed, let q = p be prime, let d >

1, and set δ = gcd( d, p − { ax d + by d : x, y ∈ F p } = { ax δ + by δ : x, y ∈ F p } for all a, b ∈ F × p . The result follows from the theorem since δ < p = char( F p ).To prove Theorem 4, we will ﬁrst need to state two lemmas. We prove the easier one immediately anddefer the proof of the other one to the end of the section. Lemma 2.

Fix a ﬁnite ﬁeld F q and an integer d > , and let A = { x d : x ∈ F q } . Then µ ( A ) > d .Proof. The map φ : F × q → F × q given by φ ( x ) = x d is a group homomorphism. Since the equation x d = 1 hasat most d distinct solutions in F q , it follows that | ker( φ ) | ≤ d . Thus | A | − | im( φ ) | = | F × q || ker( φ ) | = q − | ker( φ ) | ≥ q − d , which implies that | A | ≥ q − d > d + q − d = qd . Thus µ ( A ) = | A | q > d . Lemma 3.

There is a positive function E ( q, d ) : N → R with these properties: . For any q , any subsets A q , B q ⊆ F q , and any polynomial P ∈ F q [ x ] with degree d satisfying < d < char( F q ) , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q µ ( A q ∩ ( B q + P ( g ))) − µ ( A q ) µ ( B q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ E ( q, d ) . (5)

2. For any ﬁxed d , we have lim q →∞ E ( q, d ) = 0 . Remarks.

1. In the ﬁrst statement, we suppose d > d = 1. In fact, for polynomials P of degree 1, the left-hand side of (5) isalways 0. On the other hand, it is not possible to remove the assumption that d < char( F q ). This isbecause for certain choices of P , for example P ( x ) = x char( F q ) + x , the supremum over A q and B q ofthe left-hand side of (5) is bounded away from 0, independently of q .2. This lemma asserts that, eventually, the quantities µ ( A q ∩ ( B q + P ( g ))) are of size µ ( A q ) µ ( B q ) onaverage. We will see in the proof of Theorem 4 that, when concrete choices of A q and B q are made, thiscurrently vague statement will suddenly reveal useful information, namely the nontrivial intersectionof A q and B q + P ( g ) for some g = 0.3. Our choice of E ( q, d ) will not be optimal in any reasonable sense. Proof of Theorem 4.

We need only to show that F q ⊆ { ax d + by d : x, y ∈ F q } , as the reverse inclusion istrivial. Let A q = { ax d : x ∈ F q } and B q = {− by d : y ∈ F q } . Fix c ∈ F × q , and let P ( x ) = cx d . Let E ( q, d ) bethe function from Lemma 3. Claim.

For suﬃciently large q , we have µ ( A q ) µ ( B q ) − q µ ( A q ∩ B q ) > E ( q, d ) . Indeed, by Lemma 2, µ ( A q ) > d and µ ( B q ) > d . Since lim q →∞ E ( q, d ) = 0 and q µ ( A q ∩ B q ) ≤ q , itfollows that for suﬃciently large q we have E ( q, d ) + 1 q µ ( A q ∩ B q ) < d < µ ( A q ) µ ( B q ) , whence the claim.Now, deﬁne the closely related quantities S = 1 q X g ∈ F q µ ( A q ∩ ( B q + P ( g ))) and T = 1 q X g ∈ F × q µ ( A q ∩ ( B q + P ( g )))We note that S = T + q µ ( A q ∩ B q ), which follows by adding the missing g = 0 term to T .For suﬃciently large q , Lemma 3 implies that S − µ ( A q ) µ ( B q ) ≥ −E ( q, d ) . (6)Thus, for suﬃciently large q , it follows by the claim and (6) that T = S − q µ ( A q ∩ B q )= (cid:18) S − µ ( A q ) µ ( B q ) (cid:19) + (cid:18) µ ( A q ) µ ( B q ) − q µ ( A q ∩ B q ) (cid:19) > −E ( q, d ) + E ( q, d ) = 0 . Since

T > T is a sum of nonnegative numbers, at least one summand is positive. Thus, there is anelement g ∈ F × q such that µ ( A q ∩ ( B q + P ( g ))) >

0. Having positive measure, the set A q ∩ ( B q + P ( g ))6ust therefore be nonempty; hence there exist x , x ∈ F q such that ax d = − bx d + cg d . Since g = 0, we canrearrange this equation to yield c = a (cid:18) x g (cid:19) d + b (cid:18) x g (cid:19) d , which shows that c ∈ { ax d + by d : x, y ∈ F q } . But c ∈ F × q was arbitrary, and of course 0 = a · d + b · d ;hence we are done.To prove the lemma that has done all the heavy lifting for us, we need some elementary facts about innerproducts. Fix a ﬁnite ﬁeld F q . Deﬁne the inner product h f , f i := Z F q f f dµ = 1 q X g ∈ F q f ( g ) f ( g )for functions f , f : F q → C and write the corresponding norm || f || := p h f , f i . In this notation, theCauchy–Schwarz inequality has a simple form: |h f , f i| ≤ || f || · || f || . (7)Indeed, for a , . . . , a q , b , . . . , b q ∈ C , the Cauchy–Schwarz inequality states that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X i =1 a i b i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ q X j =1 | a j | q X k =1 | b k | . (8)Setting a j = f ( j ) and b k = f ( k ) and dividing both sides by q , we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q q X i =1 f ( i ) f ( i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤  q q X j =1 | f ( j ) |  q q X k =1 | f ( k ) | ! , which is written compactly as | h f , f i | ≤ h f , f i h f , f i , whence (7).All the functions we will take inner products with will be real-valued, so we will treat this inner productas a pleasantly linear tool. In particular, we will pull ﬁnite sums in and out, without warning. Proof of Lemma 3.

Fix q and subsets A, B ⊆ F q . In the proof of Theorem 4 we only used polynomials of theform P ( x ) = cx d for c ∈ F q nonzero, so we will stick with these for illustration.If 1 A is the characteristic function of A , let a g be the function F q → C deﬁned by x A ( x + g ) − µ ( A ).We make two observations about these a g ’s, both involving shifts, and both justifying the utility of this weirdfunction. First, after a change of variable x x − g ′ , we see h a g , a g ′ i = Z F q (cid:16) A ( x + g ) − µ ( A ) (cid:17)(cid:16) A ( x + g ′ ) − µ ( A ) (cid:17) dµ ( x )= Z F q (cid:16) A ( x + g − g ′ ) − µ ( A ) (cid:17)(cid:16) A ( x ) − µ ( A ) (cid:17) dµ ( x )= h a g − g ′ , a i . (9) We have encountered this trick before. In the introduction, it appeared in the remark that explains how Theorem 2 solvesProblem 2 when q is prime. We are eﬀectively enumerating the elements of F q as g , . . . , g q and writing f ( j ) instead of f ( g j ). This abuse of notationis committed only to clarify how one form of Cauchy–Schwarz follows from another. that (cid:10) a P ( g ) , B (cid:11) = Z F q (cid:16) A ( x + P ( g )) − µ ( A ) (cid:17) B ( x ) dµ ( x )= Z F q (cid:16) A − P ( g ) ( x ) − µ ( A ) (cid:17) B ( x ) dµ ( x ) (10)= Z F q A − P ( g ) ( x )1 B ( x ) − µ ( A )1 B ( x ) dµ ( x )= Z F q ( A − P ( g )) ∩ B ( x ) − µ ( A )1 B ( x ) dµ ( x ) (11)= µ (( A − P ( g )) ∩ B ) − µ ( A ) µ ( B ) , so that (cid:10) a P ( g ) , B (cid:11) = µ ( A ∩ ( B + P ( g ))) − µ ( A ) µ ( B ) . (12)With the help of (12), we begin. By Cauchy–Schwarz, we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q µ ( A ∩ ( B + P ( g ))) − µ ( A ) µ ( B ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)* q X g ∈ F q a P ( g ) , B +(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) · || B ||≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (13)where we used the fact that || B || = p h B , B i = p µ ( B ) ≤ h ∈ F q isﬁxed, the polynomial P ( x + h ) − P ( x ) = c ( x + h ) d − cx d has degree at most d − x . Usually the degree will be d −

1, but it can be smaller if h = 0 or if we did notsuppose the characteristic of F q is larger than d . This diﬀerencing procedure allows us to reduce the degreeof a polynomial whenever we can introduce this diﬀerence, which we will do aggressively. If we are givenparameters h , h , . . . , h d − ∈ F q , let P ( x ; h ) := P ( x + h ) − P ( x ) ,P ( x ; h , h ) := P ( x + h ; h ) − P ( x ; h ) , ... P ( x ; h , h , . . . , h d − ) := P ( x + h d − ; h , h , . . . , h d − ) − P ( x ; h , h , . . . , h d − )be, respectively, the degree at most d −

1, degree at most d −

2, . . . , and degree at most 1 polynomials in x obtained by consecutively diﬀerencing P by h , then the result by h , and so on down to a (usually) linearpolynomial. Now we return to bound the norm in (13). This is a key step.Observe, after using the ﬁrst fact (9) about the a g ’s, reindexing a sum to introduce diﬀerencing, and Equation (10) holds since x + y ∈ A if and only if x ∈ A − y , and (11) since 1 A ( x )1 B ( x ) = 1 A ∩ B ( x ). Arguably the most notable use of this idea is in [34], wherein Weyl shows that if Q ( x ) is a real polynomial with at least oneirrational coeﬃcient, then { Q ( n ) : n ∈ N } is uniformly distributed modulo 1. (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = * q X g ′ ∈ F q a P ( g ′ ) , q X g ∈ F q a P ( g ) + by deﬁnition= 1 q X g ′ ∈ F q q X g ∈ F q (cid:10) a P ( g ′ ) , a P ( g ) (cid:11) = 1 q X g ′ ∈ F q q X g ∈ F q (cid:10) a P ( g ′ ) − P ( g ) , a (cid:11) by (9)= 1 q X h ∈ F q q X g ∈ F q (cid:10) a P ( g + h ) − P ( g ) , a (cid:11) = 1 q X h ∈ F q * q X g ∈ F q a P ( g ; h ) , a + ≤ q X h ∈ F q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (14)Note that the last step follows since || a || = || A − µ ( A ) || = p µ ( A ) − µ ( A ) ≤ . This argument in (14) has reduced the degree of P ( g ) by (at least) one. Encouraged, we prepare to iterate.For each h , making the displayed argument with a P ( g ; h ) replacing a P ( g ) on the left-hand side, we get abound on (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q P g ∈ F q a P ( g ; h ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) in terms of a new parameter h , which plays the same role as h in the displayedargument. This yields (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ vuuut q X h ∈ F q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ,h ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (15)Applying (15) to the norm in (14), we ﬁnd (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ q X h ∈ F q vuuut q X h ∈ F q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ,h ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where P ( g ; h , h ) = P ( g + h ; h ) − P ( g ; h ) is a polynomial in g of degree at most d − h , h ∈ F q .Proceeding recursively in this way, we see that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ q X h ∈ F q vuuuuut q X h ∈ F q vuuuut · · · vuuut q X h d − ∈ F q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ,h ,...,h d − ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (16)where P ( g ; h , h , . . . , h d − ) is the polynomial in g of degree at most 2 obtained by diﬀerencing. If we reducethe degree one more time—but without applying Cauchy–Schwarz—we will obtain an expression that we canﬁnally bound. Indeed, by repeating the argument of (14), stopping just before the inequality, observe thatfor any h , h , . . . , h d − ∈ F q , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ; h ,...,h d − ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 1 q X h d − ∈ F q * q X g ∈ F q a P ( g ; h ,h ,...,h d − ) , a + . (17) Note that || a || = p µ ( A ) − µ ( A ) ≤ since the function x x − x has maximum on [0 ,

9y applying (17) to the innermost part of (16), we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ q X h ∈ F q vuuut · · · vuut q X h d − ∈ F q * q X g ∈ F q a P ( g ; h ,h ,...,h d − ) , a + . (18)After taking square roots on both sides of (18), we have an expression with d − we canmove each of the square roots to the outside of the expression. This massaging of radical symbols is notnecessary for the argument, but we hope the reader will appreciate the service. The result is that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤  q d − X h ,...,h d − ∈ F q * q X g ∈ F q a P ( g ; h ,h ,...,h d − ) , a + − ( d − . (19)Compare (19) and (13). To ﬁnish the proof of the lemma, the idea is that the inner product in (19) is zeromost of the time, i.e., for enough choices of the parameters h i , and if it’s not zero, then it can be boundedtrivially.First, the popular case. Tracing what happens when one diﬀerences P ( g ) = cg d the whole d − P ( g ; h , . . . , h d − ), viewed as a linear polynomial in g , is d ! c Q d − i =1 h i .Since d < char( F q ) and c = 0, this coeﬃcient is zero if and only if at least one h i = 0. Thus, for ﬁxed nonzero h , h , . . . , h d − ∈ F × q , this linear coeﬃcient is invertible, which implies that the map g P ( g ; h , . . . , h d − )is a permutation of F q . Using this permutation to reindex a sum, we see that * q X g ∈ F q a P ( g ; h ,h ,...,h d − ) , a + = * q X g ∈ F q a g , a + , and a straightforward calculation shows that the function1 q X g ∈ F q a g ( x ) = 1 q X g ∈ F q A ( x + g ) − µ ( A )is the zero function, so that the inner product of interest is 0. Now, for the unpopular case, if any h i is 0,then P ( g ; h , . . . , h d − ) is a constant polynomial in g , and after analyzing the diﬀerencing process, one cansee that, in particular, the constant is 0. It follows that * q X g ∈ F q a P ( g ; h ,h ,...,h d − ) , a + = h a , a i = || a || ≤ . The ﬁnal step is to reckon the exact popularity of the two cases. Of the q d − choices of parameters( h , . . . , h d − ), exactly ( q − d − of them have no h i equal to zero. Thus, the unpopular case, compris-ing everything else, happens q d − − ( q − d − times in the sum in (19). It follows that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q a P ( g ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18) q d − − ( q − d − q d − (cid:19) − ( d − . Thus, take E ( q, d ) = (cid:16) q d − − ( q − d − q d − (cid:17) − ( d − . For ﬁxed d , lim q →∞ E ( q, d ) = 0.Recalling the remark below Theorem 4, we observe that Theorem 4 solves Problem 2 except for thosenonprime ﬁnite ﬁelds with characteristic drawn from the set of primes less than or equal to d . For context, Namely, if c , . . . , c q are positive numbers, then q P qi =1 √ c i ≤ q q P qi =1 c i . To see this, set a j = √ c j and b k = 1 in(8), divide both sides by q , then take the square root of both sides. For the interested reader, this inequality is ﬁrst appliedwith “ i = h d − ” and c i = q P h d − ∈ F q D q P g ∈ F q a P ( g ; h ,h ,...,h d − ) , a E , where we again only abuse notation when clarifyingsomething related to Cauchy–Schwarz. The ﬁrst approach was animated by an all-encompassing desire to reduce the degree of a polynomial until itis linear, because a linear polynomial behaves predictably with respect to the averages we have considered—compare the popular and unpopular cases above. The basic structure of the second approach is similar overall:an auxiliary lemma will do most of the work with the help of the same averaging argument. However, incontrast to the erstwhile insistence on diﬀerencing, the second approach tolerates polynomiality until itbecomes a lightning rod. We will ﬁnd under this electric inﬂuence that the second approach solves Problem 2in full generality.

We need just a little Fourier analysis.An additive character of F q is a homomorphism χ : ( F q , +) → C × . If F q has characteristic p , thenevidently χ ( g ) p = χ ( pg ) = χ (0) = 1 for any g ∈ F q , so χ actually takes values in the set of p th roots of unity.As a result, we conclude that χ ( − g ) = χ ( g ) − = χ ( g ). Of course, χ is an additive character if and only if χ is. The principal character χ is identically 1. The set c F q of additive characters of F q is an orthonormalbasis for the q -dimensional C -linear space of functions f : F q → C with the inner product h· , ·i deﬁned in theprevious section. Hence, with a small amount of work on orthogonality of characters, one can show that any f can be written as a linear combination of additive characters in the following way: f = X χ ∈ c F q h f, χ i χ. (20)The Fourier transform of f , written ˆ f , is the function ˆ f : c F q → C given by ˆ f ( χ ) = q h f, χ i = P g ∈ F q f ( g ) χ ( g ).Thus h f, χ i = ˆ f ( χ ) q ; plugging this into (20) and reindexing the sum yields the Fourier inversion formula, validfor all a ∈ F q : f ( a ) = X χ ∈ c F q ˆ f ( χ ) q χ ( a ) = 1 q X χ ∈ c F q ˆ f ( χ ) χ ( − a ) . (21)With some more work, one can show the Plancherel formula: For any f , f : F q → C , we have D ˆ f , ˆ f E = q h f , f i , where on the left-hand side we mean the analogous inner product D ˆ f , ˆ f E := 1 q X χ ∈ c F q ˆ f ( χ ) ˆ f ( χ ) . Fourier analysis is a huge topic; for a sample, the interested reader may consult [8, 14, 25].

We will state a variant of Lemma 3, use it to solve Problem 2, and then prove it.

Lemma 4.

There is a positive function E ( q, d ) : N → R with these properties:1. For any q , any subsets A q , B q ⊆ F q , and any polynomial P ∈ F q [ x ] with degree d coprime to q , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X g ∈ F q µ ( A q ∩ ( B q + P ( g ))) − µ ( A q ) µ ( B q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ E ( q, d ) . . For any ﬁxed d , we have lim q →∞ E ( q, d ) = 0 . Remarks.

These comments are parallel to their corresponding comments under Lemma 3.1. This lemma strengthens Lemma 3, since the set of polynomials appearing in the ﬁrst statement containsthe set of polynomials with degree between 1 and char( F q ).2. The intuitive assertion of this lemma remains the same as in Lemma 3; namely, the quantities µ ( A q ∩ ( B q + P ( g ))) are eventually of size µ ( A q ) µ ( B q ) on average. Moreover, the application of this lemmawill be almost exactly the same.3. Our choice of E ( q, d ) here will decay to zero more quickly than it did in Lemma 3, except when d = 2,in which case it agrees with the previous choice. Solution to Problem 2.

We may suppose q > d . For q such that d is coprime with q , the argument proceedsas in the proof of Theorem 4, with Lemma 4 replacing Lemma 3, and shows that for all suﬃciently large q coprime to d , for all a, b ∈ F × q , we have F q = { ax d + by d : x, y ∈ F q } . (22)Otherwise, namely, if gcd( d, q ) >

1, then by Lemma 1, we have { ax d + by d : x, y ∈ F q } = { ax δ + by δ : x, y ∈ F q } , where δ = gcd( d, q − δ divides q −

1, so δ and q are coprime. This completes the argument since d has ﬁnitely many factors. Proof of Lemma 4.

Fix q , subsets A, B ⊆ F q , and a polynomial P ∈ F q [ x ] with degree d coprime to q .We rewrite the expression 1 q X g ∈ F q µ ( A ∩ ( B + P ( g )))step by step. Recall that µ ( S ) = | S | q = q P h ∈ F q S ( h ), that 1 S ∩ S ′ ( x ) = 1 S ( x )1 S ′ ( x ), and that 1 S + y ( x ) =1 S ( x − y ) for any x, y ∈ F q and S, S ′ ⊆ F q . Thus,1 q X g ∈ F q µ ( A ∩ ( B + P ( g ))) = 1 q X g ∈ F q q X h ∈ F q A ( h )1 B ( h − P ( g )) . Inverting using (21) with f = 1 B and a = h − P ( g ), we ﬁnd1 q X g ∈ F q q X h ∈ F q A ( h )1 B ( h − P ( g )) = 1 q X g,h ∈ F q X χ ∈ c F q A ( h ) c B ( χ ) χ ( P ( g ) − h ) . After separating χ ( P ( g ) − h ) = χ ( P ( g )) χ ( − h ) = χ ( P ( g )) χ ( h ) and changing the order of summation, wenotice the expression for the Fourier transform of 1 A evaluated at the character χ :1 q X g,h ∈ F q X χ ∈ c F q A ( h ) c B ( χ ) χ ( P ( g ) − h ) = 1 q X g,χ χ ( P ( g )) c B ( χ ) X h ∈ F q A ( h ) χ ( h ) | {z } = c A ( χ ) . After changing the order of summation again, we have1 q X g,χ χ ( P ( g )) c B ( χ ) c A ( χ ) = 1 q X χ c A ( χ ) c B ( χ ) X g χ ( P ( g )) . Let’s separate the χ = χ term from the whole sum. Remember, χ is identically 1. Thus, by deﬁnition, c A ( χ ) = P g ∈ F q A ( g ) χ ( g ) = P g ∈ F q A ( g ) = | A | , and since χ = χ , we have c B ( χ ) = | B | . Moreover, P g ∈ F q χ ( P ( g )) = q . Now q · | A || B | q = µ ( A ) µ ( B ). Thus1 q X χ c A ( χ ) c B ( χ ) X g χ ( P ( g )) = µ ( A ) µ ( B ) + 1 q X χ = χ c A ( χ ) c B ( χ ) X g χ ( P ( g )) . E ( q, d ) to bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q X χ = χ c A ( χ ) c B ( χ ) X g ∈ F q χ ( P ( g )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (23)By the triangle inequality, (23) is bounded by1 q X χ = χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)c B ( χ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X g ∈ F q χ ( P ( g )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (24)At this point, we await the lightning. Transcribing some folklore spawned from his own work, Weil wrotedown in a short note in 1948 the relationship between some exponential sums and the Riemann hypothesisover ﬁnite ﬁelds [32]. The speciﬁc formulation we use here appears (with minor notation adjustments) inKowalski’s notes on exponential sums [15, Theorem 3.2]: Theorem 5.

Fix a polynomial f ∈ F q [ x ] of degree d and a nontrivial additive character χ of F q . If d < q and d is coprime to q , then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X g ∈ F q χ ( f ( g )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ( d − √ q. This theorem applies readily to part of (24). Thus we ﬁnd that (24) is bounded by( d − √ qq X χ = χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)c B ( χ ) (cid:12)(cid:12)(cid:12) . (25)Our ﬁnal step uses the Plancherel formula. In preparation, we massage part of (25) by padding it withthe χ = χ term and applying Cauchy–Schwarz to see that X χ = χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)c B ( χ ) (cid:12)(cid:12)(cid:12) ≤ X χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)c B ( χ ) (cid:12)(cid:12)(cid:12) ≤ X χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) X χ ′ (cid:12)(cid:12)(cid:12) c B ( χ ′ ) (cid:12)(cid:12)(cid:12)  = (cid:0) q µ ( A ) µ ( B ) (cid:1) (26) ≤ q , where (26) holds by Plancherel since X χ (cid:12)(cid:12)(cid:12)c A ( χ ) (cid:12)(cid:12)(cid:12) = q Dc A , c A E = q h A , A i = q µ ( A ) . Thus (25) is bounded by ( d − √ qq · q = d − √ q , so take E ( q, d ) = d − √ q . Obviously lim q →∞ E ( q, d ) = 0 for ﬁxed d .In this approach we depend on serious, now classical, knowledge about certain exponential sums, a deﬁnitestep up in diﬃculty over Gauss sums. The interest in these kinds of exponential sums arose in part as anoutgrowth of interest in the problems of cyclotomy and diagonal equations but has since taken on its own lifein number theory. We have certainly not done justice to the Riemann hypothesis over ﬁnite ﬁelds here; it isa deep topic. For a historical perspective, see [21], and for a mathematical discussion see, for example, [12].What we have done is examined several facets of a pretty problem, Problem 2, found in the same deposit asFermat’s last theorem. For an oﬄine source, see [23, Theorem 2E] or [24, Theorem 2.5]. cknowledgment. We thank the referees for many useful comments and especially the referee who suggested several improve-ments to the structure of Section 2 and pointed out what are now Proposition 2, the remark under Theorem 4,and part of the ﬁrst remark under Lemma 3.

References [1] Cauchy, A.-L. (1813). Recherches sur les nombres.

J. ´Ec. polytech. Math.

9: 99–116;

Oeuvres (II), vol. 1,pp. 39–63, Gauthier-Villars, Paris, 1905.[2] Davenport, H., Hasse, H. (1934). Die Nullstellen der Kongruenzzetafunktionen in gewissen zyklischenF¨allen.

J. Reine Angew. Math.

J. ReineAngew. Math.

92: 181–290.[4] Dickson, L. E. (1909). On the congruence x n + y n + z n ≡ p . J. Reine Angew. Math. x e + y e + z e ≡ p . J. ReineAngew. Math.

Ann. of Math. (2) . 28(1-4): 333–341.[7] Dickson, L. E. (1928). Review of Robert Fricke’s

Lehrbuch der Algebra . Bull. Amer. Math. Soc.

34: 531.[8] Hardy, G. H., Rogosinski, W. W. (2013).

Fourier Series , revised ed. Mineola, NY: Dover.[9] Hua, L. K., Vandiver, H. S. (1949). Characters over certain types of rings, with applications to the theoryof equations in a ﬁnite ﬁeld.

Proc. Nat. Acad. Sci. U.S.A.

35: 94–99.[10] Hua, L. K., Vandiver, H. S. (1949). On the nature of the solutions of certain equations in a ﬁnite ﬁeld.

Proc. Nat. Acad. Sci. U.S.A.

35: 481–487.[11] Hurwitz, A. (1909). ¨Uber die Kongruenz ax e + by e + cz e ≡ p . J. Reine Angew. Math.

A Classical Introduction to Modern Number Theory , 2nd ed. New York,NY: Springer-Verlag.[13] Joly, J.-R. (1973). ´Equations et vari´et´es alg´ebriques sur un corps ﬁni.

Enseign. Math.

Fourier Analysis . New York: Cambridge Univ. Press.[15] Kowalski, E. (2018). Exponential sums over ﬁnite ﬁelds: elementary methods. Available at:people.math.ethz.ch/%7Ekowalski/exp-sums.pdf[16] Kummer, E. E. (1850). Allgemeine Reciprocit¨atsgesetze f¨ur beliebig hohe Potenzreste.

Monatsber.K¨onigl. Akad. Wiss. Berlin

Collected Papers , vol. 1, pp. 345–357, Springer-Verlag, Berlin-Heidelberg-New York, 1975.[17] Kummer, E. E. (1851). M´emoire sur la th´eorie des nombres complexes compos´es de racines de l’unit´e etde nombres entiers.

J. Math. Pures Appl.

16: 377–498;

Collected Papers , vol. 1, pp. 363–484, Springer-Verlag, Berlin-Heidelberg-New York, 1975.[18] Kummer, E. E. (1852). ¨Uber die Erg¨anzungss¨atze zu den allgemeinen Reciprocit¨atsgesetzen.

J. ReineAngew. Math.

44: 93–146;

Collected Papers , vol. 1, pp. 485–538, Springer-Verlag, Berlin-Heidelberg-NewYork, 1975.[19] Lagrange, J.-L. (1770). D´emonstration d’un th´eor`eme d’arithm´etique.

Nouv. M´emoires Acad. Roy. Berlin

Oeuvres , vol. 3, pp. 189–201, Gauthier-Villars, Paris, 1869.[20] Lidl, R., Niederreiter, H. (1997).

Finite Fields , 2nd ed. New York, NY: Cambridge Univ. Press.1421] Roquette, P. (2018).

The Riemann Hypothesis in Characteristic p in Historical Perspective . LectureNotes in Mathematics, 2222. Cham, Switzerland: Springer.[22] Schappacher, N. On the History of Hilbert’s Twelfth Problem: A Comedy of Errors, Mat´eriaux pourl’histoire des math´ematiques au XX e si`ecle , Nice, 1996, S´emin. Congr. 3 (Soci´et´e Math´ematique de France,Paris, 1998), 243–273.[23] Schmidt, W. (1976). Equations over Finite Fields: an Elementary Approach , 1st ed. Lecture Notes inMathematics, 536. Berlin-New York: Springer-Verlag.[24] Schmidt, W. (2004).

Equations over Finite Fields: an Elementary Approach , 2nd ed. Heber City, UT:Kendrick Press.[25] Shakarchi, R., Stein, E. M. (2003).

Fourier Analysis: An Introduction . Princeton, NJ: Princeton Univ.Press.[26] Singh, S. (1973). Analysis of each integer as sum of two cubes in a ﬁnite integral domain.

Indian J. PureAppl. Math.

Norske Vid. Selsk. Forh.

10: 89–92.[28] Small, C. (1977). Sums of powers in large ﬁnite ﬁelds.

Proc. Amer. Math. Soc.

Arithmetic of Algebraic Curves . New York: Springer.[30] Storer, T. (1967).

Cyclotomy and diﬀerence sets . Lectures in Advanced Mathematics, 2. Chicago, IL:Markham Publishing Co.[31] Weber, H. M. (1899).

Lehrbuch der Algebra , Vol. 2, 3rd. ed. AMS Chelsea.[32] Weil, A. (1948). On some exponential sums.

Proc. Nat. Acad. Sci. U.S.A.

34: 204–207.[33] Weil, A. (1949). Numbers of solutions of equations in ﬁnite ﬁelds.

Bull. Amer. Math. Soc.

55: 497–508.[34] Weyl, H. (1916). ¨Uber die Gleichverteilung von Zahlen mod. Eins.