[PDF] On the length of binary forms

Abstract

The K -length of a form f in K[ x 1 ,…, x n ] , $K \subset \cc$, is the smallest number of d -th powers of linear forms of which f is a K -linear combination. We present many results, old and new, about K -length, mainly in n=2 , and often about the length of the same form over different fields. For example, the K -length of 3 x 5 −20 x 3 y 2 +10x y 4 is three for $K = \qq(\sqrt{-1})$, four for $K = \qq(\sqrt{-2})$ and five for $K = \rr$.

Full PDF

aa r X i v : . [ m a t h . N T ] J u l ON THE LENGTH OF BINARY FORMS

BRUCE REZNICK

Abstract.

The K -length of a form f in K [ x , . . . , x n ], K ⊂ C , is the smallestnumber of d -th powers of linear forms of which f is a K -linear combination. Wepresent many results, old and new, about K -length, mainly in n = 2, and oftenabout the length of the same form over diﬀerent ﬁelds. For example, the K -lengthof 3 x − x y + 10 xy is three for K = Q ( √− K = Q ( √−

2) and ﬁvefor K = R . Introduction and Overview

Suppose f ( x , ..., x n ) is a form of degree d with coeﬃcients in a ﬁeld K ⊆ C . The K -length of f , L K ( f ), is the smallest r for which there is a representation(1.1) f ( x , . . . , x n ) = r X j =1 λ j (cid:0) α j x + · · · + α jn x n (cid:1) d with λ j , α jk ∈ K .In this paper, we consider the K -length of a ﬁxed form f as K varies; this isapparently an open question in the literature, even for binary forms ( n = 2). Sylvester[48, 49] explained how to compute L C ( f ) for binary forms in 1851. Except for a fewremarks, we shall restrict our attention to binary forms.It is trivially true that L K ( f ) = 1 for linear f and for d = 2, L K ( f ) equals therank of f : a representation over K can be found by completing the square, and thislength cannot be shortened by enlarging the ﬁeld. Accordingly, we shall also assumethat d ≥

3. Many of our results are extremely low-hanging fruit which were eitherknown in the 19th century, or would have been, had its mathematicians been able totake 21st century undergraduate mathematics courses.When K = C , the λ j ’s in (1.1) are superﬂuous. The computation of L C ( f ) is ahuge, venerable and active subject, and very hard when n ≥

3. The interested readeris directed to [4, 5, 10, 12, 13, 17, 20, 23, 29, 39, 40, 41] as representative recentworks. Even for small n, d ≥

3, there are still many open questions. Landsberg andTeitler [29] complete a classiﬁcation of L C ( f ) for ternary cubics f and also discuss L C ( x x · · · x n ), among other topics. Historically, much attention has centered onthe C -length of a general form of degree d . In 1995, Alexander and Hirschowitz[1] (see also [2, 31]) established that for n, d ≥

3, this length is ⌈ n (cid:0) n + d − n − (cid:1) ⌉ , the Date : May 30, 2018.2000

Mathematics Subject Classiﬁcation.

Primary: 11E76, 11P05, 14N10. constant-counting value, with the four exceptions known since the 19th century –( n, d ) = (3 , , (4 , , (4 , , (4 ,

5) – in which the length is ⌈ n (cid:0) n + d − n − (cid:1) ⌉ + 1.An alternative deﬁnition would remove the coeﬃcients from (1.1). (The com-putation of the alternative deﬁnition is likely to be much harder than the one weconsider here, if for no other reason than that cones are harder to work with thansubspaces.) This alternative deﬁnition was considered by Ellison [14] in the specialcases K = C , R , Q . When d is odd and K = R , the λ j ’s are also unnecessary. When d is even and K ⊆ R , it is not easy to determine whether (1.1) is possible for agiven f . In [42], the principal object of study is Q n, k , the (closed convex) cone offorms which are a sum ( λ j = 1) of 2 k -th powers of real linear forms. As two illus-trations of the diﬃculties which can arise in this case: √ K = Q ( √ √ x is not a sum of squares in K [ x ], and x + 6 λx y + y ∈ Q , if and only if λ ∈ [0 , R -representation (1.1), see [45]. Helmke [20] uses both deﬁnitions for length forforms, and is mainly concerned with the coeﬃcient-free version in the case when K is an algebraically closed (or a real closed) ﬁeld of characteristic zero, not necessarilya subset of C . Newman and Slater [34] do not restrict to homogeneous polynomials.They write x as a sum of d d -th powers of linear polynomials; by substitution, anypolynomial is a sum of at most d d -th powers of polynomials. They also show thatthe minimum number of d -th powers in this formulation is ≥ √ d . Because of thedegrees of the summands, these methods do not homogenize to forms. Mordell [32]showed that a polynomial that is a sum of cubes of linear forms over Z is also asum of at most eight such cubes. More generally, if R is a commutative ring, thenits d -Pythagoras number , P d ( R ), is the smallest integer k so that any sum of d -thpowers in R is a sum of k d -th powers. This subject is closely related to Hilbert’s17th Problem; see [6, 8, 7].Two examples illustrate the phenomenon of multiple lengths over diﬀerent ﬁelds. Example . Suppose f ( x, y ) = ( x + √ y ) d + ( x − √ y ) d ∈ Q [ x, y ]. Then L K ( f ) is2 (if √ ∈ K ) and d (otherwise). This example ﬁrst appeared in [42, p.137]. (SeeTheorem 4.6 for a generalization.) Example . If φ ( x, y ) = 3 x − x y + 10 xy , then L K ( φ ) = 3 if and only if √− ∈ K , L K ( φ ) = 4 for K = Q ( √− , Q ( √− , Q ( √− , Q ( √−

6) (at least) and L R ( φ ) = 5. (We give proofs of these assertions in Examples 2.1 and 3.1.)The following simple deﬁnitions and remarks apply in the obvious way to formsin n ≥ K -minimal if r = L K ( f ). Two linear forms are called distinct if they (or their d -th powers) are not proportional. A representation is honest if thesummands are pairwise distinct. Any minimal representation is honest. Two honestrepresentations are diﬀerent if the ordered sets of summands are not rearrangementsof each other; we do not distinguish between ℓ d and ( ζ kd ℓ ) d where ζ d = e πi/d . N THE LENGTH OF BINARY FORMS 3 If g is obtained from f by an invertible linear change of variables over K , then L K ( f ) = L K ( g ). Given a form f ∈ C [ x, y ], the ﬁeld generated by the coeﬃcients of f over C is denoted E f . The K -length can only be deﬁned for ﬁelds K satisfying E f ⊆ K ⊆ C . The following implication is immediate:(1.2) K ⊂ K = ⇒ L K ( f ) ≥ L K ( f ) . Strict inequality in (1.2) is possible, as shown by the two examples. The cabinet of f , C ( f ) is the set of all possible lengths for f .We now outline the remainder of the paper.In section two, we give a self-contained proof of Sylvester’s 1851 Theorem (Theorem2.1). Although originally given over C , it adapts easily to any K ⊂ C (Corollary2.2). If f is a binary form, then L K ( f ) ≤ r iﬀ a certain subspace of the binary formsof degree r (a subspace determined by f ) contains a form that splits into distinctfactors over K . We illustrate the algorithm by proving the assertions of lengths 3and 4 for φ in Example 1.2.In section three, we prove (Theorem 3.2) a homogenized version of Sylvester’s1864 Theorem (Theorem 3.1), which implies that if real f has r linear factors over R (counting multiplicity), then L R ( f ) ≥ r . In particular, L R ( φ ) = 5. As far as wehave been able to tell, Sylvester did not connect his two theorems: perhaps becausehe presented the second one for non-homogeneous polynomials in a single variable,perhaps because “ﬁelds” had not yet been invented.We apply these theorems and some other simple observations in sections four andﬁve. We ﬁrst show that if L C ( f ) = 1, then L E f ( f ) = 1 as well (Theorem 4.1). Any setof d + 1 d -th powers of pairwise distinct linear forms is linearly independent (Theorem4.2). It follows quickly that if f ( x, y ) has two diﬀerent honest representations oflength r and s , then r + s ≥ d + 2 (Corollary 4.3), and so if L E f ( f ) = r ≤ d +12 , thenthe representation over E f is the unique minimal C -representation (Corollary 4.4).We show that Example 1.1 gives the template for forms f satisfying L C ( f ) = 2

4; and giveexamples for each one not already forbidden. We then completely classify binarycubics; the key point of Theorem 5.2 is that if the cubic f has no repeated factors,then L k ( f ) = 2 if and only if E f ( p − f )) ⊆ K ; this signiﬁcance of the discriminant∆( f ) can already be found e.g. in Salmon [47, § d = 3. In Theorem 5.3, we show that Conjecture 4.12 also holds for d = 4. Anotherprobably old theorem (Theorem 5.4) is that L C ( f ) = d if and only if there are BRUCE REZNICK distinct linear forms ℓ, ℓ ′ so that f = ℓ d − ℓ ′ . The minimal representations of x k y k areparameterized (Theorem 5.5), and in Corollary 5.6, we show that L K (( x + y ) k ) ≥ k + 1, with equality if and only if tan πk +1 ∈ K . In particular, L Q (( x + y ) ) = 4.Theorem 5.7 shows that L Q ( x + 6 λx y + y ) = 3 if and only if a certain quarticdiophantine equation over Z has a non-zero solution.Section six lists some open questions.We would like to express our appreciation to the organizers of the Higher DegreeForms conference in Gainesville in May 2009 for oﬀering the opportunities to speakon these topics, and to write this article for its Proceedings. We also thank MikeBennett, Joe Rotman and Zach Teitler for helpful conversations.2.

Sylvester’s 1851 Theorem

Modern proofs of Theorem 2.1 can be found in the work of Kung and Rota: [28, § Theorem 2.1 (Sylvester) . Suppose (2.1) f ( x, y ) = d X j =0 (cid:18) dj (cid:19) a j x d − j y j and suppose (2.2) h ( x, y ) = r X t =0 c t x r − t y t = r Y j =1 ( − β j x + α j y ) is a product of pairwise distinct linear factors. Then there exist λ k ∈ C so that (2.3) f ( x, y ) = r X k =1 λ k ( α k x + β k y ) d if and only if (2.4)  a a · · · a r a a · · · a r +1 ... ... . . . ... a d − r a d − r +1 · · · a d  ·  c c ... c r  =  ...  ; that is, if and only if (2.5) r X t =0 a ℓ + t c t = 0 , ℓ = 0 , , . . . , d − r. N THE LENGTH OF BINARY FORMS 5

Proof.

First suppose that (2.3) holds. Then for 0 ≤ j ≤ d , a j = r X k =1 λ k α d − jk β jk = ⇒ r X t =0 a ℓ + t c t = r X k =1 r X t =0 λ k α d − ℓ − tk β ℓ + tk c t = r X k =1 λ k α d − ℓ − rk β ℓk r X t =0 α r − tk β tk c t = r X k =1 λ k α d − ℓ − rk β ℓk h ( α k , β k ) = 0 . Now suppose that (2.4) holds and suppose ﬁrst that c r = 0. We may assumewithout loss of generality that c r = 1 and that α j = 1 in (2.2), so that the β j ’s aredistinct. Deﬁne the inﬁnite sequence (˜ a j ), j ≥

0, by:(2.6) ˜ a j = a j if 0 ≤ j ≤ r −

1; ˜ a r + ℓ = − r − X t =0 ˜ a t + ℓ c t for ℓ ≥ . This sequence satisﬁes the recurrence of (2.5), so that(2.7) ˜ a j = a j for j ≤ d. Since | ˜ a j | ≤ c · M j for suitable c, M , the generating functionΦ( T ) = ∞ X j =0 ˜ a j T j converges in a neighborhood of 0. We have r X t =0 c r − t T t ! Φ( T ) = r − X n =0 n X j =0 c r − ( n − j ) ˜ a j ! T n + ∞ X n = r r X t =0 c r − t ˜ a n − t ! T n . It follows from (2.6) that the second sum vanishes, and hence Φ( T ) is a rationalfunction with denominator r X t =0 c r − t T t = h ( T,

1) = r Y j =1 (1 − β j T ) . By partial fractions, there exist λ k ∈ C so that(2.8) ∞ X j =0 ˜ a j T j = Φ( T ) = r X k =1 λ k − β k T = ⇒ ˜ a j = r X k =1 λ k β jk . A comparison of (2.8) and (2.7) with (2.1) shows that(2.9) f ( x, y ) = d X j =0 (cid:18) dj (cid:19) a j x d − j y j = r X k =1 λ k d X j =0 (cid:18) dj (cid:19) β jk x d − j y j = r X k =1 λ k ( x + β k y ) d , as claimed in (2.3). BRUCE REZNICK If c r = 0, then c r − = 0, because h has distinct factors. We may proceed as before,replacing r by r − c r − = 1, so that (2.2) becomes(2.10) h ( x, y ) = r − X t =0 c t x r − t y t = x r − Y j =1 ( y − β j x ) . Since c r = 0, (2.4) loses a column and becomes  a a · · · a r − a a · · · a r ... ... . . . ... a d − r a d − r +1 · · · a d −  ·  c c ... c r −  =   . We argue as before, except that (2.7) becomes(2.11) ˜ a j = a j for j ≤ d − , a d = ˜ a d + λ r , and (2.9) becomes(2.12) f ( x, y ) = d X j =0 (cid:18) dj (cid:19) a j x d − j y j = λ r y d + r − X k =1 λ k d X j =0 (cid:18) dj (cid:19) β jk x d − j y j = λ r y d + r − X k =1 λ k ( x + β k y ) d . By (2.10), (2.12) meets the description of (2.3), completing the proof. (cid:3)

The ( d − r + 1) × ( r + 1) Hankel matrix in (2.4) will be denoted H r ( f ). If ( f, h )satisfy the criterion of this theorem, we shall say that h is a Sylvester form for f . Ifthe only Sylvester forms of degree r are λh for λ ∈ C , we say that h is the unique Sylvester form for f . Any multiple of a Sylvester form that has no repeated factorsis also a Sylvester form, since there is no requirement that λ k = 0 in (2.3). If f hasa unique Sylvester form of degree r , then L C ( f ) = r and L K ( f ) ≥ r .The proof of Theorem 2.1 in [44] is based on apolarity. If f and h are given by(2.1) and (2.2), and h ( D ) = Q rj =1 ( β j ∂∂x − α j ∂∂y ), then h ( D ) f = d − r X m =0 d !( d − r − m )! m ! d − r X i =0 a i + m c i ! x d − r − m y m Thus, (2.4) is equivalent to h ( D ) f = 0. One can then argue that each linear factorin h ( D ) kills a diﬀerent summand, and dimension counting takes care of the rest.In particular, if deg h > d , then h ( D ) f = 0 automatically, and this implies that L C ( f ) ≤ d + 1. Theorem 4.2 is a less mysterious explanation of this fact.If h has repeated factors, a condition of interest in [25, 26, 27, 28, 44], then Gun-delﬁnger’s Theorem [18], ﬁrst proved in 1886, shows that a factor ( βx − αy ) ℓ of h corresponds to a summand ( αx + βy ) d +1 − ℓ q ( x, y ) in f , where q is an arbitrary form N THE LENGTH OF BINARY FORMS 7 of degree ℓ −

1. (Such a summand is unhelpful in the current context when ℓ ≥ d = 2 s − r = s , then H s ( f ) is s × ( s + 1) and has a non-trivial null-vector;for a general f , the resulting form h has distinct factors, and so is a unique Sylvesterform. (The coeﬃcients of h , and its discriminant, are polynomials in the coeﬃcientsof f .) This is how Sylvester proved that a general binary form of degree 2 s − s powers of linear forms and the minimal representation is unique. (If so, L K ( f ) = s iﬀ h splits in K , but this does not happen in general if s ≥ d = 2 s and r = s , then H s ( f ) is square; det( H s ( f )) is the catalecticant of f . (Foretymological exegeses on “catalecticant”, see [42, pp.49-50] and [17, pp.104-105].) Ingeneral, there exists λ so that the catalecticant of f ( x, y ) − λx s vanishes, and theresulting non-trivial null vector is generally a Sylvester form (no repeated factors).Thus, a general binary form of degree 2 s is a sum of λx s plus s powers of linearforms. (It is less clear whether one should expect L K ( f ) = s + 1 for general f .)Sylvester’s Theorem can be adapted to compute K -length when K ( C . Corollary 2.2.

Given f ∈ K [ x, y ] , L K ( f ) is the minimal degree of a Sylvester formfor f which splits completely over K .Proof. If (2.3) is a minimal representation for f over K , where λ k , α k , β k ∈ K , then h ( x, y ) ∈ K [ x, y ] splits over K by (2.2). Conversely, if h is a Sylvester form for f satisfying (2.2) with α k , β k ∈ K , then (2.3) holds for some λ k ∈ C . This is equivalentto saying that the linear system(2.13) a j = r X k =1 α d − jk β jk X k , (0 ≤ j ≤ d )has a solution { X k = λ k } over C . Since a j , α d − jk β jk ∈ K , it follows that (2.13) alsohas a solution over K , so that f has a K -representation of length r . (cid:3) We apply these results to the quintic from Example 1.2.

Example . Note that φ ( x, y ) = 3 x − x y + 10 xy = (cid:18) (cid:19) · x + (cid:18) (cid:19) · x y + (cid:18) (cid:19) · ( − x y + (cid:18) (cid:19) · x y + (cid:18) (cid:19) · xy + (cid:18) (cid:19) · y . Since  − − −  ·  c c c c  =   ⇐⇒ ( c , c , c , c ) = r (0 , , , , BRUCE REZNICK φ has a unique Sylvester form of degree 3: h ( x, y ) = y ( x + y ) = y ( y − ix )( y + ix ).Accordingly, there exist λ k ∈ C so that φ ( x, y ) = λ x + λ ( x + iy ) + λ ( x − iy ) . Indeed, λ = λ = λ = 1, as may be checked. It follows that L K ( φ ) = 3 if and onlyif i ∈ K . (A representation of length two would be detected here if some λ k = 0.)To ﬁnd representations for φ of length 4, we revisit (2.4): H ( φ ) · ( c , c , c , c , c ) t = (0 , t ⇐⇒ c − c + 2 c = − c + 2 c = 0 ⇐⇒ ( c , c , c , c , c ) = r (2 , , , ,

0) + r (0 , , , ,

0) + r (0 , , , , , hence h ( x, y ) = r x (2 x + 3 y ) + y ( x + y )( r x + r y ). Given a ﬁeld K , it is farfrom obvious whether there exist { r ℓ } so that h splits into distinct factors over K .Here are some imaginary quadratic ﬁelds for which this happens.The choice ( r , r , r ) = (1 , ,

2) gives h ( x, y ) = (2 x + y )( x + 2 y ) and24 φ ( x, y ) = 4( x + √− y ) + 4( x − √− y ) + (2 x + √− y ) + (2 x − √− y ) . Similarly, ( r , r , r ) = (2 , ,

9) and (2 , , −

5) give h ( x, y ) = ( x + 3 y )(4 x + 3 y )and ( x − y )(4 x + 5 y ), leading to representations for φ of length 4 over Q ( √− Q ( √− Q ( √−

6) uses( r , r , r ) = (8450 , , − h ( x, y ) = (5 x + 12 y )(5 x − y )(6 · x + 33 y ) . It is easy to believe that L Q ( √− m ) ( φ ) = 4 for all squarefree m ≥

2, though we have noproof. In Example 3.1, we shall show that there is no choice of ( r , r , r ) for which h splits into distinct factors over any subﬁeld of R .3. Sylvester’s 1864 Theorem

Theorem 3.1 was discovered by Sylvester [50] in 1864 while proving Isaac Newton’sconjectural variation on Descartes’ Rule of Signs, see [22, 51]. This theorem appearedin P´olya-Szeg¨o [37, Ch.5,Prob.79], and has been used by P´olya and Schoenberg [36]and Karlin [24, p.466]. The (dehomogenized) version proved in [37] is:

Theorem 3.1 (Sylvester) . Suppose = λ k for all k and γ < · · · < γ r , r ≥ , arereal numbers such that Q ( t ) = r X k =1 λ k ( t − γ k ) d does not vanish identically. Suppose the sequence ( λ , . . . , λ r , ( − d λ ) has C changesof sign and Q has Z zeros, counting multiplicity. Then Z ≤ C . We shall prove an equivalent version which exploits the homogeneity of f to avoiddiscussion of zeros at inﬁnity in the proof. (The equivalence is discussed in [45].) N THE LENGTH OF BINARY FORMS 9

Theorem 3.2.

Suppose f ( x, y ) is a non-zero real form of degree d with τ real linearfactors (counting multiplicity) and (3.1) f ( x, y ) = r X j =1 λ j (cos θ j x + sin θ j y ) d where − π < θ < · · · < θ r ≤ π , r ≥ and λ j = 0 . Suppose there are σ sign changesin the tuple ( λ , λ , . . . , λ r , ( − d λ ) . Then τ ≤ σ . In particular, τ ≤ r .Example . Since φ ( x, y ) = 3 x (cid:0) x − −√ y (cid:1)(cid:0) x − √ y (cid:1) is a product of ﬁve linear factors over R , L R ( φ ) ≥

5. The representation6 φ ( x, y ) = 36 x − x + y ) − x − y ) + ( x + 2 y ) + ( x − y ) . over Q implies that C ( φ ) = { , , } . Proof of Theorem 3.2.

We ﬁrst “projectivize” (3.1):(3.2) 2 f ( x, y ) = r X j =1 λ j (cos θ j x + sin θ j y ) d + r X j =1 ( − d λ d (cos( θ j + π ) x + sin( θ j + π ) y ) d View the sequence ( λ , λ , . . . , λ r , ( − d λ , ( − d λ , . . . , ( − d λ r , λ ) cyclically, iden-tifying the ﬁrst and last term. There are 2 σ pairs of consecutive terms with a negativeproduct. It doesn’t matter where one starts, so if we make any invertible change ofvariables ( x, y ) (cos θx + sin θy, − sin θx + cos θy ) in (3.1) (which doesn’t aﬀect τ ,and which “dials” the angles by θ ), and reorder the “main” angles to ( − π , π ], thevalue of σ is unchanged. We may therefore assume that neither x nor y divide f , that x d and y d are not summands in (3.2) (i.e., θ j is not a multiple of π ), and that if thereis a sign change in ( λ , λ , . . . , λ r ), then θ u < < θ u +1 implies λ u λ u +1 <

0. Underthese hypotheses, we may safely dehomogenize f by setting either x = 1 or y = 1and avoid zeros at inﬁnity and know that τ is the number of zeros of the resultingpolynomial. The rest of the proof generally follows [37].Let ¯ σ denote the number of sign changes in ( λ , λ , . . . , λ r ). We induct on ¯ σ . Thebase case is ¯ σ = 0 (and λ j > d is even, then σ = 0and f ( x, y ) = r X j =1 λ j (cos θ j x + sin θ j y ) d is deﬁnite, so τ = 0. If d is odd, then σ = 1. Let g ( t ) = f ( t, g ′ ( t ) = r X j =1 d ( λ j cos θ j ) (cos θ j t + sin θ j ) d − . Since d − θ j > λ j > g ′ is deﬁnite and g ′ = 0. Rolle’s Theoremimplies that g has at most one zero; that is, τ ≤ σ .Suppose the theorem is valid for ¯ σ = m ≥ σ = m + 1 in (3.1).Now let h ( t ) = f (1 , t ). We have h ′ ( t ) = r X j =1 d ( λ j sin θ j ) (cos θ j + sin θ j t ) d − . Note that h ′ ( t ) = q (1 , t ), where q ( x, y ) = r X j =1 d ( λ j sin θ j ) (cos θ j x + sin θ j y ) d − . Since ¯ σ ≥ θ u < < θ u +1 implies that λ u λ u +1 <

0, so that the number of signchanges in ( dλ sin θ , dλ sin θ , . . . , dλ r sin θ r ) is m , as the sign change at the u -thconsecutive pair has been removed, and no other possible sign changes are introduced.The induction hypothesis implies that q ( x, y ) has at most m linear factors, hence q (1 , t ) = h ′ ( t ) has ≤ m zeros (counting multiplicity) and Rolle’s Theorem impliesthat h has ≤ m + 1 zeros, completing the induction. (cid:3) Applications to forms of general degree

We begin with a familiar folklore result: the vector space of complex forms f in n variables of degree d is spanned by the set of linear forms taken to the d -th power.It follows from a 1903 theorem of Biermann (see [42, Prop.2.11] or [46] for a proof)that a canonical set of the “right” number of d -th powers over Z forms a basis:(4.1) (cid:8) ( i x + ... + i n x n ) d : 0 ≤ i k ∈ Z , i + · · · + i n = d (cid:9) . If f ∈ K [ x , . . . , x n ], then f is a K -linear combination of these forms and so L K ( f ) ≤ (cid:0) n + d − n − (cid:1) . We show below (Theorems 4.10, 5.4) that when n = 2, the bound for L K ( f )can be improved from d + 1 to d , but this is best possible.The ﬁrst two simple results are presented explicitly for completeness. Theorem 4.1. If f ∈ K [ x, y ] , then L K ( f ) = 1 if and only if L C ( f ) = 1 .Proof. One direction is immediate from (1.2). For the other, suppose f ( x, y ) =( αx + βy ) d with α, β ∈ C . If α = 0, then f ( x, y ) = β d y d , with β d ∈ K . If α = 0,then f ( x, y ) = α d ( x + ( β/α ) y ) d . Since the coeﬃcients of x d and dx d − y in f are α d and α d − β , it follows that α d and β/α = ( α d − β ) /α d are both in K . (cid:3) Theorem 4.2.

Any set { ( α j x + β j y ) d : 0 ≤ j ≤ d } of pairwise distinct d -th powersis linearly independent and spans the binary forms of degree d .Proof. The matrix of this set with respect to the basis (cid:0) di (cid:1) x d − i y i is [ α d − ij β ij ], whosedeterminant is Vandermonde: Y ≤ j

This determinant is a product of non-zero terms by hypothesis. (cid:3)

By considering the diﬀerence of two representations of a given form, we obtainan immediate corollary about diﬀerent representations of the same form. Trivialcounterexamples, formed by splitting summands, occur in non-honest representations.

Corollary 4.3. If f has two diﬀerent honest representations: (4.2) f ( x, y ) = s X i =1 λ i ( α i x + β i y ) d = t X j =1 µ j ( γ j x + δ j y ) d , then s + t ≥ d + 2 . If s + t = d + 2 in (4.2) , then the combined set of linear forms, { α i x + β i y, γ j x + δ j y } , is pairwise distinct. The next result collects some consequences of Corollary 4.3.

Corollary 4.4.

Let E = E f . (1) If L E ( f ) = r ≤ d + 1 , then L C ( f ) = r , so C ( f ) = { r } . (2) If, further, L E ( f ) = r ≤ d + , then f has a unique C -minimal representation. (3) If d = 2 s − and H s ( f ) has full rank and f has a unique Sylvester form h ofdegree s , then L K ( f ) ≥ s , with equality if and only if h splits in K .Proof. We take the parts in turn.(1) A diﬀerent representation of f over C must have length ≥ d +2 − r ≥ d +1 ≥ r by Corollary 4.3, and so L C ( f ) = r .(2) If r ≤ d + , then this representation has length ≥ d + > r , and so cannotbe minimal.(3) If d = 2 s − r = s , then the last case applies, so f has a unique C -minimalrepresentation, and by Corollary 2.2, this representation can be expressed in K if and only if the Sylvester form splits over K . (cid:3) We now give some more explicit constructions of forms with multiple lengths. Weﬁrst need a lemma about cubics.

Lemma 4.5. If f is a cubic given by (2.1) and H ( f ) = (cid:18) a a a a a a (cid:19) has rank ≤ ,then f is a cube.Proof. If a = 0, then a = 0, so a = 0 and f is a cube. If a = 0, then a = a /a and a = a a /a = a /a and f ( x, y ) = a ( x + a a y ) is again a cube. (cid:3) Theorem 4.6.

Suppose d ≥ and there exist α i , β i ∈ C so that (4.3) f ( x, y ) = d X i =0 (cid:18) di (cid:19) a i x d − i y i = ( α x + β y ) d + ( α x + β y ) d ∈ K [ x, y ] . If (4.3) is honest and L K ( f ) > , then there exists u ∈ K with √ u / ∈ K so that L K ( √ u ) ( f ) = 2 . The summands in (4.3) are conjugates of each other in K ( √ u ) . Proof.

First observe that if α = 0, then α β = α β implies that α = 0. But then a = α d = 0 and a = α d − β imply that α d , β /α ∈ K as in Theorem 4.1, and so f ( x, y ) − α d ( x + ( β /α ) y ) d = ( β y ) d = β d y d ∈ K [ x, y ] . This contradicts L K ( f ) >

2, so α = 0; similarly, α = 0. Let λ i = α di and γ i = β i /α i for i = 1 ,

2, so λ λ = 0 and γ = γ . We have f ( x, y ) = λ ( x + γ y ) d + λ ( x + γ y ) d = ⇒ a i = λ γ i + λ γ i . Now let g ( x, y ) = λ ( x + γ y ) + λ ( x + γ y ) = a x + 3 a x y + 3 a xy + a y . Since λ i = 0 and (4.3) is honest, Corollary 3.5 implies that L C ( g ) = 2, so H ( g ) hasfull rank by Lemma 4.5. It can be checked directly that (cid:18) a a a a a a (cid:19) ·  γ γ − ( γ + γ )1  = (cid:18) (cid:19) , and this gives h ( x, y ) = ( y − γ x )( y − γ x ) as the unique Sylvester form for g . Since H ( g ) has entries in K and hence has a null vector in K , we must have h ∈ K [ x, y ]. Byhypothesis, h does not split over K ; it must do so over K ( √ u ), where u = ( γ − γ ) =( γ + γ ) − γ γ ∈ K . Moreover, if σ denotes conjugation with respect to √ u , then γ = σ ( γ ) and since λ + λ ∈ K , λ = σ ( λ ) as well. Note that λ i = α di and γ i = β i /α i ∈ K ( √ u ), but this is not necessarily true for α i and β i themselves. (cid:3) Corollary 4.7.

Suppose g ∈ E [ x, y ] does not split over E , but factors into distinctlinear factors g ( x, y ) = Q rj =1 ( x + α j y ) over an extension ﬁeld K of E . If d > r − ,then for each ℓ ≥ , f ℓ ( x, y ) = r X j =1 α ℓj ( x + α j y ) d ∈ E [ x, y ] , and L K ( f ℓ ) = r < d + 2 − r ≤ L E ( f ℓ ) .Proof. The coeﬃcient of (cid:0) dk (cid:1) x d − k y k in f ℓ is P rj =1 α ℓ + kj . Each such power-sum belongsto E by Newton’s Theorem on Symmetric Forms. If α s / ∈ E (which must hold for atleast one α s = 0), then α ℓs ( x + α s y ) d / ∈ E [ x, y ]. Apply Corollary 4.3. (cid:3) Corollary 4.8.

Suppose K is an extension ﬁeld of E f , r ≤ d +12 , and f ( x, y ) = r X i =1 λ i ( α i x + β i y ) d with λ i , α i , β i ∈ K . Then every automorphism of K which ﬁxes E f permutes thesummands of the representation of f . N THE LENGTH OF BINARY FORMS 13

Proof.

We interpret σ ( λ ( αx + βy ) d ) = σ ( λ )( σ ( α ) x + σ ( β ) y ) d . Since σ ( f ) = f , theaction of σ is to give another representation of f . Corollary 4.4(2) implies that thisis the same representation, perhaps reordered. (cid:3) This next theorem is undoubtedly ancient, but we cannot ﬁnd a suitable reference.

Theorem 4.9. If f ∈ K [ x, y ] , then L C ( f ) ≤ deg d .Proof. By a change of variables, which does not aﬀect the length, we may assumethat neither x nor y divide f , hence a a d = 0 and h = a d x d − a y d is a Sylvester formwhich splits over C . (cid:3) Theorem 4.9 appears as an exercise in Harris [19, Ex.11.35], with the (dehomoge-nized) maximal length occurring at x d − ( x + 1) (see Theorem 5.4). Landsberg andTeitler [29, Cor. 5.2] prove that L C ( f ) ≤ (cid:0) n + d − n − (cid:1) − ( n − n = 2.The proof given for Theorem 4.9 will not apply to all ﬁelds K , because a d x d − a y d usually does not split over K . A more careful argument is required. Theorem 4.10. If f ∈ K [ x, y ] , then L K ( f ) ≤ deg d .Proof. Write f as in (2.1). If f is identically zero, there is nothing to prove. Other-wise, we may assume that f (1 ,

0) = a = 0 after a change of variables if necessary.By Corollary 2.2, it suﬃces to ﬁnd h ( x, y ) = P dk =0 c k x d − k y k which splits into distinctlinear factors over K and satisﬁes P dk =0 a k c k = 0.Let e = 1 and e k ( t , . . . , t d − ) denote the usual k -th elementary symmetric func-tions. We make a number of deﬁnitions: h ( t , . . . , t d − ; x, y ) := d − X k =0 e k ( t , . . . , t d − ) x d − − k y k = d − Y j =1 ( x + t j y ) ,β ( t , . . . , t d − ) := − d − X k =0 a k e k ( t , . . . , t d − ) ,α ( t , . . . , t d − ) := d − X k =0 a k +1 e k ( t , . . . , t d − ) , Φ( t , . . . , t d − ) := d − Y j =1 ( α ( t , . . . , t d − ) t j − β ( t , . . . , t d − )) , Ψ( t , . . . , t d − ) := Φ( t , . . . , t d − ) Y ≤ i

0) = − a e = − a = 0, so Φ(0 , . . . ,

0) = a d − = 0 and Φ is notthe zero polynomial, and thus neither is Ψ. Choose γ j ∈ K , 1 ≤ j ≤ d −

1, sothat Ψ( γ , . . . , γ d − ) = 0. It follows that the γ j ’s are distinct, and αγ j = β , where α = α ( γ , . . . , γ d − ) and β = β ( γ , . . . , γ d − ). Let e k = e k ( γ , . . . , γ d − ). We claimthat h ( x, y ) = d X i =0 c i x d − y i := ( αx + βy ) h ( γ , . . . , γ d − ; x, y ) = ( αx + βy ) d − Y j =1 ( x + γ j y )= ( αx + βy ) d − X k =0 e k x d − − k y k = αe x d + d − X k =1 ( αe k + βe k − ) x d − k y k + βe d − y d is a Sylvester form for f . Note that the γ j ’s are distinct and αγ j = β , 1 ≤ j ≤ d − h is a product of distinct linear factors. Finally, d X k =0 a k c k = αe a + d − X k =1 ( αe k + βe k − ) a k + βe d − a k = α d − X k =0 e k a k + β d − X k =0 e k a k +1 = α ( − β ) + βα = 0 . This completes the proof. (cid:3)

Corollary 4.11. If f is a product of d real linear forms, then L R ( f ) = d .Proof. Write f as a sum of L R ( f ) = r ≤ d d -th powers and rescale into the shape(3.1). Taking τ = d in Theorem 3.2, we see that d ≤ σ ≤ r . (cid:3) Conjecture 4.12. If f ∈ R [ x, y ] is a form of degree d ≥ , then L R ( f ) = d if andonly if f is a product of d linear forms. We shall see in Theorems 5.2 and 5.3 that this conjecture is true for d = 3 , Applications to forms of particular degree

Corollary 4.3 and Theorem 4.10 impose some immediate restrictions on the possiblecabinets of a form of degree d . Corollary 5.1.

Suppose deg f = d . (1) If L C ( f ) = r , then C ( f ) ⊆ { r, d − ( r − , d − ( r − , . . . , d } . (2) If L C ( f ) = 2 , then C ( f ) is either { } or { , d } . (3) If f has k diﬀerent lengths, then d ≥ k − . (4) If f is cubic, then C ( f ) = { } , { } , { } or { , } . (5) If f is quartic, then C ( f ) = { } , { } , { } , { } , { , } or { , } . We now completely classify L K ( f ) when f is a binary cubic. Theorem 5.2.

Suppose f ( x, y ) ∈ E f [ x, y ] is a cubic form with discriminant ∆ andsuppose E f ⊆ K ⊆ C . (1) If f is a cube, then L E f ( f ) = 1 and C ( f ) = { } . N THE LENGTH OF BINARY FORMS 15 (2) If f has a repeated linear factor, but is not a cube, then L K ( f ) = 3 and C ( f ) = { } . (3) If f does not have a repeated factor, then L K ( p ) = 2 if √− ∈ K and L K ( p ) = 3 otherwise, so either C ( f ) = { } or C ( f ) = { , } .Proof. The ﬁrst case follows from Theorem 4.1. In the second case, after an invertiblelinear change of variables, we may assume that f ( x, y ) = 3 x y , and apply Theorem2.1 to test for representations of length 2. But(5.1) (cid:18) (cid:19) ·  c c c  = (cid:18) (cid:19) = ⇒ c = c = 0 , so h has repeated factors. Hence L K ( x y ) ≥ L K ( x y ) = 3.Finally, suppose f ( x, y ) = a x + 3 a x y + 3 a xy + a y = Y j =1 ( r j x + s j y )does not have repeated factors, so that0 = ∆( f ) = Y j

0, so p − f ) / ∈ R and L R ( f ) = 3. If f is real and has one real and two conjugatecomplex linear factors, then ∆( f ) <

0, so L R ( f ) = 2. Counting repeated roots, wesee that if f is a real cubic, and not a cube, then L R ( f ) = 3 if and only if it has threereal factors, thus proving Conjecture 4.12 when d = 3. Example . We ﬁnd all representations of 3 x y of length 3. Note that H ( f ) · ( c , c , c , c ) t = (0) ⇐⇒ c = 0 ⇐⇒ h ( x, y ) = c x + c xy + c y . If c = 0, then y | h , which is to be avoided, so we scale and assume c = 1. Wecan parameterize the Sylvester forms h ( x, y ) = ( x − ay )( x − by )( x + ( a + b ) y ) with a, b, − ( a + b ) distinct. This leads to an easily checked general formula(5.2) 3( a − b )( a + 2 b )(2 a + b ) x y =( a + 2 b )( ax + y ) − (2 a + b )( bx + y ) + ( a − b )( − ( a + b ) x + y ) . It is not hard to ﬁnd analogues of (5.2) for d >

3; we leave this to the reader.

Theorem 5.3. If f is a real quartic form, then L R ( f ) = 4 if and only if f is aproduct of four linear factors.Proof. Factor ± f as a product of k positive deﬁnite quadratic forms and 4 − k linearforms. If k = 0, then Corollary 4.11 implies that L R ( f ) = 4. We must show that if k = 1 or k = 2, then f has a representation over R as a sum of ≤ k = 2, then f is positive deﬁnite and by [38, Thm.6], after an invertible linearchange of variables, f ( x, y ) = x + 6 λx y + y , with 6 λ ∈ ( − , r = 1, then(5.3) ( rx + y ) + ( x + ry ) − ( r + r )( x + y ) = ( r − ( r + r + 1) (cid:0) x − (cid:0) rr + r +1 (cid:1) x y + y (cid:1) . Let φ ( r ) = − rr + r +1 . Then φ ( − √

3) = 2 and φ (1) = −

2, and since φ is continuous,it maps [ − √ ,

1) onto ( − , L R ( f ) ≤ k = 1, there are two cases, depending on whether the linear factors are distinct.Suppose that after a linear change, f ( x, y ) = x h ( x, y ), where h is positive deﬁnite,and so for some λ > ℓ , h ( x, y ) = λx + ℓ . After another linear change,(5.4) f ( x, y ) = x (2 x + 12 y ) = ( x + y ) + ( x − y ) − y , and (5.4) shows that L R ( f ) ≤ f ( x, y ) = xy ( ax + 2 bxy + cy )where a > , c > , b < ac . After a scaling, f ( x, y ) = xy ( x + dxy + y ), | d | <

2, andby taking ± f ( x, ± y ), we may assume d ∈ [0 , r = 1, then(5.5) ( r + 1)( x + y ) − ( rx + y ) − ( x + ry ) = 4( r − ( r + r + 1) (cid:16) x y + (cid:16) r ) r + r +1) (cid:17) x y + xy (cid:17) . Let ψ ( r ) = r ) r + r +1) . Since ψ ( −

1) = 0, ψ (1) = 2 and ψ is continuous, it maps [ − , , L R ( f ) ≤ (cid:3) N THE LENGTH OF BINARY FORMS 17

The next result must be ancient; L C ( x d − y ) = d seems well known, but we havenot found a suitable reference for the converse. Landsberg and Teitler [29, Cor.4.5]show that L C ( x a y b ) = max( a + 1 , b + 1) if a, b ≥ Theorem 5.4. If d ≥ , then L C ( f ) = d if and only if there are two distinct linearforms ℓ and ℓ ′ so that f = ℓ d − ℓ ′ .Proof. If f = ℓ d − ℓ ′ , then after an invertible linear change, we may assume that f ( x, y ) = dx d − y . If L C ( dx d − y ) ≤ d −

1, then f would have a Sylvester form ofdegree d −

1. But then, as in (5.1), (2.4) becomes (cid:18) · · ·

01 0 · · · (cid:19) ·  c c ... c d −  = (cid:18) (cid:19) = ⇒ c = c = 0 , so h does not have distinct factors. Thus, L C ( dx d − y ) = d .Conversely, suppose L C ( f ) = d . Factor f = Q ℓ m j j as a product of pairwisedistinct linear forms, with P m j = d , m ≥ m · · · ≥ m s ≥

1, and s > L C ( f ) = 1.) Make an invertible linear change taking ( ℓ , ℓ ) ( x, y ), and call thenew form g ; L C ( g ) = d as well. If g ( x, y ) = P dℓ =0 (cid:0) dℓ (cid:1) b ℓ x d − ℓ y ℓ , then b = b d = 0. Byhypothesis, there does not exist a Sylvester form of degree d − g . Consider (cid:18) b · · · b d − b d − b b · · · b d − (cid:19) ·  c c ... c d −  = (cid:18) (cid:19) . If m ≥ m ≥

2, then x , y | g ( x, y ) and b = b d − = 0 and x d − − y d − is a Sylvesterform of degree d − f . Thus m = 1 and so y does not divide g and b = 0. Let q ( t ) = P d − i =0 b i +1 t i (note the absence of binomial coeﬃcients!) and suppose q ( t ) = 0.Since q (0) = b , t = 0. We have (cid:18) b · · · b d − b d − b b · · · b d − (cid:19) ·  t ... t d −  = (cid:18) t q ( t ) q ( t ) (cid:19) = (cid:18) (cid:19) . Since h ( x, y ) = d − X i =0 t i x d − − i y i = x d − t d y d x − t y = d − Y k =1 ( x − ζ kd − t y )has distinct linear factors, it is a Sylvester form for g , and L C ( g ) ≤ d −

1. This con-tradiction implies that q has no zeros, and by the Fundamental Theorem of Algebra, q ( t ) = b must be a constant. It follows that g ( x, y ) = db x d − y , as promised. (cid:3) By Corollaries 4.4 and 5.1, instances of the ﬁrst ﬁve cabinets in Corollary 5.1(5)are: x , x + y , x + y + ( x + y ) , x y and ( x + iy ) + ( x − iy ) . It will follow fromthe next results that C (( x + y ) ) = { , } . Theorem 5.5. If d = 2 k and f ( x, y ) = (cid:0) kk (cid:1) x k y k , then L C ( f ) = k + 1 . The minimal C -representations of f are given by (5.6) ( k + 1) (cid:18) kk (cid:19) x k y k = k X j =0 ( ζ j k +2 wx + ζ − j k +2 w − y ) k , = w ∈ C . Proof.

We ﬁrst evaluate the right-hand side of (5.6) by expanding the 2 k -th power:(5.7) k X j =0 ( ζ j k +2 wx + ζ − j k +2 w − y ) k = k X j =0 2 k X t =0 (cid:18) kt (cid:19) ζ j (2 k − t ) − jt k +2 w (2 k − t ) − t x k − t y t = k X t =0 (cid:18) kt (cid:19) w k − t x k − t y t k X j =0 ζ j ( k − t ) k +1 ! . But P m − j =0 ζ rjm = 0 unless m | r , in which case it equals m . Since the only multiple of k + 1 in the set { k − t : 0 ≤ t ≤ k } occurs for t = k , (5.7) reduces to the left-handside of (5.6). We now show that these are all the minimal C -representations of f .Since H k ( x k y k ) has 1’s on the NE-SW diagonal, it is non-singular, so L C ( x k y k ) > k ,and L C ( x k y k ) = k + 1 by (5.6). By Corollary 4.3, any minimal C -representation not given by (5.6) can only use powers of forms which are distinct from any wx + w − y .If ab = c = 0, then ax + by is a multiple of ac x + ca y . This leaves only x k and y k ,and there is no linear combination of these giving x k y k . (cid:3) The representations in (5.6) arise because the null-vectors of H k +1 ( x k y k ) can onlybe ( c , , . . . , , c k +1 ) t and c x k +1 + c k +1 y k +1 is a Sylvester form when c c k +1 = 0. Corollary 5.6.

For k ≥ , L C (( x + y ) k ) = k + 1 , and L K (( x + y ) k ) = k + 1 iﬀ tan πk +1 ∈ K . The C -minimal representations of ( x + y ) k are given by (5.8) (cid:18) dk (cid:19) ( x + y ) k = 1 k + 1 k X j =0 (cid:0) cos( jπk +1 + θ ) x + sin( jπk +1 + θ ) y (cid:1) d , θ ∈ C . Proof.

The invertible map ( x, y ) ( x − iy, x + iy ) takes x k y k into ( x + y ) k . Setting0 = w = e iθ in (5.6) gives (5.8) after the usual reduction. If tan α = 0, then(cos α x + sin α y ) r = cos r α · ( x + tan α y ) r = (1 + tan α ) − r ( x + tan α y ) r . Thus, (cos α x + sin α y ) r ∈ K [ x, y ] iﬀ cos α = 0 or tan α ∈ K . It follows that L K (( x + y ) k ) = k + 1 if and only if there exists θ ∈ C so that for each 0 ≤ j ≤ k , either cos( jπk +1 + θ ) = 0 or tan( jπk +1 + θ ) ∈ K . Since tan α, tan β ∈ K implytan( α − β ) ∈ K and k ≥

1, we see that (5.8) is a representation over K if and onlyif tan πk +1 ∈ K . (cid:3) N THE LENGTH OF BINARY FORMS 19

In particular, since tan π = √ / ∈ Q , L Q (( x + y ) > C (( x + y ) ) = { , } , as promised. Since tan πm is irrational for m ≥ L Q (( x + y ) k ) = k + 1 only for k = 1 , x k y k is a highly singular complex form, as is ( x + y ) k .However, as a real form, ( x + y ) k is in some sense at the center of the cone Q , k . Forreal θ , the formula in (5.8) goes back at least to Friedman [16] in 1957. It was shownin [42] that all minimal real representations of ( x + y ) k have this shape. There is anequivalence between representations of ( x + y ) k as a real sum of 2 k -th powers andquadrature formulas on the circle – see [42]. In this sense, (5.8) can be traced backto Mehler [30] in 1864. Taking k = 7 , θ = 0 and ρ := tan π = √ − ( x + y ) = x + y + (cid:0) ( x + y ) + ( x − y ) (cid:1) + (cid:16) √ (cid:17) (cid:0) ( x + ρy ) + ( x − ρy ) + ( ρx + y ) + ( ρx − y ) (cid:1) . A real representation (1.1) of ( P x i ) k (with positive real coeﬃcients λ j ) is called aHilbert Identity; Hilbert [21, 15] used such representations with rational coeﬃcientsto solve Waring’s problem. Hilbert Identities are deeply involved with quadratureproblems on S n − , the Delsarte-Goethals-Seidel theory of spherical designs in com-binatorics and for embedding questions in Banach spaces [42, Ch.8,9], as well as forexplicit computations in Hilbert’s 17th problem [43]. It can be shown that any suchrepresentation requires at least (cid:0) n + k − n − (cid:1) summands, and this bound also applies ifnegative coeﬃcients λ j are allowed. It is not known whether allowing negative co-eﬃcients can reduce to the total number of summands. When ( P x i ) k is a sum ofexactly (cid:0) n + k − n − (cid:1) k -th powers, the coordinates of minimal representations can be usedto produce tight spherical designs. Such representations exist when n = 2, 2 k = 2,( n, k ) = (3 , n, k ) = ( u − ,

4) ( u = 3 , n, k ) = (3 v − ,

6) ( v = 2 , n, k ) = (24 , n, k ) = ( u − ,

4) for some odd integer u ≥ n, k ) = (3 v − ,

6) for someinteger v ≥

4. These questions have been largely open for thirty years. It is also notknown whether there exist ( k, n ) so that L R (( P x i ) k )) > L C (( P x i ) k ), although thiscannot happen for n = 2. For that matter, it is not known whether there exists any f ∈ Q n,d so that L R ( f ) > L C ( f ).We conclude this section with a related question: if f λ ( x, y ) = x + 6 λx y + y for λ ∈ Q , what is L Q ( f λ )? If λ ≤ − , then f λ has four real factors, so L Q ( f λ ) = 4.Since det H ( f λ ) = λ − λ , L C ( f λ ) = 2 for λ = 0 , , −

1. The formula( x + 6 λx y + y ) = λ (cid:0) ( x + y ) + ( x − y ) (cid:1) + (1 − λ )( x + y ) . shows that L Q ( f ) = L Q ( f ) = 2; 2 f − ( x, y ) = ( x + iy ) + ( x − iy ) has Q -length 4. Theorem 5.7.

Suppose λ = ab ∈ Q , λ = λ . Then L Q ( x + 6 λx y + y ) = 3 if andonly if there exist integers ( m, n ) = (0 , so that (5.9) Γ( a, b, m, n ) = 4 a b m + ( b − a b − a ) m n + 4 a b n is a non-zero square.Proof. By Corollary 2.2, such a representation occurs if and only if there is a cubic h ( x, y ) = P i =0 c i x − i y i which splits over Q and satisﬁes(5.10) c + λc = λc + c = 0 . Assume that h ( x, y ) = ( mx + ny ) g ( x, y ), ( m, n ) = (0 ,

0) with m, n ∈ Z . If g ( x, y ) = rx + sxy + ty , then c = mr, c = ms + nr, c = mt + ns, c = nt and (5.10) becomes(5.11) (cid:18) m λn λmλn λm n (cid:19) ·  rst  = (cid:18) (cid:19) If m = 0, then the general solution to (5.11) is ( r, s, t ) = ( r, , − λr ) and rx − λry splits over Q into distinct factors iﬀ λ is a non-zero square; that is, iﬀ ab is a square,and similarly if n = 0. Otherwise, the system has full rank since λ = 1 and anysolution is a multiple of(5.12) rx + sxy + ty = ( λn − λ m ) x + ( λ − mnxy + ( λm − λ n ) y . The quadratic in (5.12) splits over Q into distinct factors iﬀ its discriminant(5.13) 4 λ m + (1 − λ − λ ) m n + 4 λ n = b − Γ( a, b, m, n )is a non-zero square in Q . (cid:3) In particular, we have the following identities: Γ( u , v , v, u ) = ( u v − uv ) andΓ( uv, u − uv + v , ,

1) = ( u − v ) ( u + v ) , hence L Q ( f λ ) = 3 for λ = τ and λ = ττ − τ +1 , where τ = uv ∈ Q , τ = ±

1. These show that L Q ( f λ ) = 3 for a dense setof rationals in [ − , ∞ ). These families do not exhaust the possibilities. If λ = , so f λ ( x, y ) = x + 76 x y + y , then λ is expressible neither as τ nor ττ − τ +1 for τ ∈ Q ,but Γ(38 , , ,

19) = 276906 .We mention two negative cases: if λ = , Γ(1 , , m, n ) = 12( m + n ) , which isnever a square, giving another proof that L Q (( x + y ) ) = 4. If λ = , thenΓ(1 , , m, n ) = 8 m − m n + 8 n = ( m − n ) + ( m + n ) , hence if L Q ( x + 3 x y + y ) = 3, then there is a solution to the Diophantine equation27 X + 5 Y = Z . A simple descent shows that this has no non-zero solutions:working mod 5, we see that 2 X = Z ; since 2 is not a quadratic residue mod 5, itfollows that 5 | X, Z , and these imply that 5 | Y as well.Solutions of the Diophantine equation Am + Bm n + Cn = r were ﬁrst studiedby Euler; see [11][pp.634-639] and [33][pp.16-29] for more on this topic. This equationhas not yet been completely solved; see [3, 9]. We hope to return to the analysis of(5.9) in a future publication. N THE LENGTH OF BINARY FORMS 21 Open Questions

Conecture 4.12 seems plausible, but as the degree increases, the canonical formsbecome increasingly involved. Are there other ﬁelds besides C (and possibly R ) forwhich there is a simple description of { f : L K ( f ) = deg f } ?Which cabinets are possible? Are there other restrictions beyond Corollary 5.1(1)?How many diﬀerent lengths are possible? If |C ( f ) | ≥

4, then d ≥ f have more than one, but a ﬁnite number, of K -minimal representations,where K is not necessarily equal to E f ? Theorem 5.7 might be a way to ﬁnd suchexamples.Length is generic over C , but not over R . For d = 2 r , the R -length of a real formis always 2 r in a small neighborhood of Q dj =1 ( x − jy ), but the R -length is always r + 1 in a small neighborhood of ( x + y ) r , by [42]. Which combinations of degreesand lengths have interior? Does the parity of d matter? References [1] J. Alexander and A. Hirschowitz,

Polynomial interpolation in several variables , J. AlgebraicGeom., (1995), 201-222, MR1311347 (96f:14065).[2] M. Brambilla and G. Ottaviani, On the Alexander-Hirschowitz theorem , J. Pure Appl. Algebra, (2008), 1229-1251, arxiv–math/0701409, MR2387598 (2008m:14104).[3] E. Brown, x + dx y + y = z : some cases with only trivial solutions—and a solution Eulermissed , Glasgow Math. J., 31 (1989), 297–307, MR1021805 (91d:11026).[4] E. Carlini, Varieties of simultaneous sums of power for binary forms , Matematiche (Catania), (2002), 83–97, arxiv–math.AG/0202050, MR2075735 (2005d:11058).[5] E. Carlini and J. Chipalkatti, On Waring’s problem for several algebraic forms , Comment.Math. Helv., (2003), 494–517, arxiv–math.AG/0112110, MR1998391 (2005b:14097).[6] M. D. Choi, Z. D. Dai, T. Y. Lam and B. Reznick, The Pythagoras number of some aﬃnealgebras and local algebras , J. Reine Angew. Math., (1982), 45–82, MR0671321 (84f:12012).[7] M. D. Choi, T. Y. Lam, A. Prestel and B. Reznick,

Sums of m th powers of rational functionsin one variable over real closed ﬁelds , Math. Z., (1996), 93–112, MR1369464 (96k:12003).[8] M. D. Choi, T. Y. Lam and B. Reznick, Sums of squares of real polynomials , K -theory andalgebraic geometry: connections with quadratic forms and division algebras (Santa Barbara,CA, 1992), 103–126, Proc. Sympos. Pure Math., , Part 2, Amer. Math. Soc., Providence,RI, 1995, MR1327293 (96f:11058).[9] J. H. E. Cohn, On the Diophantine equation z = x + Dx y + y , Glasgow Math. J., (1994), 283–285, MR1295501 (95k:11035).[10] P. Comon and B. Mourrain, Decomposition of quantics in sums of powers of linear forms , SignalProcessing, (1996), 93-107.[11] L. E. Dickson, History of the Theory of Numbers, vol II: Diophantine Analysis , Carnegie Insti-tute, Washington 1920, reprinted by Chelsea, New York, 1971, MR0245500 (39

Polar covariants of plane cubics and quartics , Adv. Math., (1993), 216–301, MR1213725 (94g:14029).[13] R. Ehrenborg and G.-C. Rota, Apolarity and canonical forms for homogeneous polynomials ,European J. Combin., (1993), 157–181, MR1215329 (94e:15062).[14] W. J. Ellison, A ‘Waring’s problem’ for homogeneous forms , Proc. Cambridge Philos. Soc., (1969), 663-672, MR0237450 (38 [15] W. J. Ellison, Waring’s problem , Amer. Math. Monthly, (1971), 10–36, MR0414510 (54 Mean-values and polyharmonic polynomials , Michigan Math. J., (1957), 67–74,MR0084045 (18,799b).[17] A. Geramita, Inverse systems of fat points: Waring’s problem, secant varieties of Veronesevarieties and parameter spaces for Gorenstein ideals , The Curves Seminar at Queen’s, Vol. X(Kingston, ON, 1995), 2–114, Queen’s Papers in Pure and Appl. Math., , Queen’s Univ.,Kingston, ON, 1996, MR1381732 (97h:13012).[18] S. Gundelﬁnger,

Zur Theorie der bin¨aren Formen , J. Reine Angew. Math., (1886), 413–424.[19] J. Harris,

Algebraic geometry. A ﬁrst course , Graduate Texts in Mathematics, , Springer-Verlag, New York, 1992, MR1182558 (93j:14001).[20] U. Helmke,

Waring’s problem for binary forms , J. Pure Appl. Algebra, (1992), 29–45,MR1167385 (93e:11057).[21] D. Hilbert, Beweis f¨ur die Darstellbarkeit der ganzen Zahlen durch eine feste Anzahl n -terPotenzen (Waringsches Problem) , Math. Ann., (1909), 281–300, Ges. Abh. 1, 510–527,Springer, Berlin, 1932, reprinted by Chelsea, New York, 1981.[22] P. Holgate, Studies in the history of probability and statistics. XLI. Waring and Sylvester onrandom algebraic equations , Biometrika, (1986), 228–231, MR0836453 (87m:01026).[23] A. Iarrobino and V. Kanev, Power Sums, Gorenstein algebras, and determinantal loci , LectureNotes in Mathematics, (1999), MR1735271 (2001d:14056).[24] S. Karlin,

Total Positivity, vol. 1 , Stanford University Press, Stanford, 1968, MR0230102 (37

Gundelﬁnger’s theorem on binary forms , Stud. Appl. Math., (1986), 163–169,MR0859177 (87m:11020).[26] J. P. S. Kung, Canonical forms for binary forms of even degree , in

Invariant theory , LectureNotes in Mathematics, , 52–61, Springer, Berlin, 1987, MR0924165 (89h:15037).[27] J. P. S. Kung,

Canonical forms of binary forms: variations on a theme of Sylvester , in

Invarianttheory and tableaux (Minnesota, MN, 1988) , 46–58, IMA Vol. Math. Appl., , Springer, NewYork, 1990, MR1035488 (91b:11046).[28] J. P. S. Kung and G.-C. Rota, The invariant theory of binary forms , Bull. Amer. Math. Soc.(N. S.), (1984), 27–85, MR0722856 (85g:05002).[29] J. M. Landsberg and Z. Teitler, On the ranks and border ranks of symmetric tensors , Found.Comp. Math., (2010), 339-366, arXiv:0901.0487.[30] G. Mehler, Bemerkungen zur Theorie der mechanischen Quadraturen , J. Reine Angew. Math., , (1864), 152-157.[31] R. Miranda, Linear systems of plane curves , Notices Amer. Math. Soc., (1999), 192–202,MR1673756 (99m:14012).[32] L. J. Mordell, Binary cubic forms expressed as a sum of seven cubes of linear forms , J. London.Math. Soc., (1967), 646-651, MR0249355 (40 Diophantine equations , Academic Press, London-New York 1969, MR0249355(40

Waring’s problem for the ring of polynomials , J. Number Theory, (1979), 477-487, MR0544895 (80m:10016).[35] I. Niven, Irrational numbers , Carus Mathematical Monographs, No. 11. Math. Assoc. Amer.,New York, 1956, MR0080123 (18,195c).[36] G. P´olya and I. J. Schoenberg,

Remarks on de la Vall´ee Poussin means and convex conformalmaps of the circle , Paciﬁc J. Math., (1958), 295–234, MR0100753 (20 Problems and theorems in analysis, II , Springer-Verlag, New York-Heidelberg 1976, MR0465631 (57

N THE LENGTH OF BINARY FORMS 23 [38] V. Powers and B. Reznick,

Notes towards a constructive proof of Hilbert’s theorem on ternaryquartics , Quadratic forms and their applications (Dublin, 1999), 209–227, Contemp. Math., , Amer. Math. Soc., Providence, RI, 2000, MR1803369 (2001h:11049).[39] K. Ranestad and F.-O. Schreyer,

Varieties of sums of powers , J. Reine Angew. Math., (2000), 147–181, MR1780430 (2001m:14009).[40] B. Reichstein,

On expressing a cubic form as a sum of cubes of linear forms , Linear AlgebraAppl., (1987), 91–122, MR0870934 (88e:11022).[41] B. Reichstein, On Waring’s problem for cubic forms , Linear Algebra Appl., (1992), 1–61,MR1137842 (93b:11048).[42] B. Reznick,

Sums of even powers of real linear forms , Mem. Amer. Math. Soc., (1992), no.463, MR1096187 (93h:11043).[43] B. Reznick, Uniform denominators in Hilbert’s Seventeenth Problem , Math. Z., (1995),75-97, MR1347159 (96e:11056).[44] B. Reznick,

Homogeneous polynomial solutions to constant coeﬃcient PDE’s , Adv. Math., (1996), 179-192, MR1371648 (97a:12006).[45] B. Reznick,

Laws of inertia in higher degree binary forms , Proc. Amer. Math. Soc., (2010),815-826, arXiv – 0906.5559, MR26566547.[46] B. Reznick,

Blenders , in preparation.[47] G. Salmon,

Lesson introductory to the modern higher algebra , ﬁfth edition, Chelsea, New York,1964.[48] J.J. Sylvester,

An Essay on Canonical Forms, Supplement to a Sketch of a Memoir on Elimina-tion, Transformation and Canonical Forms , originally published by George Bell, Fleet Street,London, 1851; Paper 34 in

Mathematical Papers , Vol. 1, Chelsea, New York, 1973. Originallypublished by Cambridge University Press in 1904.[49] J. J. Sylvester,

On a remarkable discovery in the theory of canonical forms and of hyperdeter-minants , originally in Phiosophical Magazine, vol. 2, 1851; Paper 42 in

Mathematical Papers ,Vol. 1, Chelsea, New York, 1973. Originally published by Cambridge University Press in 1904.[50] J. J. Sylvester,

On an elementary proof and demonstration of Sir Isaac Newton’s hithertoundemonstrated rule for the discovery of imaginary roots , Proc. Lond. Math. Soc. (1865/1866),1–16; Paper 84 in Mathematical Papers , Vol.2, Chelsea, New York, 1973. Originally publishedby Cambridge University Press in 1908.[51] J.-C. Yakoubsohn,

On Newton’s rule and Sylvester’s theorems , J. Pure Appl. Algebra, (1990),293–309, MR1072286 (91j:12002). Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana,IL 61801

E-mail address ::