A nearly optimal algorithm to decompose binary forms
Matías Bender, Jean-Charles Faugère, Ludovic Perret, Elias Tsigaridas
AA nearly optimal algorithm to decompose binary forms
Mat´ıas R. Bender
Sorbonne Universit´e,
CNRS , INRIA , Laboratoire d’Informatique de Paris 6,
LIP6 , ´Equipe P OL S YS , 4 place Jussieu,F-75005, Paris, France Jean-Charles Faug`ere
Sorbonne Universit´e,
CNRS , INRIA , Laboratoire d’Informatique de Paris 6,
LIP6 , ´Equipe P OL S YS , 4 place Jussieu,F-75005, Paris, France Ludovic Perret
Sorbonne Universit´e,
CNRS , INRIA , Laboratoire d’Informatique de Paris 6,
LIP6 , ´Equipe P OL S YS , 4 place Jussieu,F-75005, Paris, France Elias Tsigaridas
Sorbonne Universit´e,
CNRS , INRIA , Laboratoire d’Informatique de Paris 6,
LIP6 , ´Equipe P OL S YS , 4 place Jussieu,F-75005, Paris, France Abstract
Symmetric tensor decomposition is an important problem with applications in several areas,for example signal processing, statistics, data analysis and computational neuroscience. It isequivalent to Waring’s problem for homogeneous polynomials, that is to write a homogeneouspolynomial in n variables of degree D as a sum of D -th powers of linear forms, using the min-imal number of summands. This minimal number is called the rank of the polynomial/tensor.We focus on decomposing binary forms, a problem that corresponds to the decomposition ofsymmetric tensors of dimension 2 and order D , that is, symmetric tensors of order D over thevector space K . Under this formulation, the problem finds its roots in invariant theory wherethe decompositions are related to canonical forms.We introduce a superfast algorithm that exploits results from structured linear algebra . Itachieves a softly linear arithmetic complexity bound. To the best of our knowledge, the previ-ously known algorithms have at least quadratic complexity bounds. Our algorithm computes asymbolic decomposition in O ( M ( D ) log ( D )) arithmetic operations, where M ( D ) is the complex-ity of multiplying two polynomials of degree D . It is deterministic when the decomposition isunique. When the decomposition is not unique, it is randomized. We also present a Monte Carlo1 a r X i v : . [ c s . S C ] S e p ariant as well as a modification to make it a Las Vegas one.From the symbolic decomposition, we approximate the terms of the decomposition with anerror of 2 − ε , in O (cid:0) D log ( D ) (cid:0) log ( D ) + log ( ε ) (cid:1)(cid:1) arithmetic operations. We use results fromKaltofen and Yagati (1989) to bound the size of the representation of the coefficients involved inthe decomposition and we bound the algebraic degree of the problem by min ( rank , D − rank + ) .We show that this bound can be tight. When the input polynomial has integer coefficients, ouralgorithm performs, up to poly-logarithmic factors, (cid:101) O B ( D (cid:96) + D + D τ ) bit operations, where τ is the maximum bitsize of the coefficients and 2 − (cid:96) is the relative error of the terms in thedecomposition. Keywords:
Decomposition of binary forms; Tensor decomposition; Symmetric tensor;Symmetric tensor rank; Polynomial Waring’s problem; Waring rank; Hankel matrix; Algebraicdegree; Canonical form;
1. Introduction
The problem of decomposing a symmetric tensor consists in writing it as the sum of rank-1symmetric tensors, using the minimal number of summands. This minimal number is knownas the rank of the symmetric tensor . The symmetric tensors of rank-1 correspond to, roughlyspeaking, the D -th outer-product of a vector. The decomposition of symmetric tensor is a com-mon problem which appears in divers areas such as signal processing, statistics, data mining,computational neuroscience, computer vision, psychometrics, chemometrics, among others. Fora modern introduction to the theory of tensor, their decompositions and applications we refer toe.g., Comon (2014); Landsberg (2012).There is an equivalence between decomposing symmetric tensors and solving Waring’s prob-lem for homogeneous polynomials, e.g., Comon et al. (2008); Helmke (1992). Given a symmet-ric tensor of dimension n and order D , that is a symmetric tensor of order D over the vector space Email addresses: [email protected] (Mat´ıas R. Bender), [email protected] (Jean-Charles Faug`ere), [email protected] (Ludovic Perret), [email protected] (EliasTsigaridas)
URL: ∼ bender/ (Mat´ıas R. Bender), ∼ jcf/ (Jean-Charles Faug`ere), ∼ perret/ (Ludovic Perret), ∼ elias/ (Elias Tsigaridas) Some authors, e.g., Comon et al. (2008), refer to this number as the symmetric rank of the tensor. n , we can construct a homogeneous polynomial in n variables of degree D . We can identifythe symmetric tensors of rank-1 with the D -th power of linear forms. Hence, to decompose asymmetric tensor of order D is equivalent to write the corresponding polynomial as a sum of D -th powers of linear forms using the minimal numbers of summands. This minimal number isthe rank of the polynomial/tensor.Under this formulation, symmetric tensor decomposition dates back to the origin of mod-ern (linear) algebra as a part of Invariant Theory. In this setting, the decomposition of genericsymmetric tensors corresponds to canonical forms (Sylvester, 1904, 1851; Gundelfinger, 1887).Together with the theory of apolarity, this problem was of great importance because the de-compositions provide information about the behavior of the polynomials under linear change ofvariables (Kung and Rota, 1984). Binary Form Decomposition.
We study the decomposition of symmetric tensors of order D anddimension 2. In terms of homogeneous polynomials, we consider a binary form f ( x , y ) : = ∑ Di = (cid:0) Di (cid:1) a i x i y D − i , (1)where a i ∈ K ⊂ C and K is some field of characteristic zero. We want to compute a decomposi-tion f ( x , y ) = ∑ rj = ( α j x + β j y ) D , (2)where α , . . . , α r , β , . . . , β r ∈ K , with K being the algebraic closure of K , and r is minimal. Wesay that a decomposition unique if, for all the decompositions, the set of points { ( α j , β j ) : 1 ≤ j ≤ r } ⊂ P ( K ) is unique, where P ( K ) is the projective space of K (Reznick, 2013a). Previous work.
The decomposition of binary forms, Equation (2), has been studied extensivelyfor K = C . More than one century ago Sylvester (1851, 1904) described the necessary andsufficient conditions for a decomposition to exist, see Section 2.1. He related the decompositionsto the kernel of Hankel matrices. For a modern approach of this topic, we refer to Kung andRota (1984); Kung (1990); Reznick (2013a); Iarrobino and Kanev (1999). Sylvester’s work wasextended to a more general kind of polynomial decompositions that we do not consider in thiswork, e.g., Gundelfinger (1887); Reznick (1996); Iarrobino and Kanev (1999).Sylvester’s results lead to an algorithm (Algorithm 1) to decompose binary forms (see Comonand Mourrain, 1996, Sec. 3.4.3). In the case where the binary form is of odd degree, then we3an compute the decompositions using Berlekamp-Massey algorithm (see D¨ur, 1989). Whenthe decomposition is unique, the Catalecticant algorithm, which also works for symmetric ten-sors of bigger dimension (Iarrobino and Kanev, 1999; Oeding and Ottaviani, 2013), improvesSylvester’s work. For an arbitrary binary form, Helmke (1992) presented a randomized algo-rithm based on Pad´e approximants and continued fractions, in which he also characterized thedifferent possible decompositions. Unfortunately, all these algorithms have complexity at leastquadratic in the degree of the binary form.Besides the problem of computing the decomposition(s) many authors considered the sub-problems of computing the rank and deciding whether there exists a unique decomposition, e.g.,Sylvester (1851, 1904); Helmke (1992); Comas and Seiguer (2011); Bernardi et al. (2011). Forexample, Sylvester (1851, 1904) considered generic binary forms, that is binary forms with co-efficients belonging to a dense algebraic open subset of K D + (Comon and Mourrain, 1996,Section 3), and proved that when the degree is 2 k or 2 k +
1, for k ∈ N , the rank is k + k or 2 k − r , is unique if and only if r ≤ k . With respect to the prob-lem of computing the rank there are different variants of algorithms, e.g., Comas and Seiguer(2011); Comon et al. (2008); Bernardi et al. (2011). Even though there are not explicit complex-ity estimates, by exploiting recent superfast algorithms for Hankel matrices (Pan, 2001), we candeduce a nearly-optimal arithmetic complexity bound for computing the rank using the approachof Comas and Seiguer (2011).For the general problem of symmetric tensor decomposition, Sylvester’s work was success-fully extended to cases in which the decomposition is unique, e.g., Brachat et al. (2010); Oedingand Ottaviani (2013). There are also homotopy techniques to solve the general problem, e.g., todecompose generic symmetric tensors (Bernardi et al., 2017) or, when there is a finite number ofpossible decompositions and we know at least one of them, to compute all the other decomposi-tions (Hauenstein et al., 2016). There are no complexity estimations for these methods. Besidestensor decomposition, there are other related decompositions for binary forms and univariatepolynomials that we do not consider in this work, e.g., Reznick (1996, 2013b); Giesbrecht et al.(2003); Giesbrecht and Roche (2010); Garc´ıa-Marco et al. (2017).4 ormulation of the problem. Instead of decomposing the binary form as in Equation (2), wecompute λ . . . λ r , α . . . α r , β . . . β r ∈ K , where r is minimal, such that, f ( x , y ) = ∑ rj = λ j ( α j x + β j y ) D . (3)Since every λ j belongs to the algebraic closure of the field K , the problems are equivalent.This approach allows us to control the algebraic degree (Bajaj, 1988; Nie et al., 2010) of theparameters λ j , α j , and β j in the decomposition (Section 4.1).Note that if the field is not algebraically closed and we force the parameters to belong to thebase field, that is λ j , α j , β j ∈ K , the decompositions induced by Equation (2) and Equation (3)are not equivalent. We do not consider the latter case and we refer to Helmke (1992); Reznick(1992); Comon et al. (2008); Boij et al. (2011); Blekherman (2015) for K = R , and to Reznick(1996, 2013a); Reznick and Tokcan (2017) for K ⊂ C . Main results.
We extend Sylvester’s algorithm to achieve a nearly-optimal complexity boundin the degree of the binary form. By considering structural properties of the Hankel matrices,we restrict the possible values for the rank of the decompositions and we identify when thedecomposition is unique. We build upon Helmke (1992) and we use the Extended EuclideanAlgorithm to deduce a better complexity estimate than what was previously known. Similarly toSylvester’s algorithm, our algorithm decomposes successfully any binary form, without makingany assumptions on the input.First, we focus on symbolic decompositions , that is a representation of the decomposition asa sum of a rational function evaluated at the roots of a univariate polynomial (Definition 36).We introduce an algorithm to compute a symbolic decomposition of a binary form of degree D in O ( M ( D ) log ( D )) , where M ( D ) is the arithmetic complexity of polynomial multiplication(Theorem 43). When the decomposition is unique, the algorithm is deterministic and this is aworst case bound. When the decomposition is not unique, our algorithm makes some randomchoices to fulfill certain genericity assumptions; thus the algorithm is a Monte Carlo one. How-ever, we can verify if the genericity assumptions hold within the same complexity bound, that is O ( M ( D ) log ( D )) , and hence we can also deduce a Las Vegas variant of the algorithm.Following the standard terminology used in structured matrices (Pan, 2001), our algorithmis superfast as its arithmetic complexity matches the size of the input up to poly-logarithmicfactors. The symbolic decomposition allow us to approximate the terms in a decomposition,5ith a relative error of 2 − ε , in O (cid:0) D log ( D ) (cid:0) log ( D ) + log ( ε ) (cid:1)(cid:1) arithmetic operations (Pan,2002; McNamee and Pan, 2013). Moreover, we can deduce for free the rank and the border rankof the tensor, see (Comas and Seiguer, 2011, Section 1).Using results from Kaltofen and Yagati (1989), we bound the algebraic degree of the decom-positions by min ( rank , D − rank + ) (Theorem 28). Moreover, we prove lower bounds for thealgebraic degree of the decomposition and we show that in certain cases the bound is tight (Sec-tion 4.1.3). For polynomials with integer coefficients, we bound the bit complexity, up to poly-logarithmic factors, by (cid:101) O B ( D (cid:96) + D + D τ ) , where τ is the maximum bitsize of the coefficientsof the input binary form and 2 − (cid:96) is the error of the terms in the decomposition (Theorem 45).This Boolean worst case bound holds independently of whether the decomposition is unique ornot.This work is an extension of the conference paper (Bender et al., 2016). With respect to theconference version, our main algorithm (Algorithm 3) omits an initial linear change of coordi-nates as we now rely on fewer genericity assumptions. In contrast with our previous algorithm,we present an algorithm which is deterministic when the decomposition is unique (Theorem 43).When the decomposition is not unique, our algorithm is still randomized but we present boundsfor the number of bad choices that it could make (Proposition 29). With respect to the algebraicdegree of the problem, we study the tightness of the bounds that we proposed in the conferencepaper (Theorem 27). We introduce explicit lower bounds showing that our bounds can be tight(Section 4.1.3). Organization of the paper.
First, we introduce the notation. In Section 2, we present the prelimi-naries that we need for introducing our algorithm. We present Sylvester’s algorithm (Section 2.1),we recall some properties of the structure of the kernel of the Hankel matrices (Section 2.2), weanalyze its relation to rational reconstructions of series/polynomials (Section 2.3), and we presentthe Extended Euclidean Algorithm (Section 2.4). Later, in Section 3, we present our main al-gorithm to decompose binary forms (Algorithm 3) and its proof of correctness (Section 3.3).This algorithm uses Algorithm 4 to compute the kernel of a family of Hankel matrices, whichwe consider in Section 3.1. Finally, in Section 4, we study the algebraic degree of the prob-lem (Section 4.1), we present tight bounds for it (Section 4.1.3), and we analyze the arithmetic(Section 4.2) and bit complexity of Algorithm 3 (Section 4.3).6 otation.
We denote by O , respectively O B , the arithmetic, respectively bit, complexity and weuse (cid:101) O , respectively (cid:101) O B , to ignore (poly-)logarithmic factors. M ( n ) is the arithmetic complexity ofmultiplying two polynomial of degree n . Let K be a zero characteristic subfield of C , and K itsalgebraic closure. If v = ( v , . . . , v n ) T then P v = P ( v ,..., v n ) : = ∑ ni = v i x i y n − i . Given a binary form f ( x , y ) , we denote by f ( x ) the univariate polynomial f ( x ) : = f ( x , ) . By f (cid:48) ( x ) we denote thederivative of f ( x ) with respect to x . For a matrix M , rk ( M ) is its rank and Ker ( M ) its kernel.
2. Preliminaries
Sylvester’s theorem (Theorem 2) relates the minimal decomposition of a binary form to thekernel of a Hankel matrix. Moreover, it implies an (incremental) algorithm for computing theminimal decomposition. The version that we present in Algorithm 1 comes from Comon andMourrain (1996, Section 3.2).
Definition 1.
Given a vector a = ( a , . . . , a D ) T , we denote by { H ka } ≤ k ≤ D the family of Hankelmatrices indexed by k, where H ka ∈ K ( D − k + ) × ( k + ) andH ka : = a a · · · a k − a k a a · · · a k a k + ... ... . . . ... ... a D − k − a D − k · · · a D − a D − a D − k a D − k + · · · a D − a D . (4)We may omit the index a in H ka when it is clear from the context. Theorem 2 (Sylvester, 1851) . Let f ( x , y ) = ∑ Di = (cid:0) Di (cid:1) a i x i y D − i with a i ∈ K ⊆ C . Also, consider anon-zero vector c = ( c , . . . , c r ) T ∈ K r + , such that the polynomialP c = ∑ ri = c i x i y r − i = ∏ rj = ( β j x − α j y ) is square-free and α j , β j ∈ K , for all ≤ j ≤ r. Then, there are λ , . . . λ r ∈ K such that we canwrite f ( x , y ) as f ( x , y ) = r ∑ j = λ j ( α j x + β j y ) D , if and only if ( c , . . . , c r ) T ∈ Ker ( H ra ) . lgorithm 1 I NCR D ECOMP (Comon and Mourrain, 1996, Figure 1)1. r : =
12. Get a random c ∈ Ker ( H r )
3. If P c is not square-free, r : = r + P c as ∏ rj = ( β j x − α j y )
5. Solve the transposed Vandermonde system: β D · · · β Dr β D − α · · · β D − r α r ... . . . ... α D · · · α Dr λ ... λ r = a ... a D . (5)6. Return ∑ rj = λ j ( α j x + β j y ) D For a proof of Theorem 2 we refer to Reznick (2013a, Theorem 2.1 & Corollary 2.2). Theo-rem 2 implies Algorithm 1. This algorithm will execute steps 2 and 3 as many times as the rank.At the i -th iteration it computes the kernel of H i . The dimension of this kernel is ≤ i and eachvector in the kernel has i + D by noticing that the rank ofthe binary form is either rk ( H (cid:100) D (cid:101) ) or D − rk ( H (cid:100) D (cid:101) ) + r , the dimension of thekernel of H r can still be as big as O ( D ) ; the same bound holds for the length of the vectors in thekernel. Hence, the complexity is lower bounded by O ( D ) .Our approach avoids the incremental construction. We exploit the structure of the kernel of8he Hankel matrices and we prove that the rank has only two possible values (Lemma 17), seealso (Comas and Seiguer, 2011, Section 3), (Helmke, 1992, Theorem B), or (Bernardi et al.,2011). Moreover, we use a compact representation of the vectors in the kernel. We describethem as a combination of two polynomials of degree O ( D ) . The Hankel matrices are among the most studied structured matrices (Pan, 2001). They arerelated to polynomial multiplication. We present results about the structure of their kernel. Fordetails, we refer to Heinig and Rost (1984, Chapter 5).
Proposition 3.
Matrix-vector multiplication of Hankel matrices is equivalent to polynomialmultiplication. Given two binary forms A : = ∑ Di = a i x i y D − i and U : = ∑ ki = u i x i y k − i , considerR : = ∑ D + ki = r i x i y D + k − i = A · U. If we choose the monomial basis { y D + k , . . . , x D + k } , then the equal-ity A · U = R is equivalent to Equation (6) , where the central submatrix of the left matrix isH k ( a ,..., a D ) (Definition 1). a a a . . . . . . ... a · · · a k − a k − a a · · · a k − a k a a · · · a k a k + ... ... . . . ... ... a D − k a D − k + · · · a D − a D a D − k + a D − k + · · · a D ... . . . . . . a D − a D a D u k ... u u = r r ... r k − r k r k + ... r D r D + ... r D + k − r D + k . (6)Consider a family of Hankel matrices { H ka } ≤ k ≤ D as in Definition 1. There is a formula forthe dimension of the kernel of each matrix in the family { H ka } ≤ k ≤ D that involves two numbers, N a and N a . To be more specific, the following holds:9 roposition 4. For any family of Hankel matrices { H ka } ≤ k ≤ D there are two constants, N a andN a , such that the following hold:1. ≤ N a ≤ N a ≤ D.2. For all k, ≤ k ≤ D, it holds dim ( Ker ( H ka )) = max ( k − N a ) + max ( k − N a ) .3. N a + N a = D. We may omit the index a in N a and N a when it is clear from the context. rk ( H k ) N + N N D D k (a) Rank of the Hankel matrices dim ( Ker ( H k )) N N DDN − N (b) Dimension of the kernel Figure 1:
Relation between N , N , and D An illustration of Proposition 4 appears in Figure 1. The dimension of the kernel and therank of the matrices are piece-wise-linear functions in k , the number of columns in the matrix.The graphs of the functions consist of three line segments, as we can see in the Figures 1a and1b. The dimension of the kernel is an increasing function of k . For k from 1 to N , the kernelof the matrix is trivial, so the rank increases with the number of columns. That is, the slope ofthe graph of the rank (Figure 1a) is 1, while the slope of the graph of the dimension of the kernel(Figure 1b) is 0. For k from N + N , the rank remains constant as for each column that weadd, the dimension of the kernel increases by one. Hence, the slope of the graph of the rank is 0and the slope of the graph of the dimension of the kernel is 1. For k from N + D , the rankdecreases because the dimension of the kernel increases by 2, and so the slope of the graph ofthe rank is −
1, while the slope of the graph of the dimension of the kernel is 2.10f N = N , then the graph of both functions degenerates to two line segments. Regarding thegraph of the rank, the first segment has slope 1 for k from 1 to N + − k from N + D . For the graph of the dimension of the kernel, the first segmenthas slope 0 from 1 to N +
1, and the second one has slope 2 from N to D .The elements of the kernel of the matrices in { H k } are related. To express this relation froma linear algebra point of view, we need to introduce the U-chains . Definition 5 (Heinig and Rost (1984, Definition 5.1)) . A U-chain of length k of a vector v =( v , . . . , v n ) T ∈ K n + is a set of vectors { U k v , U k v , . . . , U k − k v } ⊂ K n + k . The i-th element, for ≤ i ≤ k − , is U ik v = ( , . . . , (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) i , n + (cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125)(cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123) v , . . . , v n , , . . . , (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) k − − i ) , where U ik is an i-shifting matrix of dimension ( n + k ) × ( n + ) (Heinig and Rost, 1984, page 11). If v is not zero, then all the elements in a U-chain of v are linearly independent. The followingtheorem uses U-chains to relate the vectors in the kernels of a family of Hankel matrices. Proposition 6 (Vectors v and w) . Given a family of Hankel matrices { H k } ≤ k ≤ D , let N and N be the constants of Proposition 4. There are two vectors, v ∈ K N + and w ∈ K N + , such that • If ≤ k ≤ N , then Ker ( H k ) = { } . • If N < k ≤ N , then the U-chain of v of length ( k − N ) is a basis of Ker ( H k ) , that is Ker ( H k ) = (cid:104) U k − N v , . . . , U k − N − k − N v (cid:105) . • If N < k ≤ D, then the U-chain of v of length k − N together with the U-chain of w of lengthk − N is a basis of Ker ( H k ) , that is Ker ( H k ) = (cid:104) U k − N v , . . . , U k − N − k − N v , U k − N w , . . . , U k − N − k − N w (cid:105) . The vectors v and w of Proposition 6 are not unique. The vector v could be any vector in Ker ( H N + ) . The vector w could be any vector in Ker ( H N + ) that does not belong to the vectorspace generated by the U-chain of v of length N − N +
1. From now on, given a family ofHankel matrices, we refer to v and w as the vectors of Proposition 6.11et u be a vector in the kernel of H k and P u its corresponding polynomial (see Notation).We say that P u is a kernel polynomial . As P U jk v = x j y k − − j P v , we can write any kernel poly-nomial of a family of Hankel matrices as a combination of P v and P w (Heinig and Rost, 1984,Propositions 5.1 & 5.5). Moreover, P v and P w are relatively prime. Proposition 7.
Consider any family of Hankel matrices { H k } ≤ k ≤ D . The kernel polynomials P v and P w are relative prime. Moreover, for each k, the set of kernel polynomials of the matrix H k is as follows: • If < k ≤ N , then it is { } . • If N < k ≤ N , then it is { P µ · P v : µ ∈ K k − N } . • If N < k ≤ D, then it is { P µ · P v + P ρ · P w : µ ∈ K k − N , ρ ∈ K k − N } . Corollary 8.
Let ω ∈ Ker ( H N + ) such that P ω (cid:60) { P µ · P v : µ ∈ K N − N + } , then we can consider ω as the vector w from Proposition 6.2.3. Rational Reconstructions A rational reconstruction for a series, respectively a polynomial, consists in approximating theseries, respectively the polynomial, as the quotient of two polynomials. Rational reconstructionsare the backbone of many problems e.g., Pad´e approximants, Cauchy Approximations, LinearRecurrent Sequences, Hermite Interpolation. They are related to Hankel matrices. For an intro-duction to rational reconstructions, we refer to Bostan et al. (2017, Chapter 7) and referencestherein.
Definition 9.
Consider a : = ( a , . . . , a D ) T ∈ K D + and a polynomial A : = ∑ Di = a i x i ∈ K [ x ] . Givena pair of univariate polynomials ( U , R ) , we say that they are a rational reconstruction of Amodulo x D + if A · U ≡ R mod x D + . Such a reconstruction is not necessarily unique. Our interest emanates from the relationbetween the rational reconstructions of A modulo x D + and the kernels of the family of Hankelmatrices { H ka } k . 12 emma 10. Consider ω ∈ Ker ( H ka ) and ( r , . . . , r k − ) ∈ K k such that · · · a ... . . . . . . ... a · · · a k − ω ω ... ω k = r ... r k − . Then, ( P ω ( , x ) , ∑ k − i = r i x i ) is a rational reconstruction of A modulo x D + .Proof. Following Equation (6), if ω ∈ Ker ( H ka ) , then a . . . ... a · · · a k − a a · · · a k ... ... . . . ... a D − k a D − k + · · · a D ω ω ... ω k = r ... r k − ... . (7)Hence, P ω ( , x ) = ∑ ki = ω k − i x i and A · P ω ( , x ) ≡ ∑ k − i = r i x i mod x D + . Therefore, ( P ω ( , x ) , ∑ k − i = r i x i ) is a rational reconstruction of A modulo x D + . Lemma 11. If ( U , R ) is a rational reconstruction of A module x D + , then there is a vector ω ∈ Ker ( H max ( deg ( U ) , deg ( R )+ ) a ) such thatP ω = U (cid:16) yx (cid:17) x max ( deg ( U ) , deg ( R )+ ) . Proof.
Let k = deg ( U ) , q = deg ( R ) , U = ∑ i u i x i and R = ∑ i r i x i . Following Equation (6), AU ≡ R mod x D + is equivalent to, a . . . ... a · · · a k − a a · · · a k ... ... . . . ... a D − k a D − k + · · · a D u k ... u u = r r ... r q ... . (8)13f k > q , Equation (8) reduces to Equation (7), and so ω = ( u k , . . . , u ) ∈ Ker ( H ka ) . Moreover, U (cid:16) yx (cid:17) x k = k ∑ i = u i y i x k − i = ( j ↔ k − i ) k ∑ j = u k − j x j y j − k = P ω . If q ≥ k , we extend the vector ( u k , . . . , u ) by adding ( q + − k ) leading zeros. We rewriteEquation (8) as Equation (9). The concatenation of the two bottom submatrices form the matrix H q + a , and so ω = ( , . . . , , u k , . . . , u ) ∈ Ker ( H q + a ) . Also, P ω = k ∑ j = u j x q + − j y j + q + ∑ j = k + x q + − j y j = U (cid:16) yx (cid:17) x q + . a ... ... a ··· a k a a ··· a k + ... ... ... ... ... a ... a q − k a q + − k ... a q + ... ... ... ... ... ... a D − q − ... a D − k − a D − k ··· a D ... u k ... u = r ... r k ... r q ... . (9) Remark 12. If ( U , R ) is a rational reconstruction, then the degree of the kernel polynomialP ω ( x , y ) = U (cid:0) yx (cid:1) x max ( deg ( U ) , deg ( R )+ ) is max ( deg ( U ) , deg ( R ) + ) . In particular, the maximumpower of x that divides the kernel polynomial P ω is x max ( , deg ( R )+ − deg ( U )) .2.4. Greatest Common Divisor and B´ezout identity The Extended Euclidean algorithm ( E GCD) is a variant of the classical Euclidean algorithmthat computes the Greatest Common Divisor of two univariate polynomials A and B , gcd ( A , B ) ,together with two polynomials U and V , called cofactors , such that U A + V B = gcd ( A , B ) . In theprocess of computing these cofactors, the algorithm computes a sequence of relations between A and B that are useful to solve various problems, in particular to compute the rational reconstruc-tion of A modulo B . For a detailed exposition of this algorithm, we refer to Bostan et al. (2017,Chapter 6) and Gathen and Gerhard (2013, Chapter 3 and 11).14 lgorithm 2 Calculate the E GCD of A and B ( U , V , R ) ← ( , , B )( U , V , R ) ← ( , , A ) k ← while R k (cid:44) do k ← k + Q k − ← R k − quo R k − ( U k , V k , R k ) ← ( U k − , V k − , R k − ) − Q k − ( U k − , V k − , R k − ) end while Return { ( U i , V i , R i ) } i The Extended Euclidean Algorithm (Algorithm 2) computes a sequence of triples { ( U i , V i , R i ) } i which form the identities U i A + V i B = R i , for all i . (10)Following Gathen and Gerhard (2013), we refer to these triplets as the rows of the ExtendedEuclidean algorithms of A and B . Besides Equation (10), the rows are related to each other asfollows. Remark 13.
The degrees of the polynomials { R i } i form a strictly decreasing sequence, that is deg ( R i ) > deg ( R i + ) for every i. Lemma 14 (Bostan et al., 2017, Sec 7.1) . For each i, U i V i + − U i + V i = ( − ) i , and so thepolynomials U i and V i are coprime. Lemma 15 (Bostan et al., 2017, Lem 7.1) . For each i > , the degree of U i is the degree of Bminus the degree of R i − , that is deg ( U i ) = deg ( B ) − deg ( R i − ) , ∀ i > . Every row of the Extended Euclidean Algorithm leads to rational reconstruction of A modulo B . Remark 16.
For each i ≥ , ( U i , R i ) is a rational reconstruction of A modulo B. . The Algorithm One of the drawbacks of Algorithm 1 and its variants is that they rely on the computation ofthe kernels of many Hankel matrices and they ignore the particular structure that it is present inall of them. Using Lemma 17, we can skip many calculations by computing only two vectors, v and w (Proposition 6). This is the main idea behind Algorithm 3 that leads to a softly-lineararithmetic complexity bound (Section 4.2).Algorithm 3 performs as follows: First, step 1 computes the kernel polynomials P v and P w which, by Proposition 7, allow us to obtain the kernel polynomials of all the Hankel matrices(see Section 3.1). Then, step 2 computes a square-free kernel polynomial of the minimum degree r (see Section 3.2). Next, step 3 computes the coefficients λ , . . . , λ r (see Section 4.1.2). Finally,step 4 recovers a decomposition for the original binary form.Let f be a binary form as in Equation (1) and let { H k } ≤ k ≤ D be its corresponding family ofHankel matrices (see Definition 1). The next well-known lemma establishes the rank of f . Lemma 17.
Assume f , { H k } k , N and N of Proposition 4, and v and w of Proposition 6. If P v (Proposition 7) is square-free then the rank of f is N + , else, it is N + .Proof. By Proposition 4, for k < N +
1, the kernel of H k is trivial. Hence, by Sylvester’stheorem (Theorem 2), there is no decomposition with a rank smaller than N +
1. Recall that v ∈ Ker ( H N + ) . So, if P v is square-free, by Sylvester’s theorem, there is a decomposition ofrank N + P v is not square-free. For N + ≤ k ≤ N , P v divides all the kernel polynomials ofthe matrices H k (Proposition 7). Therefore, none of them is square-free, and so the rank is atleast N + P v and P w are coprime. So, there is a polynomial P µ of degree N − N such that Q µ : = P v · P µ + P w is square-free. A formal proof of this appears in Theorem 22.By Proposition 7, Q µ is a square-free kernel polynomial of degree N +
1. Consequently, bySylvester’s theorem, there is a decomposition with rank N + (cid:104) f (cid:105) (Kung16nd Rota, 1984);(Iarrobino and Kanev, 1999, Chapter 1). All the kernel polynomials of { H k } k belong to the annihilator of (cid:104) f (cid:105) and they form an ideal. If f is a binary form of degree D = k or2 k +
1, then this ideal is generated by two binary forms of degrees rk ( H k ) and D + − rk ( H k ) ,with no common zeros (Iarrobino and Kanev, 1999, Theorem 1.44). These are the polynomials P v and P w . Using this interpretation Algorithm 1 and its variants compute a (redundant) generatingset of the annihilator of (cid:104) f (cid:105) , while Algorithm 3 computes a (minimal) basis. Algorithm 3 F AST D ECOMP
Input:
A binary form f ( x , y ) of degree D Output:
A decomposition for f ( x , y ) of rank r .1. Compute P v and P w of { H k a } k We use Algorithm 4 with ( a , . . . , a D ) .2. IF P v ( x , y ) is square-free , Q ←− P v r ←− N + { The rank of the decomposition is the degree of Q } ELSECompute a square-free binary form Q
We compute a vector µ of length ( N − N + ) ,such that ( P µ · P v + P w ) is square-free (Section 4.1.1). Q ←− P µ · P v + P w r ←− N + { The rank of the decomposition is the degree of Q } Compute the coefficients λ , . . . , λ r Solve the system of Equation (5) where Q ( x , y ) = ∏ rj = ( β j x − α j y ) .For details and the representation of λ j , see Section 4.1.2.4. Return f ( x , y ) = ∑ rj = λ j ( α j x + β j y ) D .1. Computing the polynomials P v and P w We use Lemma 10 and Lemma 11 to compute the polynomials P v and P w from Proposition 7as a rational reconstruction of A : = ∑ Di = a i x i modulo x D + . Our algorithm exploits the ExtendedEuclidean Algorithm in a similar way as Cabay and Choi (1986) do to compute scaled Pad´efractions.In the following, let v be the vector of Proposition 6, consider U v : = P v ( , x ) and R v ∈ K [ x ] as the remainder of the division of ( A · P v ( , x )) by x D + . Note that the polynomial R v is theunique polynomial of degree strictly smaller to N + A · P v ( , x ) ≡ R v mod x D + , seeEquation (7). Lemma 18. If ( U , R ) is a rational reconstruction of A modulo x D + such that max ( deg ( U ) , deg ( R ) + ) ≤ N , then there is a polynomial Q ∈ K [ x ] such that Q · P v ( x , ) = U and Q · R v = R.Proof.
Let k : = deg ( U ) and q : = deg ( R ) . By Lemma 11, there is a non-zero vector ω ∈ Ker ( H max ( k , q + ) a ) such that the kernel polynomial P ω is equal to U (cid:0) yx (cid:1) x max ( k , q + ) . Hence, Ker ( H max ( k , q + ) a ) (cid:44) N < max ( k , q + ) . We assume that max ( k , q + ) ≤ N , hence the degree of P ω is smaller or equal to N and, by Proposition 7, P ω is divisibleby P v . Therefore, there is a polynomial ¯ Q ∈ K [ x , y ] such that ¯ QP v = P ω . Let Q : = ¯ Q ( , x ) . Bydefinition, U v = P v ( , x ) and U = P ω ( , x ) , so U = QU v . Hence, Q R v ≡ R mod x D + , because R v ≡ U v A mod x D + and Q R v ≡ QU v A ≡ UA ≡ R mod x D + . If the degrees of ( Q R v ) and R are smaller than D +
1, then
Q R v = R , as we want to prove. By assumption, deg ( R ) < N ≤ D and deg ( U v Q ) = deg ( U ) ≤ N . By definition, the degree of R v is smaller of equal to N , and sodeg ( Q R v ) ≤ deg ( U v Q R v ) ≤ N + N = D (Proposition 4).We can use this lemma to recover the polynomial P v from certain rational reconstructions. Corollary 19. If ( U , R ) is a rational reconstructions of A modulo x D + such that max ( deg ( U ) , deg ( R ) + ) ≤ N and for every polynomial Q of degree strictly bigger than zero that dividesU and R, ( UQ , RQ ) is not a rational reconstruction of A modulo x D + , then there is a non-zeroconstant c such that P v = c · U (cid:0) yx (cid:1) x max ( deg ( U ) , deg ( R )+ ) (Proposition 7). In particular,N = max ( deg ( U ) − , deg ( R )) . Proof.
By Lemma 18, there is a Q ∈ K [ x ] such that Q · ( U v , R v ) = ( U , R ) . By Lemma 10, ( U v , R v ) is a rational reconstruction, and so deg ( Q ) =
0. Hence, N + = deg ( P v ) = max ( deg ( U ) , deg ( R ) + ) and Q · P v ( , yx ) x N + = U ( yx ) x N + .18f ( U , R ) is a rational reconstruction of A modulo x D + such that deg ( U ) + deg ( R ) ≤ D and U ( ) =
1, then RU is the Pad´e approximant of A of type ( deg ( R ) , deg ( U )) (Bostan et al., 2017,Section 7.1). When this Pad´e approximant exists, it is unique, meaning that for any rationalreconstruction with this property the quotient RU is unique (we can invert U mod x D + because U ( ) = N < N , we have that D + ≤ N (Proposition 4) and so, if the the Pad´e approx-imant of A of type ( D + − , D + ) exists, by Lemma 18, we can recover P v from it. The existenceof this Pad´e approximant is equivalent to the condition U v ( ) =
1, which means v N + =
1. In thealgorithm proposed in the conference version of this paper (Bender et al., 2016, Algorithm 3),the correctness of our algorithms relied on this condition. In that version, we ensured this prop-erty with a generic linear change of coordinates in the original polynomial f . In this paper, weskip this assumption. Following Bostan et al. (2017, Theorem 7.2), when N < N , we can com-pute v no matter the value of v N + . This approach has a softly-linear arithmetic complexity andinvolves the computation of a row of the E GCD of A and x D + . We can compute P w from aconsecutive row.Before going into the proof, we study the case N = N . In this case, there are not rationalreconstructions with the prerequisites of Lemma 18, and so we treat this case in a different way. Lemma 20.
If N = N , then there is a unique rational decomposition ( U , R ) such that deg ( U ) ≤ D , deg ( R ) ≤ D and R is monic. In particular, deg ( R ) = D and we can consider the kernelpolynomial related to v ∈ Ker ( H N + ) (Proposition 6) as P v = U (cid:0) yx (cid:1) x D + .Proof. First note that, as D = N + N (Proposition 4), then N = D . Following Equation (6), ifwe write U = ∑ N i = u i x i and R = ∑ N i = u i x i , then we get the linear system, H N a a a . . . . . . ... a · · · a N − a N a · · · a N a N + ... . . . ... ... a D − N · · · a D − a D u N ... u u = r ... r N − r N ... . The matrix H N ∈ K ( D − D + ) × ( D + ) is square and, as Ker ( H N ) = r N =
0, that is deg ( R v ) < N , then the polynomial U is zero and so ( U , R ) is not a19ational reconstruction. Hence, we can consider deg ( U ) = N . If R is monic, then r N = U and R as u N ... u u = ( H N + ) − ... , r ... r N − = a a a . . . . . . ... a · · · a N − a N u N u N − ... u . Lemma 21 (Correctness of Algorithm 4) . Let { ( U j , V j , R j ) } j be the set of triplets obtained fromthe Extended Euclidean Algorithm for the polynomials A and x D + , see Section 2.4. Let i be theindex of the first row of the extended Euclidean algorithm such that deg ( R i ) < D + . Then, wecan compute the polynomials P v and P w of Proposition 7 as (A) P v = U i ( xy ) · x max ( deg ( U i ) , deg ( R i )+ ) . (B) If deg ( R i ) > deg ( U i ) , P w = U i + ( xy ) · x deg ( U i + ) . (C) If deg ( R i ) ≤ deg ( U i ) , P w = U i − ( xy ) · x deg ( R i − + ) .Proof. (A). First observe that if i is the first index such that the degree of R i is strictly smallerthan D + , then, by Remark 13, the degree of R i − has to be bigger or equal to D + . Hence, thedegree of U i is smaller or equal to D + , because by Lemma 15, deg ( U i ) = D + − deg ( R i − ) ≤ D + − D + = D + . We can consider R i − , that is i is strictly bigger than 0, because the degreeof R = x D + is strictly bigger than D + .If N = N , then D is even and N = D (Proposition 4). As (cid:98) D + (cid:99) = D , deg ( R i ) ≤ D anddeg ( U i ) ≤ D . By Lemma 20, max ( deg ( U i ) , deg ( R i ) + ) = N + P v as U i ( yx ) x N + .If N < N , assume that there is a non-zero Q ∈ K [ x ] such that Q divides U i and R i and ( U i Q , R i Q ) is a rational reconstruction of A modulo x D + . Then, U i Q A ≡ R i Q mod x D + and so there is a poly-nomial ¯ V such that ¯ V x D + + U i Q A = R i Q . Multiplying by Q , we get the equality Q ¯ V x D + + U i A = R i .Consider the identity V i x D + + U i A = R i from Equation (10). Coupling the two equalities to-gether, we conclude that V i = Q ¯ V . As Q divides U i and V i , which are coprime (Lemma 14), Q is aconstant, that is deg ( Q ) =
0. If N < N , then D < N (Proposition 4). Hence, deg ( U i ) ≤ D + ≤ N and deg ( R i ) + < D + + ≤ N +
1, that is, max ( deg ( U i ) , deg ( R i ) + ) ≤ N . Hence, by20emma 18, we can consider U i ( yx ) x max ( deg ( U i ) , deg ( R i )+ ) as the kernel polynomial P v from Propo-sition 7, spanning Ker ( H N + ) . (B). Assume that the degree of U i is strictly bigger than the one of R i , that is deg ( U i ) > deg ( R i ) . Then N = deg ( U i ) −
1, as deg ( U i ) = deg ( P v ) = N + i > U = R = A (cid:44)
0, and so deg ( U ) ≤ deg ( R ) . The degree of R i − is N because, by Lemma 15, deg ( R i − ) = D + − deg ( U i ) = D + − N − = N (Proposition 4).Consider the degree of U i − . By Lemma 15, deg ( U i − ) = D + − deg ( R i − ) . As deg ( R i − ) > deg ( R i − ) (Remark 13), then deg ( R i − ) > N . Therefore, the degree of U i − is smaller or equalto the one of R i − becausedeg ( U i − ) = D + − deg ( R i − ) < D + − N = N +
1, and sodeg ( U i − ) ≤ N ≤ N = deg ( R i − ) . Hence, by Remark 16, ( U i − , R i − ) is a rational reconstruction of A modulo x D + such thatdeg ( U i − ) ≤ N and deg ( R i − ) = N . So, max ( deg ( U i − ) , deg ( R i − ) + ) = N + P ω = U i − ( yx ) x N + of degree N + x N + − deg ( U i − ) divides P ω . As deg ( U i − ) ≤ N , x N + − N divides x N + − deg ( U i − ) and so, it divides P ω . We as-sumed that the degree of U i is strictly bigger than the one of R i , and so x does not divide P v (Remark 12). Hence, there is no binary form Q of degree N − N such that x N − N + divides Q P v . Therefore, by Corollary 8, we can consider P w = P ω . (C). Assume that the degree of R i is bigger or equal to the one of U i , that is deg ( R i ) ≥ deg ( U i ) .Hence, deg ( R i ) + = deg ( P v ) = N + ( R i ) = N . In particular, R i (cid:44) ( i + ) -th row of the Extended Euclidean Algorithm, ( U i + , V i + , R i + ) , is defined.The degree of U i + is N + ( U i + ) = D + − deg ( R i ) = N + R i + is strictly smaller than the one of R i (Remark 13), which is N . Hence, the degree of R i + is smaller than the degree of U i + because deg ( R i + ) < N ≤ N < deg ( U i + ) . Therefore, P ω = U i + ( yx ) x N + is a kernel polynomial in Ker ( H N + ) (Lemma 11).By Remark 12, as deg ( R i + ) < deg ( U i + ) , x does not divide P ω . Also, the maximal power of x that divides P v is x deg ( R i )+ − deg ( U i ) , and, as we assumed deg ( R i ) ≥ deg ( U i ) , x divides P v . Hence,every polynomial in { Q P v : deg ( Q ) = N − N } is divisible by x , and so, by Corollary 8, we canconsider P w = P ω . 21 lgorithm 4 C OMPUTE P V AND P W Input:
A sequence ( a , . . . , a D ) . Output:
Polynomials P v and P w as 7.1. i ← first row of E GCD ( x D + , ∑ i = a i x i ) such that R i < D + .2. P v ← U i ( xy ) · x max ( deg ( U i ) , deg ( R i )+ ) . N ← max ( deg ( U i ) − , deg ( R i )) IF deg ( R i ) > deg ( U i ) , P w ← U i + ( xy ) · x deg ( U i + ) . N ← deg ( U i + ) − ELSE P w ← U i − ( xy ) · x deg ( R i − + ) . N ← deg ( R i − ) .4. Return P v and P w We can compute Q at step 2 of Algorithm 3 in different ways. If P v is square-free, then weset Q equal to P v . If P v is not square-free, by Lemma 17, we need to find a vector µ ∈ K ( N − N + ) such that Q µ : = P µ · P v + P w is square-free. By Proposition 7, P v and P w are relative prime. Thus,if we take a random vector µ , generically, Q µ would be square-free. For this to hold, we have toprove that the discriminant of Q µ is not identically zero. To simplify notation, in the followingtheorem we dehomogenize the polynomials. Theorem 22.
Given two relative prime univariate polynomials P v ( x ) and P w ( x ) of degrees N + and N + respectively, let Q µ ( x ) : = P µ ( x , ) · P v + P w ∈ K [ µ , . . . , µ N − N ][ x ] . The discriminantof Q µ ( x ) with respect to x is a non-zero polynomial.Proof. The zeros of the discriminant of Q µ ( x ) with respect to x over K correspond to the set { µ ∈ K N − N + : Q µ has double roots } . We want to prove that the discriminant is not zero.A univariate polynomial is square-free if and only if it does share any root with its deriva-22ive. Hence, ( µ , . . . , µ N − N ) T ∈ { µ ∈ K N − N + : Q µ has double roots } if and only if, there is ( µ , . . . , µ N − N , α ) ∈ K N − N + × K such that it is a solutions to the following system of equa-tions ( P µ · P v + P w )( x ) = ( P µ · P (cid:48) v + P (cid:48) µ · P v + P (cid:48) w )( x ) = . (11)In Equation (11), µ only appears in P µ ( x , ) with degree 1. We eliminate it to obtain thepolynomial ( P v · P (cid:48) µ + P (cid:48) w ) P v − P (cid:48) v · P w . This polynomial is not identically 0 as P (cid:48) v does not divide P v and P v and P w are relativeprime. Hence, for each ( µ , . . . , µ N − N ) , there is a finite number of solutions for this equation,bounded by the degree of the polynomial. Moreover, as the polynomials of Equation (11) arelinear in µ , each solution of the deduced equation is extensible to a finite number of solutionsof Equation (11). Hence, there is a µ ∈ K N − N + , such that Q µ is square-free. Therefore, thediscriminant of Q µ ( x ) is not identically zero. Corollary 23.
For every vector ( µ , . . . , µ N − N ) ∈ K N − N such that there is a µ ∈ K such thaty does not divides Q µ , where µ = ( µ , . . . , µ N − N ) , there are at most D + different values for µ ∈ K such that the polynomial Q µ ( x , y ) is not square-free.Proof. If Q µ ( x , y ) is not square-free, then it has a double root in P ( K ) . This root could be ofthe form ( α , ) ∈ P ( K ) or ( , ) ∈ P ( K ) . We analyze separately these cases.First, we consider the polynomial Q µ ( x , ) ∈ K [ µ , x ] . By Theorem 22, the discriminant of Q µ ( x , ) with respect to x is not zero. As Q µ ( x , ) is a polynomial of degree N + x , and of degree 1 with respect to µ , the degree with respect to µ of the discriminant of Q µ ( x , ) with respect to x is at most ( N + ) + N ≤ D +
1. Hence, there are at most 2 D + µ such that Q µ ( x , y ) has a root of the form ( α , ) ∈ P ( K ) with multiplicity biggerthan one.The polynomial Q µ ( x , y ) has a root of the form ( , ) ∈ P ( K ) with multiplicity bigger thanone, if and only if y divides Q µ ( x , y ) . If this happens, then the coefficients of the monomials y · x N − N − and x N − N in the polynomial Q µ ( x , y ) are zero. By assumption, these coefficientsare not identically zero as polynomials in K [ µ ] . As Q µ ( x , y ) is a linear polynomial with respectto µ , there is at most one value for µ such that y divides Q µ ( x , y ) .23herefore, there are at most ( D + ) + Q µ ( x , y ) is not square-free. Remark 24.
The previous assumption is not restrictive. If y divides Q µ , where µ =( µ , . . . , µ N − N ) , then y does not divide Q ( µ ,..., µ N − N + ) = Q µ + x N + norQ ( µ ,..., µ N − N − + , µ N − N ) = Q µ + yx N . Moreover, if N − N ≥ , y divides (or not) Q µ ( x , y ) regardless the value of µ . Conversely, if N − N < , there is always a µ such that y does notdivide Q µ .3.3. Correctness of Algorithm 3 For computing a decomposition for a binary form f , we need to compute the kernel of aHankel matrix (Theorem 2). Algorithm 4 computes correctly the polynomials P v and P w thatcharacterize the kernels of the family of Hankel matrices associated to f . Once we obtain thesepolynomials, step 2 (see Corollary 23) and step 3 computes the coefficients α j , β j , λ j of thedecomposition. Hence, we have a decomposition for f , as f ( x , y ) = ∑ rj = λ j · ( α x + β y ) D . Example.
Consider f ( x , y ) = y + xy + x y + x y + x . The family of Hankel matricesassociated to f are related to the vector a : = ( , , , , ) T , it is denoted by { H ka } k , and it containsthe following matrices: H a = H a = H a = H a = (cid:16) (cid:17) The kernel H a is trivial, so we compute the one of H a . This kernel is generated by the vector ( , − , ) T , so by Proposition 6 we consider v = ( , − , ) T . Also, by Proposition 4, N + = N = D − N =
3. The kernel polynomial P v = y − x y + x = ( x − y ) is not square-free thus,by Lemma 17, the rank of f ( x , y ) is N + = P w in the kernel of H a . Following Proposition 6, the kernel of H a is generated by U-chain of v given vectors U v = ( , − , , , ) T , U v = ( , , − , , ) T , and U v = ( , , , − , ) T , plus avector w linear independent with this U-chain. We consider the vector w = ( , , , , − ) , whichfulfills that assumption. Hence, P v = y − x y + x and P w = yx − x .24e proceed by computing a square-free polynomial combination of P v and P w . For that, wechoose Q : = ( y + yx + x ) P v + P w = ( x − y )( x − y )( x + y )( x + y ) . Finally, we solve the system given by the transposed of a Vandermonde matrix, · − − · ( − ) ( − ) · ( − ) ( − ) λ λ λ λ = . (12)The unique solution of the system is ( − , , , ) T , and so we recover the decomposition f ( x , y ) = − ( x + y ) + ( x + y ) + ( − x + y ) − ( − x + y ) . Instead of considering incrementally the matrices in the Hankel family we can compute thepolynomials P v and P w faster by applying Algorithm 4. For this, we consider the polynomial A : = x + x + x + x +
1, and the rows of the Extended Euclidean Algorithm for A and x . j V j U j R j x x + x + x + x +
12 1 ( x − ) ( x + x + x + ) − ( x − ) ( x − ) ( x + x + x + x + ) − x j such that deg ( R j ) < , which is j =
3. Hence, N = max ( deg ( U ) − , deg ( R )) = P v : = U (cid:16) yx (cid:17) x max ( deg ( U ) , deg ( R )+ ) = (cid:16) yx − (cid:17) x = ( y − x ) . As deg ( R ) ≤ deg ( U ) , we consider N = deg ( R ) = P w : = U (cid:16) yx (cid:17) x deg ( R )+ = ( yx − x ) . ♦ The real case.
When we consider the decomposition of binary forms over R , Algorithm 3 mightfail. This happens either when the decomposition over C is not unique or when the decompositionover C is unique but it is not a decomposition over R . The algorithm fails because25 The real rank of the binary form might be bigger than N +
1, see Reznick (2013a).Lemma 17 does not hold over R and so we cannot find a square-free kernel polynomial Q µ that factors over R . • Even when the real rank of the binary form is N +
1, we need to perform some extracomputations to compute a square-free kernel polynomial Q µ that factors over R . Thiscomputations are not taken into account in our algorithm, so we could never find such adecomposition.Recently, Sylvester’s algorithm was extended to the real case (Ansola et al., 2017, Algorithm 2).This algorithm performs an incremental search over r as in Algorithm 1, but it decides if there isa real decomposition of length r by checking the emptiness of a semi-algebraic set. In the step r -th of the algorithm, the semi-algebraic set is embedded in R dim ( Ker ( H r )) . Hence, the bottleneckof their algorithm is not the computation of P v and P w as in our case, but deciding the emptinessof the semi-algebraic set. We emphasize that when the decompositions over C and R are uniqueand the same, our algorithm computes such a decomposition. Moreover, given P v , we can checkif the previous condition holds by checking if P v splits over R .
4. Complexity
In this section we study the algebraic degree of the parameters that appear in the decompositionof a binary form as well as the arithmetic and bit complexity of Algorithm 3.
If we assume that the coefficients of the input binary form Equation (1) are rational numbersthen the parameters of the decompositions, α j , β j , and λ j (see Equation (3)), are algebraic num-bers, that is, roots of univariate polynomials with integer coefficients. The maximum degree ofthese polynomials is the algebraic degree of the problem. We refer the interested reader to Ba-jaj (1988); Nie et al. (2010); Draisma et al. (2016) for a detailed exposition about the algebraicdegree and how it address the complexity of the problem at hand at a fundamental level. Recall that, from Lemma 17, the rank of f could be either N + N +
1. When thepolynomial P v is square-free, then the rank is N + Q = P v . Following the discussion of26ection 3.2, we prove that, when the rank of the binary form is N +
1, we can compute a square-free kernel polynomial Q of this degree such that the largest degree of its irreducible factors is N . Moreover, we prove that for almost all the choices of ( N − N + ) different points in P ( K ) (the projective space of K ) there is a square-free kernel polynomial of H N + which vanishes onthese points. This will be our choice for Q . Lemma 25.
Let f be a binary form of rank N + . Given ( N − N + ) different points ( α , β ) , . . . , ( α N − N , β N − N ) ∈ P ( K ) such that none of them is a root of P v , then there is aunique binary form P µ of degree N − N , such that the kernel polynomial Q µ : = P µ · P v + P w vanishes on these points.Proof. Without loss of generality, we assume β i =
1, for i ∈ { , . . . , N − N } . By Proposition 7,for any polynomial P µ of degree N − N , Q µ is a kernel polynomial. Since Q µ ( α i , ) =
0, wecan interpolate P µ by noticing that P µ ( α j , ) = − P w ( α j , ) P v ( α j , ) .The degree of P µ is ( N − N ) and we interpolate it using ( N − N + ) different points.Hence there is a unique interpolation polynomial P µ . So, Q µ is the unique kernel polynomial of H N + vanishing at all these points. Example (cont.).
For the example of Section 3.3, we obtained the square-free kernel polynomial Q by choosing the points ( , ) , ( − , ) and ( − , ) ∈ P ( K ) . If we choose other points suchthat Q is square-free, we will obtain a different decomposition. Hence, f does not have a uniquedecomposition. This holds in general. ♦ From Lemma 25, we deduce the following well-known result about the uniqueness of the de-composition, see also Helmke (1992); Comas and Seiguer (2011); Bernardi et al. (2011); Carliniet al. (2017).
Corollary 26.
A decomposition is unique if and only if the rank is N + and N < N . Adecomposition is not unique if and only if the rank is N + . Theorem 27.
Let the rank of f be N + . Then there is a square-free kernel polynomial Q suchthat the largest degree of its irreducible factors is at most N .Proof. If the rank of f is N +
1, then for each set of N − N + P ( K ) ,following the assumptions of Lemma 25, there is a unique kernel polynomial. There is a ratio-nal map that realizes this relation (see the proof of Lemma 25). Let this map be Q [ α ] , where27 = (( α , β ) , . . . , ( α N − N , β N − N )) ∈ P ( K ) N − N + . The image of the map is contained in { P µ · P v + P w : µ ∈ K N − N + } . This set and P ( K ) N − N + have the same dimension, N − N + Q ( x , y ) , there is a finite number of distinct points ( α , β ) ∈ P ( K ) such that ˆ Q ( α , β ) =
0. Hence, the pre-image of an element in the image of Q [ α ] is a finite set.Therefore, the dimension of the image and the dimension of the domain are the same.By Theorem 22, the non-square-free kernel polynomials form a hypersurface in the space ofkernel polynomials of the shape P µ · P v + P w . If we consider the pre-image of the intersectionbetween this hypersurface and the image of the rational map, then its dimension is smaller than N − N + N − N + P ( K ) , the map Q [ α ] ( x , y ) resultsa square-free kernel polynomial. As K is the algebraic closure of K ⊂ C , the same holds over K . Theorem 28.
Given a binary form f of rank r and degree D, there is a square-free kernelpolynomial of degree r such that the biggest degree of its irreducible factors is min ( r , D − r + ) .Proof. If the rank is r = N +
1, then min ( r , D − r + ) = N . By Theorem 27, such a square-free kernel polynomial exists. If the rank is r = N + N < N , by Lemma 17, there is asquare-free kernel polynomial of degree min ( r , D − r + ) = N + Proposition 29.
Let f be a binary form of rank N + . For every set S ⊂ P ( K ) of cardinal ( N − N ) such that ( ∀ ( α , β ) ∈ S ) P v ( α , β ) (cid:44) there are at most D + D + values ( α , β ) ∈ P ( K ) such that ( α , β ) (cid:60) S, P v ( α , β ) (cid:44) and the unique kernel polynomial Q µ : = P µ · P v + P w that vanish over S and ( α , β ) (Lemma 25) is not square-free. To prove this proposition we use Lagrange polynomials to construct the maps and varietiesof the proof of Theorem 27.Let S = { ( α , β ) , . . . , ( α N − N , β N − N ) } ⊂ P ( K ) be the set of Proposition 29. For each ( α , β ) ∈ P ( K ) such that ( α , β ) (cid:60) S and P v ( α , β ) (cid:44) Q α , β which vanishes at S and ( α , β ) , see Lemma 25. Using Lagrange polynomial, we28an write this polynomial as Q α , β ( x , y ) = (cid:32) − P w ( α , β ) P v ( α , β ) M ( x , y ) M ( α , β ) + N − N ∑ i = β x − α y α β i − α i β E i ( x , y ) (cid:33) P v ( x , y ) + P w ( x , y ) Where E i ( x , y ) : = − P w ( α i , β i ) P v ( α i , β i ) ∏ j (cid:60) { , i } β j x − α j y α i β j − α j β i and M ( x , y ) : = ∏ N − N j = ( β j x − α j y ) . For each ( α j , β j ) ∈ S , we characterize the possible ( α , β ) ∈ P ( K ) such that ( α j , β j ) is aroot of Q α , β of multiplicity bigger than one. Then, we study the ( α , β ) ∈ P ( K ) such that ( α , β ) is a root of Q α , β with multiplicity bigger than one. Finally, we reduce every case to theprevious ones.To study the multiplicities of the roots, we use the fact that ( α , β ) is a double root of a binaryform P if and only if P ( α , β ) = ∂ P ∂ x ( α , β ) = ∂ P ∂ y ( α , β ) =
0. Hence, for each ( α , β ) ∈ P ( K ) ,we consider ∂ Q α , β ∂ x and ∂ Q α , β ∂ y , where ∂ Q α , β ∂ x = − P w ( α , β ) P v ( α , β ) M ( α , β ) (cid:18) ∂ M ∂ x P v + M ∂ P v ∂ x (cid:19) ( x , y )+ (13) N − N ∑ i = β α i − α β i ∂ (( β x − α y ) E i P v ) ∂ x ( x , y ) + ∂ P w ∂ x ( x , y ) Let O α , β x ( x , y ) be the product between the last line of Equation (13) and M ( α , β ) , that is O α , β x ( x , y ) : = N − N ∑ i = M ( α , β ) β α i − α β i ∂ (( β x − α y ) E i P v ) ∂ x ( x , y ) + M ( α , β ) ∂ P w ∂ x ( x , y ) In what follows, instead of considering the pair ( α , β ) as a point in P ( K ) , we consider it asa pair of variables. Hence, for every ( α i , β i ) ∈ S , ( β α i − α β i ) divides M ( α , β ) , as polynomialsin K [ α , β ] , so O α , β x ( x , y ) is a polynomial in K [ α , β ][ x , y ] . The derivative of Q α , β withrespect to x is a rational function in K ( α , β )[ x , y ] , that we can write as ∂ Q α , β ∂ x = T α , β ( x , y ) P v ( α , β ) M ( α , β ) where T α , β ( x , y ) : = − P w ( α , β ) (cid:18) ∂ M ∂ x P v + M ∂ P v ∂ x (cid:19) ( x , y ) + O α , β x ( x , y ) P v ( α , β ) ∈ K [ α , β ][ x , y ] Lemma 30.
For each ( α i , β i ) ∈ S, there are at most N + possible ( ¯ α , ¯ β ) ∈ P ( K ) such that ( ¯ α , ¯ β ) (cid:60) S, P v ( ¯ α , ¯ β ) (cid:44) and that ( α i , β i ) is a root of multiplicity bigger than in Q ¯ α , ¯ β . For each 0 ≤ i ≤ N − N , Q α , β ( x , y ) is a rational function of degree 0 with respect to ( α i , β i ) . Hence, it is welldefined the evaluation of the variables ( α i , β i ) in Q α , β ( x , y ) at points of P ( K ) . roof. If ( α i , β i ) is a root of multiplicity bigger than 1 in Q ¯ α , ¯ β , then ∂ Q ¯ α , ¯ β ∂ x ( α i , β i ) =
0. Hence,we are looking for the ( ¯ α , ¯ β ) such that T ¯ α , ¯ β ( α i , β i ) =
0. The polynomial T α , β ( α i , β i ) be-longs to K [ α , β ] , so if it is not identically zero, then there are a finite number of points ( ¯ α , ¯ β ) ∈ P ( K ) such that T ¯ α , ¯ β ( α i , β i ) =
0. Moreover, the degree of the polynomial T α , β ( α i , β i ) is atmost max ( deg ( P w ) , deg ( O α , β x ( α i , β i ))+ deg ( P v )) = N +
1. Hence, if the polynomial is not zero,this finite number is at most N + T α , β ( α i , β i ) ∈ K [ α , β ] is not zero. Observe that as M is square-free, M ( α i , β i ) = P v ( α i , β i ) (cid:44)
0, then (cid:16) ∂ M ∂ x P v + M ∂ P v ∂ x (cid:17) ( α i , β i ) (cid:44)
0. Hence, as P w and P v arecoprime, if ( ¯ α , ¯ β ) ∈ P ( K ) and P v ( ¯ α , ¯ β ) =
0, then T ¯ α , ¯ β ( α i , β i ) (cid:44)
0. That is, T α , β ( α i , β i ) does not vanish in the roots of P v . Lemma 31.
There are at most N + possible ( ¯ α , ¯ β ) ∈ P ( K ) such that ( ¯ α , ¯ β ) (cid:60) S,P v ( ¯ α , ¯ β ) (cid:44) and ( ¯ α , ¯ β ) is a root of multiplicity bigger than in Q ¯ α , ¯ β .Proof. Following the proof of Lemma 30, we study T α , β ( α , β ) ∈ K [ α , β ] . T α , β ( α , β ) = − P w ( α , β ) (cid:18) ∂ M ∂ x P v + M ∂ P v ∂ x (cid:19) ( α , β ) + O α , β x ( α , β ) P v ( α , β )= (cid:18) − P w M ∂ P v ∂ x (cid:19) ( α , β ) + (cid:18) O α , β x − P w ∂ M ∂ x (cid:19) ( α , β ) P v ( α , β ) Note that P w and P v are coprime. Also, M and P v are coprime. Hence, the polynomial T α , β ( α , β ) is not zero because P v does not divide P w M ∂ P v ∂ x . We conclude the proof by noting that the degreeof T α , β ( α , β ) is bounded by 2 N + Lemma 32.
Let ( ¯ α , ¯ β ) , ( ˆ α , ˆ β ) ∈ P ( K ) such that ( ¯ α , ¯ β ) , ( ˆ α , ˆ β ) (cid:60) S, P v ( ¯ α , ¯ β ) (cid:44) . Hence,Q ¯ α , ¯ β ( ˆ α , ˆ β ) = if and only if Q ¯ α , ¯ β ( x , y ) = Q ˆ α , ˆ β ( x , y ) .Proof. Assume that Q ¯ α , ¯ β ( ˆ α , ˆ β ) =
0. Following Lemma 25, we write Q ¯ α , ¯ β = P ¯ µ P v + P w and Q ˆ α , ˆ β = P ˆ µ P v + P w . As P v and P w are coprime and Q ¯ α , ¯ β ( ˆ α , ˆ β ) =
0, then P v ( ˆ α , ˆ β ) (cid:44) Q ¯ α , ¯ β − Q ˆ α , ˆ β = P v ( P ¯ µ − P ˆ µ ) . This polynomial belongs to K [ x , y ] and it vanishes over P ( K ) at the N + P v , at the N − N points on S , and at ( ˆ α , ˆ β ) ∈ P ( K ) . Hence, Q ¯ α , ¯ β − Q ˆ α , ˆ β = K [ x , y ] of degree at most N + N + P ( K ) . Therefore, Q ¯ α , ¯ β ( x , y ) = Q ˆ α , ˆ β ( x , y ) .To prove the second case, note that, by definition, Q ˆ α , ˆ β ( ˆ α , ˆ β ) =
0. Hence, if we assumethat Q ¯ α , ¯ β ( x , y ) = Q ˆ α , ˆ β ( x , y ) , then we have Q ¯ α , ¯ β ( ˆ α , ˆ β ) = roof of Proposition 29. We want to bound the number of different points ( ¯ α , ¯ β ) ∈ P ( K ) suchthat Q ¯ α , ¯ β ( x , y ) is not a square-free binary form over K [ x , y ] . If the binary form Q ¯ α , ¯ β ( x , y ) is not square-free, then it has a root over P ( K ) with multiplicity bigger than one. If such aroot is ( α i , β i ) ∈ S , we can bound the possible number of different values for ( ¯ α , ¯ β ) ∈ P ( K ) by ( N + ) (Lemma 31). Hence, if there is a i such that ( α i , β i ) ∈ S has multiplicity biggerthan one as a root of Q ¯ α , ¯ β ( x , y ) , we can bound the possible number of different values for ( ¯ α , ¯ β ) ∈ P ( K ) by S · ( N + ) = ( N − N )( N + ) .If Q ¯ α , ¯ β is not square-free and the multiplicity of every root ( α i , β i ) ∈ S is one, then theremust be a root ( ˆ α , ˆ β ) ∈ P ( K ) such that ( ˆ α , ˆ β ) (cid:60) S and its multiplicity as a root of Q ¯ α , ¯ β isbigger strictly than one. By Lemma 32, Q ˆ α , ˆ β ( x , y ) = Q ¯ α , ¯ β ( x , y ) , and so ( ˆ α , ˆ β ) ∈ P ( K ) hasmultiplicity bigger than one as a root of Q ˆ α , ˆ β ( x , y ) . Hence, P v ( ˆ α , ˆ β ) (cid:44) ( ˆ α , ˆ β ) ∈ P ( K ) by 2 N +
1. As Q ˆ α , ˆ β ( x , y ) has N + P ( K ) \ S then, by Lemma 32, there are N + ( ¯ α , ¯ β ) ∈ P ( K ) such that Q ˆ α , ˆ β ( x , y ) = Q ¯ α , ¯ β ( x , y ) . Hence, for each ( ˆ α , ˆ β ) ∈ P ( K ) suchthat ( ˆ α , ˆ β ) has multiplicity bigger than one as a root of Q ˆ α , ˆ β ( x , y ) , there are N + ( ¯ α , ¯ β ) ∈ P ( K ) such that ( ˆ α , ˆ β ) has multiplicity bigger than one as a root of Q ¯ α , ¯ β ( x , y ) .Therefore, the number of different values for ( ¯ α , ¯ β ) ∈ P ( K ) such that Q ¯ α , ¯ β ( x , y ) has a rootin P ( K ) \ S with multiplicity bigger than one is bounded by ( N + )( N + ) .Joining these bounds, we deduce that there are at most ( N − N )( N + ) + ( N + )( N + ) different ( ¯ α , ¯ β ) ∈ P ( K ) such that Q ¯ α , ¯ β is not square-free. Recalling that N = D − N and N ≤ D (Proposition 4), we can bound ( N − N )( N + ) + ( N + )( N + ) , by D + D + Waring locus of the binary form f ( x , y ) , see Carliniet al. (2017). For example, this proposition improves (Carlini et al., 2017, Proposition 3.8) bya factor of two. Moreover, it shows that the uniqueness condition from (Carlini et al., 2017,Proposition 3.8) misses some assumptions to hold . The authors are not taking into account that the lambdas that they use in their proof are not unique, and so they giveus more degrees of freedom that we can use to fix more terms in the decomposition. .1.2. Complexity of computing λ We compute the coefficients λ j of the decomposition by solving a linear system involving atransposed Vandermonde matrix (Step 3 of Algorithm 3). We follow Kaltofen and Yagati (1989)to write the solution of Equation (5) as the evaluation of a rational function over the roots of aunivariate polynomial. Definition 33.
Given a polynomial P ( x ) : = ∑ ni = a i x i and < k ≤ n, letQuo ( P ( x ) , x k ) : = n ∑ i = k a i x i − k . Proposition 34 (Kaltofen and Yagati, 1989, Sec. 5) . If α j (cid:44) α i , for all i (cid:44) j, then there is a uniquesolution to the system of Equation (14) . ··· α α ··· α r ... ... ... α r − α r − ··· α r − r λ = a a ... a r − (14) Moreover, if the solution is λ = ( λ , . . . , λ r ) T then, λ j = TQ (cid:48) ( α j ) where Q (cid:48) ( x ) is the derivative ofQ ( x ) : = ∏ ri = ( x − α i ) , R ( x ) : = ∑ ri = a r − i x i − and T ( x ) : = Quo (cid:0) Q ( x ) · R ( x ) , x r (cid:1) . Lemma 35.
Given a binary form f ( x , y ) : = ∑ Di = (cid:0) Di (cid:1) a i x i y D − i , let Q be a square-free kernelpolynomial of degree r, obtained after step 3 of Algorithm 3. Assume that y does not di-vide Q. Let α j be the j-th roots of Q ( x ) , Q (cid:48) ( x ) be the derivative of Q ( x ) and the polynomialT ( x ) : = Quo (cid:0) Q ( x ) · R ( x ) , x r (cid:1) , with R ( x ) : = ∑ ri = a r − i x i − . Then, each λ j from step 3 in Algo-rithm 4 can be written as λ j = TQ (cid:48) ( α j ) .Proof. As y does not divide Q , we can write it as Q ( x , y ) = ∏ i ( x − α i y ) , where all the α i aredifferent. Hence, as the r × r leading principal submatrix of Equation (5) is invertible, we canrestrict the problem to solve that r × r leading principal subsystem. This system is Equation (14).Therefore, the proof follows from Proposition 34. Proposition-Definition 36 (Symbolic decomposition) . Let Q be a square-free kernel polynomialrelated to a minimal decomposition of a binary form f of degree D, such that y does not divideQ. In this case, we can write f asf ( x , y ) = ∑ { α ∈ K | Q ( α )= } TQ (cid:48) ( α ) · ( α x + y ) D . emark 37. If the square-free kernel polynomial related to a decomposition of rank r is divisibleby y, we can compute { λ j } j < r of Equation (5) as in Lemma 35, by taking Qy as the kernel poly-nomial. It is without loss of generality to consider Q = P ( u ,..., − , ) T , because Q is square-free,and so y can not divide it. Hence, λ r = a D − ∑ r − i = u i a D − r + i + (Reznick, 2013a, Equation 2.12). To summarize the section, given a binary form f of rank r , there is a square-free kernelpolynomial Q of the degree r , such that the largest degree of its irreducible factors is bounded bymin ( r , D − r + ) (Proposition-Definition 36). If Q ( x , y ) is not divisible by y , the decompositionis f ( x , y ) = ∑ { α ∈ K | Q ( α )= } TQ (cid:48) ( α ) · ( α x + y ) D , where T and Q (cid:48) are polynomials whose coefficients belong to K and whose degrees are boundedby r , defined in Lemma 35. When y divides Q , the form is similar. In this section we analyze the tightness of the bound of Theorem 28. To do so, we constructfamilies of examples where the bound is tight. We present two families of examples. In the firstone, the decomposition is unique. In the second one, it is not.
Proposition 38 (Heinig and Rost, 1984, Theorem 5.2) . For every pair of relatively prime binaryforms, ¯ P v and ¯ P w , of degrees ¯ N + and ¯ N + , ¯ N ≤ ¯ N , there is a sequence a = ( a , . . . , a ¯ N + ¯ N ) such that N a = ¯ N , N a = ¯ N , and we can consider the polynomials ¯ P v and ¯ P w as the kernelpolynomials P v and P w from Proposition 7 with respect to the family of Hankel matrices { H ka } k . Corollary 39.
If there is an irreducible binary form of degree ¯ N + in K [ x , y ] , then for everyD > N , there is a binary form f of degree D such that its decomposition is unique, its rank ¯ N + , and the degree of the biggest irreducible factor of the polynomial Q from Algorithm 3 inthe decomposition is min ( ¯ N + , D − ¯ N ) = ¯ N + . That is, the algebraic degree of the minimaldecomposition over K is ¯ N + and the bound of Theorem 28 is tight.Proof. Let ¯ P v be a irreducible binary form of degree ¯ N +
1. Let ¯ P w be any binary form of degree¯ N + = D − ¯ N + P v . Consider the sequence a = ( a , . . . , a ¯ N + ¯ N ) ofProposition 38 with respect to ¯ P v and ¯ P w , and the binary form f ( x , y ) : = ∑ Di = (cid:0) Di (cid:1) a i x i y D − i . As K is of characteristic zero, K is a perfect field, and so, as ¯ P v is irreducible, it is square-free.Then, by Lemma 17, the rank of the decomposition is N a + = ¯ N +
1, and by Corollary 26 the33ecomposition is unique. Following Algorithm 3, the polynomial Q is equal to ¯ P v , which is anirreducible polynomial of degree ¯ N +
1. As D > N , then min ( ¯ N + , D − ¯ N ) = ¯ N + Lemma 40.
Let K = Q and p ∈ N a prime number. Then, there is a binary form f of degree ( p − ) whose decomposition is not unique and the bound of Theorem 28 is tight.Proof. Consider the binary form f ( x , y ) : = (cid:0) ( p − ) p − (cid:1) x p − y p − . Using Algorithm 4, we obtain P v = − y p and P w = x p , N = N = p −
1. The polynomial P v is not square-free, so we have to con-sider a square-free kernel polynomial in Ker ( H N + ) . Moreover, the rank of the decompositionis N + = p . Every kernel polynomial in Ker ( H N + ) in Q [ x , y ] can be written as µ w x p − µ v y p for some µ w , µ v ∈ Q . We are interested in the zeros of these polynomials (step 3 of Algorithm 3),thus we consider coprime µ w , µ v ∈ Z , as the zeros do not change. As we want to consider square-free kernel polynomials, neither µ w nor µ v can be zero, and so ( , ) ∈ P ( Q ) is not a root of anyof these polynomials. Hence, we rewrite our polynomial as µ v y p ( µ w x p µ v y p − ) , and so we look forthe factorization over Q [ z ] of µ w µ v z p −
1, where z = xy . We can use the Newton’s polygon criterion,e.g., Cassels (1986, Chapter 6.3), to show that, if p (cid:113) µ w µ v (cid:60) Q , then µ w µ v z p − Q [ x , y ] and so the degree of its biggest irreducible factor is p > min ( p , ( p − ) − p + ) = p − p (cid:113) | µ w µ v | ∈ Q , and so we can factor it as (cid:18) p (cid:114) | µ w µ v | · z (cid:19) p − = (cid:18) p (cid:114) | µ w µ v | · z − (cid:19) (cid:32) p − ∑ i = (cid:18) p (cid:114) | µ w µ v | · z (cid:19) i (cid:33) . The second factor is irreducible because there is an automorphism in Q [ x ] (given by z (cid:55)→ p (cid:113) | µ v µ w | z )that transforms it into the p -th cyclotomic polynomial, which is irreducible as p is prime. Hence,the biggest irreducible factor of this polynomial has degree p − = min ( p , ( p − ) − p + ) andthe bound of Theorem 28 is tight. Lemma 41 (Complexity of Algorithm 4) . Given a binary form f = ∑ Di = (cid:0) Di (cid:1) a i x i y D − i of degreeD, Algorithm 4 computes P v and P w in O ( M ( D ) · log ( D )) arithmetic operations.Proof. The complexity of the algorithm is the complexity of computing the rows ( i + ) , i and ( i − ) of the Extended Euclidean algorithm between ∑ i = a i x i and x D + , where the i -th row is thefirst row i such that deg ( R i ) < D + (Lemma 21). This can be done using the Half-GCD algorithm ,34hich computes these rows in O ( M ( D ) · log ( D )) . For a detailed reference of how this algorithmworks see Bostan et al. (2017, Chapter 6.3) or Gathen and Gerhard (2013, Chapter 11). Lemma 42 (Complexity of computing Q ) . Given the kernel polynomials P v and P w from Propo-sition 7, we compute a square-free polynomial Q µ : = P µ · P v + P w such that the maximal degreeof its irreducible factors is bounded by Theorem 28 in O ( M ( D ) · log ( D )) .Proof. To compute the vector µ , we choose randomly N − N + K [ x , y ] and weproceed as in Lemma 25. The complexity bound is due to multi-point evaluation and interpola-tion of a univariate polynomial (Gathen and Gerhard, 2013, Chapter 10). Theorem 43.
When the decomposition is unique, that is when the rank is N + , then Algo-rithm 3 computes deterministically a symbolic decomposition (Proposition-Definition 36) of abinary form in O ( M ( D ) log ( D )) .When the decomposition is not unique, that is when the rank is N + and N < N , then Algo-rithm 3 is a Monte Carlo algorithm that computes a symbolic decomposition of a binary form inO ( M ( D ) log ( D )) .Proof. The first step of the algorithm, in both cases, is to compute the kernel polynomials P v and P w of Proposition 7 using Algorithm 4. By Lemma 41, we compute them deterministically in O ( M ( D ) · log ( D )) .If P v is square-free, which means that the decomposition is unique, then Q = P v . Otherwise, instep 2, we need to choose some random values to construct the square-free polynomial Q (fromthe kernel polynomials P v and P w ) in O ( M ( D ) · log ( D )) as in Lemma 42. This is the step thatmakes the algorithm a Monte Carlo one, as we might fail to produce a square-free polynomial Q .In both cases, at step 3 we compute the rational function that describes the solution of thesystem in Equation (5) in O ( M ( D ) · log ( D )) (Lemma 35). At step 4 of the algorithm, we returnthe decomposition.We can bound the probability of error of Algorithm 3 using Proposition 29, which bounds thenumber of bad values that lead us to a non square-free polynomial Q . Moreover, we can introducea Las Vegas version of Algorithm 3 by checking if the values that we choose to construct apolynomial Q result indeed a square-free polynomial. We can do this check in O ( M ( D ) · log ( D )) ,by computing the GCD between the Q and its derivatives.35 emark 44. If we want to output an approximation of the terms of the minimal decomposi-tion, with a relative error of − ε , we can use Pan’s algorithm (Pan, 2002) (McNamee and Pan,2013, Theorem 15.1.1) to approximate the roots of Q. In this case the complexity becomesO (cid:0) D log ( D ) (cid:0) log ( D ) + log ( ε ) (cid:1)(cid:1) .4.3. Bit complexity Let f ∈ Z [ x , y ] be a binary form as in Equation (1), of degree D and let τ be the maxi-mum bitsize of the coefficients a i . We study the bit complexity of computing suitable approx-imations of the α j ’s, β j ’s, and λ j ’s of Equation (3), say (cid:101) α j , (cid:101) β j and (cid:101) λ j respectively, that in-duce an approximate decomposition correct up to (cid:96) bits. That is a decomposition such that (cid:107) f − ∑ j (cid:101) λ j ( (cid:101) α j x + (cid:101) β j y ) D (cid:107) ∞ ≤ − (cid:96) . We need to estimate an upper bound on the number of bits thatare necessary to perform all the operations of the algorithm.The first step of the algorithm is to compute P v and P w , via the computation of three rowsof the Extended GCD of two polynomials of degree D and D + τ . This can be achieved in (cid:101) O B ( D τ ) bit operations (Gathen and Gerhard, 2013, Corol-lary 11.14.B), and the maximal bit size of P v and P w is (cid:101) O ( D τ ) . We check if P v is a square-freepolynomial in (cid:101) O ( D τ ) , via the computation of the GCD of P v ( x , ) and its derivative (Gathen andGerhard, 2013, Corollary 11.14.A), and by checking if y divides it.If P v is square-free polynomial, then Q = P v . If P v is not square-free, then we can compute Q by assigning values to the coefficients of P µ . We assume that y does not divide P w , if thisdoes not hold, we replace P w by the kernel polynomial x N − N P v + P w , which is coprime to P v ,and so not divisible by y , as P v and the original P w are coprime (Proposition 7). We set all thecoefficients of P µ to zero, except the constant term. Then Q = µ y N − N P v + P w . Now we haveto choose µ so that Q is square-free. As y N − N P v and P w are coprime, there are at most 2 D + µ such that Q is not square-free (Corollary 23), thus at least one of thefirst 2 D + (cid:101) O B ( D τ ) and so the overall cost is (cid:101) O B ( D τ ) .Let σ = (cid:101) O ( D τ ) be the maximal bit size of Q . By Remark 37, we can assume that y does notdivide Q , consider y = Q as an univariate polynomial.Let { α j } j be the roots of Q . We isolate them in (cid:101) O B ( D σ ) (Pan, 2002). For the (aggregate)separation bound of the roots it holds that − lg ∏ j ∆ ( α j ) = O ( D σ + D lg ( D )) . We approximate36ll the roots up to accuracy 2 − (cid:96) in (cid:101) O B ( D σ + D (cid:96) ) (Pan and Tsigaridas, 2017a). That is, wecompute absolute approximations of α j , say (cid:101) α j , such that | α j − (cid:101) α j | ≤ − (cid:96) .The next step consists in solving the (transposed) Vandermonde system, V ( (cid:101) α ) T λ = a , where V ( (cid:101) α ) is the Vandermonde matrix we construct with the roots of Q , λ is a vector contains thecoefficients of decomposition, and a is a vector containing the coefficients of F , see also Equa-tion (5). We know the entries of V ( (cid:101) α ) up to (cid:96) bits. Therefore, we can compute the elementsof the solution vector λ with an absolute approximation correct up to (cid:96) = (cid:96) − O ( D lg ( D ) σ − lg ∏ j ∆ ( α j )) = (cid:96) − O ( D lg ( D ) σ ) bits (Pan and Tsigaridas, 2017b, Theorem 29). That is, wecompute (cid:101) λ j ’s such that | λ j − (cid:101) λ j | ≤ − (cid:96) . At this point we have obtained the approximate decom-position (cid:101) f ( x , y ) : = r ∑ j = (cid:101) λ j ( (cid:101) α j x + y ) D . To estimate the accuracy of (cid:101) f we need to expand the approximate decomposition and con-sider it as a polynomial in x . We do not actually perform this operation; we only estimate theaccuracy as if we were. First, we expand each ( (cid:101) α j x + y ) D . This results polynomials with coeffi-cients correct up (cid:96) = (cid:96) − O ( D σ ) = (cid:96) − O ( D lg ( D ) σ ) − O ( D σ ) = (cid:96) − O ( D lg ( D ) σ ) bits (Panand Tsigaridas, 2017b, Lemma 19). Next, we multiply each such polynomial with (cid:101) λ i , and wecollect the coefficients for the various powers of x . Each coefficient is the sum of r ≤ D terms.The last two operations do not affect, asymptotically, the precision. Therefore, the polynomial (cid:101) f = ∑ rj = (cid:101) λ j ( (cid:101) α j x + ( − (cid:101) α j t ) y ) D that corresponds to the approximate decomposition has an ab-solute approximation such that (cid:107) f − (cid:101) f (cid:107) ≤ − (cid:96) + O ( D lg ( D ) σ ) . To achieve an accuracy of 2 − (cid:96) in thedecomposition, such that (cid:107) f − (cid:101) f (cid:107) ≤ − (cid:96) , we should choose (cid:96) = (cid:96) + O ( D lg ( D ) σ ) . Thus, all thecomputations should be performed with precision of (cid:96) + O ( D lg ( D ) σ ) bits. The bit complexity ofcomputing the decomposition of f up to (cid:96) bits is dominated by the solving and refining processand it is (cid:101) O B ( D (cid:96) + D σ ) . If we substitute the value for σ , then we arrive at the complexity boundof (cid:101) O B ( D (cid:96) + D + D τ ) . Theorem 45.
Let f ∈ Z [ x , y ] be a homogeneous polynomial of degree D and maximum coefficientbitsize τ . We compute an approximate decomposition of accuracy − (cid:96) in (cid:101) O B ( D (cid:96) + D + D τ ) bitoperations. cknowledgments The authors thank George Labahn and Vincent Neiger for pointing out important referencesto derandomize the computation of P v and P w . The authors thank the anonymous reviewers fortheir helpful comments. Mat´ıas Bender thanks Joos Heintz for supervising his Master’s thesis.The authors are partially supported by ANR JCJC GALOP (ANR-17-CE40-0009), the PGMOgrant ALMA and the PHC GRAPE. References
Ansola, M., Daz-Cano, A., Zurro, M. A., Jun. 2017. Semialgebraic decomposition of real binary forms of a given degree’sspace. arXiv:1706.04207 [math]ArXiv: 1706.04207.URL http://arxiv.org/abs/1706.04207
Bajaj, C., 1988. The algebraic degree of geometric optimization problems. Discrete & Computational Geometry 3 (1),177–191.Bender, M. R., Faug`ere, J.-C., Perret, L., Tsigaridas, E., 2016. A superfast randomized algorithm to decompose binaryforms. In: Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation. ACM, pp.79–86.Bernardi, A., Daleo, N. S., Hauenstein, J. D., Mourrain, B., Dec. 2017. Tensor decomposition and homotopy continua-tion. Differential Geometry and its Applications 55, 78–105.URL
Bernardi, A., Gimigliano, A., Ida, M., 2011. Computing symmetric rank for symmetric tensors. Journal of SymbolicComputation 46 (1), 34–53.Blekherman, G., 2015. Typical real ranks of binary forms. Foundations of Computational Mathematics 15 (3), 793–798.Boij, M., Carlini, E., Geramita, A., 2011. Monomials as sums of powers: the real binary case. Proceedings of theAmerican Mathematical Society 139 (9), 3039–3043.Bostan, A., Chyzak, F., Giusti, M., Lebreton, R., Lecerf, G., Salvy, B., Schost, ´E., 2017. Algorithmes efficaces en calculformel. published by the Authors.Brachat, J., Comon, P., Mourrain, B., Tsigaridas, E., 2010. Symmetric tensor decomposition. Linear Algebra and itsApplications 433 (11), 1851–1872.Cabay, S., Choi, D.-K., 1986. Algebraic computations of scaled Pad´e fractions. SIAM Journal on Computing 15 (1),243–270.Carlini, E., Catalisano, M. V., Chiantini, L., Geramita, A. V., Woo, Y., 2018. Symmetric tensors: rank, Strassen’s conjec-ture and e-computability. Annali della Scuola Normale Superiore di Pisa. Classe di scienze 18 (1), 363–390.URL https://dialnet.unirioja.es/servlet/articulo?codigo=6389435
Carlini, E., Catalisano, M. V., Oneto, A., Jul. 2017. Waring loci and the Strassen conjecture. Advances in Mathematics314, 630–662.URL assels, J. W. S., 1986. Local fields. Vol. 3. Cambridge University Press Cambridge.Comas, G., Seiguer, M., 2011. On the rank of a binary form. Foundations of Computational Mathematics 11 (1), 65–78.Comon, P., 2014. Tensors: a brief introduction. IEEE Signal Processing Magazine 31 (3), 44–53.Comon, P., Golub, G., Lim, L.-H., Mourrain, B., 2008. Symmetric tensors and symmetric tensor rank. SIAM Journal onMatrix Analysis and Applications 30 (3), 1254–1279.Comon, P., Mourrain, B., 1996. Decomposition of quantics in sums of powers of linear forms. Signal Processing 53 (2),93–107.Draisma, J., Horobet¸, E., Ottaviani, G., Sturmfels, B., Thomas, R. R., 2016. The euclidean distance degree of an algebraicvariety. Foundations of computational mathematics 16 (1), 99–149.D¨ur, A., Oct 1989. On computing the canonical form for a binary form of odd degree. Journal of Symbolic Computation8 (4), 327–333.Garc´ıa-Marco, I., Koiran, P., Pecatte, T., 2017. Reconstruction algorithms for sums of affine powers. In: Proceedings ofthe 2017 ACM on International Symposium on Symbolic and Algebraic Computation. ISSAC ’17. ACM, New York,NY, USA, pp. 317–324.Gathen, J. v. z., Gerhard, J., 2013. Modern computer algebra. Cambridge University Press, Cambridge.Giesbrecht, M., Kaltofen, E., Lee, W.-s., 2003. Algorithms for computing sparsest shifts of polynomials in power, cheby-shev, and pochhammer bases. Journal of Symbolic Computation 36 (3-4), 401–424.Giesbrecht, M., Roche, D. S., 2010. Interpolation of shifted-lacunary polynomials. Computational Complexity 19 (3),333–354.Gundelfinger, S., 1887. Zur theorie der bin¨aren formen. Journal f¨ur die reine und angewandte Mathematik 100, 413–424.Hauenstein, J. D., Oeding, L., Ottaviani, G., Sommese, A. J., 2016. Homotopy techniques for tensor decomposition andperfect identifiability. Journal fr die reine und angewandte Mathematik (Crelles Journal) 0 (0).URL Heinig, G., Rost, K., 1984. Algebraic methods for Toeplitz-like matrices and operators. Springer.Helmke, U., 1992. Waring’s problem for binary forms. Journal of pure and applied algebra 80 (1), 29–45.Iarrobino, A., Kanev, V., 1999. Power sums, Gorenstein algebras, and determinantal loci. Springer.Kaltofen, E., Yagati, L., 1989. Improved sparse multivariate polynomial interpolation algorithms. In: Symbolic andAlgebraic Computation. Springer, pp. 467–474.Kung, J. P., 1990. Canonical forms of binary forms: variations on a theme of Sylvester. Institute for Mathematics and ItsApplications 19, 46.Kung, J. P., Rota, G.-C., 1984. The invariant theory of binary forms. Bulletin of the American Mathematical Society10 (1), 27–85.Landsberg, J. M., 2012. Tensors: geometry and applications. American Mathematical Society.McNamee, J. M., Pan, V. Y., 2013. Numerical methods for roots of polynomials (II). Elsevier.Nie, J., Ranestad, K., Sturmfels, B., 2010. The algebraic degree of semidefinite programming. Mathematical Program-ming 122 (2), 379–405.Oeding, L., Ottaviani, G., 2013. Eigenvectors of tensors and algorithms for waring decomposition. Journal of SymbolicComputation 54, 9–35. an, V., 2001. Structured matrices and polynomials: unified superfast algorithms. Springer.Pan, V. Y., 2002. Univariate polynomials: Nearly optimal algorithms for numerical factorization and root-finding. Journalof Symbolic Computation 33 (5), 701 – 733.Pan, V. Y., Tsigaridas, E., 2017a. Accelerated approximation of the complex roots and factors of a univariate polynomial.Theoretical Computer Science 681, 138–145.Pan, V. Y., Tsigaridas, E. P., 2017b. Nearly optimal computations with structured matrices. Theoretical Computer Science681, 117–137.Reznick, B., 1996. Homogeneous polynomial solutions to constant coefficient pde’s. Advances in Mathematics 117 (2),179–192.Reznick, B., 2013a. On the length of binary forms. In: Quadratic and Higher Degree Forms. Springer, pp. 207–232.Reznick, B., 2013b. Some new canonical forms for polynomials. Pacific Journal of Mathematics 266 (1), 185–220.Reznick, B., Tokcan, N., 2017. Binary forms with three different relative ranks. Proceedings of the American Mathemat-ical Society 145 (12), 5169–5177.Reznick, B. A., 1992. Sums of even powers of real linear forms. Vol. 463. American Mathematical Society.Sylvester, J. J., 1851. On a remarkable discovery in the theory of canonical forms and of hyperdeterminants. The London,Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2 (12), 391–410.Sylvester, J. J., 1904. An essay on canonical forms, supplement to a sketch of a memoir on elimination, transformationand canonical forms. In: The collected papers of James Joseph Sylvester. Vol. 1. Cambridge University Press, pp.203–216, the original paper dates back to 1851.an, V., 2001. Structured matrices and polynomials: unified superfast algorithms. Springer.Pan, V. Y., 2002. Univariate polynomials: Nearly optimal algorithms for numerical factorization and root-finding. Journalof Symbolic Computation 33 (5), 701 – 733.Pan, V. Y., Tsigaridas, E., 2017a. Accelerated approximation of the complex roots and factors of a univariate polynomial.Theoretical Computer Science 681, 138–145.Pan, V. Y., Tsigaridas, E. P., 2017b. Nearly optimal computations with structured matrices. Theoretical Computer Science681, 117–137.Reznick, B., 1996. Homogeneous polynomial solutions to constant coefficient pde’s. Advances in Mathematics 117 (2),179–192.Reznick, B., 2013a. On the length of binary forms. In: Quadratic and Higher Degree Forms. Springer, pp. 207–232.Reznick, B., 2013b. Some new canonical forms for polynomials. Pacific Journal of Mathematics 266 (1), 185–220.Reznick, B., Tokcan, N., 2017. Binary forms with three different relative ranks. Proceedings of the American Mathemat-ical Society 145 (12), 5169–5177.Reznick, B. A., 1992. Sums of even powers of real linear forms. Vol. 463. American Mathematical Society.Sylvester, J. J., 1851. On a remarkable discovery in the theory of canonical forms and of hyperdeterminants. The London,Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2 (12), 391–410.Sylvester, J. J., 1904. An essay on canonical forms, supplement to a sketch of a memoir on elimination, transformationand canonical forms. In: The collected papers of James Joseph Sylvester. Vol. 1. Cambridge University Press, pp.203–216, the original paper dates back to 1851.