Improved Polynomial Remainder Sequences for Ore Polynomials
aa r X i v : . [ c s . S C ] N ov Improved Polynomial Remainder Sequences for Ore Polynomials
Maximilian Jaroschek Research Institute for Symbolic ComputationJohannes Kepler UniversityA4040 Linz, Austria
Abstract
Polynomial remainder sequences contain the intermediate results of the Euclidean algorithm whenapplied to (non-)commutative polynomials. The running time of the algorithm is dependent on thesize of the coefficients of the remainders. Different ways have been studied to make these as smallas possible. The subresultant sequence of two polynomials is a polynomial remainder sequence inwhich the size of the coefficients is optimal in the generic case, but when taking the input fromapplications, the coefficients are often larger than necessary. We generalize two improvements ofthe subresultant sequence to Ore polynomials and derive a new bound for the minimal coefficientsize. Our approach also yields a new proof for the results in the commutative case, providing anew point of view on the origin of the extraneous factors of the coefficients.
Keywords:
Ore polynomials, greatest common right divisor, polynomial remainder sequences,subresultants11A05, 68W30
1. Introduction
When given a system of differential equations, one might be interested in finding the commonsolutions of these equations. In order to do so, one can compute another differential equation whosesolution space is the intersection of the solution spaces of the equations in the original system. Oneway to do this is to translate the equations into operators and use the Euclidean algorithm tocompute their greatest common right divisor. The solution space of the greatest common rightdivisor then consists of the desired elements.Similarly, given a sequence of numbers ( t n ) n ∈{ , ,... } that satisfies two different recurrence equa-tions, the Euclidean algorithm is used in applications to find a reasonable candidate for the leastorder equation of which ( t n ) n ∈{ , ,... } is a solution.Carrying out Euclid’s algorithm applied to two polynomials over a domain D usually re-quires a prediction of the denominators that might appear in the coefficients of the remaindersin order to bypass costly computations in the quotient field of D . While such a prediction canbe done easily, the growth of the coefficients of the remainders can be tremendous, which mightresult in an unnecessary high running time. This can be avoided by dividing out possible content Email address: [email protected] (Maximilian Jaroschek) Supported by the Austrian Science Fund (FWF) grant Y464-N18
Preprint submitted to Journal of Symbolic Computation January 15, 2018 f the remainders to make their coefficients as small as possible. For commutative polynomials aswell as for non-commutative operators, different ways have been extensively studied to find factorsof the content in the sequence of remainders without computing the GCD of the coefficients ofeach element of the sequence. Most notably in this respect are subresultant sequences, where thegrowth of the coefficients can be reduced from exponential to linear in the number of reductionsteps in the Euclidean algorithm. When taking generic, randomly generated input, the coefficientsize in the subresultant sequence is usually optimal, but when taking the input from applicationsin e.g. combinatorics or physics, the remainders still have non-trivial content in many cases.For commutative polynomials, some ways are known to improve on subresultants. In thisarticle we generalize two of these results to Ore polynomials and we also give a new proof forthe commutative case that is based on the structure of subresultants as matrix determinants.Furthermore, we use these results to derive a new bound for the coefficient size of the content-freeremainders.In Section 2 the basic notions of Ore polynomial rings are stated. A precise definition andexamples of polynomial remainder sequences are given in Section 3 and further details on thesubresultant sequence are then presented in Section 4. The main results of this article can befound in Sections 5 and 6, where we first describe how additional content in the subresultantsequence can emerge and then use these results to improve on the Euclidean algorithm and toget a new bound for the size of the coefficients.
2. Preliminaries
The algebraic framework for different kinds of operators that we consider here are Ore polyno-mial rings, which were introduced by Øystein Ore in the 1930’s. We provide an overview of somebasic facts that suffice our needs and that can be found in Ore (1933) and Bronstein and Petkovˇsek(1996).
Definition 1.
Let D be a commutative domain, D [ x ] the set of univariate polynomials over D and let σ : D → D be an injective endomorphism.1. A map δ : D → D is called pseudo-derivation w.r.t. σ , if for any a, b ∈ D δ ( a + b ) = δ ( a ) + δ ( b ) and δ ( ab ) = σ ( a ) δ ( b ) + δ ( a ) b.
2. Suppose that δ is a pseudo-derivation w.r.t. σ . We define the Ore polynomial ring ( D [ x ] , + , · )with componentwise addition and the unique distributive and associative extension of themultiplication rule xa = σ ( a ) x + δ ( a ) for any a ∈ D , to arbitrary polynomials in D [ x ]. To clearly distinguish this ring from the standard polyno-mial ring over D , we denote it by D [ x ; σ, δ ].Elements of an Ore polynomial ring are called operators and are denoted by capital letters. Werefer to the leading coefficient of an operator A as lc( A ), to the coefficient of x in A as tc( A ) andto the polynomial degree of A in x as the order d A of A . Example 1.
Commonly used Ore polynomial rings are:1. D [ x ] = D [ x ; 1 , D .2. C ( y )[ D ; 1 , ddy ], the ring of linear ordinary differential operators.3. If s n : C ( n ) → C ( n ) is the forward shift in n , i.e. s n ( a ( n )) = a ( n + 1), then C ( n )[ S ; s n ,
0] isthe ring of linear ordinary recurrence operators.4. If σ : C ( q )( y ) → C ( q )( y ) is the q -shift in y , i.e. σ ( a ( y )) = a ( qy ), then C ( q )( y )[ J ; σ, ddy ] is thering of Jackson’s q -derivative operators.In this article, we consider the following situation: Let D be a Euclidean domain with degreefunction deg and let D [ x ; σ, δ ] be an Ore polynomial ring where σ is an automorphism. For anyoperator A ∈ D [ x ; σ, δ ], we define k A k to be the maximal coefficient degree of A . The contentcont( A ) of A is the greatest common divisor of all the coefficients of A and it is defined to be lc( A )if D is a field. It is possible to extend D [ x ; σ, δ ] to an Ore polynomial ring over the quotient field K of D by setting σ ( a − ) = σ ( a ) − and δ ( a/b ) = ( bδ ( a ) − aδ ( b )) / ( bσ ( b )) for a, b ∈ D , b = 0 (see Li(1996), Proposition 2.2.1). We will denote this ring by K [ x ; σ, δ ] without making it explicit thatthe automorphism and the pseudo-derivation are extensions of the functions used in D [ x ; σ, δ ]. Itis well known that for any two operators A, B ∈ K [ x ; σ, δ ], there exists a greatest common rightdivisor (GCRD) and it can be made unique (up to units in D ) by setting gcrd( A, B ) to a nonzero K - left multiple of any GCRD of A and B that has coefficients in D but does not have any contentin D .Throughout this article, we let A, B, G ∈ D [ x ; σ, δ ], B = 0 be such that d A ≥ d B and G is theGCRD of A and B . Definition 2.
For a ∈ D and n ∈ N , σ n ( a ) is obtained by applying n times σ to a and σ − n ( a ) :=( σ − ) n ( a ), where σ − is the inverse map of σ . The n th σ -factorial of a ∈ D is defined as theproduct a [ n ] := n − Y i =0 σ i ( a ) .
3. Polynomial Remainder Sequences for Ore Polynomials
The greatest common right divisor of A and B can be computed by using the Euclidean algo-rithm. If we multiply any intermediate result that appears during the execution of the algorithmby an element of K \ { } , the final output will be a K -left multiple of G . This amount of freedomallows us to optimize the running time by choosing these factors appropriately. In order to be ableto formulate improvements of this kind, the notion of polynomial remainder sequences has beenintroduced. Each element of such a sequence corresponds to a remainder computed in one iterationof the Euclidean algorithm. Definition 3.
Let ( R i ) i ∈{ ,...,ℓ +1 } and ( Q i ) i ∈{ ,...,ℓ } be sequences in K [ x ; σ, δ ], ( d i ) i ∈{ ,...,ℓ } a se-quence in N and let ( α i ) i ∈{ ,...,ℓ } and ( β i ) i ∈{ ,...,ℓ } be sequences in K such that R = A, R = B, d i = d R i ,α i R i − = Q i R i + β i R i +1 , d i +1 < d i , and all R i are nonzero except for R ℓ +1 . We call the sequence ( R i ) i ∈{ ,...,ℓ +1 } a polynomial remaindersequence (PRS) of A and B . 3 PRS of A and B is uniquely determined by specifying the α i and β i . Whenever we talkabout a PRS ( R i ) i ∈{ ,...,ℓ +1 } , we allow ourselves to refer to the related sequences ( Q i ) i ∈{ ,...,ℓ } ,( d i ) i ∈{ ,...,ℓ } etc. as in the above definition without explicitly introducing them.In order to efficiently compute G , one wants to make sure that all the remainders are elementsof D [ x ; σ, δ ] rather than K [ x ; σ, δ ]. This can be achieved by choosing the α i in a way such that thequotient of any two consecutive remainders has coefficients in D . To this extent, for 1 ≤ i ≤ ℓ set α i := lc( R i ) [ d i − − d i +1] and division with remainder yields Q i and R i +1 in D [ x ; σ, δ ] with: α i R i − = Q i R i + R i +1 , d i +1 < d i . (1)We call pquo( R i − , R i ) := Q i the pseudo-quotient of R i − and R i and prem( R i − , R i ) := R i +1 thepseudo-remainder of R i − and R i .The α i are used to make sure that computations can be done in D [ x ; σ, δ ] and the β i controlthe coefficient growth in a PRS. We want β i to contain as many factors of the content of R i +1 aspossible without much computational overhead needed to obtain these factors. Example 2.
Set α i = lc( R i ) [ d i − − d i +1] and1. β i = 1. This is called the pseudo PRS of A and B . Here, no content will be divided out.2. β i = cont( R i +1 ). This is called the primitive PRS of A and B . The coefficients of theremainders will be as small as possible, but it is necessary to compute the GCD of thecoefficients of each remainder in order to get the β i .3. The subresultant PRS of A and B (see Section 4) is given by β i = ( − σ ( ψ ) [ d − d ] , if i = 1 , − lc( R i − ) σ ( ψ i ) [ d i − − d i ] , if 2 ≤ i ≤ ℓ, where ψ i = − , if i = 1 , ( − lc( R i − )) [ d i − − d i − ] σ ( ψ i − ) [ d i − − d i − − , if 2 ≤ i ≤ ℓ. In this PRS, the content that is generated systematically by pseudo-remaindering will becleared from the remainders.While in all of the above PRSs the remainders are elements of D [ x ; σ, δ ], the degrees of thecoefficients differ drastically, as illustrated in the following example. It can be shown that thedegrees of the coefficients in the pseudo PRS grow exponentially with i , which renders this PRSpractically useless. The growth in the subresultant and primitive PRS is linear in i . Example 3.
Assume we are given a finite sequence of rational numbers that comes from a sequence( t n ) n ∈{ , ,... } which admits a linear recurrence equation with polynomial coefficients. If the amountof data is sufficiently large, we are able to guess recurrence operators of some fixed order andmaximal coefficient degree that annihilate ( t n ) n ∈{ , ,... } , i.e. the operators applied to the sequencegive zero. (For details on guessing and a Mathematica implementation of the method, see Kauers(2009).) For example, consider t n = n X k =0 (cid:18) n + 4 k (cid:19) + (2 n − k )! + k . A and B in Q [ n ][ S ; s n , d A = 14, d B = 13 and maximal coefficient degree k A k = 5, k B k = 6 resp. Both operatorsannihilate the given sequence, but none of them is of minimal order. To get an annihilating minimalorder operator, we compute the GCRD of A and B in Q ( n )[ S ; s n , A and B .PRS R R R R R R R pseudo 11 22 49 114 271 650 1565subresultant 11 16 21 26 31 36 41primitive 9 12 15 18 21 24 21 Table 1: Maximal coefficient degrees for different PRSs.
The example confirms that the degrees in the pseudo PRS grow exponentially, whereas the subre-sultant PRS and the primitive PRS show linear growth. At the same time, the degrees in the sub-resultant PRS are not as small as possible. This behavior is typical not only for this pair A and B ,but in general for operators coming from applications. For randomly generated operators, thesubresultant PRS and the primitive PRS usually coincide. Our goal is to understand the differencebetween randomly generated input and the operators A and B as above and to identify the sourceof some (and most often all) of the additional content in the subresultant PRS. To make use ofthis knowledge, we will then adjust the formulas for α i and β i from Example 2.3 so that we get aPRS with smaller degrees without having to compute the content of every remainder.
4. Subresultant Theory for Ore Polynomials
For commutative polynomials, the theory of subresultants was intensively studied by Brown(1978), Brown and Traub (1971), Collins (1967) and Loos (1982). The main idea is to translaterelations between the elements of a PRS like the B´ezout relation or the (pseudo-)remainder formulainto linear algebra. A central tool in this context is the Sylvester matrix, which, roughly speaking,contains the coefficients of all the monomial multiples of the input polynomials that are necessaryto compute remainders of any possible degree. The remainders in the subresultant sequence turnout to be polynomials whose coefficients are determinants of certain submatrices of this matrix. Li(1998) generalized these results to Ore polynomials.lc( x d B − A ) · · · · · · · · · · · · · · · · · · tc( x d B − A ). . . ...lc( A ) · · · · · · · · · · · · tc( A )lc( x d A − B ) · · · · · · · · · · · · · · · · · · tc( x d A − B ). . . .... . . ...lc( B ) · · · · · · · · · tc( B ) d B d A Figure 1: The form of the Sylvester matrix of A and B . Entries outside of the gray area are zero. A, B ) is defined to be the matrix of size ( d A + d B ) × ( d A + d B ) with thefollowing entries: If 1 ≤ i ≤ d B and 1 ≤ j ≤ d A + d B , the entry in the i th row and j th column isthe ( d A + d B − j )th coefficient of x d B − i A . If d B + 1 ≤ i ≤ d A + d B and 1 ≤ j ≤ d A + d B , the entryin the i th row and j th column is the ( d A + d B − j )th coefficient of x d A − ( i − d B ) B .For i, j ∈ N with 0 ≤ j ≤ i ≤ d B , the matrix Syl i,j ( A, B ) is obtained from Syl(
A, B ) byremoving the rows 1 to i , the rows d B + 1 to d B + i , the columns 1 to i and the last i + 1 columnsexcept for the column d A + d B − j . Figure 2: Sketch of Syl , ( A, B ). The lines indicate the removed rows and columns. The column under the dotted line isadded again.
Definition 4.
For 0 ≤ i ≤ d B , the polynomialsres i ( A, B ) := i X j =0 det(Syl i,j ( A, B )) x j is called the i th (polynomial) subresultant of A and B . If the order of sres i ( A, B ) is strictly lessthan i , the i th subresultant of A and B is called defective, otherwise it is called regular. Thesubresultant sequence of A and B of the first kind is the subsequence of( A, B, sres d B − ( A, B ) , sres d B − ( A, B ) , . . . , sres ( A, B ) , A , B , the trailing zero and all nonzero sres i ( A, B ) for which sres i +1 ( A, B ) is regular.
Theorem 1 (Li (1998)).
The polynomial remainder sequence given by α i and β i as in Exam-ple 2.3, the subresultant PRS, is equal to the subresultant sequence of A and B of the first kind.
5. Identifying Content of Polynomial Subresultants
The representation of subresultants in terms of determinants of the matrices Syl i,j ( A, B ) makesit possible to identify content by exploiting the special form of these matrices as well as thecorrespondence between rows of the Sylvester matrix and monomial multiples of A and B . For thecase of commutative polynomials, some results are known for detecting such additional content.We generalize two results to the Ore setting. The first (Theorem 2) is a generalization of anobservation mentioned in Brown (1978), which carries over quite easily to the Ore case. Thesecond (Theorem 4) usually performs better in terms of coefficient size of the remainders, but aheuristic argument is necessary to use it algorithmically (see Section 6).6 heorem 2. With t := gcd( σ d B − (lc( A )) , σ d A − (lc( B ))) and γ i := σ − i ( t ) for ≤ i ≤ d B − , weget: γ i | cont(sres i ( A, B )) . Proof.
Let i be fixed. The coefficients of sres i ( A, B ) are the determinants of the matricesSyl i,j ( A, B ) for 0 ≤ j ≤ i . The first column of all of these matrices is( σ d B − − i (lc( A )) , , . . . , , σ d A − − i (lc( B )) , , . . . , T . Laplace expansion along this column proves the claim. (cid:3)
Not all of the subresultants of A and B are in the subresultant PRS of A and B . To make useof Theorem 2 for a new PRS, we need a minor specialisation of the statement: Corollary 1.
Let ( R i ) i ∈{ ,...,ℓ +1 } be the subresultant PRS of A and B (not necessarily normal). Ifwe choose t = gcd( σ d B − (lc( A )) , σ d A − (lc( B ))) , γ = σ − d B +1 ( t ) and γ i = σ d i − − d i − ( γ i − ) for < i ≤ ℓ, then γ i | cont( R i ) for ≤ i ≤ ℓ . Proof.
Suppose R i is the j th subresultant of A and B . Then, by the definition of the subresultantsequence of the first kind and Theorem 1, the ( j + 1)st subresultant of A and B is regular. Becauseof this and the subresultant block structure (see Li (1998)), R i − is of order j + 1 and so j is equalto d i − −
1. By Theorem 2, the content of R i is divisible by σ − d i − +1 ( t ). It is easy to see that σ − d i − +1 ( t ) is equal to γ i . (cid:3) In the commutative case, a second source of additional content was determined, although thisresult is not widely known. The following theorem can be found in Knuth (1981):
Theorem 3.
Let
A, B ∈ D [ x ] be such that the subresultant PRS of A and B is normal, i.e. d i − = d i + 1 for ≤ i ≤ ℓ , and let G be the GCD of A and B . Then lc( G ) i − | cont( R i ) for ≤ i ≤ ℓ . A generalization of Theorem 3 to Ore polynomials is not straightforward, as Example 4 shows.
Example 4 (Example 3 cont.).
If we take A and B as in Example 3, then the leading coefficientof the GCRD of A and B is ( n + 9) p ( n ), where p ( n ) is a polynomial of degree 17. The subresultantPRS of A and B turns out to be normal and R is of order d = 12. By Theorem 3, if thepolynomials were elements of D [ x ], cont( R ) would be divisible by lc( G ) and a naive translationof the theorem to the non-commutative case suggests divisibility by a polynomial of degree atleast 36. The (monic) content of R , however, is only ( n + 16)( n + 17), which is contained in, butnot equal to, σ (lc( G )) [2] .Again in the commutative case, let Q A , Q B ∈ D [ x ] be such that A = Q A G and B = Q B G . Knuth (1981) proves Theorem 3 by showing that if ( R i ) i ∈{ ,...,ℓ +1 } is the subresultantPRS of A and B and ( ˜ R i ) i ∈{ ,...,ℓ +1 } is the subresultant PRS of Q A , Q B , then Q i = lc( G ) i − ˜ R i .This approach is problematic for Ore polynomials, because there the Q i ’s and the ˜ R i ’s have coef-ficients in K and not necessarily in D . This means that even after showing that a quotient Q i is a7 -left multiple of some subresultant ˜ R i of Q A and Q B , the left factor and the denominators in thecoefficients of ˜ R i might not be coprime and thus lead to cancellation. Therefore we will not onlydescribe why in the non-commutative case only some factors of lc( G ) appear as content, but wealso present a new proof of Theorem 3 that makes it more explicit where the additional contentcomes from. Moreover, we won’t require the remainder sequence to be normal.In D [ x ], if A is a multiple of the primitive polynomial G , then their quotient will always havecoefficients in D , and therefore, the leading coefficient of A contains all the factors of the leadingcoefficient of G . For Ore polynomials, this is not necessarily true, since the quotient of A and G might be an element of K [ x ; σ, δ ] \ D [ x ; σ, δ ]. Still, different left multiples of G in D [ x ; σ, δ ] mayshare some common factors in their leading coefficients, as described in Lemma 1. Lemma 1.
Let d T ∈ N be fixed, let I ⊳ D [ x ; σ, δ ] be a left ideal and let T be any element of I oforder d T such that, among all the operators of order d T in I , its leading coefficient t is minimalwith respect to the degree. Then t is independent of the choice of T (up to multiplication by unitsin D ) and for any L ∈ I with d L ≤ d T we have σ d L − d T ( t ) | lc( L ) . Proof.
Assume there are
T, L ∈ I for which the claim σ d L − d T ( t ) | lc( L ) does not hold. Welet L ′ = x d T − d L L and get lc( L ′ ) = σ d T − d L (lc( L )), thus t ∤ lc( L ′ ) by assumption. Division withremainder yields nonzero q, r ∈ D such thatlc( L ′ ) = qt + r, deg( r ) < deg( t ) . Hence the operator L ′ − qT is an element of I whose leading coefficient has degree less than deg( t ).This contradicts the choice of T .For the uniqueness, let T ′ ∈ I be any other operator of order d T with minimal leading coefficientdegree. By what was just shown above, we get lc( T ′ ) | t and t | lc( T ′ ), so t and lc( T ′ ) are associates. (cid:3) Definition 5.
Consider I , T and t from Lemma 1. The shift σ − d T ( t ) of the leading coefficientof T is called the essential part of I at order d T . If there is no operator in I for some order n , theessential part of I at order n is defined to be 1.Let L ∈ C [ y ][ D ; 1 , ddy ] and I = I ′ ∩ C [ y ][ D ; 1 , ddy ] where I ′ ⊳ C ( y )[ D ; 1 , ddy ] is the left idealgenerated by L . We give an informal explanation of essential parts of I in terms of solutionsof L , i.e. functions that are annihilated by L . Any non-removable singularity of a solution of L corresponds to a root of the leading coefficient of L , but not for any root of lc( L ) there has tobe a solution with a non-removable singularity at that point. Any solution of L is also a solutionof every operator in I and it can happen that there are nonzero K -left multiples of L in I thathave strictly smaller leading coefficient degree than L . If such a desingularized operator exists,it means that some of the roots of lc( L ) can be removed by multiplying L with another operatorfrom the left. These removable roots are called the apparent singularities of L . It is shown inJaroschek (2013) that there exists a unique minimal (w.r.t. degree) essential part of I that appearsin the essential parts of I at every order greater than d L . This minimal essential part of I is apolynomial whose roots are exactly the non-apparent singularities of L , and it turns out that foreach root of the essential part of I , there is at least one solution of L that does not admit ananalytic continuation at that point. A more detailed description of desingularization and apparent8ingularities of differential equations can be found in Ince (1926). Further references and recentresults on desingularization of Ore operators can be found in Chen et al. (2013).Note that for commutative polynomials, by Gauß’ Lemma, the essential part of a nonzero idealat any order is equal to the leading coefficient of the primitive greatest common divisor of the idealelements.For the remaining part of this article, let I ⊳ D [ x ; σ, δ ] be the left ideal generaed by A and B .We formulate our Ore generalization of Theorem 3, where now some of the essential parts of I play the role of the leading coefficient of the GCRD of A and B . Theorem 4.
Let i ∈ { , . . . , d B − } and ∆ := d A + d B − i . If t k is the essential part of I atorder k for i < k ≤ ∆ + i − , then ∆+ i − Y k = i +1 t k ! | cont(sres i ( A, B )) . Proof.
For any j ∈ { , . . . , i } , Syl i,j ( A, B ) is of size ∆ × ∆ and if the last column is removed, theresulting matrix does not depend on j anymore. For n ∈ { , . . . , ∆ − } , let M i,n be the set of all n × n matrices obtained by removing the last ∆ − n columns and any ∆ − n rows from Syl i,j ( A, B ).The j th coefficient of sres i ( A, B ) is the determinant of Syl i,j ( A, B ) and Laplace expansion along thelast column shows that it is a D -linear combination of the elements of M i, ∆ − . By induction on n we show that the determinant of any element of M i,n is divisible by t ∆+ i − n t ∆+ i − ( n − . . . t ∆+ i − .The theorem is then proven by setting n = ∆ − n = 1, the only entry in a matrix in M i, is either zero or the leading coefficient of amonomial left multiple of A or B of order ∆ + i −
1, so the claim follows from Lemma 1.Now suppose the claim is true for 1 ≤ n < ∆ − M be any element of M i,n +1 . If thedeterminant of M is zero, then there is nothing to show. Consider the case where det( M ) = 0.Then there is a v ∈ K n +1 such that M T v = (0 , . . . , , T . By Cramer’s rule, the j th component v j of v is of the form p j / det( M ) where p j ∈ D is the determinant of some element of M i,n . Byinduction hypothesis it is divisible by t ∆+ i − n t ∆+ i − ( n − . . . t ∆+ i − . Every row in M correspondsto an operator of the form x k A or x k B for k ∈ N , minus some of the lower order terms. For the j th row, 1 ≤ j ≤ n + 1, we denote the corresponding operator by L j . By the definition of v , theoperator P n +1 j =0 v j L j ∈ K [ x ; σ, δ ] will have order ∆ + i − ( n + 1) and leading coefficient 1. So if weset v ′ := det( M ) t ∆+ i − n t ∆+ i − ( n − . . . t ∆+ i − v ∈ D n +1 and L = P n +1 j =0 v ′ j L j , then L is an element in I of order ∆ + i − ( n + 1) and its leading coefficient isdet( M ) / ( t ∆+ i − n t ∆+ i − ( n − . . . t ∆+ i − ) ∈ D . Lemma 1 yields that lc( L ) is divisible by t ∆+ i − ( n +1) ,so we get in total t ∆+ i − ( n +1) t ∆+ i − n . . . t ∆+ i − | det( M ). (cid:3) In practice, the essential parts of I will most likely be the same at every order n with d G ≤ n ≤ d A + d B . In that case, Theorem 4 is equivalent to the following simplification, where only theessential part of I at order d A + d B needs to be known. Corollary 2.
Let i ∈ { , . . . , d B − } and ∆ := d A + d B − i . If t is the essential part of I atorder d A + d B , then σ i +1 ( t ) [∆ − | cont(sres i ( A, B )) . roof. According to Lemma 1, σ j ( t ) divides the essential part of I at order j for any d G ≤ j ≤ d A + d B . If i < d G , then the i th subresultnat of A and B is zero. Otherwise, Theorem 4 yieldsthat cont(sres i ( A, B )) is divisible by σ i +1 ( t ) σ i +1 ( t ) . . . σ ∆+ i − ( t ) = σ i +1 ( t ) [∆ − . (cid:3) Like for Theorem 2, an adjustment of Corollary 2 to the block structure of the subresultantsequence of the first kind is needed in order to construct a new PRS.
Corollary 3.
Let ( R i ) i ∈{ ,...,ℓ +1 } be the subresultant PRS of A and B (not necessarily normal) andlet t be the essential part of I at order d A + d B . If we set γ = σ d B ( t ) [ d A − d B +1] and γ i = σ d i − ( t ) [ d i − − d i − ] γ i − σ d A + d B − d i − +1 ( t ) [ d i − − d i − ] for < i ≤ ℓ, then γ i | cont( R i ) for ≤ i ≤ ℓ . Proof.
Suppose R i is the j th subresultant of A and B . As in the proof of Corollary 1, we havethat j is equal to d i − −
1. So by Corollary 2, the content of R i is divisible by σ d i − ( t ) [ d A + d B − d i − +1] .Simple hand calculation shows that this is equal to γ i . (cid:3)
6. Improved Polynomial Remainder Sequence
We now derive formulas for the α i and β i that take into account the potential additional contentcharacterized by Theorems 2 and 4. For this we need the following lemma: Lemma 2.
For γ , γ ∈ K : pquo( γ A, γ B ) γ = γ γ [ d A − d B +1]2 pquo( A, B ) . Proof.
By Lemma 2.3 in Li (1998), the pseudo-remainder of γ A and γ B is the ( d B − γ A and γ B (up to sign). Consequently, its coefficients are determinants of submatricesof Syl( γ A, γ B ) that contain one row corresponding to the operator γ A and d A − d B + 1 rowscorresponding to operators of the form x i γ B , 0 ≤ i ≤ d A − d B . Thus, by Lemma 2.2 in Li (1998),it follows that (up to sign) prem( γ A, γ B ) = γ γ [ d A − d B +1]2 prem( A, B ) . (2)The pseudo-remainder formula (1) applied to γ A and γ B islc( γ B ) [ d A − d B +1] γ A = pquo( γ A, γ B ) γ B + prem( γ A, γ B ) . Combining this with (2) and dividing the resulting equation by γ γ [ d A − d B +1]2 from the left givesthe desired result. (cid:3) This now allows us to state α i and β i for improved polynomial remainder sequences:10 heorem 5. Suppose ( R i ) i ∈{ ,...,ℓ +1 } is the subresultant PRS of A and B and ( γ i ) i ∈{ ,...,ℓ +1 } isany sequence in K \ { } with γ = γ = 1 . Set ˜ R i = γ i R i . Then ( ˜ R i ) i ∈{ ,...,ℓ +1 } is a PRS of A and B with: ˜ α i = lc( ˜ R i ) [ d i − − d i +1] , ˜ β i = − σ ( ˜ ψ ) [ d − d ] γ , if i = 1 , − lc( ˜ R i − ) σ ( ˜ ψ i ) [ d i − − d i ] γ i [ d i − − d i +1] γ i +1 , if ≤ i ≤ ℓ, where ˜ ψ i = − , if i = 1 , ( − γ i − lc( ˜ R i − )) [ d i − − d i − ] σ ( ˜ ψ i − ) [ d i − − d i − − , if ≤ i ≤ ℓ. Proof.
From the definition of ˜ R i and the equations α i R i − = Q i R i + β i R i +1 and α i = γ [ d i − − d i +1] i ˜ α i , it follows that γ [ d i − − d i +1] i γ i − ˜ α i ˜ R i − = Q i γ i ˜ R i + β i γ i +1 ˜ R i +1 . (3)For the first summand on the right hand side, Lemma 2 yields Q i γ i = γ [ d i − − d i +1] i γ i − ˜ Q i . (4)For the second summand, observe that since γ i lc( ˜ R i ) equals lc( R i ), we have that ψ i equals ˜ ψ i forall 1 ≤ i ≤ ℓ . Thus β i γ i +1 = γ [ d i − − d i +1] i γ i − ˜ β i . (5)The proof is concluded by combining (3), (4) and (5) and dividing the resulting equation by γ [ d i − − d i +1] i γ i − from the left. (cid:3) Two possible choices for ( γ i ) i ∈{ i,...,ℓ +1 } were presented in Corollary 1 and 3. The computationof γ i in Corollary 1 is straightforward, but in Corollary 3, the essential part of I (the ideal generatedby A and B ) at order d A + d B is usually not known. A simple heuristic can solve this problem inmost cases: As was shown in Lemma 1, the essential part of I at order d A + d B appears in a shiftedversion in the leading coefficient of every nonzero ideal element with order less than or equal to d A + d B . In particular it is contained in lc( A ) and lc( B ). Thus, if t is the essential part of I atorder d A + d B , we have σ d A ( t ) | gcd(lc( A ) , σ d A − d B (lc( B ))) (6)and in most cases, we not only have divisibility but equality. In fact, in all the examples we lookedat that came from combinatorics or physics, this guess for the essential part turned out to becorrect. Example 5 (Example 4 cont.).
We now use Theorem 5 and Corollaries 1 and 3 to computenew PRSs of A and B as in Example 3. The essential part of I at order d A + d B is ( n + 3),11o σ d A ( n + 3) = ( n + 17), which is also the guess given by the right hand side of (6). ApplyingCorollary 1 yields the factors γ = n + 17 , γ = n + 18 , . . . γ i = n + 16 + i − , . . . whereas Corollary 3 gives γ = ( n + 16) [2] , γ = ( n + 15) [4] , . . . γ i = ( n + 16 − i + 2) [2( i − , . . . The improvements from Corollary 1 are marginal, while the degrees in the improved PRS with theresults from Corollary 3 are equal to the degrees in the primitive PRS, except for the very laststep: PRS R R R R R R R subresultant 11 16 21 26 31 36 41improved (Cor. 1) 10 15 20 25 30 35 40improved (Cor. 3) 9 12 15 18 21 24 27primitive 9 12 15 18 21 24 21 Table 2: Maximal coefficient degrees for the subresultant, improved and primitive PRS.
Example 6.
Although the remainders in the PRS based on Corollary 3 are usually primitive whenstarting from randomly generated operators or operators that come from some applications, it isnot guaranteed that this is always the case. As an example, consider
A, B ∈ Q [ y ][ x ] ,A = x + yx + yx + y,B = x + yx . The second subresultant of A and B is sres ( A, B ) = ( y + y ) x + yx + y , so cont(sres ( A, B )) = y ,but in the improved PRS, no content will be found.As mentioned, it may also happen that the guess for the essential part of I at order d A + d B is too large, for example: A, B ∈ Q ( y )[ D, , ddx ] ,A = ( y + 1) D + D + D + yD + 1 ,B = ( y + 1) D + D + 1 . Here, cont( R ) in the subresultant PRS is ( y + 1), but a factor ( y + 1) is predicted. The mistakein predicting the essential part can be noticed on the fly during the execution of the algorithmas soon as a remainder with coefficients in Q ( y ) appears. It is then possible to either switch toanother PRS or to refine the guess of the essential part. One strategy to do so is to remove all thefactors from the guess that could be responsible for the appearance of denominators. Let t be theguess for the essential part of I at d A + d B and let c be the non-trivial common denominator of thecoefficients of a remainder R i in the improved PRS. Furthermore let M be the set of all integers m σ m ( c ) , t ) = 1. Update R i , γ i and t with R i ← cR i ,γ i ← γ i c ,t ← t gcd( t, Q m ∈ M σ m ( c )) ,γ i +1 ← σ d i − d B ( t ) [ d A + d B − d i +1] , and continue the computation with these new values. For differential operators in C ( y )[ D ; 1 , ddy ],we have M = { } and for recurrence operators in C ( n )[ S n ; s n , M contains all the integer rootsof the polynomial res n ( c ( n + m ) , t ). Example 7.
We can guess two operators A and B in Q [ n ][ S ; s n ,
0] of order d A = 16, d B = 14,resp. that annihilate the sequence t n = (7 n + 5 n + n + 1) (( n + 1 / ) (2 n )! (3 n )! . The GCRD of A and B is of order 1 and the essential part of I at d A + d B is of degree 4. Theessential part of I at order 11, however, is of degree 11, so here we are in the rare case where theessential part of I at order d A + d B is only contained but not equal to the essential part at lowerorders. Formula (6) only predicts the essential part of I at order d A + d B and during the GCRDcomputation, content that comes from lower order essential parts emergesPRS R R R R R R R improved (Cor. 3) 31 44 57 70 83 96 109primitive 31 44 50 56 62 68 74 Table 3: Maximal coefficient degrees for the first few remainders in the improved and primitive PRS.
It is possible to guess the essential part of I at lower orders and then use Theorem 4 to get theprimitive remainders, but like in the direct computation of the primitive PRS, GCD computationsin the base ring would be necessary after each division step.As another consequence of Theorem 4, we can give a new bound for the coefficient degrees ofthe primitive PRS in terms of the essential parts of the left ideal generated by A and B . Theorem 6.
Let ( R i ) i ∈{ ,...,ℓ +1 } be the primitive PRS of A and B . Fix i ∈ { , . . . , ℓ } and let b ∈ N be such that max k ∈{ ,...,d B − d i − } ( k x k A k ) ≤ b and max k ∈{ ,...,d A − d i − } ( k x k B k ) ≤ b . If t j denotes theessential part of I at order j ∈ N , then k R i k ≤ ( d A + d B − d i − − b − d A + d B − d i − +1 X j = d i − deg( t j ) Proof.
The bound follows directly from Hadamard’s inequality, the subresultant block structureand Corollary 3. (cid:3) cknowledgements I would like to thank Ziming Li and Manuel Kauers for their helpful comments and supportduring our personal communication. Also, I thank the reviewers for their careful reading of thisarticle and for pointing out mistakes and shortcomings in the draft.
References