Trading Order for Degree in Creative Telescoping
aa r X i v : . [ c s . S C ] J a n Trading Order for Degree inCreative Telescoping
Shaoshi Chen Department of MathematicsNorth Carolina State UniversityRaleigh, NC 27695-8205, USA
Manuel Kauers Research Institute for Symbolic ComputationJohannes Kepler UniversityA4040 Linz, Austria
Abstract
We analyze the differential equations produced by the method of creative telescoping appliedto a hyperexponential term in two variables. We show that equations of low order have highdegree, and that higher order equations have lower degree. More precisely, we derive degreebounding formulas which allow to estimate the degree of the output equations from creativetelescoping as a function of the order. As an application, we show how the knowledge of theseformulas can be used to improve, at least in principle, the performance of creative telescopingimplementations, and we deduce bounds on the asymptotic complexity of creative telescopingfor hyperexponential terms.
Key words:
Definite integration, Hyperexponential terms, Zeilberger’s algorithm.
1. Introduction
Creative telescoping is a technique for computing differential or difference equationssatisfied by a given definite sum or integral. The technique became widely known throughthe work of Zeilberger (1991), who first observed that creative telescoping in combination
Email addresses: [email protected] (Shaoshi Chen), [email protected] (Manuel Kauers). Current address. The work described here was done while S.C. was employed as postdoc at RISC in theFWF projects Y464-N18 and P20162-N18. At NCSU, S.C. is supported by the NSF grant CCF-1017217. M.K. was supported by the FWF grant Y464-N18.
Preprint submitted to Elsevier 12 November 2018 r d Fig. 1. Sizes ( r, d ) of creative telescoping relations for the integral of a certain rational function with Gosper’s algorithm (Gosper, 1978) for indefinite hypergeometric summation leadsto a complete algorithm for computing recurrence equations of definite hypergeomet-ric sums. This algorithm is now known as Zeilberger’s algorithm (Zeilberger, 1990). Inits original version, it accepts as input a bivariate proper hypergeometric term f ( n, k )and returns as output a linear recurrence equation with polynomial coefficients satisfiedby the sum F ( n ) = P bk = a f ( n, k ). An analogous algorithm for definite integration wasgiven by Almkvist and Zeilberger (1990). This algorithm accepts as input a bivariatehyperexponential term f ( x, y ) and returns as output a linear differential equation withpolynomial coefficients satisfied by the integral F ( x ) = R βα f ( x, y ) dy . A summary of themethod of creative telescoping for this case is given in Section 2 below. For further de-tails, variations, and generalizations, consult for instance Petkovˇsek et al. (1997), Chyzak(2000), Schneider (2005), Chyzak et al. (2009), Kauers and Paule (2011). For implemen-tations, see Paule and Schorn (1995), Chyzak (1998), Koepf (1998), Schneider (2004),Abramov et al. (2004), Koutschan (2009, 2010), etc.The equations which can be found via creative telescoping have a certain order r andpolynomial coefficients of a certain degree d . But for a fixed integration problem, r and d are not uniquely determined. Instead, there are infinitely many points ( r, d ) ∈ N suchthat creative telescoping can find an equation of order r and degree d . These points form aregion which is specific to the integration problem at hand. Figure 1 shows an example forsuch a region. Every point ( r, d ) in the gray region corresponds to a differential equationof order r and degree d which creative telescoping can find for integrating the rationalfunction f ( x, y ) = (cid:16) x y + 9 x y + 9 x + 10 xy + 3 xy + 4 x + 1 (cid:17).(cid:16) x y + 9 x y + x y + 3 x + 7 x y + 8 x y + 5 x + 8 xy + 10 xy + 10 xy + x + 5 y + 10 y + 5 y + 5 (cid:17) . The picture indicates that low order equations have high degree, and that the degreedecreases with increasing order. But what exactly is the shape of the gray region? Andwhere does it come from? And how can it be exploited? These are the questions weaddress in this article. 2 ow can it be exploited?
There are two main reasons why the shape of the grayregion is of interest. First, because it can be used to estimate the size of the outputequations, and hence to derive bounds on the computational cost of computing them.Secondly, because it can be used to design more efficient algorithms by recognizing thatsome of the equations are cheaper than others.An analysis of this kind was first undertaken by Bostan et al. (2007). They studiedthe problem of computing differential equations satisfied by a given algebraic functionand found a similar phenomenon: low order equations have high degree and vice versa.Among other things, they found that an algebraic function with a minimal polynomial ofdegree n satisfies a differential equation of order at most n with polynomial coefficients ofdegree O( n ), but also a differential equation of order 6 n whose coefficients have degreeonly O( n ). Their message is that trading order for degree can pay off.The same phenomenon applies to creative telescoping, as was shown by Bostan et al.(2010) for the case of integrating rational functions. The results in the present articleextend this work in two directions: First in that we consider the larger input class ofhyperexponential terms, and second in that we give not only isolated degree estimatesfor some specific choices of r , but a curve which passes along the boundary of the grayregion and thus establishes a degree estimate as a function of the order r . Where does it come from?
The standard argument for proving the existenceof creative telescoping relations rests on the fact that linear systems of equations withmore variables than equations must have a nontrivial solution. Every creative telescopingrelation can be viewed as a solution of a certain linear system of equations which can beconstructed from the data given in the input. There is some freedom in how to constructthese systems, and it turns out that this freedom can be used for making the number ofvariables exceed the number of equations, and thus to enforce the existence of a nontrivialsolution.This reasoning not only implies the existence of equations and the termination of thealgorithm which searches for them, but it also implies bounds on the output size andon the computational cost of the algorithm. But in order to obtain good bounds, thefreedom in setting up the linear systems must be used carefully. For a good bound, wenot only want that the number of variables exceeds the number of equations, but we alsowant this to happen already for a reasonably small system. The shape of the gray regionoriginates from the smallest systems which have solutions.Verbaeten (1974, 1976) introduced a technique which helps in keeping the size of thesystems small. The idea is to saturate the linear systems by introducing additional vari-ables in a way that avoids increasing the number of equations. We will make use of thisidea in Section 3 where we propose a design for a parameterized family of linear sys-tems whose solutions give rise to creative telescoping relations. Unfortunately, it requiressome quite lengthy and technical calculations to translate this particular design into aninequality condition which rephrases the condition “number of variables > number ofequations” in precise terms. However, as a reward we obtain a good approximation tothe gray region as the solution of this inequality. What is the exact shape?
We don’t know. All we can offer are some rationalfunctions which describe the boundary of the region of all ( r, d ) where the ansatz de-scribed in Section 3 has a solution (Theorem 14). The graphs of these rational functionsare curves which pass approximately along the boundary of the gray region.3y construction, for all integer points ( r, d ) above these graphs we can guaranteethe existence of a creative telescoping relation of order r with polynomial coefficientsof degree d . But we have no proof that our curves are best possible. Experiments haveshown that at least in some cases, our curve describes the boundary of the gray regionexactly, or within a negligible error. In other cases, there remains a significant portion ofthe gray region below our curve when r is large.In cases where the curve from Theorem 14 is tight, we can compute the points ( r, d )for which certain interesting measures (such as computing time, output size, . . . ) areminimized, as shown in Section 5. Even when the curve is not tight, these calculationsstill give rise to new asymptotic bounds (including the multiplicative constants) of thecorresponding complexities. We expect that this data will be valuable for constructingthe next generation of symbolic integration software.
2. Creative Telescoping for Hyperexponential Terms
We consider in this article only hyperexponential terms as integrands. Throughoutthe article, K is a field of characteristic 0, and K ( x, y ) is the field of bivariate rationalfunctions in x and y over K . Let D x and D y denote the derivations on K ( x, y ) suchthat D x c = D y c = 0 for all c ∈ K , and D x x = 1, D x y = 0, D y x = 0, D y y = 1.One can see that D x and D y commute with each other on K ( x, y ). We say that a field E containing K ( x, y ) is a differential field extension of K ( x, y ) if the derivations D x and D y are extended to derivations on E and those extended derivations, still denoted by D x and D y , commute with each other on E . Definition 1.
An element h of a differential field extension E of K ( x, y ) is called hyper-exponential (over K ( x, y )) if D x hh ∈ K ( x, y ) and D y hh ∈ K ( x, y ) . When h ∈ E is a hyperexponential term and r , r ∈ K ( x, y ) are such that ( D x h ) /h = r and ( D y h ) /h = r , then D x D y h = D y D x h implies D y r = D x r . Conversely,Christopher (1999) has shown for algebraically closed ground fields K that for anytwo rational functions r , r ∈ K ( x, y ) with D y r = D x r there exist a/b ∈ K ( x, y ), c , . . . , c L ∈ K [ x, y ] and e , . . . , e L ∈ K with r = D x c c + D x (cid:16) ab (cid:17) + L X ℓ =1 e ℓ D x c ℓ c ℓ and r = D y c c + D y (cid:16) ab (cid:17) + L X ℓ =1 e ℓ D y c ℓ c ℓ . Together with Theorem 2 of Bronstein et al. (2005), it follows that there exists a dif-ferential field extension E of K ( x, y ) and an element h ∈ E with ( D x h ) /h = r and( D y h ) /h = r which we can write in the form h = c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ , where a ∈ K [ x, y ], b, c , . . . , c L ∈ K [ x, y ] \ { } , e , . . . , e ℓ ∈ K , and the expressionsexp( a/b ) and c e ℓ ℓ refer to elements of E on which D x and D y act as suggested by thenotation. We assume from now on that hyperexponential terms are always given in this4orm, and we use the letters a, b, c , . . . , c L , e , . . . , e L consistently throughout with themeaning they have here. Example 2. h = exp( x y ) √ x − y is a hyperexponential term. We have D x hh = 1 + 4 x y − xy x − y = 2 xy + 12 x − y ∈ K ( x, y ) ,D y hh = x − x y − x − y = x − x − y ∈ K ( x, y ) . For this term, we can take c = 1, a = x y , b = 1, c = x − y , e = .We may adopt the additional condition (without loss of generality) that the c ℓ ( ℓ > e ℓ N for all ℓ >
0. The estimates derivedbelow do not depend on these additional conditions, but will typically not be sharp whenthey are not fulfilled. For simplicity, we will exclude throughout some trivial specialcases by assuming that all e ℓ are nonzero and that max { deg x a, deg x b } + P Lℓ =1 deg x c ℓ and max { deg y a, deg y b } + P Lℓ =1 deg y c ℓ are nonzero. These latter two conditions encodethe requirement that h is neither independent of x nor independent of y , nor simply apolynomial.Applied to the hyperexponential term h , the method of creative telescoping consistsof finding, by whatever means, polynomials p , . . . , p r ∈ K [ x ], not all zero, and a hyper-exponential term Q such that p h + p D x h + · · · + p r D rx h = D y Q. An equation of this form is called a creative telescoping relation for h , the differentialoperator P := p + p D x + · · · + p r D rx appearing on the left is called the telescoper and Q is called the certificate of the relation. The telescoper is required to be nonzero andfree of y , but the certificate may be zero or it may involve both x and y . When p r = 0,the number r is called the order of P , and d := max ri =0 deg x p i is called its degree .To motivate the form of a creative telescoping relation, assume that h = h ( x, y ) can beinterpreted as an actual function in x and y and consider the integral f ( x ) = R βα h ( x, y ) dy .Then integrating both sides of a creative telescoping relation implies that f satisfies theinhomogeneous differential equation p ( x ) f ( x ) + p ( x ) D x f ( x ) + · · · + p r ( x ) D rx f ( x ) = (cid:2) Q ( x, y ) (cid:3) βy = α . In the frequent situation that the inhomogeneous part happens to evaluate to zero, thismeans that the telescoper of h annihilates the integral f . Example 3.
A creative telescoping relation for h = exp( x y ) √ x − y is(3 x − h − xD x h = D y (cid:0) (3 x − y ) h (cid:1) . It consists of the telescoper P = (3 x − − xD x and the certificate Q = (3 x − y ) h . Forthe definite integral f ( x ) := R x/ −∞ exp( x y ) √ x − ydy , we obtain the differential equation(3 x − f ( x ) − xD x f ( x ) = 0 . Q , they fix an order r and a degree s for the numerator of Q , make anansatz with undetermined coefficients, and obtain a linear system by comparing coef-ficients. Appropriate choices of r and s ensure that this linear system has a nontrivialsolution, and also lead to a sharp bound on the order r of the telescoper.Let us illustrate this reasoning for the case where the integrand is a rational function h = u/v ∈ K ( x, y ) with deg y u < deg y v and v irreducible. Fix some r . Then we have tofind p , . . . , p r ∈ K ( x ) and a rational function Q ∈ K ( x, y ) with p h + p D x h + · · · + p r D rx h = D y Q. A reasonable choice for Q is Q = (cid:0)P si =0 q i y i (cid:1) /v r , where s = deg y u + ( r −
1) deg y v and q , . . . , q s are unknowns, because with this choice, both sides of the equation are equalto a rational function with the same denominator v r +1 and numerators of degree atmost deg y u + r deg y v in y in which the unknowns p i and q j appear linearly. Comparingcoefficients with respect to y on both sides leads to a homogeneous linear system of atmost 1 + deg y u + r deg y v equations with ( r + 1) + ( s + 1) unknowns and coefficientsin K ( x ). This system will have a nontrivial solution if r is chosen such that( r + 1) + ( s + 1) > deg y u + r deg y v + 1 ⇐⇒ r ≥ deg y v. All these solutions must lead to a nonzero telescoper P because any nontrivial solutionwith P = 0 would have a nonzero certificate Q with D y Q = 0, and this is impossiblebecause s was chosen such that the numerator of Q has a strictly lower degree than itsdenominator.We have thus shown the existence of telescopers of any order r ≥ deg y v . This is agood bound, but it does not provide any estimate on their degrees d . We will next deriveinequalities involving both r and d by constructing linear systems with coefficients in K rather than in K ( x ).
3. Shaping the Ansatz
Let h be a hyperexponential term and consider an ansatz of the form P = r X i =0 d i X j =0 p i,j x j D ix , Q = (cid:18) s X i =0 s X j =0 q i,j x i y j (cid:19) hv for a telescoper P and a certificate Q . The plan is to find a good choice for the parameters r, s , s , v, d , . . . , d r . The only restriction we have is that the linear system obtained fromequating all the coefficients in the numerator of the rational function ( P h − D y Q ) /h tozero should have a solution in which not all the p i,j are zero. The remaining freedom canbe used to shape the ansatz such as to keep d := max ri =0 d i small.As a sufficient condition for the existence of a solution, we will require that the numberof terms x i y j in the numerator of the rational function ( P h − D y Q ) /h (i.e., the numberof equations) should be less than P ri =0 ( d i + 1) + ( s + 1)( s + 1) (i.e., the number ofvariables p i,j and q i,j ). As shown in the following example, this condition is really justsufficient, but not necessary. 6 xample 4. Let h = u/v be the rational function from the introduction. With r = 3, d = d = d = d = d = 54, and Q = (cid:0)P i =0 P j =0 q i,j x i y j (cid:1)(cid:14) v , comparing thecoefficients of the numerator of ( P h − D y Q ) /h to zero gives a linear system with 787variables and 792 equations. This system has a nonzero solution although 792 > P and Q in such away that the linear system originating from it will have a nullspace whose dimension isexactly the difference between the number of equations and the number of variables (or0 if there are more equations than variables).The goal of this section is to describe our choice for the ansatz of telescoper andcertificate. The form of the ansatz for the telescoper is given in Section 3.1, the certificateis discussed in Section 3.2. In the beginning, we collect some facts about the rationalfunctions ( D ix h ) /h which are used later for calculating how many equations a particularansatz induces. The following notational conventions will be used throughout. Notation 5. • lc z p and deg z p refer to the leading coefficient and the degree of thepolynomial p with respect to the variable z , respectively. For the zero polynomial, wedefine deg z −∞ and lc z • p ∗ refers to the square free part of the polynomial p with respect to all its variables,e.g., (cid:0) ( x +1) ( y +3) (cid:1) ∗ = ( x +1)( y +3). Note that p ∗ is only unique up to multiplicationby elements from K \{ } , but that for any choice of p ∗ , the degrees deg x p ∗ and deg y p ∗ are uniquely determined and we have that p ∗ ( D x p ) /p is a polynomial in x and y . Theseare the only properties we will use. • z n := z ( z − z − · · · ( z − n + 1) and z n := z ( z + 1)( z + 2) · · · ( z + n −
1) denote thefalling and rising factorials, respectively. For n ≤ z n := z n := 1. • If z is a real number, then z + := max { , z } . • If z is a real number, then ⌊ z ⌋ := max { x ∈ Z : x ≤ z } , ⌈ z ⌉ := min { x ∈ Z : x ≥ z } ,and ⌊ z ⌉ := ⌊ z + ⌋ denotes the nearest integer to z . • If Φ is a formula then [[Φ]] denotes the Iverson bracket, which evaluates to 1 if Φ istrue and to 0 if Φ is false, e.g., z + = [[ z ≥ z ; δ i,j = [[ i = j ]], etc. Lemma 6.
Let h be a hyperexponential term and i ≥ x a > deg x b , then D ix hh = N i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) i for some polynomial N i ∈ K [ x, y ] withdeg x N i = deg x c + i (cid:16) deg x a + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) , deg y N i ≤ deg y c + i (cid:16) max { deg y a, deg y b } + deg y b ∗ + L X ℓ =1 deg y c ℓ (cid:17) , lc x N i = (lc x c ) (cid:16) lc x ab ∗ L Y ℓ =1 c ℓ (cid:17) i (deg x a − deg x b ) i .
72) If deg x a ≤ deg x b , then D ix hh = N i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) i for some polynomial N i ∈ K [ x, y ] withdeg x N i = deg x c + i (cid:16) deg x b + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) − [[ ω ∈ N ∧ i > ω ]] δ, deg y N i ≤ deg y c + i (cid:16) max { deg y a, deg y b } + deg y b ∗ + L X ℓ =1 deg y c ℓ (cid:17) , lc x N i = (lc x c ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) i ω i if ω N or i ≤ ω ;(lc x N ω +1 ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) i − ( ω +1) ( − δ − i − ( ω +1) if ω ∈ N and i > ω ,where ω := deg x c + L P ℓ =1 e ℓ deg x c ℓ and, if ω ∈ N , δ := deg x c + ( ω + 1) (cid:16) deg x b + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) − deg x N ω +1 ≥ . Proof.
All claims are proved by induction on i . For i = 0, there is nothing to show inany of the cases. The calculations for the induction step i → i + 1 are as follows.(1) Let v := bb ∗ Q Lℓ =1 c ℓ and write m i for the claimed value of deg x N i . Then D i +1 x hh = D x (cid:18) N i c v i c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ (cid:19)(cid:30)(cid:18) c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ (cid:19) = ( D x N i ) v − iN i D x vc v i +1 + N i c v i ( D x a ) b ∗ − ab ∗ ( D x b ) /bbb ∗ + N i c v i L X ℓ =1 e ℓ D x c ℓ c ℓ = ( D x N i ) v − iN i D x v + N i (cid:0) L Q ℓ =1 c ℓ (cid:1)(cid:0) ( D x a ) b ∗ − ab ∗ D x bb (cid:1) + N i v L P ℓ =1 e ℓ D x c ℓ c ℓ c v i +1 . Since deg x a > deg x b by assumption, we havedeg x (cid:18) ( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ (cid:19) ≤ deg x N i + deg x v − m i + deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ − < m i + deg x a + deg x b ∗ + L X ℓ =1 deg x c ℓ − m i +1 . Furthermore, because of( D x a ) b ∗ − ab ∗ D x bb = (lc x a )(lc x b ∗ )(deg x a − deg x b ) x deg x a +deg x b ∗ − + · · ·
8e have N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ = (lc x N i ) (cid:16) lc x ab ∗ L Y ℓ =1 c ℓ (cid:17) (deg x a − deg x b ) x m i +1 + · · · . This completes the proof that ( D i +1 x h ) /h has the denominator as claimed and thatits numerator has degree and leading coefficient with respect to x as claimed. Theremaining degree bound with respect to y follows fromdeg y (cid:16) ( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ (cid:17) ≤ deg y c + i (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17)| {z } bounds deg y N i + deg y b ∗ + deg y b + L X ℓ =1 deg y c ℓ | {z } bounds deg y v ≤ deg y c + ( i + 1) (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) anddeg y (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) ≤ deg y c + i (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17)| {z } bounds deg y N i + deg y b ∗ + deg y a + L X ℓ =1 deg y c ℓ | {z } bounds deg y of the other factors ≤ deg y c + ( i + 1) (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) . (2) Again, let v := bb ∗ Q Lℓ =1 c ℓ and write m i for the claimed value of deg x N i . Then,like in part 1, D i +1 x hh = ( D x N i ) v − iN i D x v + N i (cid:0) L Q ℓ =1 c ℓ (cid:1)(cid:0) ( D x a ) b ∗ − ab ∗ D x bb (cid:1) + N i v L P ℓ =1 e ℓ D x c ℓ c ℓ c v i +1 . First consider the case ω N or i ≤ ω .Since deg x a ≤ deg x b by assumption, and because of( D x a ) b ∗ − ab ∗ D x bb = (lc x a )(lc x b ∗ )(deg x a − deg x b ) x deg x a +deg x b ∗ − + · · · , we now have deg x (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) < m i + deg x b + deg x b ∗ − L X ℓ =1 deg x c ℓ = m i +1 . x a = deg x b because the coefficientof x deg x a +deg x b ∗ − in ( D x a ) b ∗ − ab ∗ ( D x b ) /b contains the factor deg x a − deg x b ,which vanishes in this case.Next, using the induction hypothesis, we have( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ = (lc x N i )(lc x v ) (cid:16) deg x N i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x deg x N i +deg x v − + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i ω i (lc x v ) (cid:16) m i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +deg x v − + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 ω i (cid:16) deg x c + L X ℓ =1 e ℓ deg x c ℓ − i (cid:17) x m i +1 + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 ω i +1 x m i +1 + · · · . Since ω i +1 = 0 when ω N or i + 1 ≤ ω , this completes the proof that ( D i +1 x h ) /h has the denominator as claimed and that its numerator has degree and leadingcoefficient with respect to x as claimed. The degree bounds with respect to y areshown exactly as in part 1.Now consider the case where ω ∈ N and i > ω . In this case, we start the inductionat i = ω + 1. The induction base follows from the calculations carried out above for i ≥ ω , the fact ω ω +1 = 0, and the definition of δ . (Note that ω ω +1 = 0 also implies δ ≥ i i + 1, we have, similar as before,deg x (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) < m i +1 and( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ = (lc x N i )(lc x v ) (cid:16) m i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +deg x v − + · · · = (lc x N i )(lc x v ) (cid:16) deg x c + i (deg x v − − δ − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +1 + · · · = (lc x N ω +1 ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i − ( ω +1) ( − δ − i − ( ω +1) (lc x v )( ω − δ − i ) x m i +1 + · · · = (lc x N ω +1 ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 − ( ω +1) ( − δ − i +1 − ( ω +1) x m i +1 + · · · Because of δ >
0, the factor ( − δ − i − ( ω +1) is nonzero for all i > ω . ✷ xample 7. The case when h = u/v is a rational function is covered by part 2 of thelemma. For example, for h = (2 x − x +5) / (3 x − x +8) we can take c = 2 x − x +5, a = 0, b = 1, L = 1, c = 3 x − x + 8, e = −
1. Direct calculation of the derivativesgives i x c c i ( D ix h ) /h x c c i ( D ix h ) /h − − i = ω + 1 = 3, but knowing these, it correctly predicts all the other data in the table. In thisexample, we have δ = 3 = ω + 1. This is not a coincidence, as we shall show next. Lemma 8.
Let h be a hyperexponential term with deg x a ≤ deg x b , and let ω and δ beas in Lemma 6.(2), ω ∈ N . Then δ ≥ ω + 1. Proof.
Rewrite h = c exp( ab ) Q Lℓ =1 c e ℓ ℓ = ¯ c exp( ab ) Q L +2 ℓ =1 ¯ c ¯ e ℓ ℓ with ¯ c = x ω , ¯ c ℓ = c ℓ ( ℓ = 1 , . . . , L ), ¯ e ℓ = e ℓ ( ℓ = 1 , . . . , L ), ¯ c L +1 = c , ¯ e L +1 = 1, ¯ c L +2 = x , ¯ e L +2 = − ω .The rational functions ( D ix h ) /h are of course independent of the representation of h ,but the representations of these rational functions which are given in Lemma 6 are not.The representation obtained for the new representation of h is obtained from the originalrepresentation by multiplying numerator and denominator by x ω + i c i − . Observe that thismodification does not influence the values for ω or δ . It is therefore sufficient to prove theclaim for terms of the form h = x ω ¯ h , where ¯ h is some hyperexponential term for whichthe value of ω is zero. We do so by induction on ω . For ω = 0, we have δ ≥ ω + 1already by Lemma 6.(2). Now assume that ω ≥ x ω ¯ h the degree drop¯ δ is ω + 1 or more. Then for h = x ω +1 ¯ h = x ( x ω ¯ h ) we have D x h = x ω ¯ h + xD x ( x ω ¯ h ), D x h = 2 D x ( x ω ¯ h ) + xD x ( x ω ¯ h ), and so on, all the way down to D ω +2 x h = ( ω + 2) D ω +1 x ( x ω ¯ h ) + xD ω +2 x ( x ω ¯ h )= ( ω + 2) N ω +1 x ω v ω +1 x ω ¯ h + x N ω +2 x ω v ω +2 x ω ¯ h = ( ω + 2) N ω +1 v + xN ω +2 v ω +2 ¯ h, (1)where N ω +1 and N ω +2 are as in Lemma 6 and v refers to the denominator stated there.If δ denotes the degree drop for h , then this calculation implies δ ≥ ¯ δ . By inductionhypothesis, we have ¯ δ ≥ ω + 1. If in fact ¯ δ ≥ ω + 2, then we are done. Otherwise, if¯ δ = ω + 1, thenlc x N ω +2 = ( − ¯ δ −
1) lc x N ω +1 lc x v = − ( ω + 2) lc x N ω +1 lc x v by Lemma 6, so the leading terms of the two polynomials in the numerator of (1) cancel,and therefore δ > ω + 1 also in this case. ✷ Experiments suggest that the bound in Lemma 8 is tight in the sense that we have δ = ω + 1 for almost all hyperexponential terms h . But there do exist situations with δ > ω + 1. For example, it can be shown that for h = c exp( a/b ) with deg x b − deg x a > deg x c = ω we have δ ≥ deg x b − deg x a . 11lso Lemma 6 is not necessarily sharp for degenerate choices of h . In particular, we donot claim that the numerators and denominators stated in Lemma 6 are coprime. It maybe possible to carry out a finer analysis by considering the square free decomposition of c ,or by taking into account possible common factors between b and the c ℓ , or by handlingthe c ℓ which do not involve x separately. For our purpose, we believe that the statementsgiven above form a reasonable compromise between sharpness of the statements andreadability of the derivation.Several aspects of the formulas in Lemma 6 are important. One of them is that thedenominators corresponding to lower derivatives divide those corresponding to higherderivatives. This has the consequence that when the linear combination P h is broughton a common denominator, the degree of the numerator will not grow drastically. In asense, this fact is the main reason why creative telescoping works at all. Our next stepis to bring the formulas from Lemma 6 on a common denominator.
Lemma 9.
Let h be a hyperexponential term and r, i ∈ Z with r ≥ i ≥ x a > deg x b , then D ix hh = N r,i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r for some N r,i ∈ K [ x, y ] withdeg x N r,i = deg x c + r (cid:16) deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ (cid:17) + i (cid:16) deg x a − deg x b − (cid:17) deg y N r,i ≤ deg y c + r (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) , lc x N r,i = (lc x c )(lc x a ) i (lc x b ) r − i (lc x b ∗ ) r (cid:16) lc x L Y ℓ =1 c ℓ (cid:17) r (cid:0) deg x a − deg x b (cid:1) i . (2) If deg x a ≤ deg x b , then D ix hh = N r,i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r for some N r,i ∈ K [ x, y ] withdeg x N r,i = deg x c + r (cid:16) deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ (cid:17) − i − [[ ω ∈ N ∧ i > ω ]] δ, deg y N r,i ≤ deg y c + r (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) , lc x N r,i = (lc x c )(lc x bb ∗ ) r (cid:0) lc x L Q ℓ =1 c ℓ (cid:1) r ω i if ω N or i ≤ ω ;(lc x N ω +1 ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) r − ( ω +1) ( − δ − i − ( ω +1) if ω ∈ N and i > ω ,where ω , δ , and lc x N ω +1 are as in Lemma 6.(2). Proof.
Both parts follow directly from the respective parts of Lemma 6 by multiplyingnumerator and denominator of the representations stated there by (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − i . ✷ Definition 10.
For a hyperexponential term h , let α = deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ , β = deg x a − deg x b − ,γ = deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ , ω = deg x c + L X ℓ =1 e ℓ deg x c ℓ . If deg x a ≤ deg x b and ω ∈ N , we further let δ be any integer with ω + 1 ≤ δ ≤ deg x c + ( ω + 1)( α − − deg x (cid:18) c (cid:16) bb ∗ L Y ℓ =1 c ℓ (cid:17) ω +1 D ω +1 x hh (cid:19) . Otherwise, if deg x a > deg x b or ω N , let δ = 0. Finally, we define the following flags: φ = [[ lc x a lc x b ∈ K ]] , φ = [[ lc x a lc x b ∈ K ∧ β = 0]] ,φ = [[ ab ∈ K ( x ) ∧ ∀ ℓ : (deg y c ℓ = 0 ∨ e ℓ ∈ Z ) ∧ deg y c ≥ L P ℓ =1 e ℓ deg y c ℓ ]] . Note that none of these parameters depends on r or i . The flags φ k ( k = 1 , ,
3) arein { , } , ω belongs to K , β belongs to N ∪ {− } , and all other parameters are positiveintegers. The best value for δ is the right bound of the specified range, but since thisvalue cannot be directly read of from the input, we do not insist that δ be equal to thisvalue, but we allow δ to be any number between the bound from Lemma 8 and the truedegree drop. The flags φ and φ will be used below in the ansatz for the telescoper, φ will play a role afterwards in the ansatz for the certificate.In terms of the parameters defined in Definition 10, the degree bounds of Lemma 9simplify to deg x N r,i ≤ deg x c + αr + max { β, − } i − [[ ω ∈ N ∧ i > ω ]] δ, deg y N r,i ≤ deg y c + γr. Lemma 9 suggests reasonable choices for the degrees d i in the ansatz for P . In partic-ular, our choice is based on the following features of the formulas in Lemma 9. • The degree of the numerator in ( D ix h ) /h varies with i . A good choice for the degrees d i will compensate for this variation, taking higher values for d i when the numeratorof ( D ix h ) /h has low degree, and vice versa. This is the key idea of the Verbaetencompletion (Verbaeten, 1974, 1976; Wegschaider, 1997). • The leading coefficients of N r,i ( i >
0) are polynomials in y , but in case 2, most ofthem are K -multiples of each other. When a and b are such that (lc x a ) / (lc x b ) ∈ K ,then this is also true in case 1. We will use this fact for eliminating several equationsat the cost of a single variable.Before describing the ansatz for P in full generality, we motivate the construction byan example. 13 t t t t tt t t t t tt t t t t tt t t t t tt t t t t tt t t t tt t t tt t t ❞❞ d =7 0 r =5 w =3 z }| { Fig. 2. The ansatz for P discussed in Example 11 Example 11.
Suppose that h is hyperexponential with lc x a = lc x b , β = 1 (case 1 ofLemma 9), and deg x c = 0.Let r = 5 and d = 7. We want to choose d i such that max i =0 d i = 7 and the ansatz P = X i =0 d i X j =0 p i,j x j D ix leads to “many” variables but only “few” equations. The choice with most variables isclearly to set d i = d = 7 for all i . But this ansatz leads to quite many equations. Eachterm x j D ix contributes to the common numerator a polynomial x j N ,i whose degree in x is 5 α + i + j and whose degree in y is at most 5 γ . Because of the term x D x , we mustexpect up to (5 α + 13)(5 γ + 1) terms in the numerator. This is the expected number ofequations in the linear system resulting from coefficient comparison.If we remove the term x D x from the ansatz, i.e., if we choose d = · · · = d = 7, d = 6, then the number of equations drops to (5 α + 12)(5 γ + 1) because all terms x j D ix other than x D x contribute only polynomials x j N ,i of lower degree. We save 5 γ + 1equations at the cost of removing a single variable. Removing also the terms x D x and x D x lowers the number of equations further to (5 α + 11)(5 γ + 1), and in general, for any0 ≤ w ≤
5, choosing d i = 7 − ( w + i − + ( i = 0 , . . . ,
5) leads to (5 α + 13 − w )(5 γ + 1)equations. The number of variables is (5 + 1)(7 + 1) − P wk =1 k = 48 − w ( w + 1).If w >
1, we can introduce w − w = 3, i.e., the terms x j D ix with i + j ≥
10 have been removed from the ansatz. We reintroduce the terms x D x , x D x , x D x by adding p , (cid:0) (deg x a − deg x b ) x D x − x D x (cid:1) + p , (cid:0) (deg x a − deg x b ) x D x − x D x (cid:1) to the ansatz, getting back the two variables p , and p , but no new equations, because,according to Lemma 9.(1), the assumption lc x a = lc x b implies(deg x a − deg x b ) lc x N , = lc x N , and (deg x a − deg x b ) lc x N , = lc x N , . The final ansatz is depicted in Figure 2. A bullet at ( i, j ) represents a variable p i,j inthe ansatz. White bullets correspond to the reintroduced variables p , and p , whichdo not affect the number of equations. 14he general form of our ansatz for the telescoper is given in the following lemma. Thefirst case is like in the example above when β >
0. For β = 0, no degree compensationis possible because all N r,i have the same degree. But if (lc x a ) / (lc x b ) ∈ K , it is stillpossible to save some equations by exploiting the linear dependence among the leadingterms. In the second case, there is always a degree compensation possible, but unlikein the example above, terms are removed for indices i close to zero rather than closeto r . When ω ∈ N , we provide an alternative ansatz which takes the degree drop δ into account. Common to all cases are the two basic principles of choosing d i such asto compensate for the different degrees of the N r,i in Lemma 9, and of installing someadditional variables by exploiting the knowledge about the leading terms of the N r,i . Forthe size of the cutoff, we use a new integer parameter w , whose optimal value will bedetermined later. Lemma 12.
Let h be a hyperexponential term, r ≥ d ≥ x a > deg x b . Let 0 ≤ w ≤ min { r, d/β } ( w := 0 if β = 0), d i := d − β ( w + i − r ) + − φ ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + [[ β = 0]] φ r − X i = r − w +1 p i,d i +1 (cid:16)(cid:0) lc x a lc x b ( β + 1) (cid:1) i x d i +1 D ix − x d r +1 D rx (cid:17) + φ r − X i =0 p i,d i +1 (cid:16)(cid:0) lc x a lc x b (cid:1) i x d i +1 D ix − x d r +1 D rx (cid:17) . Let N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + ( α + β ) r − βw − φ and deg y N ≤ deg y c + γr. (2) Suppose that deg x a ≤ deg x b . Let 0 ≤ w ≤ min { d + 1 , r + 1 } . Let d i := d − ( w − i ) + ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + w − X i =1 p i,d i +1 (cid:16) x d i +1 D ix − ω i x d +1 (cid:17) . Let N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + αr − w and deg y N ≤ deg y c + γr. (2 ′ ) Suppose that deg x a ≤ deg x b and ω ∈ N . Let ω ≤ w ≤ min { d − δ + 1 , r + 1 } . Let d i := d − ( w − i ) + − [[ i ≤ ω ]] δ ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + ω X i =1 p i,d i +1 (cid:16) x d i +1 D ix − ω i x d +1 (cid:17) + w − X i = ω +2 p i,d i +1 (cid:16) x d i +1 D ix − ( − δ − i − ( ω +1) x d ω +1 +1 D ω +1 x (cid:17) . (See Figure 3 for an illustration of the shape of P in this case.)15et N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + αr − w − δ and deg y N ≤ deg y c + γr. Proof. (1) We apply Lemma 9.(1) to each term in the ansatz for P . The claim aboutdeg y N follows directly from the bound on deg y N r,i there. For the bound on deg x N ,first observe thatdeg x x j N r,i ≤ d i + deg x c + αr + βi = deg x c + d + αr + βi − β ( w + i − r ) + − φ = deg x c + d + αr + β ( i − max { w + i − r, } ) − φ ≤ deg x c + d + αr + β ( r − w ) − φ for all i, j with 0 ≤ i ≤ r and 0 ≤ j ≤ d i . This settles the terms coming from thedouble sum. For the terms in the first single sum, which only appears when β = 0,we have deg x x d i +1 N r,i = deg x x d r +1 N r,r = deg x c + d + αr + β ( r − w ) + 1 − φ and (cid:0) lc x a lc x b ( β + 1) (cid:1) i lc x N r,i = lc x N r,r for i = r − w + 1 , . . . , r −
1. This impliesdeg x (cid:16)(cid:0) lc x a lc x b ( β + 1) (cid:1) i x d i +1 N r,i − x d r +1 N r,r (cid:17) ≤ deg x c + d + αr + β ( r − w ) − φ , as desired. The argument for the second single sum, which only appears when β = 0,is analogous.(2) Now we use Lemma 9.(2). Again, the claim about deg y N follows immediately. Forthe bound on deg x N , first observe thatdeg x x j N r,i ≤ d i + deg x c + αr − i = deg x c + d + αr − i − ( w − i ) + ≤ deg x c + d + αr − w. This settles the terms in the double sum. For the terms in the single sum, we havedeg x x d i +1 N r,i = deg x x d +1 N r, = deg x c + d + αr − w + 1and lc x N r,i = lc x ω i N r, for i = 1 , . . . , w −
1, and thereforedeg x (cid:16) x d i +1 N r,i − ω i x d +1 N r, (cid:17) ≤ deg x c + d + αr − w. (2 ′ ) In this case, the terms in the double sum contribute polynomials of degreedeg x x j N r,i ≤ d i + deg x c + αr − i − [[ i > ω ]] δ = deg x c + d + αr − i − ( w − i ) + − [[ i ≤ ω ]] δ − [[ i > ω ]] δ ≤ deg x c + d + αr − w − δ. For the terms in the first single sum, we havedeg x x d i +1 N r,i = deg x x d +1 N r, = deg x c + d + αr − w + 1 − δ t t t t t t t t t t tt t t t t t t t t t t tt t t t t t t t t t t tt t t t t t t t t t t❞ t t t t t t t t t t❞ t t t t t t t t tt t t t t t t t tt t t t t t t t tt t t t t t t t tt t t t t t t t❞ t t t t t t t d = 100 ω = 2 r = 11 δ =3 w =5 z }| { Fig. 3. The ansatz for P in case 2 ′ of Lemma 12 and lc x N r,i = lc x ω i N r, for i = 1 , . . . , ω , and thereforedeg x (cid:16) x d i +1 N r,i − ω i x d +1 N r, (cid:17) ≤ deg x c + d + αr − w − δ. Similarly, for the terms in the second single sum, we havedeg x x d i +1 N r,i = deg x x d ω +1 +1 N r,ω +1 ≤ deg x c + d + αr − w + 1 − δ. If the inequality is strict, we are done. Otherwise, δ is maximal and we havelc x N r,i = lc x ( − δ − i − ( ω +1) N r,ω +1 for i = ω + 2 , . . . , w −
1, and thereforedeg x (cid:16) x d i +1 N r,i − ( − δ − i − ( ω +1) x d ω +1 +1 N r,ω +1 (cid:17) ≤ deg x c + d + αr − w − δ, and we are also done. ✷ Lemma 12 makes a statement on the number of equations to be expected when theansatz for P is made in the form as indicated. This number of equations is equal to thenumber of terms x i y j in N , and this number is bounded by (deg x N + 1)(deg y N + 1),for which upper bounds are stated in the lemma. We also need to count the number ofvariables p i,j . This number is easily obtained from the sum expressions given for P in thevarious cases by replacing all the summand expressions by 1. After some straightforwardand elementary simplifications which we do not want to reproduce here, the statisticsare as follows. • In case 1, the number of variables is( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ . • In case 2, the number of variables is( r + 1)( d + 1) − w ( w + 1) + ( w − + . • In case 2 ′ , the number of variables is( r + 1)( d + 1) − w ( w + 1) − δ ( ω + 1) + ω + ( w − ω − + . P .We will next discuss the ansatz for the certificate Q , which will bring many additionalvariables, but, by a careful construction, no additional equations. The design of the ansatz for the certificate is much simpler. Here, the goal is to setup Q in such a way that ( D y Q ) /h has the same denominator and the same numeratordegrees in x and y as ( P h ) /h does (in order to not create more equations than necessary),and that ( D y Q ) /h cannot become zero (in order to enforce that P = 0 in every solutionwe find).A direct calculation like in the proof of Lemma 6 confirms that the first requirementis satisfied by choosing Q = s P i =0 s P j =0 q i,j x i y j c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h with s = deg x c + d + ( α + β )( r − − βw − φ − x c + d + α ( r − − w in case 2 of Lemma 12;deg x c + d + α ( r − − w − δ in case 2 ′ of Lemma 12and s = deg y c + γ ( r −
1) + 1 in all cases.This ansatz provides ( s + 1)( s + 1) variables. To ensure that D y Q = 0 for every choiceof q i,j , observe that D y Q = 0 can only happen if h is a rational function with respectto y , meaning a, b ∈ K [ x ] and c ℓ ∈ K [ x ] for all ℓ with e ℓ Z . In this case, we have D y Q = 0 if and only if the q i,j are instantiated in such a way that the resulting Q isfree of y , and this can only happen if the choice of q i,j is made in such a way that thenumerator degree in y is equal to the denominator degree in y . The denominator degreeis L X ℓ =1 ( r − − e ℓ ) deg y c ℓ = γ ( r − − η, where η = L X ℓ =1 e ℓ deg y c ℓ , which is less than s = deg y c + γ ( r −
1) + 1 if and only if deg y c + η + 1 >
0. If weremove all the terms q i,j x i y j with j = γ ( r − − η from the ansatz, no instantiation of theremaining q i,j can turn Q into a term independent of y , so we can be sure that D y Q = 0in this modified setup. The number of variables in this modified ansatz is ( s + 1) s . Theflag φ defined in Definition 10 is set up in such a way that we can in all cases assumean ansatz for Q with ( s + 1)( s + 1 − φ ) variables. The following lemma summarizesthe two versions of the ansatz for Q . Lemma 13.
Let h be a hyperexponential term.181) If max { deg y a, deg y b } > y c ℓ > ℓ with e ℓ Z , then for every s , s ∈ N and every choice of q i,j ∈ K where not all q i,j are equal to zero we have D y s P i =0 s P j =0 q i,j x i y j c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h = 0 . (2) If deg y a = deg y b = 0 and deg y c ℓ = 0 for all ℓ with e ℓ Z , then for every s , s ∈ N and every choice of q i,j ∈ K where not all q i,j are equal to zero we have D y s P i =0 (cid:18) ( r − γ − η − P j =0 q i,j x i y j + s P j =( r − γ − η +1 q i,j x i y j (cid:19) c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h = 0 , where η = P Lℓ =1 e ℓ deg y c ℓ .
4. Solving the Inequalities
As the result of the previous section, we obtain counts for the number of variables andthe number of equations for a particular family of ansatzes which are parameterized bythe desired order r and degree d of the telescoper, various Greek parameters introducedin Definition 10, which measure the input, and one additional parameter w by whichthe shape of the ansatz can be modulated. A sufficient condition for the existence of asolution of order (at most) r and degree (at most) d is r, d, w ) − r, d, w ) > . For any particular choice of w from the ranges specified for the various cases in Lemma 12,we obtain a valid sufficient condition connecting r and d via the Greek parameters.Any of these conditions defines a region in N which is inside the gray region from theintroduction. To make this region as large as possible (and hence, as equal as possible tothe gray region), we will choose w in such a way that the left hand side, considered as afunction in w , is maximal.It comes in handy that r, d, w ) − r, d, w ) is a (piecewise) quadratic poly-nomial with respect to w , so the optimal choice of w is easily found by equating itsderivative with respect to w to zero and rounding the solution to the nearest integer. Ifthis point is outside the range to which w is constrained, then the maximum is assumedat one of the two boundary points of the range.The following theorem, which is the main result of this article, contains the boundswhich we obtained by applying this reasoning to the explicit expressions derived for r, d, w ) and r, d, w ) in the previous section for the various cases to be con-sidered. Theorem 14.
Let h = c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ
19e a hyperexponential term and let α, β, γ, δ, ω, φ , φ , φ be as in Definition 10 and set ψ = γ + φ −
2. Then a creative telescoping relation for h of order r and degree d existswhenever r ≥ ψ + 1 and d > ϑ r + ϕr − ψ , where ϑ and ϕ are defined as follows.(1) If deg x a > deg x b , let ϑ = ( α + β )(2 γ − φ ) + γ − ,ϕ = deg x c + ( α + β + 1) deg y c + ( γ − φ )(deg x c − α − β − φ ) − (1 − φ )( γ − φ ) + (cid:0) φ + β ( γ − φ ) (cid:1) . (2) If deg x a ≤ deg x b , let ϑ = α (2 γ − φ ) − ,ϕ = deg x c + α deg y c + ( γ − φ )(deg x c + 1 − α ) − ( γ − φ ) + ( γ + 1 + φ ) . If furthermore ω ∈ N and γ − φ > ω and δ = ω + 1, then ϕ can be replaced by ϕ ′ = ϕ − δ ( γ − φ − ω ) + 1 . Proof. (1) Suppose deg x a > deg x b . According to the calculations done in the previoussection, in this case there exists an ansatz with( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ variables coming from the telescoper P , (cid:0) deg x c + d + ( α + β )( r − − βw − φ (cid:1)(cid:0) deg y c + γ ( r −
1) + 2 − φ (cid:1) variables coming from the certificate Q , and (cid:0) deg x c + d + ( α + β ) r − βw − φ + 1 (cid:1)(cid:0) deg y c + γr + 1 (cid:1) equations. Therefore, a creative telescoping relation exists provided that( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ + (deg x c + d + ( α + β )( r − − βw − φ )(deg y c + γ ( r −
1) + 2 − φ ) − (deg x c + d + ( α + β ) r − βw − φ + 1)(deg y c + γr + 1) > . For r ≥ γ − φ , this inequality is equivalent to d > (cid:16)(cid:0) ( α + β )(2 γ − φ (cid:1) + γ − (cid:1) r + deg x c + ( α + β + 1) deg y c + ( γ − φ )(deg x c − α − β − φ ) (2)+ βw ( w − γ + 3 − φ ) − φ ( w − + (cid:17).(cid:16) r − γ + 2 − φ (cid:17) . The choice w = 0 proves the claim when φ = 1 or γ ≤ − φ . Now supposethat φ = 0 and γ > − φ . The claimed estimate is obtained for the choice w = γ − φ >
0. We have to show that this choice is admissible, i.e., that1 ≤ γ − φ ≤ min { r, d/β } . Because of γ > − φ , the lower bound is clear, and r ≥ γ − φ holds by assumption. To see that γ − φ ≤ d/β , observe that20he right hand side of (2) converges to ( α + β )(2 γ − φ ) + γ − r → ∞ .Since its numerator is nonnegative (as checked by a straightforward calculation),it follows that this inequality implies d > ( α + β )(2 γ − φ ) + γ − ≥ β ( γ − φ ) , as desired.(2) Now assume deg x a ≤ deg x b . From the counts of variables and equations in theansatz described in Lemma 12.(2), we find that a creative telescoping equationexists provided that( r + 1)( d + 1) − w ( w + 1) + ( w − + + (deg x c + d + α ( r − − w + 1)(deg y c + γ ( r −
1) + 2 − φ ) − (deg x c + d + αr − w + 1)(deg y c + γr + 1) > . For r ≥ γ − φ , this inequality is equivalent to d > (cid:16) ( α (2 γ − φ ) − r + deg x c + α deg y c + ( γ − φ )(deg x c + 1 − α )+ ( − γ − φ ) w + w − ( w − + (cid:17).(cid:16) r − γ + 2 − φ (cid:17) . Regardless of the choice of w , the right hand side is at least α (2 γ − φ ) − w = 0and on the other hand, if γ > − φ , from the choice w = γ − φ , which also inthis case is in the required range because 1 ≤ γ − φ ≤ α (2 γ − φ ) − < d and γ − φ ≤ r .The second estimate is obtained from the alternative ansatz from Lemma 12.(2 ′ ).The inequality in this case is( r + 1)( d + 1) − w ( w + 1) − δ ( ω + 1) + ω + ( w − ω − + + (deg x c + d + α ( r − − w − δ + 1)(deg y c + γ ( r −
1) + 2 − φ ) − (deg x c + d + αr − w − δ + 1)(deg y c + γr + 1) > , which for r ≥ γ − φ and w = γ − φ is equivalent to d > ( α (2 γ − φ ) − r + ϕ ′ r − γ + 2 − φ . It remains to show that the choice w = γ − φ is compatible with the rangerestrictions for w applicable in the present case. While the requirements ω ≤ γ − φ ≤ r + 1 are satisfied by assumption, the requirement γ − φ ≤ d − δ + 1is less obvious. A sufficient condition is( α (2 γ − φ ) − r + ϕ ′ r − γ + 2 − φ ≥ γ − φ + δ. It can be shown easily with Collins’s cylindrical algebraic decomposition algo-rithm (Collins, 1975; Caviness and Johnson, 1998) (e.g., with its implementationin Mathematica (Strzebo´nski, 2000, 2006)) that this latter inequality follows fromdeg x c ≥
0, deg y c ≥ α ≥ r ≥ γ − φ ≥ ω +1 ≥ δ = ω +1, φ ( φ −
1) = 0,and ϕ ′ = deg x c + α deg y c + δω + 1 + ( γ − φ )(deg x c − α − ( γ − φ ) − δ ) . ✷ As we do not claim that our bounds are sharp, no justification for the various choices of w are required in the proof. But of course, the choices were made following the reasoningoutlined before the theorem. For example, in case 1 the main inequality is( r + 1)( d + 1 + φ ) − βw ( w + 1) + φ ( w − + − φ + (deg x c + d + ( α + β )( r − − βw − φ + 1)(deg y c + γ ( r −
1) + 2 − φ ) − (deg x c + d + ( α + β ) r − βw − φ + 1)(deg y c + γr + 1) > . Differentiating the left hand side with respect to w gives − βw − β + βγ + φ + βφ , which vanishes for w = γ − + φ + φ /β . The unique nearest integer point is ⌊ γ − + φ + φ /β ⌉ = γ − φ when φ /β = 1. When φ /β = 1, there are two nearestinteger points γ − φ and γ + φ , and since the maximum is exactly between them andquadratic parabolas are symmetric about their extremal points, the values at γ − φ and γ + φ agree. In conclusion, the choice w = γ − φ is optimal in both cases.The calculations for the other cases are similar. But note that having chosen w op-timally does not imply that the bounds given in the Theorem 14 are tight, because thewhole argument relies on counting variables and equations for the particular ansatz fam-ily introduced in Section 3, and we cannot claim that this shape is best possible. Recallthat we aim at an ansatz for which the number of solutions of the resulting linear systemis equal to (or at least not much larger than) the difference between number of variablesand number of equations. One way of measuring the quality of our ansatz, and hence thetightness of our bounds, is to compare the region of all points ( r, d ) where an ansatz fororder r and degree d actually has a solution (the “gray region” from the introduction)with the region of all points ( r, d ) for which Theorem 14 guarantees the existence of asolution. The following collection of examples shows that there are cases where Theo-rem 14 is extremely accurate as well as cases where there is a clear gap between thepredicted shape and the actual shape of the gray region. As a reference ansatz for exper-imentally determining in the examples whether a specific point ( r, d ) belongs to the grayregion, we checked whether the naive ansatz where d = d = · · · = d r (i.e., w = 0) as asolution, because every solution of some refined ansatz with w > w = 0. It is not guaranteed however that this ansatz covers all creativetelescoping relations. Additional relations at points ( r, d ) outside of what we indicate asthe gray region may exist. For example, when our ansatz leads to a solution ( P, Q ) inwhich all the polynomial coefficients of P share a nontrivial common factor f ∈ K [ x ],then ( P/f, Q/f ) is another relation with a telescoper of lower degree. This phenomenoncan often be observed for the minimal order telescoper, but as we do not know of anyefficient way of detecting it also for the nonminimal ones, we can unfortunately not takeit into account in the figures.
Example 15. (1) Consider the term h = u exp( v ) where u = 7 x y + 8 x y + 9 x y + 3 x + 10 x y + 2 x y + 3 x y + 9 x + 7 xy + 4 xy + 5 xy + 3 x + 9 y + 6 y + 6 y + 1 , = 6 x y + 4 x y + x y + 9 x + 8 x y + 8 x y + 2 x y + 8 x + 3 xy + 7 xy + 4 xy + 8 x + 5 y + 2 y + 7 y + 6 . We are in case 1 of Theorem 14 and have α = 0, β = 2, γ = 3, φ = φ = φ = 0,deg x c = deg y c = 3. According to the theorem, we expect creative telescopingrelations for all ( r, d ) with r ≥ d > (12 r + 11) / ( r − r + 11) / ( r −
1) together with the gray region. In this example, thegray region consists exactly of the integer points above the curve: the bound is astight as can be.(2) Now consider the term h = exp( u ) /v where u = 4 x y + 7 x y + 9 x + 5 xy + 2 xy + 3 x + 5 y + y + 6 ,v = 6 x y + 10 x y + 6 x + 9 xy + 5 xy + 8 x + 8 y + 10 y + 8 . We are again in case 1 of the theorem and we have α = 2, β = 1, γ = 4, φ = φ = φ = 0, deg x c = deg y c = 2. The estimate from Theorem 14 is now d > (24 r − / ( r − h be the rational function from the introduction. Then we are in case 2of the theorem and we have α = 3, β = − γ = 3, ω = − δ = 0, φ = 1, φ = 0, φ = 1, deg x c = deg y c = 2. The bound from the theorem is now d > (17 r + 3) / ( r − h = u/v with u = 4 x y + 7 x y + 9 x + 5 xy + 2 xy + 3 x + 5 y + y + 6 ,v = (cid:0) x y + 10 x y + 6 x + 9 xy + 5 xy + 8 x + 8 y + 10 y + 8 (cid:1) × (cid:0) x y + 7 x y + 4 x + 5 xy + 3 xy + 7 x + 9 y + 7 y + 7 (cid:1) . This term is also covered by case 2 of the theorem, and we have α = 4, β = − γ = 4, ω = − δ = − φ = 1, φ = 0, φ = 0, deg x c = deg y c = 2. Theestimate d > (27 r + 3) / ( r −
2) from the theorem is correct but not tight, as shownin Figure 4.(d).(5) Finally, let h = √ u with u = 4 x y + 8 x y + 2 x y + 7 x y + 7 x y + 2 x y + 7 x + 10 xy + 7 xy + 9 xy + 4 xy + 5 xy + 5 xy + 7 x + 4 y + 3 y + 2 y + 8 y + 3 y + 7 y + 2 . Now the alternative bound of case 2 with ϕ ′ in place of ϕ is applicable because wehave ω = 1 ∈ N . The bound using ϕ is d > (21 r − / ( r − r = 14. In contrast, the bound d > (21 r − / ( r − ϕ ′ is tight for all r > r = 5. Thesituation is shown in Figure 5. On the right, we show a comparison of the sharpbound based on ϕ ′ (solid), the bound based on ϕ (dashed) and the bound whichwould be obtained by choosing w = 0 instead of w = γ − φ in the proof ofTheorem 14 (dotted).There are several ways of refining the ansatz for P and Q even further in order toachieve better estimates where ours are not sharp. Here are some ideas.23a) r d (b) r d (c) r d (d) r d Fig. 4. Sizes ( r, d ) of creative telescoping relations together with the curve predicted by Theo-rem 14, for the hyperexponential terms discussed in Example 15. • The possibility of introducing extra variables without increasing the number of equa-tions (depicted by the white bullets in Figures 1 and 2) rests on the observation madein Lemma 9 that the leading coefficients lc x N r,i are K -multiples of each other, i.e.,that these leading coefficients generate a linear subspace of K [ y ] of dimension one. Ex-periments suggest that this observation can be generalized to the coefficients of lowerdegree as follows: If V j ⊆ K [ y ] denotes the vector space generated by the coefficientsof x deg x N r,i − j in N r,i ( i = 0 , . . . , r ), then V ⊆ V ⊆ · · · ⊆ V j and dim V j ≤ j + 1 atleast for small j . If this is true, it would allow adding more extra variables withoutincreasing the number of equations. • In general, comparing coefficients of the monomials x i y j of a polynomial S to zeroresults in a linear system with (deg x S + 1)(deg y S + 1) equations. But if S containssome factor which is free of the variables p i,j and q i,j , then canceling this factor beforecomparing coefficients results in a system with fewer equations and the same number ofvariables. While in our case, it is too much to hope for a factor which would divide S asa whole, it seems that at least in some cases, factors can be removed from lc x S ∈ K [ y ]or lc y S ∈ K [ x ]. For example, when deg x a > deg x b and deg y a > deg y b , it can be24 r d r d Fig. 5. Left: Sizes ( r, d ) of creative telescoping relations together with the curve predicted byTheorem 14, for the term discussed in Example 15.(5). Right: a detail of the figure on the left ina larger scale, together with the curve based on ϕ instead of ϕ ′ (dashed) and the curve based on w = 0 (dotted). The correct degrees are precisely the smallest integers strictly above the solidcurve. The two variations both overshoot for all the points in this range. shown that Q Lℓ =1 lc x c ℓ (cid:12)(cid:12) lc x S and Q Lℓ =1 lc y c ℓ (cid:12)(cid:12) lc y S , so P Lℓ =1 (cid:0) deg y lc x c ℓ + deg x lc y c ℓ (cid:1) equations can be discarded in this case.We have not worked out the influence of these variations in full generality, but onlyon some examples. It turned out that they indeed lead to tighter estimates, but thedifference is rather small, and decays to zero for large r . At the same time, they wouldlead to much more complicated formulas. We do not know the reason for the gap inExamples 15.(2) and 15.(4) between the curve from Theorem 14 and the boundary ofthe gray region for r → ∞ . Even though it appears more important for a bound to betight for small orders than for large ones, we would be very interested in seeing a refinedbound which closes this gap.It is also interesting to compare the gray regions for hyperexponential terms com-posed from dense random polynomials with the gray regions for hyperexponential termsof the same shape that originate from some specific application. According to our experi-ments, the shape of the gray region for a randomly chosen term h = c exp( a/b ) Q Lℓ =1 c e ℓ ℓ only depends on the number L of factors in the product, the degrees of the polyno-mials a, b, c , . . . , c L , and the exponents e , . . . , e L . However, input containing sparsepolynomials or polynomials which in some other sense have a “structure” may well haveconsiderably smaller degrees. Example 16. If a n,k denotes the number of HC-polynomioes with n cells and k rows(Wilf, 1989, Section 4.9), then ∞ X n,k =0 a n,k x n y k = xy (1 − x ) (1 − x ) − xy (1 − x − x + x + x y ) . A differential equation for the generating function P ∞ n =0 a n,n x n of the number of HC-polynomials with n cells and n rows can be obtained by applying creative telescoping tothe rational function obtained from the rational function above by substituting x by y ,25 r d Fig. 6. Gray regions for the two terms h (light gray) and g (dark gray) from Example 16.Although all Greek parameters have the same values for h and g (and hence, Theorem 14 givesthe same degree estimation curve), the actual gray regions differ significantly. y by x/y , and dividing the result by y . Let thus h = 1 y y xy (1 − y ) (1 − y ) − y xy (1 − y − y + y + y xy ) = x (1 − y ) y ((1 − y ) − x (1 − y + xy − y + y )) . Here we have c = x (1 − y ) , a = 0, b = 1, c = y , c = ((1 − y ) − x (1 − y + xy − y + y ), e = e = −
1. The gray region for h is shown in light gray in Figure 6. For comparison,the same figure contains the gray region (in dark gray) for a term g which was obtainedfrom h by replacing c and c by dense random polynomials with deg x c = 1, deg y c = 3,deg x c = 2, deg y c = 4, so that all the Greek parameters have precisely the same valuesfor g and h .Theorem 14 predicts relations whenever d ≥ r − r − (black curve), which is a goodestimate for the generic term g but a significant overestimation for the special term h .
5. Consequences and Applications
Our theorem contains as a special case Theorem cAZ of Apagodu and Zeilberger(2006), which says that a (non-rational) hyperexponential term always admits a tele-scoper of order r = γ + 1, but makes no statement about its degree d . Similarly, we canalso give an estimate for the possible degrees d without paying attention to their orders r . Corollary 17. (1) For every hyperexponential term h , there exists a creative telescop-ing relation of order r = ψ + 1 = γ + 1 − φ .(2) For every hyperexponential term h , there exists a creative telescoping relation ofdegree d = ϑ + 1 = ( α + β )(2 γ − φ ) + γ if deg x a > deg x b ; α (2 γ − φ ) if deg x a ≤ deg x b . Proof.
Both claims are immediate by the formulas given in Theorem 14. ✷
26n connecting order r and degree d into a single formula, Theorem 14 makes a muchstronger statement than this corollary. Assuming for simplicity that the bounds of The-orem 14 are tight, we can use them to compute optimal choices for order and degree ofthe telescoper. There are various quantities which one may want to minimize. Besidesasking for a bound on the minimal order or the minimal degree, as carried out above,we may ask for a choice ( r, d ) where the computational cost is minimal, or the total size S ( r, d ) := ( r + 1)( d + 1) + ( s + 1)(deg x c + γ ( r −
1) + 2) of the output (consisting oftelescoper and certificate), or the size T ( r, d ) := ( r + 1)( d + 1) of the output telescoperalone. Or, if the telescoper P is to be transformed into a recurrence for the series coef-ficients of its solutions, one may want to minimize the order of this recurrence, which isbounded by R ( r, d ) := r + d (see, e.g., Thm. 7.1 in Kauers and Paule, 2011).For minimizing the computational cost, we first have to fix a particular algorithm forcomputing P and Q for given h . We are not forced to follow the algorithm which isimplicit in the analysis of Sections 3 and 4 (making an ansatz, comparing coefficientswith respect to x and y to zero, and solving a linear system of equations over K ).In fact, this algorithm has a rather poor performance. It is much better to do a co-efficient comparison with respect to y only and to solve a linear system of equationsover K ( x ). This is also what is proposed in the original articles (Almkvist and Zeilberger,1990; Mohammed and Zeilberger, 2005; Apagodu and Zeilberger, 2006) and what is usedin practice (Koutschan, 2009, 2010). Output sensitive linear system solvers based onHermite-Pad´e approximation (Beckermann and Labahn, 1994; Storjohann and Villard,2005; Bostan et al., 2007) are able to determine the degree n solutions of a linear sys-tem over K ( x ) with m variables and at most m equations using O ∼ ( nm ) operationsin K . Since an ansatz over K ( x ) will have only r + 1 variables coming from the tele-scoper, deg y c + γ ( r − − φ + 2 variables coming from the certificate, and a so-lution of degree s with respect to x , it seems reasonable to assume that the com-putational cost is minimal for a choice ( r, d ) which minimizes the function C ( r, d ) := s (deg y c + ( γ + 1) r − γ − φ + 3) . Example 18.
Consider a hyperexponential term h = c exp( a/b ) √ c where a, b, c , c ∈ K [ x, y ] have the degrees deg x a = deg y a = deg x b = deg y b = 1, deg x c = deg y c = 2,deg x c = 4, deg y c = 6. We are in case 2 of Theorem 14 and have α = 6, β = − γ = 8, ω = 4, δ = 5, φ = 0, φ = 0, φ = 0. According to the theorem, a creative telescopingrelation exists for ( r, d ) with r ≥ d ≥ (89 r − / ( r −
6) + 1 = (90 r − / ( r − d = (90 r − / ( r − C ( r, d ) = (6 r + d − r − assumes its minimal value for r = 8 rather than for the minimal order r = 7. Finding thisoptimal value is easy: regard r temporarily as real variable and use calculus to determinethe minimum of C ( r, r − r − ). This gives a minimum point near r = 7 . r ∈ N is either at r = 7 or at r = 8. Comparing the actual values of C at these two points indicates that the 8th order telescoper is about 8% cheaper than the7th order operator, and hence the cheapest operator of all.By similar calculations, we find that the output size (telescoper and certificate com-bined) is minimized for r = 10, the size of the telescoper alone is minimized for r = 12,and the order of the recurrence associated to the telescoper is minimized for r = 28. SeeFigure 7 for an illustration. 27 bcd e f r d Fig. 7. Points ( r, d ) on the curve for which (a) the order, (b) the computational cost, (c) thesize of telescoper and certificate combined, (d) the size of the telescoper only, (e) the order ofthe recurrence corresponding to the telescoper, and (f) the degree is minimal.
For the moment, the term h considered in the above example is a bit too big to actuallycompute the creative telescoping relations of orders 7 and 8 and compare the difference ofthe timings to the predicted speedup of 8%. On smaller examples, the minimal (predicted)complexity is achieved for the minimal order operator. It may seem that an improvementby just a few percent is not really worth the effort. But in fact, the improvement gained inthe example is just the tip of an iceberg. Asymptotically, as the input size increases, thespeedup becomes more and more significant. In the next result, which is a generalizationand a refinement of a result of Bostan et al. (2010), we give precise estimates. Corollary 19.
Let h be a hyperexponential term and τ = max { α, γ, deg x c , deg y c } .Let κ be an increasing sublinear function with the property that degree n solutions ofa linear system with m variables and at most m equations over K ( x ) can be computedwith nm κ (max { n, m } ) operations in K . Then a creative telescoping relation of order r = τ − φ can be computed using2 κ (2 τ ) τ + O ∼ ( τ )operations in K . If r is chosen such that r = (1 + √ τ + O(1) ≤ . τ + O(1)then a creative telescoping relation of order r can be computed using (349 + 85 √ κ (11 τ ) τ + O( τ ) ≤ . κ (11 τ ) τ + O ∼ ( τ )operations in K . In particular, creative telescoping relations for hyperexponential termscan be computed in polynomial time. 28 roof. First assume deg x a > deg x b . According to Theorem 14, there exists a creativetelescoping relation of order r and degree d whenever r ≥ τ − φ and d ≥ f ( r ) := (2 τ + (2 β + φ ) τ + ( φ − β ) r + O( τ ) r − τ + 2 − φ , where the term O( τ ) is independent of r . A creative telescoping relation of order r anddegree d can be computed using at most C ( r, d ) = (cid:0) ( r + 1) τ + 3 − φ (cid:1) (cid:0) ( β + τ ) r + d − β ( τ + φ ) − φ − (cid:1) κ (cid:0) ( β + τ )( r + 1) + d (cid:1) operations in K . The claim follows from evaluating C ( r, f ( r )) at r = τ − φ and r = (1 + √ τ + O(1), respectively, and replacing the arguments of κ by generousupper bounds.For the case deg x a ≤ deg x b , the estimates are proved analogously. Although theformulas for f ( r ) and C ( r, d ) are slightly different in this case, the final result turns outto be the same. We leave the details to the reader. ✷ The strange constant (1 + √
17) in Corollary 19 is chosen such as to minimize themultiplicative constant in the complexity bound under the simplifying assumption that κ is constant. It was determined by first equating ddr C ( r, f ( r )) to zero, which yielded theoptimal choice of r as an algebraic function in τ , β , and φ . The term (1 + √ τ is thedominant term in the asymptotic expansion of this function for τ → ∞ . It is perhapsnoteworthy that the choice of the constant is irrelevant for achieving a cost of O ∼ ( τ ),as long as the constant is greater than 1. Taking r = uτ for arbitrary but fixed u > u ( u +1) u − κτ + O ∼ ( τ ). The choice u = (1 + √
17) onlyminimizes the leading coefficient. Since (1 + √ ≈ .
28, the result indicates that when α and γ are large and approximately equal, it appears to be most efficient to compute atelescoper whose order is about 30% larger than the minimum order.In the same way as exemplified in Corollary 19, we have also determined the choicesfor r for which some other quantities become minimal. The results are given in Table 1.As a final application, we improve some of the results given by Bostan et al. (2007) ondifferential and recurrence equations related to algebraic functions. Let m ∈ K [ x, y ] beirreducible with deg y m ≥
1, and let a ∈ K [[ x ]] be such that m ( x, a ( x )) = 0. According toProposition 2 in their paper, if P + D y Q is a creative telescoping relation for y ( D y m ) /m ,then P a = 0. Thus we can use our results about creative telescoping to derive estimatesfor differential equations for a . Corollary 20.
Let m ∈ K [ x, y ] and a = P ∞ n =0 a n x n ∈ K [[ x ]] be as above and write τ x := deg x m , τ y := deg y m . Assume τ x > τ y >
0. Then(1) The series a satisfies a linear differential equation of order r = τ y with coefficientsof degree d = 2 τ x τ y − τ y + τ x τ y − τ y + τ x + 3 . (2) The series a also satisfies a linear differential equation of order r = 2 τ y with coeffi-cients of degree d = 4 τ x τ y − τ y − τ x − l τ x + 1 τ y + 1 m . C ( r, d ) S ( r, d ) T ( r, d ) R ( r, d ) d (a) τ κτ − φ τ τ τ τ (b) √ τ √ κτ √ τ √ τ (5 + √ τ (5 + √ τ (c) √ τ √ κτ √ τ (4 + 2 √ τ (3 + √ τ (3 + √ τ (d) 2 τ κτ τ τ τ τ (e) √ τ / κτ τ √ τ / τ τ (f) 2 τ κτ τ τ τ τ Table 1.
Minimizing various functions on the curve of Theorem 14. The table shows the order r ,the complexity C ( r, d ), the output size S ( r, d ) of telescoper and certificate, the output size T ( r, d )of the telescoper only, the recurrence order R ( r, d ), and the degree d of the telescoper when r is chosen such that (a) r is minimal, (b) C ( r, d ) is minimal, (c) S ( r, d ) is minimal, (d) T ( r, d )is minimal, (e) R ( r, d ) is minimal, (f) d is minimal. The parameters τ and κ have the samemeaning as in Corollary 19. The arguments of κ are suppressed. Only the dominant terms ofthe asymptotic expansion for τ → ∞ are given. In rows (e) and (f), the values for d differ onlyin the lower order terms. (3) The coefficient sequence ( a n ) ∞ n =0 satisfies a linear recurrence equation of order l τ x τ y + τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 m with polynomial coefficients of degree l τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 m . Proof.
For h = y ( D y m ) /m we have deg x c ≤ α = τ x , deg y c = γ = τ y , ω ≤ δ ≤ φ = 1. According to Theorem 14.(2), a creative telescoping relation of order r anddegree d exists provided that r ≥ τ y and d ≥ τ x τ y r + 2 τ x τ y − τ y − τ y + 2 τ x + 62( r − τ y + 1) . Parts 1 and 2 follow from here by setting r = τ y or r = 2 τ y , respectively. For part 3,observe first that there exists a creative telescoping relation of order r and degree d where r ≥ τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 ,d ≥ τ x τ y + q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 . From here the claim follows by the fact that when a power series a satisfies a lineardifferential equation of order r and degree d , then its coefficient sequence satisfies alinear recurrence equation of order r + d and degree r . ✷ These results are to be compared with the corresponding results of Bostan et al.(degree 4 τ x τ y + smaller terms for part 1, order 6 τ y and degree 3 τ x τ y for part 2, and orderand degree 2 τ x τ y + τ y + 1 for part 3), as well as with the conjectures about the minimalsizes they found experimentally (2 τ − τ + 3 τ for part 1 when τ x = τ y =: τ and orderand degree 2 τ x τ y − − ( τ x − τ y ) for part 3 if τ y > . Conclusion What is the shape of the gray region? Where does it come from? And how can itbe exploited?—These were the guiding questions for the work described in this article.As a main result, we have given in Theorem 14 a simple rational function whose graphpasses approximately along the boundary of the gray region, in some examples moreaccurately than in others. This curve was derived from a somewhat technical analysisof the linear systems resulting from a specific ansatz over K . Where the curve does notdescribe the gray region accurately, these linear systems have solutions despite of havingmore equations than variables. Some possible reasons for this phenomenon were takeninto account in the design of the ansatz, thereby improving the accuracy of the estimatecompared to a naive approach. However, as shown in Examples 15.(2) and 15.(4), thereseem to be further effects which sometimes cause a gap between the true degrees and ourprediction. It would be interesting to know what these effects are, and to derive sharperestimates from them. Ultimately, it would be desirable to have a version of Theorem 14which is generically tight.Tight curves allow for optimizing computational cost, output sizes, and other mea-sures by trading order against degree. As the degree decreases when the order grows, itis not always optimal to compute the minimal order operator. In Example 18, we haveillustrated how the curve of Theorem 14 can be used to calculate a priori the optimalorders for several interesting measures. Of course, if the curve is not tight, these predic-tions may not be correct, but even then, at least they provide some useful orientation.Tightness of the curve is also not required for deriving asymptotic bounds on the com-plexity. As we have shown in Corollary 19, the difference between the optimal choiceand other choices is significant for asymptotically large input size. We believe that thisresult is not only of theoretical interest. Even if the minimal cost may be achieved for theminimal order in any example which is feasible with currently available hardware, it canbe seen from Example 18 that it already starts to make a difference for inputs which areonly slightly beyond the capability of today’s computers. We therefore expect that thetechnique of trading order for degree will help to optimize the performance of efficientimplementations of creative telescoping in the near future. Acknowledgements.
We wish to thank Christoph Koutschan and Carsten Schneiderfor valuable remarks on an earlier draft of this article.
References