[PDF] Trading Order for Degree in Creative Telescoping

Abstract

We analyze the differential equations produced by the method of creative telescoping applied to a hyperexponential term in two variables. We show that equations of low order have high degree, and that higher order equations have lower degree. More precisely, we derive degree bounding formulas which allow to estimate the degree of the output equations from creative telescoping as a function of the order. As an application, we show how the knowledge of these formulas can be used to improve, at least in principle, the performance of creative telescoping implementations, and we deduce bounds on the asymptotic complexity of creative telescoping for hyperexponential terms.

Full PDF

aa r X i v : . [ c s . S C ] J a n Trading Order for Degree inCreative Telescoping

Shaoshi Chen Department of MathematicsNorth Carolina State UniversityRaleigh, NC 27695-8205, USA

Manuel Kauers Research Institute for Symbolic ComputationJohannes Kepler UniversityA4040 Linz, Austria

Abstract

We analyze the diﬀerential equations produced by the method of creative telescoping appliedto a hyperexponential term in two variables. We show that equations of low order have highdegree, and that higher order equations have lower degree. More precisely, we derive degreebounding formulas which allow to estimate the degree of the output equations from creativetelescoping as a function of the order. As an application, we show how the knowledge of theseformulas can be used to improve, at least in principle, the performance of creative telescopingimplementations, and we deduce bounds on the asymptotic complexity of creative telescopingfor hyperexponential terms.

Key words:

Deﬁnite integration, Hyperexponential terms, Zeilberger’s algorithm.

1. Introduction

Creative telescoping is a technique for computing diﬀerential or diﬀerence equationssatisﬁed by a given deﬁnite sum or integral. The technique became widely known throughthe work of Zeilberger (1991), who ﬁrst observed that creative telescoping in combination

Email addresses: [email protected] (Shaoshi Chen), [email protected] (Manuel Kauers). Current address. The work described here was done while S.C. was employed as postdoc at RISC in theFWF projects Y464-N18 and P20162-N18. At NCSU, S.C. is supported by the NSF grant CCF-1017217. M.K. was supported by the FWF grant Y464-N18.

Preprint submitted to Elsevier 12 November 2018 r d Fig. 1. Sizes ( r, d ) of creative telescoping relations for the integral of a certain rational function with Gosper’s algorithm (Gosper, 1978) for indeﬁnite hypergeometric summation leadsto a complete algorithm for computing recurrence equations of deﬁnite hypergeomet-ric sums. This algorithm is now known as Zeilberger’s algorithm (Zeilberger, 1990). Inits original version, it accepts as input a bivariate proper hypergeometric term f ( n, k )and returns as output a linear recurrence equation with polynomial coeﬃcients satisﬁedby the sum F ( n ) = P bk = a f ( n, k ). An analogous algorithm for deﬁnite integration wasgiven by Almkvist and Zeilberger (1990). This algorithm accepts as input a bivariatehyperexponential term f ( x, y ) and returns as output a linear diﬀerential equation withpolynomial coeﬃcients satisﬁed by the integral F ( x ) = R βα f ( x, y ) dy . A summary of themethod of creative telescoping for this case is given in Section 2 below. For further de-tails, variations, and generalizations, consult for instance Petkovˇsek et al. (1997), Chyzak(2000), Schneider (2005), Chyzak et al. (2009), Kauers and Paule (2011). For implemen-tations, see Paule and Schorn (1995), Chyzak (1998), Koepf (1998), Schneider (2004),Abramov et al. (2004), Koutschan (2009, 2010), etc.The equations which can be found via creative telescoping have a certain order r andpolynomial coeﬃcients of a certain degree d . But for a ﬁxed integration problem, r and d are not uniquely determined. Instead, there are inﬁnitely many points ( r, d ) ∈ N suchthat creative telescoping can ﬁnd an equation of order r and degree d . These points form aregion which is speciﬁc to the integration problem at hand. Figure 1 shows an example forsuch a region. Every point ( r, d ) in the gray region corresponds to a diﬀerential equationof order r and degree d which creative telescoping can ﬁnd for integrating the rationalfunction f ( x, y ) = (cid:16) x y + 9 x y + 9 x + 10 xy + 3 xy + 4 x + 1 (cid:17).(cid:16) x y + 9 x y + x y + 3 x + 7 x y + 8 x y + 5 x + 8 xy + 10 xy + 10 xy + x + 5 y + 10 y + 5 y + 5 (cid:17) . The picture indicates that low order equations have high degree, and that the degreedecreases with increasing order. But what exactly is the shape of the gray region? Andwhere does it come from? And how can it be exploited? These are the questions weaddress in this article. 2 ow can it be exploited?

There are two main reasons why the shape of the grayregion is of interest. First, because it can be used to estimate the size of the outputequations, and hence to derive bounds on the computational cost of computing them.Secondly, because it can be used to design more eﬃcient algorithms by recognizing thatsome of the equations are cheaper than others.An analysis of this kind was ﬁrst undertaken by Bostan et al. (2007). They studiedthe problem of computing diﬀerential equations satisﬁed by a given algebraic functionand found a similar phenomenon: low order equations have high degree and vice versa.Among other things, they found that an algebraic function with a minimal polynomial ofdegree n satisﬁes a diﬀerential equation of order at most n with polynomial coeﬃcients ofdegree O( n ), but also a diﬀerential equation of order 6 n whose coeﬃcients have degreeonly O( n ). Their message is that trading order for degree can pay oﬀ.The same phenomenon applies to creative telescoping, as was shown by Bostan et al.(2010) for the case of integrating rational functions. The results in the present articleextend this work in two directions: First in that we consider the larger input class ofhyperexponential terms, and second in that we give not only isolated degree estimatesfor some speciﬁc choices of r , but a curve which passes along the boundary of the grayregion and thus establishes a degree estimate as a function of the order r . Where does it come from?

The standard argument for proving the existenceof creative telescoping relations rests on the fact that linear systems of equations withmore variables than equations must have a nontrivial solution. Every creative telescopingrelation can be viewed as a solution of a certain linear system of equations which can beconstructed from the data given in the input. There is some freedom in how to constructthese systems, and it turns out that this freedom can be used for making the number ofvariables exceed the number of equations, and thus to enforce the existence of a nontrivialsolution.This reasoning not only implies the existence of equations and the termination of thealgorithm which searches for them, but it also implies bounds on the output size andon the computational cost of the algorithm. But in order to obtain good bounds, thefreedom in setting up the linear systems must be used carefully. For a good bound, wenot only want that the number of variables exceeds the number of equations, but we alsowant this to happen already for a reasonably small system. The shape of the gray regionoriginates from the smallest systems which have solutions.Verbaeten (1974, 1976) introduced a technique which helps in keeping the size of thesystems small. The idea is to saturate the linear systems by introducing additional vari-ables in a way that avoids increasing the number of equations. We will make use of thisidea in Section 3 where we propose a design for a parameterized family of linear sys-tems whose solutions give rise to creative telescoping relations. Unfortunately, it requiressome quite lengthy and technical calculations to translate this particular design into aninequality condition which rephrases the condition “number of variables > number ofequations” in precise terms. However, as a reward we obtain a good approximation tothe gray region as the solution of this inequality. What is the exact shape?

We don’t know. All we can oﬀer are some rationalfunctions which describe the boundary of the region of all ( r, d ) where the ansatz de-scribed in Section 3 has a solution (Theorem 14). The graphs of these rational functionsare curves which pass approximately along the boundary of the gray region.3y construction, for all integer points ( r, d ) above these graphs we can guaranteethe existence of a creative telescoping relation of order r with polynomial coeﬃcientsof degree d . But we have no proof that our curves are best possible. Experiments haveshown that at least in some cases, our curve describes the boundary of the gray regionexactly, or within a negligible error. In other cases, there remains a signiﬁcant portion ofthe gray region below our curve when r is large.In cases where the curve from Theorem 14 is tight, we can compute the points ( r, d )for which certain interesting measures (such as computing time, output size, . . . ) areminimized, as shown in Section 5. Even when the curve is not tight, these calculationsstill give rise to new asymptotic bounds (including the multiplicative constants) of thecorresponding complexities. We expect that this data will be valuable for constructingthe next generation of symbolic integration software.

2. Creative Telescoping for Hyperexponential Terms

We consider in this article only hyperexponential terms as integrands. Throughoutthe article, K is a ﬁeld of characteristic 0, and K ( x, y ) is the ﬁeld of bivariate rationalfunctions in x and y over K . Let D x and D y denote the derivations on K ( x, y ) suchthat D x c = D y c = 0 for all c ∈ K , and D x x = 1, D x y = 0, D y x = 0, D y y = 1.One can see that D x and D y commute with each other on K ( x, y ). We say that a ﬁeld E containing K ( x, y ) is a diﬀerential ﬁeld extension of K ( x, y ) if the derivations D x and D y are extended to derivations on E and those extended derivations, still denoted by D x and D y , commute with each other on E . Deﬁnition 1.

An element h of a diﬀerential ﬁeld extension E of K ( x, y ) is called hyper-exponential (over K ( x, y )) if D x hh ∈ K ( x, y ) and D y hh ∈ K ( x, y ) . When h ∈ E is a hyperexponential term and r , r ∈ K ( x, y ) are such that ( D x h ) /h = r and ( D y h ) /h = r , then D x D y h = D y D x h implies D y r = D x r . Conversely,Christopher (1999) has shown for algebraically closed ground ﬁelds K that for anytwo rational functions r , r ∈ K ( x, y ) with D y r = D x r there exist a/b ∈ K ( x, y ), c , . . . , c L ∈ K [ x, y ] and e , . . . , e L ∈ K with r = D x c c + D x (cid:16) ab (cid:17) + L X ℓ =1 e ℓ D x c ℓ c ℓ and r = D y c c + D y (cid:16) ab (cid:17) + L X ℓ =1 e ℓ D y c ℓ c ℓ . Together with Theorem 2 of Bronstein et al. (2005), it follows that there exists a dif-ferential ﬁeld extension E of K ( x, y ) and an element h ∈ E with ( D x h ) /h = r and( D y h ) /h = r which we can write in the form h = c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ , where a ∈ K [ x, y ], b, c , . . . , c L ∈ K [ x, y ] \ { } , e , . . . , e ℓ ∈ K , and the expressionsexp( a/b ) and c e ℓ ℓ refer to elements of E on which D x and D y act as suggested by thenotation. We assume from now on that hyperexponential terms are always given in this4orm, and we use the letters a, b, c , . . . , c L , e , . . . , e L consistently throughout with themeaning they have here. Example 2. h = exp( x y ) √ x − y is a hyperexponential term. We have D x hh = 1 + 4 x y − xy x − y = 2 xy + 12 x − y ∈ K ( x, y ) ,D y hh = x − x y − x − y = x − x − y ∈ K ( x, y ) . For this term, we can take c = 1, a = x y , b = 1, c = x − y , e = .We may adopt the additional condition (without loss of generality) that the c ℓ ( ℓ > e ℓ N for all ℓ >

0. The estimates derivedbelow do not depend on these additional conditions, but will typically not be sharp whenthey are not fulﬁlled. For simplicity, we will exclude throughout some trivial specialcases by assuming that all e ℓ are nonzero and that max { deg x a, deg x b } + P Lℓ =1 deg x c ℓ and max { deg y a, deg y b } + P Lℓ =1 deg y c ℓ are nonzero. These latter two conditions encodethe requirement that h is neither independent of x nor independent of y , nor simply apolynomial.Applied to the hyperexponential term h , the method of creative telescoping consistsof ﬁnding, by whatever means, polynomials p , . . . , p r ∈ K [ x ], not all zero, and a hyper-exponential term Q such that p h + p D x h + · · · + p r D rx h = D y Q. An equation of this form is called a creative telescoping relation for h , the diﬀerentialoperator P := p + p D x + · · · + p r D rx appearing on the left is called the telescoper and Q is called the certiﬁcate of the relation. The telescoper is required to be nonzero andfree of y , but the certiﬁcate may be zero or it may involve both x and y . When p r = 0,the number r is called the order of P , and d := max ri =0 deg x p i is called its degree .To motivate the form of a creative telescoping relation, assume that h = h ( x, y ) can beinterpreted as an actual function in x and y and consider the integral f ( x ) = R βα h ( x, y ) dy .Then integrating both sides of a creative telescoping relation implies that f satisﬁes theinhomogeneous diﬀerential equation p ( x ) f ( x ) + p ( x ) D x f ( x ) + · · · + p r ( x ) D rx f ( x ) = (cid:2) Q ( x, y ) (cid:3) βy = α . In the frequent situation that the inhomogeneous part happens to evaluate to zero, thismeans that the telescoper of h annihilates the integral f . Example 3.

A creative telescoping relation for h = exp( x y ) √ x − y is(3 x − h − xD x h = D y (cid:0) (3 x − y ) h (cid:1) . It consists of the telescoper P = (3 x − − xD x and the certiﬁcate Q = (3 x − y ) h . Forthe deﬁnite integral f ( x ) := R x/ −∞ exp( x y ) √ x − ydy , we obtain the diﬀerential equation(3 x − f ( x ) − xD x f ( x ) = 0 . Q , they ﬁx an order r and a degree s for the numerator of Q , make anansatz with undetermined coeﬃcients, and obtain a linear system by comparing coef-ﬁcients. Appropriate choices of r and s ensure that this linear system has a nontrivialsolution, and also lead to a sharp bound on the order r of the telescoper.Let us illustrate this reasoning for the case where the integrand is a rational function h = u/v ∈ K ( x, y ) with deg y u < deg y v and v irreducible. Fix some r . Then we have toﬁnd p , . . . , p r ∈ K ( x ) and a rational function Q ∈ K ( x, y ) with p h + p D x h + · · · + p r D rx h = D y Q. A reasonable choice for Q is Q = (cid:0)P si =0 q i y i (cid:1) /v r , where s = deg y u + ( r −

1) deg y v and q , . . . , q s are unknowns, because with this choice, both sides of the equation are equalto a rational function with the same denominator v r +1 and numerators of degree atmost deg y u + r deg y v in y in which the unknowns p i and q j appear linearly. Comparingcoeﬃcients with respect to y on both sides leads to a homogeneous linear system of atmost 1 + deg y u + r deg y v equations with ( r + 1) + ( s + 1) unknowns and coeﬃcientsin K ( x ). This system will have a nontrivial solution if r is chosen such that( r + 1) + ( s + 1) > deg y u + r deg y v + 1 ⇐⇒ r ≥ deg y v. All these solutions must lead to a nonzero telescoper P because any nontrivial solutionwith P = 0 would have a nonzero certiﬁcate Q with D y Q = 0, and this is impossiblebecause s was chosen such that the numerator of Q has a strictly lower degree than itsdenominator.We have thus shown the existence of telescopers of any order r ≥ deg y v . This is agood bound, but it does not provide any estimate on their degrees d . We will next deriveinequalities involving both r and d by constructing linear systems with coeﬃcients in K rather than in K ( x ).

3. Shaping the Ansatz

Let h be a hyperexponential term and consider an ansatz of the form P = r X i =0 d i X j =0 p i,j x j D ix , Q = (cid:18) s X i =0 s X j =0 q i,j x i y j (cid:19) hv for a telescoper P and a certiﬁcate Q . The plan is to ﬁnd a good choice for the parameters r, s , s , v, d , . . . , d r . The only restriction we have is that the linear system obtained fromequating all the coeﬃcients in the numerator of the rational function ( P h − D y Q ) /h tozero should have a solution in which not all the p i,j are zero. The remaining freedom canbe used to shape the ansatz such as to keep d := max ri =0 d i small.As a suﬃcient condition for the existence of a solution, we will require that the numberof terms x i y j in the numerator of the rational function ( P h − D y Q ) /h (i.e., the numberof equations) should be less than P ri =0 ( d i + 1) + ( s + 1)( s + 1) (i.e., the number ofvariables p i,j and q i,j ). As shown in the following example, this condition is really justsuﬃcient, but not necessary. 6 xample 4. Let h = u/v be the rational function from the introduction. With r = 3, d = d = d = d = d = 54, and Q = (cid:0)P i =0 P j =0 q i,j x i y j (cid:1)(cid:14) v , comparing thecoeﬃcients of the numerator of ( P h − D y Q ) /h to zero gives a linear system with 787variables and 792 equations. This system has a nonzero solution although 792 > P and Q in such away that the linear system originating from it will have a nullspace whose dimension isexactly the diﬀerence between the number of equations and the number of variables (or0 if there are more equations than variables).The goal of this section is to describe our choice for the ansatz of telescoper andcertiﬁcate. The form of the ansatz for the telescoper is given in Section 3.1, the certiﬁcateis discussed in Section 3.2. In the beginning, we collect some facts about the rationalfunctions ( D ix h ) /h which are used later for calculating how many equations a particularansatz induces. The following notational conventions will be used throughout. Notation 5. • lc z p and deg z p refer to the leading coeﬃcient and the degree of thepolynomial p with respect to the variable z , respectively. For the zero polynomial, wedeﬁne deg z −∞ and lc z • p ∗ refers to the square free part of the polynomial p with respect to all its variables,e.g., (cid:0) ( x +1) ( y +3) (cid:1) ∗ = ( x +1)( y +3). Note that p ∗ is only unique up to multiplicationby elements from K \{ } , but that for any choice of p ∗ , the degrees deg x p ∗ and deg y p ∗ are uniquely determined and we have that p ∗ ( D x p ) /p is a polynomial in x and y . Theseare the only properties we will use. • z n := z ( z − z − · · · ( z − n + 1) and z n := z ( z + 1)( z + 2) · · · ( z + n −

1) denote thefalling and rising factorials, respectively. For n ≤ z n := z n := 1. • If z is a real number, then z + := max { , z } . • If z is a real number, then ⌊ z ⌋ := max { x ∈ Z : x ≤ z } , ⌈ z ⌉ := min { x ∈ Z : x ≥ z } ,and ⌊ z ⌉ := ⌊ z + ⌋ denotes the nearest integer to z . • If Φ is a formula then [[Φ]] denotes the Iverson bracket, which evaluates to 1 if Φ istrue and to 0 if Φ is false, e.g., z + = [[ z ≥ z ; δ i,j = [[ i = j ]], etc. Lemma 6.

Let h be a hyperexponential term and i ≥ x a > deg x b , then D ix hh = N i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) i for some polynomial N i ∈ K [ x, y ] withdeg x N i = deg x c + i (cid:16) deg x a + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) , deg y N i ≤ deg y c + i (cid:16) max { deg y a, deg y b } + deg y b ∗ + L X ℓ =1 deg y c ℓ (cid:17) , lc x N i = (lc x c ) (cid:16) lc x ab ∗ L Y ℓ =1 c ℓ (cid:17) i (deg x a − deg x b ) i .

72) If deg x a ≤ deg x b , then D ix hh = N i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) i for some polynomial N i ∈ K [ x, y ] withdeg x N i = deg x c + i (cid:16) deg x b + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) − [[ ω ∈ N ∧ i > ω ]] δ, deg y N i ≤ deg y c + i (cid:16) max { deg y a, deg y b } + deg y b ∗ + L X ℓ =1 deg y c ℓ (cid:17) , lc x N i =  (lc x c ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) i ω i if ω N or i ≤ ω ;(lc x N ω +1 ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) i − ( ω +1) ( − δ − i − ( ω +1) if ω ∈ N and i > ω ,where ω := deg x c + L P ℓ =1 e ℓ deg x c ℓ and, if ω ∈ N , δ := deg x c + ( ω + 1) (cid:16) deg x b + deg x b ∗ + L X ℓ =1 deg x c ℓ − (cid:17) − deg x N ω +1 ≥ . Proof.

All claims are proved by induction on i . For i = 0, there is nothing to show inany of the cases. The calculations for the induction step i → i + 1 are as follows.(1) Let v := bb ∗ Q Lℓ =1 c ℓ and write m i for the claimed value of deg x N i . Then D i +1 x hh = D x (cid:18) N i c v i c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ (cid:19)(cid:30)(cid:18) c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ (cid:19) = ( D x N i ) v − iN i D x vc v i +1 + N i c v i ( D x a ) b ∗ − ab ∗ ( D x b ) /bbb ∗ + N i c v i L X ℓ =1 e ℓ D x c ℓ c ℓ = ( D x N i ) v − iN i D x v + N i (cid:0) L Q ℓ =1 c ℓ (cid:1)(cid:0) ( D x a ) b ∗ − ab ∗ D x bb (cid:1) + N i v L P ℓ =1 e ℓ D x c ℓ c ℓ c v i +1 . Since deg x a > deg x b by assumption, we havedeg x (cid:18) ( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ (cid:19) ≤ deg x N i + deg x v − m i + deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ − < m i + deg x a + deg x b ∗ + L X ℓ =1 deg x c ℓ − m i +1 . Furthermore, because of( D x a ) b ∗ − ab ∗ D x bb = (lc x a )(lc x b ∗ )(deg x a − deg x b ) x deg x a +deg x b ∗ − + · · ·

8e have N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ = (lc x N i ) (cid:16) lc x ab ∗ L Y ℓ =1 c ℓ (cid:17) (deg x a − deg x b ) x m i +1 + · · · . This completes the proof that ( D i +1 x h ) /h has the denominator as claimed and thatits numerator has degree and leading coeﬃcient with respect to x as claimed. Theremaining degree bound with respect to y follows fromdeg y (cid:16) ( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ (cid:17) ≤ deg y c + i (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17)| {z } bounds deg y N i + deg y b ∗ + deg y b + L X ℓ =1 deg y c ℓ | {z } bounds deg y v ≤ deg y c + ( i + 1) (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) anddeg y (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) ≤ deg y c + i (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17)| {z } bounds deg y N i + deg y b ∗ + deg y a + L X ℓ =1 deg y c ℓ | {z } bounds deg y of the other factors ≤ deg y c + ( i + 1) (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) . (2) Again, let v := bb ∗ Q Lℓ =1 c ℓ and write m i for the claimed value of deg x N i . Then,like in part 1, D i +1 x hh = ( D x N i ) v − iN i D x v + N i (cid:0) L Q ℓ =1 c ℓ (cid:1)(cid:0) ( D x a ) b ∗ − ab ∗ D x bb (cid:1) + N i v L P ℓ =1 e ℓ D x c ℓ c ℓ c v i +1 . First consider the case ω N or i ≤ ω .Since deg x a ≤ deg x b by assumption, and because of( D x a ) b ∗ − ab ∗ D x bb = (lc x a )(lc x b ∗ )(deg x a − deg x b ) x deg x a +deg x b ∗ − + · · · , we now have deg x (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) < m i + deg x b + deg x b ∗ − L X ℓ =1 deg x c ℓ = m i +1 . x a = deg x b because the coeﬃcientof x deg x a +deg x b ∗ − in ( D x a ) b ∗ − ab ∗ ( D x b ) /b contains the factor deg x a − deg x b ,which vanishes in this case.Next, using the induction hypothesis, we have( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ = (lc x N i )(lc x v ) (cid:16) deg x N i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x deg x N i +deg x v − + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i ω i (lc x v ) (cid:16) m i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +deg x v − + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 ω i (cid:16) deg x c + L X ℓ =1 e ℓ deg x c ℓ − i (cid:17) x m i +1 + · · · = (lc x c ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 ω i +1 x m i +1 + · · · . Since ω i +1 = 0 when ω N or i + 1 ≤ ω , this completes the proof that ( D i +1 x h ) /h has the denominator as claimed and that its numerator has degree and leadingcoeﬃcient with respect to x as claimed. The degree bounds with respect to y areshown exactly as in part 1.Now consider the case where ω ∈ N and i > ω . In this case, we start the inductionat i = ω + 1. The induction base follows from the calculations carried out above for i ≥ ω , the fact ω ω +1 = 0, and the deﬁnition of δ . (Note that ω ω +1 = 0 also implies δ ≥ i i + 1, we have, similar as before,deg x (cid:16) N i (cid:16) ( D x a ) b ∗ − ab ∗ D x bb (cid:17) L Y ℓ =1 c ℓ (cid:17) < m i +1 and( D x N i ) v − iN i D x v + N i v L X ℓ =1 e ℓ D x c ℓ c ℓ = (lc x N i )(lc x v ) (cid:16) m i − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +deg x v − + · · · = (lc x N i )(lc x v ) (cid:16) deg x c + i (deg x v − − δ − i deg x v + L X ℓ =1 e ℓ deg x c ℓ (cid:17) x m i +1 + · · · = (lc x N ω +1 ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i − ( ω +1) ( − δ − i − ( ω +1) (lc x v )( ω − δ − i ) x m i +1 + · · · = (lc x N ω +1 ) (cid:16) lc x bb ∗ L Y ℓ =1 c ℓ (cid:17) i +1 − ( ω +1) ( − δ − i +1 − ( ω +1) x m i +1 + · · · Because of δ >

0, the factor ( − δ − i − ( ω +1) is nonzero for all i > ω . ✷ xample 7. The case when h = u/v is a rational function is covered by part 2 of thelemma. For example, for h = (2 x − x +5) / (3 x − x +8) we can take c = 2 x − x +5, a = 0, b = 1, L = 1, c = 3 x − x + 8, e = −

1. Direct calculation of the derivativesgives i x c c i ( D ix h ) /h x c c i ( D ix h ) /h − − i = ω + 1 = 3, but knowing these, it correctly predicts all the other data in the table. In thisexample, we have δ = 3 = ω + 1. This is not a coincidence, as we shall show next. Lemma 8.

Let h be a hyperexponential term with deg x a ≤ deg x b , and let ω and δ beas in Lemma 6.(2), ω ∈ N . Then δ ≥ ω + 1. Proof.

Rewrite h = c exp( ab ) Q Lℓ =1 c e ℓ ℓ = ¯ c exp( ab ) Q L +2 ℓ =1 ¯ c ¯ e ℓ ℓ with ¯ c = x ω , ¯ c ℓ = c ℓ ( ℓ = 1 , . . . , L ), ¯ e ℓ = e ℓ ( ℓ = 1 , . . . , L ), ¯ c L +1 = c , ¯ e L +1 = 1, ¯ c L +2 = x , ¯ e L +2 = − ω .The rational functions ( D ix h ) /h are of course independent of the representation of h ,but the representations of these rational functions which are given in Lemma 6 are not.The representation obtained for the new representation of h is obtained from the originalrepresentation by multiplying numerator and denominator by x ω + i c i − . Observe that thismodiﬁcation does not inﬂuence the values for ω or δ . It is therefore suﬃcient to prove theclaim for terms of the form h = x ω ¯ h , where ¯ h is some hyperexponential term for whichthe value of ω is zero. We do so by induction on ω . For ω = 0, we have δ ≥ ω + 1already by Lemma 6.(2). Now assume that ω ≥ x ω ¯ h the degree drop¯ δ is ω + 1 or more. Then for h = x ω +1 ¯ h = x ( x ω ¯ h ) we have D x h = x ω ¯ h + xD x ( x ω ¯ h ), D x h = 2 D x ( x ω ¯ h ) + xD x ( x ω ¯ h ), and so on, all the way down to D ω +2 x h = ( ω + 2) D ω +1 x ( x ω ¯ h ) + xD ω +2 x ( x ω ¯ h )= ( ω + 2) N ω +1 x ω v ω +1 x ω ¯ h + x N ω +2 x ω v ω +2 x ω ¯ h = ( ω + 2) N ω +1 v + xN ω +2 v ω +2 ¯ h, (1)where N ω +1 and N ω +2 are as in Lemma 6 and v refers to the denominator stated there.If δ denotes the degree drop for h , then this calculation implies δ ≥ ¯ δ . By inductionhypothesis, we have ¯ δ ≥ ω + 1. If in fact ¯ δ ≥ ω + 2, then we are done. Otherwise, if¯ δ = ω + 1, thenlc x N ω +2 = ( − ¯ δ −

1) lc x N ω +1 lc x v = − ( ω + 2) lc x N ω +1 lc x v by Lemma 6, so the leading terms of the two polynomials in the numerator of (1) cancel,and therefore δ > ω + 1 also in this case. ✷ Experiments suggest that the bound in Lemma 8 is tight in the sense that we have δ = ω + 1 for almost all hyperexponential terms h . But there do exist situations with δ > ω + 1. For example, it can be shown that for h = c exp( a/b ) with deg x b − deg x a > deg x c = ω we have δ ≥ deg x b − deg x a . 11lso Lemma 6 is not necessarily sharp for degenerate choices of h . In particular, we donot claim that the numerators and denominators stated in Lemma 6 are coprime. It maybe possible to carry out a ﬁner analysis by considering the square free decomposition of c ,or by taking into account possible common factors between b and the c ℓ , or by handlingthe c ℓ which do not involve x separately. For our purpose, we believe that the statementsgiven above form a reasonable compromise between sharpness of the statements andreadability of the derivation.Several aspects of the formulas in Lemma 6 are important. One of them is that thedenominators corresponding to lower derivatives divide those corresponding to higherderivatives. This has the consequence that when the linear combination P h is broughton a common denominator, the degree of the numerator will not grow drastically. In asense, this fact is the main reason why creative telescoping works at all. Our next stepis to bring the formulas from Lemma 6 on a common denominator.

Lemma 9.

Let h be a hyperexponential term and r, i ∈ Z with r ≥ i ≥ x a > deg x b , then D ix hh = N r,i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r for some N r,i ∈ K [ x, y ] withdeg x N r,i = deg x c + r (cid:16) deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ (cid:17) + i (cid:16) deg x a − deg x b − (cid:17) deg y N r,i ≤ deg y c + r (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) , lc x N r,i = (lc x c )(lc x a ) i (lc x b ) r − i (lc x b ∗ ) r (cid:16) lc x L Y ℓ =1 c ℓ (cid:17) r (cid:0) deg x a − deg x b (cid:1) i . (2) If deg x a ≤ deg x b , then D ix hh = N r,i c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r for some N r,i ∈ K [ x, y ] withdeg x N r,i = deg x c + r (cid:16) deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ (cid:17) − i − [[ ω ∈ N ∧ i > ω ]] δ, deg y N r,i ≤ deg y c + r (cid:16) deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ (cid:17) , lc x N r,i =  (lc x c )(lc x bb ∗ ) r (cid:0) lc x L Q ℓ =1 c ℓ (cid:1) r ω i if ω N or i ≤ ω ;(lc x N ω +1 ) (cid:0) lc x bb ∗ L Q ℓ =1 c ℓ (cid:1) r − ( ω +1) ( − δ − i − ( ω +1) if ω ∈ N and i > ω ,where ω , δ , and lc x N ω +1 are as in Lemma 6.(2). Proof.

Both parts follow directly from the respective parts of Lemma 6 by multiplyingnumerator and denominator of the representations stated there by (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − i . ✷ Deﬁnition 10.

For a hyperexponential term h , let α = deg x b ∗ + deg x b + L X ℓ =1 deg x c ℓ , β = deg x a − deg x b − ,γ = deg y b ∗ + max { deg y a, deg y b } + L X ℓ =1 deg y c ℓ , ω = deg x c + L X ℓ =1 e ℓ deg x c ℓ . If deg x a ≤ deg x b and ω ∈ N , we further let δ be any integer with ω + 1 ≤ δ ≤ deg x c + ( ω + 1)( α − − deg x (cid:18) c (cid:16) bb ∗ L Y ℓ =1 c ℓ (cid:17) ω +1 D ω +1 x hh (cid:19) . Otherwise, if deg x a > deg x b or ω N , let δ = 0. Finally, we deﬁne the following ﬂags: φ = [[ lc x a lc x b ∈ K ]] , φ = [[ lc x a lc x b ∈ K ∧ β = 0]] ,φ = [[ ab ∈ K ( x ) ∧ ∀ ℓ : (deg y c ℓ = 0 ∨ e ℓ ∈ Z ) ∧ deg y c ≥ L P ℓ =1 e ℓ deg y c ℓ ]] . Note that none of these parameters depends on r or i . The ﬂags φ k ( k = 1 , ,

3) arein { , } , ω belongs to K , β belongs to N ∪ {− } , and all other parameters are positiveintegers. The best value for δ is the right bound of the speciﬁed range, but since thisvalue cannot be directly read of from the input, we do not insist that δ be equal to thisvalue, but we allow δ to be any number between the bound from Lemma 8 and the truedegree drop. The ﬂags φ and φ will be used below in the ansatz for the telescoper, φ will play a role afterwards in the ansatz for the certiﬁcate.In terms of the parameters deﬁned in Deﬁnition 10, the degree bounds of Lemma 9simplify to deg x N r,i ≤ deg x c + αr + max { β, − } i − [[ ω ∈ N ∧ i > ω ]] δ, deg y N r,i ≤ deg y c + γr. Lemma 9 suggests reasonable choices for the degrees d i in the ansatz for P . In partic-ular, our choice is based on the following features of the formulas in Lemma 9. • The degree of the numerator in ( D ix h ) /h varies with i . A good choice for the degrees d i will compensate for this variation, taking higher values for d i when the numeratorof ( D ix h ) /h has low degree, and vice versa. This is the key idea of the Verbaetencompletion (Verbaeten, 1974, 1976; Wegschaider, 1997). • The leading coeﬃcients of N r,i ( i >

0) are polynomials in y , but in case 2, most ofthem are K -multiples of each other. When a and b are such that (lc x a ) / (lc x b ) ∈ K ,then this is also true in case 1. We will use this fact for eliminating several equationsat the cost of a single variable.Before describing the ansatz for P in full generality, we motivate the construction byan example. 13 t t t t tt t t t t tt t t t t tt t t t t tt t t t t tt t t t tt t t tt t t ❞❞ d =7 0 r =5 w =3 z }| { Fig. 2. The ansatz for P discussed in Example 11 Example 11.

Suppose that h is hyperexponential with lc x a = lc x b , β = 1 (case 1 ofLemma 9), and deg x c = 0.Let r = 5 and d = 7. We want to choose d i such that max i =0 d i = 7 and the ansatz P = X i =0 d i X j =0 p i,j x j D ix leads to “many” variables but only “few” equations. The choice with most variables isclearly to set d i = d = 7 for all i . But this ansatz leads to quite many equations. Eachterm x j D ix contributes to the common numerator a polynomial x j N ,i whose degree in x is 5 α + i + j and whose degree in y is at most 5 γ . Because of the term x D x , we mustexpect up to (5 α + 13)(5 γ + 1) terms in the numerator. This is the expected number ofequations in the linear system resulting from coeﬃcient comparison.If we remove the term x D x from the ansatz, i.e., if we choose d = · · · = d = 7, d = 6, then the number of equations drops to (5 α + 12)(5 γ + 1) because all terms x j D ix other than x D x contribute only polynomials x j N ,i of lower degree. We save 5 γ + 1equations at the cost of removing a single variable. Removing also the terms x D x and x D x lowers the number of equations further to (5 α + 11)(5 γ + 1), and in general, for any0 ≤ w ≤

5, choosing d i = 7 − ( w + i − + ( i = 0 , . . . ,

5) leads to (5 α + 13 − w )(5 γ + 1)equations. The number of variables is (5 + 1)(7 + 1) − P wk =1 k = 48 − w ( w + 1).If w >

1, we can introduce w − w = 3, i.e., the terms x j D ix with i + j ≥

10 have been removed from the ansatz. We reintroduce the terms x D x , x D x , x D x by adding p , (cid:0) (deg x a − deg x b ) x D x − x D x (cid:1) + p , (cid:0) (deg x a − deg x b ) x D x − x D x (cid:1) to the ansatz, getting back the two variables p , and p , but no new equations, because,according to Lemma 9.(1), the assumption lc x a = lc x b implies(deg x a − deg x b ) lc x N , = lc x N , and (deg x a − deg x b ) lc x N , = lc x N , . The ﬁnal ansatz is depicted in Figure 2. A bullet at ( i, j ) represents a variable p i,j inthe ansatz. White bullets correspond to the reintroduced variables p , and p , whichdo not aﬀect the number of equations. 14he general form of our ansatz for the telescoper is given in the following lemma. Theﬁrst case is like in the example above when β >

0. For β = 0, no degree compensationis possible because all N r,i have the same degree. But if (lc x a ) / (lc x b ) ∈ K , it is stillpossible to save some equations by exploiting the linear dependence among the leadingterms. In the second case, there is always a degree compensation possible, but unlikein the example above, terms are removed for indices i close to zero rather than closeto r . When ω ∈ N , we provide an alternative ansatz which takes the degree drop δ into account. Common to all cases are the two basic principles of choosing d i such asto compensate for the diﬀerent degrees of the N r,i in Lemma 9, and of installing someadditional variables by exploiting the knowledge about the leading terms of the N r,i . Forthe size of the cutoﬀ, we use a new integer parameter w , whose optimal value will bedetermined later. Lemma 12.

Let h be a hyperexponential term, r ≥ d ≥ x a > deg x b . Let 0 ≤ w ≤ min { r, d/β } ( w := 0 if β = 0), d i := d − β ( w + i − r ) + − φ ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + [[ β = 0]] φ r − X i = r − w +1 p i,d i +1 (cid:16)(cid:0) lc x a lc x b ( β + 1) (cid:1) i x d i +1 D ix − x d r +1 D rx (cid:17) + φ r − X i =0 p i,d i +1 (cid:16)(cid:0) lc x a lc x b (cid:1) i x d i +1 D ix − x d r +1 D rx (cid:17) . Let N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + ( α + β ) r − βw − φ and deg y N ≤ deg y c + γr. (2) Suppose that deg x a ≤ deg x b . Let 0 ≤ w ≤ min { d + 1 , r + 1 } . Let d i := d − ( w − i ) + ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + w − X i =1 p i,d i +1 (cid:16) x d i +1 D ix − ω i x d +1 (cid:17) . Let N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + αr − w and deg y N ≤ deg y c + γr. (2 ′ ) Suppose that deg x a ≤ deg x b and ω ∈ N . Let ω ≤ w ≤ min { d − δ + 1 , r + 1 } . Let d i := d − ( w − i ) + − [[ i ≤ ω ]] δ ( i = 0 , . . . , r ), and P = r X i =0 d i X j =0 p i,j x j D ix + ω X i =1 p i,d i +1 (cid:16) x d i +1 D ix − ω i x d +1 (cid:17) + w − X i = ω +2 p i,d i +1 (cid:16) x d i +1 D ix − ( − δ − i − ( ω +1) x d ω +1 +1 D ω +1 x (cid:17) . (See Figure 3 for an illustration of the shape of P in this case.)15et N = c (cid:0) b ∗ b Q Lℓ =1 c ℓ (cid:1) r ( P h ) /h . Thendeg x N ≤ deg x c + d + αr − w − δ and deg y N ≤ deg y c + γr. Proof. (1) We apply Lemma 9.(1) to each term in the ansatz for P . The claim aboutdeg y N follows directly from the bound on deg y N r,i there. For the bound on deg x N ,ﬁrst observe thatdeg x x j N r,i ≤ d i + deg x c + αr + βi = deg x c + d + αr + βi − β ( w + i − r ) + − φ = deg x c + d + αr + β ( i − max { w + i − r, } ) − φ ≤ deg x c + d + αr + β ( r − w ) − φ for all i, j with 0 ≤ i ≤ r and 0 ≤ j ≤ d i . This settles the terms coming from thedouble sum. For the terms in the ﬁrst single sum, which only appears when β = 0,we have deg x x d i +1 N r,i = deg x x d r +1 N r,r = deg x c + d + αr + β ( r − w ) + 1 − φ and (cid:0) lc x a lc x b ( β + 1) (cid:1) i lc x N r,i = lc x N r,r for i = r − w + 1 , . . . , r −

1. This impliesdeg x (cid:16)(cid:0) lc x a lc x b ( β + 1) (cid:1) i x d i +1 N r,i − x d r +1 N r,r (cid:17) ≤ deg x c + d + αr + β ( r − w ) − φ , as desired. The argument for the second single sum, which only appears when β = 0,is analogous.(2) Now we use Lemma 9.(2). Again, the claim about deg y N follows immediately. Forthe bound on deg x N , ﬁrst observe thatdeg x x j N r,i ≤ d i + deg x c + αr − i = deg x c + d + αr − i − ( w − i ) + ≤ deg x c + d + αr − w. This settles the terms in the double sum. For the terms in the single sum, we havedeg x x d i +1 N r,i = deg x x d +1 N r, = deg x c + d + αr − w + 1and lc x N r,i = lc x ω i N r, for i = 1 , . . . , w −

1, and thereforedeg x (cid:16) x d i +1 N r,i − ω i x d +1 N r, (cid:17) ≤ deg x c + d + αr − w. (2 ′ ) In this case, the terms in the double sum contribute polynomials of degreedeg x x j N r,i ≤ d i + deg x c + αr − i − [[ i > ω ]] δ = deg x c + d + αr − i − ( w − i ) + − [[ i ≤ ω ]] δ − [[ i > ω ]] δ ≤ deg x c + d + αr − w − δ. For the terms in the ﬁrst single sum, we havedeg x x d i +1 N r,i = deg x x d +1 N r, = deg x c + d + αr − w + 1 − δ t t t t t t t t t t tt t t t t t t t t t t tt t t t t t t t t t t tt t t t t t t t t t t❞ t t t t t t t t t t❞ t t t t t t t t tt t t t t t t t tt t t t t t t t tt t t t t t t t tt t t t t t t t❞ t t t t t t t d = 100 ω = 2 r = 11 δ =3  w =5 z }| { Fig. 3. The ansatz for P in case 2 ′ of Lemma 12 and lc x N r,i = lc x ω i N r, for i = 1 , . . . , ω , and thereforedeg x (cid:16) x d i +1 N r,i − ω i x d +1 N r, (cid:17) ≤ deg x c + d + αr − w − δ. Similarly, for the terms in the second single sum, we havedeg x x d i +1 N r,i = deg x x d ω +1 +1 N r,ω +1 ≤ deg x c + d + αr − w + 1 − δ. If the inequality is strict, we are done. Otherwise, δ is maximal and we havelc x N r,i = lc x ( − δ − i − ( ω +1) N r,ω +1 for i = ω + 2 , . . . , w −

1, and thereforedeg x (cid:16) x d i +1 N r,i − ( − δ − i − ( ω +1) x d ω +1 +1 N r,ω +1 (cid:17) ≤ deg x c + d + αr − w − δ, and we are also done. ✷ Lemma 12 makes a statement on the number of equations to be expected when theansatz for P is made in the form as indicated. This number of equations is equal to thenumber of terms x i y j in N , and this number is bounded by (deg x N + 1)(deg y N + 1),for which upper bounds are stated in the lemma. We also need to count the number ofvariables p i,j . This number is easily obtained from the sum expressions given for P in thevarious cases by replacing all the summand expressions by 1. After some straightforwardand elementary simpliﬁcations which we do not want to reproduce here, the statisticsare as follows. • In case 1, the number of variables is( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ . • In case 2, the number of variables is( r + 1)( d + 1) − w ( w + 1) + ( w − + . • In case 2 ′ , the number of variables is( r + 1)( d + 1) − w ( w + 1) − δ ( ω + 1) + ω + ( w − ω − + . P .We will next discuss the ansatz for the certiﬁcate Q , which will bring many additionalvariables, but, by a careful construction, no additional equations. The design of the ansatz for the certiﬁcate is much simpler. Here, the goal is to setup Q in such a way that ( D y Q ) /h has the same denominator and the same numeratordegrees in x and y as ( P h ) /h does (in order to not create more equations than necessary),and that ( D y Q ) /h cannot become zero (in order to enforce that P = 0 in every solutionwe ﬁnd).A direct calculation like in the proof of Lemma 6 conﬁrms that the ﬁrst requirementis satisﬁed by choosing Q = s P i =0 s P j =0 q i,j x i y j c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h with s =  deg x c + d + ( α + β )( r − − βw − φ − x c + d + α ( r − − w in case 2 of Lemma 12;deg x c + d + α ( r − − w − δ in case 2 ′ of Lemma 12and s = deg y c + γ ( r −

1) + 1 in all cases.This ansatz provides ( s + 1)( s + 1) variables. To ensure that D y Q = 0 for every choiceof q i,j , observe that D y Q = 0 can only happen if h is a rational function with respectto y , meaning a, b ∈ K [ x ] and c ℓ ∈ K [ x ] for all ℓ with e ℓ Z . In this case, we have D y Q = 0 if and only if the q i,j are instantiated in such a way that the resulting Q isfree of y , and this can only happen if the choice of q i,j is made in such a way that thenumerator degree in y is equal to the denominator degree in y . The denominator degreeis L X ℓ =1 ( r − − e ℓ ) deg y c ℓ = γ ( r − − η, where η = L X ℓ =1 e ℓ deg y c ℓ , which is less than s = deg y c + γ ( r −

1) + 1 if and only if deg y c + η + 1 >

0. If weremove all the terms q i,j x i y j with j = γ ( r − − η from the ansatz, no instantiation of theremaining q i,j can turn Q into a term independent of y , so we can be sure that D y Q = 0in this modiﬁed setup. The number of variables in this modiﬁed ansatz is ( s + 1) s . Theﬂag φ deﬁned in Deﬁnition 10 is set up in such a way that we can in all cases assumean ansatz for Q with ( s + 1)( s + 1 − φ ) variables. The following lemma summarizesthe two versions of the ansatz for Q . Lemma 13.

Let h be a hyperexponential term.181) If max { deg y a, deg y b } > y c ℓ > ℓ with e ℓ Z , then for every s , s ∈ N and every choice of q i,j ∈ K where not all q i,j are equal to zero we have D y  s P i =0 s P j =0 q i,j x i y j c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h  = 0 . (2) If deg y a = deg y b = 0 and deg y c ℓ = 0 for all ℓ with e ℓ Z , then for every s , s ∈ N and every choice of q i,j ∈ K where not all q i,j are equal to zero we have D y  s P i =0 (cid:18) ( r − γ − η − P j =0 q i,j x i y j + s P j =( r − γ − η +1 q i,j x i y j (cid:19) c (cid:0) bb ∗ Q Lℓ =1 c ℓ (cid:1) r − h  = 0 , where η = P Lℓ =1 e ℓ deg y c ℓ .

4. Solving the Inequalities

As the result of the previous section, we obtain counts for the number of variables andthe number of equations for a particular family of ansatzes which are parameterized bythe desired order r and degree d of the telescoper, various Greek parameters introducedin Deﬁnition 10, which measure the input, and one additional parameter w by whichthe shape of the ansatz can be modulated. A suﬃcient condition for the existence of asolution of order (at most) r and degree (at most) d is r, d, w ) − r, d, w ) > . For any particular choice of w from the ranges speciﬁed for the various cases in Lemma 12,we obtain a valid suﬃcient condition connecting r and d via the Greek parameters.Any of these conditions deﬁnes a region in N which is inside the gray region from theintroduction. To make this region as large as possible (and hence, as equal as possible tothe gray region), we will choose w in such a way that the left hand side, considered as afunction in w , is maximal.It comes in handy that r, d, w ) − r, d, w ) is a (piecewise) quadratic poly-nomial with respect to w , so the optimal choice of w is easily found by equating itsderivative with respect to w to zero and rounding the solution to the nearest integer. Ifthis point is outside the range to which w is constrained, then the maximum is assumedat one of the two boundary points of the range.The following theorem, which is the main result of this article, contains the boundswhich we obtained by applying this reasoning to the explicit expressions derived for r, d, w ) and r, d, w ) in the previous section for the various cases to be con-sidered. Theorem 14.

Let h = c exp (cid:16) ab (cid:17) L Y ℓ =1 c e ℓ ℓ

19e a hyperexponential term and let α, β, γ, δ, ω, φ , φ , φ be as in Deﬁnition 10 and set ψ = γ + φ −

2. Then a creative telescoping relation for h of order r and degree d existswhenever r ≥ ψ + 1 and d > ϑ r + ϕr − ψ , where ϑ and ϕ are deﬁned as follows.(1) If deg x a > deg x b , let ϑ = ( α + β )(2 γ − φ ) + γ − ,ϕ = deg x c + ( α + β + 1) deg y c + ( γ − φ )(deg x c − α − β − φ ) − (1 − φ )( γ − φ ) + (cid:0) φ + β ( γ − φ ) (cid:1) . (2) If deg x a ≤ deg x b , let ϑ = α (2 γ − φ ) − ,ϕ = deg x c + α deg y c + ( γ − φ )(deg x c + 1 − α ) − ( γ − φ ) + ( γ + 1 + φ ) . If furthermore ω ∈ N and γ − φ > ω and δ = ω + 1, then ϕ can be replaced by ϕ ′ = ϕ − δ ( γ − φ − ω ) + 1 . Proof. (1) Suppose deg x a > deg x b . According to the calculations done in the previoussection, in this case there exists an ansatz with( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ variables coming from the telescoper P , (cid:0) deg x c + d + ( α + β )( r − − βw − φ (cid:1)(cid:0) deg y c + γ ( r −

1) + 2 − φ (cid:1) variables coming from the certiﬁcate Q , and (cid:0) deg x c + d + ( α + β ) r − βw − φ + 1 (cid:1)(cid:0) deg y c + γr + 1 (cid:1) equations. Therefore, a creative telescoping relation exists provided that( r + 1)( d + 1) − βw ( w + 1) + φ ( w − + − φ + (deg x c + d + ( α + β )( r − − βw − φ )(deg y c + γ ( r −

1) + 2 − φ ) − (deg x c + d + ( α + β ) r − βw − φ + 1)(deg y c + γr + 1) > . For r ≥ γ − φ , this inequality is equivalent to d > (cid:16)(cid:0) ( α + β )(2 γ − φ (cid:1) + γ − (cid:1) r + deg x c + ( α + β + 1) deg y c + ( γ − φ )(deg x c − α − β − φ ) (2)+ βw ( w − γ + 3 − φ ) − φ ( w − + (cid:17).(cid:16) r − γ + 2 − φ (cid:17) . The choice w = 0 proves the claim when φ = 1 or γ ≤ − φ . Now supposethat φ = 0 and γ > − φ . The claimed estimate is obtained for the choice w = γ − φ >

0. We have to show that this choice is admissible, i.e., that1 ≤ γ − φ ≤ min { r, d/β } . Because of γ > − φ , the lower bound is clear, and r ≥ γ − φ holds by assumption. To see that γ − φ ≤ d/β , observe that20he right hand side of (2) converges to ( α + β )(2 γ − φ ) + γ − r → ∞ .Since its numerator is nonnegative (as checked by a straightforward calculation),it follows that this inequality implies d > ( α + β )(2 γ − φ ) + γ − ≥ β ( γ − φ ) , as desired.(2) Now assume deg x a ≤ deg x b . From the counts of variables and equations in theansatz described in Lemma 12.(2), we ﬁnd that a creative telescoping equationexists provided that( r + 1)( d + 1) − w ( w + 1) + ( w − + + (deg x c + d + α ( r − − w + 1)(deg y c + γ ( r −

1) + 2 − φ ) − (deg x c + d + αr − w + 1)(deg y c + γr + 1) > . For r ≥ γ − φ , this inequality is equivalent to d > (cid:16) ( α (2 γ − φ ) − r + deg x c + α deg y c + ( γ − φ )(deg x c + 1 − α )+ ( − γ − φ ) w + w − ( w − + (cid:17).(cid:16) r − γ + 2 − φ (cid:17) . Regardless of the choice of w , the right hand side is at least α (2 γ − φ ) − w = 0and on the other hand, if γ > − φ , from the choice w = γ − φ , which also inthis case is in the required range because 1 ≤ γ − φ ≤ α (2 γ − φ ) − < d and γ − φ ≤ r .The second estimate is obtained from the alternative ansatz from Lemma 12.(2 ′ ).The inequality in this case is( r + 1)( d + 1) − w ( w + 1) − δ ( ω + 1) + ω + ( w − ω − + + (deg x c + d + α ( r − − w − δ + 1)(deg y c + γ ( r −

1) + 2 − φ ) − (deg x c + d + αr − w − δ + 1)(deg y c + γr + 1) > , which for r ≥ γ − φ and w = γ − φ is equivalent to d > ( α (2 γ − φ ) − r + ϕ ′ r − γ + 2 − φ . It remains to show that the choice w = γ − φ is compatible with the rangerestrictions for w applicable in the present case. While the requirements ω ≤ γ − φ ≤ r + 1 are satisﬁed by assumption, the requirement γ − φ ≤ d − δ + 1is less obvious. A suﬃcient condition is( α (2 γ − φ ) − r + ϕ ′ r − γ + 2 − φ ≥ γ − φ + δ. It can be shown easily with Collins’s cylindrical algebraic decomposition algo-rithm (Collins, 1975; Caviness and Johnson, 1998) (e.g., with its implementationin Mathematica (Strzebo´nski, 2000, 2006)) that this latter inequality follows fromdeg x c ≥

0, deg y c ≥ α ≥ r ≥ γ − φ ≥ ω +1 ≥ δ = ω +1, φ ( φ −

1) = 0,and ϕ ′ = deg x c + α deg y c + δω + 1 + ( γ − φ )(deg x c − α − ( γ − φ ) − δ ) . ✷ As we do not claim that our bounds are sharp, no justiﬁcation for the various choices of w are required in the proof. But of course, the choices were made following the reasoningoutlined before the theorem. For example, in case 1 the main inequality is( r + 1)( d + 1 + φ ) − βw ( w + 1) + φ ( w − + − φ + (deg x c + d + ( α + β )( r − − βw − φ + 1)(deg y c + γ ( r −

1) + 2 − φ ) − (deg x c + d + ( α + β ) r − βw − φ + 1)(deg y c + γr + 1) > . Diﬀerentiating the left hand side with respect to w gives − βw − β + βγ + φ + βφ , which vanishes for w = γ − + φ + φ /β . The unique nearest integer point is ⌊ γ − + φ + φ /β ⌉ = γ − φ when φ /β = 1. When φ /β = 1, there are two nearestinteger points γ − φ and γ + φ , and since the maximum is exactly between them andquadratic parabolas are symmetric about their extremal points, the values at γ − φ and γ + φ agree. In conclusion, the choice w = γ − φ is optimal in both cases.The calculations for the other cases are similar. But note that having chosen w op-timally does not imply that the bounds given in the Theorem 14 are tight, because thewhole argument relies on counting variables and equations for the particular ansatz fam-ily introduced in Section 3, and we cannot claim that this shape is best possible. Recallthat we aim at an ansatz for which the number of solutions of the resulting linear systemis equal to (or at least not much larger than) the diﬀerence between number of variablesand number of equations. One way of measuring the quality of our ansatz, and hence thetightness of our bounds, is to compare the region of all points ( r, d ) where an ansatz fororder r and degree d actually has a solution (the “gray region” from the introduction)with the region of all points ( r, d ) for which Theorem 14 guarantees the existence of asolution. The following collection of examples shows that there are cases where Theo-rem 14 is extremely accurate as well as cases where there is a clear gap between thepredicted shape and the actual shape of the gray region. As a reference ansatz for exper-imentally determining in the examples whether a speciﬁc point ( r, d ) belongs to the grayregion, we checked whether the naive ansatz where d = d = · · · = d r (i.e., w = 0) as asolution, because every solution of some reﬁned ansatz with w > w = 0. It is not guaranteed however that this ansatz covers all creativetelescoping relations. Additional relations at points ( r, d ) outside of what we indicate asthe gray region may exist. For example, when our ansatz leads to a solution ( P, Q ) inwhich all the polynomial coeﬃcients of P share a nontrivial common factor f ∈ K [ x ],then ( P/f, Q/f ) is another relation with a telescoper of lower degree. This phenomenoncan often be observed for the minimal order telescoper, but as we do not know of anyeﬃcient way of detecting it also for the nonminimal ones, we can unfortunately not takeit into account in the ﬁgures.

Example 15. (1) Consider the term h = u exp( v ) where u = 7 x y + 8 x y + 9 x y + 3 x + 10 x y + 2 x y + 3 x y + 9 x + 7 xy + 4 xy + 5 xy + 3 x + 9 y + 6 y + 6 y + 1 , = 6 x y + 4 x y + x y + 9 x + 8 x y + 8 x y + 2 x y + 8 x + 3 xy + 7 xy + 4 xy + 8 x + 5 y + 2 y + 7 y + 6 . We are in case 1 of Theorem 14 and have α = 0, β = 2, γ = 3, φ = φ = φ = 0,deg x c = deg y c = 3. According to the theorem, we expect creative telescopingrelations for all ( r, d ) with r ≥ d > (12 r + 11) / ( r − r + 11) / ( r −

1) together with the gray region. In this example, thegray region consists exactly of the integer points above the curve: the bound is astight as can be.(2) Now consider the term h = exp( u ) /v where u = 4 x y + 7 x y + 9 x + 5 xy + 2 xy + 3 x + 5 y + y + 6 ,v = 6 x y + 10 x y + 6 x + 9 xy + 5 xy + 8 x + 8 y + 10 y + 8 . We are again in case 1 of the theorem and we have α = 2, β = 1, γ = 4, φ = φ = φ = 0, deg x c = deg y c = 2. The estimate from Theorem 14 is now d > (24 r − / ( r − h be the rational function from the introduction. Then we are in case 2of the theorem and we have α = 3, β = − γ = 3, ω = − δ = 0, φ = 1, φ = 0, φ = 1, deg x c = deg y c = 2. The bound from the theorem is now d > (17 r + 3) / ( r − h = u/v with u = 4 x y + 7 x y + 9 x + 5 xy + 2 xy + 3 x + 5 y + y + 6 ,v = (cid:0) x y + 10 x y + 6 x + 9 xy + 5 xy + 8 x + 8 y + 10 y + 8 (cid:1) × (cid:0) x y + 7 x y + 4 x + 5 xy + 3 xy + 7 x + 9 y + 7 y + 7 (cid:1) . This term is also covered by case 2 of the theorem, and we have α = 4, β = − γ = 4, ω = − δ = − φ = 1, φ = 0, φ = 0, deg x c = deg y c = 2. Theestimate d > (27 r + 3) / ( r −

2) from the theorem is correct but not tight, as shownin Figure 4.(d).(5) Finally, let h = √ u with u = 4 x y + 8 x y + 2 x y + 7 x y + 7 x y + 2 x y + 7 x + 10 xy + 7 xy + 9 xy + 4 xy + 5 xy + 5 xy + 7 x + 4 y + 3 y + 2 y + 8 y + 3 y + 7 y + 2 . Now the alternative bound of case 2 with ϕ ′ in place of ϕ is applicable because wehave ω = 1 ∈ N . The bound using ϕ is d > (21 r − / ( r − r = 14. In contrast, the bound d > (21 r − / ( r − ϕ ′ is tight for all r > r = 5. Thesituation is shown in Figure 5. On the right, we show a comparison of the sharpbound based on ϕ ′ (solid), the bound based on ϕ (dashed) and the bound whichwould be obtained by choosing w = 0 instead of w = γ − φ in the proof ofTheorem 14 (dotted).There are several ways of reﬁning the ansatz for P and Q even further in order toachieve better estimates where ours are not sharp. Here are some ideas.23a) r d (b) r d (c) r d (d) r d Fig. 4. Sizes ( r, d ) of creative telescoping relations together with the curve predicted by Theo-rem 14, for the hyperexponential terms discussed in Example 15. • The possibility of introducing extra variables without increasing the number of equa-tions (depicted by the white bullets in Figures 1 and 2) rests on the observation madein Lemma 9 that the leading coeﬃcients lc x N r,i are K -multiples of each other, i.e.,that these leading coeﬃcients generate a linear subspace of K [ y ] of dimension one. Ex-periments suggest that this observation can be generalized to the coeﬃcients of lowerdegree as follows: If V j ⊆ K [ y ] denotes the vector space generated by the coeﬃcientsof x deg x N r,i − j in N r,i ( i = 0 , . . . , r ), then V ⊆ V ⊆ · · · ⊆ V j and dim V j ≤ j + 1 atleast for small j . If this is true, it would allow adding more extra variables withoutincreasing the number of equations. • In general, comparing coeﬃcients of the monomials x i y j of a polynomial S to zeroresults in a linear system with (deg x S + 1)(deg y S + 1) equations. But if S containssome factor which is free of the variables p i,j and q i,j , then canceling this factor beforecomparing coeﬃcients results in a system with fewer equations and the same number ofvariables. While in our case, it is too much to hope for a factor which would divide S asa whole, it seems that at least in some cases, factors can be removed from lc x S ∈ K [ y ]or lc y S ∈ K [ x ]. For example, when deg x a > deg x b and deg y a > deg y b , it can be24 r d r d Fig. 5. Left: Sizes ( r, d ) of creative telescoping relations together with the curve predicted byTheorem 14, for the term discussed in Example 15.(5). Right: a detail of the ﬁgure on the left ina larger scale, together with the curve based on ϕ instead of ϕ ′ (dashed) and the curve based on w = 0 (dotted). The correct degrees are precisely the smallest integers strictly above the solidcurve. The two variations both overshoot for all the points in this range. shown that Q Lℓ =1 lc x c ℓ (cid:12)(cid:12) lc x S and Q Lℓ =1 lc y c ℓ (cid:12)(cid:12) lc y S , so P Lℓ =1 (cid:0) deg y lc x c ℓ + deg x lc y c ℓ (cid:1) equations can be discarded in this case.We have not worked out the inﬂuence of these variations in full generality, but onlyon some examples. It turned out that they indeed lead to tighter estimates, but thediﬀerence is rather small, and decays to zero for large r . At the same time, they wouldlead to much more complicated formulas. We do not know the reason for the gap inExamples 15.(2) and 15.(4) between the curve from Theorem 14 and the boundary ofthe gray region for r → ∞ . Even though it appears more important for a bound to betight for small orders than for large ones, we would be very interested in seeing a reﬁnedbound which closes this gap.It is also interesting to compare the gray regions for hyperexponential terms com-posed from dense random polynomials with the gray regions for hyperexponential termsof the same shape that originate from some speciﬁc application. According to our experi-ments, the shape of the gray region for a randomly chosen term h = c exp( a/b ) Q Lℓ =1 c e ℓ ℓ only depends on the number L of factors in the product, the degrees of the polyno-mials a, b, c , . . . , c L , and the exponents e , . . . , e L . However, input containing sparsepolynomials or polynomials which in some other sense have a “structure” may well haveconsiderably smaller degrees. Example 16. If a n,k denotes the number of HC-polynomioes with n cells and k rows(Wilf, 1989, Section 4.9), then ∞ X n,k =0 a n,k x n y k = xy (1 − x ) (1 − x ) − xy (1 − x − x + x + x y ) . A diﬀerential equation for the generating function P ∞ n =0 a n,n x n of the number of HC-polynomials with n cells and n rows can be obtained by applying creative telescoping tothe rational function obtained from the rational function above by substituting x by y ,25 r d Fig. 6. Gray regions for the two terms h (light gray) and g (dark gray) from Example 16.Although all Greek parameters have the same values for h and g (and hence, Theorem 14 givesthe same degree estimation curve), the actual gray regions diﬀer signiﬁcantly. y by x/y , and dividing the result by y . Let thus h = 1 y y xy (1 − y ) (1 − y ) − y xy (1 − y − y + y + y xy ) = x (1 − y ) y ((1 − y ) − x (1 − y + xy − y + y )) . Here we have c = x (1 − y ) , a = 0, b = 1, c = y , c = ((1 − y ) − x (1 − y + xy − y + y ), e = e = −

1. The gray region for h is shown in light gray in Figure 6. For comparison,the same ﬁgure contains the gray region (in dark gray) for a term g which was obtainedfrom h by replacing c and c by dense random polynomials with deg x c = 1, deg y c = 3,deg x c = 2, deg y c = 4, so that all the Greek parameters have precisely the same valuesfor g and h .Theorem 14 predicts relations whenever d ≥ r − r − (black curve), which is a goodestimate for the generic term g but a signiﬁcant overestimation for the special term h .

5. Consequences and Applications

Our theorem contains as a special case Theorem cAZ of Apagodu and Zeilberger(2006), which says that a (non-rational) hyperexponential term always admits a tele-scoper of order r = γ + 1, but makes no statement about its degree d . Similarly, we canalso give an estimate for the possible degrees d without paying attention to their orders r . Corollary 17. (1) For every hyperexponential term h , there exists a creative telescop-ing relation of order r = ψ + 1 = γ + 1 − φ .(2) For every hyperexponential term h , there exists a creative telescoping relation ofdegree d = ϑ + 1 =  ( α + β )(2 γ − φ ) + γ if deg x a > deg x b ; α (2 γ − φ ) if deg x a ≤ deg x b . Proof.

Both claims are immediate by the formulas given in Theorem 14. ✷

26n connecting order r and degree d into a single formula, Theorem 14 makes a muchstronger statement than this corollary. Assuming for simplicity that the bounds of The-orem 14 are tight, we can use them to compute optimal choices for order and degree ofthe telescoper. There are various quantities which one may want to minimize. Besidesasking for a bound on the minimal order or the minimal degree, as carried out above,we may ask for a choice ( r, d ) where the computational cost is minimal, or the total size S ( r, d ) := ( r + 1)( d + 1) + ( s + 1)(deg x c + γ ( r −

1) + 2) of the output (consisting oftelescoper and certiﬁcate), or the size T ( r, d ) := ( r + 1)( d + 1) of the output telescoperalone. Or, if the telescoper P is to be transformed into a recurrence for the series coef-ﬁcients of its solutions, one may want to minimize the order of this recurrence, which isbounded by R ( r, d ) := r + d (see, e.g., Thm. 7.1 in Kauers and Paule, 2011).For minimizing the computational cost, we ﬁrst have to ﬁx a particular algorithm forcomputing P and Q for given h . We are not forced to follow the algorithm which isimplicit in the analysis of Sections 3 and 4 (making an ansatz, comparing coeﬃcientswith respect to x and y to zero, and solving a linear system of equations over K ).In fact, this algorithm has a rather poor performance. It is much better to do a co-eﬃcient comparison with respect to y only and to solve a linear system of equationsover K ( x ). This is also what is proposed in the original articles (Almkvist and Zeilberger,1990; Mohammed and Zeilberger, 2005; Apagodu and Zeilberger, 2006) and what is usedin practice (Koutschan, 2009, 2010). Output sensitive linear system solvers based onHermite-Pad´e approximation (Beckermann and Labahn, 1994; Storjohann and Villard,2005; Bostan et al., 2007) are able to determine the degree n solutions of a linear sys-tem over K ( x ) with m variables and at most m equations using O ∼ ( nm ) operationsin K . Since an ansatz over K ( x ) will have only r + 1 variables coming from the tele-scoper, deg y c + γ ( r − − φ + 2 variables coming from the certiﬁcate, and a so-lution of degree s with respect to x , it seems reasonable to assume that the com-putational cost is minimal for a choice ( r, d ) which minimizes the function C ( r, d ) := s (deg y c + ( γ + 1) r − γ − φ + 3) . Example 18.

Consider a hyperexponential term h = c exp( a/b ) √ c where a, b, c , c ∈ K [ x, y ] have the degrees deg x a = deg y a = deg x b = deg y b = 1, deg x c = deg y c = 2,deg x c = 4, deg y c = 6. We are in case 2 of Theorem 14 and have α = 6, β = − γ = 8, ω = 4, δ = 5, φ = 0, φ = 0, φ = 0. According to the theorem, a creative telescopingrelation exists for ( r, d ) with r ≥ d ≥ (89 r − / ( r −

6) + 1 = (90 r − / ( r − d = (90 r − / ( r − C ( r, d ) = (6 r + d − r − assumes its minimal value for r = 8 rather than for the minimal order r = 7. Finding thisoptimal value is easy: regard r temporarily as real variable and use calculus to determinethe minimum of C ( r, r − r − ). This gives a minimum point near r = 7 . r ∈ N is either at r = 7 or at r = 8. Comparing the actual values of C at these two points indicates that the 8th order telescoper is about 8% cheaper than the7th order operator, and hence the cheapest operator of all.By similar calculations, we ﬁnd that the output size (telescoper and certiﬁcate com-bined) is minimized for r = 10, the size of the telescoper alone is minimized for r = 12,and the order of the recurrence associated to the telescoper is minimized for r = 28. SeeFigure 7 for an illustration. 27 bcd e f r d Fig. 7. Points ( r, d ) on the curve for which (a) the order, (b) the computational cost, (c) thesize of telescoper and certiﬁcate combined, (d) the size of the telescoper only, (e) the order ofthe recurrence corresponding to the telescoper, and (f) the degree is minimal.

For the moment, the term h considered in the above example is a bit too big to actuallycompute the creative telescoping relations of orders 7 and 8 and compare the diﬀerence ofthe timings to the predicted speedup of 8%. On smaller examples, the minimal (predicted)complexity is achieved for the minimal order operator. It may seem that an improvementby just a few percent is not really worth the eﬀort. But in fact, the improvement gained inthe example is just the tip of an iceberg. Asymptotically, as the input size increases, thespeedup becomes more and more signiﬁcant. In the next result, which is a generalizationand a reﬁnement of a result of Bostan et al. (2010), we give precise estimates. Corollary 19.

Let h be a hyperexponential term and τ = max { α, γ, deg x c , deg y c } .Let κ be an increasing sublinear function with the property that degree n solutions ofa linear system with m variables and at most m equations over K ( x ) can be computedwith nm κ (max { n, m } ) operations in K . Then a creative telescoping relation of order r = τ − φ can be computed using2 κ (2 τ ) τ + O ∼ ( τ )operations in K . If r is chosen such that r = (1 + √ τ + O(1) ≤ . τ + O(1)then a creative telescoping relation of order r can be computed using (349 + 85 √ κ (11 τ ) τ + O( τ ) ≤ . κ (11 τ ) τ + O ∼ ( τ )operations in K . In particular, creative telescoping relations for hyperexponential termscan be computed in polynomial time. 28 roof. First assume deg x a > deg x b . According to Theorem 14, there exists a creativetelescoping relation of order r and degree d whenever r ≥ τ − φ and d ≥ f ( r ) := (2 τ + (2 β + φ ) τ + ( φ − β ) r + O( τ ) r − τ + 2 − φ , where the term O( τ ) is independent of r . A creative telescoping relation of order r anddegree d can be computed using at most C ( r, d ) = (cid:0) ( r + 1) τ + 3 − φ (cid:1) (cid:0) ( β + τ ) r + d − β ( τ + φ ) − φ − (cid:1) κ (cid:0) ( β + τ )( r + 1) + d (cid:1) operations in K . The claim follows from evaluating C ( r, f ( r )) at r = τ − φ and r = (1 + √ τ + O(1), respectively, and replacing the arguments of κ by generousupper bounds.For the case deg x a ≤ deg x b , the estimates are proved analogously. Although theformulas for f ( r ) and C ( r, d ) are slightly diﬀerent in this case, the ﬁnal result turns outto be the same. We leave the details to the reader. ✷ The strange constant (1 + √

17) in Corollary 19 is chosen such as to minimize themultiplicative constant in the complexity bound under the simplifying assumption that κ is constant. It was determined by ﬁrst equating ddr C ( r, f ( r )) to zero, which yielded theoptimal choice of r as an algebraic function in τ , β , and φ . The term (1 + √ τ is thedominant term in the asymptotic expansion of this function for τ → ∞ . It is perhapsnoteworthy that the choice of the constant is irrelevant for achieving a cost of O ∼ ( τ ),as long as the constant is greater than 1. Taking r = uτ for arbitrary but ﬁxed u > u ( u +1) u − κτ + O ∼ ( τ ). The choice u = (1 + √

17) onlyminimizes the leading coeﬃcient. Since (1 + √ ≈ .

28, the result indicates that when α and γ are large and approximately equal, it appears to be most eﬃcient to compute atelescoper whose order is about 30% larger than the minimum order.In the same way as exempliﬁed in Corollary 19, we have also determined the choicesfor r for which some other quantities become minimal. The results are given in Table 1.As a ﬁnal application, we improve some of the results given by Bostan et al. (2007) ondiﬀerential and recurrence equations related to algebraic functions. Let m ∈ K [ x, y ] beirreducible with deg y m ≥

1, and let a ∈ K [[ x ]] be such that m ( x, a ( x )) = 0. According toProposition 2 in their paper, if P + D y Q is a creative telescoping relation for y ( D y m ) /m ,then P a = 0. Thus we can use our results about creative telescoping to derive estimatesfor diﬀerential equations for a . Corollary 20.

Let m ∈ K [ x, y ] and a = P ∞ n =0 a n x n ∈ K [[ x ]] be as above and write τ x := deg x m , τ y := deg y m . Assume τ x > τ y >

0. Then(1) The series a satisﬁes a linear diﬀerential equation of order r = τ y with coeﬃcientsof degree d = 2 τ x τ y − τ y + τ x τ y − τ y + τ x + 3 . (2) The series a also satisﬁes a linear diﬀerential equation of order r = 2 τ y with coeﬃ-cients of degree d = 4 τ x τ y − τ y − τ x − l τ x + 1 τ y + 1 m . C ( r, d ) S ( r, d ) T ( r, d ) R ( r, d ) d (a) τ κτ − φ τ τ τ τ (b) √ τ √ κτ √ τ √ τ (5 + √ τ (5 + √ τ (c) √ τ √ κτ √ τ (4 + 2 √ τ (3 + √ τ (3 + √ τ (d) 2 τ κτ τ τ τ τ (e) √ τ / κτ τ √ τ / τ τ (f) 2 τ κτ τ τ τ τ Table 1.

Minimizing various functions on the curve of Theorem 14. The table shows the order r ,the complexity C ( r, d ), the output size S ( r, d ) of telescoper and certiﬁcate, the output size T ( r, d )of the telescoper only, the recurrence order R ( r, d ), and the degree d of the telescoper when r is chosen such that (a) r is minimal, (b) C ( r, d ) is minimal, (c) S ( r, d ) is minimal, (d) T ( r, d )is minimal, (e) R ( r, d ) is minimal, (f) d is minimal. The parameters τ and κ have the samemeaning as in Corollary 19. The arguments of κ are suppressed. Only the dominant terms ofthe asymptotic expansion for τ → ∞ are given. In rows (e) and (f), the values for d diﬀer onlyin the lower order terms. (3) The coeﬃcient sequence ( a n ) ∞ n =0 satisﬁes a linear recurrence equation of order l τ x τ y + τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 m with polynomial coeﬃcients of degree l τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 m . Proof.

For h = y ( D y m ) /m we have deg x c ≤ α = τ x , deg y c = γ = τ y , ω ≤ δ ≤ φ = 1. According to Theorem 14.(2), a creative telescoping relation of order r anddegree d exists provided that r ≥ τ y and d ≥ τ x τ y r + 2 τ x τ y − τ y − τ y + 2 τ x + 62( r − τ y + 1) . Parts 1 and 2 follow from here by setting r = τ y or r = 2 τ y , respectively. For part 3,observe ﬁrst that there exists a creative telescoping relation of order r and degree d where r ≥ τ y − q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 ,d ≥ τ x τ y + q (8 τ y − τ y + 4) τ x − τ y − τ y + 12 . From here the claim follows by the fact that when a power series a satisﬁes a lineardiﬀerential equation of order r and degree d , then its coeﬃcient sequence satisﬁes alinear recurrence equation of order r + d and degree r . ✷ These results are to be compared with the corresponding results of Bostan et al.(degree 4 τ x τ y + smaller terms for part 1, order 6 τ y and degree 3 τ x τ y for part 2, and orderand degree 2 τ x τ y + τ y + 1 for part 3), as well as with the conjectures about the minimalsizes they found experimentally (2 τ − τ + 3 τ for part 1 when τ x = τ y =: τ and orderand degree 2 τ x τ y − − ( τ x − τ y ) for part 3 if τ y > . Conclusion What is the shape of the gray region? Where does it come from? And how can itbe exploited?—These were the guiding questions for the work described in this article.As a main result, we have given in Theorem 14 a simple rational function whose graphpasses approximately along the boundary of the gray region, in some examples moreaccurately than in others. This curve was derived from a somewhat technical analysisof the linear systems resulting from a speciﬁc ansatz over K . Where the curve does notdescribe the gray region accurately, these linear systems have solutions despite of havingmore equations than variables. Some possible reasons for this phenomenon were takeninto account in the design of the ansatz, thereby improving the accuracy of the estimatecompared to a naive approach. However, as shown in Examples 15.(2) and 15.(4), thereseem to be further eﬀects which sometimes cause a gap between the true degrees and ourprediction. It would be interesting to know what these eﬀects are, and to derive sharperestimates from them. Ultimately, it would be desirable to have a version of Theorem 14which is generically tight.Tight curves allow for optimizing computational cost, output sizes, and other mea-sures by trading order against degree. As the degree decreases when the order grows, itis not always optimal to compute the minimal order operator. In Example 18, we haveillustrated how the curve of Theorem 14 can be used to calculate a priori the optimalorders for several interesting measures. Of course, if the curve is not tight, these predic-tions may not be correct, but even then, at least they provide some useful orientation.Tightness of the curve is also not required for deriving asymptotic bounds on the com-plexity. As we have shown in Corollary 19, the diﬀerence between the optimal choiceand other choices is signiﬁcant for asymptotically large input size. We believe that thisresult is not only of theoretical interest. Even if the minimal cost may be achieved for theminimal order in any example which is feasible with currently available hardware, it canbe seen from Example 18 that it already starts to make a diﬀerence for inputs which areonly slightly beyond the capability of today’s computers. We therefore expect that thetechnique of trading order for degree will help to optimize the performance of eﬃcientimplementations of creative telescoping in the near future. Acknowledgements.

We wish to thank Christoph Koutschan and Carsten Schneiderfor valuable remarks on an earlier draft of this article.

References