BBlends in M
APLE * Robert M. Corless , [ − − − ] and Erik J. Postma [ − − − ] School of Mathematical and Statistical Sciences, Western Universty David R. Cheriton School of Computer Science, Waterloo University [email protected] Maplesoft, Waterloo, Ontario [email protected]
Abstract. A blend of two Taylor series for the same smooth real- or complex-valued function of a single variable can be useful for approximation. We use anexplicit formula for a two-point Hermite interpolational polynomial to constructsuch blends. We show a robust M APLE implementation that can stably and effi-ciently evaluate blends using linear-cost Horner form, evaluate their derivativesto arbitrary order at the same time, or integrate a blend exactly. The implemen-tation is suited for use with evalhf . We provide a top-level user interface andefficient module exports for programmatic use.
This work intended for presentation at the Maple Conference 2020
Keywords:
Two-point Hermite interpolants · Blends · M
APLE · stable and effi-cient implementation.
Taylor series are one of the basic tools of analysis and of computation for functions of asingle variable. Even so, it is not widely appreciated that Taylor series can be combinedto give what is usually an even better approximation. In this paper we only considerblending Taylor series at two points, say z = a and z = b . We convert to the unit intervalby introducing a new variable s with z = a + s ( b − a ) . Most examples in this paper willjust use s , but it is a straightforward matter to adjust back to the original variables, andwe will give examples of how to do so. Consider the following formula , namely that the grade m + n + H m , n ( s ) = m ∑ j = (cid:34) m − j ∑ k = (cid:18) n + kk (cid:19) s k + j ( − s ) n + (cid:35) p j + n ∑ j = (cid:34) n − j ∑ k = (cid:18) m + kk (cid:19) s m + ( − s ) k + j (cid:35) ( − ) j q j (1) * This work partially funded by NSERC. likely known already to Hermite because it is a special case of Hermite interpolation that hecould hardly have neglected. a r X i v : . [ c s . M S ] J u l R.M. Corless, and E. Postma has a Taylor series matching the given m + p j = f ( j ) ( ) / j ! at s = n + q j = f ( j ) ( ) / j ! at s =
1: using asuperscript ( j ) to mean the j th derivative with respect to s , H ( j ) m , n ( ) j ! = p j , ≤ j ≤ m , and H ( j ) m , n ( ) j ! = q j , ≤ j ≤ n . As with Lagrange interpolation, where for instance two points give a grade one poly-nomial, that is, a line, here m + n + m + n + O ( m ) + O ( n ) arithmeticoperations.We use the word grade to mean “degree at most”. That is, a polynomial of grade(say) 5 is of degree at most 5, but because here the leading coefficient is not visible, wedon’t know the exact degree, which could be lower. Typically, with a blend we will notknow the degree unless we compute it. This use of the word “grade” is common in theliterature of matrix polynomial eigenvalue problems.We show in figure 1 an example. We take the function f ( s ) = / Γ ( s − ) . For areference on the Gamma function, see [3]. This function has known series at s = s =
1: 1 Γ ( s − ) = − s + ( − γ + ) s + (cid:18) π − γ + γ − (cid:19) s + O (cid:0) s (cid:1) and 1 Γ ( s − ) = ( s − )+( γ − ) ( s − ) + (cid:18) − π + γ − γ + (cid:19) ( s − ) + O (cid:16) ( s − ) (cid:17) as computed by M APLE ’s series command. The series coefficients get complicatedas the degree increases, so we suppress printing them. We compute them up to degrees m = n = + + = f ( s ) − H , ( s ) andthe derivative error f (cid:48) ( s ) − H (cid:48) , ( s ) , first in the top row computing the blend in 15 Digits(which takes a third of a second to compute the blend and three of its derivatives at 2021points, so 8084 values) and comparing against M APLE ’s built-in evaluator (computedat higher precision, in fact 60 Digits because of the apparent end-point singularity). Inthe second row we compare the blend computed at 30 Digits, which takes 3 .
14 sec-onds, about ten times longer than the 15 Digit computation. We see in the second rowof the figure that the truncation error is smaller than 6 · − ; M APLE ’s hardware floatsuse IEEE double precision with a unit roundoff of 2 − ≈ − . We therefore expectrounding error to dominate if we do computation in only 15 Digits, and that is indeedwhat we see in the top row—and moreover we see that the rounding error is not appar-ently amplified very much, if at all: the errors plotted are all modest multiples of theunit roundoff. The unit roundoff itself can be seen in the apparent horizontal lines, infact. This will be indicative of the general behaviour of a blend: when carefully imple-mented, rounding errors do not affect it much. Since, as we will see, balanced blendsare quite well-conditioned, this will result in usually accurate answers. lends in M APLE To compare with Taylor series and other methods of approximating this particular f ( s ) , an equivalent cost Taylor series would be degree 19. The Taylor series of degree19 at s = s = − of about 3 . · − , many orders of magnitude greaterthan the error in the blend; the series at s = s = + . Thisis well-known: Taylor series are really good at their point of expansion, but will bebad at the other end of the interval. On the other hand, the best polynomial approxima-tion to this function, found by the Remez algorithm, is of course better than the blendwe produce here. Similarly a Chebyshev approximation to this function, as would beproduced by Chebfun [1], is also better: either cheaper for the same accuracy, or moreaccurate for the same effort. And then there is the new AAA algorithm, which is betterstill [8], which we do not pursue further here. But where does a blend fit in on this scaleof best-to-Taylor? The Chebyshev approximation accurate to 6 · − (as computed by numapprox[chebyshev] is of degree 16, not 19, so it is about 20% cheaper to evalu-ate . For the rational Remez best approximation, by numapprox[minimax] the degree [ , ] gets an error 3 · − and is cheaper yet to evaluate. Conversely, when using a sin-gle Taylor series, experiments at high precision show that we need to use degree 29 toget an error strictly less than the error of the ( , ) blend everywhere in the interval, anda degree 28 Taylor series is strictly worse. Therefore both the best approximation andthe Chebyshev series are better than a blend—but not by that much, while a blend beatsa single Taylor series by a considerable margin. Because blends are relatively simpleto compute and to understand, being “in the ballpark” of best approximation is likelygood enough to make these objects interesting. The error in Hermite interpolation is known, see for instance [6]. Here, the generalresults simplify to f ( s ) − H m , n ( s ) = f ( m + n + ) ( θ )( m + n + ) ! s m + ( − s ) n + (2)for some θ = θ ( s ) between 0 and 1. This is quite reminiscent of the Lagrange formof the remainder of Taylor series, and indeed it reduces exactly to that if we have an ( m , − ) or ( − , n ) blend—that is, without using any information from the other point.We saw in the high-precision graphs in figure 1 that the actual error curve really doesflatten out at either end.The errors in derivatives have a similar form, and as shown in [6] essentially loseone order of accuracy for each derivative taken.Rounding error depends strongly on how the blend is actually evaluated. The ordi-nary Horner’s rule has a standard “backward error” result: each evaluation is the exact evaluation of a polynomial with only slightly different coefficients. We believe that asimilar result is true for blends (we have not yet written down the proof, but it doesn’t This is harder to judge than we are saying, here. Optimal evaluation of Chebyshev polyno-mials via preprocessing is not usually done; the Clenshaw algorithm is very stable and usuallyused because it is O ( n ) in cost. Similarly, evaluation of a blend is O ( n ) . So this figure of 20% islikely not very true, but rather merely indicative. R.M. Corless, and E. Postma(a) f ( s ) − H , ( s )
15 Digits (b) f (cid:48) ( s ) − H (cid:48) , ( s )
15 Digits(c) f ( s ) − H , ( s )
30 Digits (d) f (cid:48) ( s ) − H (cid:48) , ( s )
30 Digits
Fig. 1.
The error in an ( , ) blend for f ( s ) = / Γ ( s − ) . This grade of blend produces an ap-proximation that is nearly accurate to full double precision; the truncation error is obscured byrounding errors. As is shown in the second row of graphs, recomputing these approximations athigher precision gives smoother error curves of about the same size and showing the theoreti-cal s ( − s ) behaviour. As usual with approximation methods, the accuracy degrades as thederivative order increases.lends in M APLE seem hard). This means that the effects of rounding error can be modelled by the usualcombination of backward error (guaranteed to be small) times conditioning. We willsee in the next section that blends are usually well-conditioned, and balanced blendsare the best.The numbers (cid:0) n + kk (cid:1) , for 0 ≤ k ≤ m , and identically (cid:0) m + kk (cid:1) , 0 ≤ k ≤ n , which appearin formula (1), grow large rather quickly. Here is a table of the numbers: to get (cid:0) m + kk (cid:1) ,choose the m th row (indexed starting at 0) and read across from k = k = n . Toget (cid:0) n + kk (cid:1) , choose the n th column and read down from k = k = m . Of course thetable is symmetric. · · · · · · · · · · · · · · · · · · · · · · · · ... ... ... ... ... ... ... ... ... . . . The largest entries are on the diagonal, and indeed (cid:18) mm (cid:19) ∼ m √ π m (cid:18) + O (cid:18) m (cid:19)(cid:19) (3)as we find out from the M APLE command asympt( binomial(2*m,m), m ) and some simplification. One worries about the numerical effect of these large numbersfor high-degree blends. We will see in section 1.5 that these do have some bad numericaleffects sometimes, but we will also see by the example of high-degree blends that theirinfluence is not as bad as it might be. For instance, when m = n = (cid:0) mm (cid:1) = · . Yet the blends that we have computed of this grade (including those for theLebesgue function, in the next section) show no numerical difficulties. One common measure of the numerical behaviour of a polynomial expression is its so-called
Lebesgue function: this is defined to be what you would get if absolute valuesare taken of each term multiplying a Taylor coefficient, and moreover all Taylor coef-ficients are also replaced by 1. In our case we can see that on the interval 0 ≤ s ≤ R.M. Corless, and E. Postma terms in the first series, with p j , are positive anyway; in the second term, we may makeeverything positive by choosing q j = ( − ) j . Outside that interval, the absolute valuesare needed. L m , n ( s ) = m ∑ j = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m − j ∑ k = (cid:18) n + kk (cid:19) s k + j ( − s ) n + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + n ∑ j = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − j ∑ k = (cid:18) m + kk (cid:19) s m + ( − s ) k + j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (4)Thus the Lebesgue function for our blend is (inside 0 ≤ s ≤
1) a blend itself, for afunction with the same Taylor series at s = / ( − s ) , and the same Taylor seriesat s = / s = / ( + ( s − )) . There is no function analytic everywhere with thoseproperties, of course, but nonetheless these polynomials are useful. Having a small sizeof L is a guarantee of good numerical behaviour, if one implements things carefully.Here, for the balanced case m = n , one can show that inside the interval ≤ L m , m ( s ) ≤ m is. If they are large but not balanced, then we can have L m , n ( s ) as large as the maximum of m and n . See figure 2.Outside the interval 0 ≤ s ≤ m + n + | s | . Blends are typicallyuseful numerically only between the two endpoints and in a small region in the complexplane surrounding that line segment, where L ( s ) remains of moderate size. By refiningthis argument somewhat, we may do better for certain polynomials by taking betteraccount of the polynomial coefficients through the theory of conditioning, see [4]. We will see that the definite integral of a blend over the entire interval will allow us toconstruct a new blend whose value at any point is the indefinite integral of the originalblend up to that point, F ( x ) = (cid:82) xs = H m , n ( s ) ds .Direct integration over the entire interval 0 ≤ s ≤ (cid:90) s = s a ( − s ) b ds = a ! b ! ( a + b + ) !gets us a formula for F ( ) involving the symbolic sum m − j ∑ k = (cid:0) n + kk (cid:1) ( j + k ) ! ( n + ) ! ( n + + j + k ) ! (5)and a similar one interchanging m and n . M APLE can evaluate both those sums: sm := sum( binomial(n+k,k)*(n+1)!*(k+j)!/(j+k+n+2)!, k=0..m-j ):simplify( sm ); yields the right-hand side of the equation below: m − j ∑ k = (cid:0) n + kk (cid:1) ( j + k ) ! ( n + ) ! ( n + + j + k ) ! = ( n + m − j + ) ! ( + m ) ! ( j + ) ( n + + m ) ! ( m − j ) ! . (6) lends in M APLE
Fig. 2.
Lebesgue functions for ( m , n ) = k , 0 ≤ k ≤ ( m , n ) = ( · k , k ) ,0 ≤ k ≤ m and n . Similarly we find the other sum, and finally we get (cid:90) s = H m , n ( s ) ds = ( m + ) ! ( m + n + ) ! m ∑ j = ( n + m − j + ) ! ( j + ) ( m − j ) ! p j + ( n + ) ! ( m + n + ) ! n ∑ j = ( n + m − j + ) ! ( j + ) ( n − j ) ! ( − ) j q j . (7)The numbers showing up in this formula turn out to be smaller for higher-order Tay-lor coefficients, as one would expect. We emphasize that the above formula gives (inexact arithmetic) the exact integral of the blend over the whole interval. If the blend isapproximating a function f ( s ) , then integrating equation (2) gives us (cid:90) s = f ( s ) ds − F ( ) = ( m + ) ! ( n + ) ! ( m + n + ) ! f ( m + n + ) ( c )( m + n + ) ! (8)where, using the Mean Value Theorem for integrals and the fact that s m + ( − s ) n + isof one sign on the interval, we replace the evaluation of the derivative at one unknownpoint θ with another unknown point c on the interval.Once we have the value F ( ) , we can construct a new blend from the old one asfollows. First, we put a value of 0 for the new F ( ) at the left end (in a string of blends,we would accumulate integrals; for now, we are just integrating from the left end).Then we adjust all the Taylor coefficients at the left: the old f ( ) /
0! becomes the new F (cid:48) ( ) / f (cid:48) ( ) /
1! becomes the new F (cid:48)(cid:48) ( ) /
2! and so we have to divide the old
R.M. Corless, and E. Postma p by 2; the old f (cid:48)(cid:48) ( ) /
2! becomes the new F (cid:48)(cid:48)(cid:48) ( ) /
3! so we have to divide by 3, and soon until the old f ( m ) ( ) / m ! becomes the new F ( m + ) ( ) / ( m + ) !; the new blend willhave m + F ( ) = the integral given above. We then shift all the old q j = f ( j ) ( ) / j ! into the new F ( j + ) ( ) / ( j + ) ! for j = . . . , n .We now have a type ( m + , n + ) blend H m + , n + ( s ) . Its Taylor coefficients onthe left are the same as the Taylor coefficients of F ( x ) = (cid:82) xs = H m , n ( s ) ds as a functionof x . Its Taylor coefficients on the right are also the same as those of F ( x ) at x = m + + n + + m + n + m + n +
2. However, in exact arithmetic, the result is actually of degree at most m + n +
2, because the value is the exact integral of a polynomial, and thus we see thatthe blend we have is actually using more information than it needs. We could throw oneof the highest derivatives away (it’s natural to do so at the right end) but there is no realneed unless we expect to do this process repeatedly to a single blend.To use this formula on integration from z = a to z = b one must incorporate thechange of variable from z to s = ( z − a ) / ( b − a ) . Putting h = ( b − a ) then we must(as always) scale the Taylor coefficients p j and q j by multiplying each by h j , and thenfinally the integral is just (cid:90) bz = a H m , n (cid:18) z − ab − a (cid:19) dz = h (cid:90) s = H m , n ( s ) ds . (9)If we have more than one blend lined up in a row, which we call a “string of blends”(this is quite natural, as can be seen from the fact that blends are joined at what aretermed “knots” in the spline and piecewise polynomial literature), then this formulacan be used to generate composite quadrature rules. The case m = n = m = n = (cid:90) bz = a H , (cid:18) z − ab − a (cid:19) dz = h ( f ( a ) + f ( b )) + h (cid:0) f (cid:48) ( b ) − f (cid:48) ( a ) (cid:1) . (10)A ( , ) blend gives the rule (cid:90) s = H , ( s ) ds = p + p + p + p + p + q − q + q − q + q h , one needs powers of h in the Taylorseries. We see that this balanced blend gives coefficients that will telescope at odd ordersfor composite rules on equally-spaced intervals. [This is well-known.] See also [9] foroptimal formulas of this balanced type. One often wants to find roots of a polynomial; or roots of a function which we approx-imate by a polynomial, and then finding the roots of the polynomial approximates the lends in M
APLE roots of the function. We will show a way to do this for blends by computing generalizedeigenvalues of a pair of matrices, which we will typically call AAA and
BBB . The command
Eigenvalues(A,B) in the
LinearAlgebra package computes generalized eigenval-ues rapidly and efficiently.This is a specialization of the general method described in [7] where we find a companion matrix pair for polynomials expressed in general Hermite interpolationalbases. This is a pair of matrices ( AAA , BBB ) with det ( sBBB − AAA ) = p ( s ) , the original polynomial.The method is numerically stable, and very convenient. This matrix pair does, however,introduce two extra eigenvalues at infinity, but this is usually no bother.The construction of the matrix pair is simplified in the two-point Hermite interpo-lational case: that is, for a blend. We will demonstrate by example. For a simple cubicblend with m = n =
1, we have
AAA = − h f (cid:48) ( b ) − f ( b ) − h f (cid:48) ( a ) − f ( a ) − (12)and BBB = . (13)The reader may verify that Determinant(z*B-A) will produce a polynomial rep-resented in the monomial basis that has p ( a ) = f ( a ) , p (cid:48) ( a ) = f (cid:48) ( a ) , p ( b ) = f ( b ) ,and p (cid:48) ( b ) = f (cid:48) ( b ) , and is of degree at most 3. The barycentric weights—the numbers (cid:0) m + kk (cid:1) —appear in the first column of AAA , increasing as the row index increases in eachblock and alternating in sign in the block associated with the node s =
1. The valuesof the function and scaled derivatives appear (in order of high derivatives to low) in thefirst row. The nodes appear in transposed Jordan blocks as block diagonals. The dimen-sion of the matrices is m + n + m + n +
3, which is two higher than it could be: thereare two infinite generalized eigenvalues of the pair ( AAA , BBB ) no matter what the functionvalues are. But the (at most) three finite roots are the roots of the cubic equation that fitsthe data. One can verify this directly by taking the determinant.It is straightforward to generalize this construction to arbitrary ( m , n ) blends. How-ever, for large enough grades, the eigenvalue conditioning of the matrix pair becomeslarge; this technique is only good up until about ( m , n ) = ( , ) or so (dependingon the Taylor coefficients too, of course). We will show detailed examples and tests insection 3. Here, we will just show two examples. First, we use this idea to get a good starting guess for an iterative scheme forrootfinding. Consider the equation f ( x ) = x exp ( x ) − =
0. Of course the unique pos-itive real value of x that solves this is W ( ) , where W is the Lambert W function.But if instead the equation is not that, but f ( x ) = ( x + ) exp ( x ) − =
0, then weno longer have an analytical solution. We notice by inspection that f ( ) = − < f ( ) = e − >
0, and so could decide to use bisection, or an iterative method such asNewton’s method (or any of infinitely many methods, really: M
APLE ’s fsolve worksvery well, for instance) but for fun let us proceed as above. We compute series at x = x = ≤ x ≤ x =
0, let us use m = n =
2. This gives us
AAA = − − − + − − − − −
10 0 1 1 0 0 0 01 0 0 0 0 0 0 03 0 0 0 1 0 0 06 0 0 0 0 1 0 010 0 0 0 0 0 1 0 . (14)The BBB matrix is the identity except the ( , ) entry is zero. Using Eigenvalues(evalf(A),evalf(B)) we find all generalized eigenvalues: two are infinite (as promised), four are complex,two are real but one is outside the interval and the other is x = . ( , ) blend, not of f ( x ) , so we perform one iteration of Halley’smethod x n + = x n − f ( x n ) f (cid:48) ( x n ) − f ( x n ) f (cid:48)(cid:48) ( x n ) f (cid:48) ( x n ) (15)on it (the derivatives of f are easy) to get x = . ( , ) blend, and taking two Halley steps, instead gets us a result correct to all fifteenDigits.As another example, consider a ( , ) blend of a step function f ( s ) = − s = f ( s ) = s = s = / ≈ .
618 reflecting the balance of theblend. The companion pair correctly gets this zero. The curve along which the zeros lieseems to be a kind of Szeg¨o curve (perhaps a blend of two such curves); convergenceto the function is not expected outside that curve. See figure 3.
If we look at equation (1) with a programmer’s eye, we see a lot of room for econo-mization. First, the sums are polynomials in s and in 1 − s . Because 0 ≤ s ≤
1, both of lends in M
APLE Fig. 3.
The zeros of a ( , ) blend of a step function f ( s ) = − s = f ( s ) = s = . these terms are positive, so we do not want to expand powers of ( − s ) , for instance;introducing subtraction means potentially revealing rounding errors made earlier. Butas a first step we may put the sums in Horner form. We remind you that the Horner form of a polynomial f ( x ) = f + f x + f x + f x is a rewriting so that no powers occur,only multiplication: f ( x ) = f + x ( f + x ( f + x f )) . The form can be programmed in asimple loop: p := f[n];for j from n-1 by -1 to 0 dop := f[j] + x*p;end do; Here we have a double sum, and in each sum we may write in Horner form; that is,where the loop above has a simple f[j] we would have an inner Horner loop to com-pute it.But the inner sum is simply ∑ m − jk = (cid:0) n + kk (cid:1) s k once the s j ( − s ) n + is factored out ofit. These inner sums should be precomputed by the simple recurrence (adding the nextterm to the previous sum), outside of the innermost loop, so that the cost is proportionalto either n or m , and not their product.The numbers (cid:0) m + kk (cid:1) and (cid:0) n + kk (cid:1) occur frequently, and perhaps they should be pre-computed. Except that they, too, can be split in a Horner-like fashion, because for k ≥ s k (cid:18) m + kk (cid:19) = s m + kk · s k − (cid:18) m + k − k − (cid:19) . While this is actually more expensive than precomputing the numbers, it keeps the sizeof the numbers occuring in the formula small (remember 0 ≤ s ≤ to numerical stability. This is best seen by example. In the m = n = + s + s + s which rewritten in Horner form is just 1 + s ( + s ( + s · )) . But if we factor out thebinomial coefficient factors using the rule above, it becomes1 + s (cid:18) + ( + s ) s (cid:19) . It might be better to keep only integers in the rewritten form; we do not know how todo that in general, although it is simple enough for this example.A final and important efficiency is to realize that the sum for the left-hand termsand the sum for the right-hand terms is invariant under a symmetry: exchange m and n , exchange s and 1 − s , and account for sign changes in the second sum by absorbingthem into the q j , and the sums can be executed by the same program. This leads to later programmer efficiency as well, if one thinks of a further improvement to the code: thenit only has to happen in one place. [This actually happened here.]The goal is to make the innermost loop as efficient as is reasonably possible. Weexpect that these blends will be evaluated with hundreds of points (routinely) and onoccasion with tens of thousands of points (for a tensor product grid of a bivariate func-tion, for instance). In M APLE , we would like to be able to use evalhf or even thecompiler.
The Horner loop above can be rewritten to provide not only the value of p ( x ) but alsoof p (cid:48) ( x ) , the derivative with respect to x . This is also called program differentiation.M APLE ’s D operator can differentiate simple programs such as that. Supposing we de-fine Horner := proc(x, f, n)local i, p;p := f[n];for i from n-1 by -1 to 0 dop := f[i] + x*p;end do;return p;end proc:
Then the command
D[1](Horner) produces the following: proc(x, f, n)local i, p, px;px := 0;p := f[n];for i from n - 1 by -1 to 0 dopx := px*x + p;p := f[i] + x*p;end do; lends in M
APLE px;end proc This procedure returns just the derivative, not the derivative and the polynomial value.If one wishes that, one may use instead codegen[GRADIENT] , with the syntax codegen[GRADIENT](Horner, [x], function_value = true)
This command generates the following code. proc(x, f, n)local dp, i, p;dp := 0;p := f[n];for i from n - 1 by -1 to 0 dodp := dp*x + p;p := p*x + f[i];end do;return p, dp;end proc
Procedures for evaluating higher-order derivatives may be computed in a similar way.For our purposes, though, it is better to allow an arbitrary number nder of deriva-tives. This means not adding one or more statements to the Horner loop, but ratherwriting a loop to evaluate all the derivatives. Here is this idea applied to the Hornerprogram above.
Horner := proc(x, f, n, nder)local i, ell, p;p := Array(0..nder,0);p[0] := f[n];for i from n-1 by -1 to 0 dofor ell from nder by -1 to 1 dop[ell] := p[ell]*x + ell*p[ell-1];end do;p[0] := f[i] + x*p[0];end do;return p;end proc:
Calling this with symbolic x and f and numeric n and number of derivatives desired,gets something like (for n = p ( x ) = f + x ( f + x ( x f + f )) which looks familiar, but has the expression p (cid:48) ( x ) = ( x f + f ) x + f + x ( x f + f ) which looks strange. But it’s correct—just a rewriting of the normal derivative of acubic polynomial. But the strength of this technique is not for symbolic use, but ratherfor numeric use. When calling the modified program with a numeric x then the loop just performs (reasonably efficient) numerical computation; this program can be translatedinto other languages, as well.For the code for our blends, we simply wrote all the loops ourselves as above. Wehave not yet tried to translate the resulting code (which is more complicated than thesimple Horner loop above) into any other languages. There is similar code in the
Interpolation package and in the
CurveFitting pack-age, namely
Spline and
ArrayInterpolation . The interface to this code should notbe too much different to those. Consideration of the various possible kinds of inputsdemonstrates that a front-end that dispatches to the most appropriate routine would behelpful; if the input s is a symbol, then there is no point in calling evalhf , for instance.If the input is an Array of complex floating-point numbers, then depending on Digitsit might indeed be appropriate to try the hardware float routine.For that reason we chose a module with
ModuleApply as being most convenient;this would allow the user to be relatively carefree. We also allowed the module to exportthe basic ‘fast’ routines so that if the user wanted to look after the headaches of workingstorage of hardware floating point datatypes then the user could use blends in their owncode without a significant performance penalty.Another issue is a good name. We chose
HornerTwoPointHermiteInterpolation before we thought of the name blend . At the moment we have kept the old name anduse macro(Blend=HornerTwoPointHermiteInterpolation) for short.The minimum information that the routine needs is z , a , b , and the Arrays p and q of Taylor coefficients. If the user does not request a number of derivatives, it can besafely assumed that only H ( z ) is wanted. The grade ( m , n ) of the blend can be deducedfrom the input Arrays p and q . It might be a convenience to the user to allow the abilityto specify m or n even if the input Arrays are larger, of course.The types of data input can vary considerably. We allow rationals, exact numerics,software floats, hardware floats, and complex versions of all of those. We do not providefor finite fields (the binomial coefficients will sometimes be zero—we don’t even knowif formula (1) is even true—in that case) or for matrix values although for that lattercase the concept is well-defined.The data type of the output can vary, as well: when there is an Array of inputs, andonly function values are wanted and no derivatives, the user would surely expect anArray of outputs of the same dimension. If derivatives are wanted, though, then therewill be a higher-dimensional Array output; sometimes the special case of an index 0for such a higher-dimensional output would fit the user’s expectations so we allow anoption to specify such. The default is just to be sensible: scalar in, scalar out; Vector in,Vector out.Currently several operations take place outside the code, in “main M APLE ”. Thisincludes series manipulations and the construction of the companion matrix pair. Con-struction of the integrated blend is also currently left in the user’s hands.At this moment we do not know just how this code would be used, or who would beinterested (aside from people interested in high-order methods for solving differentialequations numerically). The idea is surprisingly flexible: by reversing Taylor series at lends in M
APLE each end, it is easy to make blends of inverse functions, for instance. The usage willaffect how convenient or inconvenient the user interface is. So we tried to make theinterface as simple as possible. Doubtless improvements will occur to us after it’s outin the wild. Right now, there is too much duplicate code—there is one more refactoring that isneeded and has not yet been done. The “automatic” differentiation code is a bit fragile,and a bit opaque; we have tried to document it with the future modifier in mind. Havingthat actually generated automatically would improve code maintainability substantially.The code can be made still faster: one would want to be able to parallelize evaluationat a vector of points, for instance, and one would want to be able to compile the code.These achievements would perhaps come at a cost of maintainability.
In figure 4 we see the results of a simple test with random Taylor coefficients drawnfrom the interval − ≤ x ≤
1. We first used the M
APLE rand function to generate co-efficients for the maximum m and n . Subsequent calls to Blend used subsets of thosedata. The blends were evaluated in 15 Digit precision at 2021 points equally-spaced on0 ≤ s ≤ H m , m ( s ) , H (cid:48) m , m ( s ) , H (cid:48)(cid:48) m , m ( s ) ,and H (cid:48)(cid:48)(cid:48) m , m ( s ) . The computing time was modest and showed linear growth, with a fit of0 . m to its data (in seconds). Thus the computing time seems, as expected, linear inthe degree of the balanced blend. We ran a further test with the same coefficients butthis time without asking for derivatives; the cost (not shown) was a factor 4 . ( , ) blends for f ( s ) = cos π s and its deriva-tives. This shows the effects, scaled with the appropriate power of π , of taking thederivative. This function has known Taylor series at each end (indeed the coefficientsare just the negatives of each other): cos π s = − ( π s ) / + ( π s ) / − · · · and at theother end cos π s = − + π ( s − ) − π ( s − ) + π ( s − ) − π ( s − ) + O (cid:16) ( s − ) (cid:17) . The ( , ) blends are better—and use essentially the same information because the Tay-lor coefficients of degree 9 at either end are zero—but these curves are informativeabout the numerical stability and efficiency of these blends.We then chose a harder example. In figure 6 we find the results of a “stress test”,namely a blend for the function f ( s ) = exp ( − / s ) . This has all its right-hand derivatives Fig. 4.
CPU time in seconds on an i5-7300U 2.6Ghz Microsoft Surface Pro running M
APLE H m , m ( s ) of varying degrees. The blends and their first three derivatives wereevaluated on 2021 equally spaced points on the interval 0 ≤ s ≤ . m showing linear growth of computing time, as should have been expected. at s = diff(exp(-1/x),x$k) to find that, for k ≥ d ( k ) fdx k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) x = = ( − ) k + e − / WhittakerM (cid:18) k , , (cid:19) = ( − ) k + e − F (cid:18) − k (cid:12)(cid:12)(cid:12)(cid:12) (cid:19) . (16)Here F represents hypergeom( [1-k], [2], 1) . For k = ( − ) . It is amusing to note that apart from sign, the first 5 are just exp ( − ) / k !, butthe degree 5 term is −
19 exp ( − ) / s = n = m = q part of formula (1) is present, so the blend is actually degree1001 not just grade 1001. The largest binomial coefficient appearing is (cid:0) (cid:1) which isabout 6 . · which suggests that numerical difficulties are to be expected. None,however, appear. The blend is entirely smooth, and the difference between the blendand f ( s ) is no more than 10 − at its greatest. One expects that the blends will convergeas ( m , n ) go to infinity, for any fixed ratio of m and n . Here because the ratio was 9 / s = . s = . s =
1, both of degree 900 andof 1001. The errors at s = − . . lends in M APLE f ( s ) − H , ( s ) (b) ( f (cid:48) ( s ) − H (cid:48) , ( s )) / π (c) ( f (cid:48)(cid:48) ( s ) − H (cid:48)(cid:48) , ( s )) / π (d) ( f (cid:48)(cid:48)(cid:48) ( s ) − H (cid:48)(cid:48)(cid:48) , ( s )) / π Fig. 5.
The error in an ( , ) blend for f ( s ) = cos π s . This grade of blend produces an approxima-tion that is nearly accurate to full double precision; the truncation error, proportional to s ( − s ) ,is beginning to be obscured by rounding errors. Recomputing these errors at higher precisiongives smoother curves of about the same size. As usual with approximation methods, the accu-racy degrades as the derivative order increases.8 R.M. Corless, and E. Postma Fig. 6.
The difference between exp ( − / s ) and its ( , ) blend near its maximum point. Themaximum difference is about 10 − . Computing this grade 1001 blend and its derivative at 2021points (once the series coefficients at s = s = (cid:0) (cid:1) ≈ . · but no numerical artifacts are seen: the blend is smooth all across 0 ≤ s ≤ We now give another stress test, this one (finally) showing some numerical fail-ure (overflow and underflow, resulting in NaNs, or floating-point (Not A Number)s).We blend the step function f ( s ) = − s = f ( s ) = s = m and n , the step willbe located somewhere between; near s = ( m + ) / ( m + n + ) in fact. The Lebesguefunction is maximal at that point, with value ( max ( m , n ) + ) / ( min ( m , n ) + ) . For ( m , n ) = ( , ) (these are Fibonacci numbers, by the way) we have the largest bino-mial coefficient about 3 . · which must overflow in IEEE double precision, whichis used by evalhf . The corresponding powers of s and 1 − s must underflow. In spiteof that, the blend correctly computes (taking 7 .
7s CPU time) the step portion of thefigure: overflow and underflow causing NaNs only happen in the flat portions of theblend. See figure 7. Computing instead in 30 Digits (which takes about 70 seconds onthe same machine) does not suffer from overflow or underflow because software floatsin M
APLE have a greater range. At this precision, M
APLE computes the complete fig-ure (not shown). Moreover, comparing the numerical values computed at 15 Digits tothe values computed at 30 Digits, we find that the largest difference is smaller than7 · − . Working at 15 Digits, the blend was able to correctly compute the interestingpart of the curve, even though overflow/underflow prevented it from computing the flatparts. The derivative was computed in 15 Digits correct to 10 − in the same region thefunction was computed correctly.Note that φ · ≈ . φ = ( √ − ) / ≈ . Blend was stored in the
Array(0..2020,0..1) y , then thevalue of y [ , ] = − . y [ , ] = . lends in M APLE of the step is indeed determined by the ratio of m + m + n +
2, being Fibonaccinumbers.We conclude that blends of degrees high enough to produce binomial coefficients (cid:0) m + nn (cid:1) that overflow will cause numerical difficulty. What seems surprising is that thisis the only case where we have seen numerical difficulty. We have also looked at caseswhere the function has nearby complex poles and so the series cannot converge, andwhile the blends do have unexpected features in the regions between the two points ofexpansion of the series, in all cases they behaved smoothly. Even when we tried seriesfor functions whose Taylor series are known to be ill-conditioned (such as exp ( − s ) blending with exp ( ( s − ) / π ) the resulting behaviour was explainable. Fig. 7.
The ( , ) blend of the step function − (cid:0) + (cid:1) ≈ . · and this causes both overflow and underflow result-ing in NaNs, which are not plotted. Every numerical value that is plotted is correct to 13 digits,however: the only numerical failure is overflow. This blend has grade 1598 and computing it andits derivative at 2021 values (many of which resulted in NaNs) took 7 . The idea of blending two Taylor series is quite old, and people have tried to do it inseveral different ways. The Hermite interpolation idea is one of the oldest, but we thinkthat not enough attention has been paid to it. There are other ways in the literature.For example, there is the very similar work in [5], which makes a kind of blend with avariable upper limit and uses that to construct rational approximations.The next step of course is to combine different blends into what we call a stringof blends, joined at “knots” where the same local Taylor series are reused. This is akind of piecewise polynomial , similar to splines which are another kind of piecewise polynomial. Of course having a single blend use more than two truncated Taylor seriesis just Hermite interpolation. There are a great many other similar ideas in the literature.It is not clear if what we are calling a “blend” will be sufficiently useful to catchon widely; the existing body of numerical software involving piecewise polynomials isquite substantial, and it is not clear that a blend is any better than what is being usedalready. However, there are some niche situations, such as numerical solution of ODEby high-order methods, where it is natural (and already being used in specialized soft-ware). There may also be useful pedagogical reasons to talk about blends (whimsically,we called some of the unusual quadrature formulas “Anti-Cheating Quadrature rules”;this may be of interest for student assessments!). The idea of a blend does reinforce theideas of convergence and approximation. Trying to blend two series that have complexpoles near to each expansion point produces some very informative results! The answerto “Will it blend?” is, in that case, “no”. The antics of the blends as they try to converge(when they can’t ) is quite entertaining.We would like to extend this code to vector and matrix blends. We would alsolike to blend Laurent and Puiseux series (Laurent series seem, in fact, very simple:just do a blend of the Taylor series for ( z − a ) α ( z − b ) β f ( z ) References
1. Battles, Z., Trefethen, L.: An extension of Matlab to continuous functions and operators.SIAM Journal on Scientific Computing (5), 1743–1770 (2004)2. Benghorbal, M., Corless, R.M.: The n th derivative. ACM SIGSAM Bulletin (1), 10–14(2002)3. Borwein, J.M., Corless, R.M.: Gamma and factorial in the Monthly. The American Mathe-matical Monthly (5), 400–424 (2018)4. Corless, R.M., Fillion, N.: A graduate introduction to numerical methods, vol. 10. Springer(2013)5. Hummel, P., Seebeck Jr, C.: A generalization of Taylor’s expansion. The American Mathe-matical Monthly (4), 243–247 (1949)6. Kansy, K.: Elementare fehlerdarstellung f¨ur ableitungen bei der Hermite-interpolation. Nu-merische Mathematik (4), 350–354 (1973)7. Lawrence, P.W., Corless, R.M.: Numerical stability of barycentric Hermite root-finding. In: Proceedings of the 2011 International Workshop on Symbolic-NumericComputation. pp. 147–148. SNC ’11, ACM, New York, NY, USA (2011).https://doi.org/10.1145/2331684.2331706, http://doi.acm.org/10.1145/2331684.23317068. Nakatsukasa, Y., S`ete, O., Trefethen, L.N.: The AAA algorithm for rational approximation.SIAM Journal on Scientific Computing40