IIMA Journal of Numerical Analysis (2019) Page 1 of 29doi:10.1093/imanum/drnxxx
Compact Finite Differences and Cubic Splines
Robert M. Corless † The Ontario Research Center for Computer Algebra and The School of Mathematical andStatistical Sciences The University of Western Ontario London, Ontario,Canada [Received on 15 October 2019; revised on ]
In this paper I uncover and explain—using contour integrals and residues—a connection between cubicsplines and a popular compact finite difference formula. The connection is that on a uniform mesh thesimplest Padé scheme for generating fourth-order accurate compact finite differences gives exactly thederivatives at the interior nodes needed to guarantee twice-continuous differentiability for cubic splines.I also introduce an apparently new spline-like interpolant that I call a compact cubic interpolant; this issimilar to one introduced in 1972 by Swartz and Varga, but has higher order accuracy at the edges. Iargue that for mildly nonuniform meshes the compact cubic approach offers some potential advantages,and even for uniform meshes offers a simple way to treat the edge conditions, relieving the user of theburden of deciding to use one of the three standard options: free (natural), complete (clamped), or “not-a-knot” conditions. Finally, I establish that the matrices defining the compact cubic splines (equivalently,the fourth-order compact finite difference formulæ) are positive definite, and in fact totally nonnegative,if all mesh widths are the same sign, for instance if the mesh is real and nodes are increasing only.
Keywords : compact finite differences; cubic splines; barycentric form; compact cubic splines; contourintegral methods; totally nonnegative matrices
1. Introduction “The most popular choice continues to be a piecewise cubic approximating function.”—Carl de Boor (de Boor, 1978, p. 49).Cubic splines and the similar piecewise interpolant known as pchip are both widely used inpiecewise polynomial interpolation. Cubic splines give a twice continuously differentiable interpolantthrough given data, while pchip tries instead to preserve monotonicity and convexity. Both are usefulnot just because they fit the data, but also because their derivatives are generally also good approxima-tions to the derivative of the underlying function that one wants to approximate. Loosely speaking, ona uniform mesh of width h the fit of a cubic spline to a smooth function is accurate to O ( h ) and thederivative is accurate to O ( h ) . For more careful error bounds see Swartz & Varga (1972) and de Boor(1978).In contrast to both cubic splines and pchip , a true cubic Hermite interpolant fits not only functionvalues at the nodes but also true derivative values at the nodes: both cubic splines and pchip usesubstitutes computed from the function values. The similarity of names ( pchip stands for PiecewiseCubic Hermite Interpolant) does cause confusion. The relative rarity of the case when true derivatives ρ k , = f (cid:48) ( τ k ) are specified makes this confusion bearable. This paper will concentrate on cubic splines,and not on pchip or on true cubic Hermite interpolants. The classical reference for splines is de Boor(1978), but see also chapter 3 of Moler (2004). † Email: [email protected] © The author 2019. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. a r X i v : . [ m a t h . NA ] N ov of 29Compact finite differences are an efficient and accurate way of approximating the derivatives ofknown data. There are several formulæ in use. See for instance Lele (1992) or Collatz (1966).In this paper I detail a connection between cubic splines on a uniform mesh and a popular compactfinite difference formula that the traditional (monomial basis) method of computing the cubic splineshad obscured. Namely, the simplest Padé scheme for generating fourth-order compact finite differencesgives exactly the derivatives at the interior nodes needed to guarantee twice-continuous differentiability;that is, a spline. The literature both of splines and of finite differences is vast. Nonetheless I believethat this connection is new. In this paper I will give the two derivations, and show that, in the case of anequally-spaced mesh, equation (4.2) from the compact scheme and equation (3.9) from the cubic splineare identical after normalization (each could be multiplied by an arbitrary constant without altering thesolution). I then offer an explanation in section 5 of why this is so (or why it might have been expectedto be so).I also comment on the various choices for treating the degrees of freedom at the endpoints, andintroduce a new choice, which I call a “compact cubic spline”, namely the use of a modified compactfinite difference formula to give the derivatives at the endpoints. I prove that the matrices involved arepositive definite (indeed totally nonnegative) if the mesh widths have the same sign . This is similar inspirit to the approach of Swartz & Varga (1972), who use a four-node explicit finite difference at eachend to provide O ( h ) accurate approximate derivatives at the edges of a uniform grid. Here, we use acompact four-node formula that gives O ( h ) accurate approximate derivatives at the edges, uniform gridor not, if the function being interpolated has at least five continuous derivatives.
2. Notation
We study two basic problems and their connection in this paper. The first basic problem under con-sideration is construction of a piecewise polynomial interpolant p ( t ) from function value data ρ i , on apossibly nonuniform mesh made up of distinct nodes τ i , numbered 0 (cid:54) i (cid:54) n . This will lead to n + n + n (cid:62) n polynomial on n + n (cid:54) f : [ a , b ] → R have been sampled on some partition of afinite interval [ a , b ] , with a = τ < τ < τ < · · · < τ n − < τ n = b . We write ρ i , = f ( τ i ) for the knownfunction values, for 0 (cid:54) i (cid:54) n . On each subinterval [ τ k , τ k + ] the piecewise interpolant will be givenby p ( t ) = p k ( t ) , a polynomial of degree at most p k ( τ k ) = ρ k , and p k ( τ k + ) = ρ k + , , andsatisfying some other conditions that we will specify later. We will refer to this as a piecewise cubicpolynomial even if some or all of its degrees are less than 3.As is very well known, both cubic splines and the shape-preserving interpolant known as pchip construct values ρ i , at the interior nodes τ i for 1 (cid:54) i (cid:54) n − derivative of each p k ( t ) in orderto ensure continuous differentiability there: we wish p (cid:48) i − ( τ i ) = p (cid:48) i ( τ i ) for all interior nodes, 1 (cid:54) i (cid:54) n −
1. A spline goes further than pchip and chooses these values in order to ensure twice continuousdifferentiability at the interior nodes: p (cid:48)(cid:48) i − ( τ i ) = p (cid:48)(cid:48) i ( τ i ) . By not imposing this last condition, a pchip can use some degrees of freedom to preserve monotonicity instead, usually by taking the derivatives ρ i , Normally, the mesh will be real with the nodes increasing, so the mesh widths will all be positive. Sometimes it is convenientto number real nodes from the rightmost endpoint, so the mesh widths will all be negative. If the nodes are complex and not in astraight line, then indeed the mesh widths will not have the same (complex) sign.OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
3. Cubic Splines
What follows in this section is a new derivation, or at least a derivation likely new to the reader, of thepiecewise cubic Hermite spline. This derivation was apparently first published in Chapter 8 of Corless& Fillion (2013). The novelty is that it uses the barycentric forms for a cubic Hermite interpolanton each piece: that is, instead of trying to fit unknown cubic polynomials in a local monomial basis p k ( t ) = a k + b k ( t − τ k ) + c k ( t − τ k ) + d k ( t − τ k ) to the data and trying to find reasonable ways todetermine the 4 n unknowns (there are n subintervals), one instead works directly with p k ( t ) = k + ∑ i = k ∑ j = j ∑ (cid:96) = β i , j ρ i (cid:96) ( t − τ i ) (cid:96) − j − k + ∑ i = k ∑ j = β i , j ( t − τ i ) − j − , (3.1)which is the second barycentric form of the cubic Hermite interpolant (note that only two nodes are usedin this form and thus this determines only one piece of the interpolant). There are four unknowns in thisformula: ρ k , , ρ k , , ρ k + , , and ρ k + , . The β i , j can be found from the partial fraction decompositionof the reciprocal of the node polynomial, and thus can be regarded as known once the nodes τ k arespecified: 1 ( t − τ k + ) ( t − τ k ) = k + ∑ i = k ∑ j = β i , j ( t − τ i ) − j − = − ( τ k + − τ k ) t − τ k + ( τ k + − τ k ) ( t − τ k ) + ( τ k + − τ k ) t − τ k + + ( τ k + − τ k ) ( t − τ k + ) . There are only four β i , j for each interval, and we see them written above explicitly in terms of the givennodes τ k . For convenience, one can simplify the second barycentric form to the usual cubic Hermitepolynomial basis (here h k + = τ k + − τ k is the width of the interval): p ( t ) = ( t − τ k + ) ( t − τ k + h k + ) ρ k , h k + + ( t − τ k + ) ( t − τ k ) ρ k , h k + + ( t − τ k ) ( τ k + + h k + − t ) ρ k + , h k + + ( t − τ k ) ( t − τ k + ) ρ k + , h k + . (3.2)R EMARK ( p ( x )) is the exact value of a polynomialgoing through the data [ ρ k , ( + θ ) , ρ k , ( + θ ) , ρ k + , ( + θ ) , ρ k + , ( + θ )] ; that is the floating-pointevaluation of p ( x ) is the exact value of a polynomial going through data that is at most six roundingerrors different to the original data.Notice that the ρ i , (not the ρ i , , which represent derivative values) are the known data values. Wewant to choose the n + ρ i , to make the resulting interpolant as smooth as possible. We will see of 29that we can make it C [ τ , τ n ] ; that is, the second derivative will be continuous at each interior node.This makes the piecewise interpolant a cubic spline, albeit represented in a different basis. Notice alsothat we may choose the ρ i , in such a way that we automatically have p ( t ) ∈ C [ τ , τ n ] : just take theslope at the right end of one interval to be the same slope at the left end of the next. This is very naturalbecause ρ i , is then interpreted as ‘the’ slope at the node τ i (indeed it would be somewhat unnatural tohave different slopes on the left and right, though we could do that if we wanted). Having made ourinterpolant continuously differentiable by this device, then p (cid:48) k − ( τ − k ) = ρ k , = p (cid:48) k ( τ + k ) . To further ensure p ( t ) ∈ C [ τ , τ n ] , we want to make the second derivatives equal, i.e. , p (cid:48)(cid:48) k − ( τ − k ) = p (cid:48)(cid:48) k ( τ + k ) k = , , . . . , n − . (3.3)I personally found this easier than the necessary algebra in the local monomial case. We would have p k ( t ) = ρ k , + ρ k , ( t − τ k ) + c k ( t − τ k ) + d k ( t − τ k ) (3.4)and even to make the function just C we would have to impose the condition p (cid:48) k ( τ − k + ) = p (cid:48) k + ( τ + k + ) , which isn’t automatic; we would have to enforce ρ k , + c k ( τ k + − τ k ) + d k ( τ k + − τ k ) = ρ k + , . For C , we would have to enforce yet another condition, namely2 c k + d k ( τ k + − τ k ) = c k + . Of course, this can be done, and the solution is even elegant. These equations can be reduced to a tridiagonal system of equations for the slopes ρ k , , and explicit formulæ for the c k and d k are knownonce the slopes are known. The solution is shown in splinetx.m in Moler (2004).But here, because we start with the Hermite interpolational basis, we have a simpler (but equivalent)task: just enforce the second derivative conditions. To do this, we need a formula for the second deriva-tive of the cubic Hermite interpolant at the nodes. A short computation by hand or with a computeralgebra language such as M APLE shows that p (cid:48)(cid:48) k − ( τ − k ) = τ k − τ k − ( ρ k , + ρ k − , ) − ( τ k − τ k − ) ( ρ k , − ρ k − , ) (3.5)and p (cid:48)(cid:48) k ( τ + k ) = − τ k + − τ k ( ρ k , + ρ k + , ) + ( τ k + − τ k ) ( ρ k + , − ρ k , ) . (3.6)Equating these at interior nodes 1 (cid:54) k (cid:54) n − n − τ k − τ k − ρ k − , + (cid:18) τ k − τ k − + τ k + − τ k (cid:19) ρ k , + τ k + − τ k ρ k + , = ( τ k + − τ k ) ( ρ k + , − ρ k , ) + ( τ k − τ k − ) ( ρ k , − ρ k − , ) (3.7) OMPACT FINITE DIFFERENCES AND CUBIC SPLINES k = , , , . . . , n −
1. There are n − h k = τ k − τ k − and h k + = τ k + − τ k and multiply equation (3.7) by h k h k + / ( h k + h k + ) to get2 h k + h k + h k + ρ k − , + ρ k , + h k h k + h k + ρ k + , = h k h k + ( h k + h k + ) ( ρ k + , − ρ k , ) + h k + h k ( h k + h k + ) ( ρ k , − ρ k − , ) . (3.8)If the mesh is equally-spaced, that is τ k = a + ( b − a ) k / n = a + kh for 0 (cid:54) k (cid:54) n , with h = ( b − a ) / n ,then these equations become for 1 (cid:54) i (cid:54) n − ρ i − , + ρ i , + ρ i + , = h ( ρ i + , − ρ i − , ) . (3.9)If we specify two edge conditions and then solve this tridiagonal system of n − p k ( t ) is given by equation (3.2).This leaves two degrees of freedom, which (counterintuitively) may be unwanted: the user will haveto choose what to do with these two degrees of freedom. Some common choices are, first, to ask for three times continuous differentiability at τ and at τ n − : this is called the “not-a-knot” condition. Second,one can specify arbitrary values of the derivatives at the ends, which is sometimes called a “clamped”spline; if the derivative values are correct for the function being interpolated, it is called a complete spline in de Boor (1978). Finally, one can ask for a second derivative of zero at the ends, called the“natural” spline.We will see a new choice, actually a new set of choices, similar to a fifth choice due to Swartz &Varga (1972) that is mentioned in de Boor (1978), after the next section.
4. Compact finite differences
This section is adapted from chapter 11 of Corless & Fillion (2013). The problem being addressed isthe problem of computing derivatives of a function known only at discrete values; we are not neces-sarily interested in computing an interpolant. One could use, for instance, simple finite differences andapproximate f (cid:48) ( τ k ) by ( f ( τ k + ) − f ( τ k )) / ( τ k + − τ k ) . The main idea of a compact finite difference isthat instead of using a single explicit finite difference formula to evaluate a derivative at a point, wehave a whole mesh of function values and we wish to compute the derivatives at all the nodes. Thisis quite like the case of a global interpolating polynomial where one constructs a differentiation matrix(see Weideman & Reddy (2000)), and indeed we will have the equivalent of a differentiation matrixhere; but it will not be explicitly formed. Instead we will solve a banded linear system.The purpose of this paper is to show that the canonical example of a compact finite differencehappens to have a significant relationship with cubic splines. This fact does not seem to have beennoticed before, although it is difficult to be sure, given the variation in nomenclature and the largenumber of works on piecewise polynomials of one kind or another. Specifically, the relationship is asfollows.Let ∆ be the difference operator ∆ ( f ) = f ( x + h ) − f ( x ) . Then the so-called operator approach tofinite differences (which appears for instance in Boole (1960)) gives a relationship between the differ-entiation operator D and ∆ , namely D = ln ( + ∆ ) / h . Then the ( , ) Padé approximant for ln ( + ∆ ) , of 29namely D . = ∆ + ∆ / + ∆ + ∆ / , (4.1)gives rise to a compact finite difference scheme (also called a Padé scheme) that happens to be fourth-order accurate: apply the denominator + ∆ + ∆ / f (cid:48) and the numerator ∆ + ∆ / f and we findafter shifting to center at t that, up to terms of O ( h ) ,16 f (cid:48) ( t − h ) + f (cid:48) ( t ) + f (cid:48) ( t + h ) = h ( f ( t + h ) − f ( t − h )) . We will discuss further the operator approach briefly in section 6.2. This formula gives us a tridiagonal(whence “compact”) system of equations for the unknown derivatives, at all interior nodes ; again some-thing special has to be done at the edges. This tridiagonal system of equations turns out to be identicalup to scaling—when the nodes are equally-spaced—to equations (3.7).The idea of solving a system of equations to find our finite difference approximation to the deriva-tives may be unfamiliar. The simplest finite difference formulæ make straightforward linear combina-tions such as ( f ( t + h ) − f ( t − h )) / ( h ) to approximate f (cid:48) ( t ) . But compact finite difference formulæare different. That is, instead of simply applying a formula to a vector of function values to get a vectorof derivative values, we instead have to set up and solve a linear system of equations for the unknownderivatives. Having to solve equations is more complicated than just using a formula, but it has severaladvantages.To better understand where the system of equations for this formula comes from, we give anotherderivation. Make the following simplifying assumptions, for the moment. Suppose f (cid:48) ( τ ) and f (cid:48) ( τ n ) are known (just to make it simple) and that τ k + − τ k = h is constant. Then fix attention on one particularnode, say τ k . The formula above becomes, when multiplied by 6 and putting t − h = τ k − , t = τ k , and t + h = τ k + , f (cid:48) ( τ k − ) + f (cid:48) ( τ k ) + f (cid:48) ( τ k + ) = h ( f ( τ k + ) − f ( τ k − )) . (4.2) This equation is exactly equation (3.9) . Recall that ρ k , represents derivatives f (cid:48) ( τ k ) and f ( τ k ) = ρ k , .Now we let k vary over all the indices of the interior nodes, 1 (cid:54) k (cid:54) n −
1. Each interior node gives usone equation. Each equation only contains at most three of the unknown derivatives (and the equationfor k = f (cid:48) ( τ ) , while the equation for k = n − f (cid:48) ( τ n ) ). This gives us a tridiagonal linear system of equations to solve for the unknownderivatives f (cid:48) ( τ k ) , 1 (cid:54) k (cid:54) n −
1. Call the tridiagonal matrix A . Notice also that the right-hand side ofthe system involves linear combinations of the values of f ( τ k ) at different nodes—these are supposedto be known. Call the (also tridiagonal) matrix that forms that combination, B . Note that B has azero diagonal. The system Av = B ρρρ needs to be solved computationally to get the vector v of desiredderivatives.In effect this computes the differentiation matrix D as A − B , but in practice one never explicitlycomputes A − because it is a full matrix. Instead, of course, to compute Dy , one solves Av = By for v ,which is formally A − By .I now give a derivation of a fourth-order compact finite difference method for nonuniform meshes,using contour integrals because this also works for the barycentric form. This provides a point oftheoretical continuity with section 3, and will be useful for the explanation of the connection to cubic OMPACT FINITE DIFFERENCES AND CUBIC SPLINES h k − = rh and h k = sh and we assumeboth r and s are different from zero (we could without loss of generality assume that one of r or s was 1but this doesn’t help much, and I find the symmetry in the formulæ below useful for understanding).1 ( z + rh ) z ( z − sh ) = r h ( s + r ) ( z + rh ) + r + s ( z + rh ) r h ( s + r ) + r h s z + r − szr h s + h ( s + r ) s ( z − sh ) + − s − r ( z − sh ) h ( s + r ) s , By a standard argument in complex variables, the contour integral of12 π i ˛ C f ( z )( z + rh ) z ( z − sh ) dz (4.3)over the contour that encloses all the zeros of the denominator (the numerator is a polynomial and hasno poles) is zero for polynomials f ( x ) of degree 6 − = f ( j ) ( p ) j ! = π i ˛ C f ( z )( z − p ) j + dz (4.4)and multiply by ( s + r ) for convenience, this gives us the following formula, which is exact for poly-nomials of degree at most four .1 r f (cid:48) ( − rh ) + ( s + r ) r s f (cid:48) ( ) + s f (cid:48) ( sh )= − r + sr h ( r + s ) f ( − rh ) − ( r − s ) ( s + r ) r hs f ( ) + s + rh ( r + s ) s f ( sh ) . (4.5)These equations, if r (cid:54) = s , are not the same as equations (3.8), whatever constant we use to normalizethem. Setting r = s =
1, however, gives equation (4.2). If we now have a mesh of distinct points τ < τ < · · · < τ n − < τ n , we can lay this compact formula first on τ , τ , and τ , with rh = τ − τ and sh = τ − τ which, if we have a reference step width h , say h = ( ∑ ( τ k + − τ k )) / n , gives us an equationrelating the derivative values on these three mesh points to the function values on the mesh points. Wethen lay the formula over τ , τ , and τ , giving us another equation—one for each interior point, τ , τ , . . . , τ n − . The linear system for the unknown derivative values is, like the spline equations, tridiagonal;but we only have n − n + f (cid:48) ( τ ) , f (cid:48) ( τ ) , . . . , f (cid:48) ( τ n ) . To make a squaresystem, we need two more equations.4.1 Truncation error in the compact formula
Once the formula has been found, one can do a more standard Taylor series analysis on it, for instancein M
APLE . Taking the Taylor series in h of the difference L − R where L is the left-hand side of equa- The fact that this is valid for polynomials of degree four is an important point in understanding why the equations are the sameas for cubic splines, as we will see in section 5. of 29tion (4.5) and R is the right-hand side, the error term turns out to be L − R = ( s + r ) h f ( ) ( ) − ( r − s ) ( s + r ) h f ( ) ( ) + O ( h ) , (4.6)which is O ( h ) regardless of the local mesh ratios r and s . If r = s = h /
30 times the fifth derivative of f evaluated at zero.4.2 Truncation error in the spline formula, considered as a compact finite difference
We can do the same kind of error analysis as we have just done for the compact formula, but for equa-tions (3.8). Put ρ k , = f (cid:48) ( τ k ) and ρ k , = f ( τ k ) in equations (3.8). Take h k − = rh and h k = sh . For asufficiently differentiable function f we may take the Taylor series in h of the difference L − R betweenthe left- and right-hand sides. Then the result is L − R = − rs ( r − s ) f ( ) ( ) h + rs ( r − rs + s ) f ( ) ( ) h + O ( h ) . (4.7)This is the truncation error that occurs if the spline continuity equations (3.7) are used as a compact finitedifference formula. We see that if the mesh is not locally approximately uniform, that is r (cid:54) = s + O ( h ) ,then this error is O ( h ) , not O ( h ) . We may compare this with equation (4.6); we see that if r = s theyare identical, as we will see that they must be.4.3 What to do at the edges
We now return to computing fourth-order accurate compact finite differences. We can use the samecontour integral method to look for fourth-order formulae at either end, giving equations involving τ and its nearest mesh neighbours, and τ n and its nearest neighbours. We will want the formulæ likewiseexact for polynomials of degree four or less. Since the matrix is so far tridiagonal, we try to keep it thatway and we thus look for relations of the form a f (cid:48) ( τ ) + b f (cid:48) ( τ ) = c f ( τ ) + c f ( τ ) + c f ( τ ) + c f ( τ ) . (4.8)The reader may verify that fourth order accuracy is not possible in general without the function value at τ . This still qualifies as ‘compact’ though because we use only two extra mesh points at the left end,and similarly only two extra on the right, and these appear in the right-hand side and do not change thetridiagonality of the matrix. We will need n (cid:62)
4, giving five by five matrices at the smallest. This ansatzsuggests looking at the partial fraction decomposition of1 ( z − τ ) ( z − τ ) ( z − τ ) ( z − τ ) , (4.9)from which we straightforwardly find (of course by using a computer algebra system) that (with h = τ − τ , h = τ − τ , etc and normalizing so b = a = h ( h + h )( h + h ) ( h + h + h ) and b = , OMPACT FINITE DIFFERENCES AND CUBIC SPLINES c k s are c = − h ( h + h ) (cid:0) h + h h + h h + h + h h (cid:1) h ( h + h ) ( h + h + h ) c = h ( h − h ) + h ( h − h ) h h ( h + h ) c = h ( h + h ) h ( h + h ) h c = − h h h ( h + h ) ( h + h + h ) . It turns out that the residual error in (4.8) is, as desired, O ( h ) . In detail, if τ k + = τ k + r k + h , for k = h = τ − τ = h , then the residual error is seen by a Taylor series computation to be L − R = ( r + r ) r h f ( ) ( ) + O ( h ) . (4.10)In the uniform mesh case this reduces to h /
60 times the fifth derivative. A similar formula holds forthe other end (indeed, simply reverse the labels, τ k ↔ τ n − k ). This gives us closure in our search for acompact, variable mesh fourth-order finite difference formula. We are left with a tridiagonal matrix A with entries depending on the h k = τ k − τ k − .Apart from the first and last rows, this matrix is diagonally dominant because 4 − ( h k + h k − ) / ( h k + h k − ) = h k h k − / ( h k + h k − ) . We will see that it is indeed positive definite if the mesh widths havethe same sign. For meshes with widely varying widths, however, the matrix can be ill-conditioned. Seefigure 2.By direct computation with Maple for small examples, we see that the determinant is a rationalpolynomial in all the h k with positive coefficients in the numerator and a squared denominator, andindeed is positive definite. Monitoring the sign of the determinant of A as computed numerically byMatlab’s det function, we find that for n (cid:62) HEOREM h k have the same sign, then for n (cid:62) n + n + A of thecompact finite difference formula with these fourth order formulæ at the edges is totally nonnegative :that is, the determinants of all minors are nonnegative. More, the determinants of all principal minorsare strictly positive, so the matrix is positive definite. Proof.
Without loss of generality we will assume all h k >
0. By a theorem of Gantmacher andKrein (see (Hogben, 2006, Chapter 29)), an irreducible tridiagonal matrix is totally nonnegative if andonly if its entries are nonnegative and its leading principal minors are nonnegative. Here all entriesare nonnegative. To prove the theorem we must then verify only that all leading principal minors arepositive. We do so by induction, for all but the last, and then handle that specially.For simplicity, write A = ZT n + , n + where Z = diag ( , / ( h + h ) , / ( h + h ) , . . . , / ( h n − + h n ) , ) to remove the common denominator of each of the interior rows. Multiplying by positive di-agonal matrices leaves the properties of total positivity (nonnegativity) undisturbed. This leaves eachinterior row as h k , ( h k + h k − ) , h k − , for easier manipulation. Explicitly writing the six-by-six case,0 of 29 T , = h ( h + h )( h + h )( h + h + h ) h ( h + h ) h h ( h + h ) h h ( h + h ) h
00 0 0 h ( h + h ) h h ( h + h )( h + h )( h + h + h ) . Let T k , n + = T [ k , k ] denote the leading principal submatrices. Let D k , n + = det ( T k , n + ) for 1 (cid:54) k (cid:54) n +
1. By direct computation, D , n + = h ( h + h )( h + h ) ( h + h + h ) D , n + = h h h ( h + h )( h + h ) ( h + h + h ) D , n + = h h h ( h + h ) ( h + h + h )( h + h ) ( h + h + h ) (4.11)R EMARK n =
3, then D , =
0. This is excluded from the theorem; as discussed previously, inthat case, just fit a polynomial to all the data.Returning to the proof, we see that D , n + > D , n + >
0, and D , n + >
0, for any choice ofpositive h k .Laplace expansion about the last row of T k , n + for 3 (cid:54) k (cid:54) n gives a recursive formula for thedeterminant: D k , n + = ( h k − + h k ) D k − , n + − h k − h k D k − , n + . (4.12)For the last minor, namely the determinant of the full matrix, we have D n + , n + = h n − ( h n − + h n − )( h n − + h n )( h n + h n − + h n − ) D n , n + − h n − D n − , n + . (4.13)By direct computation, D , n + = h D , n + + q (4.14)where q is the positive function q = h h h ( h + h + h )( h + h )( h + h + h ) . (4.15)Now assume inductively that D k − , n + = h k − D k − , n + + q k − for a positive function q k − . We see abovethat this is true for k − =
3, or k =
4. We will simultaneously establish by induction that D k , n + > D k , n + = h k − D k − , n + + q k − for another positive function q k − , for k (cid:54) n . OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
11 of 29R
EMARK n (cid:62) n = D , is not given by the interior formula, but by the end formula.Returning to the proof, by the interior recurrence relation above, D k , n + = ( h k + h k − ) D k − , n + − h k h k − D k − , n + = h k − D k − , n + + q k − (4.16)where, by the inductive assumption and expanding ( h k + h k − ) , q k − = h k h k − D k − , n + + h k q k − (4.17)is clearly positive, being the sum of positive terms.This establishes by induction that all leading minors D k , n + up to, but not including, the final deter-minant D n + , n + , are positive. We verify that the final determinant is positive by a different method.If the ( n + , n + ) entry in T n + , n + were α , then the determinant D n + , n + would be positive if α D n , n + − h n − D n − , n + > α > h n − D n − , n + D n , n + > h n − ( h n + h n − ) − h n h n − D n − , n + / D n − , n + (4.19)by equation (4.12). Examining the ratio D n − , n + / D n − , n + will lead to the sufficient condition α > h n − ( h n − + h n ) − h n h n − / ( h n − + h n − ) (4.20)for D n + , n + to be positive. In detail, because D n − , n + and D n − , n + are interior determinants, theysatisfy D n − , n + D n − , n + = h n − + h n − h n − + h n q n − D n − , n + (4.21)where that last term is positive. Dropping the last term and replacing = by > then following the manip-ulations (inversion, negation, inversion) through the algebra of inequalities gives the upper bound h n − ( h n + h n − ) − h n h n − D n − , n + / D n − , n + < h n − ( h n + h n − ) − h n h n − / ( h n − + h n − h n − ) . (4.22)Verifying that α = h n − ( h n − + h n − )( h n − + h n )( h n + h n − + h n − ) is larger than the right hand side, and thus larger than the left,is straightforward: after some algebra we have α − rhs = h n − h n − h n ( h n − + h n − + h n )( h n − + h n ) ( h n + h n − + h n − ) A (4.23)2 of 29where A = h n + h n h n − + h n h n − + h n − h n − + h n − (4.24)which, being positive, completes the theorem. (cid:92) (cid:3) Some details of the equally-spaced case.
In the equally-spaced case the ( , ) and ( n + , n + ) entries each become 1 /
3, the diagonal entries are 4 otherwise, and all sub and superdiagonal entriesare 1. LU factoring the matrix gives bidiagonal L and U . The diagonal entries of U are 1 /
3, 1, 3,11 /
3, 41 /
11, and so on up until the penultimate entry, for 2 < k < n − a k + / a k where a k + = a k − a k − , and initial conditions a = a = U , call it u n + , n + must have a n − / a n · + u n + , n + = / L are the reciprocals of the diagonal entries of U ) and so u n + , n + = / − a n − / a n . Solving the recurrence relation for the a k we have a k = (cid:32) − √ (cid:33) (cid:16) + √ (cid:17) k + (cid:32) + √ (cid:33) (cid:16) − √ (cid:17) k (4.25)so 1 / − u n + , n + very rapidly approaches 2 − √ = / ( + √ ) = . / = . a k / a k + = / ( + √ ) + O ( c − k ) where c ≈ .
9. This is important because thefinal entry u n + , n + is then very close to 1 / − / ( + √ ) ≈ . HEOREM n + n + A for uniform mesh isasymptotically 63 + √ ≈ .
35, exponentially quickly independent of n . Proof.
Omitted for brevity. (cid:3)
Yet another alternative treatment of the endpoints.
In Zhao et al. (2007) we didn’t use equa-tion (4.10). Instead we added an extra node τ , which requires n >
3, in order to specify the ratio a / b = c : = + √
3. This allows analytical factoring of the matrix, speeding the process up slightly (itlowers the constant in the O ( n ) cost). In detail, we found the edge formula c f (cid:48) ( ) + f (cid:48) ( h ) = − h (cid:18) ( − c / − / ) f ( )+ ( c − / ) f ( h ) + ( − c + / ) f ( h )+ ( c / − / ) f ( h ) + ( − c / + / ) f ( h ) (cid:19) (4.26)(and a similar formula at the right edge with the sign of h reversed but where c =
4) which has truncationerror ( c − ) h f ( v ) ( ) + ( c − ) h f ( vi ) ( ) + · · · . (4.27)This gives compact finite differences, again fourth order accurate. Note that the truncation error aboveis larger than the truncation error for the four-node formula; this seems to be compensated for by thefact that the condition number is smaller, by a similar amount. OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
13 of 29A similar formula is available for a nonuniform mesh. However, it does not have the same speedadvantage that it does for a uniform mesh, because the matrix cannot (so far as I know) be analyticallyfactored. The proof of Theorem 4.1 is easily modified to show that the resulting matrix (for nonuniformmeshes) is also totally nonnegative for positive mesh widths; in fact, this follows from a general factabout totally nonnegative matrices that says that the ( , ) entries and the ( n + , n + ) entries can alwaysbe increased without altering the total nonnegativity (see (Hogben, 2006, Chapter 29) for a statementand reference).As mentioned, the advantage of this formula for a uniform mesh is that the tridiagonal matrix A canbe analytically factored into L and U , where independent of dimension the subdiagonal elements of L are all 1 / c = / ( + √ ) , and the diagonal elements of U are all c . This saves the O ( n ) cost of factoringthe matrix, both in terms of operation count and in terms of storage. The relatively trivial disadvantageof this formula is that it needs one more function value ( f ( h ) ) on the right hand side. This makes nopractical difference because in the only case where this matters, n =
3, one can instead use a singlecubic. For the uniform mesh formula, the 1-norm condition number is again asymptotically constant,but now being less than 3 irrespective of the dimension; however, the truncation error at the edges islarger, and thus this advantage does not seem to matter.4.4
Nonuniform mesh compact derivatives/spline in practice
So, how well do these methods work in practice? Actually, pretty well. As expected, if the mesh ratiosare not “too large,” i.e. , adjacent subintervals are not too different in width (so that the r k factors inthe edge formulæ and the r and s factors in the interior formula are not too large), then the formula isvery similar to the cubic spline and its derivative, for smooth functions. As the mesh ratios depart moreseriously from uniformity, the compact method begins to be more accurate than the cubic spline.Note, however, that the formulæ as initially written in terms of the τ k are quite likely to producerounding errors, especially in the formulæ at the edges, and should be rewritten using h i + = τ i + − τ i and factored, as done in the six-by-six example above, wherever possible. When this is done the influ-ence of rounding errors, while still felt, is significantly reduced. These formulæ have been implementedin M ATLAB in a program called vcompact4 , available on the code repository for Corless & Fillion(2013). For this paper, I have written a program called compactcubic , which I discuss later.R
EMARK infinitely ill-conditioned (a fact which iswell-known, but rarely discussed as such: see the exposition in chapter 11 of Corless & Fillion (2013)for one instance).
5. Explanation
Why are the equations for a cubic spline, which guarantee twice-continuous differentiability for thepiecewise cubic interpolation of the data, the same equations as those for (equally-spaced) fourth-ordercompact finite differences? The problems being solved by the two algorithms are different!Saying it again: in the cubic spline case, we are constructing separate cubic polynomials on eachsubinterval τ k (cid:54) t (cid:54) τ k + . To guarantee twice continuous differentiability, we specify that the interior4 of 29derivatives ρ i , for 1 (cid:54) i (cid:54) n − ( z − τ k − ) ( z − τ k ) ( z − τ k + ) (5.1)has zero coefficient of 1 / ( z − τ k ) if (and only if) τ k + − τ k = τ k − τ k − . Explicitly, here, if τ k + − τ k = τ k − τ k − = h (as we will see without loss of generality we may take h =
1) we have the followingT
HEOREM t and h be fixed complex numbers, h (cid:54) =
0, and p ( z ) be a polynomial in the variable z . If the polynomial p ( z ) has degree at most 5, then the values of p ( t − h ) , p (cid:48) ( t − h ) , p ( t ) , p ( t + h ) ,and p (cid:48) ( t + h ) determine p (cid:48)(cid:48) ( t ) uniquely. Simultaneously, if p ( z ) has degree less than or equal to 4, thosesame values determine p (cid:48) ( t ) uniquely. Proof.
For intelligibility we may without loss of generality replace z with θ where z = t + θ h , so − (cid:54) θ (cid:54)
1. Then we have the partial fraction decomposition1 ( θ + ) θ ( θ − ) = − / ( θ + ) − θ + + θ + θ + θ − + / ( θ − ) , (5.2)which (as noted in the preamble to the theorem) has no 1 / θ term. By a standard argument, the contourintegral 12 π i ˛ C p ( θ )( θ + ) θ ( θ − ) d θ = C contains −
1, 0, and 1. Therefore the partialfraction decomposition and Cauchy’s integral formula gives0 = − p ( − ) + p ( ) − p ( ) + (cid:0) p (cid:48) ( ) − p (cid:48) ( − ) (cid:1) + p (cid:48)(cid:48) ( ) . (5.4)This means that if we know the value of the function at three consecutive points (which for a splinewould take two cubic polynomials to fit), with equal width intervals on either side of the midpoint 0,and we know the first derivatives at the two endpoints, then the value of the second derivative at themidpoint 0 is determined irrespective of the value of the first derivative p (cid:48) ( ) . Since these same quanti-ties determine p (cid:48) ( ) by equation (4.2) for polynomials of degree at most four, the proof is complete. (cid:92) (cid:3) Undoing the nondimensionalization and translating this into a finite-difference formula on the threepoints τ k − , τ k , τ k + where h = τ k + − τ k = τ k − τ k + , we have f (cid:48)(cid:48) ( τ k ) = f ( τ k + ) − f ( τ k ) + f ( τ k − ) h − f (cid:48) ( τ k + ) − f (cid:48) ( τ k − ) h + f ( ) ( ) h + O ( h ) . (5.5)Notice the familiar second difference formula and the centered difference formula (using f (cid:48) ) appearing.That this combination is fourth-order accurate was new to me but hardly seems surprising; doubtless theformula is in one book of finite differences or another, perhaps Milne-Thomson (1951), Collatz (1966), This is similar to the contour integral explanation of the well-known extra order of accuracy of Simpson’s rule: we fit threepoints, but in the equally-spaced case we get exact fit for a cubic, not just a quadratic.OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
15 of 29Jordán (1965), or even Boole (1960). The error term in equation (5.5) was computed in Maple, andindicates that the formula is exact if the sixth derivative is identically zero, i.e. if f ( z ) is a polynomialof degree at most five. Since the value of the derivative at τ k is not used, we can see that imposing afourth-order accurate value of the derivative there must respect this constraint on the second derivative.An alternative view is that in the equally-spaced case we may force an approximate value of f (cid:48) ( τ k ) that is in O ( ) error without altering this finite-difference value of f (cid:48)(cid:48) ( τ k ) .That this symmetry forces the value of the second derivative to be a particular value for all fourth-order accurate derivatives still seems somewhat surprising. The key seems to be the extra accuracy,allowing the second derivative to be determined, that is permitted by the zero residue in the (locally)equally-spaced case.
6. Numerical tests
I wrote a small didactic (and therefore relatively inefficient) M
ATLAB program that I called compactcubic ,based on the codes pchiptx and splinetx from Moler (2004), and which I will make available, ei-ther in the code repository for Corless & Fillion (2013) or some other convenient place.Rather than leave the interpolant in the Hermite interpolational basis, which I prefer, for ease ofcomparison I computed the local monomial basis coefficients from the derivatives, much as is done in splinetx . Indeed the formulæ are very similar. I added the ability to output the first derivative. Ialso examine the 2nd derivative, which turns out to be informative, though normally clearly departingfrom the 2nd derivative of the underlying function. That this is not continuous for nonuniform meshesis demonstrated in figure 8.I ran this on several functions, notably the Runge example f ( x ) = / ( + x ) and the signumfunction ( sign in M ATLAB ), using uniform meshes and nonuniform meshes of random widths. Thenonuniform meshes of random widths were constructed by first generating the random widths with rand(1,n) and then constructing the nodes by use of cumsum .Some of the results are given in figures 4 and 5. In these tests, the programs appear to have be-haved satisfactorily. The condition numbers of the tridiagonal matrices depend on the node family used.Equally-spaced nodes generate a condition number essentially constant; random meshes generate con-dition numbers that depend on the mesh spacing, but not the dimension. The O ( / n ) behaviour of theerrors on uniform grids and the O ( / n ) behaviour of the errors of the derivative are shown.R EMARK
ATLAB built-in spline function and piecewisepolynomial evaluation routines do not have automatic access to evaluating the derivatives of the piece-wise functions. Of course one could write one’s own spline differentiator using unmkpp and mkpp (it only takes two lines of code), but bookkeeping other people’s choices of piecewise polynomial rep-resentation is not always trivial, and the user might not notice the solution given in the help page for mkpp . I believe that the built-in ability to take derivatives is always helpful . But for this paper, to makecomparisons, instead of using mkpp and unmkpp I modified the splinetx.m function to returnderivatives.As predicted, for equally-spaced nodes the derivatives were the same. For Chebyshev nodes the As a colleague points out, piecewise polynomial differentiation functionality is available in the CurveFitting package,which can be downloaded and installed (at my institution, at no extra cost; on the Mathworks page it offers the possibility of afree trial). r − s is typically quite small); for Fibonacci nodes there wasa difference, but not much.6.1 Contour Integrals and Higher Derivatives
The contour integral technique used above is of course classical. I learned it from John C. Butcher, whoused the following elegant variation in Butcher (1967) and later in Butcher et al. (2010) to computehigher derivatives. The idea is to multiply by another factor, which I call the Butcher factor and denoteby B ( z ) , that zeros out the unneeded and interfering residues. As an example, suppose that we wishto find a formula for the second derivative of a cubic spline. We then wish a rational function withdenominator ( z − τ n ) ( z − t ) ( z − τ n + ) , so that we will pick up a term of the form 1 / ( z − t ) whichwill supply us with p (cid:48)(cid:48) ( t ) /
2! by the Cauchy Integral Formula. But this approach will also generate aterm for p ( t ) and another for p (cid:48) ( t ) , all mixed in together; so we introduce the Butcher factor B ( z ) = b + b z + b z (6.1)and choose the coefficients b , b , and b (not all zero) so that the residues of 1 / ( z − t ) and 1 / ( z − t ) arezero. This gives us two linear equations in three variables to solve, which we can do straightforwardly.One reasonably presentable solution is to put t = τ n + h φ , z = τ n + h θ and τ n + = τ n + h for neatness,and then B ( θ ) = (cid:0) ( φ − ) φ (cid:1) − (cid:0) φ ( φ − ) (cid:1) θ + (cid:0) φ − φ + (cid:1) θ (6.2)which gives in turn B ( θ ) θ ( θ − φ ) ( θ − ) = − φ + ( θ − ) + − φ + θ + φ − θ − + − φ + θ + ( θ − φ ) . (6.3)Notice that as intended the coefficients of 1 / ( z − t ) and 1 / ( z − t ) (equivalently 1 / ( θ − φ ) and 1 / ( θ − φ ) ) are zero. This in turn gives (after the same contour argument as before) that2 − φ h p (cid:48) ( τ n ) + − φ h p (cid:48) ( τ n + ) + p (cid:48)(cid:48) ( t ) + φ − h p ( τ n + ) + − φ h p ( τ n ) = , (6.4)if the degree of p ( z ) is at most three. Rearranging, we get p (cid:48)(cid:48) ( τ n + φ h ) = ( φ − ) h p (cid:48) ( τ n ) + ( φ − ) h p (cid:48) ( t n + ) + ( − φ ) h (cid:18) p ( t n + ) − p ( τ n ) h (cid:19) . (6.5)To use the same idea to find a compact formula for the second derivative given only function valueson equally-spaced nodes, the idea is to search for a Butcher factor B ( z ) that makes the coefficients of ( θ + ) − , θ − , and ( θ − ) − zero in the partial fraction expansion of b + b θ + b θ ( θ + ) θ ( θ − ) . (6.6)Here we have taken advantage of the equal spacing of the mesh to put z = τ k + θ h again. A shortcomputation shows that B ( θ ) = θ − θ − ( θ + ) θ ( θ − ) = − θ + + θ − θ − + / ( θ + ) + θ + / ( θ − ) . (6.7) OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
17 of 29After using the Cauchy Integral Formula again, we find the following fourth-order formula p (cid:48)(cid:48) ( τ n − h ) + p (cid:48)(cid:48) ( τ n ) + p (cid:48)(cid:48) ( τ n + h ) = h ( p ( τ n − h ) − p ( τ n ) + p ( τ n + h )) . (6.8)The Cauchy formula already tells us that this is exact for polynomials of degree at most 5. A separateTaylor series expansion (trivial in a computer algebra system) gives that the error term for functions notpolynomials of degree at most five is asymptotically, as h → h p ( ) ( τ n ) + h p ( ) ( τ n ) + O ( h ) . (6.9)6.2 Notes and further reading
The books de Boor (1978) and Jordán (1965) are each rich sources of the history of their respectivesubjects. Much of this section comes from them. The word “spline” comes from the same root asthe English word “splinter”, meaning a thin fragment of wood; such wooden pieces gave rise to aflexible mechanical device that was used, apparently in shipbuilding, to draw smooth curves betweenfixed points. I. J. Schoenberg in Schoenberg (1946) chose the name “spline” for the piecewise cubicpolynomials computed in this fashion because they minimize an approximation of the strain energy thatthe original flexible instrument minimized. The modern terminology, and in particular the differencebetween a cubic spline, a pchip , and a true cubic Hermite interpolant (where the derivative values f (cid:48) ( τ k ) are also known exactly and not chosen by the algorithm), is somewhat confusing, sometimeseven to experts.I was unable to trace the first use of the word “compact” for finite differences that are defined asa relation amongst function and derivative values on a small set of mesh points. Some authors useinstead “optimal”, conveying that the maximum accuracy possible is attained given the constraints onthe number of mesh points. Collatz used the term “Mehrstellenverfahren”, which is still used.This paper is concerned with the connection between these two concepts. To explain this connectionI used the contour integral approach. As is very well known, this is not the only approach to finitedifferences, compact or otherwise.One standard approach is simply to use brute Taylor expansion of an ansatz with undetermined co-efficients, then set all the desired coefficients of powers of the step size to zero, and brutally solve for theunknowns. Doing this once and for all for a fixed (possibly symbolic) grid in a computer algebra systemis perfectly straightforward and perfectly useful, and I have used this brutal method in a few papers suchas Zhao et al. (2007), and before that in Corless & Rokicki (1996). As pointed out in Fornberg (1998)this approach cannot easily be used in a purely numerical environment because the linear systems areoften quite ill-conditioned, and more so for larger formulæ; but in a computer algebra system the arith-metic can be done exactly (if rational), or at whatever precision desired, and the expense of doing so isamortized over the life of the use of the formulæ.Also as pointed out in Fornberg (1998), and in Corless & Fillion (2013), the use of the formulæ(however obtained) is subject to numerical difficulty; this is because differentiation is infinitely ill-conditioned. But the brute-force approach of generating formulæ by computer algebra systems worksand gets formulæ as accurate as any method does . Well, maybe there is still room for blunders. The paper Keller & Pereyra (1978) used Macsyma to generate weights for finitedifference formulæ. The paper Fornberg (1988) points out that the tables in Keller & Pereyra (1978) contained “both isolatedand systematic errors”. Certainly while I have been as careful as I can for this paper, and I have taken out a lot of typos, I haveprobably introduced some in transferring formulæ from Maple to L A TEX. explanation , however. Here, of course, weare trying to understand the connection of cubic splines to compact finite differences, and the contourintegral approach supplied a satisfactory explanation. Contour methods are often used in this way.According to Weideman (2005) contour integrals were used by the great number theorist K. Mahler toexplore Padé approximations to the logarithms of algebraic numbers in Mahler (1953). The connectionto Padé approximation (and indeed compact finite difference formulæ are often called “Padé methods”)arises because of another, perhaps even more elegant method for constructing finite difference formulæ,the operator method . One version of the operator method, derived by applying difference operations tothe exponential function, is used to great effect in Fornberg (1998) to compute weights in many finite-difference formulæ and to give general recurrence relations for doing so on arbitrary grids. In olderworks, such as Fox (1957), we see a more “formal” treatment of operator methods; here ‘formal’ meansthat rigour is not maintained throughout the computation (a curious usage of this word, that often causespuzzlement, meaning only that computations are just carried out by ‘form’ and not worrying until laterwhether they were correct). The operator method was recently blogged about in Scientific Americanby Lamb (2019). I gave a talk on this method at a CAIMS meeting in Winnipeg in 2005. It seems to becommon in early works on finite differences, such as Milne-Thomson (1951), and is at least a hundredand sixty years old, being present in Boole (1960); in fact, according to Jordán (1965) the method is dueto Lagrange himself.The operator method rests on an analogy: the shift operator E h ( f )( t ) = f ( t + h ) can be expressed interms of the derivative operator hD by Taylor’s theorem for analytic functions as E h ( f )( t ) = f ( t + h ) = ∑ k (cid:62) h k k ! D ( k ) ( f )( t ) , (6.10)where the symbol D ( k ) means repeated application of the differentiation operator. By analogy withmultiplication, this is written as E h = exp ( hD )( f )( t ) , which though likely familiar to the reader fromvarious contexts is actually a remarkable leap to a definition of an exponential of an operator. Formallysolving for D gives D = ln ( E h )( f )( t ) / h = ln ( + ∆ )( f )( t ) / h , motivating Padé approximations to thelogarithm near 1 (or the inverse sinh function near 0 because analogously exp ( hD / ) − exp ( − hD / ) = ( hD / ) applied to f ( t ) gives f ( t + h / ) − f ( t − h / ) and so the inverse sinh function expresses D in terms of compositions of central differences). The resulting Padé approximations then give uscompact finite difference formulæ.The paper Weideman (2005) extends this idea to Hermite-Padé approximations, and solves severalgeneral problems associated with these. Further, that paper details an open problem about explicitdescriptions of arbitrary width compact formulæ for the second derivative; I do not know if the Butcherfactor idea for contour integration can be used to attack that open problem, but John C. Butcher did use his approach to solve another general degree interpolation problem in quadrature (we detailed hissolution in Butcher et al. (2010)), so it might be possible. I note that one can establish that a necessarycondition for the unwanted residues to vanish if the degree of B ( z ) is to be no more than n is that at eachnode τ k , 0 (cid:54) k (cid:54) n , the Butcher factor must have B (cid:48) ( τ k ) − n ∑ j = j (cid:54) = k τ k − τ j B ( τ k ) = , (6.11)which implies that B ( z ) can be identified in the Lagrange basis by its values on the nodes by finding OMPACT FINITE DIFFERENCES AND CUBIC SPLINES
19 of 29a vector in the nullspace of D − ( s , s , . . . s n ) where s k = ∑ nj = j (cid:54) = k τ k − τ j and D is the differentiationmatrix on the nodes (for differentiation matrices, see e.g. Amiraslani et al. (2018)). I leave this to futurework.6.3 Higher dimension
Compact finite differences are widely used in problems having more than one dimension. An exampleon a regular grid is Rokicki & Floryan (1995), and on irregular grids is Rokicki & Floryan (1999).Automatic computation of such formulæ for scattered nodes in more than one dimension are consideredin more detail in the University of Western Ontario Ph.D. thesis of Jichao Zhao (2006) and a subsequentpaper Zhao & Corless (2006) and a Maple Share Library package released in that same year updatingthe original 1994 package. Other researchers have also studied the problem, for example Wright &Fornberg (2006) who use radial basis functions and were likely the first to do this in full generality.
7. Concluding remarks, and a marriage of convenience
The usual methods of dealing with the two extra degrees of freedom of a cubic spline have alwaysseemed somewhat unnatural to me. As has been demonstrated with many psychological studies, theextra choice seems unwelcome. What I recommend here is to use compact finite differences to not onlycompute the derivatives at the interior nodes, but also at the endpoints. This is very similar in spirit tothe recommendation of Swartz & Varga (1972), except here we are using a compact finite differenceformula to give fourth-order accuracy, where they recommended a third-order accurate direct formula.The method recommended here requires essentially no more effort, being simply a modification of thetridiagonal system needed to find the spline derivatives, and has higher order accuracy for the derivativesat the edges. This new compact method, with its positive definite (totally nonnegative) matrices, there-fore seems as though it will be as good a method as any, and might be psychologically more satisfyingin that the derivatives at the end are determined by the data. Of course, they are also determined by thedata, in a sense, if instead the “not-a-knot” condition is used, but that makes a qualitative difference withthe two subintervals at each end. Whether it makes a practical difference is another matter.I have not made extensive comparisons with M
ATLAB ’s built-in spline , for example (but thefew experiments I have done show that the behaviour is at least similar). The fact that the Not-A-Knot condition generates an O ( h ) error in the derivative at the end may or may not be significant.Nor have I made a choice between the 2 + √ n = h ( h + h ) / (( h + h )( h + h + h )) method, which also needs at least 5 nodes because its four-by-four matrix is singular; in the first casethe condition number of the matrix is smaller, but the truncation error is larger (by a similar factor). Myonly experience with this in practice is with the uniform mesh case, where the efficiency of the analyticfactoring is noticeably useful. More experimentation is necessary to learn if the variable mesh formulais truly useful.The method has been coded in M ATLAB as the program compactcubic.m and will be made avail-able at the code repository for Corless & Fillion (2013), namely nfillion.com/coderepository .It seems that this approach gives a potentially reasonable alternative to splines, in that its derivativecan be expected to be more accurate at the nodes for nonuniform meshes, and one does not have to makea choice about what to do at the endpoints. That first caveat, “at the nodes” is important. The error inpiecewise cubic interpolation means that the derivatives between the nodes can only be O ( h ) accurate.This weakens the case for the compact cubic interpolant proposed here, it’s true; away from the nodes,0 of 29 REFERENCES one doesn’t see more accurate derivatives.This approach needs a name to distinguish it from “spline”, “cubic Hermite”, and “pchip” (itselfa misleading name for the interpolant whose derivatives are chosen to preserve qualitative features, atthe cost of even less accuracy in the derivative). I propose the name “compact cubic interpolant” forthis construction. It also needs an efficient and stable implementation; probably conversion to the localmonomial basis is the most efficient, as is done here, but there are likely other tricks needed to ensurethat the computation is as stable as possible.I realize that I have not made a strong case for using this method—cubic splines are pretty good, afterall, even if they occasionally wiggle too much (which is what pchip is for)—but the exploration of thisalternative has yielded some advantages for the compact cubic method: file-it-and-forget-it treatment ofthe edges, and in theory more accurate derivatives at the nodes. If one is using compact finite differencesfor some other purpose anyway (perhaps as part of solving a PDE by the method of lines) then theanalysis in this paper may help you to choose to use the compact cubic interpolant.I remark that totally positive (nonnegative) matrices occur in various interpolation problems—see Marco et al. (2017) for an example.The other result of this paper, namely an explanation, using contour integrals and residues, of theapparent coincidence that cubic splines give the same derivatives that the fourth order Padé compactformula does, is only one such explanation. This coincidence can also be explained by noting first(as can be found in Appendix A of Fornberg & Flyer (2015) for instance) that a cubic spline over x − h , x , x + h is the sum of a pure cubic and a multiple of | x − x | . Then since the compact formulais exact for pure cubics, one only needs to think about its behaviour on | x − x | ; but this is an evenfunction, and a moment’s reflection shows that (3.9) is exact for even functions, giving zero derivativeat the centre; so since the equation is exact for each, it is exact for their sum, and hence for splines.Thus this coincidence might not be so surprising for one “skilled in the art”. This argument works onlyin the case of real nodes, of course, whereas the residue argument works over C , but the overwhelmingmajority of spline interpolation takes place over R , so that doesn’t matter much. Still, I think that thiscoincidence has hardly been noticed in the literature; that it can be explained several ways is perhapsnot a surprise. I do like the residue explanation in preference to the | x − x | explanation because zeroresidues are connected to the extra degree of accuracy obtained in Simpson’s Rule, and connected to theButcher method of finding higher derivative formulæ and solving the Birkhoff interpolation problem.This, however, is clearly a matter of taste: I am certain that there are many who would prefer theexplanation using | x − x | .A referee points out that I have not reviewed the spline literature thoroughly in this paper, omittingreference even to the classic book Schumaker (2007). In my defence I point out that this paper is alreadytoo long. References A MIRASLANI , A., C
ORLESS , R. M. & G
UNASINGAM , M. (2018) Differentiation matrices for uni-variate polynomials.
Numerical Algorithms , 1–31.B
OOLE , G. (1960)
A treatise on the calculus of finite differences . Dover. (Orig. pub 1860. 2nd edition1872 edited by J. F. Moulton).B
UTCHER , J. C. (1967) A multistep generalization of Runge-Kutta methods with four or five stages.
J.ACM , , 84–99. EFERENCES
21 of 29B
UTCHER , J. C., C
ORLESS , R. M., G
ONZALEZ -V EGA , L. & S
HAKOORI , A. (2010) Polynomialalgebra for Birkhoff interpolants.
Numerical Algorithms , 1–29.C
OLLATZ , L. (1966)
The numerical treatment of differential equations , 3rd edn. Springer-Verlag.C
ORLESS , R. M. & F
ILLION , N. (2013)
A graduate introduction to numerical methods , vol. 10.Springer, p. 12.C
ORLESS , R. M. & R
OKICKI , J. (1996) The symbolic generation of finite-difference formulae.
Z.A.M.M. , , 381–382. DE B OOR , C. (1978)
A practical guide to splines . Springer-Verlag.F
ORNBERG , B. (1988) Generation of finite difference formulas on arbitrarily spaced grids.
Mathematicsof computation , , 699–706.F ORNBERG , B. (1998) Classroom note: Calculation of weights in finite difference formulas.
SIAMReview , , 685–691.F ORNBERG , B. & F
LYER , N. (2015)
A primer on radial basis functions with applications to thegeosciences . SIAM.F OX , L. (1957) The numerical solution of two-point boundary problems in ordinary differential equa-tions . Oxford Clarendon.H
IGHAM , N. J. (2002)
Accuracy and Stability of Numerical Algorithms , 2nd edn. Philadelphia: SIAM.H
OGBEN , L. (ed.) (2006)
Handbook of Linear Algebra . Chapman and Hall/CRC.J
ORDÁN , K. (1965)
Calculus of finite differences , vol. 33. American Mathematical Soc.K
ELLER , H. & P
EREYRA , V. (1978) Symbolic generation of finite difference formulas.
Mathematicsof Computation , , 955–971.L AMB , E. (2019) The theorem that unites different kinds of calculus.
Scientific American (blogs) .L ELE , S. K. (1992) Compact finite difference schemes with spectral-like resolution.
Journal of Com-putational Physics , , 16–42.M AHLER , K. (1953) On the approximation of logarithms of algebraic numbers.
Philosophical Trans-actions of the Royal Society of London. Series A, Mathematical and Physical Sciences , , 371–398.M ARCO , A., M
ARTÍNEZ , J.-J. & P
EÑA , J. M. (2017) Accurate bidiagonal decomposition of totallypositive Cauchy–Vandermonde matrices and applications.
Linear Algebra and its Applications , ,63–84.M ILNE -T HOMSON , L. M. (1951)
The Calculus of Finite Differences , 2nd edn. MacMillan and Com-pany. (1st edition 1933).M
OLER , C. (2004)
Numerical Computing with
MATLAB, Electronic edn. Philadelphia: SIAM.R
OKICKI , J. & F
LORYAN , J. (1995) Domain decomposition and the compact fourth-order algorithmfor the Navier-Stokes equations.
Journal of Computational Physics , , 79–96.2 of 29 REFERENCES R OKICKI , J. & F
LORYAN , J. (1999) Higher-order unstructured domain decomposition method forNavier–Stokes equations.
Computers & fluids , , 87–120.S CHOENBERG , I. J. (1946) Contributions to the problem of approximation of equidistant data byanalytic functions.
Quart. Appl. Math. , , 45–99 and 112–141.S CHUMAKER , L. (2007)
Spline functions: basic theory . Cambridge University Press.S
WARTZ , B. K. & V
ARGA , R. S. (1972) Error bounds for spline and L-spline interpolation.
JournalOf Approximation Theory , , 6–49.W EIDEMAN , J. A. C. (2005) Padé approximations to the logarithm I: Derivation via differential equa-tions.
Quaestiones Mathematicae , , 375–390.W EIDEMAN , J. A. C. & R
EDDY , S. C. (2000) A
MATLAB differentiation matrix suite.
ACM Trans.Math. Softw. , , 465–519.W RIGHT , G. B. & F
ORNBERG , B. (2006) Scattered node compact finite difference-type formulasgenerated from radial basis functions.
Journal of Computational Physics , , 99–123.Z HAO , J., D
AVISON , M. & C
ORLESS , R. M. (2007) Compact finite difference method for Americanoption pricing.
J. Comput. Appl. Math. , , 306–321.Z HAO , J. & C
ORLESS , R. M. (2006) Compact finite difference method for integro-differential equa-tions.
Applied Mathematics and Computation , , 271 – 288. EFERENCES
23 of 29 F IG . 1. Fourth-order uniform mesh compact finite difference derivative graphed with the exact derivative (dashed line). Even withjust five subintervals, visual accuracy is achieved in this example. REFERENCES
Distribution of condest(A), n=10946 f r equen cy / F IG . 2. Condition numbers of the tridiagonal compact finite difference matrix for random meshes. We sampled 10 ,
000 meshesof dimension n = ,
946 (a Fibonacci number). The meshes were generated by h = rand(1,n) . The vertical line is at the di-mension, n , suggesting that the mean condition number grows like n . Similar graphs were generated for other n , up to n = , histogram( log(condsR(:,1))/log(10),’DisplayStyle’,’stairs’) .EFERENCES
25 of 29 −4 −3 −2 −1 −14 −12 −10 −8 −6 −4 e rr o r F IG . 3. Fourth-order nonuniform mesh compact finite difference maximum error computing ( / Γ ) (cid:48) ( x ) on 1 (cid:54) x (cid:54)
3. The spatialmesh was Chebyshev points x j = + cos ( π j / n ) for 0 (cid:54) j (cid:54) n , for various n . Theoretical fourth-order behaviour is shown by thedashed line. REFERENCESF IG . 4. Error in fourth-order compact cubic spline and its derivative for the Runge function on random meshes on [ − , ] . Therandom meshes were generated by h = rand(1,n) and then using a cumulative sum to generate grid points. The mesh widthsvaried in each individual mesh by factors up to about 10 .F IG . 5. Error in fourth-order compact cubic spline and its derivative for the Runge function on uniform meshes on [ − , ] .EFERENCES
27 of 29 -1 -0.5 0 0.5 1Equal t-4-3-2-101234 D e r i v a t i v e e rr o r -3 F IG . 6. Plot of the error in the fourth-order uniform mesh ( n =
64) compact finite difference derivative , on the Runge example f ( x ) = / ( + x ) , on the interval [ − , ] . Solid line is the error between mesh points. The error is largest in the middle, andbelievably O ( h ) . The error in the interpolant itself (not the derivative as we have it in this figure) is O ( h ) . REFERENCES -1 -0.5 0 0.5 1Equal t-30-25-20-15-10-50510 S e c ond D e r i v a t i v e F IG . 7. Plot of the error in second derivative of the cubic Hermite interpolant. We use n = [ − , ] for theRunge example f ( x ) = / ( + x ) . The graph shows clearly that the second derivatives of the spline are continuous at thenodes, although as expected they do not match the second derivatives at the nodes computed by the repeated application of vcompact4.m .EFERENCES
29 of 29 F IG . 8. Plot of the error in second derivative of the compact cubic interpolant. We use n = [ − , ] for the Runge example f ( x ) = / ( + x ) . The graph shows clearly that the (linear) secondderivatives of the cubic spline are notnot