A Riemann--Hilbert approach to Jacobi operators and Gaussian quadrature
AA Riemann–Hilbert approach to Jacobi operators andGaussian quadrature
Thomas Trogdon [email protected]
Courant Institute of Mathematical SciencesNew York University251 Mercer St.New York, NY, 10012, USASheehan Olver
School of Mathematics and StatisticsThe University of SydneyNSW 2006, AustraliaSeptember 15, 2018
Abstract
The computation of the entries of Jacobi operators associated with orthogonal poly-nomials has important applications in numerical analysis. From truncating the oper-ator to form a Jacobi matrix, one can apply the Golub–Welsh algorithm to computethe Gaussian quadrature weights and nodes. Furthermore, the entries of the Jacobioperator are the coefficients in the three-term recurrence relationship for the polyno-mials. This provides an efficient method for evaluating the orthogonal polynomials.Here, we present an Ø( N ) method to compute the first N rows of Jacobi operatorsfrom the associated weight. The method exploits the Riemann–Hilbert representationof the polynomials by solving a deformed Riemann–Hilbert problem numerically. Wefurther adapt this computational approach to certain entire weights that are beyondthe reach of current asymptotic Riemann–Hilbert techniques. Infinite symmetric tridiagonal matrices with positive off-diagonal entries are referred to asJacobi operators. Such operators have an intimate connection to orthogonal polynomi-als. Given a positive weight d ρ ( x ) = ω ( x ) d x on R , ω entire with finite moments, the1 a r X i v : . [ m a t h . NA ] N ov ram–Schmidt procedure produces a sequence of orthogonal polynomials. These polynomi-als satisfy a three-term recurrence relation, see (1.2). From the coefficients in the recurrencerelation, a Jacobi operator J ∞ ( ω ) = a √ b √ b a √ b √ b a √ b . . . . . . . . . (1.1)can be constructed. Our aim is to rapidly calculate the entries of this Jacobi operator fromthe given weight ω .Computing the Jacobi operator (1.1) is equivalent to computing the coefficients in thethree-term recurrence relation (1.2). Gautschi states in [8]:The three-term recurrence relation satisfied by orthogonal polynomials is ar-guably the single most important piece of information for the constructive andcomputational use of orthogonal polynomials. Apart from its obvious use in gen-erating values of orthogonal polynomials and their derivatives, both within andwithout the spectrum of the measure, knowledge of the recursion coefficients (i)allows the zeros of orthogonal polynomials to be readily computed as eigenvaluesof a symmetric tridiagonal matrix, and with them the all-important Gaussianquadrature rule, (ii) yields immediately the normalization coefficients needed topass from monic orthogonal to orthonormal polynomials, (iii) opens access topolynomials of the second kind and related continued fractions, and (iv) allowsan efficient evaluation of expansions in orthogonal polynomials.All applications discussed here are merely a consequence of this broad and important factemphasized by Gautschi.While the entries of the Jacobi operator can be calculated using the Stieljes procedure (adiscretization of the Gram–Schmidt procedure), the complexity to accurately calculate thefirst N entries is observed to be Ø( N ), see Section 1.1. We develop an alternative to theStieljes procedure that achieves Ø( N ) operations: calculate the entries of the Jacobi operatorusing the formulation of orthogonal polynomials as solutions to Riemann–Hilbert problems(RHPs). Riemann–Hilbert problems are boundary value problems for analytic functions,and over the last 20 years they have played a critical role in the advancement of asymptoticknowledge of orthogonal polynomials [2, 3, 4, 11, 5]. More recently, a numerical approachfor solving Riemann–Hilbert problems has been developed [15, 16], which the authors usedto numerically calculate the N th and ( N − ω ( x ; N ) = e − NV ( x ) . In this paper, we extend this numerical Riemann–Hilbertapproach to stably calculate orthogonal polynomials for non-varying weights, as well as theentries of the associated Jacobi operator. Remark 1.1.
The Riemann–Hilbert approach also allows the calculation of p N ( x ) pointwisein Ø(1) operations, which can be used to achieve an Ø( N ) algorithm for calculating Gaussian2uadrature rules, ´a la the Glaser–Liu–Rokhlin algorithm [9, 10], whereas the Golub–Welschalgorithm is Ø( N ) operations. While technically an Ø( N ) algorithm, its applicability iscurrently limited due to large computational cost of evaluating solutions to RHPs. Remark 1.2.
In many cases one also considers the map from a Jacobi operator to theassociated weight. In the case that the elements of a Jacobi operator are unbounded, aself-adjoint extension ˜ J ∞ is considered. The spectral measure ˜ ρ for ˜ J ∞ satisfies e (cid:124) ( ˜ J ∞ − z ) − e = (cid:90) R d ˜ ρ ( x ) x − z . Here e is the standard basis vector with a one in its first component and zeros elsewhere.It is known that if ω ( x ) = e − p ( x ) for a polynomial p then J ∞ is essentially self-adjoint ( i.e. ˜ J ∞ is unique) and ω ( x ) d x = d ˜ ρ ( x ) [2]. Therefore, from a theoretical standpoint, the map ϕ from a subset of essentially self-adjoint Jacobi operators to this restricted class of measuresis invertible. Viewed in this way, we are concerned with the computation of ϕ − , the inversespectral map. The monic polynomials π n with respect to the weight ω satisfy the three-term recurrencerelation π − ( x ) = 0 ,π ( x ) = 1 ,π n +1 ( x ) = ( x − a n ) π n ( x ) − b n π n − ( x ) , (1.2)where a n = (cid:104) xπ n , π n (cid:105) γ n and b n = (cid:104) xπ n , π n − (cid:105) γ n − for (cid:104) f, g (cid:105) = (cid:90) f ( x ) g ( x ) ω ( x ) d x, where γ n are the normalization constants γ n = (cid:104) π n , π n (cid:105) − = (cid:18)(cid:90) R π n ( x ) ω ( x ) d x (cid:19) − . The constants γ n can be calculated directly from b n,N [8]: γ − n,N = b n b n − · · · b (cid:90) ω ( x ) d x, though it is convenient to compute them separately from the RHP. Remark 1.3.
In our notation, a n and b n are the recurrence coefficients for the monic or-thogonal polynomials. The Jacobi operator that follows in (1.1) encodes the recurrencecoefficients for the orthonormal polynomials. We also remark that the orthogonal polyno-mials themselves can be calculated from the recurrence coefficients.3he conventional method for computing recurrence coefficients is the Stieljes procedure,which is a discretized modified Gram–Schmidt method [8]. In this method, one replaces theinner product associated with the weight by an M -point discretization: (cid:104) f, g (cid:105) ≈ (cid:104) f, g (cid:105) M = M (cid:88) j =1 w k f ( x k ) g ( x k ) . For example, R can be mapped to the unit circle {| z | = 1 } using a M¨obius transformationand then the trapezoidal rule with M sample points applied. The three-term recurrence coef-ficients and the values of the monomial orthogonal polynomial with respect to the discretizedinner product can be built up directly: p M = , p M = x − A M (cid:104) , (cid:105) M p M , p Mn +1 = xp Mn − A Mn B Mn p Mn − B Mn B Mn − p Mn − A Mn = (cid:10) xp Mn , p Mn (cid:11) M , B M = (cid:104) , (cid:105) M , B Mn = (cid:10) xp n , p n − (cid:11) M , where is a vector of M ones, x = ( x , . . . , x M ) (cid:62) , the product of two vectors is the pointwiseproduct, i.e. , xp := diag( x ) p , and we use the notation (cid:104) f , g (cid:105) M = f (cid:124) diag( w , . . . , w M ) g .Noting that B Mn also equals (cid:10) p Mn , p Mn (cid:11) M , the entries of the Jacobi operator are approximatedby a n ≈ a Mn = A Mn B Mn and b n ≈ b Mn = B Mn B Mn − . It is shown in [8] that the recurrence coefficients a Mn , b Mn obtained from this discrete innerproduct (via the Stieltjes procedure) converge to a n , b n for fixed n as M → ∞ . It is statedin [8] that in M must be much larger than n , the order of the polynomial, for a Mn and b Mn to accurately approximate a n and b n . A significant question is: how much larger must M be? In Figure 1, we show the relative error in calculating the recurrence coefficients ofthe Hermite weight e − x d x for varying M and n where n is the index of the recurrencecoefficient computed. To accurately calculate the first 100 coefficients requires M ≈ M ∼ N to achieve accuracy in calculating thefirst N recurrence coefficients. Each stage of the Stieljes procedure requires Ø( M ) operations,hence the total observed complexity is Ø( N ) operations. This compares unfavorably withthe Ø( N ) operations of the RHP approach. For stability purposes, it is necessary that we consider weights that depend on a parameter N of the form d ρ N ( x ) = ω ( x ; N ) d x = e − NV N ( x ) d x (1.3)on R , where V N ( x ) ≡ V ( x ) tends to a nice limit as N → ∞ . Here V N ( x ) must be entire andhave polynomial growth to + ∞ as | x | → ∞ . For a fixed N , we can stably calculate the n throw for n (cid:46) N , hence we only consider n = 0 , , . . . , N . Varying weights of this form areuseful in the study of random matrices [2, 18].4 n (cid:45) (cid:45) (cid:45) (cid:45) (a)
200 400 600 800 1000 1200 M (cid:45) (cid:45) (cid:45) (cid:45) (b) Figure 1: Relative error in approximating the Hermite polynomial recurrence coefficients.(a) Varying the number of trapezoidal rule points M = 50 (plain), 100 (dotted), 1000(dashed) and 2000 (dot-dashed). (b) The error in coefficient n = 10 (plain), 20 (dotted), 40(dashed) and 60 (dot-dashed).For Gaussian quadrature, it is more natural to consider weights of the form e − Q ( x ) d x .When Q ( x ) = qx m + Ø( x m − ) is a degree m polynomial, a simple change of variables x (cid:55)→ N /m x is sufficient to put the weight in the required form, where V N ( x ) = N − Q ( N /m x ) → qx m as N → ∞ . When Q is not a polynomial — e.g. an Erd¨os weight [6] — we do aconstant change of variable x (cid:55)→ α N x + β N , but now α N and β N are determined numericallyin order to control the behaviour of V N ( x ) = N − Q ( α N x + β N ), see Section 3.6. Applyingthis procedure to two entire choices for Q , we observe a slight degradation in computationalcost to an estimated count of Ø( N log N ) operations to calculate the first N rows of theJacobi operator, see Appendix A.Once we calculate the first N rows of the Jacobi operator associated with d ρ N ( x ) =e − Q ( α N x + β N ) d x , we can subsequently determine the first N rows of the Jacobi operatorof the unscaled weight d ρ ( x ) = e − Q ( x ) d x . First note that if π j,N are monic orthogonalpolynomials with respect to d ρ N , then the monic orthogonal polynomials with respect to ρ N are π j ( x ) = α jN π j,N ( α − N ( x − β N )) . Thus if the entries of the Jacobi operator of ρ N are givenby a n,N and b n,N , then a simple calculation verifies that the entries of the original Jacobioperator of ρ are b n = b n,N α N and a n = a n,N α N + β N . (1.4)The factors in the right-hand sides of (1.4) depend on N while the product does not. Itis advantageous to choose N so that 1 /c < b n,N < c for some c > n or N and the choice n = N is sufficient. The argument for this can be seen by examiningthe case of the Hermite polynomials where b n = n/ n >
0. Here α N = n = n so that b n,N = n = 1 / n and the growth of b n is captured exactly by α N = n . If N is large withrespect to n then b n,N (cid:28) α N . Our choice for N is onethat keeps the “solution” b n,N bounded while keeping the amplification factor α N minimal. Remark 1.4.
We are implicitly assuming that the equilibrium measure of the scaled poten-tial is supported on a single interval. This is the case when Q is convex, as V N will also be5onvex. The method presented here for the computation of recurrence coefficients relies on the nu-merical solution of a matrix Riemann–Hilbert problem (RHP). We refer the reader to [16] fora discussion of the numerical solution of RHPs. In this section we give a brief descriptionof some components of the RHP theory. Given an oriented contour Γ, an RHP consists offinding a sectionally analytic function Φ( z ) : C \ Γ → C × , depending on n and ω ( x ; N ),such that lim z → x left of Γ Φ( z ) = (cid:32) lim z → x z right of Γ Φ( z ) (cid:33) G ( x ; n ) , G ( x ; n ) : Γ → C × . The contour Γ is referred to as the jump contour and G , as the jump matrix. Canonically,RHPs also impose the asymptotic behaviour at ∞ : lim | z |→∞ Φ( z ) = I . For orthogonalpolynomials, the initial RHP, Problem 3.1, is reduced to canonical form via a series oftransformations described in Section 3.Of course, the sense in which limits exist needs to be made precise, but this is beyondthe scope of this paper, see [2]. We use the notationΦ + ( x ) = Φ + ( x ) = lim z → x z left of Γ Φ( z ) , Φ − ( x ) = Φ − ( x ) = lim z → x z right of Γ Φ( z ) , where the ± may be located in the superscript or subscript. Define C Γ f ( z ) = 12 πi (cid:90) Γ f ( s ) s − z ds, C ± Γ f ( x ) = ( C Γ f ( x )) ± . The class RHPs that we consider which satisfy lim | z |→∞ Φ( z ) = I have the representationΦ( z ) = I + C Γ U ( z ) , for a matrix-valued function U ∈ L (Γ) [19]. The so-called Plemelj Lemma also holds: C +Γ f − C − Γ f = f, f ∈ L (Γ) . From this it follows that Φ + ( x ) = Φ − ( x ) G ( x ; n ) , lim | z |→∞ Φ( z ) = I, is equivalent to U − C − Γ U · ( G − I ) = G − I. (1.5)This is a singular integral equation for U and this integral equation is solved numericallywith the method in [16]. 6 Gaussian Quadrature
In this section we discuss Gaussian quadrature on R . In Gaussian quadrature, one choosesnodes { x j } nj =1 and weights { ω j } nj =1 such that (cid:90) R p ( x ) ω ( x ) d x = n (cid:88) j =1 p ( x j ) ω j , for any polynomial p ( x ) of degree ≤ n − We use the classical Golub–Welsh algorithm to calculate the nodes x j and weights ω j fromthe Jacobi matrix, i.e., n × n finite section of the Jacobi operator: J n ( ω ) = a √ b √ b a √ b √ b a √ b . . . . . . . . . (cid:112) b n − (cid:112) b n − a n − . The eigenvalues of J n ( ω ) are the zeros of π n ( x ) [8]. For each eigenvalue λ j of J n ( ω ), aneigenvector is given by v ( λ j ) = ( π ( λ j ) , . . . , π n − ( λ j )) (cid:124) . We use ˜ v ( λ j ) to denote the normalized eigenvector, having Euclidean norm one. Let µ = (cid:82) ω ( x ) d x and it follows that ω j = µ (˜ v ( λ j )) , i.e. the first component of ˜ v ( λ j ) squared times µ [8]. In practice, µ can be computed with standard quadrature routines, i.e. Clenshaw–Curtis quadrature.
For the Hermite polynomials, µ = β = √ π , β j = j/ j > a j = 0 always [8]. Weuse this as a test case to demonstrate the accuracy of our computation. In Figure 2 we showthe relative error of the first 700 coefficients computed using the method described below. ω ( x ) = e − x We demonstrate the integration of a meromorphic integrand. In all our examples, we chooseto integrate positive functions so that cancellation will not increase our accuracy. See Figure 3for a demonstration of this. 7
100 200 300 400 500 600 7001 (cid:180) (cid:45) (cid:180) (cid:45) (cid:180) (cid:45) (cid:180) (cid:45) (cid:180) (cid:45) (cid:180) (cid:45) (cid:180) (cid:45) Figure 2: The relative error in b n for n = 0 , , , . . . , (cid:45) (cid:45) (a) (cid:45) (cid:45) (cid:45) (cid:45) (cid:45) n A b s o l u t e E rr o r (b) Figure 3: (a) The function f ( x ) ω ( x ) for f ( x ) = cos (150 x ) / (25 x + 1) and ω ( x ) = e − x .(b) The error in the approximation of (cid:82) f ( x ) ω ( x ) d x with a varying number of nodes andweights. The approximation is compared against an approximation using Clenshaw–Curtisquadrature with 10 ,
000 points. 8 (cid:45) (a) (cid:45) (cid:45) (cid:45) (cid:45) (cid:45) n A b s o l u t e E rr o r (b) Figure 4: (a) The function f ( x ) ω ( x ) for f ( x ) = sin (10 x ) and ω ( x ) = e − x − sin( x ) . (b) Theerror in the approximation of (cid:82) f ( x ) ω ( x ) d x with a varying number of nodes and weights.The approximation is compared against an approximation using Clenshaw–Curtis quadraturewith 10 ,
000 points. ω ( x ) = e − x − sin( x ) A more exotic choice is Q ( x ) = x + sin( x ). Even though Q is not a polynomial and it is notconvex, numerical experiments demonstrate that the method described below to computethe recurrence coefficients is still effective. We demonstrate using this weight with Gaussianquadrature in Figure 4. Remark 2.1.
The only reason the method would fail on this weight would be if the equi-librium measure (see Section 3) was not supported on a single interval. ω ( x ) = e − cosh( x ) With the choice Q ( x ) = cosh( x ), we have super-exponential decay. We include this exampleto show that the method extends to Q that have super-polynomial growth. Indeed, a mod-ification of the method is needed, see Section 3.6. We demonstrate using this weight withGaussian quadrature in Figure 5. Here we present a concise adaptation of the numerical RH approach to computing orthog-onal polynomials on R with analytic exponential weights (originally described in [18]) tocalculating all orthogonal polynomials of order n = 0 , , . . . , N . For given n and N , theRHP simultaneously captures the polynomials π n,N ( x ) and γ n − ,N π n − ,N ( x ). More precisely,these polynomials can be expressed in terms of the solution of an RHP:9 (cid:45) (a) (cid:45) (cid:45) (cid:45) (cid:45) (cid:45) n A b s o l u t e E rr o r (b) Figure 5: (a) The function f ( x ) = 0 . x + x ) sin (10 x ) e − cosh( x ) . (b) The error in theapproximation of (cid:82) f ( x ) d x with a varying number of nodes and weights. The approximationis compared against an approximation using Clenshaw–Curtis quadrature with 10 ,
000 points.
Problem 3.1. [7] The function Y ( z ) = (cid:18) π n,N ( z ) C R [ π n,N e − NV ]( z ) − πiγ n − ,N π n − ,N ( z ) − πiγ n − ,N C R [ π n − ,N e − NV ]( z ) (cid:19) , where γ n − ,N = (cid:20)(cid:90) π n − ,N ( x ) e − NV ( x ) d x (cid:21) − , is the unique solution of the RHP on the real line Y + ( z ) = Y − ( z ) (cid:18) e − NV ( z ) (cid:19) , z ∈ R , Y ∼ (cid:18) z n z − n (cid:19) , z → ∞ . We apply the numerical method described in [16] but conditioning steps are required.Our first task is to remove the growth/decay of Y at ∞ . To accomplish this, we mustcompute the g -function, which has logarithmic growth at infinity so that e ± ng ( z ) ∼ z ± n , buthas special jump properties so that its incorporation into the RHP allows the RHP to behavewell in the large- n limit.The g -function, which depends on n and N , is associated with the equilibrium measureof V ( x ): Definition 3.1.
The equilibrium measure µ ≡ µ n,N is the minimizer of (cid:90)(cid:90) log 1 | x − y | d µ ( x ) d µ ( y ) + (cid:90) Nn V ( x ) d µ ( x ) . For simplicity, we assume that the equilibrium measure of V ( x ) is supported on a singleinterval ( a, b ). This is certainly true provided V is convex [2]. A numerical method forcomputing equilibrium measures is presented in [14] and we leave the details to that reference.10 discussion of how the computational cost of computing the equilibrium measure varies with N and n is left to Appendix A.From [2] it follows that g ≡ g n,N is an anti-derivative of the Stieljes transform of µ : g ( z ) = (cid:90) z (cid:90) d µ ( x ) z − x d z. Therefore, g is analytic off ( −∞ , b ) and satisfies g + ( z ) + g − ( z ) = Nn V ( z ) − (cid:96), z ∈ ( a, b ) , (cid:96) ∈ C ,g ( z ) = log z + O ( z − ) as z → ∞ ,g + ( z ) − g − ( z ) = 2 πi, z ∈ ( −∞ , a ) . The method of [14] gives a simple representation of g in terms of the Chebyshev coefficientsof V over ( a, b ) and elementary functions. Numerically, the Chebyshev coefficients of V (cid:48) are replaced with a numerical approximation via the discrete cosine transform applied to V (cid:48) evaluated at Chebyshev points, and the g associated with the resulting polynomial iscalculated in closed form. The error introduced by this approximation is uniformly smallthroughout the complex plane, including for z close to or on R . We rewrite Y ( z ) to normalize the behavior at infinity: Y ( z ) = (cid:32) e n(cid:96) e − n(cid:96) (cid:33) T ( z ) (cid:18) e − ng ( z ) e ng ( z ) (cid:19) (cid:32) e − n(cid:96) e n(cid:96) (cid:33) , (3.1)so that T ( z ) ∼ I for large z and T ( z ) has a jump along the real line, on which it satisfies T + ( z ) = T − ( z ) (cid:18) e n ( g ( z ) − g + ( z )) e n ( g + ( z )+ g − ( z )+ (cid:96) − Nn V ( z )) e n ( g + ( z ) − g − ( z )) (cid:19) = T − (cid:18) e n ( g + ( z )+ g − ( z )+ (cid:96) − Nn V ( z )) (cid:19) if z < a, (cid:18) e n ( g − ( z ) − g + ( z )) e n ( g + ( z ) − g − ( z )) (cid:19) if a < z < b, (cid:18) e n (2 g ( z )+ (cid:96) − Nn V ( z )) (cid:19) if b < z, We appeal to properties of equilibrium measures [2, (7.57)] to assert that g + ( x ) + g − ( x ) + (cid:96) − Nn V ( x ) < , S with the angles θ and θ labelled.for x < a and x > b , thus those contributions of the jump matrix are isolated around a and b . On the other hand, g + − g − is imaginary between a and b [2, pp. 195], hence e ± n ( g + − g − ) is oscillatory on ( a, b ). The RHP is deformed into the complex plane to convert oscillationsinto exponential decay. A so-called lensing of the RHP is introduced in Figure 6, where werewrite T as T ( z ) = S ( z ) (cid:18) e n ( Nn V ( z ) − (cid:96) − g ( z )) (cid:19) , if z ∈ Σ + , (cid:18) e n ( Nn V ( z ) − (cid:96) − g ( z )) (cid:19) , if z ∈ Σ − ,I, otherwise . (3.2)By substituting g + ( z ) = Nn V ( z ) − g − ( z ) − (cid:96), within the support of µ we see that the oscillations have been removed completely from thesupport: S + ( z ) = T + ( z ) (cid:18) − e n ( Nn V ( z ) − (cid:96) − g + ( z )) (cid:19) = T + ( z ) (cid:18) − e n ( g − ( z ) − g + ( z )) (cid:19) = T − ( z ) (cid:18) e n ( g − ( z ) − g + ( z )) e n ( g + ( z ) − g − ( z )) (cid:19) (cid:18) − e n ( g − ( z ) − g + ( z )) (cid:19) = T − ( z ) (cid:18) − e n ( g + ( z ) − g − ( z )) (cid:19) = S − ( z ) (cid:18) − e n ( Nn V ( z ) − (cid:96) − g − ( z )) (cid:19) (cid:18) − e n ( Nn V ( z ) − (cid:96) − g − ( z )) (cid:19) = S − ( z ) (cid:18) − (cid:19) . ↑ and Γ ↓ , on which S + ( z ) = T + ( z ) = T − ( z ) = S − ( z ) (cid:18) e n ( Nn V ( z ) − (cid:96) − g ( z )) (cid:19) . The choices of θ and θ in Figure 6 are discussed below. The above process has shifted oscillations to exponential decay. However, the goal of thesedeformations of the RHP is to maintain accuracy of the numerical algorithm for large n .Following the heuristics set out in [17] we must isolate the jumps to neighborhoods of theendpoints a and b by removing the jump along ( a, b ). We introduce a global parametrix thatsolves the RHP exactly on this single contour. In other words, we require a function whichsatisfies the following RHP: N + ( x ) = N − ( x ) (cid:18) − (cid:19) , for a < x < b and N ( ∞ ) = I. The solution is [2] N ( z ) = 12 ν ( z ) (cid:18) i − i (cid:19) + ν ( z )2 (cid:18) − ii (cid:19) for ν ( z ) = (cid:18) z − bz − a (cid:19) / ; i.e. , ν ( z ) is a solution to ν + ( x ) = iν − ( x ) for a < x < b and ν ( ∞ ) = 1 . An issue with using N ( z ) as a parametrix is that it introduces singularities at a and b , hence we also introduce local parametrices to avoid these singularities. Generically, theequilibrium measure ψ ( x ) has square-root decay at the endpoints of its support, and soparametrices based on Airy functions could be employed [2]. However, it is possible that theequilibrium measure has higher-order decay at the endpoints of its support, in which casethe local parametrices are given only in terms of a RHP [1]. To avoid this issue, we introducethe trivially constructed local parametrices which satisfy the jumps of S in neighborhoods of a and b : P a ( z ) = (cid:18) (cid:19) if π − θ < arg( z − a ) < π (cid:18) −
11 0 (cid:19) if − π < arg( z − a ) < − π + θ (cid:18) −
11 0 (cid:19) if − π + θ < arg( z − a ) < I otherwise (cid:32) e n ( Nn V ( z ) − (cid:96) − g ( z )) e − n ( Nn V ( z ) − (cid:96) − g ( z )) (cid:33) , P b ( z ) = (cid:18) − (cid:19) if θ < arg( z − b ) < π (cid:18) −
11 1 (cid:19) if − π < arg( z − b ) < − θ (cid:18) −
10 1 (cid:19) if − θ < arg( z − b ) < I otherwise (cid:32) e n ( Nn V ( z ) − (cid:96) − g ( z )) e − n ( Nn V ( z ) − (cid:96) − g ( z )) (cid:33) , where the angles θ and θ are determined in Section 3.3.Figure 7: The jumps of Φ. We use a counter-clockwise orientation on the circles about a and b .We write S ( z ) = Φ( z ) N ( z ) if | z − a | > r and | z − b | > r,P b ( z ) if | z − b | < r,P a ( z ) if | z − a | < r. (3.3)The final RHP for Φ satisfies the jumps depicted in Figure 7. Note that in general r , theradius of the circles in Figure 7, depends on N . We discuss this in more detail in the nextsection.In practice, we do not use infinite contours. We truncate contours when the jump matrixis to machine precision the identity matrix and this can be justified following arguments in[17]. In all cases we consider here, after proper deformations the jump matrices are infinitelysmooth and are exponentially decaying to the identity matrix for large z . Evaluating the functions that appear in the jump matrix of the RHP for Φ is not expensivedue to the efficiency of the method for computing g ( z ). This allows us to perform an adaptivetruncation of the jump matrices. Before we discuss this, we must discuss the radius of thecircles in Figure 7.In the case of a non-degenerate (generic case) equilibrium measure, Nn V ( z ) − (cid:96) − g ( z ) ∼ c a ( z − a ) / as z → a and Nn V ( z ) − (cid:96) − g ∼ c b ( z − b ) / as z → b . In this case, we scale thecontours like n − / : Ω n = − n − / Ω + a and Ω n = n − / Ω + b, = 2 = Figure 8: The pre-scaled Ω used for non-degenerate endpoints and the pre-scaled Ω usedfor first-order degenerate endpoints. These are the contours that are used in practice.for a fixed domain Ω that is used in practice is depicted in the left graph of Figure 8, and theangle of the contours are chosen to match the direction of steepest descent. This implies that r ∼ n − / . In the first-order degenerate case (e.g., V ( x ) = x / − x /
15 + x /
20 + 8 x/ Nn V ( z ) − (cid:96) − g ( z ) ∼ c b ( z − b ) / as z → b and we scale like n − / (implying r ∼ n − / ) atthe degenerate endpoint:Ω n = − n − / Ω + a and Ω n = n − / Ω + b, where Ω is depicted in the right graph of Figure 8 (the angle is sharper to attach to the newdirection of steepest descent). In general, θ and θ in Figure 6 are chosen based on the anglesfor Ω and Ω . Higher order degenerate points can only take the form Nn V ( z ) − (cid:96) − g ( z ) ∼ c b ( z − b ) (3+4 λ ) / for integer λ [1], and can be treated similarly by scaling a suitably constructedcontour like n − / (3+4 λ ) .Each non-self-intersecting component Ω of the contour in Figure 7 is given by a trans-formation Ω = M ([ − , M ( x ) = αx + β is just an affine transformation.Given a jump matrix G defined on Ω we wish to truncate the contour when (cid:107) G ( k ) − I (cid:107) < (cid:15) . Algorithm 3.1.
Contour truncation
1. Fix N grid > and consider the values (cid:107) G ( k ) − I (cid:107) , (cid:107) G ( k ) − I (cid:107) , . . . , (cid:107) G ( k M grid ) − I (cid:107) ,k j = M (cid:18) jM grid − (cid:19) , M grid = (cid:100) ( | α | + 1) N grid (cid:101) . Here { k j } represents a uniform grid on Ω that increases in number of points as Ω increases in extent. There is a simple ordering on Ω : we say s < t , s, t ∈ Ω if and onlyif M − ( s ) < M − ( t ) . . If (cid:107) G ( k ) − I (cid:107) ≥ (cid:15) set S = { k } otherwise S = {} .3. For j = 1 , . . . , M grid − , if ( (cid:107) G ( k j ) − I (cid:107) − (cid:15) )( (cid:107) G ( k j +1 ) − I (cid:107) − (cid:15) ) < , indicating a signchange, add ( k j + k j +1 ) / to the set S .4. If S has an odd number of elements add k M grid to S .5. Order S = { t , . . . , t n } according to t j < t j +1 . Replace Ω with the contours { Ω , . . . , Ω n } where Ω j is a straight line that connects t j − and t j . Provided that N grid is sufficiently large so that at most one zero of (cid:107) G ( k j ) − I (cid:107) − (cid:15) liesbetween successive points k j and k j +1 the algorithm produces a consistent truncation of thecontour in the sense that (cid:107) G ( k ) − I (cid:107) ≤ (cid:115) sup ∂ k (cid:107) G ( k ) − I (cid:107) ( | α | + 1) N grid + (cid:15) for k ∈ Ω \ { Ω ∪ · · · ∪ Ω n } . If (cid:107) G ( k ) − I (cid:107) happens to oscillate wildly as n increases then N grid may have to increase withrespect to n . In practice, (cid:107) G ( k ) − I (cid:107) is seen to either be monotonic or vary slowly. Indeed,this is the purpose of the deformation of an RHP. In both cases, the algorithm produces agood truncation. As a product of this, contours are generally divided up into at most threecomponents after truncation, see Figure 9. The number of mapped Chebyshev points on Ωused to discretize the singular integral equation is used on each of the subdivided contoursto ensure enough points. In practice, this procedure ensures accuracy for all n and N . The RHP for Φ, after scaling and truncation is solved numerically. To avoid too muchcomplication, we only describe the essence of the method [16] that is applied. This methoddiscretizes (1.5) via collocation using an appropriately defined Chebyshev basis. As is de-scribed in [18] the scaling and truncation is sufficient to ensure that the number of collocationpoints needed to compute the solution accurately is bounded (with respect to n ). Thus tosolve the RHP for Φ the computational cost is O (1). Once Φ is calculated, we can evaluateorthogonal polynomials pointwise by undoing the sequence of transformations to determine Y . However, we will see that the recurrence coefficients are also calculable directly. For a given V ( x ), we discuss how to accurately obtain a n,N and b n,N . For large z , the solutionΦ of the RHP in Figure 7 is given byΦ( z ) = T ( z ) N − ( z ) = (cid:18) e − n(cid:96)/ e n(cid:96)/ (cid:19) Y ( z ) (cid:18) e n(cid:96)/ e − n(cid:96)/ (cid:19) (cid:18) e ng ( z ) e − ng ( z ) (cid:19) N − ( z ) . (cid:45) (cid:45) (cid:45) (cid:45) (a) (cid:45) (cid:45) (cid:45) (cid:45) (cid:45) (b) (cid:45) (cid:45) (cid:45) (cid:45) (cid:45) (c) Figure 9: A demonstration of the resulting contours after the application of Algorithm 3.1with V ( x ) = x and N = 1: (a) n = 20, (b) n = 24, (c) n = 30.17or fixed N , we wish to compute the polynomials, orthogonal with respect to e − NV ( x ) d x oforder n = 0 , , . . . , N . From [11] we know that b n,N = ( Y ( n )) ( Y ( n )) , a n,N = ( Y ( n )) ( Y ( n )) − ( Y ( n )) ,Y ( z ) = (cid:18) I + 1 z Y ( n ) + 1 z Y ( n ) + O (cid:18) z (cid:19)(cid:19) (cid:18) z n z − n (cid:19) . It is inconvenient to compute Y ( n ) so we look for alternate formulae for a n,N and b n,N . From[2, p. 47] a n,N = ( Y ( n )) − ( Y ( n + 1)) ,b n,N = ( Y ( n )) ( Y ( n )) . Computing Y ( n + 1) it is more accurate than computing Y ( n ). It is also more efficient whenthe sequence a j,N , j = 0 , , , . . . is computed with the computation reused at each step.To compute Y ( n ) we use the formula I + Y ( n ) z + O (cid:18) z (cid:19) = (cid:18) e n(cid:96)/ e − n(cid:96)/ (cid:19) (cid:18) I + Φ ( n ) z + O (cid:18) z (cid:19)(cid:19) (cid:18) I + N z + O (cid:18) z (cid:19)(cid:19) × (cid:18) I + (cid:18) − ng ng (cid:19) z + O (cid:18) z (cid:19)(cid:19) (cid:18) e − n(cid:96)/ e n(cid:96)/ (cid:19) , Φ( z ) = I + Φ ( n ) z + O (cid:18) z (cid:19) , N ( z ) = I + N ( n ) z + O (cid:18) z (cid:19) ,g ( z ) = log z + g z + O (cid:18) z (cid:19) . Therefore Y ( n ) = (cid:18) e n(cid:96)/ e − n(cid:96)/ (cid:19) (cid:18) Φ ( n ) + N + (cid:18) − ng ng (cid:19)(cid:19) (cid:18) e − n(cid:96)/ e n(cid:96)/ (cid:19) Here N can be computed explicitly in terms of the endpoints of the support of theequilibrium measure. Φ is determinable via a contour integral once the RHP for Φ issolved. Recall, the method in [16] returns U such that I + 12 πi (cid:90) Γ U ( s ) s − z ds ≈ Φ( z ) . Then Φ ( n ) ≈ − πi (cid:90) Γ U ( s ) ds. g = 12 πi (cid:90) ba x d µ ( x ) . When V ( x ) is polynomial the computational cost to compute any given recurrence co-efficient O (1). When V ( x ) is entire, but non-polynomial, we obtain conjectures on thecomputational cost. A full discussion of this is found in Appendix A. The method in the previous section fails for non-polynomial Q , for example Q ( x ) = cosh( x ).For polynomial Q , the effect of Q ( x ) (cid:55)→ V N ( x ) is to force the support of the equilibriummeasure to be O (1). Thus, when Q ( x ) grows faster than any polynomial it is no longer cleara priori what the correct scaling is. We look for functions α N > β N such that theequilibrium measure for V N ( x ) = N − Q ( α N x + β N ) , N > , has support [ − , α N and β N , andif Q symmetric then β N = 0. This indicates that the choice m = log( N ) / log( α N ) ( m = 1when N = 1) in the previous section is correct choice for exponentially growing potentials. Remark 3.1.
The resulting method is very stable to small perturbations of α N and β N .Indeed, any value of α N or β N such that the support of the equilibrium measure is O (1) isan acceptable scaling. Remark 3.2.
The constants α N and β N provide the asymptotic behavior of the coefficientsof the Jacobi operator for large n of the unscaled Jacobi operator without requiring thesolution of any RHP, see [13, Theorem 15.2]. We have demonstrated a new, stable and accurate algorithm for computing the entries ofJacobi operators associated with polynomials orthogonal on R with respect to analytic,exponential weights. Importantly, the computational cost to compute n rows of the Jacobioperator is O ( N ) when V ( x ) is a polynomial. We estimate the computational cost being O ( N log N ) when V ( x ) is entire for our examples. As a trivial consequence computing thecoefficients in the classical three-term recurrence relation for these polynomials allows us tocompute the polynomials themselves. Applying the Golub–Welsh algorithm, we computeGaussian quadrature weights and nodes accurately.This approach can be extended to measures supported on a single interval, using theRHP derived in [12]. This setting is simplified for fixed weights as the equilibrium measuredoes not depend on the weight: it is always an arcsine distribution. Another extension is toweights whose equilibrium measure is supported on multiple intervals, see [5]. The numericalapproach for equilibrium measures is applicable to the multiple interval setting [14], however,the Newton iteration underlying the determination of the support can have difficulties attransition points, where the number of intervals of support grows or shrinks.19 Computational cost
The arguments presented in [18] demonstrate that once g ( z ) is computed, the computationalcost required to construct and solve the RHP is bounded. Therefore, it is necessary to under-stand the computational cost associated with computing the equilibrium measure for varying N and n and when V depends on N . The method in [14] relies on two approximations. Thefirst is the representation of Nn V (cid:48) ( x ) in a Chebyshev series. The second is the convergenceof Newton’s method to determine the support of the equilibrium measure. Thus computa-tional cost behaves like O ( Kk + k log k ) where K is the number of Newton iterations and k is the number of coefficients in the Chebyshev series that are needed to approximate V (cid:48) ( x )to machine precision.In practice, we see that K = O (1). By the chosen scalings, the endpoints of the equilib-rium measure are O (1) and an initial guess of [ − ,
1] is sufficient. Additionally, one coulduse an initial guess from a different N or n value as an initial guess to provide a minorimprovement in the computation time.When V is itself an m th order polynomial then it follows that k = m − g ( z ), construct the RHP and solve the RHP is truly O (1). This issue is much more delicate when considering, say, V N ( x ) = x + 1 /N sin( √ N x )(which corresponds to a normalized V ( x ) = x + sin( x )) or V N ( x ) = N − cosh( d N x ) (whichcorresponds to a normalized V ( x ) = cosh( x )). In the first case, V (cid:48)(cid:48)(cid:48) N ( x ) is unbounded whichforces k to increase with respect to N . In both of these examples, we observe experimen-tally that k = O (log N ), implying that the computational cost for V ( x ) = x + sin( x ) is O ( N log N ). It is also observed that it takes a bounded amount of time to compute d N and a bounded number of Chebyshev coefficients to compute the equilibrium measure for V N ( x ) = N − cosh( d N x ) and therefore it is conjectured that the full computation (computing g ( z ), constructing the RHP and solving the RHP) is O (1) in this case. Further details arebeyond the scope of this paper. References [1] T. Claeys, I. Krasovsky, and A. Its. Higher-order analogues of the Tracy–Widom dis-tribution and the Painlev´e II hierarchy.
Comm. Pure Appl. Math. , 63:362–412, 2010.[2] P. Deift.
Orthogonal Polynomials and Random Matrices: a Riemann–Hilbert Approach .AMS, 2000.[3] P. Deift, T. Kriecherbauer, K. T.-R. McLaughlin, S. Venakides, and X. Zhou. Asymp-totics for polynomials orthogonal with respect to varying exponential weights.
Internat.Math. Res. Notices , 16:759–782, 1997.[4] P. Deift, T. Kriecherbauer, K. T.-R. McLaughlin, S. Venakides, and X. Zhou. Strongasymptotics of orthogonal polynomials with respect to exponential weights.
Comm.Pure Appl. Math. , 52(12):1491–1552, 1999.205] P. Deift, T. Kriecherbauer, K. T.-R. McLaughlin, S. Venakides, and X. Zhou. Uniformasymptotics for polynomials orthogonal with respect to varying exponential weightsand applications to universality questions in random matrix theory.
Comm. Pure Appl.Math. , 52(11):1335–1425, 1999.[6] P. Erd¨os. On the distribution of roots of orthogonal polynomials. In G. Alexits et al.,editor,
Proceedings of the Conference on the Constructive Theory of Functions , pages145–150. Akademiai Kiado, Budapest, 1972.[7] A. S. Fokas, A. R. Its, and A. V. Kitaev. The isomonodromy approach to matric modelsin 2d quantum gravity.
Comm. Math. Phys. , 147(2):395–430, 1992.[8] W. Gautschi.
Orthogonal Polynomials: Applications and Computation . Oxford Univer-sity Press, 2004.[9] A. Glaser, X. Liu, and V. Rokhlin. A fast algorithm for the calculation of the roots ofspecial functions.
SIAM J. Sci. Comp. , 29(4):1420–1438, 2007.[10] N. Hale and A. Townsend. Fast and accurate computation of Gauss–Legendre andGauss–Jacobi quadrature nodes and weights.
SIAM J. Sci. Comp. , 35(2):A652–A674,2013.[11] A. B. J. Kuijlaars and P. M. J. Tibboel. The asymptotic behaviour of recurrencecoefficients for orthogonal polynomials with varying exponential weights.
J. Comput.Appl. Math. , 233(3):775–785, 2009.[12] A.B.J. Kuijlaars, K.T.-R. McLaughlin, W. Van Assche, and M. Vanlessen. The riemann–hilbert approach to strong asymptotics for orthogonal polynomials on [ − , Advancesin Mathematics , 188(2):337–398, 2004.[13] A. L. Levin and D. S. Lubinsky.
Orthogonal Polynomials for Exponential Weights ,volume 4. Springer, 2001.[14] S. Olver. Computation of equilibrium measures.
J. Approx. Theory , 163:1185–1207,2011.[15] S. Olver. Numerical solution of Riemann–Hilbert problems: Painlev´e II.
Found. Comput.Math. , 11:153–179, 2011.[16] S. Olver. A general framework for solving Riemann–Hilbert problems numerically.
Nu-mer. Math. , 122:305–340, 2012.[17] S. Olver and T. Trogdon. Nonlinear steepest descent and the numerical solution ofRiemann–Hilbert problems.
Comm. Pure Appl. Maths , 2012. To appear.[18] S. Olver and T. Trogdon. Numerical solution of Riemann–Hilbert problems: randommatrix theory and orthogonal polynomials.
Const. Approx. , 2013. To appear.[19] T. Trogdon.