[PDF] Algorithms for the computation of the matrix logarithm based on the double exponential formula

Abstract

We consider the computation of the matrix logarithm by using numerical quadrature. The efficiency of numerical quadrature depends on the integrand and the choice of quadrature formula. The Gauss--Legendre quadrature has been conventionally employed; however, the convergence could be slow for ill-conditioned matrices. This effect may stem from the rapid change of the integrand values. To avoid such situations, we focus on the double exponential formula, which has been developed to address integrands with endpoint singularity. In order to utilize the double exponential formula, we must determine a suitable finite integration interval, which provides the required accuracy and efficiency. In this paper, we present a method for selecting a suitable finite interval based on an error analysis as well as two algorithms, and one of these algorithms addresses error control.

Full PDF

AAlgorithms for the computation of the matrix logarithm based on thedouble exponential formula

Fuminori Tatsuoka ∗ , Tomohiro Sogabe † , Yuto Miyatake ‡ , Shao-Liang Zhang § August 5, 2019

Abstract

We consider the computation of the matrix logarithm by using numerical quadrature. The eﬃciency ofnumerical quadrature depends on the integrand and the choice of quadrature formula. The Gauss–Legendrequadrature has been conventionally employed; however, the convergence could be slow for ill-conditionedmatrices. This eﬀect may stem from the rapid change of the integrand values. To avoid such situations,we focus on the double exponential formula, which has been developed to address integrands with endpointsingularity. In order to utilize the double exponential formula, we must determine a suitable ﬁnite integrationinterval, which provides the required accuracy and eﬃciency. In this paper, we present a method for selectinga suitable ﬁnite interval based on an error analysis as well as two algorithms, and one of these algorithms isan adaptive quadrature algorithm.

A logarithm of A ∈ R n × n is deﬁned as any matrix X such thatexp ( X ) = A , where exp ( X ) : = I + X + X + X + · · · [10, p. 269]. If all eigenvalues of A lie in the set C \ (−∞ , ] , thereis a unique logarithm of A whose eigenvalues all lie in the strip { z ∈ C : − π < Im ( z ) < π } [10, Thm. 1.31].This logarithm is called the principal logarithm of A , and denoted by log ( A ) . Throughout this paper, we assumethat all eigenvalues of A lie in the set C \ (−∞ , ] , and we consider the principal logarithm of A .The matrix logarithm is utilized in many ﬁelds of research, such as quantum mechanics [16], quantumchemistry [9], biomolecular dynamics [11], buckling simulation [14], and machine learning [8, 12, 6, 7]. Thecomputational methods include the inverse scaling and squaring (ISS) algorithm [1], an algorithm based on thearithmetic-geometric mean (AGM) iteration [3], and numerical quadrature. In this paper, we focus on numericalquadrature, which employs the following integral representation (see e.g. [10, Thm. 11.1]):log ( A ) = ( A − I ) ∫ (cid:2) t ( A − I ) + I (cid:3) − d t . (1) ∗ Department of Applied Physics, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603,Japan, [email protected] † Department of Applied Physics, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603,Japan, [email protected] ‡ Cybermedia Center, Osaka University, 1-32 Machikaneyama, Toyonaka, Osaka 560-0043, Japan, [email protected] § Department of Applied Physics, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603,Japan, [email protected] a r X i v : . [ m a t h . NA ] A ug irst, numerical quadrature can make use of the sparseness of A if A is sparse. It is useful when computingthe multiplication of the matrix logarithm and a vector log ( A ) b ( b ∈ R n ) , which appears in applications suchas [8, 6, 7] , without computing and storing dense matrices. Conversely, the ISS algorithm and the algorithmbased on the AGM iteration include the computation of the matrix square root, which means that the calculationinvolves dense matrices even if A is sparse. The second reason is that numerical quadrature is potentially morefavorable for parallel computers because of independent computation of the integrand on each abscissa.Because the integrand in (1) includes matrix inversion, the computational cost of numerical quadraturedepends on the number of evaluations of the integrand. Although numerical quadrature is suitable for par-allelization, the quadrature formula should be selected carefully to reduce the computational cost and savecomputational resources.The method conventionally used to compute (1) is the Gauss–Legendre (GL) quadrature. If the spectralradius of A − I is smaller than 1, the GL quadrature, which can be regarded as a rational approximation oflog ( A ) , coincides with the Padé approximants of log ( A ) at I [5, Thm. 4.3]. Therefore, it is natural to use theGL quadrature to reduce the number of abscissas when A is close to I . However, the convergence of the GLquadrature becomes slow when A is not close to I . For example, the convergence in our experiments becameslow when A was ill-conditioned which may be explained by rapid changes in the integrand value when it iscloser to the endpoint of the interval.In this paper, we consider the double exponential (DE) formula [15], which can be used to compute integralswith singularities at one (or both) of the endpoints. For this reason, the DE formula may be useful in scenariosin which the GL quadrature does not perform well. However, when using the DE formula, a ﬁnite interval needsto be selected because the integrand in (1) is transformed into a corresponding function on the inﬁnite interval.This selection is important, i.e., if the ﬁnite interval is too narrow, the accuracy of the computational resultbecomes low, but if it is too wide, the convergence of the DE formula becomes slow.By performing an error analysis, we provide a method of selecting the appropriate ﬁnite interval, as well astwo algorithms for the computation of log ( A ) based on the m -point DE formula.The remainder of this paper is organized as follows: in Section 2, we present an error analysis and proposetwo algorithms; in Section 3, we show the results of numerical experiments; in Section 4, we conclude the study. Notation:

Unless otherwise stated, (cid:107) · (cid:107) denotes a consistent matrix norm, e.g., the p -norm or the Frobeniusnorm, and (cid:107) · (cid:107) and (cid:107) · (cid:107) F denote the 2-norm and the Frobenius norm, respectively. The inverse functions ofsinh and tanh are referred to as asinh and atanh, respectively. log ( A ) based on the DE formula In this section we propose a method of selecting a ﬁnite interval for the DE formula by estimating the intervaltruncation error and present two algorithms. Before considering the truncation error, let us apply the followingtransformations to (1). By substituting u = t − ( A ) = ( A − I ) ∫ − [( + u )( A − I ) + I ] − d u . (2)Then, by applying the DE transformation u = tanh ( sinh ( x )) , it follows thatlog ( A ) = ( A − I ) ∫ ∞−∞ F DE ( x ) d x , (3)where F DE ( x ) : = cosh ( x ) sech ( sinh ( x )) [( + tanh ( sinh ( x )))( A − I ) + I ] − . In the three applications, the computation of log ( det A ) is performed based on some connections between log ( A ) b and log ( det A ) .For more details, see, e.g., [13]. ( + tanh ( sinh ( x )))( A − I ) + I in the integrand F DE is nonsingular for any x ∈ R . In Subsection2.1, we derive an upper bound on the error between the integral in (3) and the same integral deﬁned in the ﬁniteinterval [ l , r ] , (cid:13)(cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) . (4)In Subsection 2.2, we propose a method of selecting the interval [ l , r ] so that the relative truncation error issmall than or approximately equal to a given tolerance (cid:15) . Our algorithms are described in Subsection 2.3. The error that stems from the interval truncation (4) can be rewritten as (cid:13)(cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) (cid:20)∫ l −∞ F DE ( x ) d x + ∫ ∞ r F DE ( x ) d x (cid:21)(cid:13)(cid:13)(cid:13)(cid:13) . (5)Using the triangle inequality in the right-hand side (RHS) of (5), it holds that (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) (cid:20)∫ l −∞ F DE ( x ) d x + ∫ ∞ r F DE ( x ) d x (cid:21)(cid:13)(cid:13)(cid:13)(cid:13) (6) ≤ (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ l −∞ F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ ∞ r F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) . By estimating the RHS of (6), we obtain an upper bound on (4).Initially, we focus on the ﬁrst term which is on the RHS of (6). To avoid cumbersome notation, instead of F DE , which including hyperbolic functions, we consider the integrand of (1). By applying the transformation t = [ tanh ( sinh ( x )) + ]/

2, we have (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ l −∞ F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ a F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) , (7)where, F ( t ) : = [ t ( A − I ) + I ] − , a : = tanh ( sinh ( l )) + . The following lemma shows an upper bound on the RHS of (7), if a is small enough to warrant the use of theNeumann series expansion of F ( t ) . Lemma 2.1.

Suppose that A (cid:44) I . Then, for a ∈ (cid:0) , /( (cid:107) A − I (cid:107)) (cid:3) , we have (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ a F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) ≤ a (cid:107) A − I (cid:107) . (8) Proof.

For all t ∈ [ , a ] where a ∈ (cid:0) , /( (cid:107) A − I (cid:107)) (cid:3) , it holds that (cid:107) t ( I − A )(cid:107) ≤ ( < ) . By applying the Neumann series expansion to F ( t ) we get the following: F ( t ) = [ t ( A − I ) + I ] − = ∞ (cid:213) k = t k ( I − A ) k . F can be rewritten as follows: ( A − I ) ∫ a F ( t ) d t = ( A − I ) ∫ a (cid:34) ∞ (cid:213) k = t k ( I − A ) k (cid:35) d t = ( A − I ) ∞ (cid:213) k = a k + k + ( I − A ) k = a ( A − I ) + a ( A − I ) ∞ (cid:213) k = (cid:20) a k k + ( I − A ) k (cid:21) . By using triangle inequality and consistency of the norm we get the following: (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ a F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) ≤ a (cid:107) A − I (cid:107) + a (cid:107) A − I (cid:107) ∞ (cid:213) k = (cid:20) k + (cid:107) a ( A − I )(cid:107) k (cid:21) ≤ a (cid:107) A − I (cid:107) + a (cid:107) A − I (cid:107) ∞ (cid:213) k = (cid:34) (cid:18) (cid:19) k (cid:35) = a (cid:107) A − I (cid:107) . (cid:3) The calculation to estimate the second term on the RHS of (6) is similar to that of the ﬁrst term. By applyingthe transformation t = [ tanh ( sinh ( x )) + ]/

2, we get the following: (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ ∞ r F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ b F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) , (9)where b = [ tanh ( sinh ( r )) + ]/ b is close enough to 1 to warrant the use of theNeumann series expansion of F ( t ) . Lemma 2.2.

For b ∈ (cid:2) (cid:107) A − (cid:107)/( (cid:107) A − (cid:107) + ) , (cid:1) , we have (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ b F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:18) − log ( b ) + − b b (cid:19) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) . (10) Proof.

The outline of this proof is similar to that of Lemma 2.1. For all t ∈ [ b , ] where b ∈ (cid:2) (cid:107) A − (cid:107)/( (cid:107) A − (cid:107) + ) , (cid:1) , it holds that (cid:13)(cid:13)(cid:13)(cid:13) t − t A − (cid:13)(cid:13)(cid:13)(cid:13) ≤ ( < ) . By applying the Neumann series expansion to F ( t ) , we get F ( t ) = [ t ( A − I ) + I ] − = t A − ∞ (cid:213) k = (cid:18) t − t (cid:19) k A − k . Therefore, the integral of F can be rewritten as follows: ( A − I ) ∫ b F ( t ) d t = ( A − I ) ∫ b (cid:34) t A − ∞ (cid:213) k = (cid:18) t − t (cid:19) k A − k (cid:35) d t = ( A − I ) A − (cid:34) − log ( b ) I + ∞ (cid:213) k = A − k ∫ b t (cid:18) t − t (cid:19) k d t (cid:35) .

4y using the triangle inequality and consistency of the norm, it follows (cid:13)(cid:13)(cid:13)(cid:13) ( A − I ) ∫ b F ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:107) A − I (cid:107) (cid:13)(cid:13) A − (cid:13)(cid:13) (cid:40) | − log ( b )| + ∞ (cid:213) k = (cid:34)(cid:13)(cid:13) A − (cid:13)(cid:13) k ∫ b t (cid:12)(cid:12)(cid:12)(cid:12) t − t (cid:12)(cid:12)(cid:12)(cid:12) k d t (cid:35) (cid:41) . (11)In (11), it holds that | − log ( b )| = − log ( b ) because b ∈ (cid:2) (cid:107) A − (cid:107)/( (cid:107) A − (cid:107) + ) , (cid:1) , and |( t − )/ t | = ( − t )/ t because t ∈ [ b , ] . For the second term in the bracket on the RHS of (11), we have: ∞ (cid:213) k = (cid:34)(cid:13)(cid:13) A − (cid:13)(cid:13) k ∫ b t (cid:18) − tt (cid:19) k d t (cid:35) ≤ ∞ (cid:213) k = (cid:34)(cid:13)(cid:13) A − (cid:13)(cid:13) k ∫ b b (cid:18) − tb (cid:19) k d t (cid:35) = − bb ∞ (cid:213) k = (cid:34) k + (cid:13)(cid:13)(cid:13)(cid:13) − bb A − (cid:13)(cid:13)(cid:13)(cid:13) k (cid:35) ≤ − bb ∞ (cid:213) k = (cid:34) (cid:18) (cid:19) k (cid:35) = − b b . (12)By substituting (12) in (11), we obtain the inequality (10). (cid:3) In the ﬁnal part of this Subsection, we estimate the upper bound on (4).

Proposition 2.1.

Suppose that A (cid:44) I . For a given interval [ l , r ] , let a = [ tanh ( sinh ( l )) + ]/ b = [ tanh ( sinh ( r )) + ]/

2. Then, if a ≤ /( (cid:107) A − I (cid:107)) and b ≥ (cid:107) A − (cid:107)/( (cid:107) A − (cid:107) + ) , it holds that (cid:13)(cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13)(cid:13) ≤ a (cid:107) A − I (cid:107) + (cid:18) − log ( b ) + − b b (cid:19) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) . (13) Proof.

By combining (5) and (6), and as well as substituting (8) and (10) in (6), we get (13). (cid:3)

To develop algorithms for computing log ( A ) based on the DE formula, we need to determine the appropriateﬁnite integration interval, [ l , r ] in advance. The ﬁnite interval should be ideally set so that the relative error isguaranteed to be smaller than or equal to a given tolerance, (cid:15) >

0, i.e., (cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13) (cid:107) log ( A )(cid:107) ≤ (cid:15) . To accomplish this, a lower bound on (cid:107) log ( A )(cid:107) must be estimated. The following lemma shows a lower boundin terms of the spectral radius of A . Lemma 2.3.

Let ρ (·) be the spectral radius, i.e., the largest absolute value of eigenvalues. Then, the followingtwo inequalities hold: (cid:107) log ( A )(cid:107) ≥ | log ( ρ ( A ))| , (14) (cid:107) log ( A )(cid:107) ≥ | log ( ρ ( A − ))| . (15) Proof.

By using the consistency of the norm and the fact any eigenvalue of log ( A ) is equal to log ( λ ) , for some λ being an eigenvalue of A , we have the lower bound in (14) as (cid:107) log ( A )(cid:107) ≥ ρ ( log ( A )) ≥ | log ( ρ ( A ))| . The lowerbound in (15) can be obtained in a similar way. (cid:3)

5n the following proposition, we show how to set a ﬁnite interval such that the relative truncation error in2-norm is smaller than or equal to a given tolerance, (cid:15) > Proposition 2.2.

Suppose that A (cid:44) I . Let θ be a lower bound on (cid:107) log ( A )(cid:107) , the tolerance (cid:15) > (cid:15) < (cid:107) A − I (cid:107) (cid:107) A − (cid:107) θ ( + (cid:107) A − (cid:107) ) , and s be a real positive solution of the equation1 θ (cid:18) − log ( s ) + − s s (cid:19) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) = (cid:15) . (16)If l : = asinh ( atanh ( a − )) , r : = asinh ( atanh ( b − )) , where a : = min (cid:26) θ(cid:15) (cid:107) A − I (cid:107) , (cid:107) A − I (cid:107) (cid:27) , b : = max (cid:26) s , (cid:107) A − (cid:107) (cid:107) A − (cid:107) + (cid:27) , (17)then, it holds that (cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13) (cid:107) log ( A )(cid:107) ≤ (cid:15) . (18) Proof.

From the deﬁnition of a and b , it is true that a ≤ /( (cid:107) A − I (cid:107) ) and b ≥ (cid:107) A − (cid:107) /( (cid:107) A − (cid:107) + ) . Inaddition, from b − a ≥ (cid:107) A − (cid:107) (cid:107) A − (cid:107) + − θ(cid:15) (cid:107) A − I (cid:107) > (cid:107) A − (cid:107) (cid:107) A − (cid:107) + − (cid:107) A − (cid:107) + (cid:107) A − (cid:107) = (cid:107) A − (cid:107) ( + (cid:107) A − (cid:107) ) − ( (cid:107) A − (cid:107) + )(cid:107) A − (cid:107) ( (cid:107) A − (cid:107) + )( + (cid:107) A − (cid:107) ) = (cid:107) A − (cid:107) ( (cid:107) A − (cid:107) + )( + (cid:107) A − (cid:107) ) > , it follows that a < b , and l < r . Therefore, we can choose a ﬁnite interval [ l , r ] that satisﬁes the assumptions ofProposition 2.1. By dividing the inequality (13) by (cid:107) log ( A )(cid:107) ≥ θ , it follows: (cid:13)(cid:13)(cid:13) log ( A ) − ( A − I ) ∫ rl F DE ( x ) d x (cid:13)(cid:13)(cid:13) (cid:107) log ( A )(cid:107) ≤ a θ (cid:107) A − I (cid:107) + θ (cid:18) − log ( b ) + − b b (cid:19) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) . (19)From the deﬁnition of a and b , it holds that a ≤ θ(cid:15) /( (cid:107) A − I (cid:107) ) and b ≥ s . Therefore,3 a θ (cid:107) A − I (cid:107) ≤ (cid:15) , θ (cid:18) − log ( b ) + − b b (cid:19) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) ≤ (cid:15) . (20)We obtain (18) by substituting (20) in (19). (cid:3) .3 Algorithms In this subsection we establish two algorithms based on the results in subsections 2.1 and 2.2. One of thealgorithms is designed to compute log ( A ) using the m -point DE formula on a ﬁnite interval with an intervaltruncation error smaller than or approximately equal to a given tolerance, (cid:15) >

0. The other algorithm is anadaptive quadrature algorithm designed to compute log ( A ) by automatically adding abscissas until the error issmaller than or approximately equal to a given tolerance ζ > (cid:15) given in Proposition 2.2 is suﬃciently small, a linear approximation to the nonlinearequation (16) can be used to determine an appropriate interval. We describe our calculation in detail below.Suppose that (cid:15) is suﬃciently small and the solution s of (16) is approximately equal to 1. Then, because3 ( − s )/ − log ( s ) + ( − s )/ s at s =

1, the solution s can beapproximated by using the solution ˜ s of the following equation:1 θ ( − ˜ s ) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) = (cid:15) . (21)The solution of (21) is given by ˜ s = − θ(cid:15) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) . Under the assumptions of Proposition 2.2, by choosing ˜ b as˜ b = max (cid:26) ˜ s , (cid:107) A − (cid:107) (cid:107) A − (cid:107) + (cid:27) instead of (17), and setting ˜ r = asinh ( atanh ( b − )) , the interval truncation will be smaller than or approximatelyequal to (cid:15) . The summary of the ﬁrst algorithm, which computes log ( A ) using the m -point DE formula whoseinterval truncation error is smaller than or approximately equal to (cid:15) is as shown in Algorithm 1. Algorithm 1

Computation of log ( A ) based on the DE formula. Input: A ∈ R n × n , m ∈ N , (cid:15) > Output: X ≈ log ( A ) Set F DE ( x ) = cosh ( x ) sech ( sinh ( x )) [( + tanh ( sinh ( x )))( A − I ) + I ] − Compute (cid:107) A − I (cid:107) , (cid:107) A − (cid:107) , ρ ( A ) θ = | log ( ρ ( A ))| (cid:15) max = θ (cid:107) A − I (cid:107) (cid:107) A − (cid:107) + (cid:107) A − (cid:107) if (cid:15) ≥ (cid:15) max then (cid:15) ← (cid:15) max / end if a = min (cid:26) θ(cid:15) (cid:107) A − I (cid:107) , (cid:107) A − I (cid:107) (cid:27) b = max (cid:26) − θ(cid:15) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) , (cid:107) A − (cid:107) (cid:107) A − (cid:107) + (cid:27) l = asinh ( atanh ( a − )) r = asinh ( atanh ( b − )) h = ( r − l )/( m − ) T = h ( F DE ( l ) + F DE ( r )) + h m − (cid:213) i = F DE ( l + ih ) X = ( A − I ) T (cid:15) is suﬃciently small, an accurate computation of (cid:107) I − A (cid:107) , (cid:107) A − (cid:107) and ρ ( A ) at Step 2 of Algorithm1 may not be required because the errors that stem from of these values have little eﬀect on the accuracy oflog ( A ) . We give more detail in the following paragraph.Assume that (cid:15) in Algorithm 1 is suﬃciently small, and that a and b in Step 9 are chosen as a = θ(cid:15) (cid:107) A − I (cid:107) , b = − θ(cid:15) (cid:107) A − I (cid:107) (cid:107) A − (cid:107) , where θ is a lower bound on (cid:107) log ( A )(cid:107) . Let ∆ and ∆ be the relative errors of θ /(cid:107) A − I (cid:107) and 1 /(cid:107) A − (cid:107) respectively. Then, the computational result of a is equal to (cid:15) θ (cid:107) A − I (cid:107) ( + ∆ ) = θ (cid:107) A − I (cid:107) (cid:15) ( + ∆ ) , and that of b is equal to1 − (cid:15) θ (cid:107) A − I (cid:107) ( + ∆ ) (cid:107) A − (cid:107) ( + ∆ ) = − θ (cid:107) A − I (cid:107) (cid:107) A − (cid:107) (cid:15) ( + ∆ + ∆ + ∆ ∆ ) . Therefore, the upper bound on the truncation error (cid:15) computed by considering the relative errors ∆ , ∆ isalmost equal to the upper bound on the truncation error when the tolerance is set as (cid:15) ( + ∆ + ∆ + ∆ ∆ ) .For example, when ∆ , ∆ = − , (cid:15) ( + ∆ + ∆ + ∆ ∆ ) ≈ . (cid:15) , which means that the upper bound onthe truncation error changes by approximately 2%. If (cid:15) is suﬃciently small, the eﬀect of these errors will benegligible.At Step 3, a lower bound on (cid:107) log ( A )(cid:107) is computed based on (14). By setting θ = max {| log ( ρ ( A ))| , | log ( ρ ( A − ))|} , a tighter lower bound can be obtained. In particular, when A is positive deﬁnite, | log ( ρ ( A − ))| can be obtained without additional computation because ρ ( A − ) = (cid:107) A − (cid:107) is already computed in Step 2.The computational cost of Algorithm 1 for dense A is ( m + ) n + O( n ) . When A is sparse, evaluatinglog ( A ) b using Algorithm 1 has computational cost mc abscissa + c mul + c param , where c abscissa is the computationalcost of computing F DE ( x ) b , c mul is the computational cost of a matrix-vector multiplication, and c param is thecomputational cost of computing parameters ρ ( A ) , (cid:107) A − I (cid:107) , and (cid:107) A − (cid:107) . If the parameters are computedapproximately and F DE ( x ) b is computed accurately, then c param will be smaller than c abscissa . Therefore, thecomputational cost will largely depend on m .Once an appropriate ﬁnite interval is obtained, the accuracy of the DE formula can be improved with thefollowing procedure. Let m be the number of abscissas, h = ( r − l )/( m − ) be the mesh size, and T ( h ) be thetrapezoidal rule for the mesh size h : T ( h ) : = h ( F DE ( l ) + F DE ( r )) + h m − (cid:213) i = F DE ( l + ih ) . Then, T ( h / ) can be computed using T ( h ) : T (cid:18) h (cid:19) = T ( h ) + h m − (cid:213) i = F DE (cid:18) l + ( i − ) h (cid:19) . In addition, we can apply the following estimation of the trapezoidal error for a suﬃciently small value of h using [2, Eq. (4.3)]: (cid:13)(cid:13)(cid:13)(cid:13)∫ rl F DE ( x ) − T (cid:18) h (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) ≈ (cid:13)(cid:13)(cid:13)(cid:13) T ( h ) − T (cid:18) h (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) . (22)Our adaptive quadrature algorithm which is based on (22) is presented as Algorithm 2.The computational cost of Algorithm 2 for dense A is ( m k + + ) n + O( n ) = [ k + ( m − ) + ] n + O( n ) ,and the computational cost of log ( A ) b with sparse A is [ k ( m − ) + ] c abscissa + c mul + c param .8 lgorithm 2 Computation of log ( A ) by adaptive quadrature based on the DE formula. Input: A ∈ R n × n , m ∈ N , (cid:15) > ζ > Output: X ≈ log ( A ) Set F DE ( x ) = cosh ( x ) sech ( sinh ( x )) [( + tanh ( sinh ( x )))( A − I ) + I ] − Computing l , r , θ using steps 2 to 11 of Algorithm 1 h = ( r − l )/( m − ) T = h F DE ( l ) + h F DE ( r ) + h (cid:205) m − i = F DE ( l + ih ) for k = , , , . . . until convergence do h k + = h k / T k + = T k + h k + (cid:205) m k − i = F DE ( l + ( i − ) h k + ) m k + = m k − if (cid:107) T k + − T k (cid:107)/ θ ≤ ζ then T = T k + break end if end for X = ( A − I ) T The numerical experiments were carried out using Julia 1.0 on a Core-i7 (3.4GHz) CPU with 16GB RAM. Weused the IEEE double precision arithmetic. We computed abscissas and weights in the GL quadrature with

QuadGK.jl (https://github.com/JuliaMath/QuadGK.jl).

In this experiment, we checked the convergence of the GL quadrature and the DE formula. Our test matricesare presented in Table 1. We generated the ﬁrst three matrices in Table 1 by using the following procedure:Table 1: Test matrices. The condition number of A is denoted by κ ( A ) = (cid:107) A (cid:107) (cid:107) A − (cid:107) .Matrix Size κ ( A ) Structure

SPD 1

50 1 . × SPD

SPD 2

50 1 . × SPD

SPD 3

50 1 . × SPD parter [17] 10 2 . × Nonsymmetric frank [17] 10 2 . × Nonsymmetric vand [17] 10 3 . × Nonsymmetric bcsstk02 [4] 66 4 . × SPD bcsstk03 [4] 112 6 . × SPD ck104 [4] 104 5 . × Nonsymmetric1. We generated an orthogonal matrix Q by QR decomposition of a random 50 ×

50 matrix.2. We generated a diagonal matrix whose diagonal elements were from the geometric sequence: { d i } i = ,..., where d = κ − / and d = κ / for κ = .3. A = QDQ (cid:62) . 9. We repeated Step 2 and Step 3 by setting κ as 10 and 10 .The experimental procedure is as follows:1. We scaled the test matrices as ˜ A = ( / ρ ( A )) A because some matrices had values that were too large touse in computation.2. We computed the reference log ( ˜ A ) with the arbitrary precision arithmetic and rounded the result to doubleprecision. We implemented the ISS algorithm [10, Alg. 11.10] with the BigFloat type of Julia.3. We computed log ( ˜ A ) using Algorithm 1, where (cid:15) = − ≈ . × − . If the test matrix was symmetricpositive deﬁnite, we set θ = max {| log ( ρ ( ˜ A ))| , | log ( ρ ( ˜ A − ))|} as stated in Subsection 2.3. We computed ρ ( ˜ A ) using the eigvals function of Julia, which computes all eigenvalues of ˜ A . Similarly, we computed (cid:107) I − ˜ A (cid:107) and (cid:107) ˜ A − (cid:107) using the svdvals function, which computes all singular values of ˜ A .4. We computed log ( ˜ A ) by applying the GL quadrature to (2).Figure 1 shows the convergence histories of the DE formula and the GL quadrature for each matrix. Severalobservations can be made:• The accuracy of the DE formula is almost the same as that of the GL quadrature, and the accuracies ofthe DE formula and the GL quadrature depend on the condition number of test matrices.• For well-conditioned matrices, such as SPD 1 and parter matrix, the GL quadrature converged fasterthan the DE formula. Conversely, for the ill-conditioned matrices, such as

SPD 3 and vand matrix, theDE formula converged faster than the GL quadrature.The above observations suggest that Algorithm 1 selects an appropriate interval and the DE formula issuitable for ill-conditioned matrices.

In this experiment, we check the performance of Algorithm 2 by using the same matrices that were used inExperiment 1 (see Subsection 3.1). We compared Algorithm 2 with Algorithm 3, which is based on the GLquadrature (see Appendix B).We conducted the experiment using the following procedure:1. We computed log ( ˜ A ) by using Algorithm 2. We set m =

16 and ζ = (cid:15) ∈ { − , − } . In Step 2 ofAlgorithm 2, which calls Algorithm 1, we computed the spectral radius and the 2-norm of matrices using eigvals and svdvals functions, as is done in Experiment 1. We stopped the computation once thenumber of integrand evaluations reached 1921.2. We computed log ( ˜ A ) by using Algorithm 3. We set m =

16 and ζ ∈ { − , − } . In Step 2 ofAlgorithm 3, the lower bound θ was computed using (14) and (15) in the same way as was done for theDE formula. If the number of integrand evaluations was more than 2032, we stopped the computation.The number of integrand evaluations and the corresponding relative error when the two algorithms stoppedare shown in Table 2.Several observations can be made: If A is large, using eigvals and svdvals may be ineﬃcient. Instead, for Julia, a package Arapack.jl (https://github.com/JuliaLinearAlgebra/Arpack.jl) is available, which can compute a part of eigenvalues and singular values basedon the Lanczos (or the Arnoldi) process with the desired accuracy. We present some numerical results, for which ρ ( ˜ A ) , (cid:107) ˜ A − I (cid:107) , and (cid:107) ˜ A − (cid:107) are computed with low accuracy, as shown in Appendix A.

100 20010 SPD 1

GLDE 0 100 20010 SPD 2

GLDE 0 100 20010 SPD 3

GLDE0 50 10010 parter GLDE 0 50 10010 frank GLDE 0 100 200 30010 vand GLDE0 100 20010 bcsstk02 GLDE 0 100 20010 bcsstk03 GLDE 0 100 20010 ck104 GLDE

Figure 1: Convergence histories of the DE formula (Algorithm 1) and the GL quadrature. The vertical axesshow the relative error, (cid:107) log ( ˜ A ) − X (cid:107) F /(cid:107) log ( ˜ A )(cid:107) F , and the horizontal axes show the number of abscissas, m .Table 2: Comparison between Algorithm 2 (based on the DE formula) and Algorithm 3 (based on the GLquadrature), in terms of the number of integrand evaluations (in bold) and the relative error when the algorithmstopped (in parentheses). The notation “ – (–)” means that the algorithms did not stop before the number ofintegrand evaluations reached the limit.Algorithm 2 (DE) Algorithm 3 (GL) ζ − − − − SPD 1 (2 . × − ) (2 . × − ) (4 . × − ) (5 . × − ) SPD 2 (6 . × − ) (6 . × − ) (1 . × − ) (1 . × − ) SPD 3 (3 . × − ) (4 . × − ) – (–) – (–) parter (2 . × − ) (2 . × − ) (3 . × − ) (3 . × − ) frank (1 . × − ) (2 . × − ) (1 . × − ) – (–) vand – (–) – (–) – (–) – (–) bcsstk02 (2 . × − ) (3 . × − ) (1 . × − ) (1 . × − ) bcsstk03 (1 . × − ) (1 . × − ) – (–) – (–) ck104 (6 . × − ) (7 . × − ) (2 . × − ) (2 . × − )11 Algorithm 2 successfully computed the logarithm with the desired accuracy within 2000 integrandevaluations for all test matrices, except vand matrix, whereas Algorithm 3 could not succeed for threematrices. For vand matrix, the stopping criterion ζ may be too strict because the convergence history of vand matrix (as shown in Figure 1) hardly reach the value of 10 − . Our future studies will focus on themethod for selecting a suitable stopping criterion.• Even if the condition number of A is small as is the case for SPD 1 and parter matrix, the number ofintegrand evaluations of Algorithm 2 could be smaller than that of Algorithm 3 because Algorithm 2 canreuse all previous results when improving accuracy.These observations show that Algorithm 2 can be a practical choice for the computation of the matrixlogarithm by numerical quadrature.

In this paper, we focused on the DE formula as a new choice for the numerical quadrature formula of log ( A ) .In order to utilize the DE formula, we proposed a method for selecting an appropriate ﬁnite interval based onerror analysis, and we proposed two algorithms for practical computation.We carried out two numerical experiments. The ﬁrst experiment showed that the DE formula convergedfaster than the GL quadrature for ill-conditioned matrices. The second experiment demonstrated that theproposed adaptive quadrature algorithm worked well with appropriate stopping criteria.Our future work will focus on three problems. The ﬁrst one is the analyses of the convergence rate for theDE formula and the GL formula, the second one is a method of selecting appropriate stopping criteria, andthe third one is the veriﬁcation of the practical performance of the presented algorithms, when applied to largesparse matrices from current research problems. References [1] A. H. Al-Mohy and N. J. Higham. Improved inverse scaling and squaring algorithms for the matrixlogarithm.

SIAM J. Sci. Comput. , 34(4):C153–C169, 2012.[2] J. R. Cardoso. Computation of the matrix p th root and its Fréchet derivative by integrals. Electron. Trans.Numer. Anal. , 39:414–436, 2012.[3] J. R. Cardoso and R. Ralha. Matrix arithmetic-geometric mean and the computation of the logarithm.

SIAM J. Matrix Anal. Appl. , 37(2):719–743, 2016.[4] T. A. Davis and Y. Hu. The University of Florida sparse matrix collection.

ACM Trans. Math. Software ,38(1):1:1–1:25, 2011.[5] L. Dieci, B. Morini, and A. Papini. Computational techniques for real logarithms of matrices.

SIAM J.Matrix Anal. Appl. , 17(3):570–593, 1996.[6] K. Dong, D. Eriksson, H. Nickisch, D. Bindel, and A. G. Wilson. Scalable log determinants for Gaussianprocess kernel learning. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,and R. Garnett, editors,

Advances in Neural Information Processing Systems 30 , pages 6327–6337. CurranAssociates, Inc., 2017.[7] J. Fitzsimons, D. Granziol, K. Cutajar, M. Osborne, M. Filippone, and S. Roberts. Entropic trace estimatesfor log determinants. In M. Ceci, J. Hollmén, L. Todorovski, C. Vens, and S. Džeroski, editors,

MachineLearning and Knowledge Discovery in Databases , pages 323–338, Cham, 2017. Springer InternationalPublishing. 128] I. Han, D. Malioutov, and J. Shin. Large-scale log-determinant computation through stochastic Chebyshevexpansions. In F. Bach and D. Blei, editors,

Proceedings of the 32nd International Conference on MachineLearning , volume 37 of

Proceedings of Machine Learning Research , pages 908–917, Lille, France, 07–09Jul 2015. PMLR.[9] A. Heßelmann and A. Görling. Random phase approximation correlation energies with exact Kohn–Shamexchange.

Mol. Phys. , 108(3-4):359–372, 2010.[10] N. J. Higham.

Functions of Matrices: Theory and Computation . SIAM, Philadelphia, 2008.[11] I. Horenko and Ch. Schütte. Likelihood-based estimation of multidimensional Langevin models and itsapplication to biomolecular dynamics.

Multiscale Model. Sim. , 7(2):731–773, 2008.[12] Z. Huang and L. Van Gool. A riemannian network for SPD matrix learning. In S. P. Singh and S. Markovitch,editors,

Proceedings of the Thirty-First AAAI Conference on Artiﬁcial Intelligence, February 4-9, 2017,San Francisco, California, USA. , pages 2036–2042. AAAI Press, 2017.[13] M. F. Hutchinson. A stochastic estimator of the trace of the inﬂuence matrix for Laplacian smoothingsplines.

Commun. Stat.-Simul. C. , 19(2):433–450, 1990.[14] T. Schenk, Richardson, I. M., M. Kraska, and S. Ohnimus. Modeling buckling distortion of DP600overlap joints due to gas metal arc welding and the inﬂuence of the mesh density.

Comp. Mater. Sci. ,46(4):977–986, 2009.[15] H. Takahasi and M. Mori. Double exponential formulas for numerical integration.

Publ. Res. Inst. Math.Sci. , 9:721–741, 1974.[16] C. K. Zachos. A classical bound on quantum entropy.

J. Phys. A.: Math. Theor. , 40(21):F407, 2007.[17] W. Zhang and N. J. Higham. Matrix Depot: an extensible test matrix collection for Julia.

PeerJ Comput.Sci. , 2:e58, 2016.

A Eﬀect of the parameter errors in Algorithm 1

In this section, in order to check the eﬀects of the parameter errors in Algorithm 1, we present some numericalexamples.The convergence histories of the DE formula are shown in Figure 2. Each graph shows two histories: one is SPD 1 exactArpack 0 50 10010 kahan exactArpack 0 100 20010 ck104 exactArpack Figure 2: Convergence histories of the DE formula (Algorithm 1) obtained using the exact estimations of (cid:107) ˜ A − I (cid:107) , (cid:107) ˜ A − (cid:107) , ρ ( ˜ A ) (using eigvals and svdvals ) and obtained using eigs of Arpack.jl . The verticalaxes show the relative error (cid:107) log ( ˜ A ) − X (cid:107) F /(cid:107) log ( ˜ A )(cid:107) F , and the horizontal axes show the number of abscissas m . 13btained by using the eigvals and svdvals functions for computing ρ ( ˜ A ) , (cid:107) ˜ A − I (cid:107) , and (cid:107) ˜ A − (cid:107) as in Section3; the other one is obtained by using Arpack.jl with tol=0.01 . The ﬁgure shows that the behaviors of thehistories are almost equal, and the eﬀects of the errors did not appear.In conclusion, the parameters used at Step 2 of Algorithm 1 can be computed roughly. B Adaptive quadrature algorithm based on the GL quadrature

In this section we show an adaptive quadrature algorithm based on the GL quadrature for (2), using a techniquefrom [2].Let G ( m ) : = (cid:205) mi = w i F GL ( u i ) be the results of the m -point GL quadrature, where F GL ( u i ) = [( + u )( A − I ) + I ] − . Using [2, Eq. (4.6)], the following error estimate can be applied: (cid:13)(cid:13)(cid:13)(cid:13) G ( m ) − ∫ − F GL ( t ) d t (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:107) G ( m ) − G ( m )(cid:107) . (23)Based on (23), we present the algorithm designed to compute log ( A ) by automatically adding abscissas untilthe error is smaller than a given tolerance as Algorithm 3. Algorithm 3

Computation log ( A ) based on Gauss–Legendre quadrature Input: A ∈ R n × n , The number of initial abscissas m ≥ ζ > Output: X ≈ log ( A ) Set F GL ( u ) : = [( + u )( A − I ) + I ] − Compute θ , a lower bound on (cid:107) log ( A )(cid:107) . Compute abscissas u i and weights w i of m -point Gauss–Legendre quadrature ( i = , . . . , m ) G = (cid:205) m i = w i F GL ( u i ) for k = , , , . . . until convergence do m k + = m k Compute abscissas u i and weights w i of m k + -point Gauss–Legendre quadrature ( i = , . . . , m k + ) G k + = (cid:205) m k + i = w i F GL ( u i ) if (cid:107) G k + − G k (cid:107)/ θ ≤ ζ then G = G k + break end if end for X = ( A − I ) G The computational cost of Algorithm 3 for dense A is [ ( (cid:205) k + i = m i ) + ] n + O( n ) = [( k + − ) m + ] n + O( n ) , and the computational cost for log ( A ) b with sparse A is ( k + − ) m c abscissa + c mul + c param . Under theassumption that the total computational cost largely depends on the coeﬃcient of c abscissa , the computationalcost of the Algorithm 2 will be smaller than that of Algorithm 3 when the convergence ratios of the DE formulaand the GL quadrature are the same. The parameter tol deﬁnes the relative tolerance for convergence of Ritz values. In this examples, ρ ( ˜ A ) is obtained using eigs(A, nev=1, which=:LM, tol=0.01) , (cid:107) ˜ A − I (cid:107) is obtained using eigs((A-I)’*(A-I), nev=1, which=:LM, tol=0.01) ,and (cid:107) ˜ A − (cid:107) is obtained using eigs(A’*A, nev=1, which=:SM, tol=0.01) , where A is the preconditioned matrix ˜ A ..