The Newton-Shamanskii method for solving a quadratic matrix equation arising in quasi-birth-death problems
aa r X i v : . [ m a t h . NA ] J un The Newton-Shamanskii method for solving a quadraticmatrix equation arising in quasi-birth-death problems
Pei-Chang Guo ∗ School of Mathematical Sciences, Peking University, Beijing 100871, Beijing, China
Abstract
In order to determine the stationary distribution for discrete time quasi-birth-death Markovchains, it is necessary to find the minimal nonnegative solution of a quadratic matrix equation.We apply the Newton-Shamanskii method for solving the equation. We show that the sequenceof matrices generated by the Newton-Shamanskii method is monotonically increasing andconverges to the minimal nonnegative solution of the equation. Numerical experiments showthe effectiveness of our method.
Keywords : quadratic matrix equation, quasi-birth-death problems, Newton-Shamanskii method,minimal nonnegative solution.
We first introduce some necessary notation for the paper. For any matrices B = [ b ij ] ∈ R m × n ,we write B ≥ B >
0) if b ij ≥ b ij >
0) holds for all i, j . For any matrices
A, B ∈ R n × n ,we write A ≥ B ( A > B ) if a ij ≥ b ij ( a ij > b ij ) for all i, j . For any vectors x, y ∈ R n ,we write x ≥ y ( x > y ) if x i ≥ y i ( x i > y i ) holds for all i = 1 , · · · , n . The vector of all ones is denoted by e,i.e., e = (1 , , · · · , T . The identity matrix is denoted by I .In this paper, we consider the quadratic matrix equation (QME) Q ( X ) = AX + BX + C = 0 , (1.1)where A, B, C, X ∈ R n × n , A, B + I, C ≥ A + B + I + C is irreducible and ( A + B + C ) e = e .The quadratic matrix equation (1.1) has applications in quasi-birth-death processes (QBD)[4].The elementwise minimal nonnegative solution S of the equation (1.1) is of interest in the appli-cations. The rate ρ of a QBD Markov chain is defined by ρ = p T ( B + I + 2 A ) e (1.2)where p is the stationary probability vector of stochastic matrix A + B + I + C , i.e., p T ( A + B + I + C ) = p T and p T e = 1. We refer the readers to the monograph [4] for the details. We recall thata QBD is positive recurrent if ρ <
1, null recurrent if ρ = 1 and transient if ρ >
1. Throughoutthis paper, we always assume that the QBD is positive recurrent.There have been several numerical methods for solving the equation. Some linearly convergentfixed point iterations are analyzed in [1] and the references therein. In [2] Latouche showed thatNewton’s algorithm for this equation is well defined and the matrix sequences is monotonically in-creasing and quadratically convergent. The invariant subspaces method approximates the minimalnonnegative solution S quadratically by approximating the left invariant subspace of a suitable ∗ e-mail: [email protected] S . Bini and Meini et al. devise a quadratically convergent and numeri-cally stable algorithm [6, 7, 8, 9, 10] for the computation of S based on a functional representationof cyclic reduction, which applies to general M/G/1 type Markov chains [16] and which extendsthe method of Latouche and Ramaswami [5].In this paper, we apply the Newton-Shamanskii method to equation (1.1). We show that,starting with a suitable initial guess, the sequence of the iterative matrices generated by theNewton-Shamanskii method is monotonically increasing and converges to the minimal nonnega-tive solution of QME (1.1). Numerical experiments show that the Newton-Shamanskii method iseffective and outperforms the Newton method.The rest of the paper is organized as follows. In section 2 we recall Newton’s method and presentthe Newton-Shamanskii iterative procedure. Insection3 we prove the monotone convergence resultfor the Newton-Shamanskii method. In section 4 we present some numerical examples, which showthat our new algorithm is faster than Newton method. In section 5, we give our conclusions. The function Q in (1.1) is a mapping from R n × n into itself and the Fr´echet derivative of Q at x isa linear map Q ′ X : R n × n → R n × n given by Q ′ X ( Z ) = AZX + AXZ + BZ. (2.1)The second derivative at X , Q ′′ X : R n × n → R n × n , is given by Q ′′ X ( Z , Z ) = AZ Z + AZ Z . (2.2)For a given X , the Newton sequence for the solution of Q ( X ) = 0 is X k +1 = X k − ( Q ′ X k ) − Q ( X k ) , k = 0 , , , · · · , (2.3)provided that Q ′ X k is invertible for all k . By equation (2.1), the Newton iteration (2.3) is equivalentto (cid:26) AZX k + ( AX k + B ) Z = − Q ( X k ) ,X k +1 = X k + Z, k = 0 , , , · · · (2.4)or AX k +1 X k + ( AX k + B ) X k +1 = AX k − C, k = 0 , , , · · · . (2.5)As we see in [2], for the nonlinear equation Q ( X ) = 0 with the minimal nonnegative solution S , the sequence generated by Newton method will converge quadratically and globally to the thesolution S . However, there is a disadvantage with Newton method. At every iteration step, aSylvester-type equation A XB T + A XB T = E. is needed to solve. The Bartels-Stewart method and the Hessenberg-Schur method can be employedto solve the Sylvester-type equation [3], where the QZ algorithm is involved. When solving theSylvester-type equation, a transformation method is used which employs the QZ algorithm tostructure the equation in such a way that it can be solved columnwise by a back substitutiontechnique. The work count of the floating point operations involved in QZ algorithm is largecompared with the back substitution [3]. In order to save the overall cost, we would like to reusethe special coefficient matrix structure form produced by QZ algorithm. We present the Newton-Shamanskii algorithm for QME(1.1) as follows. 2 ewton-Shamanskii algorithm for QME(1.1) Given initial value X , for k = 0 , , · · · X k, = X k − ( Q ′ X k ) − Q ( X k ) , (2.6) X k,s = X k,s − − ( Q ′ X k ) − Q ( X k,s − ) , s = 1 , , · · · , n k , (2.7) X k +1 = X k,n k (2.8) In this section, we prove a monotone convergence result for Newton-Shamanskii method for QME(1.1).
We first recall that a real square matrix A is called a Z-matrix if all its off-diagonal elements arenonpositive. Note that any Z-matrix A can be written as sI − B with B ≥
0. A Z-matrix A is called an M-matrix if s ≥ ρ ( B ), where ρ ( · ) is the spectral radius; it is a singular M-matrix if s = ρ ( B ) and a nonsingular M-matrix if s > ρ ( B ). We will make use of the following result (see[17]). Lemma 3.1.
For a Z-matrix A , the following are equivalent: ( a ) A is a nonsingular M-matrix. ( b ) A − ≥ . ( c ) Av > for some vector v > . ( d ) All eigenvalues of A have positive real parts. The next result is also well known and also can be found in [17].
Lemma 3.2.
Let A be a nonsingular M-matrix. If B ≥ A is a Z-matrix, then B is also nonsingularM-matrix . Moreover, B − ≤ A − . We recall the property of the minimal nonnegative solution S for QME (1.1), see [2, 4] for moredetails. Theorem 3.1.
If the quasi-birth-death process is positive recurrent, i.e., rate ρ defined by (1.2) satisfies that ρ < , then the matrix − [( S T ⊗ A + I ⊗ AS ) + I ⊗ B ] is a nonsingular M-matrix. The next lemma displays the monotone convergence properties of Newton iteration for QME (1.1).
Lemma 3.3.
Suppose that a matrix X is such that(i) Q ( X ) ≥ ,(ii) ≤ X ≤ S , iii) − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ] is a nonsingular M-matrix.Then there exists the matrix Y = X − ( Q ′ X ) − Q ( X ) (3.1) such that(a) Q ( Y ) ≥ ,(b) ≤ X ≤ Y ≤ S ,(c) − [( Y T ⊗ A + I ⊗ AY ) + I ⊗ B ] is a nonsingular M-matrix.Proof. Q ′ X is invertible and the matrix Y is well defined by (iii) and Lemma 3.1. Because Q ( X ) ≥ − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ] − ≥ vec ( Y ) ≥ vec ( X ) andthus Y ≥ X . From equation (3.1) and Taylor formula, we have Q ( Y ) = Q ( X ) + Q ′ X ( Y − X ) + 12 Q ′′ X ( Y − X, Y − X )= 12 Q ′′ X ( Y − X, Y − X )= A ( Y − X ) ≥ AY X + ( AX + B ) Y = AX − C and the equation AS + BS + C = 0 , we get A ( Y − S ) X + ( AX + B )( Y − S ) = AX − C − ASX − AXS − BS = A ( X − S )( X − S ) ≥ . Note that − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ] is a nonsingular M-matrix, therefore by Lemma 3.1we get vec ( S − Y ) ≥
0, which is S − Y ≥
0. Note that Y ≥ X , we have proved (b).From 0 ≤ Y ≤ S , we know − [( Y T ⊗ A + I ⊗ AY ) + I ⊗ B ] ≥ − [( S T ⊗ A + I ⊗ AS ) + I ⊗ B ] , and we know − [( S T ⊗ A + I ⊗ AS )+ I ⊗ B ] is a nonsingular M-matrix, so − [( Y T ⊗ A + I ⊗ AY )+ I ⊗ B ]is a nonsingular M-matrix by Lemma 3.2.The next lemma is an extension of Lemma 3.3, which will be the theoretical basis of monotoneconvergence result of Newton-Shamanskii method for QME (1.1). Lemma 3.4.
Suppose that a matrix X is such that(i) Q ( X ) ≥ ,(ii) ≤ X ≤ S ,(iii) − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ] is a nonsingular M-matrix.Then for any matrix N with ≤ N ≤ X , there exists the matrix Y = X − ( Q ′ N ) − Q ( X ) (3.2) such that a) Q ( Y ) ≥ ,(b) ≤ X ≤ Y ≤ S ,(c) − [( Y T ⊗ A + I ⊗ AY ) + I ⊗ B ] is a nonsingular M-matrix.Proof. First, because 0 ≤ N ≤ X , we get − [( N T ⊗ A + I ⊗ AN ) + I ⊗ B ] ≥ − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ]. From (iii) and Lemma 3.2 we know Q ′ N is invertible and the matrix Y is well defined such that0 ≤ X ≤ Y . Let ˆ Y = X − ( Q ′ X ) − Q ( X ) , we have ˆ Y ≥ Y from Lemma 3.2. Note that ˆ Y ≤ S by Lemma 3.3, so we have proved (b)0 ≤ Y ≤ S . Note that − [( ˆ Y T ⊗ A + I ⊗ A ˆ Y ) + I ⊗ B ]is a nonsingular M-matrix by Lemma 3.3 and ˆ Y ≥ Y , we have − [( Y T ⊗ A + I ⊗ AY ) + I ⊗ B ] isa nonsingular M-matrix by Lemma 3.2. Last, from Taylor formula and Q ′ N ( Y − X ) + Q ( X ) = 0,we have Q ( Y ) = Q ( X ) + Q ′ X ( Y − X ) + 12 Q ′′ X ( Y − X, Y − X )= Q ( X ) + Q ′ N ( Y − X ) + ( Q ′ X − Q ′ N )( Y − X ) + 12 Q ′′ X ( Y − X, Y − X )= ( Q ′ X − Q ′ N )( Y − X ) + 12 Q ′′ X ( Y − X, Y − X )= Q ′′ X ( X − N, Y − X ) + 12 Q ′′ X ( Y − X, Y − X )= A ( X − N )( Y − X ) + A ( Y − X )( X − N ) + A ( Y − X ) ≥ . Using Lemma 3.4, we can arrive at the following monotone convergence result of Newton-Shamanskii method for QME (1.1).
Theorem 3.2.
Suppose that a matrix X is such that(i) Q ( X ) ≥ ,(ii) ≤ X ≤ S ,(iii) − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ] is a nonsingular M-matrix.Then the Newton-Shamanskii algorithm (2.6) (2.7) (2.8) generates a sequence { X k } such that X k ≤ X k +1 ≤ S for all k ≥ , and lim k →∞ X k = S .Proof. We prove the theorem by mathematical induction. From Lemma 3.4, we have X ≤ X , ≤ · · · ≤ X ,n = X ≤ S, Q ( X ) ≥ , and know that − [( X T ⊗ A + I ⊗ AX ) + I ⊗ B ]is a nonsingular M-matrix. Assume Q ( X i ) ≥ , ≤ X , ≤ · · · ≤ X ,n = X ≤ · · · ≤ X i − ,n i − = X i ≤ S, and − [( X Ti ⊗ A + I ⊗ AX i ) + I ⊗ B ] is a nonsingular M-matrix. Again by Lemma 3.4 we have Q ( X i +1 ) ≥ ,X i ≤ X i, ≤ · · · ≤ X i,n i = X i +1 ≤ S, and − [( X +1 i T ⊗ A + I ⊗ AX i +1 ) + I ⊗ B ] is a nonsingular M-matrix. Therefore we have provedinductively the sequence { X k } is monotonically increasing and bounded above by S . So it hasa limit X ∗ such that X ∗ ≤ S . Let i → ∞ in X i +1 ≥ X i, = X i − ( Q ′ X i ) − Q ( X i ) ≥
0, we seethat Q ( X ∗ ) = 0. Since X ∗ ≤ S , and S is the minimal nonnegative solution of QME (1.1), we get X ∗ = S . We remark that the Newton-Shamanskii method differs from Newton’s method in that the Fr´echetderivative is not updated at every iteration step. That is to say the coefficient matrix pairs of theSylvester-type equation are evaluated and reduced with QZ algorithm after several inner iterationsteps. So, while more iterations will be needed than for Newton’s method, the overall cost of theNewton-Shamanskii method will be less. Our numerical experiments confirm the efficiency of theNewton-Shamskii method for QME (1.1).The numerical tests were performed on a laptop (2.4 Ghz and 2G Memory) with MATLABR2013a. We use X = 0 as the initial iteration value of the Newton-like method. As is reportedin [3], the Hesseberg-Schur method is faster than the Bartels-Stewart method when solving thegeneral Sylvester-type equation A XB T + A XB T = E. So in Newton iteration we adopt the Hesseberg-Schur method for solving the Sylvester-type equa-tion. In Newton-Shamanskii iteration, we can reuse the reduced coefficient matrix in the backsubstitution step when solving Sylvester-type equation, so we adopt the Bartels-Stewart methodto solve the Sylvester-type equation. That is to say in the first call to QZ algorithm, we reduce A to quasi-upper-triangular form.About how to choose the optimal scalars n i in the Newton-like algorithm (2.7), we have notheoretical results. In our extensive numerical experiments, we update the Fr´echet derivative every m = 2 steps.We define the number of the evaluation of the Fr´echet derivative in the algorithm as the outeriteration steps, which is k +1 for an approximate solution x k,l in the Newton-Shamanskii algorithm.The outer iteration steps (denoted as “it”), the elapsed CPU time in seconds (denoted as“time”), and the normalized residual (denoted as “NRes” ) are used to measure the feasibility andeffectiveness of our new method, where “NRes” is defined asNRes = k A ˜ X + B ˜ X + C kk ˜ X k ( k A kk ˜ X k + k B k )+ k C k , where k · k denotes the infinity-norm of the matrix and ˜ X is an approximate solution to theminimal nonnegative solution of (1.1). Numerical experiments show that the Newton-Shamanskiimethod are more efficient than Newton method . Example 4.1.
We use the example in [5, 10] to test our algorithm. In this example we constructa quasi-birth-death problem defined by the n × n matrices A = W , B = W − I , C = W + δI , where I is the identity matrix, W is a matrix having null diagonal entries and constant off-diagonalentries,and < δ < . As was observed in [5], the rate ρ = p T ( B + I + 2 A ) e , where p T ( A + B + I + C ) = p T and p T e = 1 , is exactly − δ . We have tested with three different δ values and threeproblem sizes. Tables 1, Table 2 and Table 3, report the results obtained with sizes n = 20 , n = 100 and n = 200 , respectively. n = 20 δ Method time it NRes5.0e-1 Newton 0.013 5 4.77e-165.0e-1 Newton-Shamanskii 0.009 3 2.38e-141.0e-1 Newton 0.036 7 1.61e-161.0e-1 Newton-Shamanskii 0.012 5 9.25e-161.0e-3 Newton 0.043 13 8.70e-161.0e-3 Newton-Shamanskii 0.036 9 3.00e-16Table 2: Comparison of the numerical results when n = 100 δ Method time it NRes5.0e-1 Newton 0.142 5 1.24e-155.0e-1 Newton-Shamanskii 0.110 3 2.50e-141.0e-1 Newton 0.190 7 1.21e-151.0e-1 Newton-Shamanskii 0.168 5 1.60e-151.0e-3 Newton 0.444 13 1.60e-151.0e-3 Newton-Shamanskii 0.359 9 6.14e-16Table 3: Comparison of the numerical results when n = 200 δ Method time it NRes5.0e-1 Newton 1.026 5 9.40e-155.0e-1 Newton-Shamanskii 0.746 3 2.34e-141.0e-1 Newton 1.433 7 2.18e-151.0e-1 Newton-Shamanskii 1.200 5 1.25e-151.0e-3 Newton 4.798 13 5.64e-151.0e-3 Newton-Shamanskii 4.271 9 2.50e-157
Conclusions
In this paper, we apply the Newton-Shamanskii method to the quadratic matrix equation arisingfrom the analysis of quasi-birth-death processes. The convergence analysis shows that this methodis feasible and the minimal nonnegative solution of the quadratic matrix equation can be obtained.Numerical experiments show that the Newton-Shamanskii method outperforms Newton method.
References [1] G. Latouche and V. Ramaswami. Introduction to matrix analytic methods in stochastic mod-eling. SIAM, Philadelphia, PA, 1999[2] G. Latouche. Newton’s iteration for non-linear equations in Markov chains. IMA J. Numer.Anal.,14(4):583-598, 1994.[3] J.D. Gardiner, A.J. Laub, J.J. Amato, C.B. Moler. Solution of the Sylvester matrix equation
AXB T + CXD T = E . ACM Trans. Math. Software, 18 (1992), 223C231[4] D. A. Bini, G. Latouche, and B. Meini. Numerical methods for structured Markov chains.Oxford University Press, New York, 2005.[5] G. Latouche and V. Ramaswami. A logarithmic reduction algorithm for quasi-birth-deathprocesses. J. Appl. Probab., 30(3):650-674, 1993.[6] B. Meini, Solving QBD problems: The cyclic reduction algorithm versus the invariant subspacemethod, Adv. Perf. Anal., 1 (1998), pp. 215-225.[7] D. A. Bini, B. Meini, On cyclic reduction applied to a class of Toeplitz-like matrices arisingin queueing problems, in Computations with Markov Chains, W. J. Stewart, ed., KluwerAcademic, Dordrecht, The Netherlands, 1995, 21-38.[8] D. A. Bini, B. Meini, On the solution of a nonlinear matrix equation arising in queueingproblems, SIAM J. Matrix Anal. Appl., 17 (1996), 906C926.[9] D. A. Bini andB. Meini, Improved cyclic reduction for solving queueing problems, Numer.Algorithms, 15 (1997), 57C74.[10] C. He, B. Meini, and N. H. Rhee. A shifted cyclic reduction algorithm for quasi-birth-deathproblems. SIAM J. Matrix Anal. Appl., 23(3):673-691, 2001.[11] D. A. Bini, G. Latouche, and B. Meini. Solving matrix polynomial equations arising in queue-ing problems. Linear Algebra Appl., 340 225-244, 2002.[12] Poloni, F.: Quadratic vector equations. Linear Algebra Appl. , 1627-1644 (2013)[13] Lin, Y., Bao, L.: Convergence analysis of the Newton-Shamanskii method for a nonsymmetricalgebraic Riccati equation, Numer. Linear Algebra Appl. , 535-546 (2008)[14] Chun, C.-H.: Monotone convergence of Newton-like methods for M-matrix algebraic Riccatiequations. Numer. Algorithms,64