aa r X i v : . [ m a t h . C A ] D ec On the final limit of a transition matrix
Helmut Kahl
Abstract.
For a finite intensity matrix B the final limit of its transition ma-trix exp( tB ) exists. This is a well-known fact in the realm of continous-timeMarkov processes where it is proven by probability theoretic means. A sim-ple proof is presented with help of a Tauberian theorem of complex analyticfunctions which is used also in [2] to proof the prime number theorem. Fur-thermore the final limit is computed. Mathematical Subject Cassification(2010).
Keywords. intensity/stochastic matrix, Laplace transform, Tauberian theory.
1. Introduction
Square matrices with finite index set Z are considered. A right intensitymatrix B = ( b ij ) is defined by b ij ≥ i, j ∈ Z with i = j and byvanishing row sums P j ∈ Z b ij = 0 for all i ∈ Z . Its transition matrix is exp( tB ) = ∞ X n =0 t n n ! B n , t ∈ R . Theorem 1.
For a right intensity matrix B the limit lim t →∞ exp( tB ) exists.Proof. A standard proof is given in [5], thm. 5.4.6. The assertion follows alsoby the Tauberian theorem [1], ch. III, thm. 7.1 from the complex analyticProposition 5 deferred to the next section. (cid:3) The probability distribution P of the corresponding Markov process { X t } t ≥ on the statespace Z with initial probability row vector π fulfills (P { X t = j } ) j ∈ Z = π exp( tB ) , t ≥ B ). Helmut Kahl
2. Complex analytic proof of the theorem
First we recall some theory of homogenous linear differential equation systemswith constant coefficients. The zero matrix is denoted by O , the identitymatrix by I . Lemma 2.
For a complex square matrix A the matrix Y ( z ) := exp( zA ) fulfills Y (0) = I , Y ′ = Y A = AY and Y ( w + z ) = Y ( w ) + Y ( z ) for all w, z ∈ C .Let y be a complex row vector s.t. y ∞ := lim t →∞ y Y ( t ) exists. Then it holds y ∞ A = 0 . If Y ∞ := lim t →∞ Y ( t ) exists it holds Y ∞ Y ∞ = Y ∞ and lim t →∞ Y ′ ( t ) = Y ∞ A = O. If Y is bounded on the positve real axis every entry of the Laplace transform z ( zI − A ) − of Y is holomorphic on the right half plane ℜ ( z ) > .Proof. The formulas for Y follow easily from the definition. For proof of theformulas for y ∞ observe that the function y ( t ) := y ∞ Y ( t ) fulfills y ( t ) = lim s →∞ y Y ( s ) Y ( t ) = lim s →∞ y Y ( s + t ) = y ∞ for all t ∈ R . So we have indeed y ∞ A = y ( t ) A = y ′ ( t ) = 0. In case Y ∞ existsit holds Y ( t ) Y ∞ = Y ∞ for all t ∈ R . Therefrom Y ∞ Y ∞ = Y ∞ follows. Bychoosing canonical unit vectors for y we see Y ∞ A = O . Laplace transforma-tion of Y ′ = Y A in each entry yields z ˆ Y ( z ) − Y (0) = ˆ Y ( z ) A for the Laplacetransform ˆ Y of Y . Because of Y (0) = I we have indeed ˆ Y ( z ) = ( zI − A ) − .Boundedness of Y implies boundedness of each entry of Y . So ˆ Y is well de-fined on ℜ ( z ) >
0. Holomorphy follows from M − = adj M/ | M | (s. [4], prop.3.3) with the adjugate adj M of a matrix M with determinant | M | 6 = 0. (cid:3) A square matrix P = ( p ij ) is called ( right ) stochastic when all its entries p ij ≥ P equals one. Lemma 3.
For a right intensity matrix B its transition matrix exp( tB ) isstochastic, hence a bounded function in t ∈ R .Proof. The product AB of any matrix A with B has vanishing row sums.Therefore B n has vanishing row sums for all n ∈ N . Hence the partial sums N X n =0 t n n ! B n = E + tB + ... + t N N ! B N , N ∈ N , t ∈ R have row sums one. By the limiting process with N → ∞ the same assertionholds for exp( tB ). For proving non-negativity we choose a real number µ suchthat T := B + µI has non-negative entries (like in the proof of [3], thm. 2.7).Since exp( tT ) has non-negative entries and e − µt > tB ) = exp( − µtI ) exp( tT ) = e − µt exp( tT ) (s. Lemma 2!). (cid:3) n the final limit of a transition matrix 3A square matrix ( a ij ) is called reducible when there is some non-empty, propersubset I of the index set Z such that for all i ∈ I and all j ∈ Z \ I it holds a ij = 0. Otherwise the matrix is called irreducible . An index r ∈ Z is called recurrent with respect to A = ( a ij ) when there is some r containing subset J of Z which for the submatrix A J := ( a ij ) i,j ∈ J is irreducible. Then J iscalled the recurrence class of r . A non-recurrent index is called transient . Weuse also the notation A I,J := ( a ij ) i ∈ I,j ∈ J for subsets I, J of Z . So we have A J,J = A J . Lemma 4.
For a right intensity matrix B with index set Z let D be thediagonal matrix with entries d i := − b ii , i ∈ Z . In case d i = 0 substitute d i by one, so that D becomes invertible. For i ∈ Z let q ii := 0 in case b ii = 0 and q ii := 1 otherwise. For i, j ∈ Z with i = j let q ij := b ij /d i . This definesa stochastic matrix Q = ( q ij ) i,j ∈ Z with B = DQ − D . The function x xD is an isomorphism from the left null space of B onto the left eigenspace of Q with eigenvalue one. For every non-empty subset J of Z the same assertionshold for B J , D J , Q J instead of B, D, Q . The submatrix B J is irreducible ifand only if J is a recurrence class with respect to B . If the set T of transientindices with respect to B is not empty then Z \ T is not empty and B T isinvertible. In case B is irreducible its left null space is spanned by a vectorwith positive coordinates.Proof. In case b ii = 0 it holds X j ∈ Z q ij = 1 d i X j = i b ij = − b ii d i = 1 . In case b ii = 0 this row sum is also one. So Q is stochastic. The equation B = DQ − D and therefrom the equation B J = D J Q J − D J , J ⊂ Z isverified readily. Hence for a row vector x with index set J the equation xB J = 0 is equivalent with ( xD J ) Q J = xD J . So the function x xD J isan isomorphism from the left null space of B J onto the left eigenspace of Q J with eigenvalue one. The assertion about recurrence classes follows fromthe definition. From finiteness of Z follows the existence of a recurrent index.Now assume T = ∅ . The definition of transience implies the existence of some i ∈ T and some j ∈ Z \ T s.t. b ij = 0, therefore also q ij = 0. Hence Q T isnot stochastic. According to [4], ch. 8.3.2, lem. 12 the submatrix Q T does nothave eigenvalue one. So by the isomorphism above with J = T the left nullspace of B T is zero. Thus B T is invertible. For proving the last assertion wemay assume without loss of generality that Z has more than one element.Assume b ii = 0 for some i ∈ Z . Then we have b ij = 0 for all j = i , so that B is reducible. So for irreducible B we have q ij = b ij /d i for all i, j ∈ Z with i = j . Hence Q is irreducible, too. By the Perron-Frobenius Theorem [4],thm. 8.2 the left eigenspace of Q with eigenvalue one is spanned by a vector y with positive coordinates. Since D − has the positive diagonal entries 1 /d i For the Markov process described in the footnote above Q is the transition matrix of theembedded Markov chain. Helmut Kahlthe vector x := yD − has also positive coordinates. By the above describedisomorphism x xD the left null space of B is one-dimensional. (cid:3) Proposition 5.
For a right intensity matrix B it exists a positive δ s.t. everyholomorphic entry of the matrix valued function z z ( zI − B ) − , ℜ ( z ) > is analytically continuable to the half plane ℜ ( z ) > − δ .Proof. By Lemma 2 & 3 the given function F ( z ) := z ( zI − B ) − is indeedwell-defined and entry-wise holomorphic on the half plane ℜ ( z ) >
0. Thepolynomial z
7→ | zI − B | of B has finitely many roots. So F is meromorphicon the whole complex plane. According to [4], prop. 5.12 all eigenvalues of B are elements of the compact disc of radius ρ := max {− b ii | i ∈ Z } around − ρ . Hence the non-zero eigenvalues of B = ( b ij ) i,j ∈ Z have negative real part.So it suffices to show that F is analytically continuable in z = 0.1. case: B has no transient indices. So the set R of recurrent indices comprisesthe whole index set. By suitable permutation of R we have block diagonalform B = B R = diag( B , ..., B k ) with irreducible intensity matrices B i asmain diagonal submatrices of B and with all other submatrices besides the B i blockwise equal to zero. Due to Lemma 4 the functions z
7→ | zI i − B i | ,with identity matrix I i of appropriate dimension, have the simple root z = 0.So each entry of the matrixdiag( zI − B , ..., zI k − B k ) − = diag(( zI − B ) − , ..., ( zI k − B k ) − )has pole order at most one at z = 0. So by multiplication with z we obtain F ( z ) with pole order at most zero at z = 0.2. case: The set T of transient indices is not empty. Then we have the addi-tional submatrix B T as the lower right block of B and the lower left block B T,R . Since B T is invertible due to Lemma 4 the entries of C ( z ) := zI T − B T have no pole at z = 0. With A ( z ) := zI R − B R it holds( zI − B ) − = (cid:18) A ( z ) OB T,R C ( z ) (cid:19) − = (cid:18) A ( z ) − O − C ( z ) − B T,R A ( z ) − C ( z ) − (cid:19) . So after multiplication with z we have pole order at most zero at z = 0. (cid:3)
3. Computation of the final limit
We show how to compute the final limit of Theorem 1 via matrix operations.
Theorem 6.
For a right intensity matrix B with finite index set Z the finallimit P is stochastic. For every recurrence class J ⊂ Z with respect to B arow of P with index in J equals the unique row vector p J := ( p j ) j ∈ Z definedby p J B = 0 , p j > for j ∈ J , p j = 0 for j ∈ Z \ J and P j ∈ Z p j = 1 . So thestochastic submatrix P J = lim t →∞ exp( tB J ) has equal rows with positive entries.In case the set T ⊂ Z of transient indices with respect to B is not empty the n the final limit of a transition matrix 5 set R := Z \ T of recurrent indices is not empty, too. The matrices P T and B T,R P R + B T P T,R are zero. Hereby B T is invertible, whence P T,R = − B − T B T,R P R . For the column vector ( f i,J ) i ∈ T := − B − T P j ∈ J ( b ij ) i ∈ T and the set C of recur-rence classes the row of index i ∈ T of P = ( p ij ) equals P J ∈ C f i,J p J . Hence wehave p ij = f i,J p jj for i ∈ T and j ∈ J ∈ C .Proof. Due to Lemma 3 the transition matrix Y ( t ) := exp( tB ) is stochastic.By the limiting process with t → ∞ which is allowed by Theorem 1 thisimplies the first assertion. From the definition of a recurrence class J it follows b ij = 0 for i ∈ J and j ∈ Z \ J . So it holds B nJ = ( B J ) n and B nJ,Z \ J = O for all n ∈ N . This implies Y ( t ) J = exp( tB J ) and Y ( t ) J,Z \ J = O for all t ∈ R , hence P J = lim t →∞ exp( tB J ) and P J,Z \ J = O . Therefore P J is alsostochastic. Since B J is irreducible the left null space of B J is spanned by avector p J = ( p j ) j ∈ J with positive coordinates p j according to Lemma 4. ByLemma 2 this p J coincides with each row of P J . Without loss of generalitywe may assume P j ∈ Z p j = 1. By filling with zeros at indices j ∈ Z \ J the rowvector p J becomes p J like in the second assertion. For T = ∅ we obtain theblock form B = (cid:18) B R OB T,R B T (cid:19) by permutation of the index set. By Lemma4 it holds P T B T = O and B T is invertible. So we have P T = O . Induction on n ∈ N yields B n +1 T,R = B T,R B nR + B T B nT,R . So for every t ∈ R it holds Y ′ ( t ) T,R = ( Y ( t ) B ) T,R = B T,R + B T,R ∞ X n =1 t n n ! B nR + B T ∞ X n =1 t n n ! B nT,R by Lemma 2. Because of I T,R = O it follows Y ′ ( t ) T,R = B T,R Y ( t ) R + B T Y ( t ) T,R . By the limiting process with t → ∞ this implies the formula for P T,R sincelim t →∞ Y ′ ( t ) = O by Lemma 2. Let j ∈ J ∈ C . As shown above it holds p ij = p jj for i ∈ J and p ij = 0 for i ∈ R \ J . This implies B T P T, { j } = − B T,R P R, { j } = − p jj X j ∈ J B T, { j } = p jj B T ( f i,J ) i ∈ T . By cancelling out B T we obtain p ij = f i,J p jj for i ∈ T and hence the identityfor the i -th row of P . (cid:3) Let us call a row of a right stochastic matrix a probability vector . Remark . The theorem suggests the following pseudo-algorithm for com-puting the final limit P from right intensity matrix B with index set Z : For the embedded Markov chain mentioned in the latter footnote f i,J is the enter prob-ability from the transient state i into the recurrence class J . Helmut Kahl0. Initialise P := O (zero matrix of dimension like B ) and T := Z .1. Compute the set C of recurrence classes with respect to B .2. For J ∈ C :2.1. Find a non-zero element x = ( x j ) j ∈ J of the left null space of B .2.2. Define p := λ − x with λ := P j ∈ J x j .2.3. Substitute every row of P J by p .2.2. Substitute T by T \ J .3. If T is not empty:3.1. For J ∈ C :3.1.1. Compute ( f i,J ) i ∈ T := − B − T B T,J (1 ... t .3.1.2. For i ∈ T : For j ∈ J : Substitute p ij by f i,J p jj .4. Output P .For a square matrix A with index set of n > C = ( c ij ) := A + A + ... + A n − . Then index i is in the same recurrence class as index j if and only if c ij and c ji do not vanish. This explains how to do step 1. References [1] J. Korevaar,
Tauberian theory , Springer, Berlin (2004)[2] D.J. Newman,
Simple analytic proof of the prime number theorem , Amer. Math.Monthly (1980) 693-696[3] E. Seneta, Non-negative Matrices and Markov chains , 2nd edition, Springer,Berlin (1981)[4] D. Serre,
Matrices , 2nd edition, Springer, Berlin (2010)[5] D.W. Stroock,
An introduction to Markov processes , 2nd edition, Springer,Berlin (2014)Helmut KahlMunich University of Applied SciencesLothstr. 34D-80335 M¨unchenGermanye-mail:, 2nd edition, Springer,Berlin (2014)Helmut KahlMunich University of Applied SciencesLothstr. 34D-80335 M¨unchenGermanye-mail: