[PDF] A solution to the reversible embedding problem for finite Markov chains

Abstract

The embedding problem for Markov chains is a famous problem in probability theory and only partial results are available up till now. In this paper, we propose a variant of the embedding problem called the reversible embedding problem which has a deep physical and biochemical background and provide a complete solution to this new problem. We prove that the reversible embedding of a stochastic matrix, if it exists, must be unique. Moreover, we obtain the sufficient and necessary conditions for the existence of the reversible embedding and provide an effective method to compute the reversible embedding. Some examples are also given to illustrate the main results of this paper.

Full PDF

aa r X i v : . [ m a t h . P R ] M a y A solution to the reversible embedding problem for finiteMarkov chains

Chen Jia , Beijing Computational Science Research Center, Beijing 100094, P.R. China. Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas 75080, U.S.A.Email: [email protected]

Abstract

The embedding problem for Markov chains is a famous problem in probability theory andonly partial results are available up till now. In this paper, we propose a variant of the embeddingproblem called the reversible embedding problem which has a deep physical and biochemicalbackground and provide a complete solution to this new problem. We prove that the reversibleembedding of a stochastic matrix, if it exists, must be unique. Moreover, we obtain the sufficientand necessary conditions for the existence of the reversible embedding and provide an effectivemethod to compute the reversible embedding. Some examples are also given to illustrate themain results of this paper.

Keywords : imbedding problem, stochastic matrix, generator estimation, detailed balance

In 1937, Elfving [1] proposed the following problem: given an n × n stochastic matrix P , can we find an n × n generator matrix Q such that P = e Q ? This problem, whichis referred to as the embedding problem for stochastic matrices or the embedding problemfor finite Markov chains, is still an open problem in probability theory. Let X = { X n : n ≥ } be a discrete-time homogeneous Markov chain with transition probability matrix P . The embedding problem is equivalent to asking whether we can find a continuous-timehomogeneous Markov chain Y = { Y t : t ≥ } with transition semigroup { P ( t ) : t ≥ } such that P = P (1) . If this occurs, the discrete-time Markov chain X can be embedded asthe discrete skeleton of the continuous-time Markov chain Y .The embedding problem has been studied for a long time [2–13]. So far, the embeddingproblem for × stochastic matrices has been solved by Kendall and this result is publishedin Kingman [2]. The embedding problem for × stochastic matrices has also been solvedowing to the work of Johansen [7], Carette [10], and Chen [12]. However, when the order n of the stochastic matrix P is larger than three, only partial results are available and ourknowledge on the set of embeddable n × n stochastic matrices is quite limited. There isalso an embedding problem for inhomogeneous Markov chains which has been dealt withby some authors [14–22]. However, we only focus on the homogeneous case in this paper.The embedding problem has wide applications in many scientific fields, such as socialscience [8, 9], mathematical finance [23], statistics [24, 25], biology [26, 27], and manpowerlanning [28]. One of the most important applications of the embedding problem is the gen-erator estimation problem in statistics. Let Y = { Y t : t ≥ } be a continuous-time Markovchain with generator matrix Q . In practice, it often occurs that we can only observe a suffi-ciently long trajectory of Y at several discrete times , T, T, · · · , mT with time interval T .Let P = ( p ij ) be the transition probability matrix of Y at time T . Then p ij can be estimatedby the maximum likelihood estimator ˆ p ij = P m − k =0 I { Y kT = i,Y ( k +1) T = j } P m − k =0 I { Y kT = i } , (1)where I A denotes the indicator function of A . A natural question is whether we can obtainan estimator ˆ Q of the generator matrix Q from the estimator ˆ P = (ˆ p ij ) of the transitionprobability matrix P . It is reasonable to require the estimators ˆ P and ˆ Q to be related by ˆ P = e ˆ QT . Therefore, the generator estimation problem in statistics is naturally related toembedding problem for finite Markov chains.Recently, the generator estimation problem has been widely studied in biology [26, 27],since a number of biochemical systems can be modeled by continuous-time Markov chains.In general, there are two types of Markov chains that must be distinguished, reversible chainsand irreversible chains. In the reversible case, the detailed balance condition µ i q ij = µ j q ji holds for any pair of states i and j , where µ i is the stationary probability of state i and q ij isthe transition rate from state i to j [29]. From the physical perspective, detailed balance is athermodynamic constraint for closed systems. In other words, if there is no sustained energysupply, then a biochemical system must satisfy detailed balance [30]. In the modelling ofmany biochemical systems such as enzymes [31] and ion channels [32], detailed balance hasbecome a basic assumption [33]. Therefore, in many realistic biochemical systems, what weare concerned about is not simply to find a generator matrix ˆ Q such that ˆ P = e ˆ QT , but tofind a reversible Markov chain with generator matrix ˆ Q such that ˆ P = e ˆ QT .Here we consider the following problem: given an n × n stochastic matrix P , can we finda reversible generator matrix Q such that P = e Q ? This problem will be referred to as thereversible embedding problem for stochastic matrices or the reversible embedding problemfor finite Markov chains in this paper. Compared with the classical embedding problem, thereversible embedding problem has a deeper physical and biochemical background.In this paper, we provide a complete solution to the reversible embedding problem. Weprove that the reversible embedding of stochastic matrices, if it exists, must be unique. More-over, we give the sufficient and necessary conditions for the existence of the reversible em-bedding and provide an effective method to compute the reversible embedding. Finally, weuse some examples of × stochastic matrices to illustrate the main results of this paper. For clarity, we recall several basic definitions. efinition 1. An n × n real matrix P = ( p ij ) is called a stochastic matrix if p ij ≥ for any i, j = 1 , , · · · n and P nj =1 p ij = 1 for any i = 1 , , · · · n . Definition 2. An n × n real matrix Q = ( q ij ) is called a generator matrix if q ij ≥ for any i = j and P nj =1 q ij = 0 for any i = 1 , , · · · n .In this paper, we consider a fixed n × n stochastic matrix P . For simplicity, we assumethat P is irreducible. Otherwise, we may restrict our discussion to an irreducible recurrentclass of P . Since P is irreducible, it has a unique invariant distribution µ = ( µ , µ , · · · , µ n ) whose components are all positive [29]. Definition 3. P is called reversible if the detailed balance condition µ i p ij = µ j p ji holds forany i, j = 1 , , · · · , n . In this case, µ is called a reversible distribution of P . Definition 4.

Let Q be an n × n generator matrix. Then Q is called reversible if there existsa distribution π = ( π , π · · · , π n ) such that the detailed balance condition π i q ij = π j q ji holds for any i, j = 1 , , · · · , n . In this case, π is called a reversible distribution of Q .In fact, if π is a reversible distribution of Q , then π is also an invariant distribution of Q . Definition 5.

If there exists an n × n real matrix A such that P = e A , then A is called a reallogarithm of P . Definition 6. P is called embeddable if there exists an n × n generator matrix Q such that P = e Q . In this case, Q is called an embedding of P .It is easy to see that if Q is a embedding of P , then Q is also a real logarithm of P . In this paper, we shall address the reversible embedding problem for stochastic matrices.We first give the definition of reversible embeddability for stochastic matrices.

Definition 7. P is called reversibly embeddable if there exists an n × n reversible generatormatrix Q such that P = e Q . In this case, Q is called a reversible embedding of P . Lemma 1. If Q is a reversible embedding of P , then Q is irreducible and µ is the reversibledistribution of Q . Proof. If Q has two or more communicating classes, it is easy to see that P = e Q also hastwo or more communicating classes, which contradicts the irreducibility of P . This showsthat Q is irreducible.Since Q is reversible, it has a reversible distribution π = ( π , π · · · , π n ) which is alsothe invariant distribution of Q . Therefore, π is the unique invariant distribution of P = e Q .This shows that π = µ . emark 1. Let X = { X n : n ≥ } be a stationary discrete-time Markov chain on the finitestate space S = { , , · · · , n } with transition probability matrix P = ( p ij ) . According to theabove lemma, if P is reversibly embeddable with reversible embedding Q , then there existsa reversible continuous-time Markov chain Y = { Y t : t ≥ } on the state space S withgenerator matrix Q such that X and { Y n : n ≥ } have the same distribution. Therefore,the stationary discrete-time Markov chain X can be embedded as the discrete skeleton ofthe reversible continuous-time Markov chain Y . That is why the problem dealt with in thispaper is called the reversible embedding problem for finite Markov chains.In this paper, a diagonal matrix with diagonal entries a , a , · · · , a n will be denoted by diag( a , a , · · · , a n ) . Lemma 2. If P is reversible, then P is diagonalizable and the eigenvalues of P are all realnumbers. Proof.

Let M = diag( √ µ , √ µ , · · · , √ µ n ) . Since µ is a reversible distribution of P , it iseasy to see that S = M P M − is a symmetric matrix. Thus S must be diagonalizable and theeigenvalues of S are all real numbers. This shows that P = M − SM is also diagonalizableand the eigenvalues of P are all real numbers. Lemma 3. If P is reversibly embeddable, then P is reversible and the eigenvalues of P areall positive. Proof.

Since P is reversibly embeddable, there exists a reversible generator matrix Q suchthat P = e Q . Let M = diag( √ µ , √ µ , · · · , √ µ n ) . Since µ is a reversible distribution of Q , it is easy to see that S = M QM − is a symmetric matrix. Thus e S = M P M − is also asymmetric matrix. This shows that P is reversible. Since Q is reversible, the eigenvalues of Q are all real numbers. Thus the eigenvalues of P = e Q are all positive.Due to the above lemma, we always assume that P is reversible and the eigenvalues of P are all positive in the sequel. Let λ , λ , · · · , λ n be all the eigenvalues of P , where λ i > for any i = 1 , , · · · n . Let γ , γ , · · · , γ m be the mutually different eigenvalues of P , where m ≤ n . Let n i be the multiplicity of the eigenvalue γ i for any i = 1 , , · · · , m . It is easy tosee that n + n + · · · + n m = n . Let D = diag( λ , λ , · · · , λ n ) = diag( γ , · · · , γ , γ , · · · , γ , · · · , γ m , · · · , γ m ) . (2)and let log D = diag(log λ , log λ , · · · , log λ n )= diag(log γ , · · · , log γ , log γ , · · · , log γ , · · · , log γ m , · · · , log γ m ) . (3)Since P is reversible, according to Lemma 2, there exists an n × n real invertible matrix T such that P = T DT − . Let H be the matrix defined as H = T log DT − . (4) t is easy to see that e H = T DT − = P , which shows that H is a real logarithm of P .It has been known for a long time that the embedding of a stochastic matrix may not beunique [5]. However, this is not the case when it comes to the reversible embedding. Thefollowing theorem, which is the first main result of this paper, reveals the uniqueness of thereversible embedding. It shows that the reversible embedding of a stochastic matrix, if itexists, must be unique. Theorem 1. P has at most one reversible embedding. If P is reversibly embeddable, thenthe unique reversible embedding of P must be H . Proof.

This theorem will be proved in Section 5.We next study the existence of the reversible embedding.

Lemma 4.

Assume that P is reversible and the eigenvalues of P are all positive. Then P isreversibly embeddable if and only if H is a reversible generator matrix. Proof.

This lemma follows directly from Theorem 1.In general, it is difficult to determine whether H is a reversible generator matrix. Thuswe hope to obtain simpler sufficient and necessary conditions for a stochastic matrix beingreversibly embeddable. Let k , k , · · · , k m − ∈ R be the solution to the following system oflinear equations:  k + k γ + · · · + k m − γ m − = log γ ,k + k γ + · · · + k m − γ m − = log γ , · · · · · · k + k γ m + · · · + k m − γ m − m = log γ m . (5)Since det  γ · · · γ m − γ · · · γ m − · · · · · · · · · · · · γ m · · · γ m − m  = X ≤ i

This theorem will be proved in Section 6.From (8), it is easy to see that the condition (ii) in the above theorem is equivalent tosaying that the off-diagonal entries of H are all nonnegative.The following classical result about the embeddability for × stochastic matrices isdue to Kendall and is published in Kingman [2]. Interestingly, the above theorem gives asimple derivation of this classical result. Theorem 3 (Kendall) . Let P be a × irreducible stochastic matrix. Then the followingfour statements are equivalent:(i) P is embeddable;(ii) P is reversibly embeddable;(iii) tr( P ) > ;(iv) det( P ) > . Proof.

It is easy to see that (iii) and (iv) are equivalent. Thus we only need to prove that (i),(ii), and (iii) are equivalent.Assume that Q is an embedding of P . If Q has two or more communicating classes, it iseasy to see that P = e Q also has two or more communicating classes. This contradicts theirreducibility of P . This shows that Q is irreducible. It is easy to see that a × irreduciblegenerator matrix Q must be reversible. Thus Q is a reversible embedding of P . This showsthat (i) and (ii) are equivalent.We next prove that (ii) and (iii) are equivalent. Let P = p − p − q q ! . (9)The irreducibility of P implies that p, q < . It is easy to see that P is reversible and the twoeigenvalues of P are γ = 1 and γ = p + q − < . Thus the condition (i) in Theorem 2is equivalent to saying that p + q > . Assume that p + q > . Let k and k be the solutionto the following system of linear equations: ( k + k γ = log γ ,k + k γ = log γ . (10)It is easy to check that k = − k = log( p + q − p + q − . (11)Since < p + q < , it is easy to see that k > . Thus the off-diagonal entries of k P are all nonnegative. This shows that the condition (ii) in Theorem 2 holds. Thus p + q > mplies the condition (ii) in Theorem 2. By Theorem 2, P is reversibly embeddable if andonly if p + q > , that is, tr( P ) > . This shows that (ii) and (iii) are equivalent.As another application of Theorem 2, we give some simple and direct criteria for a × stochastic matrix being reversible embeddable. Let P be a × irreducible stochasticmatrix. The Perron-Frobenius theorem [34] claims that one eigenvalue of P must be 1 andthe absolute values of the other two eigenvalues must be less than 1.We first consider the case where P has a pair of coincident eigenvalues. Theorem 4.

Let P be a × irreducible stochastic matrix with eigenvalues , λ , and λ .Then P is reversibly embeddable if and only if P is reversible and λ > . Proof.

The condition (i) in Theorem 2 is equivalent to saying that P is reversible and λ > .Assume that λ > . The Perron-Frobenius theorem implies that < λ < . Since P hasa pair of coincident eigenvalues, the mutually different eigenvalues of P are γ = 1 and γ = λ . Let k and k be the solution to the following system of linear equations: ( k + k γ = log γ ,k + k γ = log γ . (12)Straightforward calculations show that k = − k = log λλ − . (13)Since < λ < , it is easy to see that k > . Thus the off-diagonal entries of k P are allnonnegative. This shows that the condition (ii) in Theorem 2 holds. Thus λ > implies thecondition (ii) in Theorem 2. By Theorem 2, P is reversibly embeddable if and only if P isreversible and λ > .We next consider the case where P has three different eigenvalues. Theorem 5.

Let P be a × irreducible stochastic matrix with distinct eigenvalues , λ ,and η . Then P is reversibly embeddable if and only if the following two conditions hold:(i) P is reversible and λ, η > ;(ii) k p ij + k p (2) ij ≥ for any i = j , where k = ( η −

1) log λ − ( λ −

1) log η ( λ − η − η − λ ) ,k = ( λ −

1) log η − ( η −

1) log λ ( λ − η − η − λ ) . (14) Proof.

It is easy to see that the mutually different eigenvalues of P are γ = 1 , γ = λ , and γ = η . Let k , k , k be the solution to the following system of linear equations:  k + k γ + k γ = log γ ,k + k γ + k γ = log γ ,k + k γ + k γ = log γ . (15) y solving the above system of linear equations, it is easy to check that (14) holds. The restof the proof follows directly from Theorem 2. In this section, we shall use some examples of × stochastic matrices to illustratethe main results of this paper. Let P = ( p ij ) be a × irreducible stochastic matrix. It iswell-known that P is reversible if and only if p p p = p p p . This result is a directcorollary of Kolmogorov’s criterion for reversibility [35], which claims that a discrete-timeMarkov chain is reversible if and only if the product of the transition probabilities alongeach cycle and that along its reversed cycle are exactly the same. Example 1.

Let P = 

16 13 1212 16 1313 12 16  . (16)It is easy to check that p p p = p p p , which shows that P is not reversible. Thus itfollows from Theorem 2 that P is not reversibly embeddable. Example 2.

Let P = 

13 12 1612 16 1316 13 12  . (17)It is easy to check that P is reversible and the three eigenvalues of P are λ = 1 , λ = √ , λ = − √ . (18)Since λ < , it follows from Theorem 2 that P is not reversibly embeddable. Example 3.

Let P = 

12 25 11025 25 15110 15 710  . (19)It is easy to check that P is reversible and the three eigenvalues of P are λ = 1 , λ = 3 + √ , λ = 3 − √ , (20)which are all positive and mutually different. Let k and k be the two real numbers definedin (14). Straightforward calculations show that k p + k p (2)13 < . Thus it follows fromTheorem 5 that P is not reversibly embeddable. xample 4. Let P = 

12 14 1414 12 1414 14 12  . (21)It is easy to check that P is reversible and the three eigenvalues of P are λ = 1 , λ = λ = 14 , (22)which are all positive. Since P has a pair of coincident eigenvalues, it follows from Theorem4 that P is reversibly embeddable. For convenience, we give the following definition.

Definition 8.

Assume that P is reversible and the eigenvalues of P are all positive. Let T be an n × n real invertible matrix such that P = T DT − . Then H = T log DT − is calleda candidate of P .It is easy to see that if H is a candidate of P , then H is a real logarithm of P . Lemma 5.

Let λ be a complex number and let A =  λ . . . . . . λ λ  n × n . (23)Then the Jordan canonical form of e A is  e λ . . . . . . e λ e λ  n × n . (24) Proof.

Let B =  . . . . . .  n × n . (25)Then A = λI + B . Note that B l = 0 for any l ≥ n . Thus we have e A = ∞ X k =0 k ! ( λI + B ) k = ∞ X k =0 k ! k X m =0 C mk λ m B k − m = ∞ X m =0 λ m m + n − X k = m k ! C mk B k − m = ∞ X m =0 λ m n − X k =0 k + m )! C mk + m B k = ∞ X m =0 m ! λ m n − X k =0 k ! B k = e λ n − X k =0 k ! B k . (26) n view of the above equation, it is easy to check that ( e A − e λ I ) n − = e ( n − λ B n − = 0 but ( e A − e λ I ) n = 0 . This shows that e λ is an n -fold eigenvalue of e A and the Jordan canonicalform of e A is e λ I + B . Lemma 6.

Assume that P is reversible and the eigenvalues of P are all positive. Then eachreversible embedding of P must be a candidate of P . Proof.

Let Q be a reversible embedding of P . Since Q is reversible, the eigenvalues of Q must be all real numbers. Since P is diagonalizable, it follows from Lemma 5 that Q mustbe also diagonalizable and thus the Jordan canonical form of Q must be log D . Thus thereexists an n × n real invertible matrix T such that Q = T log DT − . Thus P = e Q = T DT − .This shows that Q is a candidate of P . Lemma 7.

Assume that P is reversible and the eigenvalues of P are all positive. Then thecandidate of P is unique. Proof.

Let H and H be two candidates of P . Thus there exist two n × n real invertiblematrices T and T such that H = T log DT − and H = T log DT − . This shows that T DT − = e H = P = e H = T DT − . (27)Let S = ( s ij ) = T − T . It is easy to see that SD = DS , which implies that s ij λ j = λ i s ij forany i, j = 1 , , · · · n . This shows that s ij = 0 whenever λ i = λ j . Thus whether λ i = λ j or λ i = λ j , we always have s ij log λ j = log λ i s ij for any i, j = 1 , , · · · n , which is equivalentto saying that S log D = log DS . Thus we obtain that H = T log DT − = T log DT − = H . (28)This shows that the candidate of P is unique.We are now in a position to prove Theorem 1. Proof of Theorem 1.

Assume that P is reversibly embeddable. By Lemma 3, P is reversibleand the eigenvalues of P are all positive. Since H is a candidate of P , it follows fromLemmas 6 and 7 that the reversible embedding of P must be the unique candidate H of P .Thus P has at most one reversible embedding. Lemma 8.

Assume that P is reversible and the eigenvalues of P are all positive. Let H =( h ij ) be the unique candidate of P . Then µ i h ij = µ j h ji for any i = 1 , , · · · , n . Proof.

Let M = diag( √ µ , √ µ , · · · , √ µ n ) . Since µ is a reversible distribution of P , itis easy to see that S = M P M − is a symmetric matrix. Thus there exists an orthogonalmatrix R such that RSR T = D . This shows that RM P M − R T = D , or equivalently, P = − R T DRM . Thus H = M − R T log DRM , or equivalently,

M HM − = R T log DR .This shows that M HM − is a symmetric matrix, which implies that µ i h ij = µ j h ji for any i = 1 , , · · · , n . Lemma 9.

Assume that P is reversible and the eigenvalues of P are all positive. Let H bethe unique candidate of P . Let 1 be the n -dim column vector whose components are all 1.Then H . Proof.

Since H is a candidate of P , we have e H P . Let B = ∞ X k =0 k + 1)! H k . (29)It follows from e H that BH . Since the Jordan canonical form of H is log D , theJordan canonical form of B is ∞ X k =0 k + 1)! (log D ) k . (30)Simple calculations show that ∞ X k =0 k + 1)! λ k =  , if λ = 0 , λ ( e λ − , if λ = 0 . (31)Thus the eigenvalues of B are all nonzero, which shows that B is invertible. Since BH ,we have H . Remark 2.

In Definition 8, Lemma 7, and Lemma 9, the assumption “ P is reversible” canbe weakened to “ P is diagonalizable”.We are now in a position to prove Theorem 2. Proof of Theorem 2.

By Lemma 3, we only need to prove that if the condition (i) holds, then P is reversibly embeddable if and only if the condition (ii) holds.Assume that the condition (i) holds. Let H be the unique candidate of P . It follows fromLemma 4 that P is reversibly embeddable if and only if H is a reversible generator matrix.By Lemmas 8 and 9, H is reversible and the sum of entries in each row of H is zero. Thisshows that H is a reversible generator matrix if and only if the off-diagonal entries of H areall nonnegative. From (8), it is easy to see that H is a reversible generator matrix if and onlyif the off-diagonal entries of k I + k P + · · · + k m − P m − are all nonnegative. Thus P isreversibly embeddable if and only if the condition (ii) holds. This completes the proof ofthis theorem. Acknowledgements

The author is grateful to Prof. Y. Chen and the anonymous reviewers for their valuablecomments and suggestions on the present work. eferences [1] Elfving G (1937) Zur theorie der Markoffschen ketten. Acta Societas Scientiarium Fennicae Nova SeriesA

Z Wahrsch Verw Gebiete

Proceedings of the KNAW-Series A, Mathematical Sciences

Markov Chains with Stationary Transition Probabilities (Springer-Verlag, Berlin), 2ndedition.[5] Speakman JM (1967) Two Markov chains with a common skeleton.

Z Wahrsch Verw Gebiete

J Lond Math Soc

SociologicalMethodology 1973–1974 , ed. Costner HL (Jossey-Bass, San Francisco), pp. 356–401.[9] Singer B, Spilerman S (1976) The representation of social processes by Markov models.

Am J Sociol × New York J Math

Electron J Probab

J Theor Probab

J Appl Probab

Z Wahrsch Verw Gebiete

Math ProcCambridge × ZWahrsch Verw Gebiete

Math Proc Cambridge × Z WahrschVerw Gebiete × J Multivariate Anal

ProbabTheory Relat Fields

Phys Rev E

Math Financ

J RStat Soc B

25] Metzner P, Dittmer E, Jahnke T, Sch¨utte C (2007) Generator estimation of Markov jump processes.

JComput Phys

PLoS ONE

IET Syst Biol

Commun Stat-Theor M

Markov Chains (Cambridge University Press, Cambridge).[30] Qian H (2007) Phosphorylation energy hypothesis: open chemical systems and their biological functions.

Annu Rev Phys Chem

Fundamentals of Enzyme Kinetics (Wiley-Blackwell, Weinheim), 4th edition.[32] Sakmann B, Neher E (2009)

Single-channel Recording (Springer-Verlag, New York), 2nd edition.[33] Alberty RA (2004) Principle of detailed balance in kinetics.

J Chem Educ

Nonnegative Matrices in the Mathematical Sciences (Academic Press,New York).[35] Kelly FP (2011)

Reversibility and Stochastic Networks (Cambridge University Press, Cambridge).(Cambridge University Press, Cambridge).