First Passage Moments of Finite-State Semi-Markov Processes
aa r X i v : . [ m a t h . P R ] M a y First Passage Moments of Finite-StateSemi-Markov Processes
Richard L. Warr ∗ and James D. Cordeiro ∗ September 20, 2018
Abstract
In this paper, we discuss the computation of first-passage momentsof a regular time-homogeneous semi-Markov process (SMP) with afinite state space to certain of its states that possess the property ofuniversal accessibility (UA). A UA state is one which is accessible fromany other state of the SMP, but which may or may not connect backto one or more other states. An important characteristic of UA is thatit is the state-level version of the oft-invoked process-level propertyof irreducibility. We adapt existing results for irreducible SMPs tothe derivation of an analytical matrix expression for the first passagemoments to a single UA state of the SMP. In addition, consistent pointestimators for these first passage moments, together with relevant Rcode, are provided.
KEY WORDS: First Passage Distributions; Markov Renewal Pro-cess; Spectral Radius; Statistical Flowgraph Model; Univer-sally AccessibleAMS MSC 2010: 60K15; 60G10; 62M05
Since the seminal works of Levy [17, 18] and Smith [26], semi-Markov pro-cesses (SMPs) have been utilized as a framework for a wide variety of ap-plications within the scientific literature. Much of the interest is due to the ∗ Department of Mathematics and Statistics, Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio, USA. The views expressed in this article are those of theauthors and do not reflect the official policy or position of the United States Air Force,Department of Defense, or the U.S. Government.
P H -distributions[11, 28], often used in reliability, but which also appear in the context ofSMP first passage moments, as in [29]. Other areas that have seen the ap-plication of SMP models are DNA analysis [2], queueing theory [13, 21],finance [9], artificial intelligence [28], and transportation [4, 16], to namebut a few.In this article, we will show the existence of and then derive the momentsof first passage to states of a SMP with a finite state space that have theproperty of being accessible from every other state. We call this property universal accessibility (UA) and note that it can be likened to a state-levelversion of the property of irreducibility. This comes as a consequence ofthe fact that, as we will later show, UA of every state is a necessary andsufficient condition for irreducibility to hold. In this sense, then, UA of asubset of states of a SMP may be considered a natural relaxation of theproperty of irreducibility, which has been the standard assumption in thework of all researchers dating from Pyke [22, 23] onwards. Rather thanbeing a simple generalization, we will show here that UA is, in fact, a min-imal condition required for the existence of finite moments of first passage.This demonstration requires an application of the Perron-Frobenius theoremgeneralized to reducible matrices (and hence reducible processes) in orderto arrive at the existence of the required matrix inverse. For further de-tails on the Perron-Frobenius theorem, and spectral theory in general, see[7]. Although the proof of invertibility is somewhat convoluted, one gainsthe advantage of being able to consider only those first passage momentsto a given universally accessible state, thus reducing the dimensionality ofthe problem. In addition, the expression that we derive does not suffer thepresence of inverses of singular matrix terms. Contrast this to the situationof Hunter [8] and later researchers, whose expressions for first passage mo-ments involved a noninvertible matrix, thus requiring a generalized inverseapproach. Another significant advantage is that we are able to discard thesomewhat strong assumptions of positive recurrence, and thus irreducibility,thereby increasing the class of SMPs for which a unified analytical approachto computing the first passage moments is available.Explicit time-domain formulas for the first two moments of the first pas-sage distribution of a irreducible ergodic SMP with a finite state space have2ong been known. Pyke [23] inverted Laplace-Stieltjes transform matricesunder restrictive non-singularity conditions in order to derive the first andsecond moments. Hunter [8] repeated this analysis by means of Markov re-newal theory, and then solved for the matrix of first passage moments M ofthe SMP through multiplication of the matrix I − P by its generalized in-verse, where P is the transition probability matrix of the embedded discretetime Markov chain. Although the role of the fundamental matrix of the em-bedded DTMC in solving the problem of finding the first passage momentswas recognized since at least Kemeny and Snell [10], it was Hunter [8] thatrecognized its fundamental importance by proving that the fundamental ma-trix is a generalized inverse for I − P . Some years later, Yao [27] was ableto use the generalized inverse to find all moments of first passage. Zhangand Hou [29] likewise employed the generalized inverse method in order toderive exact first passage moments for SMPs with phase- ( P H -)distributedsojourn times between states, thus capitalizing on the robust interest in thereliability community for these somewhat exponential-like statistical distri-butions. All of these previous investigations assumed irreducibility, and arethus useful background for, though not directly applicable to the type ofreducible process that we consider here.The remainder of the paper will proceed as follows. In Section 2, wedefine notation, terminology, and assumptions that guide the remainder ofthe discourse. In section 3, we introduce the notion of universal accessibility,as well as a result that explains its relationship to irreducibility. We thenpresent the main result in Section 4, which is the derivation of the formulafor the first passage moments under the condition of universal accessibility.Finally, in Section 5, we present a method for estimating the first passagemoments of SMPs and a brief example.
In this section we introduce the the notation used in this paper. A boldfacesymbol without indices refers to a matrix (e.g., F ( t ) is a matrix with elements F ij ( t ) in the i th row and j th column). We will sometimes drop the functionargument for simplicity’s sake; e.g., F = F ( t ). In the usual way, we definethe Dirac- δ function as δ ij = (cid:26) i = j i = j Additionally, we will specify that the m -dimensional square matrices I and J denote the identity matrix and the matrix whose entries consist of ones,3espectively. Finally, the matrix binary operator ‘ ◦ ’ denotes Hadamard(element-wise) multiplication; i.e. (cid:2) A ◦ B (cid:3) ij = A ij B ij . We now define a regular time-homogeneous SMP { Z ( t ) : t ≥ } with afinite state space S = { , , . . . , m } . Note that the assumption of regularityimplies that the process may transition only a finite number of times ina finite time interval with probability 1. Let S k , k = 0 , , , . . . be thetransitional epochs of the SMP and let Z k = Z ( S k ). We define the kernelmatrix Q ( x ) = [ Q ij ( x )] of the SMP as Q ij ( x ) = P { Z k +1 = j, S k +1 − S k ≤ x | Z k = i } = P { Z = j, S ≤ x | Z = i } , which are the joint probabilities of waiting times and transitions from state i ∈ S to state j ∈ S . The transition matrix of the embedded discrete timeMarkov chain (DTMC) is thus given by p = Q ( ∞ ). In addition, we definethe matrix of distribution functions F ( x ) = [ F ij ( x )] of the sojourn times instate i , given that the process transitions to state j as F ij ( x ) = P { S ≤ x | Z = i, Z = j } , (1)with associated r th moments e ( r ) = [ e ( r ) ij ] , e = [ e ij ] = e (1) , r ≥ Q ij ( x ) = p ij F ij ( x ) , (2)or, alternatively, as the Hadamard matrix product Q = p ◦ F .The similarity in behavior of an SMP to a Markov chain at transitionepochs { S k : k = 0 , , , . . . } is due to the classification of these transitionsas Markov renewal epochs . These are times at which the process in questionpossesses the Markov, or memoryless property:P { Z k +1 = j | Z k = i, Z k − , . . . , Z , Z } = P { Z k +1 = j | Z k = i } . Define the random variable N k ( t ) to be the number of transitions (Markovrenewals) of the SMP into state k up to and including t ≥ N ( t ) ≡ [ N k ( t )] k ∈S
4e the vector consisting of the random counting variables N k ( t ). Also definethe scalar random variable N ( t ) ≡ m X k =1 N k ( t )to be the total number of transitions, or Markov renewals, of the SMPup to t . We thus obtain the relationship Z ( t ) = Z N ( t ) , between the SMP { Z ( t ) : t ≥ } and its embedded DTMC { Z k : k ≥ } . The vector countingprocess { N ( t ) : t ≥ } is known as the Markov renewal process associatedto the SMP { Z ( t ) : t ≥ } .The state properties of the SMP such as irreducibility and recurrencemay be elicited from the properties of its embedded DTMC { Z n : n ≥ } .We say that state j is accessible from state i ( i → j ) if there is a nonzeroprobability that { Z n } may transition to state j in a finite number of steps,given that it begins in state i . Mathematically, this means that there issome n ∈ Z + such that p ( n ) ij >
0, where p ( n ) ij = P { Z n = j | Z = i } . The matrix p ( n ) = [ p ( n ) ij ] is called the n th-step transition probability matrix .The ij th element of the matrix denotes the probability of the DTMC tran-sitioning from state i to state j in n stages and can be computed using theidentity p ( n ) = p n . On the other hand, we say that state j is not accessible from state i (denoted i j ) if p ( n ) ij = 0 for all n . There may also exist astate 0 ∈ S known as an absorbing state , which is to say that, for any otherstate j ∈ S , 0 j . In this case, the SMP, having transitioned to state 0,sojourns for an infinite amount of time in this state. Many applications insurvival and reliability analysis may be modeled using stochastic processeswith one or more absorbing states. Transitioning to an absorbing state istantamount to death or complete failure in the original process.If i and j are mutually accessible (that is, i → j and j → i , otherwisedenoted as i ↔ j ), then they are said to communicate . Since communi-cation fulfills the axioms of reflexivity, transitivity, and symmetry, it is anequivalence relation, and thus defines a partitioning of the state space S into various disjoint communicating classes . If S is itself comprised of asingle communicating class, then the SMP is called irreducible ; otherwise,it is known as reducible . On the other hand, a nonnegative m × m matrix A = [ a ij ] is an irreducible matrix if, for each i and j , there exists some0 < η < ∞ such that the ij th element of A η is greater than 0. The alge-braic and probabilistic definitions of irreducibility coincide if the irreducible5atrix is the transition probability matrix p , for then the ij th element of p ( η ) = p η is strictly positive if and only if j is accessible from state i in afinite number η of steps with nonzero probability. This last statement canbe made precise by reference to the digraph associated to A , denoted G ( A ).This is the digraph with vertices in the set V ( G ( A )) = { , , . . . , m } suchthat the directed arc, or edge, ( i, j ) exists if and only if a ij > G ( A ) issaid to be strongly connected if, for each ordered pair i, j ∈ V ( G ( A )), thereexists a (directed) path in G ( A ) from i to j . In either case of there being anedge or directed path from i to j , the implication is clearly i → j . The finalconnection between irreducibility and connectedness is made in the followingProposition: Proposition 2.1
Let A be a nonnegative square matrix. A is irreducible ifand only if G ( A ) is strongly connected. Proof
See Shao [25].We next address the first passage times of an SMP. To this end, definethe random variable T j = inf { t ≥ S : Z ( t ) = j } , j ∈ S , which represents the time of first passage from an initial state i to state j if i = j , and the time of first return to j otherwise. The distribution function G ij ( t ) of first passage, conditioned on being in the initial state i ∈ S , isdefined as G ij ( t ) = P (cid:8) T j ≤ t | Z (0) = i (cid:9) , and for which the corresponding r th moments µ ( r ) ij , r ≥
1, if they exist, aregiven by µ ( r ) ij = E h T r j | Z (0) = i i . We thus define G ( t ) and µ ( r ) = h µ ( r ) ij i to be the matrices of first passagedistribution functions and moments.As stated in Proposition 5.15 of [24, pg104] and Lemma 4.1 of [29], themoments of first passage for an irreducible SMP may be computed as thefinite solution to the systems of equations given by µ (1) ij = m X k =1 p ik (cid:2)(cid:0) − δ kj (cid:1) µ kj + µ ik (cid:3) (3) µ ( r ) ij = m X k =1 p ik µ ( r ) ik + r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj , r ≥ . (4)6learly, a necessary condition for µ ( r ) ij < ∞ is that i → j , which is certainlytrue if the SMP is irreducible. In contrast, we observe that G ij ( ∞ ) < µ ij = ∞ ) might occur for a pair of states i, j ∈ S if i j . As we willlater show, (3) and (4) still hold under the somewhat weakened assumptionof universal accessibility for the terminal state j .The recurrence properties of a SMP may be explained in terms of thedistribution of the first passage of a SMP from a given state i ∈ S back toitself, otherwise known as the time of (first) return to a state i ∈ S . Thecrucial step is to define f ii = P { N ( T i ) < ∞ | Z = i } , which is the probability that the number of steps required for the embeddedDTMC { Z n : n ≥ } to return to state i is finite. If f ii <
1, then the state i ∈ S is called transient ; otherwise, it is known as recurrent . If, in additionto recurrence, we have µ ii < ∞ , then the state is called positive recurrent .The SMP itself is deemed recurrent, transient, or positive recurrent as a process if the corresponding condition holds for every state i ∈ S . For anirreducible SMP with a finite state space, it is well-known that the process isautomatically positive recurrent. This is not true, in general, for a reducibleprocess, but may be evaluated on a state-by-state basis.The Perron-Frobenius theorem adapted to finite-dimensional irreducibleand nonnegative matrices is very useful for characterizing the set of eigen-values of such matrices. As we will see later, the theory may be (indirectly)extended to even reducible nonnegative matrices by leveraging their distinc-tive canonical form. Let A ∈ R m × m + for some positive integer m . We definethe spectrum of A , denoted σ A , to be the set of its eigenvalues. Its spectralradius , denoted ρ ( A ), is given by ρ ( A ) = max {| λ | : λ ∈ σ A } ∈ R + , which indicates the maximum radius of the disc that contains σ A in thecomplex plane. Of particular interest is the case of a finite-dimensional stochastic matrix A , which is a nonnegative square matrix such that A = ,where is a column vector of ones. Perron-Frobenius theory, via Proposition2.4 for the reducible case, implies that the spectral radius is likewise aneigenvalue of A , denoted the Perron root of A . Stochastic matrices comprisethe boundary of the unit ball A = { A ∈ R m × m + : || A || ∞ ≤ } of finite-dimensional nonnegative matrices in the normed linear space induced bythe infinity norm || · || ∞ , which is given by the maximum absolute row sum7f A = [ a ij ], or || A || ∞ = max i n X j =1 | a ij | = max( A ) . As the next Proposition will show, we may classify certain elements of A ∈A with spectral radius ρ ( A ) < substochastic , which is to say that0 < min( A ) < Proposition 2.2
Suppose that A ∈ A . If ρ ( A ) < , then A is substochas-tic. Proof
Clearly, since A ∈ A , it must be either stochastic or substochastic.Therefore the only thing that must be proved is that A is not stochastic.Assume A is stochastic; i.e. A = . This implies 1 is an eigenvalue, whichcontradicts ρ ( A ) <
1. Therefore, A must be substochastic.For an irreducible nonnegative matrix A , it is, in fact, sufficient for A to havea spectral radius that is strictly less than unity in order to be substochastic,as the next Proposition shows. Proposition 2.3 If A ∈ R m × m + is an irreducible substochastic matrix, then ρ ( A ) < . Proof
See Theorem 7 in [14].For such reasons, among others, it is very convenient to work with irre-ducible processes. Results for irreducible matrices (processes) may still beapplied to the reducible case via an important consequence of the mathemat-ical notion of reducibility. From the definition given above for a reduciblematrix A , it can be shown that there exists a permutation matrix P suchthat P AP − is in upper block triangular form as follows: A ∼ P AP − = A A · · · A r A ,r +1 A ,r +2 · · · A M A · · · A r A ,r +1 A ,r +2 · · · A M ... . . . ... ... ... . . . ... · · · A rr A r,r +1 A r,r +2 · · · A rM · · · A r +1 ,r +1 · · ·
00 0 · · · A r +2 ,r +2 · · · ... ... . . . ... ... . . . ... · · · · · · A MM (5)8his is known as the canonical form for a reducible matrix. The canonicalform is not unique, meaning that there may be two or more permutationmatrices P such that a matrix P AP − is in canonical form. Its utility de-rives from several highly useful properties, which we will now discuss. First,the canonical matrix has the property such that every block matrix on thediagonal, A νν , ν ∈ { , . . . , M } , is either [0] × or is irreducible. Moreover,the eigenvalues of A are invariant under the permutation transformation P AP − . From a stochastic process perspective, we observe that transform-ing a reducible stochastic matrix P into its canonical form is equivalent torelabeling the state space of an associated (reducible) DTMC. The states areorganized in the canonical transition matrix P in such a way that, for some r ∈ Z + , the transient state transitions are represented within the diagonalblocks A νν for 1 ≤ ν ≤ r , while the blocks A νν for r + 1 ≤ ν ≤ M representrecurrent-state transitions. Additionally, as shown in Equation 8.4.7 of [20], ρ ( A νν ) < ≤ ν ≤ r , which, by Proposition 2.3, further showsthat the first r diagonal sub-blocks of A are substochastic. The followingProposition rounds out this list of useful properties by relating the spectralradius of the sub-blocks of the canonical matrix to that of the entire matrix. Proposition 2.4
Suppose A ∈ R m × m + is a reducible matrix in canonicalform. Then ρ ( A ) = max ν ρ ( A νν ) for ≤ ν ≤ M . Proof
See Lemma 1 in [12, pg. 303] with an additional induction argumentto get the result or as argued in [7, pg.115].Table 1 summarizes the important notation that we will use throughout thispaper.
In this section, we introduce the property of universal accessibility of state j ∈ S . As we will later demonstrate, universal accessibility is a sufficientcondition for the existence of a well-defined first-passage moment to a givenstate of the SMP. Definition
Let P be a stochastic process with state space S . State j ∈ S is said to be universally accessible (UA) if, for every state i ∈ S , we have i → j .If a proper subset of S is UA, then it is clear that the SMP is reducible. Onthe other hand, if every state is UA, then all states must communicate, asthe next Proposition asserts. 9able 1: List of important symbols and notation. n The number of states in the SMP x The sojourn time in a state t Calendar time, or the time since the process began p ij The probability that the next state in the process is j , given theprocess entered state iF ij ( x ) The CDF of the waiting time distribution in state i , given thenext transition to is state jG ij ( t ) The CDF of the first passage distribution from state i to state j I The m × m identity matrix I ( − j ) The m × m identity matrix with the j th row and column set to 0 The m × m matrix of all 1s A j The j th column of matrix AA ◦ B The element-wise product of two matrices e ( r ) ij R x r dF ij , e ij ≡ e (1) ij µ ( r ) ij R x r dG ij , µ ij ≡ µ (1) ij ρ ( A ) The spectral radius of the matrix A (cid:13)(cid:13) A (cid:13)(cid:13) ∞ The infinity norm of a matrix10 roposition 3.1
A SMP { Z ( t ) : t ≥ } with state space S is irreducible ifand only if every j ∈ S is UA. The property of a state being UA is, in a sense, the minimal requirement forthe existence of all first-passage moments. In the next section, we demon-strate the sufficiency of this condition by means of the Perron-Frobeniustheorem applied to the canonical form of the reducible transition probabil-ity matrix of the embedded DTMC.
In this section, we derive a formula for determining the first and highermoments of first passage times in reducible
SMPs to special states j thatare UA. We begin with a technical result that will be needed in the proofof Theorem 4.2 to demonstrate that the matrix formula for the moments offirst passage to a UA state j ∈ S is well-defined. For notational convenience,define I ( − j ) to be the identity matrix with the j th diagonal element set tozero. Lemma 4.1
Let { Z ( t ) : t ≥ } be a SMP with finite state space S andembedded DTMC at transition epochs with transition probabilities containedwithin the (stochastic) matrix p . Then the matrix [ I − pI ( − j )] is nonsin-gular if and only if state j ∈ S is universally accessible (UA). Proof
We begin with the observation that, since A = [ A νκ ] = pI ( − j ) isformed by setting each element of the j th column of p to 0, we essentiallyremove all directed arcs ( i, j ) in the digraph G ( p ) for each i ∈ V ( G ( p )) inorder to produce G ( A ). This means that G ( A ) cannot be strongly connected,and thus A must be reducible. We may therefore assume that A is incanonical form (5). Furthermore, because the j th column is zero, we willassume without loss of generality that the canonical form of A correspondsto the particular ordering of the states in S in which state j is re-designatedas state 1. We impose the same permutation and partitioning on p = [ p νκ ]so that A νκ = ( p νκ if ( ν, κ ) ∈ { , . . . , M } × { , , . . . , M } , if ( ν, κ ) ∈ { , . . . , M } × { } , (6)where, as in (5), M is the dimension of A . Notice that since p may be irre-ducible, the above does not necessarily imply that p can be put in canonicalform, but rather is element-wise equivalent to A , save for the first column,11hich, unlike that of A , may contain positive entries. Stated succinctly, wehave that = A ν ≤ p ν , ν = 1 , . . . , M. Assume that [ I − A ] is nonsingular, which directly implies that 1 / ∈ σ A ;that is, 1 is not an eigenvalue of A . Since p is a row-stochastic matrix, andbecause of the equivalence given in (6), the Gerschgorin Circle Theorem(see [20, Eqn. 7.1.13]) indicates that the spectral radius δ = ρ ( A ) ≤ A permits the use of Equation 8.3.1 of[20] to then assert that the Perron root 0 ≤ δ ≤ / ∈ σ A , it must then be the case that δ <
1. Thisimplies by Proposition 2.4 that ρ ( A νν ) < ν ∈ { , . . . , M } andhence, by Proposition 2.3, each diagonal block A νν , ν ∈ { , . . . , M } mustbe substochastic.We now consider the ν th diagonal block in the canonical form of A ,where ν ∈ { , . . . , M } , and proceed to show that each state i associated tothe vertex set V ( G ( A νν )) can access state 1. Because p is a row-stochasticmatrix and A νν is substochastic, either or both of the following may hold:1. p ν = , or2. A νκ = for some κ > ν .For 1), p ν = indicates the existence of states i ν ∈ V ( G ( A νν )) (with i ν = i possible, but not necessary) and 1 ∈ V ( G ( A )) for which there is a directedarc ( i ν , A νν gives a directed path from i to i ν . We thus obtain i → i ν → . In other words, there is a directed path from i to 1.If 2) holds, there exists a directed arc from some state i ν ∈ V ( G ( A νν )(again, with the possibility that i ν = i ) to a state i κ ∈ V ( G ( A κκ )). Fromhere, we are again confronted with choices 1) and 2). If 1) holds, thenthe previous argument gives us a directed path from i κ to 1. Since theirreducibility of A νν implies the existence of a path from i to i ν , we havethe accessibility chain i → i ν → i κ → , and we are done. Otherwise, we proceed to the next diagonal block following A κκ and continue until ν > r . If ν > r then the process is in a state i ν ∈ V ( G ( A νν )). The only choice here, due to the this block being substochastic,is 1); that is, p ν = , for which we have already demonstrated the existence12f the connection i ν →
1. Each of the preceding paths may then be combinedto form a single directed path from an arbitrarily selected i ∈ V ( G ( A νν )) to1 so that i → i ν → i κ → · · · → i M → . Thus, state 1 is UA.For the reverse implication, we will assume that state 1 is UA, and thenproceed to show that [ I − A ] is nonsingular. The reducibility of A allowsus to assume that it possesses canonical form and, furthermore, that eachsubmatrix on the diagonal of the canonical matrix corresponding to A isirreducible or zero. Consider an arbitrary nonzero, and hence irreducible,diagonal submatrix A νν for some ν ∈ { , . . . , M } (recall that A = bydefinition of A ). By the assumption that state 1 is UA, there must be adirected path from each state in the vertex set V ( G ( A νν )) to 1, which inturn implies that A νν is substochastic. By Proposition 2.3, ρ ( A νν ) < ρ ( A ) <
1. Hence,[ I − A ] is nonsingular, which completes the proof.For the following main result, we will show, using Lemma 4.1, that state j being UA is sufficient to derive a closed-form analytical expression for the r th first passage moments µ ( r ) = h µ ( r ) ij i , for r ≥ i ∈ S . Theorem 4.2
Let { Z ( t ) : t ≥ } be a regular, time-homogeneous SMP witha finite state space S . Further suppose that j ∈ S is UA. Then the r thmoments of the first passage times from all states i ∈ S to state j containedin the m -vector ( m = |S| ) µ ( r ) j = h µ ( r ) ij i mi =1 , r ≥ , are solutions to the system of equations given by µ j ≡ µ (1) j = [ I − pI ( − j )] − (cid:0) p ◦ e (cid:1) , (7) µ ( r ) j = [ I − pI ( − j )] − × " (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ µ ( s ) j (cid:17)i , if r > , (8) where is a column vector of ones and the scalar entries µ (1) ij and µ ( r ) ij for r ≥ are defined as in (3) and (4) , respectively. roof We first show, using induction on the r th moment, r ≥
1, thatthe system of equations (3) and (4) give a valid relationship between thefirst-passage moments to a given state j that is UA. For the mean time offirst passage given by the system (3), we observe the following at the firsttransition epoch S of the SMP:1. i k at S ⇒ the corresponding k th term drops out of the expression,and2. i → k at S ⇒ µ ik and µ kj are well-defined, the latter because j isUA.We thus conclude that a first-step analysis founded upon the state of theSMP at the first transition epoch S (c.f. Proposition 5.15 of [24, pg104]) stillholds for a terminal UA state j . Next, for the induction step, we considerexpression (4) for the ( r +1)th moment, where r ≥
1. We likewise claim thatthe original renewal argument given in Lemma 4.1 of [29] for the derivationof (4) for the r th moments of first passage is valid. In order to see this, werewrite, for i ∈ S , expression (4) as µ ( r +1) ij = m X k =1 p ik h(cid:0) − δ kj (cid:1) µ ( r +1) kj + µ ( r +1) ik i + M r (9)where M r = r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj . The inductive hypothesis and items 1) and 2) above guarantee that M r iswell-defined while the remainder of (9) is in exactly the same form as (3),which has just been shown to have a finite solution via the base step.Thus, for arbitrary i = j , where i ∈ S , we may transform (3) into theequivalent matrix expression µ = [ µ ij ] = p (cid:0) ( J − I ) ◦ µ (cid:1) + (cid:0) p ◦ e (cid:1) J . In this form we are not able to solve directly for µ , but, under the assumptionthat j is a specific UA state in S , it is possible to solve for the j th columnof µ , which we denote as µ j . We then obtain, µ j = p (cid:2)(cid:0) ( J − I ) ◦ µ (cid:1)(cid:3) j + (cid:0) p ◦ e (cid:1) . Next, we isolate ( p ◦ e ) so that µ j − p (cid:2)(cid:0) ( J − I ) ◦ µ (cid:1)(cid:3) j = (cid:0) p ◦ e (cid:1) . µ j gives [ I − pI ( − j )] µ j = (cid:0) p ◦ e (cid:1) which allows us to finally solve for µ j as µ j = [ I − pI ( − j )] − (cid:0) p ◦ e (cid:1) . By Lemma 4.1, the matrix I − pI ( − j ) is nonsingular. This proves that (7)is, indeed, well-defined.A general formula for the r th moment, where r ≥
2, is given in Lemma4.1 of [29] as µ ( r ) ij = n X k =1 p ik e ( r ) ik + r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj , which is expressed in matrix notation as µ ( r ) = (cid:16) p ◦ e ( r ) (cid:17) J + r X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17)i . Solving for the j th column gives µ ( r ) j = (cid:16) p ◦ e ( r ) (cid:17) + r X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . Using e (0) = J (the identity under the Hadamard product), we extract the r th term of the summation to obtain µ ( r ) j − p h ( J − I ) ◦ µ ( r ) i j = (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . We further observe that µ ( r ) j − p h ( J − I ) ◦ µ ( r ) i j = ( I − pI ( − j )) µ ( r ) j , which gives( I − pI ( − j )) µ ( r ) j = (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . µ ( r ) j to obtain µ ( r ) j = [ I − pI ( − j )] − " (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ µ ( s ) j (cid:17)i . As argued in the proof of formula, (7), the inverse [ I − pI ( − j )] − exists.Hence, (8) is likewise well-defined.We next investigate some statistical aspects in using Theorem 4.2 to estimatethe first passage moments to universally accessible states in a SMP. In this section we will derive consistent estimates for first passage momentsin SMPs. Since the SMP { Z ( t ) : t ≥ } is time-homogeneous, we assumewithout loss of generality that Z (0) = i ∈ S . If we observe the SMP for aperiod of time T >
0, then, for any j ∈ S , we may then define the pointestimators ˆ p ij for the probability and ˆ e ( r ) ij for the r th moment of the sojourntime of the SMP as it transitions from i to j as:ˆ p ij ≡ n ij P k ∈S n ik , ˆ e ( r ) ij ≡ n ij n ij X K =1 x rijK , i, j ∈ S , where n ij ≡ Number of observed transitions from state i to state j by time T ,x ijK ≡ K th observed sojourn time from state i to state j by time T .
We further assume T is large enough so that at least one transition from i to j has been observed; in other words, n ij ≥
1. In order to make inferentialhypotheses using these estimators, it is useful to first show that they areconsistent. A point estimator ˆ θ n is said to be consistent if it converges inprobability to the true population statistic θ as the sample size n increases;that is, for each ǫ >
0, lim n →∞ P (cid:16) | ˆ θ n − θ | < ǫ (cid:17) = 1 . This condition is written in shorthand asˆ θ n P → θ. We now show that this condition holds for the matrix estimators ˆ p ≡ [ˆ p ij ]and ˆ e ≡ [ˆ e ij ]. 16 emma 5.1 The matrix estimators ˆ p and ˆ e are consistent. Proof
Let { X n } be a sequence of Bernoulli random variables such that X n = 1 when a transition from i to j occurs at the n th transition, and is 0otherwise. Accordingly, if N > , T ], then the estimated probability of transition from i to j becomesˆ p ij = 1 N N X n =1 X n , with the following equivalences N = X k ∈S n ik , n ij = N X n =1 X n . The Markov property at transitions of the embedded DTMC of the SMPimplies that the X n are independent and identically distributed (i.i.d.) ran-dom variables. Hence, by the Weak Law of Large Numbers (see Theorem5.5.2 of [6, p.232]), we have ˆ p ij P → p ij , which demonstrates consistency.Likewise, we see that the x ijK are independent of x ijK so long as K = K . Thus, the collection { x ijK } n ij K =1 is i.i.d. By the same reasoning as above,we obtain the convergence in probabilityˆ e ij P → e ij , Hence, the ˆ e ij are consistent.We are now in a position to define the estimators of the r th moments offirst passage from state i to state j ∈ S . By replacing p and e with thematrix estimators ˆ p and ˆ e , respectively, in formulas (7) and (8), we obtainthe natural estimatorsˆ µ j ≡ [ I − ˆ pI ( − j )] − (cid:0) ˆ p ◦ ˆ e (cid:1) J j , (10)ˆ µ ( r ) j ≡ [ I − ˆ pI ( − j )] − × " (cid:16) ˆ p ◦ ˆ e ( r ) (cid:17) J j + r − X s =1 (cid:18) rs (cid:19) h(cid:16) ˆ p ◦ ˆ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ ˆ µ ( s ) j (cid:17)i , r ≥ . (11)As expected, estimators (10) and (11) are also consistent.17 emma 5.2 For a state j ∈ S that is UA with respect to the digraph G ( ˆ p ) ,the estimators ˆ µ ( r ) j , r ≥ , are consistent. Proof
By Theorem 2.1.4 in [15, pg. 51], a continuous function of consistentestimators is itself a consistent estimator. By Lemma 4.1, and the assump-tion that j is UA with respect to G ( p ) (i.e., every state in S is linked to thestate j in the digraph of ˆ p ) we may assert the existence of [ I − ˆ pI ( − j )] − .The remainder of the terms in (10) and (11) are linear, and hence continu-ous. We thus conclude that ˆ µ ( r ) j , for moments r ≥ j ∈ S , are consistent estimators.In this section we proposed consistent estimates for first passage mo-ments of SMP. These estimates can be obtained if sufficient data is collectedfrom observing the process. We give an example of an SMP and show how the first passage momentscan be estimated. Therefore, given the process depicted in Figure 1 we have3 transition distributions and a probability p . We will calculate the firstpassage moments using the direct transition moments, e . F (x) (1-p)F (x) Good Health Death Diseased pF (x) Figure 1: An example SMP of a medical patient.To begin we have p = p − p and e = e e e e . µ = − p − e p e − p ) e e = 11 − p p − p e p e + (1 − p ) e e = 11 − p e + p e + (1 − p ) e p e + p e + (1 − p ) e (1 − p ) e (12)Looking closely at the values in Equation 12 we see they are logical. As p gets small we see µ → e + e and µ → e . This simple exampledemonstrates the theory discussed earlier; how even for large systems findingthe first passage moments is only constrained by the computational burdenof computing the inverse of I − pI ( − j ).If numerical values are substituted then numerical computer programscan handle these types of problems with relative ease. Now suppose we have p = . .
20 0 1 and e = . .
10 0 0 . We get the following result µ = . . . The R-code for this example is included in the appendix. The methodspresented in this paper provide a fairly comprehensive method to determinethe first passage moments of a SMP.
In this paper we devised an exact time-domain approach to derive the mo-ments µ ij of first passage time distributions in irreducible or reducible SMPs,given that the terminal state j fulfills the conditions of universal accessibility.Beyond the expanded generality of this method, it also has the advantage ofobtaining the solution of first passage moments to only single UA states j ,19ather than to all states, thereby reducing the computational load, particu-larly for large SMPs. We have also demonstrated the existence of consistentpoint estimators for the first passage moments of processes that may bemodeled as SMPs. The authors thank Brian McBee and Dustin Mixon who have provided sug-gestions and advice in this work.
A R-code
First_Passage_Moments <- function(j,p,E) { nv_matrix = solve(I-p%*%Id)for (i in 1:r) {result[[i+1]]=0for (k in 0:(i-1)) {result[[i+1]] = result[[i+1]] +(choose(i,k)*p*(E[[i-k]]))%*%(((J-I)[,j])^k*result[[k+1]])}result[[i+1]] = inv_matrix%*%result[[i+1]]}return(result[2:(r+1)])}p <- matrix(c(0,1,0,.8,0,.2,0,0,1),nrow=3,byrow=T)E <- matrix(c(0,6,0,0.7,0,1.1,0,0,0),nrow=3,byrow=T)L <- list(E)First_Passage_Moments(3,p,L) References [1] V. Barbu, M. Boussemart, and N. Limnios,
Discrete-time semi-Markovmodel for reliability and survival analysis , Commun Stat-Theor M (2004), no. 11, 28332868.[2] Vlad Stefan Barbu and Nikolaos Liminios, Semi-Markov chains andhidden semi-Markov models toward applications: Their use in reliabilityand DNA analysis , Springer, New York, 2008.[3] R.E. Barlow and F. Proschan,
Mathematical theory of reliability , JohnWiley & Sons, London and New York, 1965.[4] Shelby Brumelle and Darius Walczak,
Dynamic airline revenue man-agement with multiple semi-Markov demand , Operations Research (2003), no. 1, 137–148.[5] Ronald W. Butler and Aparna V. Huzurbazar, Stochastic network mod-els for survival analysis , J. Amer. Statist. Assoc. (1997), 246–257.MR MR1436113 (98e:62140)[6] George Casella and Roger L. Berger, Statistical inference , second ed.,Duxbury, Pacific Grove, CA., 2002.217] Miroslav Fiedler,
Special matrices and their applications in numericalmathematics , 2nd ed., Dover Publications, New York, 2008.[8] J.J. Hunter,
On the moments of Markov renewal processes , Advancesin Applied Probability (1969), 188–210.[9] Jacques Janssen and Raimondo Manca,
Semi-Markov risk models forfinance, insurance, and reliability , Springer, New York, 2007.[10] J.G. Kem´eny and J.L. Snell,
Finite Markov chains , University series inundergraduate mathematics, Van Nostrand, 1960.[11] Jeffrey P. Kharoufeh, Christopher J. Solo, and M.Y. Ulukus,
Semi-Markov models for degradation-based reliability , IIE Transactions (2010), no. 8, 599–612.[12] David Kincaid and Ward Cheney, Numerical analysis: mathematics ofscientific computing , 3rd ed., vol. 2, American Mathematical Society,Providence, 2002.[13] Leonard Kleinrock,
Queueing systems volume I: Theory , John Wiley &Sons, New York, 1975.[14] V.V. Kolpakov,
Matrix seminorms and related inequalities , Journal ofMathematical Sciences (1983), no. 1, 2094–2106.[15] E. L. Lehmann, Elements of large-sample theory , Springer, New York,1999.[16] Steven R Lerman,
The use of disaggregate choice models in semi-Markov process models of trip chaining behavior , Transportation Sci-ence (1979), no. 4, 273–291.[17] P. Levy, Processus semi-Markoviens , Proc. Intern. Congr. Math. (1954), 416–426, Amsterdam, The Netherlands.[18] , Systems semi-Markoviens ayant au plus une inifinite denom-brable d’etats possibles , Proc. Intern. Congr. Math. (1954), 294–295,Amsterdam, The Netherlands.[19] N. Limnios and G. Opri¸san, Semi-Markov processes and reliability ,Birkh¨auser, Boston, 2001.[20] Carl D. Meyer,
Matrix analysis and applied linear algebra , Society forIndustrial Mathematics, Philadelphia, 2000.2221] M.F. Neuts,
Structured stochastic matrices of
M/G/ type and theirapplications , Probability: Pure and Applied, Marcel Dekker, Inc., NewYork and Basel, 1989.[22] Ronald Pyke, Markov renewal processes: Definitions and preliminaryproperties , The Annals of Mathematical Statistics (1961), no. 4,1231–1242.[23] , Markov renewal processes with finitely many states , The Annalsof Mathematical Statistics (1961), no. 4, 1243–1259.[24] Sheldon M. Ross, Applied probability models with optimization applica-tions , Dover, New York, 1992.[25] Jia-Yu Shao,
Products of irreducible matrices , Linear Algebra and ItsApplications (1985), 131–143.[26] W. L. Smith, Regenerative stochastic processes , Proc. Roy. Soc (GB),series A, (1955), 6–31, Amsterdam, The Netherlands.[27] David D. Yao,
First-passage-time moments of Markov processes , Jour-nal of Applied Probability (1985), no. 4, 939–945.[28] H.L.S. Younes and R.G. Simmons, Solving generalized semi-Markov de-cision processes using continuous phase-type distributions , Proceedingof the National Conference on Artificial Intelligence, 2004, pp. 742–747.[29] Xuan Zhang and Zhenting Hou,
The first-passage times of phase semi-Markov processes , Statistics & Probability Letters82