[PDF] First Passage Moments of Finite-State Semi-Markov Processes

Abstract

In this paper, we discuss the computation of first passage moments of a time-homogeneous semi-Markov process (SMP) with finite state space to certain of its states that possess the property of universal accessibility (UA). A UA state is one which is accessible from any other state of the SMP, but which may or may not connect back to one or more other states. An important characteristic of UA is that it is the state-level version of the oft-invoked process-level property of irreducibility. We adapt existing results for irreducible SMPs to the derivation of an analytical matrix expression for the first passage moments to a single UA state of the SMP. In addition, consistent estimators for these first passage moments are given.

Full PDF

aa r X i v : . [ m a t h . P R ] M a y First Passage Moments of Finite-StateSemi-Markov Processes

Richard L. Warr ∗ and James D. Cordeiro ∗ September 20, 2018

Abstract

In this paper, we discuss the computation of ﬁrst-passage momentsof a regular time-homogeneous semi-Markov process (SMP) with aﬁnite state space to certain of its states that possess the property ofuniversal accessibility (UA). A UA state is one which is accessible fromany other state of the SMP, but which may or may not connect backto one or more other states. An important characteristic of UA is thatit is the state-level version of the oft-invoked process-level propertyof irreducibility. We adapt existing results for irreducible SMPs tothe derivation of an analytical matrix expression for the ﬁrst passagemoments to a single UA state of the SMP. In addition, consistent pointestimators for these ﬁrst passage moments, together with relevant Rcode, are provided.

KEY WORDS: First Passage Distributions; Markov Renewal Pro-cess; Spectral Radius; Statistical Flowgraph Model; Univer-sally AccessibleAMS MSC 2010: 60K15; 60G10; 62M05

Since the seminal works of Levy [17, 18] and Smith [26], semi-Markov pro-cesses (SMPs) have been utilized as a framework for a wide variety of ap-plications within the scientiﬁc literature. Much of the interest is due to the ∗ Department of Mathematics and Statistics, Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio, USA. The views expressed in this article are those of theauthors and do not reﬂect the oﬃcial policy or position of the United States Air Force,Department of Defense, or the U.S. Government.

P H -distributions[11, 28], often used in reliability, but which also appear in the context ofSMP ﬁrst passage moments, as in [29]. Other areas that have seen the ap-plication of SMP models are DNA analysis [2], queueing theory [13, 21],ﬁnance [9], artiﬁcial intelligence [28], and transportation [4, 16], to namebut a few.In this article, we will show the existence of and then derive the momentsof ﬁrst passage to states of a SMP with a ﬁnite state space that have theproperty of being accessible from every other state. We call this property universal accessibility (UA) and note that it can be likened to a state-levelversion of the property of irreducibility. This comes as a consequence ofthe fact that, as we will later show, UA of every state is a necessary andsuﬃcient condition for irreducibility to hold. In this sense, then, UA of asubset of states of a SMP may be considered a natural relaxation of theproperty of irreducibility, which has been the standard assumption in thework of all researchers dating from Pyke [22, 23] onwards. Rather thanbeing a simple generalization, we will show here that UA is, in fact, a min-imal condition required for the existence of ﬁnite moments of ﬁrst passage.This demonstration requires an application of the Perron-Frobenius theoremgeneralized to reducible matrices (and hence reducible processes) in orderto arrive at the existence of the required matrix inverse. For further de-tails on the Perron-Frobenius theorem, and spectral theory in general, see[7]. Although the proof of invertibility is somewhat convoluted, one gainsthe advantage of being able to consider only those ﬁrst passage momentsto a given universally accessible state, thus reducing the dimensionality ofthe problem. In addition, the expression that we derive does not suﬀer thepresence of inverses of singular matrix terms. Contrast this to the situationof Hunter [8] and later researchers, whose expressions for ﬁrst passage mo-ments involved a noninvertible matrix, thus requiring a generalized inverseapproach. Another signiﬁcant advantage is that we are able to discard thesomewhat strong assumptions of positive recurrence, and thus irreducibility,thereby increasing the class of SMPs for which a uniﬁed analytical approachto computing the ﬁrst passage moments is available.Explicit time-domain formulas for the ﬁrst two moments of the ﬁrst pas-sage distribution of a irreducible ergodic SMP with a ﬁnite state space have2ong been known. Pyke [23] inverted Laplace-Stieltjes transform matricesunder restrictive non-singularity conditions in order to derive the ﬁrst andsecond moments. Hunter [8] repeated this analysis by means of Markov re-newal theory, and then solved for the matrix of ﬁrst passage moments M ofthe SMP through multiplication of the matrix I − P by its generalized in-verse, where P is the transition probability matrix of the embedded discretetime Markov chain. Although the role of the fundamental matrix of the em-bedded DTMC in solving the problem of ﬁnding the ﬁrst passage momentswas recognized since at least Kemeny and Snell [10], it was Hunter [8] thatrecognized its fundamental importance by proving that the fundamental ma-trix is a generalized inverse for I − P . Some years later, Yao [27] was ableto use the generalized inverse to ﬁnd all moments of ﬁrst passage. Zhangand Hou [29] likewise employed the generalized inverse method in order toderive exact ﬁrst passage moments for SMPs with phase- ( P H -)distributedsojourn times between states, thus capitalizing on the robust interest in thereliability community for these somewhat exponential-like statistical distri-butions. All of these previous investigations assumed irreducibility, and arethus useful background for, though not directly applicable to the type ofreducible process that we consider here.The remainder of the paper will proceed as follows. In Section 2, wedeﬁne notation, terminology, and assumptions that guide the remainder ofthe discourse. In section 3, we introduce the notion of universal accessibility,as well as a result that explains its relationship to irreducibility. We thenpresent the main result in Section 4, which is the derivation of the formulafor the ﬁrst passage moments under the condition of universal accessibility.Finally, in Section 5, we present a method for estimating the ﬁrst passagemoments of SMPs and a brief example.

In this section we introduce the the notation used in this paper. A boldfacesymbol without indices refers to a matrix (e.g., F ( t ) is a matrix with elements F ij ( t ) in the i th row and j th column). We will sometimes drop the functionargument for simplicity’s sake; e.g., F = F ( t ). In the usual way, we deﬁnethe Dirac- δ function as δ ij = (cid:26) i = j i = j Additionally, we will specify that the m -dimensional square matrices I and J denote the identity matrix and the matrix whose entries consist of ones,3espectively. Finally, the matrix binary operator ‘ ◦ ’ denotes Hadamard(element-wise) multiplication; i.e. (cid:2) A ◦ B (cid:3) ij = A ij B ij . We now deﬁne a regular time-homogeneous SMP { Z ( t ) : t ≥ } with aﬁnite state space S = { , , . . . , m } . Note that the assumption of regularityimplies that the process may transition only a ﬁnite number of times ina ﬁnite time interval with probability 1. Let S k , k = 0 , , , . . . be thetransitional epochs of the SMP and let Z k = Z ( S k ). We deﬁne the kernelmatrix Q ( x ) = [ Q ij ( x )] of the SMP as Q ij ( x ) = P { Z k +1 = j, S k +1 − S k ≤ x | Z k = i } = P { Z = j, S ≤ x | Z = i } , which are the joint probabilities of waiting times and transitions from state i ∈ S to state j ∈ S . The transition matrix of the embedded discrete timeMarkov chain (DTMC) is thus given by p = Q ( ∞ ). In addition, we deﬁnethe matrix of distribution functions F ( x ) = [ F ij ( x )] of the sojourn times instate i , given that the process transitions to state j as F ij ( x ) = P { S ≤ x | Z = i, Z = j } , (1)with associated r th moments e ( r ) = [ e ( r ) ij ] , e = [ e ij ] = e (1) , r ≥ Q ij ( x ) = p ij F ij ( x ) , (2)or, alternatively, as the Hadamard matrix product Q = p ◦ F .The similarity in behavior of an SMP to a Markov chain at transitionepochs { S k : k = 0 , , , . . . } is due to the classiﬁcation of these transitionsas Markov renewal epochs . These are times at which the process in questionpossesses the Markov, or memoryless property:P { Z k +1 = j | Z k = i, Z k − , . . . , Z , Z } = P { Z k +1 = j | Z k = i } . Deﬁne the random variable N k ( t ) to be the number of transitions (Markovrenewals) of the SMP into state k up to and including t ≥ N ( t ) ≡ [ N k ( t )] k ∈S

4e the vector consisting of the random counting variables N k ( t ). Also deﬁnethe scalar random variable N ( t ) ≡ m X k =1 N k ( t )to be the total number of transitions, or Markov renewals, of the SMPup to t . We thus obtain the relationship Z ( t ) = Z N ( t ) , between the SMP { Z ( t ) : t ≥ } and its embedded DTMC { Z k : k ≥ } . The vector countingprocess { N ( t ) : t ≥ } is known as the Markov renewal process associatedto the SMP { Z ( t ) : t ≥ } .The state properties of the SMP such as irreducibility and recurrencemay be elicited from the properties of its embedded DTMC { Z n : n ≥ } .We say that state j is accessible from state i ( i → j ) if there is a nonzeroprobability that { Z n } may transition to state j in a ﬁnite number of steps,given that it begins in state i . Mathematically, this means that there issome n ∈ Z + such that p ( n ) ij >

0, where p ( n ) ij = P { Z n = j | Z = i } . The matrix p ( n ) = [ p ( n ) ij ] is called the n th-step transition probability matrix .The ij th element of the matrix denotes the probability of the DTMC tran-sitioning from state i to state j in n stages and can be computed using theidentity p ( n ) = p n . On the other hand, we say that state j is not accessible from state i (denoted i j ) if p ( n ) ij = 0 for all n . There may also exist astate 0 ∈ S known as an absorbing state , which is to say that, for any otherstate j ∈ S , 0 j . In this case, the SMP, having transitioned to state 0,sojourns for an inﬁnite amount of time in this state. Many applications insurvival and reliability analysis may be modeled using stochastic processeswith one or more absorbing states. Transitioning to an absorbing state istantamount to death or complete failure in the original process.If i and j are mutually accessible (that is, i → j and j → i , otherwisedenoted as i ↔ j ), then they are said to communicate . Since communi-cation fulﬁlls the axioms of reﬂexivity, transitivity, and symmetry, it is anequivalence relation, and thus deﬁnes a partitioning of the state space S into various disjoint communicating classes . If S is itself comprised of asingle communicating class, then the SMP is called irreducible ; otherwise,it is known as reducible . On the other hand, a nonnegative m × m matrix A = [ a ij ] is an irreducible matrix if, for each i and j , there exists some0 < η < ∞ such that the ij th element of A η is greater than 0. The alge-braic and probabilistic deﬁnitions of irreducibility coincide if the irreducible5atrix is the transition probability matrix p , for then the ij th element of p ( η ) = p η is strictly positive if and only if j is accessible from state i in aﬁnite number η of steps with nonzero probability. This last statement canbe made precise by reference to the digraph associated to A , denoted G ( A ).This is the digraph with vertices in the set V ( G ( A )) = { , , . . . , m } suchthat the directed arc, or edge, ( i, j ) exists if and only if a ij > G ( A ) issaid to be strongly connected if, for each ordered pair i, j ∈ V ( G ( A )), thereexists a (directed) path in G ( A ) from i to j . In either case of there being anedge or directed path from i to j , the implication is clearly i → j . The ﬁnalconnection between irreducibility and connectedness is made in the followingProposition: Proposition 2.1

Let A be a nonnegative square matrix. A is irreducible ifand only if G ( A ) is strongly connected. Proof

See Shao [25].We next address the ﬁrst passage times of an SMP. To this end, deﬁnethe random variable T j = inf { t ≥ S : Z ( t ) = j } , j ∈ S , which represents the time of ﬁrst passage from an initial state i to state j if i = j , and the time of ﬁrst return to j otherwise. The distribution function G ij ( t ) of ﬁrst passage, conditioned on being in the initial state i ∈ S , isdeﬁned as G ij ( t ) = P (cid:8) T j ≤ t | Z (0) = i (cid:9) , and for which the corresponding r th moments µ ( r ) ij , r ≥

1, if they exist, aregiven by µ ( r ) ij = E h T r j | Z (0) = i i . We thus deﬁne G ( t ) and µ ( r ) = h µ ( r ) ij i to be the matrices of ﬁrst passagedistribution functions and moments.As stated in Proposition 5.15 of [24, pg104] and Lemma 4.1 of [29], themoments of ﬁrst passage for an irreducible SMP may be computed as theﬁnite solution to the systems of equations given by µ (1) ij = m X k =1 p ik (cid:2)(cid:0) − δ kj (cid:1) µ kj + µ ik (cid:3) (3) µ ( r ) ij = m X k =1 p ik µ ( r ) ik + r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj  , r ≥ . (4)6learly, a necessary condition for µ ( r ) ij < ∞ is that i → j , which is certainlytrue if the SMP is irreducible. In contrast, we observe that G ij ( ∞ ) < µ ij = ∞ ) might occur for a pair of states i, j ∈ S if i j . As we willlater show, (3) and (4) still hold under the somewhat weakened assumptionof universal accessibility for the terminal state j .The recurrence properties of a SMP may be explained in terms of thedistribution of the ﬁrst passage of a SMP from a given state i ∈ S back toitself, otherwise known as the time of (ﬁrst) return to a state i ∈ S . Thecrucial step is to deﬁne f ii = P { N ( T i ) < ∞ | Z = i } , which is the probability that the number of steps required for the embeddedDTMC { Z n : n ≥ } to return to state i is ﬁnite. If f ii <

1, then the state i ∈ S is called transient ; otherwise, it is known as recurrent . If, in additionto recurrence, we have µ ii < ∞ , then the state is called positive recurrent .The SMP itself is deemed recurrent, transient, or positive recurrent as a process if the corresponding condition holds for every state i ∈ S . For anirreducible SMP with a ﬁnite state space, it is well-known that the process isautomatically positive recurrent. This is not true, in general, for a reducibleprocess, but may be evaluated on a state-by-state basis.The Perron-Frobenius theorem adapted to ﬁnite-dimensional irreducibleand nonnegative matrices is very useful for characterizing the set of eigen-values of such matrices. As we will see later, the theory may be (indirectly)extended to even reducible nonnegative matrices by leveraging their distinc-tive canonical form. Let A ∈ R m × m + for some positive integer m . We deﬁnethe spectrum of A , denoted σ A , to be the set of its eigenvalues. Its spectralradius , denoted ρ ( A ), is given by ρ ( A ) = max {| λ | : λ ∈ σ A } ∈ R + , which indicates the maximum radius of the disc that contains σ A in thecomplex plane. Of particular interest is the case of a ﬁnite-dimensional stochastic matrix A , which is a nonnegative square matrix such that A = ,where is a column vector of ones. Perron-Frobenius theory, via Proposition2.4 for the reducible case, implies that the spectral radius is likewise aneigenvalue of A , denoted the Perron root of A . Stochastic matrices comprisethe boundary of the unit ball A = { A ∈ R m × m + : || A || ∞ ≤ } of ﬁnite-dimensional nonnegative matrices in the normed linear space induced bythe inﬁnity norm || · || ∞ , which is given by the maximum absolute row sum7f A = [ a ij ], or || A || ∞ = max i n X j =1 | a ij | = max( A ) . As the next Proposition will show, we may classify certain elements of A ∈A with spectral radius ρ ( A ) < substochastic , which is to say that0 < min( A ) < Proposition 2.2

Suppose that A ∈ A . If ρ ( A ) < , then A is substochas-tic. Proof

Clearly, since A ∈ A , it must be either stochastic or substochastic.Therefore the only thing that must be proved is that A is not stochastic.Assume A is stochastic; i.e. A = . This implies 1 is an eigenvalue, whichcontradicts ρ ( A ) <

1. Therefore, A must be substochastic.For an irreducible nonnegative matrix A , it is, in fact, suﬃcient for A to havea spectral radius that is strictly less than unity in order to be substochastic,as the next Proposition shows. Proposition 2.3 If A ∈ R m × m + is an irreducible substochastic matrix, then ρ ( A ) < . Proof

See Theorem 7 in [14].For such reasons, among others, it is very convenient to work with irre-ducible processes. Results for irreducible matrices (processes) may still beapplied to the reducible case via an important consequence of the mathemat-ical notion of reducibility. From the deﬁnition given above for a reduciblematrix A , it can be shown that there exists a permutation matrix P suchthat P AP − is in upper block triangular form as follows: A ∼ P AP − =  A A · · · A r A ,r +1 A ,r +2 · · · A M A · · · A r A ,r +1 A ,r +2 · · · A M ... . . . ... ... ... . . . ... · · · A rr A r,r +1 A r,r +2 · · · A rM · · · A r +1 ,r +1 · · ·

00 0 · · · A r +2 ,r +2 · · · ... ... . . . ... ... . . . ... · · · · · · A MM  (5)8his is known as the canonical form for a reducible matrix. The canonicalform is not unique, meaning that there may be two or more permutationmatrices P such that a matrix P AP − is in canonical form. Its utility de-rives from several highly useful properties, which we will now discuss. First,the canonical matrix has the property such that every block matrix on thediagonal, A νν , ν ∈ { , . . . , M } , is either [0] × or is irreducible. Moreover,the eigenvalues of A are invariant under the permutation transformation P AP − . From a stochastic process perspective, we observe that transform-ing a reducible stochastic matrix P into its canonical form is equivalent torelabeling the state space of an associated (reducible) DTMC. The states areorganized in the canonical transition matrix P in such a way that, for some r ∈ Z + , the transient state transitions are represented within the diagonalblocks A νν for 1 ≤ ν ≤ r , while the blocks A νν for r + 1 ≤ ν ≤ M representrecurrent-state transitions. Additionally, as shown in Equation 8.4.7 of [20], ρ ( A νν ) < ≤ ν ≤ r , which, by Proposition 2.3, further showsthat the ﬁrst r diagonal sub-blocks of A are substochastic. The followingProposition rounds out this list of useful properties by relating the spectralradius of the sub-blocks of the canonical matrix to that of the entire matrix. Proposition 2.4

Suppose A ∈ R m × m + is a reducible matrix in canonicalform. Then ρ ( A ) = max ν ρ ( A νν ) for ≤ ν ≤ M . Proof

See Lemma 1 in [12, pg. 303] with an additional induction argumentto get the result or as argued in [7, pg.115].Table 1 summarizes the important notation that we will use throughout thispaper.

In this section, we introduce the property of universal accessibility of state j ∈ S . As we will later demonstrate, universal accessibility is a suﬃcientcondition for the existence of a well-deﬁned ﬁrst-passage moment to a givenstate of the SMP. Deﬁnition

Let P be a stochastic process with state space S . State j ∈ S is said to be universally accessible (UA) if, for every state i ∈ S , we have i → j .If a proper subset of S is UA, then it is clear that the SMP is reducible. Onthe other hand, if every state is UA, then all states must communicate, asthe next Proposition asserts. 9able 1: List of important symbols and notation. n The number of states in the SMP x The sojourn time in a state t Calendar time, or the time since the process began p ij The probability that the next state in the process is j , given theprocess entered state iF ij ( x ) The CDF of the waiting time distribution in state i , given thenext transition to is state jG ij ( t ) The CDF of the ﬁrst passage distribution from state i to state j I The m × m identity matrix I ( − j ) The m × m identity matrix with the j th row and column set to 0 The m × m matrix of all 1s A j The j th column of matrix AA ◦ B The element-wise product of two matrices e ( r ) ij R x r dF ij , e ij ≡ e (1) ij µ ( r ) ij R x r dG ij , µ ij ≡ µ (1) ij ρ ( A ) The spectral radius of the matrix A (cid:13)(cid:13) A (cid:13)(cid:13) ∞ The inﬁnity norm of a matrix10 roposition 3.1

A SMP { Z ( t ) : t ≥ } with state space S is irreducible ifand only if every j ∈ S is UA. The property of a state being UA is, in a sense, the minimal requirement forthe existence of all ﬁrst-passage moments. In the next section, we demon-strate the suﬃciency of this condition by means of the Perron-Frobeniustheorem applied to the canonical form of the reducible transition probabil-ity matrix of the embedded DTMC.

In this section, we derive a formula for determining the ﬁrst and highermoments of ﬁrst passage times in reducible

SMPs to special states j thatare UA. We begin with a technical result that will be needed in the proofof Theorem 4.2 to demonstrate that the matrix formula for the moments ofﬁrst passage to a UA state j ∈ S is well-deﬁned. For notational convenience,deﬁne I ( − j ) to be the identity matrix with the j th diagonal element set tozero. Lemma 4.1

Let { Z ( t ) : t ≥ } be a SMP with ﬁnite state space S andembedded DTMC at transition epochs with transition probabilities containedwithin the (stochastic) matrix p . Then the matrix [ I − pI ( − j )] is nonsin-gular if and only if state j ∈ S is universally accessible (UA). Proof

We begin with the observation that, since A = [ A νκ ] = pI ( − j ) isformed by setting each element of the j th column of p to 0, we essentiallyremove all directed arcs ( i, j ) in the digraph G ( p ) for each i ∈ V ( G ( p )) inorder to produce G ( A ). This means that G ( A ) cannot be strongly connected,and thus A must be reducible. We may therefore assume that A is incanonical form (5). Furthermore, because the j th column is zero, we willassume without loss of generality that the canonical form of A correspondsto the particular ordering of the states in S in which state j is re-designatedas state 1. We impose the same permutation and partitioning on p = [ p νκ ]so that A νκ = ( p νκ if ( ν, κ ) ∈ { , . . . , M } × { , , . . . , M } , if ( ν, κ ) ∈ { , . . . , M } × { } , (6)where, as in (5), M is the dimension of A . Notice that since p may be irre-ducible, the above does not necessarily imply that p can be put in canonicalform, but rather is element-wise equivalent to A , save for the ﬁrst column,11hich, unlike that of A , may contain positive entries. Stated succinctly, wehave that = A ν ≤ p ν , ν = 1 , . . . , M. Assume that [ I − A ] is nonsingular, which directly implies that 1 / ∈ σ A ;that is, 1 is not an eigenvalue of A . Since p is a row-stochastic matrix, andbecause of the equivalence given in (6), the Gerschgorin Circle Theorem(see [20, Eqn. 7.1.13]) indicates that the spectral radius δ = ρ ( A ) ≤ A permits the use of Equation 8.3.1 of[20] to then assert that the Perron root 0 ≤ δ ≤ / ∈ σ A , it must then be the case that δ <

1. Thisimplies by Proposition 2.4 that ρ ( A νν ) < ν ∈ { , . . . , M } andhence, by Proposition 2.3, each diagonal block A νν , ν ∈ { , . . . , M } mustbe substochastic.We now consider the ν th diagonal block in the canonical form of A ,where ν ∈ { , . . . , M } , and proceed to show that each state i associated tothe vertex set V ( G ( A νν )) can access state 1. Because p is a row-stochasticmatrix and A νν is substochastic, either or both of the following may hold:1. p ν = , or2. A νκ = for some κ > ν .For 1), p ν = indicates the existence of states i ν ∈ V ( G ( A νν )) (with i ν = i possible, but not necessary) and 1 ∈ V ( G ( A )) for which there is a directedarc ( i ν , A νν gives a directed path from i to i ν . We thus obtain i → i ν → . In other words, there is a directed path from i to 1.If 2) holds, there exists a directed arc from some state i ν ∈ V ( G ( A νν )(again, with the possibility that i ν = i ) to a state i κ ∈ V ( G ( A κκ )). Fromhere, we are again confronted with choices 1) and 2). If 1) holds, thenthe previous argument gives us a directed path from i κ to 1. Since theirreducibility of A νν implies the existence of a path from i to i ν , we havethe accessibility chain i → i ν → i κ → , and we are done. Otherwise, we proceed to the next diagonal block following A κκ and continue until ν > r . If ν > r then the process is in a state i ν ∈ V ( G ( A νν )). The only choice here, due to the this block being substochastic,is 1); that is, p ν = , for which we have already demonstrated the existence12f the connection i ν →

1. Each of the preceding paths may then be combinedto form a single directed path from an arbitrarily selected i ∈ V ( G ( A νν )) to1 so that i → i ν → i κ → · · · → i M → . Thus, state 1 is UA.For the reverse implication, we will assume that state 1 is UA, and thenproceed to show that [ I − A ] is nonsingular. The reducibility of A allowsus to assume that it possesses canonical form and, furthermore, that eachsubmatrix on the diagonal of the canonical matrix corresponding to A isirreducible or zero. Consider an arbitrary nonzero, and hence irreducible,diagonal submatrix A νν for some ν ∈ { , . . . , M } (recall that A = bydeﬁnition of A ). By the assumption that state 1 is UA, there must be adirected path from each state in the vertex set V ( G ( A νν )) to 1, which inturn implies that A νν is substochastic. By Proposition 2.3, ρ ( A νν ) < ρ ( A ) <

1. Hence,[ I − A ] is nonsingular, which completes the proof.For the following main result, we will show, using Lemma 4.1, that state j being UA is suﬃcient to derive a closed-form analytical expression for the r th ﬁrst passage moments µ ( r ) = h µ ( r ) ij i , for r ≥ i ∈ S . Theorem 4.2

Let { Z ( t ) : t ≥ } be a regular, time-homogeneous SMP witha ﬁnite state space S . Further suppose that j ∈ S is UA. Then the r thmoments of the ﬁrst passage times from all states i ∈ S to state j containedin the m -vector ( m = |S| ) µ ( r ) j = h µ ( r ) ij i mi =1 , r ≥ , are solutions to the system of equations given by µ j ≡ µ (1) j = [ I − pI ( − j )] − (cid:0) p ◦ e (cid:1) , (7) µ ( r ) j = [ I − pI ( − j )] − × " (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ µ ( s ) j (cid:17)i , if r > , (8) where is a column vector of ones and the scalar entries µ (1) ij and µ ( r ) ij for r ≥ are deﬁned as in (3) and (4) , respectively. roof We ﬁrst show, using induction on the r th moment, r ≥

1, thatthe system of equations (3) and (4) give a valid relationship between theﬁrst-passage moments to a given state j that is UA. For the mean time ofﬁrst passage given by the system (3), we observe the following at the ﬁrsttransition epoch S of the SMP:1. i k at S ⇒ the corresponding k th term drops out of the expression,and2. i → k at S ⇒ µ ik and µ kj are well-deﬁned, the latter because j isUA.We thus conclude that a ﬁrst-step analysis founded upon the state of theSMP at the ﬁrst transition epoch S (c.f. Proposition 5.15 of [24, pg104]) stillholds for a terminal UA state j . Next, for the induction step, we considerexpression (4) for the ( r +1)th moment, where r ≥

1. We likewise claim thatthe original renewal argument given in Lemma 4.1 of [29] for the derivationof (4) for the r th moments of ﬁrst passage is valid. In order to see this, werewrite, for i ∈ S , expression (4) as µ ( r +1) ij = m X k =1 p ik h(cid:0) − δ kj (cid:1) µ ( r +1) kj + µ ( r +1) ik i + M r (9)where M r = r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj  . The inductive hypothesis and items 1) and 2) above guarantee that M r iswell-deﬁned while the remainder of (9) is in exactly the same form as (3),which has just been shown to have a ﬁnite solution via the base step.Thus, for arbitrary i = j , where i ∈ S , we may transform (3) into theequivalent matrix expression µ = [ µ ij ] = p (cid:0) ( J − I ) ◦ µ (cid:1) + (cid:0) p ◦ e (cid:1) J . In this form we are not able to solve directly for µ , but, under the assumptionthat j is a speciﬁc UA state in S , it is possible to solve for the j th columnof µ , which we denote as µ j . We then obtain, µ j = p (cid:2)(cid:0) ( J − I ) ◦ µ (cid:1)(cid:3) j + (cid:0) p ◦ e (cid:1) . Next, we isolate ( p ◦ e ) so that µ j − p (cid:2)(cid:0) ( J − I ) ◦ µ (cid:1)(cid:3) j = (cid:0) p ◦ e (cid:1) . µ j gives [ I − pI ( − j )] µ j = (cid:0) p ◦ e (cid:1) which allows us to ﬁnally solve for µ j as µ j = [ I − pI ( − j )] − (cid:0) p ◦ e (cid:1) . By Lemma 4.1, the matrix I − pI ( − j ) is nonsingular. This proves that (7)is, indeed, well-deﬁned.A general formula for the r th moment, where r ≥

2, is given in Lemma4.1 of [29] as µ ( r ) ij = n X k =1 p ik e ( r ) ik + r X s =1 (cid:18) rs (cid:19) X k = j p ik e ( r − s ) ik µ ( s ) kj  , which is expressed in matrix notation as µ ( r ) = (cid:16) p ◦ e ( r ) (cid:17) J + r X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17)i . Solving for the j th column gives µ ( r ) j = (cid:16) p ◦ e ( r ) (cid:17) + r X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . Using e (0) = J (the identity under the Hadamard product), we extract the r th term of the summation to obtain µ ( r ) j − p h ( J − I ) ◦ µ ( r ) i j = (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . We further observe that µ ( r ) j − p h ( J − I ) ◦ µ ( r ) i j = ( I − pI ( − j )) µ ( r ) j , which gives( I − pI ( − j )) µ ( r ) j = (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) (cid:20)(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) ◦ µ ( s ) (cid:17) j (cid:21) . µ ( r ) j to obtain µ ( r ) j = [ I − pI ( − j )] − " (cid:16) p ◦ e ( r ) (cid:17) + r − X s =1 (cid:18) rs (cid:19) h(cid:16) p ◦ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ µ ( s ) j (cid:17)i . As argued in the proof of formula, (7), the inverse [ I − pI ( − j )] − exists.Hence, (8) is likewise well-deﬁned.We next investigate some statistical aspects in using Theorem 4.2 to estimatethe ﬁrst passage moments to universally accessible states in a SMP. In this section we will derive consistent estimates for ﬁrst passage momentsin SMPs. Since the SMP { Z ( t ) : t ≥ } is time-homogeneous, we assumewithout loss of generality that Z (0) = i ∈ S . If we observe the SMP for aperiod of time T >

0, then, for any j ∈ S , we may then deﬁne the pointestimators ˆ p ij for the probability and ˆ e ( r ) ij for the r th moment of the sojourntime of the SMP as it transitions from i to j as:ˆ p ij ≡ n ij P k ∈S n ik , ˆ e ( r ) ij ≡ n ij n ij X K =1 x rijK , i, j ∈ S , where n ij ≡ Number of observed transitions from state i to state j by time T ,x ijK ≡ K th observed sojourn time from state i to state j by time T .

We further assume T is large enough so that at least one transition from i to j has been observed; in other words, n ij ≥

1. In order to make inferentialhypotheses using these estimators, it is useful to ﬁrst show that they areconsistent. A point estimator ˆ θ n is said to be consistent if it converges inprobability to the true population statistic θ as the sample size n increases;that is, for each ǫ >

0, lim n →∞ P (cid:16) | ˆ θ n − θ | < ǫ (cid:17) = 1 . This condition is written in shorthand asˆ θ n P → θ. We now show that this condition holds for the matrix estimators ˆ p ≡ [ˆ p ij ]and ˆ e ≡ [ˆ e ij ]. 16 emma 5.1 The matrix estimators ˆ p and ˆ e are consistent. Proof

Let { X n } be a sequence of Bernoulli random variables such that X n = 1 when a transition from i to j occurs at the n th transition, and is 0otherwise. Accordingly, if N > , T ], then the estimated probability of transition from i to j becomesˆ p ij = 1 N N X n =1 X n , with the following equivalences N = X k ∈S n ik , n ij = N X n =1 X n . The Markov property at transitions of the embedded DTMC of the SMPimplies that the X n are independent and identically distributed (i.i.d.) ran-dom variables. Hence, by the Weak Law of Large Numbers (see Theorem5.5.2 of [6, p.232]), we have ˆ p ij P → p ij , which demonstrates consistency.Likewise, we see that the x ijK are independent of x ijK so long as K = K . Thus, the collection { x ijK } n ij K =1 is i.i.d. By the same reasoning as above,we obtain the convergence in probabilityˆ e ij P → e ij , Hence, the ˆ e ij are consistent.We are now in a position to deﬁne the estimators of the r th moments ofﬁrst passage from state i to state j ∈ S . By replacing p and e with thematrix estimators ˆ p and ˆ e , respectively, in formulas (7) and (8), we obtainthe natural estimatorsˆ µ j ≡ [ I − ˆ pI ( − j )] − (cid:0) ˆ p ◦ ˆ e (cid:1) J j , (10)ˆ µ ( r ) j ≡ [ I − ˆ pI ( − j )] − × " (cid:16) ˆ p ◦ ˆ e ( r ) (cid:17) J j + r − X s =1 (cid:18) rs (cid:19) h(cid:16) ˆ p ◦ ˆ e ( r − s ) (cid:17) (cid:16) ( J − I ) j ◦ ˆ µ ( s ) j (cid:17)i , r ≥ . (11)As expected, estimators (10) and (11) are also consistent.17 emma 5.2 For a state j ∈ S that is UA with respect to the digraph G ( ˆ p ) ,the estimators ˆ µ ( r ) j , r ≥ , are consistent. Proof

By Theorem 2.1.4 in [15, pg. 51], a continuous function of consistentestimators is itself a consistent estimator. By Lemma 4.1, and the assump-tion that j is UA with respect to G ( p ) (i.e., every state in S is linked to thestate j in the digraph of ˆ p ) we may assert the existence of [ I − ˆ pI ( − j )] − .The remainder of the terms in (10) and (11) are linear, and hence continu-ous. We thus conclude that ˆ µ ( r ) j , for moments r ≥ j ∈ S , are consistent estimators.In this section we proposed consistent estimates for ﬁrst passage mo-ments of SMP. These estimates can be obtained if suﬃcient data is collectedfrom observing the process. We give an example of an SMP and show how the ﬁrst passage momentscan be estimated. Therefore, given the process depicted in Figure 1 we have3 transition distributions and a probability p . We will calculate the ﬁrstpassage moments using the direct transition moments, e . F (x) (1-p)F (x) Good Health Death Diseased pF (x) Figure 1: An example SMP of a medical patient.To begin we have p =  p − p  and e =  e e e e  . µ =   −  p  −  e p e − p ) e e    = 11 − p  p − p   e p e + (1 − p ) e e  = 11 − p  e + p e + (1 − p ) e p e + p e + (1 − p ) e (1 − p ) e  (12)Looking closely at the values in Equation 12 we see they are logical. As p gets small we see µ → e + e and µ → e . This simple exampledemonstrates the theory discussed earlier; how even for large systems ﬁndingthe ﬁrst passage moments is only constrained by the computational burdenof computing the inverse of I − pI ( − j ).If numerical values are substituted then numerical computer programscan handle these types of problems with relative ease. Now suppose we have p =  . .

20 0 1  and e =  . .

10 0 0  . We get the following result µ =  . .  . The R-code for this example is included in the appendix. The methodspresented in this paper provide a fairly comprehensive method to determinethe ﬁrst passage moments of a SMP.

In this paper we devised an exact time-domain approach to derive the mo-ments µ ij of ﬁrst passage time distributions in irreducible or reducible SMPs,given that the terminal state j fulﬁlls the conditions of universal accessibility.Beyond the expanded generality of this method, it also has the advantage ofobtaining the solution of ﬁrst passage moments to only single UA states j ,19ather than to all states, thereby reducing the computational load, particu-larly for large SMPs. We have also demonstrated the existence of consistentpoint estimators for the ﬁrst passage moments of processes that may bemodeled as SMPs. The authors thank Brian McBee and Dustin Mixon who have provided sug-gestions and advice in this work.

A R-code

First_Passage_Moments <- function(j,p,E) { nv_matrix = solve(I-p%*%Id)for (i in 1:r) {result[[i+1]]=0for (k in 0:(i-1)) {result[[i+1]] = result[[i+1]] +(choose(i,k)*p*(E[[i-k]]))%*%(((J-I)[,j])^k*result[[k+1]])}result[[i+1]] = inv_matrix%*%result[[i+1]]}return(result[2:(r+1)])}p <- matrix(c(0,1,0,.8,0,.2,0,0,1),nrow=3,byrow=T)E <- matrix(c(0,6,0,0.7,0,1.1,0,0,0),nrow=3,byrow=T)L <- list(E)First_Passage_Moments(3,p,L) References [1] V. Barbu, M. Boussemart, and N. Limnios,

Discrete-time semi-Markovmodel for reliability and survival analysis , Commun Stat-Theor M (2004), no. 11, 28332868.[2] Vlad Stefan Barbu and Nikolaos Liminios, Semi-Markov chains andhidden semi-Markov models toward applications: Their use in reliabilityand DNA analysis , Springer, New York, 2008.[3] R.E. Barlow and F. Proschan,

Mathematical theory of reliability , JohnWiley & Sons, London and New York, 1965.[4] Shelby Brumelle and Darius Walczak,

Dynamic airline revenue man-agement with multiple semi-Markov demand , Operations Research (2003), no. 1, 137–148.[5] Ronald W. Butler and Aparna V. Huzurbazar, Stochastic network mod-els for survival analysis , J. Amer. Statist. Assoc. (1997), 246–257.MR MR1436113 (98e:62140)[6] George Casella and Roger L. Berger, Statistical inference , second ed.,Duxbury, Paciﬁc Grove, CA., 2002.217] Miroslav Fiedler,

Special matrices and their applications in numericalmathematics , 2nd ed., Dover Publications, New York, 2008.[8] J.J. Hunter,

On the moments of Markov renewal processes , Advancesin Applied Probability (1969), 188–210.[9] Jacques Janssen and Raimondo Manca,

Semi-Markov risk models forﬁnance, insurance, and reliability , Springer, New York, 2007.[10] J.G. Kem´eny and J.L. Snell,

Finite Markov chains , University series inundergraduate mathematics, Van Nostrand, 1960.[11] Jeﬀrey P. Kharoufeh, Christopher J. Solo, and M.Y. Ulukus,

Semi-Markov models for degradation-based reliability , IIE Transactions (2010), no. 8, 599–612.[12] David Kincaid and Ward Cheney, Numerical analysis: mathematics ofscientiﬁc computing , 3rd ed., vol. 2, American Mathematical Society,Providence, 2002.[13] Leonard Kleinrock,

Queueing systems volume I: Theory , John Wiley &Sons, New York, 1975.[14] V.V. Kolpakov,

Matrix seminorms and related inequalities , Journal ofMathematical Sciences (1983), no. 1, 2094–2106.[15] E. L. Lehmann, Elements of large-sample theory , Springer, New York,1999.[16] Steven R Lerman,

The use of disaggregate choice models in semi-Markov process models of trip chaining behavior , Transportation Sci-ence (1979), no. 4, 273–291.[17] P. Levy, Processus semi-Markoviens , Proc. Intern. Congr. Math. (1954), 416–426, Amsterdam, The Netherlands.[18] , Systems semi-Markoviens ayant au plus une iniﬁnite denom-brable d’etats possibles , Proc. Intern. Congr. Math. (1954), 294–295,Amsterdam, The Netherlands.[19] N. Limnios and G. Opri¸san, Semi-Markov processes and reliability ,Birkh¨auser, Boston, 2001.[20] Carl D. Meyer,

Matrix analysis and applied linear algebra , Society forIndustrial Mathematics, Philadelphia, 2000.2221] M.F. Neuts,

Structured stochastic matrices of

M/G/ type and theirapplications , Probability: Pure and Applied, Marcel Dekker, Inc., NewYork and Basel, 1989.[22] Ronald Pyke, Markov renewal processes: Deﬁnitions and preliminaryproperties , The Annals of Mathematical Statistics (1961), no. 4,1231–1242.[23] , Markov renewal processes with ﬁnitely many states , The Annalsof Mathematical Statistics (1961), no. 4, 1243–1259.[24] Sheldon M. Ross, Applied probability models with optimization applica-tions , Dover, New York, 1992.[25] Jia-Yu Shao,

Products of irreducible matrices , Linear Algebra and ItsApplications (1985), 131–143.[26] W. L. Smith, Regenerative stochastic processes , Proc. Roy. Soc (GB),series A, (1955), 6–31, Amsterdam, The Netherlands.[27] David D. Yao,

First-passage-time moments of Markov processes , Jour-nal of Applied Probability (1985), no. 4, 939–945.[28] H.L.S. Younes and R.G. Simmons, Solving generalized semi-Markov de-cision processes using continuous phase-type distributions , Proceedingof the National Conference on Artiﬁcial Intelligence, 2004, pp. 742–747.[29] Xuan Zhang and Zhenting Hou,

The ﬁrst-passage times of phase semi-Markov processes , Statistics & Probability Letters82