Lyapunov exponents for branching processes in a random environment: The effect of information
LLyapunov exponents for branching processes in a random environment: The effect of information
Sophie Hautphenne ∗ Guy Latouche † October 8, 2018
Abstract
We consider multitype Markovian branching processes evolving ina Markovian random environment. To determine whether or not thebranching process becomes extinct almost surely is akin to computingthe maximal Lyapunov exponent of a sequence of random matrices,which is a notoriously difficult problem. We define dual processes andwe construct bounds for the Lyapunov exponent. The bounds areobtained by adding or by removing information: to add informationresults in a lower bound, to remove information results in an upperbound and we show that to add more information gives smaller lowerbounds. We give a few illustrative examples and we observe that theupper bound is generally more accurate than the lower bound.
We consider an irreducible multitype Markovian branching process with r types of individuals and we assume that its parameters vary over time ac-cording to a Markovian random environment { X ( t ) : t ∈ R + } . This is ∗ The University of Melbourne, Department of Mathematics and Statistics, Victoria3010, Australia; [email protected] † Université Libre de Bruxelles, Département d’Informatique, CP 212, Boulevard duTriomphe, 1050 Bruxelles, Belgium; [email protected] a r X i v : . [ m a t h . P R ] N ov n irreducible continuous-time Markov chain on the finite state space E = { , , . . . , m } In the absence of a random environment, the parameters of the branchingprocess stay constant over time, and an individual of type i ( ≤ i ≤ r ) livesfor an exponentially distributed amount of time, after which it generates arandom number of children of each type j , ≤ j ≤ r . It is well known(Athreya and Ney [2]) that the mean population size matrix M ( t ) , for which M ij ( t ) is the conditional expected number of individuals of type j alive attime t , given that the population starts at time 0 with one single individualof type i , is given by M ( t ) = e Ω t for some matrix Ω with nonnegative off-diagonal elements. Furthermore, extinction of the branching process occurswith probability one if and only if λ ≤ , where λ is the eigenvalue of maximalreal part of Ω .Here, we associate a matrix Ω (cid:96) to each environmental state (cid:96) , (cid:96) = 1 , . . . , m ,and the process { X ( t ) } controls the parameters of the multitype branchingprocess in such a way that the mean population size matrix is M (cid:96) ( t ) := e Ω (cid:96) t during intervals of time over which X ( · ) = (cid:96) .The following theorem gives a necessary and sufficient condition for thealmost sure extinction of the branching process in a Markovian random en-vironment. The notation is as follows: ˆΩ k = Ω ˆ X k where { ˆ X k : k ∈ N } is the jump chain associated with the random environment { X ( t ) } , and { ξ k : k ∈ N } is the sequence of sojourn times in the successive environmentalstates. Theorem 1.1. (Tanny [12, Theorem 9.10])
There exists a constant ω suchthat ω = lim n →∞ n log { e ˆΩ ξ e ˆΩ ξ · · · e ˆΩ n − ξ n − } ij (1) almost surely, independently of i and j . Extinction is almost sure if and onlyif ω ≤ . The ( i, j ) th entry of the random matrix product in (1) is the conditionalexpected number of individuals of type j alive just before the n th environ-mental state change, given that the population starts at time 0 with onesingle individual of type i and given the history of the environmental pro-cess. The limit ω may be interpreted as the asymptotic growth rate of the2opulation, and it takes different forms since one also has ω = lim n →∞ n E log { e ˆΩ ξ e ˆΩ ξ · · · e ˆΩ n − ξ n − } ij for all i , j , (2) = lim n →∞ n log || e ˆΩ ξ e ˆΩ ξ · · · e ˆΩ n − ξ n − || = lim n →∞ n E log || e ˆΩ ξ e ˆΩ ξ · · · e ˆΩ n − ξ n − || , (3)independently of the matrix norm, as shown in Athreya and Karlin [1], andin Kingman [9].The parameter ω may be likened to a Lyapunov exponent: given a set ofreal matrices A = { A k : k ≥ } and a probability distribution P on A , themaximal Lyapunov exponent ρ for A and P is defined to be ρ ( A , P ) = lim n →∞ /n E log || A . . . A n || , where A k , k ≥ , are independent and identically distributed random matri-ces on A with the distribution P ; the limit exists and does not depend onthe choice of matrix norm (Furstenberg and Kesten [4], Oseledec [11]). Inour case, A = { e Ω x : Ω ∈ { Ω , . . . , Ω m } , x ≥ } , the probability distribu-tion P is induced by the random environment, and the matrices A k are notindependent nor identically distributed.Lyapunov exponents are hard to compute (Kingman [9], Tsitsiklis andBlondel [13]), except under very special circumstances, such as in Key [7]where the matrices in the family are assumed to be simultaneously diagonal-izable, or in Lima and Rahibe [10] where A contains two matrices of order2 only, one of which is singular. For a thorough survey on the basics ofLyapunov exponents, we refer to Watkins [14].If A is a finite set of nonnegative matrices, Gharavi and Anantharam [5]give an upper bound for ρ ( A , P ) in the form of the maximum of a nonlinearconcave function. Key [8] gives both upper and lower bounds determinedas follows, on the basis of (2, 3). Define σ n = 1 /n E log || A · · · A n || and σ ∗ n = 1 /n E log { A · · · A n } jj for some arbitrarily fixed j . One verifies that { σ k } is non-increasing and that { σ ∗ k } is non-decreasing, so that these formsequences of upper and lower bounds for ω . They are, however, not mucheasier to compute than the Lyapunov exponent itself.In the absence of an easily computed exact expression for ω , we look forupper and lower bounds, in an adaptation of the approach developed in [6] for3ranching processes subject to binomial catastrophes at random times. Ourbounds are obtained in three steps. First, we replace the branching processby a dual, marked Markovian process. This has the advantage that we dealwith the trajectories of a simple continuous-time Markov chain, instead ofthe conditional expected number of individuals in each type of the originalbranching process. Next, we take the expectation with respect to the ˆ X k sand the ξ k s, and obtain an upper bound for ω . In the third step, we addsome information about the history of the dual process and so obtain a lowerbound. In some cases, we have some leeway in the amount of informationthat we need to add to obtain a lower bound, and we show that the boundis smaller when we add more information. Thus, roughly speaking, we maysay that the conditional expectation in (1) is taken with respect to a finelybalanced information; less information leads to an upper bound, while moreinformation yields a lower bound.Our approach may be used for the analysis of other systems, beyondbranching processes. For instance, Bolthausen and Goldsheid analyse in[3] a random walk on a strip subject to a random environment and theirparameter λ in [3, Equation 6] is a Lyapunov exponent. Such expressionstypically arise in the analysis of systems in random environments and shouldbe amenable to an approach similar to ours.The paper is organised as follows. The dual process and the marks aredefined in Section 2 and we determine an upper and a lower bounds for ω .We discuss in Section 3 an alternative definition of the dual Markov chain:this second dual yields the same upper bound but a different lower bound.We give two illustrative examples in Section 4, showing that the ordering ofthe lower bounds depends on the case. We show in Section 5 how the lowerbound may be increased in the special circumstance where the environmentalprocess { X ( t ) } is cyclic. Finally, we prove in Section 6 that our bounds aretight. We denote by Q the generator of the environmental process { X ( t ) } and by c (cid:96) = | Q (cid:96)(cid:96) | , for ≤ (cid:96) ≤ m , the parameters of the exponential distributions ofsojourn times in the different environmental states.We assume that the matrices Ω (cid:96) , ≤ (cid:96) ≤ m , are all irreducible. Then,each matrix has an eigenvalue λ (cid:96) of maximal real part, which is the asymp-4otic growth rate of the population under the conditions prevailing whilethe environmental process is in state (cid:96) . The matrices Ω ∗ (cid:96) defined by Ω ∗ (cid:96) =Ω (cid:96) − λ (cid:96) I have each one eigenvalue equal to zero, the other eigenvalues hav-ing a strictly negative real part. Furthermore, the corresponding left- andright-eigenvectors u (cid:96) and v (cid:96) are strictly positive and may be normalised by u (cid:96) = 1 and u (cid:96) v (cid:96) = 1 .We proceed in two steps. Firstly, we write (1) as ω = lim n →∞ n [ λ n (cid:88) k =1 ξ (1) k + . . . + λ m n m (cid:88) k =1 ξ ( m ) k ]+ lim n →∞ n log { e ˆΩ ∗ ξ e ˆΩ ∗ ξ · · · e ˆΩ ∗ n − ξ n − } ij a.s., (4)where n (cid:96) = (cid:80) n − k =0 { ˆ X k = j } is the number of times the environment visitsstate (cid:96) during the first n environmental transitions, ˆΩ ∗ k = Ω ∗ ˆ X k , and ξ ( (cid:96) ) k is thelength of the k th sojourn interval in state (cid:96) .The random variables { ξ ( (cid:96) ) k , k ≥ } are independent and exponentiallydistributed with parameter c (cid:96) , so that the sum (cid:80) n (cid:96) k =1 ξ ( (cid:96) ) k /n (cid:96) converges almostsurely to E [ ξ ( (cid:96) ) ] = 1 /c (cid:96) by the Strong Law of Large Numbers. For the samereason, n (cid:96) /n → ˆ π (cid:96) with probability one as n → ∞ , for all (cid:96) , where ˆ π is thestationary distribution of the jump chain { ˆ X k } . Therefore, we may write (4)as ω = ( λ ˆ π /c + . . . + λ m ˆ π m /c m ) + Ψ (5)where Ψ is a constant such that Ψ = lim n →∞ n log { e ˆΩ ∗ ξ e ˆΩ ∗ ξ · · · e ˆΩ ∗ n − ξ n − } ij a.s. (6)In the second step, we define ∆ (cid:96) = diag( v (cid:96) ) and Θ (cid:96) = ∆ − (cid:96) Ω ∗ (cid:96) ∆ (cid:96) , for ≤ (cid:96) ≤ m . (7)It is easy to verify for each (cid:96) that Θ (cid:96) is a generator: it has nonnegative off-diagonal elements and the row sums are equal to zero, so that the diagonalelements are strictly negative. We have e Ω ∗ (cid:96) = ∆ (cid:96) e Θ (cid:96) ∆ − (cid:96) , so that Ψ = lim n →∞ n log { ˆ∆ e ˆΘ ξ ˆ∆ − ˆ∆ e ˆΘ ξ ˆ∆ − · · · ˆ∆ n − e ˆΘ n − ξ n − ˆ∆ − n − } ij (8)almost surely, where ˆ∆ k = ∆ ˆ X k and ˆΘ k = Θ ˆ X k .5hile the probabilistic interpretation of the matrix product in (6) is notobvious, we can easily give one to the equivalent random matrix productin (8): we replace the whole branching process by a two-dimensional Markovchain { ( X ( t ) , ϕ ( t )) : t ≥ } on the state space { . . . m } × { . . . r } , withgenerator Q Θ = Q ⊗ I r + Θ Θ . . . Θ m , where I r is the identity matrix of size r (we indicate the size of I when it is notclear by the context). Like before, { X ( t ) } is a Markov chain with generator Q and the dual process { ϕ ( t ) } evolves according to the generator Θ (cid:96) as longas X ( · ) remains equal to (cid:96) . We define as follows the epochs { τ k : k = 0 , , . . . } of transition for the component X ( · ) of the process: τ = 0 , τ k +1 = τ k + ξ k , k ≥ , and we define the process { ϕ k , k ∈ N } with ϕ k = ϕ ( τ k ) , embedded at the jumpepochs for { X ( t ) } . Finally, we associate a sequence { Z k : k = 0 , , , . . . } ofmarks to the intervals [ τ k , τ k +1 ) , with Z k = ( ˆ∆ k ) ϕ k ( ˆ∆ − k ) ϕ k +1 = ( v ˆ X k ) ϕ k ( v ˆ X k ) ϕ k +1 , (9)for k ≥ . Lemma 2.1.
The parameter ω may be written as ω = ˆ π C − λ + Ψ (10) where ˆ π is the stationary probability vector of the environmental jump chain { ˆ X k } , C = diag( c , . . . , c m ) , λ = ( λ , . . . , λ m ) (cid:62) , and Ψ = lim n →∞ n log E [ Z Z · · · Z n − { ϕ n = j } | ϕ = i, ξ ( n ) , ˆ X ( n ) ] a.s. , (11) with ξ ( n ) = ( ξ , . . . , ξ n − ) , and ˆ X ( n ) = ( ˆ X , . . . , ˆ X n − ) .Proof. Equation (10) is merely (5) written in a more compact manner. Fur-thermore, one has E [ Z k { ϕ k +1 = j } | ϕ k = i, ξ k , ˆ X k ] = ( ˆ∆ k e ˆΘ k ξ k ˆ∆ − k ) ij , k ≥ , and a simple calculation shows that (8) and (11) are equivalent.6e recognise in the first term of (10) the expected long-term growth rateof the population, while the second term reflects the fact that changes inthe environment influence all individuals simultaneously. The advantage of(11) over (8) is that we now deal with a product of scalar random variablesinstead of a product of random matrices. In the sequel, we shall occasionallyuse the notation Z n,j = Z Z · · · Z n − { ϕ n = j } (12)in order to simplify the presentation of a few arguments.Now, we proceed like in [6]: we condition on less information to find anupper bound, and we condition on more information to find a lower bound. To simplify the notation, we define the matrices M (cid:96) = c (cid:96) ∆ (cid:96) ( c (cid:96) I − Θ (cid:96) ) − ∆ − (cid:96) = c (cid:96) [( c (cid:96) + λ (cid:96) ) I − Ω (cid:96) ] − , (13)for (cid:96) = 1 , , . . . , m , and the matrix M = M M . . . M m (14)of order rm . We also define the transition matrix ˆ P = I + C − Q of the jumpchain { ˆ X k } . Theorem 2.2.
An upper bound for ω is given by ω U with ω U = ˆ π C − λ + log sp[ M ( ˆ P ⊗ I r )] . Proof.
From (11, 12), we have e Ψ = lim n →∞ E [ Z n,j | ϕ = i, ξ ( n ) , ˆ X ( n ) ] /n a.s.Taking on both sides expectations with respect to ξ ( n ) and ˆ X ( n ) , we obtain e Ψ = E (cid:104) lim n →∞ E [ Z n,j | ϕ = i, ξ ( n ) , ˆ X ( n ) ] /n (cid:105) ≤ lim n →∞ E (cid:104) E [ Z n,j | ϕ = i, ξ ( n ) , ˆ X ( n ) ] /n (cid:105) ≤ lim n →∞ E (cid:104) E [ Z n,j | ϕ = i, ξ ( n ) , ˆ X ( n ) ] (cid:105) /n = lim n →∞ E [ Z n,j | ϕ = i ] /n , ˆ X and ξ , we have E [ Z { ϕ = j } | ϕ = i ] = (cid:88) ≤ (cid:96) ≤ m α (cid:96) (cid:90) ∞ (cid:2) ∆ (cid:96) e Θ (cid:96) t ∆ − (cid:96) (cid:3) ij c (cid:96) e − c (cid:96) t dt = (cid:88) ≤ (cid:96) ≤ m α (cid:96) (cid:2) c (cid:96) ∆ (cid:96) ( c (cid:96) I − Θ (cid:96) ) − ∆ − (cid:96) (cid:3) ij = [( α ⊗ I r ) M ( ⊗ I r )] ij where α is the initial probability vector of { X ( t ) } , and the vector is ofsize m . By induction, one shows that E [ Z n,j | ϕ = i ] = (cid:104) ( α ⊗ I r )[ M ( ˆ P ⊗ I r )] n ( ⊗ I r ) (cid:105) ij , so that e Ψ ≤ lim n →∞ (cid:104) ( α ⊗ I r )[ M ( ˆ P ⊗ I r )] n ( ⊗ I r ) (cid:105) /nij = sp [ M ( ˆ P ⊗ I r )] , independently of α , i , and j . To obtain a lower bound, we reverse the argument of Theorem 2.2: we startfrom a conditional expectation given more information. To make use of thedefinition (9) of the marks, we need to consider two discrete-time Markovchains { Y (1) k } and { Y (2) k } embedded at the epochs { τ k } , with Y (1) k = ( ˆ X k , ϕ k ) and Y (2) k = ( ˆ X k , ϕ k +1 ) . For { Y (1) k } , a transition from ( (cid:96), i ) to ( (cid:96) (cid:48) , j ) hasprobability P (1)( (cid:96),i ) , ( (cid:96) (cid:48) ,j ) = (cid:90) ∞ (cid:0) e Θ (cid:96) t (cid:1) ij c (cid:96) e − c (cid:96) t dt ˆ P (cid:96)(cid:96) (cid:48) = ( N (cid:96) ) ij ˆ P (cid:96)(cid:96) (cid:48) where N (cid:96) = c (cid:96) ( c (cid:96) I − Θ (cid:96) ) − = ∆ − (cid:96) M (cid:96) ∆ (cid:96) . (15)A similar calculation shows that P (2)( (cid:96),i ) , ( (cid:96) (cid:48) ,j ) = ˆ P (cid:96)(cid:96) (cid:48) ( N (cid:96) (cid:48) ) ij and we write P (1) = N ( ˆ P ⊗ I r ) , P (2) = ( ˆ P ⊗ I r ) N, N = N N . . . N m . Theorem 2.3.
A lower bound for ω is given by ω L with ω L = ˆ π C − λ + π (1) ( I − N ) log ¯ v , (16) where π (1) is the stationary probability vector of P (1) and ¯ v = v v ... v m . Proof.
We start from the conditional expectation of the product of marks,given ϕ ( n ) = ( ϕ , . . . , ϕ n − ) in addition to ξ ( n ) and ˆ X ( n ) . We readily findfrom (9) that E [ Z n,j | ϕ = i, ϕ ( n ) , ξ ( n ) , ˆ X ( n ) ]= ( v ˆ X ) i ( v ˆ X ) ϕ ( v ˆ X ) ϕ ( v ˆ X ) ϕ · · · ( v ˆ X n − ) ϕ n − ( v ˆ X n − ) ϕ n − ( v ˆ X n − ) ϕ n − ( v ˆ X n − ) j (cid:16) e ˆΘ n − ξ n − (cid:17) ϕ n − ,j = m (cid:89) (cid:96) =1 r (cid:89) a =1 ( v (cid:96) ) n (1)( (cid:96),a ) − n (2)( (cid:96),a ) a ( v ˆ X ) i ( v ˆ X n − ) j (cid:16) e ˆΘ n − ξ n − (cid:17) ϕ n − ,j , where n (1)( (cid:96),a ) = n − (cid:88) k =1 { Y (1) k =( (cid:96),a ) } and n (2)( (cid:96),a ) = n − (cid:88) k =0 { Y (2) k =( (cid:96),a ) } . Therefore, lim n →∞ E [ Z n,j | ϕ = i, ϕ ( n ) , ξ ( n ) , ˆ X ( n ) ] /n = lim n →∞ m (cid:89) (cid:96) =1 r (cid:89) a =1 ( v (cid:96) ) n (1)( (cid:96),a ) − n (2)( (cid:96),a ) n a (cid:32) ( v ˆ X ) i ( v ˆ X n − ) j (cid:16) e ˆΘ n − ξ n − (cid:17) ϕ n − ,j (cid:33) /n = m (cid:89) (cid:96) =1 r (cid:89) a =1 ( v (cid:96) ) π (1)( (cid:96),a ) − π (2)( (cid:96),a ) a a.s. (17)9here π (1) and π (2) are the stationary distribution vectors of the stochasticmatrices P (1) and P (2) . Now, e Ψ = lim n →∞ E [ Z n,j | ϕ = i, ξ ( n ) , ˆ X ( n ) ] /n = lim n →∞ E ϕ (cid:104) E [ Z n,j | ϕ = i, ϕ ( n ) , ξ ( n ) , ˆ X ( n ) ] (cid:105) /n , where the outermost expectation is with respect to ϕ ( n ) , ≥ lim n →∞ E ϕ (cid:104) E [ Z n,j | ϕ = i, ϕ ( n ) , ξ ( n ) , ˆ X ( n ) ] /n (cid:105) ≥ E ϕ (cid:104) lim n →∞ E [ Z n,j | ϕ = i, ϕ ( n ) , ξ ( n ) , ˆ X ( n ) ] /n (cid:105) (18)where we use Jensen’s Inequality followed by Fatou’s Lemma.From (10), (17) and (18) , we find that ω L = ˆ π C − λ + ( π (1) − π (2) ) log ¯ v . Finally, it is easy to verify that π (2) = π (1) N , which concludes the proof. Instead of using the right-eigenvectors of the matrices Ω (cid:96) like in (7), onemay set up another dual process, starting from the left-eigenvectors. Define ∆ (cid:48) (cid:96) = diag( u (cid:96) ) and Θ (cid:48) (cid:96) = (∆ (cid:48) (cid:96) ) − (Ω ∗ (cid:96) ) T ∆ (cid:48) (cid:96) for ≤ (cid:96) ≤ m .With these, one shows that Ψ = lim n →∞ n log (cid:110) ¯∆ n − e ¯Θ n − ξ n − ¯∆ − n − ¯∆ n − e ¯Θ n − ξ n − ¯∆ − n − · · · ¯∆ e ¯Θ ξ ¯∆ − (cid:27) ji , a.s. (19)where ¯∆ k = ∆ (cid:48) ˆ X k and ¯Θ k = Θ (cid:48) ˆ X k . We define { (cid:101) X k } as the time-reversed version of the environmental jump chain, with transition matrix (cid:101) P = diag( ˆ π ) − ˆ P T diag( ˆ π ) , Ψ = (20) lim n →∞ n log (cid:110) (cid:101) ∆ e (cid:101) Θ η (cid:101) ∆ − (cid:101) ∆ e (cid:101) Θ η (cid:101) ∆ − · · · (cid:101) ∆ n − e (cid:101) Θ n − η n − (cid:101) ∆ − n − (cid:111) ji , with probability one, where (cid:101) ∆ k = ∆ (cid:48) (cid:101) X k , (cid:101) Θ k = Θ (cid:48) (cid:101) X k and η k is exponential withparameter c (cid:101) X k .We follow the same steps as in Section 2 and obtain a new lower bound;we omit the proof here. Theorem 3.1.
An alternative lower bound for ω is given by (cid:101) ω L = ˆ π C − λ + (cid:101) π (1) ( I − (cid:101) N ) log ¯ u T , where (cid:101) N = (cid:101) N (cid:101) N . . . (cid:101) N m , (cid:101) N (cid:96) = c (cid:96) ( c (cid:96) I − Θ (cid:48) (cid:96) ) − , the vector (cid:101) π (1) is the stationary probability vector of (cid:101) P (1) = (cid:101) N ( (cid:101) P ⊗ I r ) and ¯ u = (cid:2) u u . . . u m (cid:3) . If we repeat from (20) the argument from Theorem 2.2, we find that ω ≤ (cid:101) ω U , with (cid:101) ω U = ˆ π C − λ + log sp[ M T ( (cid:101) P ⊗ I r )] . This is not a new upper bound, however, as ω U = (cid:101) ω U . Indeed, sp[ M T ( (cid:101) P ⊗ I r )] = sp[ M T (∆ − π ⊗ I r )( ˆ P T ⊗ I r )(∆ π ⊗ I r )] , with ∆ π = diag( ˆ π )= sp[(∆ π ⊗ I r ) M T (∆ − π ⊗ I r )( ˆ P T ⊗ I r )] , as the spectral radius of a product of matrices is invariant under a cyclicpermutation of the factors. The first three factors are block-diagonal matricesand a simple calculation shows that they commute, so that sp[ M T ( (cid:101) P ⊗ I r )] = sp[ M T ( ˆ P T ⊗ I r )]= sp[( ˆ P ⊗ I r ) M ] = sp[ M ( ˆ P ⊗ I r )] . ω . This is illustrated in thenext section. In this section, we illustrate our bounds with help of two examples. Thevalue of ω as defined in (1) is approximated by simulating paths of thebranching process for n environmental transitions, with n taking values upto ; we denote this approximation by ω sim . Example . In our first example, m = 2 and the generator of { X ( t ) } is Q = (cid:20) − − (cid:21) . Its stationary distribution is π = [0 . , . . We take r = 2 and Ω = (cid:20) −
15 129 − (cid:21) , Ω = (cid:20) −
13 1623 − (cid:21) . (21)The dominant eigenvalues of Ω and Ω are λ = − . and λ = 6 . respectively, so the branching process is subcritical in state 1 and supercriticalin state 2. As π > π , we expect the whole process to be supercriticaland this is confirmed by our results, summarised in the table below and inFigure 1. ω L (cid:101) ω L ω sim ω U n increases. The upper and lower dashed lines are for ω U and ω L . Clearly, ω is positive. For this example, ω L is not very good and (cid:101) ω L gives a better lower bound. 12
100 200 300 400 500 600 700 800 900 1000 − − (cid:116) Figure 1: Plain line: simulations of ω (Cesaro sequence) for successive valuesof n ; lower dashed line: ω L ; upper dashed line: ω U . Example . We add a third state tothe random environment and we assume the following generator Q = − − − , with stationary distribution π = [0 . , . , . . The matrices Ω and Ω are the same as in Example 4.1, and the third is Ω = (cid:20) −
39 123 − (cid:21) (22)with λ = − . , so the branching process is very subcritical in the newenvironmental state. In this case, we obtain the following values. (cid:101) ω L ω L ω sim ω U ω L .The examples have been chosen so that the environment spends asymp-totically the same amount of time in the supercritical state 2 (the stationaryprobability is π = 0 . in both cases). In addition, state 3 is more subcrit-ical than state 1. Nevertheless, ω is greater, the branching process is overallmore supercritical, in the second example. It is worth mentioning that wehave generally observed in our experimentation that the upper bound is moreaccurate than the lower bounds. 13
200 400 600 800 1000 1200 1400 1600 1800 2000 − − − − − (cid:116) Figure 2: Plain line: simulations of ω (Cesaro sequence) for successive valuesof n ; lower dashed line: ω L ; upper dashed line: ω U . The lower bound ω L is obtained by adding information in the form of all successive states of the embedded dual process { ϕ k } . In general, it seemsthat ω L is not a very close bound for ω , not as close as ω U at any rate,and neither is (cid:101) ω L . We attempt, therefore, to obtain a better lower boundby adding less information. This is doable in the special case of a cyclicenvironmental process.Assume that the jump chain { ˆ X k } follows a deterministic cyclic path, andassume, without loss of generality, that ˆ X k = k mod m + 1 . In this case,(11) is equivalent to Ψ = lim n →∞ n log E [ Z n,j | ϕ = i, ξ ( n ) ] a.s. = lim n →∞ nm log E ρ [ E [ Z nm,j | ϕ = i, ξ ( nm ) , ρ ( n ) ]] a.s. (23)where the outermost expectation is with respect to ρ ( n ) defined as ρ ( n ) =( ϕ m , . . . , ϕ ( n − m ) . That is, in contrast with Theorem 2.3, we do not conditionon the whole sequence of states of the dual at the epochs { τ k } but only atthe beginning of cycles for { ˆ X k } .We redefine as follows the products Z n,j in (9, 12): Z nm,j = v m,ϕ R R · · · R n − v − m,ϕ nm { ϕ nm = j } , R k = v ,ϕ km v ,ϕ km +1 · · · v m,ϕ km + m − v m,ϕ km v ,ϕ km +1 · · · v m − ,ϕ km + m − (24)and we use the fact that ( R , R , . . . , R k − ) is conditionally independent of ( R k , R k +1 , . . . ) , given ϕ km . Thus, E [ Z nm,j | ϕ = i, ξ ( nm ) , ρ ( n ) ]= v m,i n − (cid:89) k =0 E [ R k | ϕ = i, ξ ( nm ) , ρ ( n ) ] E [ R n − v − m,ϕ nm { ϕ nm = j } | ϕ = i, ξ ( nm ) , ρ ( n ) ] which we rewrite as = v m,i E [ R | ϕ = i, ζ , ϕ m ] n − (cid:89) k =1 E [ R k | ϕ km , ϕ ( k +1) m , ζ k ] E [ R n − v − m,ϕ nm { ϕ nm = j } | ϕ ( n − m , ζ n − ] with ζ k = ( ξ km , ξ km +1 , . . . , ξ km + m − ) = f n − (cid:89) k =1 E [ R k | ϕ km , ϕ ( k +1) m , ζ k ] (25)where we collect in f = v m,i E [ R | ϕ = i, ζ , ϕ m ] E [ R n − v − m,ϕ nm { ϕ nm = j } | ϕ ( n − m , ζ n − ] all the factors which play no role in the limit as n → ∞ . Observe that ζ , ζ , . . . are independent and identically distributed random m -tuples,with the same distribution as ζ ∗ = ( ξ ∗ , ξ ∗ , . . . , ξ ∗ m ) , where ξ ∗ , ξ ∗ , . . . , ξ ∗ m areindependent, exponentially distributed random variables, respectively withparameters c , c , . . . , c m . Thus, from (23) and (25), e Ψ = lim n →∞ { E ρ [ f n − (cid:89) k =1 E [ R k | ϕ km , ϕ ( k +1) m , ζ k ]] } /nm ≥ E ρ [ lim n →∞ { f n − (cid:89) k =1 E [ R k | ϕ km , ϕ ( k +1) m , ζ k ] } /nm ]= E ρ [ lim n →∞ { n − (cid:89) k =1 E [ R k | ϕ km , ϕ ( k +1) m , ζ k ] } /nm ]; (26)15he inequality is justified by Fatou’s lemma and Jensen’s inequality.We may now prove the following property. Theorem 5.1.
A lower bound for ω is given by ω ∗ L = ˆ π C − λ + 1 m (cid:88) ≤ i,j ≤ r β ij E [log E [ R | ϕ = i, ϕ m = j, ζ ∗ ]] , (27) where β ij = α i ( N N · · · N m ) ij , (28) the matrices N (cid:96) are defined in (15) and α is the stationary vector of theirproduct: α N N · · · N m = α , α = 1 . (29) Proof.
We reorganise the product in (26) and group together the factors withequal values for ϕ km and ϕ ( k +1) m , obtaining that e Ψ ≥ E ρ [ lim n →∞ ( (cid:89) ≤ i,j ≤ r n ij (cid:89) k =1 R k ; i,j ) /nm ] (30)where n ij = (cid:80) n − k =1 { ϕ mk = i,ϕ m ( k +1) = j } and, for fixed i and j , { R i,j , R i,j , . . . } are i.i.d. random variables with the same distribution as E [ R | ϕ = i, ϕ m = j, ζ ∗ ] .By the Strong Law of Large Numbers, the limits β ij = lim n →∞ n ij /n existand β ij = lim k →∞ P [ ϕ km = i, ϕ ( k +1) m = j ]= lim k →∞ P [ ϕ km = i ] P [ ϕ m = j | ϕ = i ]= α i ( N N · · · N m ) ij . Further, by the Strong Law of Large Numbers again, for fixed i and j , lim n →∞ ( n ij (cid:89) k =1 R k ; i,j ) /n = exp( β ij E [log E [ R | ϕ = i, ϕ m = j, ζ ∗ ]]) so that the limit in (30) is independent of ρ ( n ) and the inequality becomes Ψ ≥ Ψ ∗ , where Ψ ∗ = 1 m (cid:88) ≤ i,j ≤ r β ij E [log E [ R | ϕ = i, ϕ m = j, ζ ∗ ]] , ω .Now, Ψ ∗ ≥ m (cid:88) ≤ i,j ≤ r β ij E [ E [log R | ϕ = i, ϕ m = j, ζ ∗ ]] by Jensen’s inequality = 1 m (cid:88) ≤ i,j ≤ r β ij E [log R | ϕ = i, ϕ m = j ]= 1 m (cid:88) ≤ i ≤ r α i E [log R | ϕ = i ] . (31)From (24) we find that E [log R | ϕ = i ]] = E [log v ,i + m (cid:88) (cid:96) =2 log v (cid:96),ϕ (cid:96) − − log v m,i − m − (cid:88) (cid:96) =1 log v (cid:96),ϕ (cid:96) ]= ( m (cid:88) (cid:96) =1 (cid:96) − (cid:89) t =1 N t log v (cid:96) ) i − log v m,i − ( m − (cid:88) (cid:96) =1 (cid:96) (cid:89) t =1 N t log v (cid:96) ) i = ( m (cid:88) (cid:96) =1 (cid:96) − (cid:89) t =1 N t ( I − N (cid:96) ) log v (cid:96) ) i and so, by (31), Ψ ∗ ≥ m α ( m (cid:88) (cid:96) =1 (cid:96) − (cid:89) t =1 N t ( I − N (cid:96) ) log v (cid:96) ) . (32)Using the cyclic structure of ˆ P , one shows that the expression π (1) ( I − N ) log ¯ v defined in Theorem 2.3 is identical to the right-hand side of (32).This proves that ω ∗ L ≥ ω L .Denote by A ij , ≤ i, j ≤ r , the expectations in (27): A ij = E [log E [ R | ϕ = i, ϕ m = j, ζ ∗ ]] . These need to be determined for ω ∗ L to be of practical use. One verifies fromfirst principles that E [ R | ϕ , ϕ m , ζ ∗ ] = ( (cid:89) ≤ k ≤ m ∆ ∗ k e Θ k ξ ∗ k ) ϕ ,ϕ m / ( (cid:89) ≤ k ≤ m e Θ k ξ ∗ k ) ϕ ,ϕ m ∆ ∗ = ∆ ∆ − m and ∆ ∗ k = ∆ k ∆ − k − , k = 2 , . . . , m , so that A ij = E [log( (cid:89) ≤ k ≤ m ∆ ∗ k e Θ k ξ ∗ k ) ij ] − E [log( (cid:89) ≤ k ≤ m e Θ k ξ ∗ k ) ij ] . Although we do not have an explicit form for the expectations above, estima-tions by simulation are easily obtained, and this is what we did to compare ω ∗ L and ω L in the two examples below. Example . The random environment ofExample 4.1 has only two states and is automatically cyclic. The new lowerbound is ω ∗ L = 0 . and is indeed larger than ω L = 0 . . Example . This is similar toExample 4.2: it is a three-state random environment with matrices Ω (cid:96) definedin (21, 22); the cyclic generator is Q = − − − , and the stationary distribution is π = [0 . , . , . . The boundsand the approximation are given in the table below (cid:101) ω L ω L ω ∗ L ω sim ω U ω sim − ω ∗ L is less than halfthe difference ω sim − ω L . Needless to say, we might apply the same procedureto the dual of Section 3, thereby obtaining another lower bound, closer to ω than (cid:101) ω L . In the same manner as there is no systematic difference between (cid:101) ω L and ω L , we do not expect that there would be a systematic preferencebetween this additional bound and ω ∗ L . The bounds are tight in that there exist branching processes for which thebounds are all equal and equal to ω . We show below that such is the casefor single-type processes ( r = 1 ), and for processes such that the matrices Ω (cid:96) commute for all (cid:96) ; it is in particular the case for processes with constantgrowth rate ( Ω (cid:96) = Ω for all (cid:96) , for some Ω ).18 emma 6.1. If r = 1 , then ω L = (cid:101) ω L = ω ∗ L = ω U = ω = ˆ π C − λ .Proof. If r = 1 , all matrices reduce to scalars, so that λ (cid:96) = Ω (cid:96) and Ω ∗ (cid:96) = 0 for all (cid:96) . Therefore, Ψ = 0 by (6) and ω = ˆ π C − λ by (10).Furthermore, M (cid:96) = 1 for all (cid:96) , so that M ( ˆ P ⊗ I ) = ˆ P is a stochasticmatrix and log sp [ M ( ˆ P ⊗ I )] = 0 . We have thus proved that ω U = ω .Finally, Θ (cid:96) = 0 , so that N (cid:96) = 1 , for all (cid:96) . Therefore, I − N = 0 andwe conclude that ω L = ω . The same argument gives us (cid:101) ω L = ω . Since ω L ≤ ω ∗ L ≤ ω , this shows that ω ∗ L = ω as well. Lemma 6.2.
If the matrices Ω (cid:96) are mutually commutative for all ≤ (cid:96) ≤ m ,then ω L = (cid:101) ω L = ω ∗ L = ω U = ω = ˆ π C − λ .Proof. Once more, it is enough to prove that sp [ M ( ˆ P ⊗ I )] = 1 and π (1) ( I − N ) = 0 . Note that since the matrices Ω (cid:96) are mutually commutative for all (cid:96) ,their all share the same strictly positive Perron-Frobenius right-eigenvector,that we denote by v . Let λ (cid:96) denote the eigenvalue of maximal real part of Ω (cid:96) . Then the matrices M (cid:96) defined in (13) are such that M (cid:96) v = c (cid:96) (( c (cid:96) + λ (cid:96) ) I − Ω (cid:96) ) − v = c (cid:96) ( c (cid:96) + λ (cid:96) − λ (cid:96) ) − v = v . Therefore, M ( ˆ P ⊗ I )( ⊗ v ) = M ( ⊗ v ) = ⊗ v , and since ⊗ v > , this proves that sp [ M ( ˆ P ⊗ I )] = 1 by the Perron-Frobenius Theorem.Second, the mutual commutativity of the matrices Ω (cid:96) implies the mu-tual commutitivity of the matrices Θ (cid:96) , and we denote by w the commonstrictly positive left-eigenvector of the matrices Θ (cid:96) associated to their dom-inant eigenvalue 0, normalized such that w = 1 . The matrices N (cid:96) definedin (15) are then such that w N (cid:96) = c (cid:96) w ( c (cid:96) I − Θ (cid:96) ) − = w for all (cid:96) , and so ( ˆ π ⊗ w ) N ( ˆ P ⊗ I ) = ( ˆ π ⊗ w )( ˆ P ⊗ I ) = ˆ π ⊗ w . We conclude, on the one hand, that π (1) = ˆ π ⊗ w and, on the other hand,that π (1) ( I − N ) = . 19 cknowledgements Sophie Hautphenne is supported by the Australian Research Council (ARC)grant DP110101663. Guy Latouche acknowledges the support of the Min-istère de la Communauté française de Belgique through the ARC grantAUWB-08/13–ULB 5.
References [1] K. Athreya and S. Karlin. On branching processes with random envi-ronments: I: Extinction probabilities.
Ann. Math. Statist. , 5:1499–1520,1971.[2] K. B. Athreya and P. E. Ney.
Branching Processes . Springer-Verlag,New York, 1972.[3] E. Bolthausen and I. Goldsheid. Recurrence and transience of randomwalks in random environments on a strip.
Communications in Mathe-matical Physics , 214(2):429–447, 2000.[4] H. Furstenberg and H. Kesten. Products of random matrices.
Ann.Math. Statist. , 31:457–469, 1960.[5] R. Gharavi and V. Anantharam. An upper bound for the largest Lya-punov exponent of a Markovian random matrix product of nonnegativematrices.
Theoretical Computer Science , 332:543–557, 2005.[6] S. Hautphenne, G. Latouche, and G. T. Nguyen. Markovian trees sub-ject to catastrophes: would they survive forever? In
Matrix-AnalyticMethods in Stochastic Models , pages 87–106. Springer, 2013.[7] E. Key. Computable examples of the maximal Lyapunov exponent.
Probab. Th. Rel. Fields , 75:97–107, 1987.[8] E. Key. Lower bounds for the maximal Lyapunov exponent.
J. Theoret.Probab. , 3(3):477–488, 1987.[9] J. F. C. Kingman. Subadditive ergodic theory.
The Annals of Probability ,1:883–909, 1973. 2010] R. Lima and M. Rahibe. Exact Lyapunov exponent for infinite productsof random matrices.
J. Phys. A: Math. Gen. , 27:3427–3437, 1994.[11] V. I. Oseledec. A multiplicative ergodic theorem: Lyapunov charac-teristic numbers for dynamical systems.
Trans. Moscow Math. Soc. ,19:197–231, 1968.[12] D. Tanny. On multitype branching processes in a random environment.
Adv. Appl. Prob. , 13:464–497, 1981.[13] J. Tsitsiklis and V. Blondel. The Lyapunov exponent and joint spec-tral radius of pairs of matrices are hard — when not impossible — tocompute and to approximate.
Math. Control Signals Systems , 10:31–40,1997.[14] J. C. Watkins. Limit theorems for products of random matrices. InJ. E. Cohen, H. Kesten, and C. M. Newman, editors,
Random Matricesand Their Applications , volume 50 of