[PDF] A three-series theorem on Lie groups

Abstract

We obtain a necessary and sufficient condition for the convergence of independent products on Lie groups, as a natural extension of Kolmogorov's three-series theorem. Application to independent random matrices is discussed.

Full PDF

aa r X i v : . [ m a t h . P R ] D ec A three-series theorem on Lie groups

Ming Liao Summary

We obtain a necessary and suﬃcient condition for the convergenceof independent products on Lie groups, as a natural extension of Kolmogorov’sthree-series theorem. Application to independent random matrices is discussed.

Key words and phrases

Lie groups, three-series theorem.

Let x n be a sequence of independent real-valued random variables. Fix any constant r > P ∞ n =1 x n converges almost surely if and only if the following three conditions hold.(K1) P ∞ n =1 P ( | x n | > r ) < ∞ ;(K2) P ∞ n =1 E ( x n [ | x n |≤ r ] ) converges, where 1 A is the indicator of a set A ; and(K3) P ∞ n =1 E [( x n [ | x n |≤ r ] − b n ) ] < ∞ , where b n = E ( x n [ | x n |≤ r ] ).Extensions of the three-series theorem to more general spaces have been explored inliterature. In particular, Maksimov [6] obtained a one-sided extension of the three-seriestheorem to Lie groups, providing a set of suﬃcient conditions for the convergence of productsof independent random variables in a Lie group, with some partial result toward the morediﬃcult necessity part.The purpose of this paper is to present a complete extension of the three-series theoremto a general Lie group. Our result is a simpler form of a conjecture proposed in [6], andis in more close analogy with the classical result. We not only establish the more diﬃcultnecessity part, the proof of suﬃciency is also much shorter than [6]. The result will beapplied to study the convergence of products of independent random matrices.Let G be a Lie group of dimension d with identity element e . There are a relativelycompact neighborhood U of e and a smooth function φ = ( φ , φ , . . . , φ d ): U → R d whichmaps U diﬀeomorphically onto a convex neighborhood φ ( U ) of the origin 0 in R d , with φ ( e ) = 0. The U is not assumed to be open and φ is assumed extendable to be a smoothfunction on an open set containing the closure U of U . In the rest of the paper, U and φ areﬁxed, but they may be chosen arbitrarily as long as the above properties are satisﬁed. Department of Mathematics, Auburn University, Auburn, AL 36849, USA. Email: [email protected] x be a random variable in G . Its U -truncated mean b is deﬁned by φ ( b ) = E [ φ ( x )1 [ x ∈ U ] ] . (1)Note that because φ ( U ) is convex, E [ φ ( x )1 [ x ∈ U ] ] ∈ φ ( U ) and b = φ − { E [ φ ( x )1 [ x ∈ U ] ] } . Theorem 1

Let x n be a sequence of independent G -valued random variables with U -truncatedmeans b n . Then ˆ x n = x x · · · x n converges almost surely in G as n → ∞ if and only if thefollowing three conditions hold.(G1) P ∞ n =1 P ( x n ∈ U c ) < ∞ , where U c is the complement of U in G ;(G2) ˆ b n = b b · · · b n converges in G as n → ∞ ; and(G3) P ∞ n =1 E [ k φ ( x n )1 [ x n ∈ U ] − φ ( b n ) k ] < ∞ , where k · k is the Euclidean norm on R d . Note that under (G1), (G3) is equivalent to P ∞ n =1 E [ k φ ( x n ) − φ ( b n ) k [ x n ∈ U ] ] < ∞ .The proof of Theorem 1 will begin in the next section. Note that by Kolmogorov’s 0 -1law, the independent product ˆ x n either converges almost surely or diverges almost surely.When G = R d as an additive group, one may take φ to be the identity map on R d and U to be the ball of radius r > R d .We brieﬂy comment on the relation between the almost sure convergence and the con-vergence in distribution. On Euclidean spaces, it is well known that the two convergencesare equivalent for a series of independent random variables. This is not true for an inde-pendent product on a Lie group G . Because if G has a compact subgroup H = { e } , thenfor any sequence of independent random variables x n , each is distributed according to thenormalized Haar measure on H , the product x x · · · x n converge in distribution to x , butit is clearly not convergent almost surely. By Theorem 2.2.16 (ii) in Heyer [4], if the onlycompact subgroup of G is { e } , then the convergence in distribution and the almost sureconvergence are equivalent for an inﬁnite product of independent random variables in G .For k ≥

1, let M k be the space of k × k real matrices, which may be identiﬁed with R d ,where d = k . The Euclidean norm of x = { x ij } ∈ M k is k x k = qP i,j x ij , and it satisﬁes k xy k ≤ k x kk y k for x, y ∈ M k .Let G be the group of k × k real matrices of nonzero determinants under matrix product.Its identity element e is the identity matrix I . Its Lie algebra is M k with the Lie groupexponential map exp( x ) being the usual matrix exponential e x = I + P ∞ n =1 x n /n !. Theorem 2

Let G be the matrix group as above, and let x n be a sequence of independentrandom variables in G . Fix r ∈ (0 , . Then ˆ x n = x x · · · x n converges almost surely to arandom matrix in G if and only if the following three conditions hold. M1) P ∞ n =1 P ( k x n − I k > r ) < ∞ ;(M2) b b · · · b n converges in G as n → ∞ , where b n = I + E [( x n − I )1 [ k x n − I k≤ r ] ] ; and(M3) P ∞ n =1 E ( k x n − b n k [ k x n − I k≤ r ] ) < ∞ . Proof:

For x ∈ G , let U = { x ∈ G ; k x − I k ≤ r } and φ ( x ) = x − I ∈ M k . If k y k <

1, then I + y is invertible with ( I + y ) − = I + P ∞ p =1 ( − p y p . It follows that φ maps U diﬀeomorphicallyonto the ball of radius r centered at 0 in M k ≡ R d , and hence φ and U satisfy the requiredproperties. Theorem 1 may be applied with b n in (M2) being the U -truncated mean of x n .(G1) and (G2) are just (M1) and (M2), and (G3) is P n E [ k ( x n − I )1 H n − ( b n − I ) k ] < ∞ ,where H n = [ k x n − I k ≤ r ]. Because E [ k ( x n − I )1 H n − ( b n − I ) k ] = E [ k x n − b n k H n ] + k b n − I k P ( H cn ), by (M1), (G3) is equivalent to (M3). ✷ Example 1:

Let y n be a sequence of independent random variables in M k ≡ R d , d = k .Assume x n = I + y n is almost surely invertible. Note that this holds if y n has a continuousdistribution. Also assume that for some r ∈ (0 , E ( y n [ k y n k≤ r ] ) = 0 for all n . Thenˆ x n = x x · · · x n converges to an invertible random matrix x ∞ almost surely if ∞ X n =1 E ( k y n k ) < ∞ . (2)To prove this claim, note that b n in (M2) is I and (M2) holds trivially. Now (M1) is P ∞ n =1 P ( k y n k > r ) < ∞ and (M3) is P ∞ n =1 E [ k y n k [ k y n k≤ r ] ] < ∞ . Because P ( k y n k > r ) ≤ E ( k y n k ) /r , so (M1) and (M3) are implied by (2). By Theorem 2, ˆ x n converges almostsurely in the matrix group G . Example 2:

Let y n be independent random variables in M k ≡ R d , d = k . Assume y n is normal of mean 0. Then ˆ x n = ( I + y ) · · · ( I + y n ) converges almost surely in the matrixgroup G if and only if (2) holds. To prove this, note that by the symmetry of a normaldistribution, E ( y n [ k y n k≤ r ] ) = 0 for all r >

0. By Example 1, (2) is a suﬃcient condition forthe almost sure convergence of ˆ x n in G . To see it is also necessary, it suﬃces to show that(2) is implied by P n E [ k y n k [ k y n k≤ r ] ] < ∞ and P n P ( k y n k > r ) < ∞ . This can be done byan elementary computation of the normal distribution. Example 3:

Let y n be a sequence of independent random variables in M k ≡ R d , d = k .Assume there is r >

0, which may be chosen arbitrarily small, such that E ( y n [ k y n k≤ r ] ) = 0for all n . Then exp( y ) exp( y ) · · · exp( y n ) converges in the matrix group G almost surely if(2) holds. To prove this, apply Theorem 1 to x n = exp( y n ) with φ = exp − on U , where U is the diﬀeomorphic image of a small ball in M k ≡ R d under exp. The conditions may beveriﬁed as in Example 1. 3 Suﬃciency

For any sequence of independent random variables x n in G , by the Borel-Cantelli Lemma,if (G1) holds, then almost surely, x n ∈ U except for ﬁnitely many n . On the other hand, ifˆ x n = x x · · · x n converges almost surely, then because x n = ˆ x − n − ˆ x n → e , (G1) follows fromthe Borel-Cantelli Lemma. Set x ′ n = x n on [ x n ∈ U ] and x ′ n = e on [ x n ∈ U c ]. Then thealmost sure convergence of x x · · · x n is equivalent to that of x ′ x ′ · · · x ′ n and (G1). Notethat φ ( x n )1 [ x n ∈ U ] = φ ( x ′ n ) = φ ( x ′ n )1 [ x ′ n ∈ U ] , and all quantities in (G2) and (G3) (including b n )only depend on the restriction of x n on U . Therefore, (G2) and (G3) hold for x n if and onlyif they hold for x ′ n . Thus, as noted in [6], to prove Theorem 1, we may, and will, assume all x n ∈ U , and prove that ˆ x n converges almost surely in G if and only if (G2) and (G3) hold.We will prove the suﬃciency part of Theorem 1 in this section, and so assume (G2) and(G3). Let µ n be the distribution of x n . Because x n ∈ U , the U -truncated mean b n of x n isdeﬁned by φ ( b n ) = µ n ( φ ), where µ n ( φ ) = R φdµ n = E [ φ ( x n )]. Set ˆ x = ˆ b = e . For n ≥ z n = ˆ b n − x n b − n ˆ b − n − and ˆ z n = z z · · · z n , and set ˆ z = e . It is easy to show by a simpleinduction on n that for all n ≥

0, ˆ x n = ˆ z n ˆ b n . (3)By (G2), it suﬃces to show that ˆ z n converges in G almost surely.Note that for G = R d , z n is just the centered term x n − b n , and ˆ z n = ˆ x n − ˆ b n is the sumof the centered terms. To have ˆ x n = ˆ z n ˆ b n on a non-commutative multiplicative Lie group G , z n has to be deﬁned in the above rather complicated form.By the lemma below, the almost sure convergence of ˆ z n is equivalent to z m z m +1 · · · z n → e almost surely as m → ∞ with m < n . Lemma 3

Let u n be independent random variables in G . Then u u · · · u n converges almostsurely as n → ∞ if and only if u m u m +1 · · · u n → e almost surely as m → ∞ with m < n . Proof:

This is an easy consequence of the existence of a complete metric on G that isinvariant under left translations and is compatible with the topology on G . The metric canbe any left invariant Riemannian metric on G . ✷ For any f ∈ C ∞ c ( G ), let M f = f ( e ) and for n ≥

1, let M n f = f (ˆ z n ) − n X p =1 Z [ f (ˆ z p − ˆ b p − xb − p ˆ b − p − ) − f (ˆ z p − )] µ p ( dx ) . (4) Lemma 4

Let F n be the σ -algebra generated by x , x , . . . , x n . Then E [ M n f | F m ] = M m f for m < n , that is, M n f is a martingale under the ﬁltration {F n } . roof: Because x n are independent, for m < p , E [ Z f (ˆ z p − ˆ b p − xb − p ˆ b − p − ) µ p ( dx ) | F m ] = E [ Z f (ˆ z m z m +1 · · · z p − ˆ b p − xb − p ˆ b − p − ) µ p ( dx ) | F m ]= E [ f (ˆ zz m +1 · · · z p − z p )] | ˆ z =ˆ z m = E [ f (ˆ z p ) | F m ] . Then E [ R [ f (ˆ z p − ˆ b p − xb − p ˆ b − p − ) − f (ˆ z p − )] µ p ( dx ) | F m ] = 0, and E [ M n f | F m ] = M m f . ✷ Fix an integer m > V of e . Let f ∈ C ∞ c ( G ) be such that 0 ≤ f ≤ f ( e ) = 1 and f ( x ) = 0 for x ∈ V c . For g ∈ G , let l g be the left translation x gx on G , and let f m = f ◦ l ˆ z − m . Let Λ( m, V ) be the event that there is n > m such that z m +1 z m +2 · · · z n ∈ V c . To estimate P [Λ( m, V )], let τ be the ﬁrst time n > m such that z m +1 z m +2 · · · z n ∈ V c and set τ = ∞ if z m +1 z m +2 · · · z n ∈ V for all n > m . Then P [Λ( m, V )] = E { [ f m (ˆ z m ) − f m (ˆ z τ )]1 Λ( m,V ) } = lim n →∞ E { [ f m (ˆ z m ) − f m (ˆ z τ ∧ n )]1 Λ( m,V ) } , (5)where τ ∧ n = min( τ, n ). Because E { [ f m (ˆ z m ) − f m (ˆ z τ ∧ n )]1 Λ( m,V ) } ≤ E [1 − f m (ˆ z τ ∧ n )] = E [ f m (ˆ z m ) − f m (ˆ z τ ∧ n )] and E [ M τ ∧ n f m ] = E { E [ M τ ∧ n f m | F m ] } = E [ M m f m ], E { [ f m (ˆ z m ) − f m (ˆ z τ ∧ n )]1 Λ( m,V ) } ≤ − E { τ ∧ n X p = m +1 Z [ f (ˆ z p − ˆ b p − xb − p ˆ b − p − ) − f (ˆ z p − )] µ p ( dx ) }≤ ∞ X p = m E {| Z [ f (ˆ z p − ˆ b p − xb − p ˆ b − p − ) − f (ˆ z p − )] µ p ( dx ) |} . (6)We will write ˆ z, ˆ b, b, µ for ˆ z p − , ˆ b p − , b p , µ p for simplicity. For x ∈ U , by the Taylorexpansion of f (ˆ z ˆ bxb − ˆ b − ) = f (ˆ z ˆ bφ − ( φ ( x )) b − ˆ b − ) at x = b , noting µ ( U c ) = 0, Z [ f (ˆ z ˆ bxb − ˆ b − ) − f (ˆ z )] µ ( dx ) = Z { X i f i (ˆ z, ˆ b, b )[ φ i ( x ) − φ i ( b )] } µ ( dx ) + r, (7)where f i (ˆ z, ˆ b, b ) = ∂∂φ i f (ˆ z ˆ bφ − ( φ ( x )) b − ˆ b − ) | x = b (8)and the remainder r satisﬁes | r | ≤ cµ ( k φ − φ ( b ) k ) for some constant c >

0. Because µ ( φ i ) = φ i ( b ), R [ φ i ( x ) − φ i ( b )] µ ( dx ) = 0, and then by (7), | Z [ f (ˆ z ˆ bxb − ˆ b − ) − f (ˆ z )] µ ( dx ) | = | r | ≤ cµ ( k φ − φ ( b ) k ) . (9)It now follows from (5) and (6) that P [Λ( m, V )] ≤ c P ∞ n = m µ n ( k φ − φ ( b n ) k ). Let ε ∈ (0 , V k be a sequence of neighborhoods of e with V k ↓ { e } as k ↑ ∞ . By (G3), for each k ≥

1, there is an integer m k such that P [Λ( m k , V k )] < ε k . Then P ∞ k =1 P [Λ( m k , V k )] ≤ P ∞ k =1 ε k = ε/ (1 − ε ). By Lemma 3, P (ˆ z n converges) ≥ P [ ∩ ∞ k =1 Λ( m k , V k ) c ] ≥ − P ∞ k =1 P [Λ( m k , V k )] ≥ − ε/ (1 − ε ) → ε →

0. This proves ˆ z n converges almost surely.5 Necessity, part 1

We will now prove (G2) and (G3) under the assumption that ˆ x n converges almost surely andall x n ∈ U . This proof is more complicated and will require another section.Because x n = ˆ x − n − ˆ x n → e almost surely, by the Borel-Cantelli Lemma, ∀ neighborhood V of e, ∞ X n =1 P ( x n ∈ V c ) < ∞ . (10)We also have b n → e and µ n ( k φ − φ ( b n ) k ) → n → ∞ . (11)For m < n , let ˆ x m,n = x m +1 x m +2 · · · x n and ˆ b m,n = b m +1 b m +2 · · · b n . If either (G2) or (G3)does not hold, then there are a neighborhood V of e , ε > m k and n k with V V ⊂ U , m k < n k and m k ↑ ∞ as k ↑ ∞ such that for each k ≥ P n k p = m k +1 µ p ( k φ − φ ( b p ) k ) ≥ ε or ˆ b m k , n k ∈ V c .Because of (11), by choosing m large enough, we have b n ∈ V and µ n ( k φ − φ ( b n ) k ) ≤ ε for n > m . Thus, by suitably reducing n k , we obtain that for each k ≥

1, either(i) ε ≤ P n k p = m k +1 µ p ( k φ − φ ( b p ) k ) ≤ ε , and ˆ b m k , p ∈ U for m k < p ≤ n k ; or(ii) P n k p = m k +1 µ p ( k φ − φ ( b p ) k ) ≤ ε , ˆ b m k , n k ∈ V c , and ˆ b m k , p ∈ U for m k < p ≤ n k .We will derive a contradiction from either (i) or (ii) above. We will embed the partialproducts x m k , p and b m k , p , for m k < p ≤ n k , into a process ˜ x kt and a function ˜ b kt on [0 , z kt , deﬁned by˜ x kt = ˜ z kt ˜ b kt , similar to the martingale property for ˆ z n in the last section, to show the limit ˜ z t of ˜ z kt satisﬁes an integral equation, and then to derive a contradiction. This is similar to theapproaches in [3, 5] for processes in Lie groups with independent increments.Let γ k be a strictly increasing function from { m k , m k +1 , . . . , n k } into [0 ,

1] with γ k ( m k ) =0 and γ k ( n k ) = 1. Let t k,p = γ k ( p ) for m k ≤ p ≤ n k . Then t k,m k = 0 and t k,n k = 1. Let˜ x kt = ˜ b kt = e for 0 ≤ t < t k, m k +1 . For m k < p < n k and t k,p ≤ t < t k, p +1 , let˜ x kt = ˆ x m k , p and ˜ b kt = ˆ b m k , p . (12)Set ˜ x kt = ˆ x m k , n k and ˜ b kt = ˆ b m k , n k for t ≥

1. Then ˜ x kt and ˜ b kt are respectively a step processand a step function, which are right continuous with jumps x p and b p at t = t k,p .Note that by Lemma 3, almost surely, ˜ x kt → e as k → ∞ uniformly in t .A continuous function A ( t ) = { A ij ( t ) } from R + = [0 , ∞ ) to the space of d × d symmetricreal matrices is called a covariance matrix function if A (0) = 0 and for s < t , A ( t ) − A ( s ) ≥ A kij ( t ) = X

1. Let Q k ( t ) be the trace of A k ( t ). Then Q k ( t ) = X m k +1 ( t k,p − t k, p − ) → k → ∞ . (17) Lemma 5

There is a covariance matrix function A ( t ) with A ( t ) = A (1) for t ≥ such thatalong a subsequence of k → ∞ , A k ( t ) → A ( t ) for any t ≥ . Proof

Let Λ be a countable dense subset of [0 , Q k ( t ) is bounded.By (15), along a subsequence of k → ∞ , A k ( t ) converges for any t ∈ Λ. By (16), theconvergence holds for all t ≥

0, and A ( t ) is continuous in t . ✷ Let Y be a smooth manifold equipped with a compatible metric ρ and let y : [0 , → Y be a continuous function. For each k , let y k : [0 , → Y be a step function that is constanton [ t k, p − , t k,p ) for each p = m k + 1 , . . . , n k . Assume for any t > ρ ( y k ( t k,p ) , y ( t k,p )) → k → ∞ uniformly for t k,p ≤ t . Let F ( y, g ) = { F ij ( y, g ) } be a bounded continuousmatrix-valued function on Y × G . Lemma 6

Assume the above and let A ( t ) be the covariance matrix function in Lemma 5.Then for any t > , along the subsequence of k → ∞ in Lemma 5, X m k , we may replace y k and b p by y and e in the proof.Let r > A and B depending on ( k, r ), we willwrite A ≈ B if | A − B | → r → ∞ uniformly in k . Then X

2, by setting ˜ z kt = e for 0 ≤ t < t k, m k +1 , and inductively˜ z kt = ˜ z kt k, p − ˜ b kt k, p − x p b − p (˜ b kt k,p − ) − (19)for t k,p ≤ t < t k, p +1 , p = m k + 1 , . . . , n k , setting t k, n k +1 = ∞ here. Then ˜ z t = ˜ z for t > p shows that ˜ x kt = ˜ z kt ˜ b kt for all t ≥ f ∈ C ∞ c ( G ), let ˜ M kt f = f (˜ z kt ) = f ( e ) for 0 ≤ t < t k, m k +1 , and let˜ M kt f = f (˜ z kt ) − X

This is proved in the same way as in Lemma 4 for M n f to be a martingale. ✷ Because ˜ x kt = ˜ z kt ˜ b kt and ˜ x kt → e uniformly in t as k → ∞ almost surely, if ˜ b kt converges tosome continuous path ˜ b t in G uniformly in t as k → ∞ , then ˜ z kt → ˜ z t = ˜ b − t uniformly in t almost surely. This will be assumed in the rest of this section.By a computation using Taylor expansion similar to the one in the last section, but upto the second order, noting the integrals of the ﬁrst order terms vanish as before,˜ M kt f = f (˜ z kt ) − X

1. Then (21) holds for t ≥ t with R t replacedby R tt . Without loss of generality, we will assume t = 0. Substitute f = φ β in (21), thenthe integrand is 2 δ iβ δ jβ + ε s , where ε s denotes any function satisfying ε s → s →

0. Itfollows that φ β (˜ z t ) = 2 A ββ ( t ) + ε t T t , where T t = Tr[ A ( t )]. Then k φ (˜ z t ) k = 2 T t + ε t T t . Nowlet f = φ β and then (21) yields φ β (˜ z t ) = ε t T t . This implies | φ β (˜ z t ) | ≤ c k φ (˜ z t ) k for someconstant c >

0, which is clearly impossible. This shows that t = 1, and hence ˜ z t = e and A ( t ) = 0 for all t ≥ A (1)] = lim k Q k (1) = lim k P n k p = m k +1 µ p ( k φ − φ ( b p ) k ) ≥ ε , whichcontradicts to A ( t ) = 0. Thus (i) cannot hold. If (ii) holds, then ˜ b = lim k ˜ b k = lim k ˆ b m k ,n k belongs to the closure of V c , which contradicts to ˜ b t = ˜ z − t = e . We have proved that neither(i) nor (ii) holds, and hence (G2) and (G3) must hold, under the assumption that ˜ b kt → ˜ b t as k → ∞ uniformly in t for some continuous path ˜ b t in G . It remains to show that ˜ b kt → ˜ b t as k → ∞ uniformly in t for some continuous path ˜ b t in G .A rcll path is a right continuous path with left limits, and a process with rcll paths will becalled a rcll process. Let D ( G ) be the space of rcll paths in G . Equipped with the Skorohodmetric, D ( G ) is a complete separable metric space (see [2, chapter 3]). A sequence of rcllprocesses y kt in G are said to converge weakly to a rcll process y t if y k · → y · in distribution9s D ( G )-valued random variables. The sequence y kt are called relatively weak compact in D ( G ) if any subsequence has a further subsequence that converge weakly.We will show that ˜ z kt are relatively weak compact. Let V be a neighborhood of e . Theamount of time it takes for a rcll process y t to make V c -displacement from a stopping time σ (under the natural ﬁltration of process y t ) is denoted as τ σV , that is, τ σV = inf { t > y − σ y σ + t ∈ V c } (inf of an empty set is ∞ ). (22)For a sequence of processes y kt in G , let τ σ,kV be the V c -displacement time for y kt from σ .The following lemma is Lemma 16 in [5] and provides a criterion for the relative com-pactness. It is a slightly improved version of a lemma in [3]. Lemma 8

A sequence of rcll processes y kt in G are relatively weak compact in D ( G ) if forany constant T > and any neighborhood V of e , lim k →∞ sup σ ≤ T P ( τ σ,kV < δ ) → δ → , (23) and lim k →∞ sup σ ≤ T P [( y kσ − ) − y kσ ∈ K c ] → as compact K ↑ G, (24) where sup σ ≤ T is taken over all stopping times σ ≤ T . We will apply Lemma 8 to y kt = ˜ z kt . Because ˜ z kt = ˜ z k for t >

1, we may take T = 1 inLemma 8. Let f ∈ C ∞ c ( G ) be such that 0 ≤ f ≤ G , f ( e ) = 1 and f = 0 on V c . Forany stopping time σ ≤

1, write τ for τ σ,kV and let f σ = f ◦ l z with z = (˜ z kσ ) − . Then P ( τ < δ ) = E [ f σ (˜ z kσ ) − f σ (˜ z kσ + τ ); τ < δ ] ≤ E [ f σ (˜ z kσ ) − f σ (˜ z kσ + τ ∧ δ )] , (25)noting f σ ( z kσ ) = 1, f σ ( z kσ + τ ) = 0 and τ = τ ∧ δ on [ τ < δ ]. Because ˜ M kt f given by (20) is amartingale for any f ∈ C ∞ c ( G ), and σ and σ + τ ∧ δ are stopping times, E [ ˜ M kσ f σ − ˜ M kσ + τ ∧ δ f σ ] = E { E [ ˜ M kσ f σ − ˜ M kσ + τ ∧ δ f σ | F σ ] } = 0 . Writing ˜ z, ˜ b, b, µ for ˜ z kt k, p − , ˜ b kt k, p − , b p , µ p , by (20) and (25), we obtain P ( τ < δ ) ≤ − E { X σ P ( τ < δ ) ≤ cE [ Q k ( σ + δ ) − Q k ( σ )] .

10y (16), E [ Q k ( σ + δ ) − Q k ( σ )] ≤ εδ + ε k . It follows that lim k →∞ sup σ ≤ P ( τ < δ ) ≤ cεδ .This shows that the condition (23) is satisﬁed for y kt = ˜ z kt .To verify (24), note that because ˜ x kt = ˜ z kt ˜ b kt , P [(˜ z kσ − ) − ˜ z kσ ∈ K c ] = P [(˜ x kσ − ) − ˜ x kσ ∈ (˜ b kσ − ) − K c ˜ b kσ ] . By either (i) or (ii), ˜ b kt are bounded in k , when K is large, (˜ b kσ − ) − K ˜ b kσ contains a ﬁxedneighborhood H of e . Because (˜ b kσ − ) − K c ˜ b kσ = ((˜ b kσ − ) − K ˜ b kσ ) c , it follows that P [(˜ z kσ − ) − ˜ z kσ ∈ K c ] ≤ P [(˜ x kσ − ) − ˜ x kσ ∈ H c ] ≤ X p>m k µ p ( H c ) → k → ∞ . This veriﬁes (24) even before taking K ↑ G .By Lemma 8, ˜ z kt are relatively weak compact, and hence along a subsequence of k → ∞ ,˜ z kt converge weakly to a rcll process ˜ z t in G . As D ( G )-valued random variables, ˜ z k · convergein distribution to ˜ z · . It is well known (see for example Theorem 1.8 in [2, chapter 3]) thatthere are D ( G )-valued random variables ˜ z ′ k · and ˜ z ′· , possibly on a diﬀerent probability space,such that ˜ z ′· is equal to ˜ z · in distribution, ˜ z ′ k · is equal to ˜ z k · in distribution for each k , and˜ z ′ k · → ˜ z ′· almost surely. Because ˜ x k · = ˜ z k · ˜ b k · → e almost surely, where e is regarded as aconstant path in G , ˜ x ′ k · = ˜ z ′ k · ˜ b k · → e in distribution. As the limit e is non-random, ˜ x ′ k · → e inprobability. Then along a further subsequence of k → ∞ , ˜ x ′ k · → e almost surely, and hence˜ b k · = (˜ z ′ k · ) − ˜ x ′ k · → (˜ z ′· ) − .The convergence ˜ b kt → ˜ b t = (˜ z ′ t ) − under the Skorohod metric means (see Proposi-tion 5.3(c) in [2, chapter 3]) that there are continuous strictly increasing functions λ k : R + → R + such that as k → ∞ , λ k ( t ) − t → r (˜ b kt , ˜ b λ k ( t ) ) → ≤ t ≤ r is a compatible metric on G . If ˜ b t has a jump of size r (˜ b s − , ˜ b s ) > s , then ˜ b kt would have a jump of size close to r (˜ b γs − , ˜ b s ) at time t = λ − k ( s ), which is impossible becausethe jumps of ˜ b kt are uniformly small when k is large. It follows that ˜ b t is continuous in t andhence ˜ b kt → ˜ b t uniformly in t as k → ∞ . Acknowledgement:

The author wishes to thank David Applebaum and an anonymousreferee for some very helpful comments.