[PDF] Approximations for the boundary crossing probabilities of moving sums of normal random variables

Abstract

In this paper we study approximations for boundary crossing probabilities for the moving sums of i.i.d. normal random variables. We propose approximating a discrete time problem with a continuous time problem allowing us to apply developed theory for stationary Gaussian processes and to consider a number of approximations (some well known and some not). We bring particular attention to the strong performance of a newly developed approximation that corrects the use of continuous time results in a discrete time setting. Results of extensive numerical comparisons are reported. These results show that the developed approximation is very accurate even for small window length.

Full PDF

NNoname manuscript No. (will be inserted by the editor)

Approximations for the boundary crossing probabilities of movingsums of normal random variables

Jack Noonan · Anatoly Zhigljavsky

Received: date / Accepted: date

Abstract

In this paper we study approximations for boundary crossing probabilities for the movingsums of i.i.d. normal random variables. We propose approximating a discrete time problem with acontinuous time problem allowing us to apply developed theory for stationary Gaussian processesand to consider a number of approximations (some well known and some not). We bring particularattention to the strong performance of a newly developed approximation that corrects the use ofcontinuous time results in a discrete time setting. Results of extensive numerical comparisons arereported. These results show that the developed approximation is very accurate even for smallwindow length.

Keywords moving sum · boundary crossing probability · moving sum of normal · change-pointdetection Mathematics Subject Classiﬁcation (2000)

Primary: 60G50, 60G35; Secondary:60G70,94C12, 93E20

Let ε , ε , . . . be a sequence of i.i.d. normal random variables (r.v.) with mean µ and variance σ >

0. For a ﬁxed positive integer L , the moving sums are deﬁned by S n,L := n + L (cid:88) j = n +1 ε j ( n = 0 , , . . . ) . (1.1)The sequence of the moving sums (1.1) will be denoted by S so that S = { S ,L , S ,L , . . . } .The main aim of this paper is development of accurate approximations for the following relatedcharacteristics of S (note that for the sake of simplicity of notation we are not indicating thedependence of these characteristics on L ).(a) The boundary crossing probability (BCP) for the maximum of the moving sums: P S ( M, H ) := Pr (cid:18) max n =0 , ,...,M S n,L ≥ H (cid:19) , (1.2)where M is a given positive integer and H is a ﬁxed threshold. Note that the total number ofr.v. ε i used in (1.2) is M + L and P S ( M, H ) → M → ∞ , for all H and L . Jack NoonanSchool of Mathematics, Cardiﬀ University, Cardiﬀ, CF24 4AG, UKE-mail: [email protected] ZhigljavskySchool of Mathematics, Cardiﬀ University, Cardiﬀ, CF24 4AG, UKE-mail: ZhigljavskyAA@cardiﬀ.ac.uk a r X i v : . [ m a t h . S T ] A p r Jack Noonan, Anatoly Zhigljavsky (b) The probability distribution of the moment of time τ H ( S ) := min { n ≥ S n,L ≥ H } when thesequence S n,L reaches the threshold H for the ﬁrst time. The BCP P S ( M, H ), considered as afunction of M , is the c.d.f. of this probability distribution: P S ( M, H ) = Pr ( τ H ( S ) ≤ M ).(c) The average run length (ARL) until S reaches H for the ﬁrst time:ARL H ( S ) := ∞ (cid:88) n =0 n Pr { τ H = n } = (cid:90) ∞ M d P S ( M, H ) . (1.3)Developing accurate approximations for the BCP P S ( M, H ) and the associated ARL (1.3) forgeneric parameters H , M , L is very important in various areas of statistics, predominantly inapplications related to change-point detection; see, for example, papers [2, 7, 12] and [8]. The BCP P S ( M, H ) is an ( M + 1)-dimensional integral and therefore direct evaluation of this BCP is hardlypossible even with modern software.To derive approximations for the BCP (1.2) one can use some generic approximations such asDurbin and Poisson Clumping Heuristic considered below. These approximations, however, are notaccurate especially for small window length L ; this is demonstrated below in this paper. There is,therefore, a need for derivation of speciﬁc approximations for the BCP (1.2) and the ARL (1.3). Sucha need was well understood in the statistical community and indeed very accurate approximationsfor the BCP and the ARL have been developed in a series of papers by J. Glaz and coauthors, seefor example [6] and [7]. We will call these approximations ’Glaz approximations’ by the name ofthe main author of these papers; they will be formally written down in Sections 4.5 and 5.The accuracy of the approximations developed in the present paper is very similar to the Glazapproximations; this is discussed in Section 5. The methodologies of derivation of the approxima-tions are very diﬀerent, however. The advantage of the approximation developed in this paper overthe Glaz approximation is the fact that our approximation is explicit and hence can be computedinstantly; on the other hand, to compute the Glaz approximation one needs to numerically approx-imate L + 1 and 2 L + 1 dimensional integrals, which is not an easy task even taking into accountthe fact of existence of a sophisticated software.To derive the approximations, in Sections 3.2 and 4.2 we have used the methodology developedin [19, Ch.2, §

2] for continuous-time case, which has to be modiﬁed for discrete time. To do this, inSections 3.3, 4.3 and 4.4 we have revised and specialized the approach developed by D.Siegmundin [18] and other papers.The paper is structured as follows. In Sections 2, 3 and 4 we reformulate the problem anddiscuss how to approximate our discrete-time problem with a continuous-time problem. Here westate a number of known approximations and derive a new approximation that corrects the use ofcontinuous time results in a discrete time setting; this will be referred to as the ‘Corrected DiﬀusionApproximation’ or simply CDA. In Sections 3.4 and 4.6 we present results of large simulation studiesevaluating the performance of the considered approximations. In Section 5, we develop the CDAfor ARL H ( S ) and assess its accuracy. S n,L deﬁned in (1.1).The ﬁrst two moments of S n,L are E S n,L = µL, var( S n,L ) = σ L. (2.1)Deﬁne the standardized r.v.’s: ξ n := S n,L − E S n,L (cid:112) var( S n,L ) = S n,L − µLσ √ L , n = 0 , , . . . , (2.2) pproximations of boundary crossing probabilities 3 and denote X = { ξ , ξ , . . . , } . All r.v. ξ n are N (0 , ϕ ( x ) := 1 √ π e − x / , Φ ( t ) := (cid:90) t −∞ ϕ ( x ) dx . (2.3)Unlike the original r.v. ε i , the r.v. ξ , ξ , . . . are correlated with correlations depending on L , seeSection 2.2 below.The BCP P S ( M, H ) deﬁned by (1.2) is equal to the BCP P X ( M, h ) := Pr (cid:18) max n =0 , ,...,M ξ n ≥ h (cid:19) , (2.4)where H = µL + σh √ L so that h = H − µLσ √ L . (2.5)Similarly, τ H ( S ) = τ h ( X ) and ARL H ( S ) = ARL h ( X ).In what follows, we derive approximations for (2.4) and hence the distribution of τ h ( X ) andARL h ( X ). These approximations will be based on approximating the sequence { ξ i } i by a continuoustime random process and subsequently correcting the obtained approximations for discreteness.2.2 Correlation between ξ n and ξ n + k In order to derive our approximations, we will need explicit expressions for the correlations Corr( ξ n , ξ n + k ). Lemma 1

Let ξ n be as deﬁned in (2.2) . Then Corr( ξ , ξ k ) = Corr( ξ n , ξ n + k ) and Corr( ξ , ξ k ) = E ( ξ ξ k ) − ( E ξ ) var( ξ ) = 1 − k/L . (2.6) for ≤ k ≤ L . If k > L then Corr( ξ , ξ k ) = 0 . For a proof, see Appendix A.2.3 Continuous-time (diﬀusion) approximationFor the purpose of approximating the BCP P X ( M, h ) and the associated characteristics introducedin Introduction, we replace the discrete-time process ξ , . . . , ξ M with a continuous process ζ ( t ), t ∈ [0 , T ], where T = M/L . We do this as follows.Set ∆ = 1 /L and deﬁne t n = n∆ ∈ [0 , T ] n = 0 , , . . . , M. Deﬁne a piece-wise linear continuous-time process ζ ( L ) t , t ∈ [0 , T ] : ζ ( L ) t = 1 ∆ [( t n − t ) ξ n − +( t − t n − ) ξ n ] for t ∈ [ t n − , t n ] , n = 1 , . . . , M. By construction, the process ζ ( L ) t is such that ζ ( L ) t n = ξ n for n = 0 , . . . , M . Also we have that ζ ( L ) t is asecond-order stationary process in the sense that E ζ ( L ) t , var( ζ ( L ) t ) and the autocorrelation function R ( L ) ζ ( t, t + k∆ ) = Corr( ζ ( L ) t , ζ ( L ) t + k∆ ) do not depend on t . Lemma 2

Assume L → ∞ . The limiting process ζ t = lim L →∞ ζ ( L ) t , where t ∈ [0 , T ] , is a Gaus-sian second-order stationary process with marginal distribution ζ t ∼ N (0 , for all t ∈ [0 , T ] andautocorrelation function R ζ ( t, t + s ) = R ( s ) = max { , −| s |} . This lemma is a simple consequence of Lemma 1.

Jack Noonan, Anatoly Zhigljavsky S with a continuous process ζ t , t ∈ [0 , T ], allowsus to approximate the characteristics introduced in Introduction by the continuous-time analoguesas follows.(a) BCP P X ( M, h ) is approximated by P ζ ( T, h ), which is the probability of reaching the threshold h by the process ζ t on the interval [0 , T ]: P ζ ( T, h ) := Pr (cid:26) max ≤ t ≤ T ζ t ≥ h (cid:27) = Pr (cid:110) ζ t ≥ h for at least one t ∈ [0 ,T ] (cid:111) . (2.7)Note that P ζ (0 , h ) = 1 − Φ ( h ) > τ H ( S ) = τ h ( X ) is approximated by τ h ( ζ t ) := min { t ≥ ζ t ≥ h } , which isthe time moment when the process ζ t reaches h . The distribution of τ h ( ζ t ) has the form:(1 − Φ ( h )) δ ( ds ) + q ( s, h, ζ t ) ds , s ≥ , where δ ( ds ) is the delta-measure concentrated at 0 and q ( s, h, ζ t ) = dds P ζ ( s, h ) , < s < ∞ . (2.8)The function q ( s, h, ζ t ) /Φ ( h ), considered as a function of s , is a probability density function on(0 , ∞ ) since (cid:90) ∞ q ( s, h, ζ t ) ds = 1 − P ζ (0 , h ) = Φ ( h ) . (c) ARL H ( X ) /L is approximated byARL h ( ζ t ) = E ( τ h ( ζ t )) = (cid:90) ∞ s q ( s, h, ζ t ) ds . (2.9)We will call approximations (2.7) and (2.9) diﬀusion approximations, see Section 3.1. Numericalresults discussed in Section 4.6 show that if L and M are very large then the diﬀusion approximationsare rather accurate. For not very large values of L and M these approximations will be muchimproved with the help of the methodology developed by D.Siegmund and adapted to our setup inSections 3.3 and 4.3.2.5 Durbin and Poisson Clumping approximations for the BCP P ζ ( T, h )Derivation of the exact formulas for the BCP P ζ ( T, h ) has been discussed in several papers including[10, 14, 15, 16, 19]; exact formulas will be provided in Sections 3.1 and 4.1.In this section, we provide explicit formulas for two simple approximations for the BCP P ζ ( T, h )based on general principles; see also Section 4.5 for an approximation specialized for the setup ofmoving sums. We will assess the accuracy of these approximations in Section 3.4 and will ﬁnd thatthe accuracy of both of them is quite poor. The purpose of including these two approximationsinto our collection is only to demonstrate that the original problems stated in Introduction are noteasy and cannot be handled by general-purpose techniques. More sophisticated techniques usingthe speciﬁcity of the problem should be used, which is exactly what is done in this paper. The ﬁrstgeneric approximation considered is the Durbin approximation which is constructed on the base of[3] and is explained in Appendix B.

Approximation 1.

Durbin approximation for the BCP (2.7) : P ζ ( T, h ) ∼ = hT ϕ ( h ) . Let us now state the second approximation for the BCP deﬁned in (2.7), which is the PoissonClumping Heuristic (PCH) formulated as Lemma 3 according to [1] p. 81. pproximations of boundary crossing probabilities 5

Lemma 3

Let X ( t ) be a stationary Gaussian process with mean zero and covariance functionsatisfying R ( t ) = 1 − | t | as t → . Then for large h , T h = min { t : X ( t ) ≥ h } is approximatelyexponential with parameter hϕ ( h ) . From Lemma 3 we obtain:

Approximation 2.

PCH approximation for BCP (2.7) : P ζ ( T, h ) ∼ = 1 − exp( − hϕ ( h ) T ) . As can be seen from Fig. 1 and Fig. 2 in Section 3.4, Approximations 1 and 2 are poor approx-imations for P ζ ( T, h ) and P X ( M, h ) when

M/L ≤

1; the case

M/L > M ≤ L In this section, we assume M ≤ L and hence T = M/L ≤

1. The more complicated case

M > L will be considered in Section 4.3.1 Diﬀusion approximation, formulationHere we collect explicit formulas for the BCP P ζ ( T, h ); the proofs are given in Section 3.2.We have: P ζ ( T, h ) = 1 − Φ ( h ) + ϕ ( h ) (cid:2) hΦ ( h ) + ϕ ( h ) (cid:3) , T = 1 ; (3.1) P ζ ( T, h ) = 1 − (cid:82) h −∞ Φ (cid:16) h ( Z +1) − x ( − Z +1)2 √ Z (cid:17) ϕ ( x ) dx ++ √ ZZ +1 ϕ ( h ) (cid:104) h √ Z Φ ( h √ Z ) + √ π ( √ πϕ ( h )) Z (cid:105) , < T ≤ , (3.2)where Z = T / (2 − T ). If T = 1 then (3.2) simpliﬁes to (3.1). We refer to the above stated formulasfor P ζ ( T, h ) as Approximation 3 or ‘Diﬀusion approximation’.

Approximation 3.

The Diﬀusion approximation for the BCP P X ( M, h ) deﬁned in (2.4) in case M ≤ L : formula (3.2) with T = M/L ; if M = L then (3.2) reduces to (3.1) . In Section 3.3, we will derive a discrete-time correction for the Diﬀusion approximation. Inorder to do this, we need to correct the steps used for deriving (3.2). This explains that, despitethe formula (3.2) is known, we need to derive it (in order to correct certain steps of its derivation).This is done in the next section which follows [19], p.69.3.2 Derivation of (3.2)

From Lemma 2, { ζ t , t ∈ [0 , ∞ ) } , is a stationary Gaussian process with mean E ζ t = 0 and covariancefunction E ζ t ζ t + u = max { , − | u |} . By conditioning on the initial state of the process ζ t , we deﬁne Q h ( T, x ) := Pr (cid:26) max t ∈ [0 ,T ] ζ t > h | ζ = x (cid:27) . Since x ∼ N (0 ,

1) the BCP P ζ ( T, h ) is P ζ ( T, h ) = (cid:90) h −∞ Q h ( T, x ) ϕ ( x ) dx + 1 − Φ ( h ) , (3.3)where ϕ ( · ) and Φ ( · ) are deﬁned in (2.3). In order to proceed we seek an explicit expression for Q h ( T, x ). We shall ﬁrstly discuss a known BCP formula for the Brownian motion before returningto explicit evaluation of Q h ( T, x ). Jack Noonan, Anatoly Zhigljavsky

Let W ( t ) be the standard Brownian Motion process on [0 , ∞ ) with W (0) = 0 and E W ( t ) W ( s ) =min( t, s ) . For given a, R > b ∈ R , deﬁne P W ( R ; a, b ) := Pr { W ( t ) > a + bt for at least one t ∈ [0 , R ] } , (3.4)which is the probability that the Brownian motion W ( t ) reaches a sloped boundary a + bt withinthe time interval [0 , R ]. Using results of [18], for any a, R > b we have P W ( R ; a, b ) = 1 − Φ (cid:18) bR + a √ R (cid:19) + e − ab Φ (cid:18) bR − a √ R (cid:19) . (3.5)In particular, for R = 1 we have P W (1; a, b ) = 1 − Φ ( b + a ) + e − ab Φ ( b − a ) . (3.6) ζ t . Let { ζ ( t ) , t ∈ [0 , ∞ ) } be a process obtained by considering only the sample functions of { ζ t , t ∈ [0 , ∞ ) } which are equal to x at t = 0. For 0 ≤ t ≤

1, we obtain from [10], p.520, that ζ ( t ) canbe expressed in terms of the Brownian motion: ζ ( t ) = (2 − t ) W ( g ( t )) + x (1 − t ) (3.7)with g ( t ) = t/ (2 − t ). It then follows from (3.7) that for T ≤ x < h we have Q h ( T, x ) =Pr { ζ ( t ) ≥ h for at least one t ∈ [0 , T ] } =Pr (cid:26) W ( g ( t )) ≥ h − x (1 − t )2 − t for at least one t ∈ [0 , T ] (cid:27) . Noting that t = 2 g ( t ) / (1 + g ( t )) we obtain Q h ( T, x ) =Pr (cid:26) W ( g ( t )) ≥ (cid:18) ( h − x )(1 + g ( t ))2 (cid:19) + x g ( t ) for at least one t ∈ [0 , T ] (cid:27) =Pr (cid:26) W ( t (cid:48) ) ≥ (cid:18) h − x (cid:19) + t (cid:48) (cid:18) h + x (cid:19) for at least one t (cid:48) ∈ (cid:20) , T − T (cid:21)(cid:27) = P W ( Z ; a, b ) , (3.8)where Z = T / (2 − T ), b = ( h + x ) / a = ( h − x ) /

2. Using (3.5), we conclude Q h ( T, x ) = 1 − Φ (cid:18) bZ + a √ Z (cid:19) + e − ab Φ (cid:18) bZ − a √ Z (cid:19) . One can then show that by using this explicit form for Q h ( T, x ) in the integral (3.3), we obtain(3.1) and (3.2).It has now become clear how BCP formula (3.5) for the Brownian motion can be used to obtain(3.1) and (3.2). To improve the diﬀusion approximations for discrete time, we aim at correctingthe conditional probability Q h ( T, x ) for discrete time. Because of the relation shown in (3.8), theapproach taken in this paper is to correct (3.5) for discrete time. pproximations of boundary crossing probabilities 7 Let X , X , . . . be i.i.d. N (0 ,

1) r.v’s and set Y n = X + X + . . . + X n . Consider the sequence ofcumulative sums { Y n } and deﬁne the stopping time τ Y,a,b = inf { n ≥ Y n ≥ a + bn } for a > b ∈ R . Consider the problem of evaluatingPr( τ Y,a,b ≤ N ) = Pr( Y n ≥ a + bn for at least one n ∈ { , , . . . N } ) . (3.9)Exact evaluation of (3.9) is diﬃcult even if N is not very large but it was accurately approx-imated by D.Siegmund see e.g. [18] p. 19. Let W ( t ) be the standard Brownian Motion process on[0 , ∞ ). For a > b ∈ R , deﬁne τ W,a,b = inf { t : W ( t ) ≥ a + bt } so thatPr( τ W,a,b ≤ N ) = P W ( N, a + bt ) . (3.10)In [18], (3.10) was used to approximate (3.9) after translating the barrier a + bt by a suitablescalar ρ ≥

0. Speciﬁcally, the following approximation has been constructed: P ( τ Y,a,b ≤ N ) ∼ = P W ( N, ( a + ρ ) + bt ) , where the constant ρ approximates the expected excess of the process { Y n } over the barrier a + bt .From [17] (p. 225) ρ = − π − (cid:90) ∞ λ − log { − exp( − λ / /λ } dλ ∼ = 0 . . (3.11)Whence, by denoting ˆ a = a + ρ and recalling (3.5), D.Siegmund’s formulas of [18] imply theapproximation:Pr( τ Y,a,b ≤ N ) ∼ = Pr( τ W, ˆ a,b ≤ N ) = 1 − Φ (cid:18) bN + ˆ a √ N (cid:19) + e − ab Φ (cid:18) bN − ˆ a √ N (cid:19) . In this section, we modify D.Siegmund arguments discussed in previous section to the case whenthe r.v. are indexed by points on the uniform grid in an interval and therefore the sequence ofcumulative sums compares with a limiting Brownian motion process which lies within this interval.Assume that

Z > M is a positive integer. Deﬁne (cid:15) = Z/M and let t (cid:48) n = n(cid:15) ∈ [0 , Z ] ,n = 0 , , . . . , M. Let X , X , . . . be i.i.d. N (0 ,

1) r.v’s and set W ( t (cid:48) n ) = √ (cid:15) (cid:80) ni =1 X i . For a > b ∈ R , deﬁne the stopping time τ W,a,b = inf { t (cid:48) n : W ( t (cid:48) n ) ≥ a + bt (cid:48) n } (3.12)and consider the problem of approximatingPr( τ W,a,b ≤ Z ) = Pr (cid:18) W ( t (cid:48) n ) ≥ a + bt (cid:48) n for at least one t (cid:48) n ∈ { , (cid:15), . . . , M (cid:15) = Z } (cid:19) . (3.13)As M → ∞ , the piecewise linear continuous-time process W (cid:15) ( t ), t ∈ [0 , Z ], deﬁned by: W (cid:15) ( t ) := 1 (cid:15) (cid:2) ( t (cid:48) n − t ) W ( t (cid:48) n − )+( t − t (cid:48) n − ) W ( t (cid:48) n ) (cid:3) for t ∈ [ t (cid:48) n − , t (cid:48) n ] , n = 1 , . . . , M, converges to the Brownian motion on [0 , Z ]. For this reason, we refer to the sequence { W ( t (cid:48) ) , . . . W ( t (cid:48) M ) , } as discretized Brownian motion. We make the following connection between W ( t (cid:48) n ) and the randomwalk Y n : W ( t (cid:48) n ) = √ (cid:15) Y n = Y n / (cid:112) M/Z , n = 1 , , . . . M. Jack Noonan, Anatoly Zhigljavsky

Then by using (3.11), we approximate the expected excess over the boundary for the process W ( t (cid:48) n )by ρ M/Z = 0 . / (cid:112) M/Z .

Thus, using the same methodology as D.Siegmund, in order to obtain an accurate approximationfor (3.13), we translate the barrier a + bt by the discrete time correction factor ρ M/Z and apply(3.5). By denoting ˆ a = a + ρ M/Z , we obtain the approximation to (3.13):Pr( τ W,a,b ≤ Z ) ∼ = 1 − Φ (cid:18) bZ + ˆ a √ Z (cid:19) + e − ab Φ (cid:18) bZ − ˆ a √ Z (cid:19) . (3.14) Let Q h,ρ ( M, x ) denote the discrete time corrected equivalent of Q h ( T, x ), where T = M/L ≤ Q h,ρ ( M, x ) = 1 − Φ (cid:18) bZ + ˆ a √ Z (cid:19) + e − ab Φ (cid:18) bZ − ˆ a √ Z (cid:19) (3.15)with T = ML , Z = T − T , ˆ a = h − x ρ M/Z , b = h + x , ρ M/Z = 0 . (cid:112) M/Z .

Using Q h,ρ ( M, x ) in (3.3), the equivalent probability P ζ ( T, h ) after correction for discrete timewill be denoted by P ζ,ρ ( M, h ). Approximation 4.

For M ≤ L (that is, T ≤ ), the CDA for the BCP (2.4) is given by P X ( M, h ) ∼ = P ζ,ρ ( M, h ) := (cid:90) h −∞ Q h,ρ ( M, x ) ϕ ( x ) dx + 1 − Φ ( h ) , (3.16)where Q h,ρ ( M, x ) is given in (3.15).For M = L we have T = Z = 1 and the CDA P ζ,ρ ( M, h ) can be explicitly evaluated: P ζ,ρ ( L, h ) = 1 − Φ ( h + ρ L ) Φ ( h ) + ϕ ( h + ρ L ) ρ L Φ ( h ) − ϕ ( h ) e − hρ L ρ L Φ ( h − ρ L ) , (3.17)where ρ L = 0 . / √ L . For a proof of (3.17), see Appendix C.3.4 Simulation study, T ≤ P X ( M, h ),deﬁned in (2.4), when M ≤ L (that is, T ≤ ε j in (1.1) are normalr.v.’s with mean 0 and variance 1. In Figures 1–2, the black dashed line corresponds to the empiri-cal values of the BCP P X ( M, h ) deﬁned by (2.4) computed from 100 000 simulations with diﬀerentvalues of L and M (for given L and M , we simulate L + M normal random variables 100 000 times).For j = 1 , . . . ,

4, the number j next to a line corresponds to Approximation j . The axis are: the x -axis shows the value of the normalized barrier h , see (2.5); the y -axis denotes the probabilities ofreaching the barrier. The graphs, therefore, show the empirical probabilities of reaching the barrier h (for the dashed line) and values of considered approximations for these probabilities.In Table 1, we display the relative error of the CDA with respect to the empirical BCP P X ( M, h )for all considered parameter choices. Numerical study of this section shows that in the case T ≤ L and M . At thesame time, the Durbin, PCH and Diﬀusion approximations are generally poor (note however thatthe accuracy of the Diﬀusion approximation improves as L increases). The discrete time correctionfactor brings a huge improvement to the Diﬀusion approximation resulting in a very small relativeerrors shown in Table 1. pproximations of boundary crossing probabilities 9 Fig. 1: Empirical probabilities of reaching the barrier h and four approximations. Left: L = 5, M = 5, T = 1. Right: L = 10, M = 5, T = 1 / h and four approximations. Left: L = 100, M = 100, T = 1. Right: L = 200, M = 100, T = 1 / L = 5 , M = 5 L = 10 , M = 5 L = 100 , M = 100 L = 200 , M = 1000.05 0.225 % 0.238 % 0.041 % 0.132 %0.10 0.316 % 0.284 % 0.093 % 0.103 %0.15 0.474 % 0.326 % 0.155 % 0.059 %0.20 0.390 % 0.296 % 0.228 % 0.101 % M > L

In this section, we assume

M > L and thus

T > P ζ ( T, h )For

T >

1, the exact formulas for the BCP P ζ ( T, h ), the continuous-time case, are complicated. If

T > P ζ ( T, h ) = 1 − (cid:90) h −∞ (cid:90) D x det | ϕ ( y i − y j +1 + h ) | T | i,j =0 dy . . . dy T +1 dx, (4.1)where y = 0 , y = h − x, D x = { y , . . . , y T +1 | h − x < y < y < . . . < y T +1 } . If T > P ζ ( T, h ) is even more complex, see [15].

We are not using the exact formulas for P ζ ( T, h ) in the case

T > P ζ ( T, h ),which we will also call ‘Diﬀusion approximation’, and then correct it for discrete time.4.2 A Diﬀusion approximation when

T > W ( t ). Lemma 4 ([9], Corollary on p.12 )

Let W ( t ) be the standard Brownian motion on [0 , R ] with W (0) = 0 and E W ( t ) W ( s ) = min { t, s } . Let W µ ( t ) = µt + W ( t ) be the Brownian motion with drift µt . Then, for any y > and R > , F R,µ ( z, y ) := Pr { W µ ( R ) ≤ z, sup t ∈ [0 ,R ] W µ ( t ) ≤ y } = Φ (cid:18) z − µR √ R (cid:19) − e yµ Φ (cid:18) z − µR − y √ R (cid:19) . Similarly to formula (2) on p. 11 in [9] we can write the above formula in the formPr { W µ ( R ) ∈ dz, sup t ∈ [0 ,R ] W µ ( t ) ≤ y } = f R,µ,y ( z ) dz, where f R,µ,y ( z ) = ∂F R,µ ( z, y ) ∂z = 1 √ πR (cid:26) exp (cid:20) − ( z − µR ) R (cid:21) − exp (cid:20) yµ − ( z − y − µR ) R (cid:21)(cid:27) , z < y . From the deﬁnition of F R,µ ( z, y ), (cid:90) y −∞ f R,µ,y ( z ) dz = Pr (cid:40) sup t ∈ [0 ,R ] W µ ( t ) ≤ y (cid:41) = 1 − P W ( R ; a, b ) , (4.2)where a = y , b = − µ and P W ( R ; a, b ) is deﬁned in (3.4).We can reformulate the above results stated for W µ ( t ) as results for the standard Brownianmotion process W ( t ) with no drift. Set p W ( x ; R, a, b ) := f R, − b,a ( x − bR ) = (cid:40) √ πR (cid:104) e − x / R − e − ab − ( x − a ) / R (cid:105) , x ≤ a + bR , x > a + bR. (4.3)We will call p W ( x ; R, a, b ) in (4.3) ‘the non-normalised density of W ( R ) under the condition W ( t ) < a + bt for all t ∈ [0 , R ]’. In view of (4.2), (cid:82) p W ( x ; R, a, b ) dx = 1 − P W ( R ; a, b ) . Let us now show how to apply these results for construction of approximations for the BCP P ζ ( T, h ), where ζ t is the process deﬁned in Lemma 2. The direct relation between the process ζ t , t ∈ [0 , V ∈ (0 , ζ t < h for all t ∈ (0 , V ] given ζ = x < h is Pr( ζ t < h for all t ∈ [0 , V ] | ζ = x ) = 1 − P W ( U ; a, b ) , where U = V / (2 − V ), a = ( h − x ) / b = ( h + x ) /

2. By substituting these particular a and b into (4.3), we obtain that the non-normalised density of the r.v. ζ V conditioned on ζ = x < h and ζ t < h for all t ∈ (0 , V ] is p h,V ( x | x ) = (cid:114) − V πV (cid:26) exp (cid:20) − (2 − V ) x V (cid:21) − exp (cid:20) (2 − V )( x − h + x ) V (cid:21)(cid:27) . (4.4)In the most important special case V = 1, the non-normalised density of the r.v. ζ conditioned on ζ = x and ζ t < h for all t ∈ (0 ,

1] is p h ( x | x ) = (cid:26) ϕ ( x ) [1 − exp {− ( h − x )( h − x ) } ] , for x < h x ≥ h, (4.5) pproximations of boundary crossing probabilities 11 where ϕ ( · ) is deﬁned in (2.3) and (cid:82) h −∞ p h ( x | x ) dx = 1 − P W (1; h − x , h + x ), where we have used(3.6) to get the ﬁnal expression.Since ζ is N (0 , ζ conditioned on ζ < h is p ( x ) = φ ( x ) /Φ ( h ) , x < h .Averaging over ζ < h , the non-normalized density of the r.v. ζ under the conditions ζ t < h for all t ∈ [0 ,

1] is: ˜ p ( x ) = (cid:90) h −∞ p h ( x | x ) p ( x ) dx , for x < h (4.6)with c = (cid:90) h −∞ ˜ p ( x ) dx = [1 − P ζ (1 , h )] /Φ ( h ) . (4.7)Denote by p ( x ) = ˜ p ( x ) /c , x < h, the normalized density of ζ under the condition ζ t < h forall t ∈ [0 , i ≥

1, the densities of ζ i and ζ i − under the condition that ζ t does not reach h in ( i − , i ] and [0 , i −

1] respectively can be connected in the same way as for the interval [0 , ζ t is not Markovian). Assumethat p i − ( x ) is the normalized density of ζ i − under the condition ζ t < h for all t ∈ [0 , i − p i ( x ) = (cid:90) h −∞ p h ( x | y ) p i − ( y ) dy, for x < h . (4.8)We call it the non-normalized density of a r.v. ζ i under the conditions ζ i − ∼ ˜ p i − ( x ) and ζ t < h for all t ∈ [ i − , i ]. We then deﬁne p i ( x ) = ˜ p i ( x ) /c i , x < h, where c i = (cid:82) h −∞ ˜ p i ( x ) dx .If T is large, calculation of the densities p i ( x ) ( i ≤ T ) in such an iterative manner is cumbersome.We then replace formula (4.8) with˜ p i ( x ) = (cid:90) h −∞ p h ( x | y ) p ( y ) dy, for x < h, (4.9)where p ( x ) is an eigenfunction of the integral operator with kernel (4.5) corresponding to themaximum eigenvalue λ : λp ( x ) = (cid:90) h −∞ p ( y ) p h ( x | y ) dy, x < h . (4.10)This eigenfunction p ( x ) is a probability density on ( −∞ , h ] with p ( x ) > x ∈ ( −∞ , h )and (cid:82) h −∞ p ( x ) dx = 1 . Moreover, the maximum eigenvalue λ of the operator with kernel K ( x, y ) = p h ( x | y ) is simple and positive. The fact that such maximum eigenvalue λ is simple and real (andhence positive) and the eigenfunction p ( x ) can be chosen as a probability density follows from theRuelle-Krasnoselskii-Perron-Frobenius theory of bounded linear positive operators, see e.g. TheoremXIII.43 in [13].Using (4.9) and (4.10), we derive recursively: P ζ ( i, h ) (cid:39) P ζ ( i − , h ) + (1 − P ζ ( i − , h ))(1 − λ ) ; i = 2 , , . . . By induction, for an integer T ≥ P ζ ( T, h ) (cid:39) P ζ (1 , h ) + (1 − P ζ (1 , h )) T − (cid:88) j =0 λ j (1 − λ ) = 1 − (1 − P ζ (1 , h )) λ T − . (4.11)The approximation (4.11) can be used for non-integer T . We can also use a minor adjustment tothis approximation using the maximal eigenvalues of the kernel (4.4) in addition to λ ; this is muchharder but the beneﬁts of this are minuscule. Approximation 5. (Diﬀusion approximation for BCP P ζ ( T, h ) when

T >

Use (4.11) , where λ is the maximal eigenvalue of the integral operator with kernel p h ( x | y ) deﬁned in (4.5) . The BCP P ζ (1 , h ) can be computed by (3.1). In the next section we make a correction toApproximation 5 adjusted for discrete time. Approximations for continuous-time λ , required inApproximation 5, are obtained from formulas of that section when ρ → T > P ζ (1 , h ), and (b) the kernel of the integraloperator deﬁned in (4.5) for the continuous-time case (and hence λ , the maximum eigenvalue).Discrete-time corrections for the BCP P ζ (1 , h ) have been discussed above in Section 3.3; we willreturn to this at the end of Section 4.4.Moving to (b), recall (4.10) which states that λ is the maximal eigenvalue satisfying (4.10), thecorresponding eigenfunction p ( x ) is a probability density function on ( −∞ , h ), and p h ( x | y ) is givenin (4.5). Recall that p h ( x | x ) is the density of the random variable x = ζ under the conditions ζ = x and ζ t < h for all t ∈ [0 , p h ( x | x ) fordiscrete time.As shown in Section 3.3.2, BCP for the discretized Brownian motion process W ( t n ) can beapproximated using the BCP formula for the Brownian motion incorporating a discrete time cor-rection factor. Recalling the stopping time τ W,a,b deﬁned in (3.12), for Z = 1 we obtain from (3.14)the approximation: Pr( τ W,a,b ≤ ∼ = 1 − Φ ( b + ˆ a ) + e − ab Φ ( b − ˆ a ) , where ˆ a = a + δ and, since L = M , we will use δ := ρ L = 0 . / √ L . (4.12)In view of (3.14), for the Brownian motion W ( t ) (with W (0) = 0) considered on [0 , W (1) under the condition W ( t n ) < a + bt n for all t n = n/L ( n = 1 , . . . , L )can be approximated by ψ δ ( x ) =  √ π (cid:26) e − x − exp (cid:20) − ab − ( x − a ) (cid:21)(cid:27) for x ≤ a + b x > a + b , where ˆ a = a + δ and δ is given by (4.12). Thus, the discrete-time equivalent of p h ( x | x ), denotedby p h,δ ( x | x ), is: p h,δ ( x | x ) = ϕ ( x )(1 − exp[ − ( h − x )( h − x ) − δ (3 h − x − x + 2 δ )]) , for x < h. (4.13)If ρ = δ = 0, we clearly obtain p h, ( x | x ) = p h ( x | x ), for all h, x, x .Denote by λ δ the maximum eigenvalue associated with the kernel p h,δ ( x | x ) given by (4.13).This means that λ δ satisﬁes λ δ p ( x ) = (cid:90) h −∞ p ( y ) p h,δ ( x | y ) dy, x < h (4.14)for some eigenfunction p ( x ) which can be assumed to be a probability density function on ( −∞ , h ).Similar to λ in (4.10), λ δ is positive and uniquely deﬁned. pproximations of boundary crossing probabilities 13 λ δ and p ( x ) in (4.14)Similarly to (4.6), (4.7) and (4.8) we deﬁne p ( x ) = φ ( x ) /Φ ( h ), ˜ p i ( x ) = (cid:82) h −∞ p h,δ ( x | y ) p i − ( y ) dy ( x < h ), c i = (cid:82) h −∞ ˜ p i ( x ) dx and p i ( x ) = ˜ p i ( x ) /c i for i = 1 ,

2. From (4.14), λ δ = lim i →∞ c i . Byperforming integration we obtain˜ p ( x ) = ϕ ( x ) − ϕ ( h ) Φ ( h ) e − δh − δ / δx Φ ( x − δ ) , x < h ,c = Φ ( h ) − κ h,δ Φ ( h ) with κ h,δ = 1 δ ϕ ( h ) (cid:104) e − δh − δ / Φ ( h − δ ) − e − δh Φ ( h − δ ) (cid:105) , ˜ p ( x ) = ϕ ( x ) + ϕ ( h ) c (cid:20) Φ ( h − δ ) e − δh − δ / δx ϕ ( x ) − ϕ ( h ) e δ / − δh − δx Φ ( x − δ ) Φ ( h )( h + 2 δ − x ) − e − δh − δ / δx Φ ( x − δ ) (cid:21) , x < h . We were unable to compute c and the densities p i ( x ) with i ≥ p ( x ) is visually indistinguishable from p i ( x ) for i > p ( x ), the solution of (4.14). Thus we approximate λ δ in (4.14) byˆ λ δ = ˜ p (0) p (0) = Φ ( h ) − ( h +2 δ ) κ h,δ + ϕ ( h )[ Φ ( − δ ) e δ / − h / − δh − Φ ( h − δ ) e − δh − δ / ]( h + 2 δ ) (cid:2) Φ ( h ) − Φ ( − δ ) e − ( h + δ )( h +3 δ ) / (cid:3) . (4.15)Moreover, ˜ p ( x ) is a rather accurate approximation to the (non-normalized) eigenfunction p ( x ) in(4.14). In Fig 3(a) we have plotted p ( x ), p ( x ) and the uncorrected p ( x ) obtained by letting ρ → L and h .An alternative way of approximating λ δ and p ( x ) from (4.14) would be to use a methodologydescribed in [11] p.154 which is based on the Gauss-Legendre discretization of the interval [ − C, h ],with some large

C >

0, into an N -point set x , . . . , x N (the x i ’s are the roots of the N -th Legendrepolynomial on [ − C, h ]), and the use of the Gauss-Legendre weights w i associated with points x i ; λ δ and p ( x ) are then approximated by the largest eigenvalue and associated eigenvector of thematrix D / K D / , where D = diag( w i ), and ( K ) i,j = p h,δ ( x i | x j ). If N is large enough then theresulting approximation to λ δ is arbitrarily accurate and we use it as the true λ δ in our numericalcomparisons. Numerical simulations show that the value of c is not close enough to λ δ but c is.However, more interestingly, we see that ˆ λ δ deﬁned by (4.15) is very accurate and we suggest to useit because of its explicit form; this is demonstrated in Fig 3(b), where ˆ λ δ (solid red line) is visuallyindistinguishable from λ δ obtained using the Gauss-Legendre quadrature (dotted black line). Inthis ﬁgure, the dashed green line corresponds to the uncorrected λ δ ( ρ → λ for discrete time. We shall now discuss item (a) of Sec-tion 4.3, which concerns correction of the BCP P ζ (1 , h ) for discrete time. For correcting P ζ (1 , h )we can routinely use δ as in (4.12) but numerical results indicate that we get a better resultingapproximation, especially for small L and M , if we use γ = ρ L /T / , so that the BCP P X (1 , h )is approximated by P ζ,γ (1 , h ). We believe that the fact that P X (1 , h ) ∼ = P ζ,δ (1 , h ) is not accurateenough is due to the fact that the densities p i ( x ) are not exactly the densities of ξ i . To summarise,the CDA for T >

Approximation 6.

For

M > L (that is,

T > ), the CDA for the BCP (2.4) is P X ( M, h ) ∼ = 1 − [1 − P ζ,γ (1 , h )] ˆ λ T − δ , (4.16) where ˆ λ δ is given by (4.15) and P ζ,γ (1 , h ) = 1 − Φ ( h + γ ) Φ ( h ) + ϕ ( h + γ ) γ Φ ( h ) − ϕ ( h ) e − hγ γ Φ ( h − γ ) with γ = ρ L /T / = 0 . / ( L / T / ) . p ( x ) (dotted blue), p ( x ) (solid red) andthe uncorrected p ( x ) with δ = 0 (dashed green); L = 10and h = 2 (b) Values of ˆ λ δ (solid red), λ δ (dotted black) and λ (dashed green) obtained using the Gauss-Legendrequadrature; L = 10 and diﬀerent h . Fig. 34.5 Approximation by J. Glaz and coauthorsThe Glaz approximation for the BCP P X ( M, h ) (developed in [6, 7] and discussed in the Introduc-tion) is as follows.

Approximation 7. (Glaz approximation) For M ≥ L (so that T = M/L ≥ P X ( M, h ) ∼ = 1 − (1 − P X (2 L, h )) (cid:20) − P X (2 L, h )1 − P X ( L, h ) (cid:21) T − , (4.17)where P X (2 L, h ) and P X ( L, h ) are evaluated using R algorithms for the multivariate normal distri-bution.The approximation (4.17) is deﬁned for M ≥ L and requires numerical evaluation of L + 1 and2 L + 1 dimensional integrals (which are the BCP P X ( L, h ) and P X (2 L, h ) respectively) using the so-called ‘GenzBretz’ algorithm for numerical evaluation of multivariate normal probabilities, see [4, 5].Whilst the accuracy of Approximation 7 is very high and in fact very similar the accuracy of theCDA (Approximation 6), the nature of Approximation 7 results in high computational cost and run-time when compared to other approximations discussed in this paper (especially for large L); notealso that diﬀerent integrals should be computed for diﬀerent values of h . Moreover, the ‘GenzBretz’algorithm uses Monte-Carlo simulations so that for reliable estimation of high-dimensional integrals(especially when L is large) one needs to make a lot of averaging.We have not provided results of comparison of Approximation 7 with other approximations forthe BCP as the accuracy of Approximations 6 and 7 was very close. Note also that there is strongsimilarity between the forms of these two approximations. Indeed, from (4.16) we can write theCDA in the form 1 − (1 − P ζ,γ (1 , h ))ˆ λ T − δ = 1 − ˆ λ δ (1 − P ζ,γ (1 , h )) (cid:124) (cid:123)(cid:122) (cid:125) ( a ) ˆ λ δ (cid:124)(cid:123)(cid:122)(cid:125) ( b ) T − , where the terms (a) and (b) are as (1 − P X (2 L, h )) and (1 − P X (2 L, h )) / (1 − P X ( L, h )) in Approx-imation 7, respectively. pproximations of boundary crossing probabilities 15 P X ( M, h ),deﬁned in (2.4), when

M > L (so that

T > T ≤

1, we conclude that the CDA provides very accurate approximations andsigniﬁcantly outperforms the Diﬀusion, Durbin and PCH approximations. Note also that for large T the PCH approximation is very close to the Diﬀusion approximation; this can be seen in Fig 5,where (for T = 50) these two approximations basically coincide for all h .Fig. 4: Empirical probabilities of reaching the barrier h and four approximations. Left: L = 10, M = 50, T = 5. Right: L = 50, M = 250, T = 5.Fig. 5: Empirical probabilities of reaching the barrier h and four approximations. Left: L = 10, M = 500, T = 50. Right: L = 50, M = 2500, T = 50.Table 2: Relative error of CDA for given BCPBCP L = 10 , M = 50 L = 10 , M = 500 L = 50 , M = 250 L = 50 , M = 25000.05 0.596 % 0.028 % 0.133 % 0.054 %0.10 0.657 % 0.030 % 0.146 % 0.057 %0.15 0.455 % 0.031 % 0.390 % 0.208 %0.20 0.570 % 0.192 % 0.165 % 0.184 % h ( X ) As shown in Sections 3.4 and 4.6, the CDA accurately approximates P X ( M, h ). The CDA hasdiﬀerent forms depending on whether M ≤ L or M > L , see (3.16) and (4.16) respectively. Thus,from (2.8), the CDA leads to the following approximation for the probability density function of τ h ( X ) /L : ˆ q h ( t ) =  ddt (cid:110)(cid:82) h −∞ Q h,ρ ( tL, x ) ϕ ( x ) dx (cid:111) , < t ≤ ddt (cid:110) − [1 − P ζ,γ (1 , h )] ˆ λ t − δ (cid:111) , t > . For t >

1, one can easily get an explicit form of ˆ q h ( t ). However, we were unable to obtain an explicitform of ˆ q h ( t ) for t < h ), the probability of exceeding h in the interval (0 ,

1] is very small and the impact ofˆ q h ( t ) for t < F h ( t ) the true cumulative distribution function (c.d.f.) of τ h ( X ) /L . The c.d.f. of theCDA of τ h ( X ) /L is ˆ F h ( t ) deﬁned byˆ F h (0) = 1 − Φ ( h ) , ˆ F h ( t ) = 1 − Φ ( h ) + (cid:90) t ˆ q h ( u ) du for t > . (5.1)The accuracy of this approximation for a selection of parameter choices is demonstrated in Fig. 6and Fig. 7. The CDA for ARL h ( X ) isARL h ( X ) = E τ h ( X ) ∼ = L (cid:90) ∞ s ˆ q ( s, h ) ds . (5.2)Fig. 6 and 7 demonstrate that (5.1) very accurately approximates the distribution of τ h ( X ). (a) Left: L = 10 and h = 2 (b) Right: L = 10 and h = 3 Fig. 6: F h ( t ) and its approximation ˆ F h ( t ) for L = 10, h = 2 and 3.In this paper, we deﬁne ARL in terms of the number of random variables ξ n rather than numberof random variables ε j . This means we have to slightly modify the following Glaz approximationfor ARL given in [7], since such an approximation considers the number of random variables ε j .This can be simply done by subtracting L from the ARL approximation in [7]. From which, theGlaz approximation for ARL h ( X ) is as follows: E G ( τ h ( X )) = L (cid:88) j = L (1 −P X ( j − L, h ))+ 1 − P X ( L, h ) P X (2 L, h ) − P X ( L, h ) L (cid:88) j =1 (1 − P X ( L + j, h )) , (5.3)where x = (1 − P X (2 L, h )) / (1 − P X ( L, h )). pproximations of boundary crossing probabilities 17(a) Left: L = 50 and h = 2 (b) Right: L = 50 and h = 3 Fig. 7: F h ( t ) and its approximation ˆ F h ( t ) for L = 50, h = 2 and 3.In Table 3 we assess the accuracy of the CDA approximation (5.2) and also Glaz approximation(5.3). In these tables, the values of ARL h ( X ) have been calculated using 100 ,

000 simulations. Due tothe Monte Carlo methods used to compute the Glaz approximation, we have presented the averageof 20 iterations of (5.3) as well as providing conﬁdence intervals.Table 3: Approximations for ARL h ( X ): L = 10 (top) and L = 50 (bottom) h ± ± ± ± h ( X ) 22 33 49 78 127 218 397 758 1551 h ± ± ± ±

23 5121 ± h ( X ) 83 124 189 294 471 791 1392 2590 5110Table 3 shows that for small h the approximations developed in this paper are very accurate andare similar to the Glaz approximation. For a large L = 50, (5.3) can be considered more accuratethan (5.2). However using (5.3) for a large L is computationally expensive and results in a longrun-time, especially if results are averaged. Increasing L has no impact on the computational costand run time of (5.2). Acknowledgements

The authors are grateful to our colleague Nikolai Leonenko for intelligent discussions andﬁnding the reference [9], which is essential for the material of Section 4.2.

Appendices

Appendix A: Proof of Lemma 1As correlation is invariant under linear transformations, Corr( S ,L , S k,L ) = Corr( ξ , ξ k ). From thedeﬁnition (1.1) we have Corr( S ,L , S k,L ) = Corr( S n,L , S n + k,L ). The sum S k,L can be representedas S k,L = S ,L − k (cid:88) j =1 ε j + L + k (cid:88) j = L +1 ε j . Using this representation, we obtainCov( S ,L , S k,L ) = ( σ L + µ L ) (cid:124) (cid:123)(cid:122) (cid:125) E S ,L − kσ − µ L (cid:124) (cid:123)(cid:122) (cid:125) ( E S ,L ) = σ L − kσ . Dividing this by var( S ,L ), from (2.1), we obtain Corr( S ,L , S k,L ) = 1 − k/L in the case k ≤ L .The case k > L is obvious.Appendix B: Derivation of Durbin approximationWe shall initially show R (cid:48) (0+) = − (cid:54) = 0. We have ∂R ζ ( t, s ) ∂s (cid:12)(cid:12)(cid:12)(cid:12) s = t + = R (0+) . Using (2.6) and the fact that ∆ = 1 /L , we have R (cid:48) (0+) = lim L →∞ R ( ∆ ) − R (0) ∆ = − lim L →∞ LL = − . The Durbin approximation for q ( t, h, ζ t ) can be written as q ( t, h, ζ t ) ∼ = b ( t, h ) f ( t, h ) , where f ( t, h ) = 1 (cid:112) πR ζ ( t, t ) e − h t )2 Rζ ( t,t ) , b ( t, h ) = − h ( t ) R ζ ( t, t ) ∂R ζ ( s, t ) ∂s (cid:12)(cid:12)(cid:12)(cid:12) s = t + − dh ( t ) dt . In view of (2.8) the related approximation for the ﬁrst passage probability P ζ ( T, h ) is P ζ ( T, h ) ∼ = (cid:90) T b ( t, h ) f ( t, h ) dt . In the case when the threshold h ( t ) = h is constant, using Lemma 2 we obtain b ( t, h ) = − hR (cid:48) (0+) = h, q ( t, h, ζ t ) ∼ = h √ π e − h / and therefore we obtain the following approximation. P X ( M, h ) ∼ = P ζ ( T, h ) ∼ = hT √ π e − h / . Appendix C: Derivation of (3.17)As M = L , P ζ,ρ ( L, h ) = (cid:82) h −∞ (cid:0) − Φ ( b + ˆ a ) + e − ab Φ ( b − ˆ a ) (cid:1) ϕ ( x ) dx + 1 − Φ ( h ) . Using the factˆ a = ( h − x ) / ρ L and b = ( h + x ) /

2, we obtain: P ζ,ρ ( M, h ) = 1 − (cid:90) h −∞ Φ ( h + ρ L ) ϕ ( x ) dx + (cid:90) h −∞ e − h / x / − ρ L h − ρ L x Φ ( x − ρ L ) ϕ ( x ) dx = 1 − Φ ( h + ρ L ) Φ ( h ) + ϕ ( h ) e − ρ L h (cid:90) h −∞ e − ρ L x (cid:90) x − ρ L −∞ ϕ ( z ) dz dx . Making the substitution k = z + ρ L in the rightmost integral, we obtain ϕ ( h ) e − ρ L h (cid:90) h −∞ (cid:90) x −∞ e − ρ L x ϕ ( k − ρ L ) dk dx . pproximations of boundary crossing probabilities 19 By then changing the order of integration: ϕ ( h ) e − ρ L h (cid:90) h −∞ (cid:90) hk e − ρ L x ϕ ( k − ρ L ) dx dk = ϕ ( h ) e − ρ L h ρ L (cid:90) h −∞ ( e − ρ L k − e − ρ L h ) ϕ ( k − ρ L ) dk. By expanding the brackets, we obtain: ϕ ( h ) e − ρ L h ρ L (cid:90) h −∞ ( e − ρ L k − e − ρ L h ) ϕ ( k − ρ L ) dk = ϕ ( h ) e − ρ L h − ρ L / ρ L (cid:90) h ∞ ϕ ( k ) dk − ϕ ( h ) e − ρ L h ρ L (cid:90) h ∞ ϕ ( k − ρ L ) dk = ϕ ( h + ρ L ) ρ L Φ ( h ) − ϕ ( h ) e − ρ L h ρ L Φ ( h − ρ L ) . Thus we obtain the required: P ζ,ρ ( L, h ) = 1 − Φ ( h + ρ L ) Φ ( h ) + ϕ ( h + ρ L ) ρ L Φ ( h ) − ϕ ( h ) e − hρ L ρ L Φ ( h − ρ L ) . References

1. Aldous, D.: Probability Approximations via the Poisson Clumping Heuristic. Springer Science & Business Media(1989)2. Chu, C.S.J., Hornik, K., Kaun, C.M.: MOSUM tests for parameter constancy. Biometrika (3), 603–617 (1995)3. Durbin, J.: The ﬁrst-passage density of a continuous Gaussian process to a general boundary. Journal of AppliedProbability pp. 99–122 (1985)4. Genz, A., Bretz, F.: Computation of Multivariate Normal and t Probabilities. Lecture Notes in Statistics.Springer-Verlag, Heidelberg (2009)5. Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T.: mvtnorm: Multivariate Normaland t Distributions (2018). URL https://CRAN.R-project.org/package=mvtnorm . R package version 1.0-8: ‘https://CRAN.R-project.org/package=mvtnorm’

6. Glaz, J., Johnson, B.: Boundary crossing for moving sums. Journal of Applied Probability (1), 81–88 (1988)7. Glaz, J., Naus, J., Wang, X.: Approximations and inequalities for moving sums. Methodology and Computingin Applied Probability (3), 597–616 (2012)8. Glaz, J., Pozdnyakov, V., Wallenstein, S.: Scan Statistics: Methods and Applications. Birkhuser, Boston (2009)9. Harrison, J.: Brownian motion and stochastic ﬂow systems. John Wiley and Sons (1985)10. Mehr, C., McFadden, J.: Certain properties of Gaussian processes and their ﬁrst-passage times. Journal of theRoyal Statistical Society. Series B (Methodological) (3), 505–522 (1965)11. Mohamed, J., Delves, L.: Computational Methods for Integral Equations. Cambridge University Press (1985)12. Moskvina, V., Zhigljavsky, A.: An algorithm based on Singular Spectrum Analysis for change-point detection.Communications in Statistics—Simulation and Computation (2), 319–352 (2003)13. Reed, M., Simon, B.: Methods of Modern Mathematical Physics: Scattering theory Vol. 3. Academic Press (1979)14. Shepp, L.: Radon-Nikodym derivatives of Gaussian measures. The Annals of Mathematical Statistics pp. 321–354(1966)15. Shepp, L.: First passage time for a particular Gaussian process. The Annals of Mathematical Statistics (3),946–951 (1971)16. Shepp, L., Slepian, D.: First-passage time for a particular stationary periodic Gaussian process. Journal ofApplied Probability pp. 27–38 (1976)17. Siegmund, D.: Sequential Analysis: Tests and Conﬁdence Intervals. Springer Science & Business Media (1985)18. Siegmund, D.: Boundary crossing probabilities and statistical applications. The Annals of Statistics14