[PDF] Bernstein type inequalities for self-normalized martingales with applications

Abstract

For self-normalized martingales with conditionally symmetric differences, de la Peña [A general class of exponential inequalities for martingales and ratios. Ann. Probab. 27, No.1, 537-564] established the Gaussian type exponential inequalities. Bercu and Touati [Exponential inequalities for self-normalized martingales with applications. Ann. Appl. Probab. 18: 1848-1869] extended de la Peña's inequalities to martingales with differences heavy on left. In this paper, we establish Bernstein type exponential inequalities for self-normalized martingales with differences bounded from below. Moreover, applications to self-normalized sums, t-statistics and autoregressive processes are discussed.

Full PDF

aa r X i v : . [ m a t h . P R ] M a r Bernstein type inequalities for self-normalized martingales with applications

Xiequan Fan ∗ , Shen Wang Center for Applied Mathematics, Tianjin University, 300072 Tianjin, China

Abstract

For self-normalized martingales with conditionally symmetric diﬀerences, de la Pe˜na [6] establishedthe Gaussian type exponential inequalities. Bercu and Touati [2] extended de la Pe˜na’s inequalities tomartingales with diﬀerences heavy on left. In this paper, we establish Bernstein type exponential in-equalities for self-normalized martingales with diﬀerences bounded from below. Moreover, applicationsto self-normalized sums, t-statistics and autoregressive processes are discussed.

Keywords: martingales; self-normalized processes; exponential inequalities; autoregressive processes

Primary 60G42; 60E15; 60F10

1. Introduction

Let ( ξ i ) i ≥ be a sequence of zero-mean independent random variables satisfying ξ i ≤ i .Denote S n = P ni =1 ξ i the partial sums of ( ξ i ) i ≥ . Bennett [1] proved the following Bernstein typeinequality: for all x > , P (cid:0) S n ≥ xv (cid:1) ≤ exp (cid:26) − x v x/ (cid:27) , (1)where v = Var( S n ) is the variance of S n . The importance of Bernstein type inequalities comes fromthe fact that they combine both the Gaussian trends and exponentially decaying rate. To see this, werewrite the last inequality in the following form: for all x > , P (cid:0) S n ≥ x (cid:1) ≤ exp (cid:26) − x v + x/ (cid:27) . (2)It is easy to see that the last bound behaves as exp {− x v } for moderate x = o ( v ) , while it isexponentially decaying to 0 as x → ∞ .The generalizations of (1) to martingales have attracted certain interest. Assume that ( ξ i , F i ) i =0 , ··· ,n is a sequence of martingale diﬀerences. If ξ i ≤ , Freedman [15] showed that (1) holds also when P (cid:0) S n ≥ xv (cid:1) is replaced by P (cid:0) S n ≥ xv , h S i n ≤ v (cid:1) , where h S i n is the conditional variance of S n . Dela Pe˜na [6], Dzhaparidze and van Zanten [10] and Fan et al. [12, 14] extended Freedman’s inequality ∗ Corresponding author.

E-mail : [email protected] (X. Fan).

Preprint submitted to Elsevier September 11, 2018 o martingales with non-bounded diﬀerences. Recently, Rio [20] gave a reﬁnements on Freedman’sinequality.Despite the fact that the case of martingale is well studied, there are only a few results on Bernsteintype inequalities for self-normalized martingales S n / [ S ] n , where [ S ] n is the squared variance of S n .Among them, let us recall the following exponential inequalities of de la Pe˜na [6]. Assume that( ξ i , F i ) i =0 , ··· ,n is a sequence of conditionally symmetric martingale diﬀerences. Recall that ξ i is calledconditionally symmetric if L ( ξ i |F i − ) = L ( − ξ i |F i − ) for all i , where L ( ξ i |F i − ) stands for the regularversion of the conditional distribution of ξ i given a σ -ﬁeld F i − . De la Pe˜na [6] have established thefollowing exponential inequalities for self-normalized martingales: for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ s E (cid:20) exp (cid:26) − x [ S ] n (cid:27)(cid:21) , (3)and, for all x, y > P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ y (cid:19) ≤ exp (cid:26) − x y (cid:27) , (4)where [ S ] n = P ni =1 ξ i the squared variance of S n . In the i.i.d. case, [ S ] n /n usually converges almostsurely to the variance of the random variables. Thus (3) and (4) can be regarded as Gaussian typeinequalities.The inequalities of de la Pe˜na have been extended to the martingales with diﬀerences heavy onleft. Recall that an integrable random variable X is called heavy on left if E X = 0 and, for all a > , E [ T a ( X )] ≤ , where T a ( X ) = min( | X | , a ) sign( X )is the truncated version of X . Clearly, conditionally symmetric martingale diﬀerences are heavy onleft. Bercu and Touati [2] have obtained the following extension of de la Pe˜na’s inequity (3): for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) −

12 ( p − x [ S ] n (cid:27)(cid:21)(cid:19) /p . (5)They also showed that (4) holds for martingales with diﬀerences heavy on left. In the particular case p = 2 , inequality (5) reduces to inequality (3) under the conditional symmetric assumption. Similarresults for self-normalized martingales S n / p [ S ] n can also be found in Bercu and Touati [2].Exponential inequalities for self-normalized martingales have a lot of applications. We refer tode la Pe˜na, Klass and Lai [7] for autoregressive processes. Bercu and Touati [2] applied such typeinequalities to parameter estimations of linear regressions, autoregressive processes and branchingprocesses. For more applications of such type inequalities, we refer to the monographs of de la Pe˜na,Lai and Shao [8] and Bercu, Delyon and Rio [3].In this paper, we aim to establish Bernstein type inequalities for self-normalized martingales withdiﬀerences bounded from below. It is obvious that a random variable is bounded from below doesnot imply that it is heavy on left. Our results for self-normalized martingales are analogues to theinequalities (3) - (5). Applications to self-normalized sums, t-statistics and autoregressive processesare also discussed.The paper is organized as follows. We present our main results in Section 2. In Section 3, wediscuss the applications, and prove our main results in Section 4.2 . Main results Let ( ξ i , F i ) i =0 , ··· ,n be a ﬁnite sequence of real-valued square integrable martingale diﬀerences de-ﬁned on a probability space (Ω , F , P ), where ξ = 0 and {∅ , Ω } = F ⊆ . . . ⊆ F n ⊆ F are increasing σ -ﬁelds. So by deﬁnition, we have E [ ξ i |F i − ] = 0 , i = 1 , . . . , n . Set S = 0 and S k = k X i =1 ξ i (6)for k = 1 , . . . , n. Then S = ( S k , F k ) k =1 ,...,n is a martingale. Let [ S ] and h S i be, respectively, thesquared variance and the conditional variance of the martingale S, that is[ S ] = 0 , [ S ] k = k X i =1 ξ i , and h S i = 0 , h S i k = k X i =1 E [ ξ i |F i − ] , k = 1 , ..., n. (7)Our main result is the following Bernstein type inequalities for self-normalized martingales withdiﬀerences bounded from below. It is worth to be mentioned that the inequalities are new even forindependent random variables. Theorem 2.1.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all x > , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) x − log(1 + x ) (cid:17) [ S ] n (cid:27) { S n ≥ x [ S ] n } (cid:21)(cid:19) /p (8) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x ) [ S ] n (cid:27) { S n ≥ x [ S ] n } (cid:21)(cid:19) /p , (9) and, for all y > , P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ y (cid:19) ≤ exp (cid:26) − ( x − log(1 + x )) y (cid:27) (10) ≤ exp (cid:26) − x y x ) (cid:27) . (11)Clearly, inequality (9) implies that for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x ) [ S ] n (cid:27)(cid:21)(cid:19) /p , which is an analogues to de la Pe˜na’s inequity (3) and the inequality of Bercu and Touati (5).3enote B n = P ni =1 E ξ i . It is easy to see that for all x > < ε < , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ B n (1 − ε ) (cid:19) + P (cid:18) [ S ] n < B n (1 − ε ) (cid:19) = P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ B n (1 − ε ) (cid:19) + P (cid:18) n X i =1 ( ξ i − E ξ i ) < − B n ε (cid:19) . The ﬁrst term of the last bound can be estimated by (11). For the second term of the last bound, noticethat ( ξ i − E ξ i ) i =1 ,...,n are centered random variables bounded from below, and they are independentonce ( ξ i ) i =1 ,...,n are independent. Thus we need the following Bernstein type exponential inequalitiesfor centered random variables bounded from below. Theorem 2.2.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all x > , P (cid:18) S n h S i n ≤ − x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) { S n ≤− x h S i n } (cid:21)(cid:19) /p (12) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27) { S n ≤− x h S i n } (cid:21)(cid:19) /p . (13)Inequality (13) implies that for all x > P (cid:18) S n h S i n ≤ − x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27)(cid:21)(cid:19) /p . (14)It seems that the bound (14) is usually decreasing in p . For instance, consider the independent case.When ( ξ i ) i =1 , ··· ,n are independent random variables, we have h S i n = Var( S n ) , where Var( S n ) standsfor the variance of S n . Then the bound (14) is decreasing in p .For more exponential inequalities similar to that of Theorem 2.2, we refer to Theorem 1.3 ofde la Pe˜na [6]. In particular, de la Pe˜na proved (13) with p = 2. Moreover, de la Pe˜na also provedthe following Bernstein type exponential inequalities: for all x, y > P (cid:18) S n h S i n ≤ − x, h S i n ≥ y (cid:19) ≤ exp (cid:26) − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) y (cid:27) (15) ≤ exp (cid:26) − x y x/ (cid:27) . (16)It is easy to see that the inequalities (15) and (16) are respectively the counterparts of (10) and (11)for S n / h S i n .Notice that in the independent case, the bounds (14) and (16) are exactly Bernstein’s bound (1).Thus (14) and (16) can be regarded as Bernstein type inequalities for martingales.The following deviation inequality for self-normalized martingales has its independent interest. Theorem 2.3.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:18) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:19) ≤ √ e (cid:16) x ) ln M (cid:17) exp (cid:26) − x x/b ) (cid:27) . (17)4imilarly, when [ S ] n in the left hand side of (17) is replaced by h S i n , we have the following inequalityfor normalized martingales. Such type inequalities are due to Liptser and Spokoiny [19]. Theorem 2.4.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:18) S n p h S i n ≤ − x, b ≤ p h S i n ≤ bM (cid:19) ≤ √ e (cid:16) x ) ln M (cid:17) exp (cid:26) − x x/ (3 b )) (cid:27) . (18)It is interesting to see that in the independent case, inequality (18) with b = p Var( S n ) and M = 1reduces to exactly Bennett’s inequality, up to an absolute constant √ e . Thus the bound (18) is rathertight.

3. Applications

As an application of Theorem 2.1, we consider the self-normalized sums of i.i.d. random variables.

Theorem 3.1.

Assume that ( ξ i ) i ≥ are i.i.d. random variables. Assume that ξ ≥ − and E ξ p < ∞ , where < p ≤ . Denote σ = E ξ . Then for all x > and y ∈ (0 , σ ) , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ exp (cid:26) − x ( σ − y )2(1 + x ) n (cid:27) + exp (cid:26) −

14 ( p − y p/ ( p − ( E ξ p ) / ( p − n (cid:27) . In particular, it implies that for all x ∈ (0 , , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ exp (cid:26) − σ x x ( p − /p ) n (cid:27) + exp (cid:26) − ( p − x E ξ p /σ p ) / ( p − n (cid:27) . By the last theorem, we have the following moderate deviation result: for any x > α ∈ (0 , ) , lim n →∞ n − α log P (cid:18) S n [ S ] n ≥ xn α (cid:19) ≤ − σ x . For more such type moderate deviation results, we refer to Shao [21] and Jing et al. [16], where theauthors established the moderate deviation principles for self-normalized sums S n / p [ S ] n . Consider Student’s t-statistic T n deﬁned by T n = √ n ¯ ξ .(cid:16) n − n X j =1 ( ξ j − ¯ ξ ) (cid:17) / , ξ = P ni =1 ξ i /n . Clearly, T n and S n / p [ S ] n are closely related via the following identity: T n = S n p [ S ] n (cid:18) n − n − ( S n / p [ S ] n ) (cid:19) / . (19)Since x/ ( n − x ) / is increasing on ( −√ n, √ n ), it follows from (19) that { T n ≥ x } = (cid:26) S n p [ S ] n ≥ x (cid:16) nn + x − (cid:17) / (cid:27) . (20)The above fact was pointed out by Efron [11]. With the help of (20), the following large deviationtype result for t-statistic is an immediate consequence of Theorem 2.3. Theorem 3.2.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:0) T n ≥ x, b ≤ p [ S ] n ≤ bM (cid:1) ≤ √ e (cid:18) x (cid:16) nn + x − (cid:17) / ln M (cid:19) exp ( − x (cid:0) nn + x − (cid:1) (cid:0) x (cid:0) nn + x − (cid:1) / /b (cid:1) ) . (21) The model of autoregressive process can be expressed as follows: for all n ≥

0, by X n +1 = θX n + ε n +1 (22)where X n and ε n are the observations and driven noises, respectively. We assume that ( ε n ) is asequence of i.i.d. centered random variables with variation σ > X = ε . We can estimate theunknown parameter θ by the least-squares estimator given by, for all n ≥ θ n = P nk =1 X k − X k P nk =1 X k − (23)Bercu and Touati [2] has established the convergence rate of ˆ θ n − θ when X and ( ε n ) are normalrandom variables. Here, we would like to give a convergence rate of ˆ θ n − θ for the case that the drivennoises ( ε n ) are bounded. Applying Theorem 2.2 and de la Pe˜na’s inequality (16), we have the followingexponential inequalities. Theorem 3.3.

Assume that | ε i | ≤ C for some positive constant C and all i. If | θ | < , then for all x > , P (cid:18) | ˆ θ n − θ | ≥ x (cid:19) ≤ p> E " exp ( − ( p − x (cid:0) σ + x C −| θ | ) (cid:1) n X k =1 X k − ) /p , (24) and, for all x, y > , P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ ( − x y (cid:0) σ + x C −| θ | ) (cid:1) ) . (25)6nequality (25) is similar to an exponential inequalities of de la Pe˜na, Klass and Lai [7], whichstates that when ( ε n ) are the standard normal random variables, it holds for all x, y > P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ (cid:26) − x y (cid:27) . (26)By Theorem 2.4, we obtain the following result. Theorem 3.4.

Assume that | ε i | ≤ C for some positive constant C and all i. If | θ | < , then for all b > , M ≥ and x > , P (cid:18)(cid:12)(cid:12)b θ n − θ (cid:12)(cid:12)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp ( − x (cid:0) σ + x C −| θ | ) (cid:1) ) . (27)

4. Proofs of Theorems

The following technical lemma is from Fan et al. [13]. For reader’s convenience, we shall give aproof following [13].

Lemma 4.1.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . For any λ ∈ [0 , , denote U n ( λ ) = exp n λS n + ( λ + log(1 − λ ))[ S ] n o . Then ( U i ( λ ) , F i ) i =0 , ··· ,n is a supermartingale, and satisﬁes that for all λ ∈ [0 , , E (cid:2) U n ( λ ) (cid:3) ≤ . (28)Proof. Assume ξ i ≥ − λ ∈ [0 , λξ i ≥ − λ > −

1. We consider the function f ( x ) = log(1 + x ) − xx / , x > − , it is increasing in x , we obtain thatlog(1 + λξ i ) ≥ λξ i + 12 ( λξ i ) f ( − λ )= λξ i + ξ i ( λ + log(1 − λ )) . Therefore, we have exp (cid:8) λξ i + ξ i ( λ + log(1 − λ )) (cid:9) ≤ λξ i . Since E ξ i = 0, it follows that E h exp (cid:8) λξ i + ( λ + log(1 − λ )) ξ i (cid:9)i ≤ . λ ∈ [0 ,

1) and n ≥

0, we have U n ( λ ) = U n − ( λ ) exp n λξ n + ( λ + log(1 − λ )) ξ n o . Hence, we deduce that for all λ ∈ [0 , E [ U n ( λ ) |F n − ] = U n − ( λ ) E h exp n λξ n + ( λ + log(1 − λ )) ξ n o(cid:12)(cid:12)(cid:12) F n − i ≤ U n − ( λ ) , which means ( U i ( λ ) , F i ) i =0 , ··· ,n is a positive supermartingale. Moreover, it holds E [ U n ( λ )] ≤ E [ U n − ( λ )] ≤ ... ≤ E [ U ( λ )] ≤ . This completes the proof of Lemma 4.1. (cid:3)

In order to prove Theorem 2.2, we need the following lemma of Freedman [15].

Lemma 4.2.

Assume that ξ i ≥ − for all i ∈ [1 , n ] . Denote W n ( λ ) = exp n − λS n − ( e λ − − λ ) h S i n o , λ ≥ . Then ( W i ( λ ) , F i ) i =0 , ··· ,n is a supermartingale, and satisﬁes that E (cid:2) W n ( λ ) (cid:3) ≤ . (29) We follow the method of Bercu and Touati [2]. Let A n = { S n ≥ x [ S ] n } , x > . By Markov’sinequality, H¨older’s inequality and Lemma 4.1, we have for all λ ∈ [0 ,

1) and q > P ( A n ) ≤ E (cid:20) exp (cid:26) λq (cid:16) S n − x [ S ] n (cid:17)(cid:27) A n (cid:21) = E (cid:20) exp (cid:26) q (cid:16) λS n + ( λ + log(1 − λ ))[ S ] n (cid:17)(cid:27) exp (cid:26) q (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) A n (cid:21) ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) A n (cid:21)(cid:19) /p (cid:16) E (cid:2) U n ( λ ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) − pq (cid:16) λ + log(1 − λ ) + λx (cid:17) [ S ] n (cid:27) A n (cid:21)(cid:19) /p , (30)where p = 1 + p/q . Consequently, as p/q = p −

1, we can deduce from (30) that P ( A n ) ≤ inf p> (cid:18) E (cid:20) exp n − ( p − λ + log(1 − λ ) + λx )[ S ] n o A n (cid:21)(cid:19) /p . The right hand side of the last inequality attains its minimum at λ = λ ( x ) = x x , P ( A n ) ≤ inf p> E " exp n − ( p − x − log(1 + x ))[ S ] n o A n ! /p . Using the following inequality x − log(1 + x ) ≥ x x ) , x > , (31)we deduce that P ( A n ) ≤ inf p> (cid:18) E (cid:20) exp n − ( p − x − log(1 + x ))[ S ] n o A n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp n − ( p − x x ) [ S ] n o A n (cid:21)(cid:19) /p , which gives the ﬁrst two desired inequalities. (cid:3) Next we prove the last two desired inequalities. Denote B n = { S n ≥ x [ S ] n , [ S ] n ≥ y } . By anargument similar to the proof of (30), we deduce that for all q > P ( B n ) ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) B n (cid:21)(cid:19) /p (cid:16) E (cid:2) U n ( t ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) − pq (cid:16) λ + log(1 − λ ) + λx (cid:17) y (cid:27)(cid:21)(cid:19) /p = exp (cid:26) − p − p (cid:16) x − log(1 + x ) (cid:17) y (cid:27) . Therefore, by (31), it holds P ( B n ) ≤ inf p> exp (cid:26) − p − p (cid:16) x − log(1 + x ) (cid:17) y (cid:27) = exp (cid:26) − (cid:16) x − log(1 + x ) (cid:17) y (cid:27) ≤ exp (cid:26) − x y x ) (cid:27) , which gives the last two desired inequalities. For all x >

0, denote by D n = {− S n ≥ x h S i n } .

9y exponential Markov’s inequality, we deduce that for all λ ∈ [0 ,

3) and q > P ( D n ) ≤ E (cid:20) exp (cid:26) λq (cid:16) − S n − x h S i n (cid:17)(cid:27) D n (cid:21) = E (cid:20) exp (cid:26) − λq S n − e λ − − λq h S i n (cid:27) exp (cid:26)(cid:16) e λ − − λq − λxq (cid:17) h S i n (cid:27) D n (cid:21) . Using H¨older’s inequality and Lemma 4.2, we have for all λ ∈ [0 ,

3) and q > P ( D n ) ≤ (cid:18) E (cid:20) exp (cid:26)(cid:16) p ( e λ − − λ ) q − pλxq (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p (cid:16) E (cid:2) W n ( t ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) e λ − − λ − λx (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p , (32)where p = 1 + p/q . Consequently, as p/q = p −

1, we can deduce from (32) that P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) ( p − (cid:16) e λ − − λ − λx (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p . (33)The right hand side of the last inequality attains its minimum at λ = λ ( x ) := log(1 + x ) . Substituting λ = λ ( x ) in (33), we obtain P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p . (34)Using the inequality e λ − − λ ≤ λ − λ/ , λ ∈ [0 , , (35)we get for all x ≥ , (1 + x ) log(1 + x ) − x = inf λ ≥ (cid:16) e λ − − λ − λx (cid:17) ≤ inf λ ≥ (cid:16) λ − λ/ − λx (cid:17) = exp ( − x x/ p x/ ) ≤ exp (cid:26) − x x/ (cid:27) , p x/ ≤ x/ . Thus, from (34), we obtain forall x ≥ , P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ p x/ h S i n (cid:27) D n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27) D n (cid:21)(cid:19) /p . This proves (12) and (14). (cid:3)

The proof of Theorem 2.3 is based on a modiﬁed method of Liptser and Spokoiny [19]. Given a >

1, introduce the geometric series b k = ba k and deﬁne random events C k = (cid:26) S n p [ S ] n ≥ x, b k ≤ p [ S ] n < b k +1 (cid:27) , k = 0 , , . . . , K, where K stands for the integer part of log a M . Clearly, it holds (cid:26) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:27) ⊆ K [ k =0 C k , (36)which leads to P (cid:18) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:19) ≤ K X k =0 P ( C k ) . (37)Notice that λ + log(1 − λ ) ≥ − λ − λ ) , λ ∈ [0 , . For any λ ∈ [0 , E h exp n λS n − λ − λ ) [ S ] n o C k i ≤ . Next, taking λ k = x/ ( x + b k ) , for any x >

0, we obtain1 ≥ E (cid:20) exp (cid:26) xx + b k S n − x b k ( x + b k ) [ S ] n (cid:27) C k (cid:21) ≥ E (cid:20) exp (cid:26) x x + b k p [ S ] n − x b k ( x + b k ) [ S ] n (cid:27) C k (cid:21) ≥ E (cid:20) exp (cid:26) inf b k ≤ c

12 ( x − . Since log(1 + x ) ≥ x ) for x ≥

0, we obtain log a M ≤ x ) ln M and (17) follows by (37). (cid:3) The proof of Theorem 2.4 is similar to the proof of Theorem 2.3. Given a >

1, introduce thegeometric series b k = ba k and deﬁne random events H k = (cid:26) − S n p h S i n ≥ x, b k ≤ p h S i n < b k +1 (cid:27) , k = 0 , , . . . , K, where K stands for the integer part of log a M . Clearly, it holds (cid:26) − S n p h S i n ≥ x, b ≤ p h S i n ≤ bM (cid:27) ⊆ K [ k =0 H k , (38)which leads to P (cid:18) − S n p h S i n ≥ x, b ≤ p h S i n ≤ bM (cid:19) ≤ K X k =0 P ( H k ) . (39)Notice that e λ − − λ ≥ λ − λ/ , λ ∈ [0 , . For any λ ∈ [0 , E h exp n λ ( − S n ) − λ − λ/ h S i n o H k i ≤ . Next, taking λ k = x/ ( b k + x/ , for any x >

0, we obtain1 ≥ E (cid:20) exp (cid:26) xb k + x/ − S n ) − x b k ( b k + x/ h S i n (cid:27) H k (cid:21) ≥ E (cid:20) exp (cid:26) x b k + x/ p h S i n − x b k ( b k + x/ h S i n (cid:27) H k (cid:21) ≥ E (cid:20) exp (cid:26) inf b k ≤ c
Lemma 4.3.

Let ( ζ i ) i ≥ be independent nonnegative random variables with E ζ pi < ∞ , where < p ≤ . Then for all < y < P ni =1 E ζ i P (cid:18) n X i =1 ζ i ≤ n X i =1 E ζ i − y (cid:19) ≤ exp (cid:26) − ( p − y p/ ( p − P ni =1 E ζ pi ) / ( p − (cid:27) . (40)Now, we are in position to prove Theorem 3.1. For all x > y ∈ (0 , σ ) , we have P (cid:18) S n [ S ] n ≥ x (cid:19) = P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ n ( σ − y ) (cid:19) + P (cid:18) S n [ S ] n ≥ x, [ S ] n < n ( σ − y ) (cid:19) ≤ exp (cid:26) − x ( σ − y )2(1 + x ) n (cid:27) + P (cid:18) [ S ] n − nσ < − ny (cid:19) . (41)By Lemma 4.3, we have for all y ∈ (0 , σ ) , P (cid:18) [ S ] n − nσ < − ny (cid:19) ≤ exp (cid:26) − ( p − ny ) p/ ( p − n E ξ p ) / ( p − (cid:27) = exp (cid:26) −

14 ( p − y p/ ( p − ( E ξ p ) / ( p − n (cid:27) . (42)Combining (41) and (42) together, we obtain the ﬁrst desired inequality. Taking y = x ( p − /p σ , weobtain the second desired inequality. (cid:3) By (22), we have X k = P ki =0 θ k − i ε i . Since | θ | < | ξ i | ≤ C, we deduce that for all k, | X k | ≤ C k X i =0 | θ | k − i ≤ C − | θ | . n ≥ θ n − θ = P nk =1 X k − ε k P nk =1 X k − . (43)For any i = 1 , . . . , n , set ξ i = X i − ε i (1 − | θ | ) /C and F i = σ (cid:0) ε k , ≤ k ≤ i (cid:1) . Then ( ξ i , F i ) i =1 ,...,n is a sequence of martingale diﬀerences and satisﬁes | ξ i | ≤ h S i n = n X k =1 E (cid:2) ξ i (cid:12)(cid:12) F i − (cid:3) = σ (1 − | θ | ) C n X k =1 X i − . Thus we have ˆ θ n − θ = (1 − | θ | ) σ C S n h S i n . Applying inequality (14) to ( ξ i , F i ) i =1 ,...,n , we deduce that for all x > , P (cid:18) ˆ θ n − θ ≤ − x (cid:19) = P (cid:18) S n h S i n ≤ − x C (1 − | θ | ) σ (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p . (44)Similarly, applying inequality (14) to ( − ξ i , F i ) i =1 ,...,n , we have for all x > , P (cid:18) ˆ θ n − θ ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p . (45)Combining (44) and (45) together, we obtain for all x > , P (cid:18) | ˆ θ n − θ | ≥ x (cid:19) ≤ p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p , which gives the ﬁrst desired inequality. Applying de la Pe˜na’s inequality (16) to ( ξ i , F i ) i =1 ,...,n , we getfor all x, y > , P (cid:18) ˆ θ n − θ ≤ − x, n X k =1 X k − ≥ y (cid:19) = P (cid:18) S n h S i n ≤ − x C (1 − | θ | ) σ , h S i n ≥ y (1 − | θ | ) σ C (cid:19) ≤ exp (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) . (46)14imilarly, applying de la Pe˜na’s inequality (16) to ( − ξ i , F i ) i =1 ,...,n , we have for all x, y > , P (cid:18) ˆ θ n − θ ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ exp (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) . (47)Combining (46) and (47) together, we obtain for all x, y > , P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) , which gives the second desired inequality. (cid:3) Recall the notations in the proof of Theorem 3.3. It is easy to see that( b θ n − θ ) q Σ nk =1 X k − = P nk =1 X k − ε k qP nk =1 X k − = σ S n p h S i n . Therefore, by Theorem 2.4, for all b > , M ≥ x > P (cid:18)(cid:0)b θ n − θ (cid:1)q Σ nk =1 X k − ≤ − x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ P (cid:18) S n p h S i n ≤ − xσ , b (1 − | θ | ) σC ≤ p h S i n ≤ bM (1 − | θ | ) σC (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp (cid:26) − x σ + xC / (3 b (1 − | θ | ))) (cid:27) . Similarly, the same bound holds for the tail probabilities P (cid:18)(cid:0)b θ n − θ (cid:1)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) . Hence, we have for all b > , M ≥ x > P (cid:18)(cid:12)(cid:12)b θ n − θ (cid:12)(cid:12)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp (cid:26) − x σ + xC / (3 b (1 − | θ | ))) (cid:27) , which gives the desired inequality. (cid:3) Acknowledgements

The work has been partially supported by the National Natural Science Foundation of China(Grant nos. 11601375 and 11626250). 15 eferencesReferences [1] Bennett, G. (1962). Probability inequalities for the sum of independent random variables.

J. Amer. Statist. Assoc. , No. 297, 33-45.[2] Bercu, B., Touati, A. (2008). Exponential inequalities for self-normalized martingales with applications. Ann. Appl.Probab.

18: 1848-1869.[3] Bercu, B., Delyon, B., Rio, E. (2015).

Concentration inequalities for sums and martingales . New York: Springer.[4] Bernstein, S. (1946).

The Theory of Probabilities (Russian). Moscow, Leningrad.[5] Chen, X., Shao, Q.M., Wu, W.B., Xu, L. (2016). Self-normalized Cram´er-type moderate deviations under depen-dence.

Ann. Statist.

Vol. 44, No. 4, 1593-1617.[6] de la Pe˜na, V. H., (1999). A general class of exponential inequalities for martingales and ratios.

Ann. Probab.

Probab. Surveys

4: 172-192.[8] de la Pe˜na, V.H., Lai, T.L., Shao, Q.M. (2008).

Self-normalized processes: Limit theory and Statistical Applications .Springer.[9] Delyon, B. (2009). Exponential inequalities for sums of weakly dependent variables.

Electronic J. Probab.

14: 752-779.[10] Dzhaparidze, K., and van Zanten, J.H., (2001). On Bernstein-type inequalities for martingales.

Stochastic Process.Appl. , 109-117.[11] Efron, B. (1969). Student’s t-test under symmetry conditions.. J. Amer. Statist. Assoc.

64, 1278-1302.[12] Fan, X., Grama, I. and Liu, Q. (2012). Hoeﬀding’s inequality for supermartingales.

Stochastic Process. Appl.

Electron. J. Probab.

Statistics

Ann. Probab.

Sci. China Math. , (11): 2297-2315.[17] Lesigne, E. and Voln´y, D., (2001). Large deviations for martingales. Stochastic Process. Appl.

96, 143-159.[18] Liu, Q. and Watbled, F., (2009). Exponential ineqalities for martingales and asymptotic properties of the freeenergy of directed polymers in a random environment.

Stochastic Process. Appl.

Statist. Probab. Lett.

Stochastic Process. Appl.

Ann. Probab.25(1): 285-328.

Related Researches

Stationary Distribution Convergence of the Offered Waiting Processes in Heavy Traffic under General Patience Time Scaling

by Chihoon Lee

Exact lower bound on an "exactly one" probability

by Iosif Pinelis

A CLT for degenerate diffusions with periodic coefficients, and application to homogenisation of linear PDEs

by Nikola Sandri?

A probabilistic approach to the Erdös-Kac theorem for additive functions

by Louis H. Y. Chen

Frozen 1 -RSB structure of the symmetric Ising perceptron

by Will Perkins

Berry--Esseen Bounds for Multivariate Nonlinear Statistics with Applications to M-estimators and Stochastic Gradient Descent Algorithms

by Qi-Man Shao

Infinite Horizon Multi-Dimensional BSDE with Oblique Reflection and Switching Problem

by Brahim El Asri

Adding edge dynamics to wireless random-access networks

by Matteo Sfragara

On telegraph processes, their first passage times and running extrema

by Nikita Ratanov

A note on Fokker-Planck equations and graphons

by Fabio Coppini

Numerical approximations of one-point large deviations rate functions of stochastic differential equations with small noise

by Jialin Hong

Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices

by Kevin Schnelli

Statistical Enumeration of Groups by Double Cosets

by Persi Diaconis

A shape theorem for exploding sandpiles

by Ahmed Bou-Rabee

The Brownian Web as a random R -tree

by Giuseppe Cannizzaro

The two-sided exit problem for a random walk on Z and having infinite variance II

by Kohei Uchiyama

Existence of solutions to a system of SDEs with mean-field drift and jump random measures

by Ying Jiao

Set-valued Ito's formula with an application to the general set-valued backward stochastic differential equation

by Yao-jia Zhang

Multiple Phase Transitions for an Infinite System of Spiking Neurons

by A. M. B. Nascimento

A new discrete distribution arising from a generalised random game and its asymptotic properties

by Rudolf Frühwirth

The Critical Mean-field Chayes-Machta Dynamics

by Antonio Blanca

Local elliptic law

by Johannes Alt

Unified Signature Cumulants and Generalized Magnus Expansions

by Peter K. Friz

Sample canonical correlation coefficients of high-dimensional random vectors with finite rank correlations

by Zongming Ma

Stability of Overshoots of Markov Additive Processes

by Leif Döring

«

1

2

3

4

»

Submitted on 14 Mar 2018 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar