Bernstein type inequalities for self-normalized martingales with applications
aa r X i v : . [ m a t h . P R ] M a r Bernstein type inequalities for self-normalized martingales with applications
Xiequan Fan ∗ , Shen Wang Center for Applied Mathematics, Tianjin University, 300072 Tianjin, China
Abstract
For self-normalized martingales with conditionally symmetric differences, de la Pe˜na [6] establishedthe Gaussian type exponential inequalities. Bercu and Touati [2] extended de la Pe˜na’s inequalities tomartingales with differences heavy on left. In this paper, we establish Bernstein type exponential in-equalities for self-normalized martingales with differences bounded from below. Moreover, applicationsto self-normalized sums, t-statistics and autoregressive processes are discussed.
Keywords: martingales; self-normalized processes; exponential inequalities; autoregressive processes
Primary 60G42; 60E15; 60F10
1. Introduction
Let ( ξ i ) i ≥ be a sequence of zero-mean independent random variables satisfying ξ i ≤ i .Denote S n = P ni =1 ξ i the partial sums of ( ξ i ) i ≥ . Bennett [1] proved the following Bernstein typeinequality: for all x > , P (cid:0) S n ≥ xv (cid:1) ≤ exp (cid:26) − x v x/ (cid:27) , (1)where v = Var( S n ) is the variance of S n . The importance of Bernstein type inequalities comes fromthe fact that they combine both the Gaussian trends and exponentially decaying rate. To see this, werewrite the last inequality in the following form: for all x > , P (cid:0) S n ≥ x (cid:1) ≤ exp (cid:26) − x v + x/ (cid:27) . (2)It is easy to see that the last bound behaves as exp {− x v } for moderate x = o ( v ) , while it isexponentially decaying to 0 as x → ∞ .The generalizations of (1) to martingales have attracted certain interest. Assume that ( ξ i , F i ) i =0 , ··· ,n is a sequence of martingale differences. If ξ i ≤ , Freedman [15] showed that (1) holds also when P (cid:0) S n ≥ xv (cid:1) is replaced by P (cid:0) S n ≥ xv , h S i n ≤ v (cid:1) , where h S i n is the conditional variance of S n . Dela Pe˜na [6], Dzhaparidze and van Zanten [10] and Fan et al. [12, 14] extended Freedman’s inequality ∗ Corresponding author.
E-mail : [email protected] (X. Fan).
Preprint submitted to Elsevier September 11, 2018 o martingales with non-bounded differences. Recently, Rio [20] gave a refinements on Freedman’sinequality.Despite the fact that the case of martingale is well studied, there are only a few results on Bernsteintype inequalities for self-normalized martingales S n / [ S ] n , where [ S ] n is the squared variance of S n .Among them, let us recall the following exponential inequalities of de la Pe˜na [6]. Assume that( ξ i , F i ) i =0 , ··· ,n is a sequence of conditionally symmetric martingale differences. Recall that ξ i is calledconditionally symmetric if L ( ξ i |F i − ) = L ( − ξ i |F i − ) for all i , where L ( ξ i |F i − ) stands for the regularversion of the conditional distribution of ξ i given a σ -field F i − . De la Pe˜na [6] have established thefollowing exponential inequalities for self-normalized martingales: for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ s E (cid:20) exp (cid:26) − x [ S ] n (cid:27)(cid:21) , (3)and, for all x, y > P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ y (cid:19) ≤ exp (cid:26) − x y (cid:27) , (4)where [ S ] n = P ni =1 ξ i the squared variance of S n . In the i.i.d. case, [ S ] n /n usually converges almostsurely to the variance of the random variables. Thus (3) and (4) can be regarded as Gaussian typeinequalities.The inequalities of de la Pe˜na have been extended to the martingales with differences heavy onleft. Recall that an integrable random variable X is called heavy on left if E X = 0 and, for all a > , E [ T a ( X )] ≤ , where T a ( X ) = min( | X | , a ) sign( X )is the truncated version of X . Clearly, conditionally symmetric martingale differences are heavy onleft. Bercu and Touati [2] have obtained the following extension of de la Pe˜na’s inequity (3): for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) −
12 ( p − x [ S ] n (cid:27)(cid:21)(cid:19) /p . (5)They also showed that (4) holds for martingales with differences heavy on left. In the particular case p = 2 , inequality (5) reduces to inequality (3) under the conditional symmetric assumption. Similarresults for self-normalized martingales S n / p [ S ] n can also be found in Bercu and Touati [2].Exponential inequalities for self-normalized martingales have a lot of applications. We refer tode la Pe˜na, Klass and Lai [7] for autoregressive processes. Bercu and Touati [2] applied such typeinequalities to parameter estimations of linear regressions, autoregressive processes and branchingprocesses. For more applications of such type inequalities, we refer to the monographs of de la Pe˜na,Lai and Shao [8] and Bercu, Delyon and Rio [3].In this paper, we aim to establish Bernstein type inequalities for self-normalized martingales withdifferences bounded from below. It is obvious that a random variable is bounded from below doesnot imply that it is heavy on left. Our results for self-normalized martingales are analogues to theinequalities (3) - (5). Applications to self-normalized sums, t-statistics and autoregressive processesare also discussed.The paper is organized as follows. We present our main results in Section 2. In Section 3, wediscuss the applications, and prove our main results in Section 4.2 . Main results Let ( ξ i , F i ) i =0 , ··· ,n be a finite sequence of real-valued square integrable martingale differences de-fined on a probability space (Ω , F , P ), where ξ = 0 and {∅ , Ω } = F ⊆ . . . ⊆ F n ⊆ F are increasing σ -fields. So by definition, we have E [ ξ i |F i − ] = 0 , i = 1 , . . . , n . Set S = 0 and S k = k X i =1 ξ i (6)for k = 1 , . . . , n. Then S = ( S k , F k ) k =1 ,...,n is a martingale. Let [ S ] and h S i be, respectively, thesquared variance and the conditional variance of the martingale S, that is[ S ] = 0 , [ S ] k = k X i =1 ξ i , and h S i = 0 , h S i k = k X i =1 E [ ξ i |F i − ] , k = 1 , ..., n. (7)Our main result is the following Bernstein type inequalities for self-normalized martingales withdifferences bounded from below. It is worth to be mentioned that the inequalities are new even forindependent random variables. Theorem 2.1.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all x > , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) x − log(1 + x ) (cid:17) [ S ] n (cid:27) { S n ≥ x [ S ] n } (cid:21)(cid:19) /p (8) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x ) [ S ] n (cid:27) { S n ≥ x [ S ] n } (cid:21)(cid:19) /p , (9) and, for all y > , P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ y (cid:19) ≤ exp (cid:26) − ( x − log(1 + x )) y (cid:27) (10) ≤ exp (cid:26) − x y x ) (cid:27) . (11)Clearly, inequality (9) implies that for all x > P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x ) [ S ] n (cid:27)(cid:21)(cid:19) /p , which is an analogues to de la Pe˜na’s inequity (3) and the inequality of Bercu and Touati (5).3enote B n = P ni =1 E ξ i . It is easy to see that for all x > < ε < , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ B n (1 − ε ) (cid:19) + P (cid:18) [ S ] n < B n (1 − ε ) (cid:19) = P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ B n (1 − ε ) (cid:19) + P (cid:18) n X i =1 ( ξ i − E ξ i ) < − B n ε (cid:19) . The first term of the last bound can be estimated by (11). For the second term of the last bound, noticethat ( ξ i − E ξ i ) i =1 ,...,n are centered random variables bounded from below, and they are independentonce ( ξ i ) i =1 ,...,n are independent. Thus we need the following Bernstein type exponential inequalitiesfor centered random variables bounded from below. Theorem 2.2.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all x > , P (cid:18) S n h S i n ≤ − x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) { S n ≤− x h S i n } (cid:21)(cid:19) /p (12) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27) { S n ≤− x h S i n } (cid:21)(cid:19) /p . (13)Inequality (13) implies that for all x > P (cid:18) S n h S i n ≤ − x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27)(cid:21)(cid:19) /p . (14)It seems that the bound (14) is usually decreasing in p . For instance, consider the independent case.When ( ξ i ) i =1 , ··· ,n are independent random variables, we have h S i n = Var( S n ) , where Var( S n ) standsfor the variance of S n . Then the bound (14) is decreasing in p .For more exponential inequalities similar to that of Theorem 2.2, we refer to Theorem 1.3 ofde la Pe˜na [6]. In particular, de la Pe˜na proved (13) with p = 2. Moreover, de la Pe˜na also provedthe following Bernstein type exponential inequalities: for all x, y > P (cid:18) S n h S i n ≤ − x, h S i n ≥ y (cid:19) ≤ exp (cid:26) − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) y (cid:27) (15) ≤ exp (cid:26) − x y x/ (cid:27) . (16)It is easy to see that the inequalities (15) and (16) are respectively the counterparts of (10) and (11)for S n / h S i n .Notice that in the independent case, the bounds (14) and (16) are exactly Bernstein’s bound (1).Thus (14) and (16) can be regarded as Bernstein type inequalities for martingales.The following deviation inequality for self-normalized martingales has its independent interest. Theorem 2.3.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:18) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:19) ≤ √ e (cid:16) x ) ln M (cid:17) exp (cid:26) − x x/b ) (cid:27) . (17)4imilarly, when [ S ] n in the left hand side of (17) is replaced by h S i n , we have the following inequalityfor normalized martingales. Such type inequalities are due to Liptser and Spokoiny [19]. Theorem 2.4.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:18) S n p h S i n ≤ − x, b ≤ p h S i n ≤ bM (cid:19) ≤ √ e (cid:16) x ) ln M (cid:17) exp (cid:26) − x x/ (3 b )) (cid:27) . (18)It is interesting to see that in the independent case, inequality (18) with b = p Var( S n ) and M = 1reduces to exactly Bennett’s inequality, up to an absolute constant √ e . Thus the bound (18) is rathertight.
3. Applications
As an application of Theorem 2.1, we consider the self-normalized sums of i.i.d. random variables.
Theorem 3.1.
Assume that ( ξ i ) i ≥ are i.i.d. random variables. Assume that ξ ≥ − and E ξ p < ∞ , where < p ≤ . Denote σ = E ξ . Then for all x > and y ∈ (0 , σ ) , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ exp (cid:26) − x ( σ − y )2(1 + x ) n (cid:27) + exp (cid:26) −
14 ( p − y p/ ( p − ( E ξ p ) / ( p − n (cid:27) . In particular, it implies that for all x ∈ (0 , , P (cid:18) S n [ S ] n ≥ x (cid:19) ≤ exp (cid:26) − σ x x ( p − /p ) n (cid:27) + exp (cid:26) − ( p − x E ξ p /σ p ) / ( p − n (cid:27) . By the last theorem, we have the following moderate deviation result: for any x > α ∈ (0 , ) , lim n →∞ n − α log P (cid:18) S n [ S ] n ≥ xn α (cid:19) ≤ − σ x . For more such type moderate deviation results, we refer to Shao [21] and Jing et al. [16], where theauthors established the moderate deviation principles for self-normalized sums S n / p [ S ] n . Consider Student’s t-statistic T n defined by T n = √ n ¯ ξ .(cid:16) n − n X j =1 ( ξ j − ¯ ξ ) (cid:17) / , ξ = P ni =1 ξ i /n . Clearly, T n and S n / p [ S ] n are closely related via the following identity: T n = S n p [ S ] n (cid:18) n − n − ( S n / p [ S ] n ) (cid:19) / . (19)Since x/ ( n − x ) / is increasing on ( −√ n, √ n ), it follows from (19) that { T n ≥ x } = (cid:26) S n p [ S ] n ≥ x (cid:16) nn + x − (cid:17) / (cid:27) . (20)The above fact was pointed out by Efron [11]. With the help of (20), the following large deviationtype result for t-statistic is an immediate consequence of Theorem 2.3. Theorem 3.2.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Then for all b > , M ≥ and x > , P (cid:0) T n ≥ x, b ≤ p [ S ] n ≤ bM (cid:1) ≤ √ e (cid:18) x (cid:16) nn + x − (cid:17) / ln M (cid:19) exp ( − x (cid:0) nn + x − (cid:1) (cid:0) x (cid:0) nn + x − (cid:1) / /b (cid:1) ) . (21) The model of autoregressive process can be expressed as follows: for all n ≥
0, by X n +1 = θX n + ε n +1 (22)where X n and ε n are the observations and driven noises, respectively. We assume that ( ε n ) is asequence of i.i.d. centered random variables with variation σ > X = ε . We can estimate theunknown parameter θ by the least-squares estimator given by, for all n ≥ θ n = P nk =1 X k − X k P nk =1 X k − (23)Bercu and Touati [2] has established the convergence rate of ˆ θ n − θ when X and ( ε n ) are normalrandom variables. Here, we would like to give a convergence rate of ˆ θ n − θ for the case that the drivennoises ( ε n ) are bounded. Applying Theorem 2.2 and de la Pe˜na’s inequality (16), we have the followingexponential inequalities. Theorem 3.3.
Assume that | ε i | ≤ C for some positive constant C and all i. If | θ | < , then for all x > , P (cid:18) | ˆ θ n − θ | ≥ x (cid:19) ≤ p> E " exp ( − ( p − x (cid:0) σ + x C −| θ | ) (cid:1) n X k =1 X k − ) /p , (24) and, for all x, y > , P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ ( − x y (cid:0) σ + x C −| θ | ) (cid:1) ) . (25)6nequality (25) is similar to an exponential inequalities of de la Pe˜na, Klass and Lai [7], whichstates that when ( ε n ) are the standard normal random variables, it holds for all x, y > P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ (cid:26) − x y (cid:27) . (26)By Theorem 2.4, we obtain the following result. Theorem 3.4.
Assume that | ε i | ≤ C for some positive constant C and all i. If | θ | < , then for all b > , M ≥ and x > , P (cid:18)(cid:12)(cid:12)b θ n − θ (cid:12)(cid:12)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp ( − x (cid:0) σ + x C −| θ | ) (cid:1) ) . (27)
4. Proofs of Theorems
The following technical lemma is from Fan et al. [13]. For reader’s convenience, we shall give aproof following [13].
Lemma 4.1.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . For any λ ∈ [0 , , denote U n ( λ ) = exp n λS n + ( λ + log(1 − λ ))[ S ] n o . Then ( U i ( λ ) , F i ) i =0 , ··· ,n is a supermartingale, and satisfies that for all λ ∈ [0 , , E (cid:2) U n ( λ ) (cid:3) ≤ . (28)Proof. Assume ξ i ≥ − λ ∈ [0 , λξ i ≥ − λ > −
1. We consider the function f ( x ) = log(1 + x ) − xx / , x > − , it is increasing in x , we obtain thatlog(1 + λξ i ) ≥ λξ i + 12 ( λξ i ) f ( − λ )= λξ i + ξ i ( λ + log(1 − λ )) . Therefore, we have exp (cid:8) λξ i + ξ i ( λ + log(1 − λ )) (cid:9) ≤ λξ i . Since E ξ i = 0, it follows that E h exp (cid:8) λξ i + ( λ + log(1 − λ )) ξ i (cid:9)i ≤ . λ ∈ [0 ,
1) and n ≥
0, we have U n ( λ ) = U n − ( λ ) exp n λξ n + ( λ + log(1 − λ )) ξ n o . Hence, we deduce that for all λ ∈ [0 , E [ U n ( λ ) |F n − ] = U n − ( λ ) E h exp n λξ n + ( λ + log(1 − λ )) ξ n o(cid:12)(cid:12)(cid:12) F n − i ≤ U n − ( λ ) , which means ( U i ( λ ) , F i ) i =0 , ··· ,n is a positive supermartingale. Moreover, it holds E [ U n ( λ )] ≤ E [ U n − ( λ )] ≤ ... ≤ E [ U ( λ )] ≤ . This completes the proof of Lemma 4.1. (cid:3)
In order to prove Theorem 2.2, we need the following lemma of Freedman [15].
Lemma 4.2.
Assume that ξ i ≥ − for all i ∈ [1 , n ] . Denote W n ( λ ) = exp n − λS n − ( e λ − − λ ) h S i n o , λ ≥ . Then ( W i ( λ ) , F i ) i =0 , ··· ,n is a supermartingale, and satisfies that E (cid:2) W n ( λ ) (cid:3) ≤ . (29) We follow the method of Bercu and Touati [2]. Let A n = { S n ≥ x [ S ] n } , x > . By Markov’sinequality, H¨older’s inequality and Lemma 4.1, we have for all λ ∈ [0 ,
1) and q > P ( A n ) ≤ E (cid:20) exp (cid:26) λq (cid:16) S n − x [ S ] n (cid:17)(cid:27) A n (cid:21) = E (cid:20) exp (cid:26) q (cid:16) λS n + ( λ + log(1 − λ ))[ S ] n (cid:17)(cid:27) exp (cid:26) q (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) A n (cid:21) ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) A n (cid:21)(cid:19) /p (cid:16) E (cid:2) U n ( λ ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) − pq (cid:16) λ + log(1 − λ ) + λx (cid:17) [ S ] n (cid:27) A n (cid:21)(cid:19) /p , (30)where p = 1 + p/q . Consequently, as p/q = p −
1, we can deduce from (30) that P ( A n ) ≤ inf p> (cid:18) E (cid:20) exp n − ( p − λ + log(1 − λ ) + λx )[ S ] n o A n (cid:21)(cid:19) /p . The right hand side of the last inequality attains its minimum at λ = λ ( x ) = x x , P ( A n ) ≤ inf p> E " exp n − ( p − x − log(1 + x ))[ S ] n o A n ! /p . Using the following inequality x − log(1 + x ) ≥ x x ) , x > , (31)we deduce that P ( A n ) ≤ inf p> (cid:18) E (cid:20) exp n − ( p − x − log(1 + x ))[ S ] n o A n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp n − ( p − x x ) [ S ] n o A n (cid:21)(cid:19) /p , which gives the first two desired inequalities. (cid:3) Next we prove the last two desired inequalities. Denote B n = { S n ≥ x [ S ] n , [ S ] n ≥ y } . By anargument similar to the proof of (30), we deduce that for all q > P ( B n ) ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) − λ − log(1 − λ ) − λx (cid:17) [ S ] n (cid:27) B n (cid:21)(cid:19) /p (cid:16) E (cid:2) U n ( t ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) − pq (cid:16) λ + log(1 − λ ) + λx (cid:17) y (cid:27)(cid:21)(cid:19) /p = exp (cid:26) − p − p (cid:16) x − log(1 + x ) (cid:17) y (cid:27) . Therefore, by (31), it holds P ( B n ) ≤ inf p> exp (cid:26) − p − p (cid:16) x − log(1 + x ) (cid:17) y (cid:27) = exp (cid:26) − (cid:16) x − log(1 + x ) (cid:17) y (cid:27) ≤ exp (cid:26) − x y x ) (cid:27) , which gives the last two desired inequalities. For all x >
0, denote by D n = {− S n ≥ x h S i n } .
9y exponential Markov’s inequality, we deduce that for all λ ∈ [0 ,
3) and q > P ( D n ) ≤ E (cid:20) exp (cid:26) λq (cid:16) − S n − x h S i n (cid:17)(cid:27) D n (cid:21) = E (cid:20) exp (cid:26) − λq S n − e λ − − λq h S i n (cid:27) exp (cid:26)(cid:16) e λ − − λq − λxq (cid:17) h S i n (cid:27) D n (cid:21) . Using H¨older’s inequality and Lemma 4.2, we have for all λ ∈ [0 ,
3) and q > P ( D n ) ≤ (cid:18) E (cid:20) exp (cid:26)(cid:16) p ( e λ − − λ ) q − pλxq (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p (cid:16) E (cid:2) W n ( t ) (cid:3)(cid:17) /q ≤ (cid:18) E (cid:20) exp (cid:26) pq (cid:16) e λ − − λ − λx (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p , (32)where p = 1 + p/q . Consequently, as p/q = p −
1, we can deduce from (32) that P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) ( p − (cid:16) e λ − − λ − λx (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p . (33)The right hand side of the last inequality attains its minimum at λ = λ ( x ) := log(1 + x ) . Substituting λ = λ ( x ) in (33), we obtain P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p . (34)Using the inequality e λ − − λ ≤ λ − λ/ , λ ∈ [0 , , (35)we get for all x ≥ , (1 + x ) log(1 + x ) − x = inf λ ≥ (cid:16) e λ − − λ − λx (cid:17) ≤ inf λ ≥ (cid:16) λ − λ/ − λx (cid:17) = exp ( − x x/ p x/ ) ≤ exp (cid:26) − x x/ (cid:27) , p x/ ≤ x/ . Thus, from (34), we obtain forall x ≥ , P ( D n ) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − (cid:16) (1 + x ) log(1 + x ) − x (cid:17) h S i n (cid:27) D n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ p x/ h S i n (cid:27) D n (cid:21)(cid:19) /p ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x x/ h S i n (cid:27) D n (cid:21)(cid:19) /p . This proves (12) and (14). (cid:3)
The proof of Theorem 2.3 is based on a modified method of Liptser and Spokoiny [19]. Given a >
1, introduce the geometric series b k = ba k and define random events C k = (cid:26) S n p [ S ] n ≥ x, b k ≤ p [ S ] n < b k +1 (cid:27) , k = 0 , , . . . , K, where K stands for the integer part of log a M . Clearly, it holds (cid:26) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:27) ⊆ K [ k =0 C k , (36)which leads to P (cid:18) S n p [ S ] n ≥ x, b ≤ p [ S ] n ≤ bM (cid:19) ≤ K X k =0 P ( C k ) . (37)Notice that λ + log(1 − λ ) ≥ − λ − λ ) , λ ∈ [0 , . For any λ ∈ [0 , E h exp n λS n − λ − λ ) [ S ] n o C k i ≤ . Next, taking λ k = x/ ( x + b k ) , for any x >
0, we obtain1 ≥ E (cid:20) exp (cid:26) xx + b k S n − x b k ( x + b k ) [ S ] n (cid:27) C k (cid:21) ≥ E (cid:20) exp (cid:26) x x + b k p [ S ] n − x b k ( x + b k ) [ S ] n (cid:27) C k (cid:21) ≥ E (cid:20) exp (cid:26) inf b k ≤ c
12 ( x − . Since log(1 + x ) ≥ x ) for x ≥
0, we obtain log a M ≤ x ) ln M and (17) follows by (37). (cid:3) The proof of Theorem 2.4 is similar to the proof of Theorem 2.3. Given a >
1, introduce thegeometric series b k = ba k and define random events H k = (cid:26) − S n p h S i n ≥ x, b k ≤ p h S i n < b k +1 (cid:27) , k = 0 , , . . . , K, where K stands for the integer part of log a M . Clearly, it holds (cid:26) − S n p h S i n ≥ x, b ≤ p h S i n ≤ bM (cid:27) ⊆ K [ k =0 H k , (38)which leads to P (cid:18) − S n p h S i n ≥ x, b ≤ p h S i n ≤ bM (cid:19) ≤ K X k =0 P ( H k ) . (39)Notice that e λ − − λ ≥ λ − λ/ , λ ∈ [0 , . For any λ ∈ [0 , E h exp n λ ( − S n ) − λ − λ/ h S i n o H k i ≤ . Next, taking λ k = x/ ( b k + x/ , for any x >
0, we obtain1 ≥ E (cid:20) exp (cid:26) xb k + x/ − S n ) − x b k ( b k + x/ h S i n (cid:27) H k (cid:21) ≥ E (cid:20) exp (cid:26) x b k + x/ p h S i n − x b k ( b k + x/ h S i n (cid:27) H k (cid:21) ≥ E (cid:20) exp (cid:26) inf b k ≤ c
Lemma 4.3.
Let ( ζ i ) i ≥ be independent nonnegative random variables with E ζ pi < ∞ , where < p ≤ . Then for all < y < P ni =1 E ζ i P (cid:18) n X i =1 ζ i ≤ n X i =1 E ζ i − y (cid:19) ≤ exp (cid:26) − ( p − y p/ ( p − P ni =1 E ζ pi ) / ( p − (cid:27) . (40)Now, we are in position to prove Theorem 3.1. For all x > y ∈ (0 , σ ) , we have P (cid:18) S n [ S ] n ≥ x (cid:19) = P (cid:18) S n [ S ] n ≥ x, [ S ] n ≥ n ( σ − y ) (cid:19) + P (cid:18) S n [ S ] n ≥ x, [ S ] n < n ( σ − y ) (cid:19) ≤ exp (cid:26) − x ( σ − y )2(1 + x ) n (cid:27) + P (cid:18) [ S ] n − nσ < − ny (cid:19) . (41)By Lemma 4.3, we have for all y ∈ (0 , σ ) , P (cid:18) [ S ] n − nσ < − ny (cid:19) ≤ exp (cid:26) − ( p − ny ) p/ ( p − n E ξ p ) / ( p − (cid:27) = exp (cid:26) −
14 ( p − y p/ ( p − ( E ξ p ) / ( p − n (cid:27) . (42)Combining (41) and (42) together, we obtain the first desired inequality. Taking y = x ( p − /p σ , weobtain the second desired inequality. (cid:3) By (22), we have X k = P ki =0 θ k − i ε i . Since | θ | < | ξ i | ≤ C, we deduce that for all k, | X k | ≤ C k X i =0 | θ | k − i ≤ C − | θ | . n ≥ θ n − θ = P nk =1 X k − ε k P nk =1 X k − . (43)For any i = 1 , . . . , n , set ξ i = X i − ε i (1 − | θ | ) /C and F i = σ (cid:0) ε k , ≤ k ≤ i (cid:1) . Then ( ξ i , F i ) i =1 ,...,n is a sequence of martingale differences and satisfies | ξ i | ≤ h S i n = n X k =1 E (cid:2) ξ i (cid:12)(cid:12) F i − (cid:3) = σ (1 − | θ | ) C n X k =1 X i − . Thus we have ˆ θ n − θ = (1 − | θ | ) σ C S n h S i n . Applying inequality (14) to ( ξ i , F i ) i =1 ,...,n , we deduce that for all x > , P (cid:18) ˆ θ n − θ ≤ − x (cid:19) = P (cid:18) S n h S i n ≤ − x C (1 − | θ | ) σ (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p . (44)Similarly, applying inequality (14) to ( − ξ i , F i ) i =1 ,...,n , we have for all x > , P (cid:18) ˆ θ n − θ ≥ x (cid:19) ≤ inf p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p . (45)Combining (44) and (45) together, we obtain for all x > , P (cid:18) | ˆ θ n − θ | ≥ x (cid:19) ≤ p> (cid:18) E (cid:20) exp (cid:26) − ( p − x σ + xC / (3(1 − | θ | ))) n X k =1 X k − (cid:27)(cid:21)(cid:19) /p , which gives the first desired inequality. Applying de la Pe˜na’s inequality (16) to ( ξ i , F i ) i =1 ,...,n , we getfor all x, y > , P (cid:18) ˆ θ n − θ ≤ − x, n X k =1 X k − ≥ y (cid:19) = P (cid:18) S n h S i n ≤ − x C (1 − | θ | ) σ , h S i n ≥ y (1 − | θ | ) σ C (cid:19) ≤ exp (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) . (46)14imilarly, applying de la Pe˜na’s inequality (16) to ( − ξ i , F i ) i =1 ,...,n , we have for all x, y > , P (cid:18) ˆ θ n − θ ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ exp (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) . (47)Combining (46) and (47) together, we obtain for all x, y > , P (cid:18) | ˆ θ n − θ | ≥ x, n X k =1 X k − ≥ y (cid:19) ≤ (cid:26) − x y σ + xC / (3(1 − | θ | ))) (cid:27) , which gives the second desired inequality. (cid:3) Recall the notations in the proof of Theorem 3.3. It is easy to see that( b θ n − θ ) q Σ nk =1 X k − = P nk =1 X k − ε k qP nk =1 X k − = σ S n p h S i n . Therefore, by Theorem 2.4, for all b > , M ≥ x > P (cid:18)(cid:0)b θ n − θ (cid:1)q Σ nk =1 X k − ≤ − x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ P (cid:18) S n p h S i n ≤ − xσ , b (1 − | θ | ) σC ≤ p h S i n ≤ bM (1 − | θ | ) σC (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp (cid:26) − x σ + xC / (3 b (1 − | θ | ))) (cid:27) . Similarly, the same bound holds for the tail probabilities P (cid:18)(cid:0)b θ n − θ (cid:1)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) . Hence, we have for all b > , M ≥ x > P (cid:18)(cid:12)(cid:12)b θ n − θ (cid:12)(cid:12)q Σ nk =1 X k − ≥ x, b ≤ q Σ nk =1 X k − ≤ bM (cid:19) ≤ √ e (cid:18) xσ ) ln M (cid:19) exp (cid:26) − x σ + xC / (3 b (1 − | θ | ))) (cid:27) , which gives the desired inequality. (cid:3) Acknowledgements
The work has been partially supported by the National Natural Science Foundation of China(Grant nos. 11601375 and 11626250). 15 eferencesReferences [1] Bennett, G. (1962). Probability inequalities for the sum of independent random variables.
J. Amer. Statist. Assoc. , No. 297, 33-45.[2] Bercu, B., Touati, A. (2008). Exponential inequalities for self-normalized martingales with applications. Ann. Appl.Probab.
18: 1848-1869.[3] Bercu, B., Delyon, B., Rio, E. (2015).
Concentration inequalities for sums and martingales . New York: Springer.[4] Bernstein, S. (1946).
The Theory of Probabilities (Russian). Moscow, Leningrad.[5] Chen, X., Shao, Q.M., Wu, W.B., Xu, L. (2016). Self-normalized Cram´er-type moderate deviations under depen-dence.
Ann. Statist.
Vol. 44, No. 4, 1593-1617.[6] de la Pe˜na, V. H., (1999). A general class of exponential inequalities for martingales and ratios.
Ann. Probab.
Probab. Surveys
4: 172-192.[8] de la Pe˜na, V.H., Lai, T.L., Shao, Q.M. (2008).
Self-normalized processes: Limit theory and Statistical Applications .Springer.[9] Delyon, B. (2009). Exponential inequalities for sums of weakly dependent variables.
Electronic J. Probab.
14: 752-779.[10] Dzhaparidze, K., and van Zanten, J.H., (2001). On Bernstein-type inequalities for martingales.
Stochastic Process.Appl. , 109-117.[11] Efron, B. (1969). Student’s t-test under symmetry conditions.. J. Amer. Statist. Assoc.
64, 1278-1302.[12] Fan, X., Grama, I. and Liu, Q. (2012). Hoeffding’s inequality for supermartingales.
Stochastic Process. Appl.
Electron. J. Probab.
Statistics
Ann. Probab.
Sci. China Math. , (11): 2297-2315.[17] Lesigne, E. and Voln´y, D., (2001). Large deviations for martingales. Stochastic Process. Appl.
96, 143-159.[18] Liu, Q. and Watbled, F., (2009). Exponential ineqalities for martingales and asymptotic properties of the freeenergy of directed polymers in a random environment.
Stochastic Process. Appl.
Statist. Probab. Lett.
Stochastic Process. Appl.
Ann. Probab.25(1): 285-328.