[PDF] Large deviation for lasso diffusion process

Abstract

The aim of the present paper is to extend the large deviation with discontinuous statistics studied in \cite{BDE} to the diffusion d x ε =−{ A ⊤ (A x ε −y)+μsgn( x ε )}dt+εdw . The discontinuity of the drift of the diffusion discussed in \cite{BDE} is equal to the hyperplane {x∈ R d : x 1 =0} , however, in our case the discontinuity is more complex and is equal to the set {x∈ R d : ∏ d i=1 x i =0} .

Full PDF

aa r X i v : . [ m a t h . P R ] O c t Large deviation for lasso diﬀusion process

Azzouz Dermoune , Khalifa Es-Sebaiy , Youssef Ouknine Laboratoire Paul Painlev´e, USTL-UMR-CNRS 8524, Lille, FranceEmail :[email protected], Cadi Ayyad University Av. Abdelkrim Khattabi, 40000, Gu´eliz-Marrakech, MoroccoEmails : [email protected], [email protected] novembre 2018

Abstract :

The aim of the present paper is to extend the large deviation with discontinuousstatistics studied in [5] to the diﬀusion d x ε = −{ A ⊤ ( Ax ε − y ) + µsgn ( x ε ) } dt + εd w . Thediscontinuity of the drift of the diﬀusion discussed in [5] is equal to the hyperplane { x ∈ R d : x = 0 } , however, in our case the discontinuity is more complex and is equal to the set { x ∈ R d : Q di =1 x i = 0 } . Let y ∈ R n be a given vector, A be a known matrix which maps the domain R d into thedomain R n and µ > u equals sgn ( u ) = 1 if u > sgn ( u ) = − u < sgn (0) is any element of [ − , sgn ( x ) := ( sgn ( x ) , . . . , sgn ( x p )) ⊤ . The following diﬀusion d x ε = −{ A ⊤ ( Ax ε − y ) + µsgn ( x ε ) } dt + εd w , x ε (0) = x (0) is given (1)has a discontinuous drift. Using the fact that A ⊤ ( Ax − y ) + µsgn ( x ) is the subdiﬀerential ofthe convex map k Ax − y k µ k x k we can show that the latter stochastic diﬀerential equation (sde) has a unique strong solutionfor any ε >

0. See ([15, 7, 8, 17]). Here k · k , k · k denote respectively the l l t → + ∞ is also possible. The probability density function1 Z exp {− ε ( k Ax − y k µ k x k ) } := p ε ( d x )1s the unique invariant probability measure of ( x εt ε ), see e.g. [1]. The mode of p ε was introducedin linear regression by [18] and is called lasso. Lasso is the compact and convex set solution ofthe system A ⊤ i ( Ax − y ) + µsgn ( x i ) = 0 , i = 1 , . . . , d. (2)Here A ⊤ i denotes the i -th row of the matrix A ⊤ . A large number of theoretical results has beenprovided for lasso see e.g. [10, 12, 18, 19, 16] and the references herein.If ( P εt ) is the semi-groupe deﬁned by (1) then we have the following exponential convergence E p ε [ | P εt f − E p ε ( f ) | ] ≤ exp( − t/C ) var p ε ( f ) , (3)where C = 4 E p ε [ k x − E p ε ( x ) k ] . The proof is a consequence of Poincar´e inequality ([14, 3]) : var p ε ( f ) ≤ E p ε [ k x − E p ε ( x ) k ] E p ε ( k∇ f k )valid for all lipschitz map f , because p ε is log-concave, and the fact that Poincar´e inequalityis equivalent to the exponential convergence (3). As a consequence we can suppose that a.s.sup t ≥ k x ε ( t ) k < + ∞ . ε → Let U : R d → R be a convex map such that |∇ U ( x ) | ≤ L (1 + | x | ) , ∀ x, where L is a positive constant and ∇ U denotes the sub-diﬀerential of U .It’s known (see e.g. [15, 7]) that the sde d x ε ( t ) ∈ −∇ U ( x ε ( t )) dt + εd w t , t ∈ [0 , T ]with ﬁxed initial value x (0), has a unique strong solution. More precisely there exists a uniquesolution x ε ( t ) = x (0) − Z t v ε ( s ) ds + εw t , ∀ t ∈ [0 , T ]where the measurable map v ε is such that v ε ( t ) ∈ ∇ U ( x εt ) dt a.e.2rom the linear growth of ∇ U we have k x ε ( t ) k ≤ K + L Z t | x ε ( s ) | ds, where K = k x (0) k + LT + ε sup t ∈ [0 ,T ] k w t k . Gronwall’s lemma tells us that k x ε ( t ) k ≤ K exp( LT ) , ∀ t ∈ [0 , T ] . Using Ascoli theorem, we can extract a subsequence such that x ε → x uniformely in [0 , T ]. Nowusing the inequality k v ε ( t ) k ≤ L (1 + k x ε ( t ) k ) ≤ C, ∀ t, we derive that the sequence ( v ε ) is weakly precompact in L p ([0 , T ]) for all 1 < p < + ∞ . UsingMazur’s lemma we can construct a measurable map v and a subsequence such that v ε ( t ) → v ( t ), dt a.e.From the condition v ε ( t ) ∈ ∂ U ( x ε ( t )), the convergence ( x ε ( t ) , v ε ( t )) → ( x ( t ) , v ( t )) and thefact that ∂U is monotone maximal we have v ( t ) ∈ ∂ U ( x ( t )) dt a.e.. Finally the limit x is theunique solution of the diﬀerential inclusion dx t ∈ − ∂ U ( x t ) dt. See also [6].As an application the solution x ε of (1) converges as ε → d x ( t ) ∈ −{ A ⊤ ( Ax ( t ) − y ) + µsgn ( x ) } dt, x (0) = x (0) . (4)The solution x ( t ) converges to lasso as t → + ∞ i.e. x ( t ) converges to the minimizers of k Ax − y k µ k x k . The set of the latter minimizers is compact. Hencesup t ≥ k x ( t ) k < + ∞ . (5)Then we have for some C > ε thatsup t ≥ k x ε ( t ) k < C (6)with a big a probability.The aim of our work is to extend Bou´e, Dupuis, Ellis large deviation with discontinuousstatistics [5] to the diﬀusion (1). The discontinuity in [5] is equal to the hyperplane { x ∈ R d : x = 0 } . The discontinuity of the drift of the diﬀusion (1) is more complex and is equal to theset { x ∈ R d : Q di =1 x i = 0 } .Before arriving to large deviation result we need some preliminary results.3 Preliminary results

We work in the canonical probability space (Ω , F , P ) where Ω = C ([0 , , R d ) endowed withits Borel σ -ﬁeld F , and its Wiener measure P . The canonical process W t : w ∈ Ω → w ( t ), t ∈ [0 ,

1] is the Wiener process under P . The ﬁltration F t := σ ( { W s : s ≤ t } , N ), t ∈ [0 , N is the collection of the P -null sets. The diﬀusion x ε (1) is considered in the canonicalprobability space (Ω , F , P ). Its discontinuous drift is b ( x ) := −{ A ⊤ ( Ax − y ) + µsgn ( x ) } . (7)We denote by E x (0) the mathematical expectation under the probability distribution of thesolution x ε known that x ε (0) = x (0).I) We deﬁne for each i = 1 , . . . , d , the Borel measures γ ε, i ( dt ) = [ x εi ( t ) ≤ dt, ˆ γ ε, i ( t ) = [ x εi ( t ) ≤ ,γ ε, i ( dt ) = [ x εi ( t ) > dt, ˆ γ ε, i ( t ) = [ x εi ( t ) > . By extracting a subsequence we have x ε → x where x is the solution of the inclusion diﬀerentialequation (4), and( γ ε,ηi ( dt ) , i = 1 , . . . , d, η = 1 , → ( γ ηi ( dt ) , i = 1 , . . . , d, η = 1 , γ ηi ( dt ) , i = 1 , . . . , d, η = 1 ,

2) satisfy γ ηi ( dt ) = ˆ γ ηi ( t ) dt, ∀ i = 1 , . . . , d, η = 1 , , ˆ γ i ( t ) + ˆ γ i ( t ) = 1 , ∀ i = 1 , . . . , d, ˆ γ i ( t ) = 1 , if x i ( t ) < , ˆ γ i ( t ) = 1 , if x i ( t ) > , ˆ γ i ( t ) − ˆ γ i ( t ) := sgn ( x i ( t )) if x i ( t ) = 0 , −{ A ⊤ i ( Ax ( t ) − y ) − µ } ≥ , and − { A ⊤ i ( Ax ( t ) − y ) + µ } ≤ x i ( t ) = 0 . (8)The property (8) tells us that x i ( t ) stays at zero when the strength | A ⊤ i ( Ax ( t ) − y ) | ≤ µ. This phenomenon is known by physicist [2], and we can show it mathematically using a similarproof as in [5].II) Now we ﬁx f deterministic such that R k f ( t ) k dt < + ∞ . We consider the sde d x ε ( t ) = { f ( t ) − µsgn ( x ε ( t )) } dt + εd w ( t ) , x (0) is given , (9)4nd its limit x as ε → d x ( t ) ∈ { f ( t ) − µsgn ( x ( t )) } dt, x (0) is given . (10)We have dt a.e. d x ( t ) dt = f ( t ) − µ { ˆ γ ( t ) − ˆ γ ( t ) } .

1) If x i ( t ) <

0, then ˆ γ i ( t ) = 0, ˆ γ i ( t ) = 1 and dx i ( t ) dt = f i ( t ) + µ.

2) If x i ( t ) >

0, then ˆ γ i ( t ) = 0, ˆ γ i ( t ) = 1 and dx i ( t ) dt = f i ( t ) − µ.

3) We have dt a.e on the set { t : x i ( t ) = 0 } that − µ ≤ f i ( t ) ≤ µ and dx i ( t ) dt = f i ( t ) − µ { ˆ γ i ( t ) − ˆ γ i ( t ) } = ˆ γ i ( t ) { f i ( t ) − µ } + ˆ γ i ( t ) { f i ( t ) + µ } = 0 . It follows that f i ( t ) = µ { ˆ γ i ( t ) − ˆ γ i ( t ) } , γ i ( t ) + ˆ γ i ( t ) . Hence ˆ γ i ( t ) = f i ( t ) + µ µ , ˆ γ i ( t ) = µ − f i ( t )2 µ . Finally, if x i ( t ) = 0, then dt a.e. dx i ( t ) dt = 0 andˆ γ i ( t ) = f i ( t ) + µ µ , ˆ γ i ( t ) = µ − f i ( t )2 µ . β ( t ) := f i ( t ) + µ ≥ β ( t ) := f i ( t ) − µ ≤

0, and dx i ( t ) dt = ˆ γ i ( t ) β ( t ) + ˆ γ i ( t ) β ( t ).Observe also that x i ( t ) = 0 if and only if | f i ( t ) | > µ .By choosing f i piecewise constant we obtain the trajectorie x having the following properties :There exist 0 = τ < . . . < τ r = 1 and the constants ( β i ( k ) , i = 1 , . . . , d, k = 1 , . . . , r ) such that1) dx i ( t ) dt = β i ( k ) , ∀ t ∈ [ τ k , τ k +1 ) , x i ( t ) = 0 , ∀ t ∈ [ τ k , τ k +1 ) , or x i ( t ) = 0 , ∀ t ∈ [ τ k , τ k +1 ) . We denote by N the set of the maps x : [0 , → R d which satisfy the latter properties. It’s adense subset of C ([0 , A = { v : Ω × [0 , → R : is progressively measurable and E x (0) [ Z k v ( t ) k dt ] < + ∞} , and for v ∈ A we denote by x ε, v the solution d x ε, v ( t ) = { b ( x ε, v ( t )) + v ( t ) } dt + εd w ( t ) , x ε, v (0) = x (0) , where b ( x ) = −{ A ⊤ ( Ax ε − y ) + µsgn ( x ) } . Let ( v ε , ε ∈ (0 , ⊂ A be a family of progressively measurable processes such thatsup ε ∈ (0 , E x (0) [ Z k v ε ( t ) k dt ] < + ∞ . (11)We deﬁne for each i = 1 , . . . , d , the Borel measures ν ε ( d v , t ) dt = δ v ε ( t ) ( d v ) dt,ν ε, i ( d v , t ) = [ x εi ( t ) ≤ δ v ε ( t ) ( d v ) ,ν ε, i ( d v , t ) = [ x εi ( t ) > δ v ε ( t ) ( d v ) ,γ ε, i ( dt ) = [ x εi ( t ) ≤ dt, ˆ γ ε, i ( t ) = [ x εi ( t ) ≤ ,γ ε, i ( dt ) = [ x εi ( t ) > dt, ˆ γ ε, i ( t ) = [ x εi ( t ) > . By extracting a subsequence we have x ε, v ε → x , and ν ε → ν . Thanks to (11), we have (see [5]) E x (0) [ Z R d k v k ν ( d v , t )] < + ∞ . x is the solution of the diﬀerential inclusion d x ( t ) ∈ { f ( t ) − µsgn ( x ( t )) } dt, where f ( t ) = − A ⊤ ( Ax ( t ) − y ) + Z R d v ν ( d v , t ) . Hence x is exactly the solution studied in (10). The variational representation of [4] tells us that for any bounded measurable map h :(Ω , F , P ) → R H ε ( x (0)) : = − ε ln (cid:18) E x (0) [exp {− h ( x ε ) ε } ] (cid:19) (12)= inf { E x (0) [ 12 Z k v ( t ) k dt + h ( x ε, v )] : v ∈ A} , (13)where x ε, v denotes the solution d x ε, v ( t ) = { b ( x ε, v ( t )) + v ( t ) } dt + εd w ( t ) , x ε, v (0) = x (0) . The control v ε ∈ A such that H ε ( x (0)) ≥ E x (0) [ 12 Z k v ε ( t ) k dt + h ( x ε, v ε )] − ε and the diﬀusion d x ε, v ε ( t ) = { b ( x ε, v ε ( t )) + v ε ( t ) } dt + εd w ( t ) , x ε, v ε (0) = x (0)play the central role in the large deviation result [5], and then also in our case. We set¯ x ε = x ε, v ε , ¯ x = lim ε → ¯ x ε . Observe that Condition 3.2. in [5]sup ε ∈ (0 , E x (0) [ Z k v ε ( t ) k dt ] < + ∞ holds also in our case. 7 Large deviation : upper bound

We start from the variational representation (13) : H ε ( x (0)) := − ε ln { E x [exp( − h ( x ε ) ε ] } = inf v ∈A E x [ 12 Z k v ( t ) k dt + h ( x ε,v )]valid for all bounded measurable map h .From the deﬁnition of v ε we have H ε ( x (0)) ≥ E x [ 12 Z k v ε ( t ) k dt + h (¯ x ε )] − ε . It follows that lim inf ε → H ε ( x (0)) ≥ lim inf ε → E x [ 12 Z [0 , × R d k v k ν ε ( d v , t ) dt + h (¯ x ε )]= lim inf ε → E x [ 12 Z [0 , × R d k v k ν ε ( d v , t ) dt + h (¯ x )] . From Fatou lemma we havelim inf ε → E x [ Z [0 , × R d k v k ν ε ( d v , t ) dt = E x [ Z lim inf ε → Z R d k v k ν ε ( d v , t ) dt ] . Using the inequality lim inf ε → Z f ( v ) µ n ( d v ) ≥ Z f ( v ) µ ( d v )valid for all f ≥ µ n → µ weakly, we obtainlim inf ε → Z R d k v k ν ε ( d v , t ) ≥ Z R d k v k ν ( d v , t ) . Finally we havelim inf ε → H ε ( x (0)) ≥ E x (0) [ 12 d X i =1 Z [0 , × R d | v i | ν ( d v , t ) dt + h (¯ x )] . If ¯ x i ( t ) <

0, then d ¯ x i dt ( t ) = b i (¯ x ( t )) + Z R d v i ν i ( d v , t ) . If ¯ x i ( t ) >

0, then d ¯ x i dt ( t ) = b i (¯ x ( t )) + Z R d v i ν i ( d v , t ) .

8e also recall that in these cases ν ηi ( d v , t )is a probability measure for η = 1 ,

2. It follows from Jensen inequality that Z R d | v i | ν ηi ( d v , t ) ≥ ( Z R d v i ν ηi ( d v , t )) ≥ | d ¯ x i ( t ) dt − b ηi (¯ x ( t )) | := L ηi (¯ x ( t ) , d ¯ x i ( t ) dt ) . If ¯ x i ( t ) = 0, then 0 < ˆ γ i ( t ) <

1, and ν ( d v , t ) = ˆ γ i ( t ) ν i ( d v , t )ˆ γ i ( t ) + ˆ γ i ( t ) ν i ( d v , t )ˆ γ i ( t ) . The measure ν ηi ( d v ,t )ˆ γ ηi ( t ) is a probability for each η = 1 ,

2. Again from Jensen inequality we have Z R d | v i | ν ηi ( d v , t )ˆ γ ηi ( t ) ≥ (cid:18)Z R d v i ν ηi ( d v , t )ˆ γ ηi ( t ) (cid:19) . We recall that if ¯ x i ( t ) = 0, then β i ( t ) = b i (¯ x ( t )) + Z R d v i ν i ( d v , t )ˆ γ i ( t ) ≥ ,β i ( t ) = b i (¯ x ( t )) + Z R d v i ν i ( d v , t )ˆ γ i ( t ) ≤ , and if we denote β i = d ¯ x i dt ( t )then ˆ γ i ( t ) β i ( t ) + ˆ γ i ( t ) β i ( t ) = β i . It follows that Z R d | v i | ν ( d v , t ) ≥ ˆ γ i ( t ) | β i ( t ) − b i (¯ x ( t )) | + ˆ γ i ( t ) | β i ( t ) − b i (¯ x ( t )) | ≥ inf { p | β i − b i (¯ x ( t )) | + p | β i − b i (¯ x ( t )) | } := L i (¯ x ( t ) , d ¯ x i dt ( t )) . The inﬁmum is taken under the constraint p , p > , p + p = 1 ,p β i + p β i = d ¯ x i dt ( t ) := β i .

9e deﬁne˜ L i (¯ x ( t ) , d ¯ x i dt ( t )) = | d ¯ x i dt ( t ) − b i (¯ x ( t )) | := L (1) i (¯ x ( t ) , d ¯ x i dt ( t )) , if ¯ x i ( t ) < , ˜ L i (¯ x ( t ) , d ¯ x i dt ( t )) = | d ¯ x i dt ( t ) − b i ( x ( t )) | := L (2) i (¯ x ( t ) , d ¯ x i dt ( t )) , if ¯ x i ( t ) > , ˜ L i (¯ x ( t ) , d ¯ x i dt ( t )) := L i (¯ x ( t ) , d ¯ x i dt ( t )) , if ¯ x i ( t ) = 0 . It follows for each i that (cid:18)Z R d v i ν ( d v , t ) (cid:19) ≥ ˜ L i (¯ x ( t ) , d ¯ x i dt ( t )) , and lim inf ε → H ε ( x (0)) ≥ E x (0) [ 12 d X i =1 Z [0 , ˜ L i (¯ x ( t ) , d ¯ x i dt ( t )) dt + h (¯ x )] ≥ inf { I ( ϕ ) + h ( ϕ ) : ϕ ∈ C x (0) ([0 , } , where the rate function I x (0) ( ϕ ) := d X i =1 Z ˜ L i ( ϕ ( t ) , dϕ i dt ( t )) dt. (14)The inﬁmum is equal to + ∞ if the latter set is empty.Finally we have for any sequence ε such that ¯ x ε → ¯ x , and ( ν ε , ν ε,ηi , γ ε,ηi , i = 1 , . . . , d, η ) → ( ν, ν ηi , γ ηi , i = 1 , . . . , d, η ) thatlim inf ε → H ε ( x (0)) ≥ inf ϕ ∈ C ([0 , { I ( ϕ ) + h ( ϕ ) } . Using the same argument as in Bou´e-Dupuis-Ellis [5] we can show thatlim inf ε → H ε ( x (0)) ≥ inf ϕ ∈ C ([0 , { I x ( ϕ ) + h ( ϕ ) } for all ε ∈ (0 , L ηi , η = 0 , , We deﬁne L i : R d × R → [0 , + ∞ ) by L i ( x , β i ) = inf { p | β i − b i ( x ) | + p | β i − b i ( x ) | } (15)10he inﬁmum is taken under the constraint p , p > , p + p = 1 , β i ≥ , β i ≤ p β i + p β i = β i . Proposition.

1) If ( x , β i ) ∈ R d × R are such that b i ( x ) < β i < b i ( x ) then L i ( x , β i ) = 0 .

2) If β i ≤ b i ( x ) then L i ( x , β i ) = | β i − b i ( x ) | .

3) If β i ≥ b i ( x ) then L i ( x , β i ) = | β i − b i ( x ) | . Proof.

1) and any couple β i , β i such that pβ i + (1 − p ) β i = β . If the inﬁmum (15) issuch that L i ( x , β ) = p | β i − b i ( x ) | + (1 − p ) | β i − b i ( x ) | for some p ∈ (0 , β i ≥ β i ≤

0, then | β i − b i ( x ) | = | β i − b i ( x ) | = | pb i ( x ) + (1 − p ) b i ( x ) − β | . Hence L i ( x , β ) = inf {| pb i ( x ) + (1 − p ) b i ( x ) − β | } where the inﬁmum is also under the same constraint as in (15). This ﬁnishes the proof. Corollary.

1) For each i = 1 , . . . , d , η = 0 , , , the maps ( x , β ) ∈ R d × R → L ηi ( x , β ) arecontinuous.2) If β i ≤ , then L i ( x , β i ) ≤ L i ( x , β i ) .3) If β i ≥ , then L i ( x , β i ) ≤ L i ( x , β i ) .4) For each i = 1 , . . . , d and for x ﬁxed, the map β i → L i ( x , β i ) is convex. ε → H ε ( x (0)) ≤ I x ( ϕ ) + h ( ϕ )for all ϕ ∈ N . The map ϕ is deﬁned by ( t k , β ( k )) : k = 1 , . . . , r such that dϕ i dt ( t ) = β i ( k ) , t ∈ [ t k , t k +1 [ ,ϕ i ( t ) = 0 , t ∈ [ t k , t k +1 [ , or ϕ i ( t ) = 0 , t ∈ [ t k , t k +1 [ . If ϕ i ( t ) = 0 on [ t k , t k +1 [, then β i ( k ) := dϕ i dt ( t ) = 0 on [ t k , t k +1 [.We consider the control v ( x , t ) = β ( k ) − b ( x ) , v ( x , t ) = β ( k ) − b ( x )for t ∈ [ t k , t k +1 ). Here β i ( k ) = − µ , β i ( k ) = µ if β i ( k ) = 0. Observe that 0 = β i ( k )+ β i ( k )2 . If β i ( k ) = 0, then β i ( k ) = β i ( k ) = β i ( k ).Now deﬁne for i = 1 , . . . , d , v i ( x , t ) = v i ( x , t ) [ x i ≤ + v i ( x , t ) [ x i > , and v ( x , t ) denotes the vector column with the components v i ( x , t ). The controlled process dx ε,ϕi ( t ) = b i ( x ε ( t )) dt + v i ( x ε ( t ) , t ) dt + εdw i ( t )= { β i ( k ) [ x ε,ϕi ( t ) > + β i ( k ) [ x ε,ϕi ( t ) ≤ } dt + εdw i ( t ) (16)Let us deﬁne f i ( k ) = β i ( k ) + µ, t ∈ [ t k , t k +1 ) , ϕ i ( t k ) > ,f i ( k ) = β i ( k ) − µ, t ∈ [ t k , t k +1 ) , ϕ i ( t k ) < ,f i ( k ) = 0 , t ∈ [ t k , t k +1 ) , ϕ i ( t k ) = 0 . Then we can rewrite (16) as dx ε,ϕi ( t ) = { f i ( t ) − µsgn ( x ε,ϕi ( t )) } dt + εdw i ( t )where f i ( t ) = f i ( k ) , t ∈ [ t k , t k +1 ) . It follows from (9) that x ε,ϕ converges to ϕ as ε → H ε ( x (0)) = inf v ∈A E x (0) [ 12 Z k v ( t ) k dt + h ( x ε, v )]12e have lim sup ε → H ε ( x (0)) ≤ lim sup ε → E x (0) [ 12 Z k dϕdt ( t ) − b ( x ε,ϕ ( t )) k dt + h ( x ε,ϕ )]= [ 12 Z k dϕdt ( t ) − b ( ϕ ( t )) k dt + h ( ϕ )] , where Z k dϕdt ( t ) − b ( ϕ ( t )) k dt = d X i =1 Z | dϕ i dt ( t ) − b i ( ϕ ( t )) | dt and b i ( ϕ ( t )) = b i ( ϕ ( t )) , if ϕ i ( t ) < ,b i ( ϕ ( t )) = b i ( ϕ ( t )) , if ϕ i ( t ) > , in these cases | dϕ i dt ( t ) − b i ( ϕ ( t )) | = ˜ L i ( ϕ ( t ) , dϕ i dt ( t )) . Moreover, on each interval [ t k , t k +1 ) such that β i ( k ) = 0 we can show that | dϕ i dt ( t ) − b i ( ϕ ( t )) | := | b i ( β ( k )) | if b i ( β ( k )) ≤ , | dϕ i dt ( t ) − b i ( ϕ ( t )) | := | b i ( β ( k )) | if b i ( β ( k )) ≥ , | dϕ i dt ( t ) − b i ( ϕ ( t )) | := 0 , b i ( β ( k )) < < b i ( β ( k )) . More precisely, if β i ( k ) = 0 then | dϕ i dt ( t ) − b i ( ϕ ( t )) | = L i ( ϕ ( t ) , dϕ i dt ( t )) . Finally we have Z k dϕdt ( t ) − b ( ϕ ( t )) k dt = Z ˜ L ( ϕ ( t ) , dϕdt ( t )) dt and then for all ϕ ∈ N lim sup ε → H ε ( x (0)) ≤ I x (0) ( ϕ ) + h ( ϕ ) . To ﬁnish the proof of the large deviation’s lower bound we need the following lemmas. Theproof is the same as in Dupuis and Ellis book [13]. For the sake of completeness we will recallthe proof. 13 emma 1.

Let ψ ∈ C x (0) ([0 , R ˜ L ( ψ ( t ) , ψ ′ ( t )) dt < + ∞ . For each δ >

0, thereexist ϑ > ξ ∈ C x (0) ([0 , t ∈ [0 , | ξ ( t ) − ψ ( t ) | ≤ δ, Z ˜ L ( ξ ( t ) , dξdt ( t )) dt ≤ Z ˜ L ( ψ ( t ) , dψdt ( t )) dt + δ and sup t ∈ [0 , k dξdt ( t ) k ≤ ϑ. Now we prove the following result. proof.

Let c, λ ∈ (0 , D λ = { t ∈ [0 ,

1] : k dψdt ( t ) k ≥ λ } ,E λ = { t ∈ [0 ,

1] : k dψdt ( t ) k < λ } . We construct the time-rescaling map S λ : [0 , → [0 , + ∞ ) as follows S λ (0) = 0 , dS λ dt ( t ) = k dψdt ( t ) k c (1 − λ ) if t ∈ D λ ,dS λ dt ( t ) = 1(1 − λ ) , if t ∈ E λ . Clearly the map S λ : [0 , → [0 , S λ (1)] is one to one with S λ (1) >

1. Its inverse T λ : [0 , S λ (1)] → [0 , s ∈ [0 , S λ (1)] we deﬁne ξ λ ( s ) = ψ ( T λ ( s )) . On the one hand k dξ λ dt ( t ) k ≤ max( c (1 − λ ) , λ ) . On the other hand the hypothesis Z ˜ L ( ψ ( t ) , dψdt ( t )) dt := d X i =1 Z ˜ L i ( ψ ( t ) , dψ i dt ( t )) dt < + ∞ implies that for each i Z | dψ i dt ( t ) − b ηi ( ψ ( t )) | dt < + ∞ , for η = 1 ,

2. From the triangular inequality Z | dψ i dt ( t ) | dt ≤ Z | dψ i dt ( t ) − b ηi ( ψ ( t )) | dt + Z | b ηi ( ψ ( t )) | dt

14e derive for each i that Z k dψdt ( t ) k dt < + ∞ . Now the rest of the proof is the same as in Dupuis et al. We provesup t ∈ [0 , k ξ λ ( t ) − ψ ( t ) k → Z ˜ L ( ξ λ ( t ) , dξ λ dt ( t )) dt → Z ˜ L ( ψ ( t ) , dψdt ( t )) dt as λ →

0, which achieves the proof of Lemma 1.

Lemma 2.

Let ξ ∈ C x (0) ([0 , R ˜ L ( ξ ( t ) , dξdt ( t )) dt < + ∞ andsup t ∈ [0 , k dξdt ( t ) k ≤ ϑ. For any δ > σ > ϕ σ ∈ N such thatsup t ∈ [0 , | ξ ( t ) − ϕ σ ( t ) | ≤ δ, and Z ˜ L ( ϕ σ ( t ) , dϕ σ dt ( t )) dt ≤ Z ˜ L ( ξ ( t ) , dξdt ( t )) dt + 2 δ Proof.

We deﬁne for each i G i = { t ∈ [0 ,

1] : ξ i ( t ) = 0 } ,G i = { t ∈ [0 ,

1] : ξ i ( t ) < } ,G i = { t ∈ [0 ,

1] : ξ i ( t ) > } . Following Dupuis et al. for all σ > B i = S J i j =1 [ c j ( i ) , d j ( i )] such that ξ i ( c j ( i )) = ξ i ( d j ( i )) = 0 d j ( i ) − c j ( i ) ≤ σ , d j ( i ) ≤ c j +1 ( i ). We suppose that c ( i ) = 0 and d J i ( i ) = 1. Wechoose ﬁnitely many numbers ( e kj ( i ) , k = 1 , . . . , K j ( i )) such that d j ( i ) = e j ( i ) < . . . < e K j ( i ) j ( i ) = c j +1 ( i )and e k +1 j ( i ) − e kj ( i ) < σ. We deﬁne for each i the function ϕ σi is piecewise linear interpolation of ξ i with interpolationpoints { c j ( i ) , d j ( i ) , e kj ( i ) : j = 1 , J i , k = 1 , . . . , K j ( i ) } . δ > σ > ≤ σ ≤ σ implies thatsup t ∈ [0 , | ξ i ( t ) − ϕ σi ( t ) | ≤ δ. Clearly ϕ σi ( t ) = 0 for all t ∈ [ c j ( i ) , d j ( i )], and ϕ σi ( t ) = 0 for all t ∈ [ e kj ( i ) , e k +1 j ( i )].We deﬁne a ηj ( i ) = Z d j ( i ) c j ( i ) G ηi ( t ) dt,β ηj ( i ) = 1 a ηj ( i ) Z d j ( i ) c j ( i ) dξ i ds ( s ) G ηi ( s ) ds. Clearly X η =0 , , a ηj ( i ) = d j ( i ) − c j ( i ) . Since ξ i ( t ) = 0 implies dξ i dt ( t ) = 0 a.s. then β j ( i ) = 0 . Since G i , G i are open, then it can be written as a countable union of open intervals at eachendpoint of which ξ i = 0. Hence β ηj ( i ) = 0 for η = 1 ,

2. It follows that L i ( ξ ( c j ( i )) , ≥ L i ( ξ ( c j ( i )) , ,L i ( ξ ( c j ( i )) , ≥ L i ( ξ ( c j ( i )) , . We have for each i, j that Z d j ( i ) c j ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt = X η =0 , , Z d j c j G ηi ( t ) L ηi ( ξ ( t ) , dξ i dt ( t )) dt. From the continuity of L ηi and ξ and the fact that sup t ∈ [0 , k dξdt ( t ) k ≤ ϑ there exists σ ≤ σ such that σ ≤ σ impliessup t ∈ [0 , | L ηi ( ξ ( t ) , dξ i dt ( t )) − L ηi ( ξ ( c j ( i )) , dξ i dt ( t )) | ≤ δ. It follows that Z d j ( i ) c j ( i ) G ηi ( t ) L ηi ( ξ ( t ) , dξ i dt ( t )) dt ≥ Z d j ( i ) c j ( i ) G ηi ( t ) L ηi ( ξ ( c j ( i ) , dξ i dt ( t )) dt − ( d j ( i ) − c j ( i )) δ. From the convexity of the function β i → L ηi ( x , β i ) for each x ﬁxed and for each η = 1 ,

2, wehave1 a ηj ( i ) Z d j ( i ) c j ( i ) G ηi ( t ) L ηi ( ξ ( c j ( i ) , dξ i dt ( t )) dt ≥ L ηi ( ξ ( c j ( i )) , β ηj ( i )) = L ηi ( ξ ( c j ( i )) , ≥ L i ( ξ ( c j ( i )) , . η = 0, we have Z d j ( i ) c j ( i ) G i ( t ) L i ( ξ ( c j ( i ) , dt = a j ( i ) L i ( ξ ( c j ( i ) , . Finally, we have Z d j ( i ) c j ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt ≥ ( d j ( i ) − c j ( i )) L i ( ξ ( c j ( i )) , − δ ( d j ( i ) − c j ( i )) . (17)Observe that ξ i ( c j ( i )) = ϕ σi ( c j ( i )) , but there is no guaranty that ξ l ( c j ( i )) = ϕ σl ( c j ( i ))for l = i . However there exist α l ≤ c j ( i ) ≤ β l such that ξ l ( α l ) = ϕ σl ( α l ) , ξ l ( β l ) = ϕ σl ( β l ) β l − α l ≤ σ. More precisely α l = c j l ( l ) , β l = d j l ( l )for some j l or α l = e k l j l ( l ) , β l = e k l +1 j l ( l ) . It follows that for small σ we have ξ l ( c j ( i )) ≈ ϕ σ ( c j ( i )) ≈ ϕ σ ( t ) , ∀ t ∈ [ c j ( i ) , d j ( i )] . Then for small σ we have L i ( ξ ( c j ( i )) , ≥ L i ( ϕ σ ( t ) , − δ, ∀ t ∈ [ c j ( i ) , d j ( i )] . Now the inequality (17) becomes Z d j ( i ) c j ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt ≥ Z d j ( i ) c j ( i ) L i ( ϕ σ ( t ) , dϕ σi dt ( t )) dt − δ ( d j ( i ) − c j ( i ))= Z d j ( i ) c j ( i ) ˜ L i ( ϕ σ ( t ) , dϕ σi dt ( t )) dt − δ ( d j ( i ) − c j ( i )) . Z e k +1 j ( i ) e kj ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt ≥ Z e k +1 j e kj ( i ) ˜ L i ( ϕ σ ( t ) , dϕ σi dt ( t )) dt − δ ( e k +1 j ( i ) − e kj ( i )) . Finally Z ˜ L ( ξ ( t ) , dξdt ( t )) dt = d X i =1 Z ˜ L i ( ξ ( t ) , dξdt ( t )) dt = d X i =1 { J i X j =1 ( Z d j ( i ) c j ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt + K j ( i ) X k =1 Z e k +1 j ( i ) e kj ( i ) ˜ L i ( ξ ( t ) , dξdt ( t )) dt ) }≥ Z ˜ L ( ϕ σ ( t ) , dϕ σ dt ( t )) dt − δ. Now back to the upper bound. Let τ > ψ ∈ C x (0) ([0 , I x (0) ( ψ ) ≤ inf { I x (0) ( ϕ ) : ϕ ∈ C ([0 , } + τ, and then lim sup ε → H ε ( x (0)) ≤ inf { I x (0) ( ϕ ) + h ( ϕ ) : ϕ ∈ C ([0 , } + 2 τ for any τ > h thatlim ε → H ε ( x (0)) = inf { I x (0) ( ϕ ) + h ( ϕ ) : ϕ ∈ C ([0 , } . (18) In general, a family of probability measure ( P ε , ε >

0) on a metric space ( X, ∆) satisﬁes thelarge deviation principle (LDP) with the rate function I if the following conditions are satisﬁed :a) I : X → [0 , + ∞ ) is lower semi-continuous,b) For each r > { x ∈ X ; I ( x ) ≤ r } is precompact,c) For any R >

0, there exists a compact set K such that for any δ >

0, we have for small ε , P ε ( B cδ ( K )) ≤ exp( − R ε ) , d) lim δ → lim inf ε → ε ln { P ε ( B δ ( x )) } = lim δ → lim sup ε → ε ln { P ε ( B δ ( x )) } = − I ( x ).Here B δ ( K ) denotes the δ -neighborhoods of any compact set K , and B cδ ( K ) its complement.Let P ε x (0) be the probability distribution of x ε (1) starting from x (0). It’s known that thelimit (18) is equivalent to say that the family ( P ε x (0) : ε >

0) satisﬁes the LDP with the ratefunction I x (0) (14). See [11] for a general theory of the large deviation principle.18 ´ef´erences´ef´erences