Wavelet thresholding estimation in a Poissonian interactions model with application to genomic data
aa r X i v : . [ m a t h . S T ] J u l Wavelet thresholding estimation in a Poissonianinteractions model with application to genomic data
Laure
Sansonnet
Laboratoire de Mathématiques, CNRS UMR 8628Université Paris-Sud 1115 rue Georges Clémenceau, 91405 Orsay Cedex, France
Abstract:
This paper deals with the study of dependencies between twogiven events modeled by point processes. In particular, we focus on thecontext of DNA to detect favored or avoided distances between two givenmotifs along a genome suggesting possible interactions at a molecular level.For this, we naturally introduce a so-called reproduction function h thatallows to quantify the favored positions of the motifs and which is consideredas the intensity of a Poisson process. Our first interest is the estimation of thisfunction h assumed to be well localized. The estimator ˜ h based on randomthresholds achieves an oracle inequality. Then, minimax properties of ˜ h onBesov balls B s , ∞ ( R ) are established. Some simulations are provided, allowingthe calibration of tuning parameters from a numerical point of view andproving the good practical behavior of our procedure. Finally, our method isapplied to the analysis of the influence between gene occurrences along the E. coli genome and occurrences of a motif known to be part of the majorpromoter sites for this bacterium.
Keywords:
Adaptive estimation, interactions model, oracle inequalities,Poisson process, thresholding rule, U -statistics, wavelets. MSC2010:
Primary 60G55, 62G05; secondary 62G20, 62G30.
The goal of the present paper is to study the dependence between two given events modeled by pointprocesses. We propose a general statistical approach to analyze any type of interaction, for instance,interactions between neurons in neurosciences or the comprehension of bankruptcies by contagion ineconomics. In particular, we focus on a model to study favored or avoided distances between patternson a strand of DNA, which is an important task in genomics.We are first interested in the modeling of the influence between two given motifs, a motif being de-fined as a sequence of letters in the alphabet {a,c,g,t} . This alphabet represents the four nucleotidesbases of DNA: adenine, cytosine, guanine and thymine. Our aim is to model the dependence betweenmotifs in order to identify favored or avoided distances between them, suggesting possible interactionsat a molecular level. Because genomes are long (some 1 million bases) and motifs of interest are short(3 up to 20 bases), motif occurrences can be viewed as points along genomes. For convenience, wework in a continuous framework and then, the occurrences of a motif along a genome are modeled by apoint process lying in the interval [0; T ] , where T is the normalized length of the studied genome andwill drive the asymptotic. We add that our model focuses on only one direction of interactions, thatis to say we investigate the way a first given motif influences a second one. To study the influence ofthe second motif on the first one, we just invert their roles in the model.1 Laure
Sansonnet
We observe the occurrences of both given motifs (we presuppose interactions between them) andwe assume that their distributions are as follows. The locations of the first motif are modeled bya n -sample of uniform random variables on [0; T ] , denoted U , . . . , U n and named parents. As theparameter T , the number n of parents will also drive the asymptotic. Then, each U i gives birthindependently to a Poisson process N i with intensity the function t h ( t − U i ) with respect to theLebesgue measure on R (for instance, see [17]), which models the locations of the second motif. Weconsequently observe the aggregated process N = n X i =1 N i with intensity the function t n X i =1 h ( t − U i ) (1.1)and the points of the process N are named children. But in this model, for any child we do not observewhich parent gives birth to him. The unknown function h is so-called reproduction function. Our goalis then to estimate h with the observations of the U i ’s and realizations of N .Such a modeling of locations of the first motif is linked to the work on the distribution of wordsin DNA sequences of Schbath and coauthors (for instance, see [31], [24] and [29]). Indeed, the firstmotif of interest is a rare word and is modeled by a homogeneous Poisson process N on [0; T ] . Thus,conditionally to the event "the number of points falling into [0; T ] is n ", the points of the process N (i.e. the parents) obey the same law as a n -sample of uniform random variables on [0; T ] . Moreover,with very high probability, n is proportional to T and this constitutes the asymptotic considered ingenomics, to which we will refer as the "DNA case". With our model (considering a uniform law onthe parents), we can also take into consideration the cases n ≪ T (parents are far away with respectto each other and one can almost identify which points are the children of a given parent) and n ≫ T (parents are too close to each other, which leads to hard statistical problems).If n = 1 , the purpose is to estimate the intensity of only one Poisson process. Many adaptivemethods have been proposed to deal with Poisson intensity estimation. For instance, Rudemo [30]studied data-driven histogram and kernel estimates based on the cross-validation method. Donoho[8] fitted the universal thresholding procedure proposed by Donoho and Johnstone [9] by using theAnscombe’s transform. Kolaczyk [18] refined this idea by investigating the tails of the distribution ofthe noisy wavelet coefficients of the intensity. By using model selection, other optimal estimators havebeen proposed by Reynaud-Bouret [25] or Willett and Nowak [32]. Reynaud-Bouret and Rivoirard [26]proposed a data-driven thresholding procedure that is near optimal under oracle and minimax pointsof view, with as few support assumptions as possible (the support of the intensity h may be unknownor not finite), unlike previous methods that need to assume that the intensity has a known boundedsupport.We notice that the reproduction function h can be also viewed as the intensity of a Cox process(for instance, see [5]) where the covariates are the parents U , . . . , U n . Comte et al. [4] proposed anoriginal estimator of the conditional intensity of a Cox process (more generally, a marker-dependentcounting process). Using model selection methods, they prove that their estimator satisfies an oracleinequality and has minimax properties. Note that we consider here point processes on the real line.Some aspects of similar spatial processes are studied in a parametric way [23], for instance.Some work has been done to study the statistical dependence between motif occurrences. Forinstance, in Gusto and Schbath’s article [12], the framework consists in modeling the occurrences oftwo motifs by a Hawkes process (see [13]): our framework can be viewed when the support of h is in R + as a very particular case of theirs. Their method, called FADO, uses maximum likelihood estimatesof the coefficients of h on a Spline basis coupled with an AIC criterion. However, even if the FADOprocedure is quite effective and can manage interactions between two types of events, spontaneousapparition (a child can be an orphan) and self-excitation (a child can give birth to another child),there are several drawbacks. In fact, this procedure is a parametric estimation method coupled with aclassical AIC criterion which behaves poorly for complex families of models. Moreover, FADO involvessparsity issues. Indeed, our feeling is that if interaction exists, say around the distance d bases, thefunction h to estimate should take large values around d and if there is no biological reason for anyother interaction, then h should be null anywhere else. However, if the FADO estimate takes small onparametric estimation in a Poissonian interactions model et al. [2] recently deal with multivariate Hawkes process models in order to modelthe joint occurrences of multiple transcriptional regulatory elements (TREs) along the genome thatare capable of providing new insights into dependencies among elements involved in transcriptionalregulation.In this paper, the proposed model is simple. Each child comes from one parent (no orphan andno child who is a parent), that is to say we do not take into account the phenomenons of sponta-neous apparition and self-excitation, contrary to Hawkes process models. But it brings novelties. Toestimate the reproduction function h , we propose a nonparametric method, using a wavelet thresh-olding rule that will compensate sparsity issues of the FADO method. Furthermore, our model treatsinteraction between two types of events, with a possible influence of the past occurrences but alsofuture occurrences. Then, there is the presence of a double asymptotic: the normalized length of thestudied genome T and the number n of parents, which is not usual. In the biological context, it is notacceptable assuming to know each child’s parent. Our model, via the reproduction function h , allowsto quantify the favored locations of children in relation to their parent, even if one cannot attribute achild to a parent before the statistical inference. First we provide in this paper theoretical results andwe derive oracle inequalities and minimax rates showing that our method achieves good theoreticalperformances. The proofs of these results are essentially based on concentration inequalities and onexponential and moment inequalities for U -statistics (see [6], [11] and [14]). Secondly some simulationsare carried out to validate our procedure and an application on real data ( Escherichia coli genome)is proposed. The procedure provides satisfying reconstructions, overcomes the problems raised bythe FADO method and agrees with the knowledge of the considered biological mechanism. For thesenumerical aspects, we have used a low computational complexity cascade algorithm.In Section 2, we define the notations and we describe the method. Then Section 2 discussesthe properties of our procedure for the oracle and minimax approaches. Section 3 is devoted to theimplementation of our method and provides simulations. The cascade algorithm is presented in Section3.1. Section 4 presents the application on the complete
Escherichia coli genome. A more technicalresult that is at the origin of the one stated in Section 2.3 and proofs can be found in Section 6(Appendix).
To estimate the reproduction function, we assume that h belongs to ( R ) and ∞ ( R ) . Consequently,we can consider the decomposition of h on a particular biorthogonal wavelet basis, built by Cohen etal. [3], that we can describe as follows. We set φ = [0 , the analysis father wavelet. For any r > ,there exist three functions ψ , ˜ φ and ˜ ψ with the following properties: • ˜ φ and ˜ ψ are compactly supported, • ˜ φ and ˜ ψ belong to C r +1 , where C r +1 denotes the Hölder space of order r + 1 , • ψ is compactly supported and is a piecewise constant function, • ψ is orthogonal to polynomials of degree no larger than r , Laure
Sansonnet • n ( φ k , ψ j,k ) j > ,k ∈ Z , ( ˜ φ k , ˜ ψ j,k ) j > ,k ∈ Z o is a biorthogonal family: for any j, j ′ > , for any k, k ′ ∈ Z , Z R φ k ( x ) ˜ ψ j ′ ,k ′ ( x ) dx = Z R ψ j,k ( x ) ˜ φ k ′ ( x ) dx = 0 , Z R φ k ( x ) ˜ φ k ′ ( x ) dx = { k = k ′ } , Z R ψ j,k ( x ) ˜ ψ j ′ ,k ′ ( x ) dx = { j = j ′ ,k = k ′ } , where for any x ∈ R , φ k ( x ) = φ ( x − k ) , ψ j,k ( x ) = 2 j/ ψ (2 j x − k ) and ˜ φ k ( x ) = ˜ φ ( x − k ) , ˜ ψ j,k ( x ) = 2 j/ ˜ ψ (2 j x − k ) . On the one hand, decomposition wavelets φ k and ψ j,k are piecewise constant functions and, on theother hand, reconstruction wavelets ˜ φ k and ˜ ψ j,k are smooth functions. This implies the followingwavelet decomposition of h ∈ ( R ) : h = X k ∈ Z α k ˜ φ k + X j > X k ∈ Z β j,k ˜ ψ j,k , (2.1)where for any j > and any k ∈ Z , α k = Z R h ( x ) φ k ( x ) dx, β j,k = Z R h ( x ) ψ j,k ( x ) dx. The Haar basis, used in practice, can be viewed as a particular biorthogonal wavelet basis, by setting ˜ φ = φ and ˜ ψ = ψ = ] ;1] − [0; ] , with r = 0 (even if the second property is not satisfied with sucha choice). The Haar basis is an orthonormal basis, which is not true for general biorthogonal waveletbases. This kind of decomposition has already been used in thresholding methods by Juditsky andLambert-Lacroix [16], Reynaud-Bouret and Rivoirard [26], and Reynaud-Bouret et al. [28].To shorten mathematical expressions, we set Λ = { λ = ( j, k ) : j > − , k ∈ Z } , and for any λ ∈ Λ , ϕ λ = (cid:26) φ k if λ = ( − , k ) ψ j,k if λ = ( j, k ) with j > , ˜ ϕ λ = (cid:26) ˜ φ k if λ = ( − , k )˜ ψ j,k if λ = ( j, k ) with j > and similarly β λ = (cid:26) α k if λ = ( − , k ) β j,k if λ = ( j, k ) with j > . Then (2.1) can be rewritten as h = X λ ∈ Λ β λ ˜ ϕ λ with β λ = Z R h ( x ) ϕ λ ( x ) dx (2.2)and now, we have to estimate these wavelet coefficients.For all λ in Λ , we define ˆ β λ an estimator of β λ as ˆ β λ = G ( ϕ λ ) n , with G ( ϕ λ ) = Z R n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21) dN t , (2.3)where π is the uniform distribution on [0; T ] and E π ( ϕ λ ( t − U )) denotes the expectation of ϕ λ ( t − U ) where U ∼ π (an independent copy of U , . . . , U n ). If n = 1 , we obtain the natural estimators of the β λ ’s in the case of only one Poisson process on the real line (see [26]). onparametric estimation in a Poissonian interactions model Lemma 2.1.
For all λ = ( j, k ) in Λ , E ( G ( ϕ λ )) = n Z R ϕ λ ( x ) h ( x ) dx, i.e. ˆ β λ is an unbiased estimator for β λ . Furthermore, its variance is upper bounded as follows: Var( ˆ β λ ) C (cid:26) n + 1 T + 2 − j nT (cid:27) and sup λ ∈ Λ Var( ˆ β λ ) C ′ (cid:26) n + nT (cid:27) , where C and C ′ depend on k h k , k h k ∞ , k ψ k and k ψ k . The behavior of the variance of the ˆ β λ ’s is not usual, because two parameters n and T are involved.Nevertheless, when n is proportional to T ("DNA case" as explained in Introduction), the variance isbounded by /T up to a constant, as for the Hawkes process (see [27]). When n ≪ T , the varianceis bounded by /n up to a constant, which means that the apparition’s distance between two parentsis large enough to make their interactions insignificant for the statistical analysis. So in this case,our framework can be viewed as the observation of a n -sample of a Poisson process with commonintensity h (see [26]). Finally, when n ≫ T , the variance deteriorates and is only bounded by n/T up to a constant, and in this case, the small apparition’s distance between two parents leads to roughstatistical issues hard to overcome. We start assuming that h is compactly supported in [ − A ; A ] , with A a positive real number. Thisquantity A can denote the maximal memory along DNA sequences (this is chosen by the biologists (see[12]), depending on the underlying biological process they have in mind). Furthermore, the propertiesof the biorthogonal wavelet bases introduced previously allow us to assume that we know a positivereal number M such that the support of ψ is contained in [ − M ; M ] .First, we introduce the following deterministic subset Γ of ΛΓ = (cid:8) λ = ( j, k ) ∈ Λ : − j j , k ∈ K j (cid:9) , where j a positive integer that will be fixed later and at each resolution level j , we denote K j theset of integers such that the intersection of the supports of ϕ λ and h is not empty, with λ = ( j, k ) .Straightforward computations lead to a cardinal of Γ of order j .Then, given some parameter γ > , we define for any λ ∈ Γ , the threshold η λ ( γ, ∆) = r γj e V (cid:16) ϕ λ n (cid:17) + γj B (cid:16) ϕ λ n (cid:17) + ∆ N R n (2.4)where ∆ is a positive quantity and N R is the number of points of the aggregated process N lying in R . For theoretical results, ∆ will be taken of order j j / n + j √ T + √ j nT times a constant depending on γ , k ψ k , k ψ k and k ψ k ∞ . In (2.4), we set B (cid:16) ϕ λ n (cid:17) = 1 n B ( ϕ λ ) = 1 n (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X i =1 (cid:20) ϕ λ ( · − U i ) − n − n E π ( ϕ λ ( · − U )) (cid:21)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ (2.5)and e V (cid:16) ϕ λ n (cid:17) = 1 n e V ( ϕ λ ) = 1 n (cid:18) ˆ V ( ϕ λ ) + q γj ˆ V ( ϕ λ ) B ( ϕ λ ) + 3 γj B ( ϕ λ ) (cid:19) (2.6) Laure
Sansonnet where ˆ V ( ϕ λ ) = Z R n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)! dN t . (2.7)Since they only depend on the observations, the numerical values of B ( ϕ λ ) , ˆ V ( ϕ λ ) and so e V ( ϕ λ ) defined respectively by (2.5), (2.7) and (2.6) can be exactly computed.We denote ˜ β the estimator of β = ( β λ ) λ ∈ Λ associated with the previous thresholding rule: ˜ β = (cid:16) ˆ β λ | ˆ β λ | > η λ ( γ, ∆) λ ∈ Γ (cid:17) λ ∈ Λ (2.8)and finally, we set ˜ h = X λ ∈ Λ ˜ β λ ˜ ϕ λ (2.9)an estimator of h that only depends on the choice of ( γ, ∆) and j fixed later.Thresholding procedures have been introduced by Donoho and Johnstone [9]. They derive from thesufficiency to keep a small amount of the coefficients to have a good estimation of the function h . Thethreshold η λ ( γ, ∆) seems to be defined in a rather complicated manner but the first term: q γj e V (cid:0) ϕ λ n (cid:1) looks like the universal threshold proposed by [9] in the Gaussian regression framework, by choosing γ close to 1 and j of order log n . The universal threshold of [9] is defined by η λ = p σ log n ,where σ (assumed to be known) is the variance of each noisy wavelet coefficient. In our setting, Var( ˆ β λ ) depends on h , so it is (over)estimated by e V (cid:0) ϕ λ n (cid:1) . The other terms of the threshold (2.4) areunavoidable remaining terms which allow to obtain sharp concentration inequalities. Our main result is an oracle one. Given a collection of procedures (for example, penalization, projectionor thresholding), the oracle represents the ideal "estimator" among the collection. In our setting theoracle gives, for our thresholding rule, the coefficients that have to be kept. In our framework (see [9]and [26]), the "oracle estimator" is ¯ h = X λ ∈ Γ ¯ β λ ˜ ϕ λ , with ¯ β λ = ˆ β λ Var( ˆ β λ ) <β λ . This "estimator" is not a true estimator, of course, since it depends on h . The approach of optimaladaptation is to derive true estimators which achieve the same performance as the "oracle estimator".Our goal is now to compare the risk of ˜ h defined in Section 2.2 to the oracle risk: E (cid:0) k ¯ h − h k (cid:1) = X λ ∈ Γ E (cid:2) ( ˆ β λ Var( ˆ β λ ) <β λ − β λ ) (cid:3) + X λ Γ β λ = X λ ∈ Γ min(Var( ˆ β λ ) , β λ ) + X λ Γ β λ . Theorem 1.
We assume that n > , j ∈ N ∗ such that j n < j +1 , γ > and ∆ is definedin the Appendix by (6.15) and (6.16). Then the estimator ˜ h defined in Section 2.2 satisfies E (cid:16) k ˜ h − h k (cid:17) C inf m ⊂ Γ ( X λ m β λ + " (log n ) × n + (log n ) × nT | m | ) + C " n + nT , where | m | is the cardinal of the set m , C is a positive constant depending on γ , k h k , k h k ∞ , k ψ k , k ψ k and k ψ k ∞ and C is a positive constant depending on the compact support of h , k h k , k h k ∞ , thecompact support of ψ , k ψ k , k ψ k and k ψ k ∞ . As the expression between brackets is of the same order as the upper bound of
Var( ˆ β λ ) establishedin Lemma 2.1 (up to a logarithmic term), the oracle type inequality of Theorem 1 proves that theestimator ˜ h achieves satisfying theoretical properties. onparametric estimation in a Poissonian interactions model n proportional to T ("DNA case"), then the estimator ˜ h defined in Section 2.2 satisfies E (cid:16) k ˜ h − h k (cid:17) C inf m ⊂ Γ ( X λ m β λ + (log T ) T | m | ) + C T .
This oracle type inequality is similar to the one obtained by Theorem 1 of [27] where the Hawkes processis considered. Since n is proportional to T , this inequality is typical of classical oracle inequalitiesobtained in model selection (for example, see Theorem 2.1 of [26] where only one Poisson process onthe real line is considered or more generally, see [20] for density estimation).Then, we establish a minimax result on Besov balls still with n is proportional to T . For any R > and s ∈ R such that < s < r + 1 (where r > denotes the wavelet smoothness parameterintroduced in the description of the biorthogonal wavelet bases at the beginning of the current section),we consider the following Besov ball of radius R : B s , ∞ ( R ) = f ∈ ( R ) : f = X λ ∈ Λ β λ ˜ ϕ λ , ∀ j > − , X k ∈K j β j,k ) R − js . Now, let us state the upper bound of the risk of ˜ h when h belongs to B s , ∞ ( R ) . Corollary 2.1.
Let
R > and s ∈ R such that < s < r + 1 . Assume that h ∈ B s , ∞ ( R ) and n isproportional to T . Then the estimator ˜ h defined in Section 2.2 satisfies E (cid:16) k ˜ h − h k (cid:17) C (cid:18) (log T ) T (cid:19) s s +1 , where C is a positive constant depending on γ , the compact support of h , k h k , k h k ∞ , the compactsupport of ψ , k ψ k , k ψ k , k ψ k ∞ and R . The rate of the risk of ˜ h corresponds to the minimax rate, up to the logarithmic term, for estimationof a compactly supported intensity of a Poisson process (see [25]) or for a compactly supported densitywhen we have n i.i.d. observations (see [10]). One more time this illustrates the optimality of theprocedure ˜ h but in the minimax setting. From now on we consider the context of DNA, i.e. n is proportional to T . As mentioned in Introduction,we can assume that the parents are the points of a homogeneous Poisson process N on [0; T ] withconstant intensity µ which allows to write n ≃ µT .In this section, we specify a procedure for the computation of the family of random thresholds ( η λ ( γ, ∆)) λ ∈ Γ to reconstruct the reproduction function h . We also provide some simulations in orderto calibrate parameters from a numerical point of view and to show the robustness of our procedure. We only focus on the Haar basis where φ = ˜ φ = [0;1] and ψ = ˜ ψ = ] ;1] − [0; ] , because the expression of the functions associated to this basis, that are piecewise constant functions,allows to implement simple and fast algorithms. Furthermore, considering this kind of functions issuitable for our genomic setting. In fact, according to biological studies, the reproduction function h is expected to be very irregular, with large null ranges and sudden changes at specific distances. Werecall that h is assumed to be compactly supported in [ − A ; A ] , with A a positive integer in practice. Laure
Sansonnet
We consider the thresholding rule ˜ h defined in Section 2.2 with Γ = (cid:8) λ = ( j, k ) ∈ Λ : − j j , k ∈ K j (cid:9) , and η λ ( γ, δ ) = r γj ˆ V (cid:16) ϕ λ n (cid:17) + γj B (cid:16) ϕ λ n (cid:17) + δ √ T N R n . Observe that η λ ( γ, δ ) slightly differs from the threshold defined in (2.4) since the parameter ∆ isreplaced with δ √ T (thanks to the definition (6.15) of ∆ ) and e V ( ϕ λ ) is now replaced with ˆ V ( ϕ λ ) (thereis no major difference in our simulations). The ideal choice (from a theoretical point of view) of themaximal resolution level j is given by Theorem 1, that is to say j is the positive integer such that j n < j +1 . But we will fix j = 5 in the sequel (in particular, to limit the computation time).The choice of the parameters γ and δ is discussed in the next subsection.A key point of the algorithm is the computation of the quantity S ( ϕ λ )( t ) = n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21) , for all t ∈ R that appears in ˆ β λ , B ( ϕ λ ) and ˆ V ( ϕ λ ) . We decompose it into two parts: a random "piecewise constant"part S r ( ϕ λ ) = n X i =1 ϕ λ ( · − U i ) and a deterministic (piecewise affine) part ( n − E π ( ϕ λ ( t − U )) . Notethat the deterministic part can easily be implemented with a low computational cost. This is notthe case of the random "piecewise constant" part for which we have constructed a cascade algorithm,inspired by the pioneering work of Mallat [19]. To explain this algorithm in few words, we use thefollowing notations: for any j > , for any k ∈ Z , for any x ∈ R , φ j,k ( x ) = 2 j/ φ (2 j x − k ) , ψ j,k ( x ) = 2 j/ ψ (2 j x − k ) , where φ j,k are father wavelets and ψ j,k mother wavelets. We have the following relationships betweenwavelets at level j and wavelets at level ( j + 1) : ψ j,k = √ (cid:0) φ j +1 , k +1 − φ j +1 , k (cid:1) and φ j,k = √ (cid:0) φ j +1 , k + φ j +1 , k +1 (cid:1) . (3.1)We notice that only mother wavelets and the father wavelet of level j = 0 (that corresponds to ϕ λ with λ = ( − , k ) ) are used to reconstruct the signal. The cascade algorithm is implemented as follows.1. Compute S r ( φ j , ) . Since S r ( φ j , ) is a piecewise constant function, this computation gives apartition and the values of S r ( φ j , ) on the intervals of the partition.2. Shift by +2 − j k the intervals of the previous partition by keeping the same values on the partitionto obtain S r ( φ j ,k ) for any integer k in [ − j A ; 2 j A − .3. For any resolution level j going from ( j − to 0, in a decreasing way, compute S r ( ψ j,k ) and S r ( φ j,k ) with expressions (3.1). The quantities S r ( ψ j,k ) allow the reconstruction of the signaland the quantities S r ( φ j,k ) are transitional and will be used for the computations of the lowerresolution level ( j − .4. Also keep S r ( φ ,k ) because it is used for the reconstruction of the signal.Now, let us define our thresholding estimate of h for a practical purpose. Step 0
Let j = 5 and choose also positive constants γ and δ . Step 1
Set
Γ = (cid:8) λ = ( j, k ) ∈ Λ : − j j , k ∈ K j (cid:9) and compute for any λ in Γ , S ( ϕ λ )( X ) for allpoints X of the process N . In the same way, also compute the coefficients ˆ β λ , B ( ϕ λ ) and ˆ V ( ϕ λ ) . onparametric estimation in a Poissonian interactions model Step 2
Threshold the coefficients by setting ˜ β λ = ˆ β λ | ˆ β λ | > η λ ( γ,δ ) according to the following thresholdchoice: η λ ( γ, δ ) = r γj ˆ V (cid:16) ϕ λ n (cid:17) + γj B (cid:16) ϕ λ n (cid:17) + δ √ T N R n . Step 3
Reconstruct the function h by using the ˜ β λ ’s and denote ˜ h = X λ ∈ Λ ˜ β λ ˜ ϕ λ . The programs have been coded in
Scilab 5.2 and are available upon request.
Now, we deal with the choice of the parameters γ and δ in our procedure from a practical point ofview. The question is: how to choose the optimal parameters? We work with two testing functionsdenoted ’Signal1’ and ’Signal2’ whose definitions are given in the following table:’Signal1’ ’Signal2’ ν × [0;1] ν × (cid:0) [0 . . + [1;1 . (cid:1) with ν , the children’s intensity, set to 4. We fix willfully A = 10 . Such a choice of A (remember that [ − A ; A ] is the support of h ) assumes that we do not know the support of functions. We recall that j = 5 .Given T , µ the parents’ intensity and a testing function, we denote R ( γ, δ ) the quadratic risk ofour procedure ˜ h (depending on ( γ, δ ) ) defined in Section 3.1. Of course, we aim at finding values of ( γ, δ ) such that this quadratic risk is minimal. The average over 100 simulations of R ( γ, δ ) is computedproviding an estimation of E ( R ( γ, δ )) . This average risk, denoted ¯ R ( γ, δ ) and viewed as a functionof the parameters ( γ, δ ) , is plotted for ( T, µ ) ∈ { (10000 , . , (2000 , . , (2000 , . } and for the twosignals considered previously: ’Signal1’ and ’Signal2’. Figure 1:
The function ( γ, δ ) ¯ R ( γ, δ ) for ’Signal1’ and ’Signal2’ for different values of T and µ : ’Signal1’in (cid:4) and ’Signal2’ in (cid:4) with ( T, µ ) = (10000 , . ; ’Signal1’ in (cid:4) and ’Signal2’ in (cid:4) with ( T, µ ) = (2000 , . ;’Signal1’ in (cid:4) and ’Signal2’ in (cid:4) with ( T, µ ) = (2000 , . .0 Laure
Sansonnet
Figure 1 displays ¯ R for ’Signal1’ and ’Signal2’ decomposed on the Haar basis. This figure allows todraw the following conclusion: for any ( T, µ ) ∈ { (10000 , . , (2000 , . , (2000 , . } and for ’Signal1’or ’Signal2’, ¯ R ( γ, δ ) ≈ for many values of ( γ, δ ) . So, we observe a kind of "plateau phenomenon".Reconstructions of the intensities of ’Signal1’ and ’Signal2’ are respectively given in Figure 2 andFigure 3 with the choice ( γ, δ ) = (0 . , . , a common value of several plateaus. Note the goodperformance of our thresholding rule, in particular for T = 10000 and µ = 0 . (we have µT = 1000 parents and µνT = 4000 children in average), which corresponds to the real case treated in Section 4.Thus, we propose to take systematically ( γ, δ ) = (0 . , . in our procedure ˜ h defined in Section 3.1. -0.53.5 0.0 0.5 1.5-1.0 1.04.0 2.03.02.52.01.51.00.50.0-0.54.5 0.5-1.0 1.03.5 1.50.04.0 -0.5 2.03.02.52.01.51.00.50.0-0.54.5 -1.03.5 -0.5 0.0 1.04.5 0.54.0 1.53.02.52.01.51.00.50.0-0.5 2.0 Figure 2:
Reconstructions of ’Signal1’ (true: dotted line, estimate: solid line): left: ( T, µ ) = (10000 , . ;middle: ( T, µ ) = (2000 , . ; right: ( T, µ ) = (2000 , . . -0.58 0.0 0.5 1.5-1.0 1.010 2.0642012 0.5-1.0 1.08 1.50.010 -0.5 2.0642012 1.08 -0.5 0.01012 0.5 1.5-1.06420 2.0 Figure 3:
Reconstructions of ’Signal2’ (true: dotted line, estimate: solid line): left: ( T, µ ) = (10000 , . ;middle: ( T, µ ) = (2000 , . ; right: ( T, µ ) = (2000 , . . onparametric estimation in a Poissonian interactions model h We are interested in the robustness of our procedure with respect to the support issue from a numericalpoint of view. What happens if we are wrong about the support of the function that we want toestimate? For instance, we consider the testing function denoted ’Signal3’ whose definition is given inthe following table: ’Signal3’ ν × (cid:0) [ − . − . + [4 . (cid:1) with ν , the children’s intensity, set to .Figure 4 displays reconstructions of ’Signal3’ with different supports of h : [ − A ; A ] , with A ∈{ , , } . This figure shows that when we take a not large enough support ( A = 1 or ), we do notmake large errors of approximation on [ − A ; A ] . So, the procedure seems to take into account whathappens beyond the chosen support. And for A = 10 , we have a good complete reconstruction of’Signal3’. Figure 4:
Reconstructions of ’Signal3’ (true: dotted line, estimate: solid line) with different supports: top: A = 1 ; middle: A = 5 ; bottom: A = 10 .Finally, even if the support of the reproduction function is unknown, our method estimates correctlythe signal on the chosen support, which explains the robustness of our procedure with respect to thesupport issue. Here, we investigate the case of spontaneous apparition. Even if our model does not take into accountthe spontaneous apparition (i.e. children can not be orphans), we are interested by the performance ofour procedure if there is a presence of orphans. On the one hand, let us give two processes: a processof intensity ’Signal1’ with ν = 3 , T = 10000 and µ = 0 . , to which is added a homogeneous Poissonprocess on [0; T + 1] with intensity µ (4 − ν ) = 0 . (the orphans are viewed as a Poissonian noise).Thus, we have in average 1000 parents, 3000 children having a parent and 1000 children being orphans.On the other hand, let us give two other processes: a process of intensity ’Signal1’ with ν = 1 this2 Laure
Sansonnet time, T = 10000 and µ = 0 . , to which is added a homogeneous Poisson process on [0; T + 1] withintensity µ (4 − ν ) = 0 . . Thus, we have in average 1000 parents, 1000 children having a parent and3000 children being orphans.Reconstructions of ’Signal1’ with ν = 3 and ν = 1 are given in Figure 5. When there is a smallproportion of children being orphans, the reconstruction is still acceptable; the procedure can managefew orphans. But, when there are too many orphans, our procedure makes approximation errors, whichare due to the fact that our model consists in associating any child with a parent. -1.03.0 -0.5 0.0 1.04.0 0.53.5 2.02.52.01.51.00.50.0-0.5-1.0 1.5 0.5-1.0 1.08 1.50.010 -0.5 2.0642012 Figure 5:
Reconstructions of ’Signal1’ (true: dotted line, estimate: solid line) with different values of ν : left: ν = 3 ; right: ν = 1 .We mention that the case of spontaneous apparition is only numerical. For a more precise study ofthis phenomenon, we should extend our model by adding a positive constant to the intensity function t P ni =1 h ( t − U i ) , that would represent the orphans. This is outside the scope of this paper. As application, we are interested in the
Escherichia coli genome.
E. coli is an intestinal bacterium inmammals and very common in humans which is widely studied and used in genetics. More precisely,we are interested in the study of the dependence between promoter sites and genes along the completegenome of the bacterium. In particular, promoters are usually structured motifs located before thegenes and not too far from them. Here, we have considered the major promoter of the bacterium
E. coli and more precisely the word tataat . Most of the genes of
E. coli should be preceded by this wordat a very short distance apart. In order to validate our thresholding estimation procedure (proposedat Section 3), we hope to detect short favored distances between genes and previous occurrences of tataat .For this, as in [12] we have analyzed the sequence composed of both strands of
E. coli genome(4639221 bases); each strand being separated by 10000 artificial bases to avoid artificial dependenciesbetween occurrences on one strand and occurrences on the other strand; we took 10000 bases for themaximal memory. It then represents a sequence of length 9288442; there are 4290 genes (we took thepositions of the first base of coding sequences) and 1036 occurrences of tataat . For convenience, weset T = 9289 and so A = 10 (we work on a scale of ). We recall that we have fixed j = 5 andtaken ( γ, δ ) = (0 . , . . onparametric estimation in a Poissonian interactions model tataat influences genes and so, in our model, theparents are the occurrences of tataat and children are the occurrences of genes. To give general insighton h , Figure 6 gives the estimator ˜ h defined in Section 3.1 without Step 2 (no thresholding), i.e. wehave kept all the estimated coefficients. We observe a peak around which corresponds to what wethought about the fact that most of the genes of E. coli should be preceded by the word tataat at avery short distance apart. We also observe other peaks, for instance around 1200 bases. The biologicalsignificance of these peaks remains an open question. t-4 -2 0 4 102 8 estimator -6-8-101.21.00.80.60.40.20.0-0.2-0.4 6
Figure 6:
Estimator, no thresholding, for E. coli data at the scale (i.e. 1 corresponds to 1000 bases),with parents= tataat and children=genes .We apply the complete procedure proposed in Section 3.1 (with thresholding) and we obtainFigure 7. The shape of this estimator explains how occurrences of genes are influenced by occurrencesof tataat . We can draw following conclusions, that coincide with the ones we could expect: • The estimator ˜ h ( t ) = 0 if t and t > . It means that for such t ’s, gene occurrences seemto be uncorrelated of tataat occurrences. • Conversely, if t ∈ [0; 500] , ˜ h ( t ) > , meaning that short distances are favored; smaller the distance,higher is the influence. estimator -0.5-1.0-1.5-2.00.0-0.20.6 Figure 7:
Estimator ˜ h defined in Section 3.1 for E. coli data at the scale , with parents= tataat andchildren=genes .Then, we investigate the way genes influences the DNA motif tataat and so, in our model, theparents are the occurrences of genes and children are the occurrences of tataat . Figure 8 gives the4 Laure
Sansonnet estimator ˜ h defined in Section 3.1 (with ( γ, δ ) = (0 . , . ). The shape of this estimator explains howoccurrences of tataat are influenced by occurrences of genes. We can draw following conclusions, thatis completely coherent with biological observations: • When t − and t > , ˜ h ( t ) = 0 . It means that for such t ’s, tataat occurrences seem tobe uncorrelated of gene occurrences. • When t ∈ [ − , ˜ h ( t ) > , meaning that there is a preference having a word tataat justbefore the occurrence of a gene. It corresponds to the same conclusions drawn from Figure 7(second point). The motif tataat is part of the most common promoter sites of E. coli meaningthat it should occur in front of the majority of the genes. • When t ∈ [0; 1000] , ˜ h ( t ) < ; occurrences of tataat are avoided for such distances t . Genes onthe same strand do not usually overlap and they are about 1000 bases long in average: this factcan explain this conclusion. estimator -0.5-1.0-1.5-2.00.00-0.050.15 Figure 8:
Estimator ˜ h defined in Section 3.1 for E. coli data at the scale , with parents=genes andchildren= tataat .Finally, Figure 9 presents the results of the FADO procedure [12] and Figure 10 presents the resultsof the Islands procedure of [27]. For the FADO procedure, we have forced the estimators to be piecewiseconstant to make the comparison easier. Our results agree with the ones obtained by FADO andIslands. But our method has advantage to point out that nothing significant happens after a certaindistance (contrary to the FADO procedure), has advantage to treat interaction with another type ofevents (contrary to the Islands procedure) and has advantage to deal with the dependence on the pastoccurrences but also on the future occurrences (the function h is supported in R + for the two otherprocedures). For algorithmic reason, a practical limitation of our method is that we only considerpiecewise constant estimators (as for the Islands procedure), but it is enough to get a general trend onfavored or avoided distances within a point process. onparametric estimation in a Poissonian interactions model . . . . FADO: m = 12 t e s t i m a t o r − − + − − FADO: m = 15 t e s t i m a t o r Figure 9:
FADO estimators for both E. coli datasets: left: tataat ; right: genes . . . . . . m = 5 t e s t i m a t o r − − − − + − − m = 4 t e s t i m a t o r Figure 10:
Islands estimators for both E. coli datasets: left: tataat ; right: genes . In our paper, we have investigated the dependencies between two given motifs. A random thresh-olding procedure has been proposed in Section 2.2. The general results of Section 2.3 have revealedthe optimality of the procedure in the oracle and minimax setting. Our theoretical results have beenstrengthened by simulations illustrating the robustness of our procedure, despite a calibration of pa-rameters from a practical point of view that differs from the theoretical choice. Section 4 has validatedthe procedure with a good detection of favored or avoided distances between occurrences of tataat and genes along the
E. coli genome.Further extensions of our model could be investigated. First, we could consider a more sophisticatedmodel that takes into account the phenomenons of spontaneous apparition and self-excitation (as forthe complete Hawkes model). But this model raises serious difficulties from the theoretical point ofview. This is an exciting challenge to overcome them. Secondly, we could extend our cascade algorithmto general wavelet bases and not only to Haar bases. Finally, it is also relevant to study similar processesin the spatial framework and to connect them, for instance, to the Neymann-Scott process (see Section6.3 of [5]), which is a stimulating topic we wish to consider.
Acknowledgments:
The author wishes to thank Sophie Schbath for the two genomic data sets usedin Section 4 and both her PhD advisors, Patricia Reynaud-Bouret and Vincent Rivoirard, for a wealthof smart advice and encouragement along this work.6
Laure
Sansonnet
References [1]
Boucheron, S., Bousquet, O., Lugosi, G. and Massart, P.
Moment inequalities forfunctions of independent random variables.
Ann. Probab. , 33(2):514–560, Mar. 2005.[2]
Carstensen, L., Sandelin, A., Winther, O. and Hansen, N. R.
Multivariate Hawkesprocess models of the occurrence of regulatory elements.
BMC Bioinformatics , 11(456), Sep.2010.[3]
Cohen, A., Daubechies, I. and Feauveau, J. C.
Biorthogonal bases of compactly supportedwavelets.
Comm. Pure Appl. Math. , 45:485–560, 1992.[4]
Comte, F., Gaïffas, S. and Guilloux, A.
Adaptive estimation of the conditional intensityof marker-dependent counting processes. arXiv:0810.4263v1 , Oct. 2008.[5]
Daley, D. J. and Vere-Jones, D.
An Introduction to the Theory of Point Processes - VolumeI: Elementary Theory and Methods . Probab. Appl. (N. Y.), 2003.[6] de la Peña, V. H. and Giné, E.
Decoupling: From Dependence to Independence . Probab. Appl.(N. Y.), 1999.[7] de la Peña, V. H. and Montgomery-Smith, S. J.
Decoupling inequalities for the tail prob-abilities of multivariate U -statistics. Ann. Probab. , 23(2):806–816, Apr. 1995.[8]
Donoho, D. L.
Nonlinear wavelet methods for recovery of signals, densities, and spectra fromindirect and noisy data. In
Different Perspectives on Wavelets - AMS Short Course, San Antonio(Texas), 1993 , volume 47, pages 173–205. Proc. Sympos. Appl. Math., 1993.[9]
Donoho, D. L. and Johnstone, I. M.
Ideal spatial adaptation by wavelet shrinkage.
Biometrika , 81(3):425–455, 1994.[10]
Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D.
Density estimationby wavelet thresholding.
Ann. Statist. , 24(2):508–539, 1996.[11]
Giné, E., Latala, R. and Zinn, J.
Exponential and moment inequalities for U -statistics. In High Dimensional Probability II , volume 47, pages 13–38. Progr. Probab. Birkhäuser Boston, 2000.[12]
Gusto, G. and Schbath, S.
FADO: a statistical method to detect favored or avoided distancesbetween occurrences of motifs using the Hawkes’ model.
Stat. Appl. Genet. Mol. Biol. , 4(1), Sep.2005.[13]
Hawkes, A. G.
Spectra of some self-exciting and mutually exciting point processes.
Biometrika ,58(1):83–90, 1971.[14]
Houdré, C. and Reynaud-Bouret, P.
Exponential inequalities, with constants, for U -statistics of order two. In Stochastic Inequalities and Applications , volume 56, pages 55–69.Progr. Probab. Birkhäuser Verlag Basel, 2003.[15]
Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A.
Wavelets, Approximationand Statistical Applications , volume 129. Lecture Notes in Statist., 1998.[16]
Juditsky, A. and Lambert-Lacroix, S.
On minimax density estimation on R . Bernoulli ,10(2):187–220, Apr. 2004.[17]
Kingman, J. F. C.
Poisson Processes . Oxford Sci. Publ., 1993.[18]
Kolaczyk, E. D.
Wavelet shrinkage estimation of certain Poisson intensity signals using correctedthresholds.
Statist. Sinica , 9:119–135, 1999. onparametric estimation in a Poissonian interactions model
Mallat, S. G.
Multiresolution approximations and wavelet orthonormal bases of L ( R ) . Trans. Amer. Math. Soc. , 315(1):69–87, Sep. 1989.[20]
Massart, P.
Concentration Inequalities and Model Selection, St. Flour 2003 , volume 1896. Lec-ture Notes in Math., 2007.[21]
Ogata, Y. and Akaik H.
On linear intensity models for mixed doubly stochastic Poisson andself-exciting point processes.
J. R. Stat. Soc. Ser. B Stat. Methodol. , 44(1):102–107, 1982.[22]
Ozaki, T.
Maximum likelihood estimation of Hawkes’ self-exciting point processes.
Ann. Inst.Statist. Math. , 31(1):145–155, 1979.[23]
Rasmussen, J. G.
Aspects of temporal and spatio-temporal processes . PhD thesis, Aalborg Uni-versity - Denmark, 2006.[24]
Reinert, G. and Schbath, S.
Large compound Poisson approximations for occurrences ofmultiple words. In
Statistics in Molecular Biology and Genetics , volume 33, pages 257–275. IMSLecture Notes Monogr. Ser., 1999.[25]
Reynaud-Bouret, P.
Adaptive estimation of the intensity of inhomogeneous Poisson processesvia concentration inequalities.
Probab. Theory Related Fields , 126:103–153, 2003.[26]
Reynaud-Bouret, P. and Rivoirard, V.
Near optimal thresholding estimation of a Poissonintensity on the real line.
Electron. J. Stat. , 4:172–238, 2010.[27]
Reynaud-Bouret, P. and Schbath, S.
Adaptive estimation for Hawkes processes; applicationto genome analysis.
Ann. Statist. , 38(5):2781–2822, 2010.[28]
Reynaud-Bouret, P., Rivoirard, V. and Tuleau-Malot, C.
Adaptive density estimation:a curse of support?
J. Statist. Plann. Inference , 141:115–139, 2011.[29]
Robin, S., Rodolphe, F. and Schbath, S.
DNA, Words and Models . Cambridge Univ. Press,2005.[30]
Rudemo, M.
Empirical choice of histograms and kernel density estimators.
Scand. J. Stat. , 9:65–78, 1982.[31]
Schbath, S.
Compound Poisson approximation of word counts in DNA sequences.
ESAIMProbab. Stat. , 1:1–16, 1995.[32]
Willett, R. M. and Nowak, R. D.
Multiscale Poisson intensity and density estimation.
IEEETrans. Inform. Theory , 53(9):3171–3187, Sep. 2007.
In the sequel, the values of the constants
K, K ′ , K , K , K , K , . . . may change from line to line. Forthe sake of clarity, the proofs are fully detailed in this appendix. We first give a general result stated and proved in [26].
Theorem 2 (Theorem 2.2 of [26]) . To estimate a countable family β = ( β λ ) λ ∈ Λ , such that k β k ℓ < ∞ ,we assume that a family of coefficient estimators ( ˆ β λ ) λ ∈ Γ , where Γ is a known deterministic subset of Λ , and a family of possibly random thresholds ( η λ ) λ ∈ Γ are available and we consider the thresholdingrule ˜ β = (cid:16) ˆ β λ | ˆ β λ | > η λ λ ∈ Γ (cid:17) λ ∈ Λ . Laure
Sansonnet
Let ε > be fixed. Assume that there exist a deterministic family ( H λ ) λ ∈ Γ and three constants κ ∈ [0; 1[ , ω ∈ [0; 1] and ζ > (that may depend on ε but not on λ ) with the following properties:(A1) For all λ in Γ , P (cid:16) | ˆ β λ − β λ | > κη λ (cid:17) ω. (A2) There exist < p, q < ∞ with p + q = 1 and a constant R > such that for all λ in Γ , h E (cid:16) | ˆ β λ − β λ | p (cid:17)i p R max (cid:26) H λ , H p λ ε q (cid:27) . (A3) There exists a constant θ such that for all λ in Γ such that H λ < θε , P (cid:16) | ˆ β λ − β λ | > κη λ , | ˆ β λ | > η λ (cid:17) H λ ζ. Then the estimator ˜ β satisfies − κ κ E (cid:16) k ˜ β − β k ℓ (cid:17) E inf m ⊂ Γ κ − κ X λ m β λ + 1 − κ κ X λ ∈ m ( ˆ β λ − β λ ) + X λ ∈ m η λ + LD X λ ∈ Γ H λ , with LD = Rκ (cid:0) (1 + θ − /q ) ω /q + (1 + θ /q ) ε /q ζ /q (cid:1) . Using the previous theorem, we establish the following result that we will prove in Section 6.4.
Theorem 3.
Let n > , j ∈ N ∗ , γ > and ∆ defined by (6.15) and (6.16). Then the estimator ˜ β defined by (2.8) in Section 2.2 satisfies E (cid:16) k ˜ β − β k ℓ (cid:17) C inf m ⊂ Γ ( X λ m β λ + F ( j , n, T ) | m | ) + C R (cid:0) e − κ j γ/ + e − κ n k h k / (cid:1) j , where C is a positive constant depending on γ , k h k , k h k ∞ , k ψ k , k ψ k and k ψ k ∞ , C is a positiveconstant depending on the compact support of h and the compact support of ψ , F ( j , n, T ) = j n + j / j / n / + j j n + j T + j / j / nT / + j nT ,R = C R ( n + 2 j / n / + 2 j / nT / + nT ) , with C R a positive constant depending on k h k , k h k ∞ , the compact support of ψ , k ψ k , k ψ k and k ψ k ∞ and κ and κ are absolute constants in ]0; 1[ . To obtain Theorem 1, we consider n > , we take j the positive integer such that j n < j +1 and γ > in the previous theorem. Therefore, we note that F ( j , n, T ) = j n + j / j / n / + j j n + j T + j / j / nT / + j nT K ( log nn + (log n ) / n / n / + (log n ) nn + (log n ) T + (log n ) / n / nT / + (log n ) nT ) K ( log nn + (log n ) / n + (log n ) n + (log n ) T + (log n ) / n / T / + (log n ) nT ) K ( (log n ) n + (log n ) nT ) , onparametric estimation in a Poissonian interactions model n T and n > T ), with K an absolute positive constant (that changes from line to line) and R K ′ C R ( n + nT ) , with K ′ an absolute positive constant. Moreover, e − κ j γ/ j is bounded thanks to the choice of γ .Finally, since k ˜ h − h k K k ˜ β − β k ℓ , with K a positive constant depending only on the functions that generate the biorthogonal waveletbasis, we establish Theorem 1. Before proving Theorem 3, we establish two lemmas which we will use throughout the proof.
Lemma 6.1. (a) For any function f in ( R ) and for all t ∈ R , Var π ( f ( t − U )) T Z R f ( x ) dx , where Var π ( f ( t − U )) denotes the variance of f ( t − U ) where U ∼ π .(b) For any function f in ( R ) and for all t ∈ R , Z R E π ( f ( t − U )) dt = Z R f ( x ) dx .(c) For any nonnegative function f in ( R ) and for all t ∈ R , E π ( f ( t − U )) T Z R f ( x ) dx .Proof. (a) Let f ∈ ( R ) and t ∈ R . Var π ( f ( t − U )) E π ( f ( t − U )) = 1 T Z T f ( t − u ) du T Z R f ( x ) dx. (b) Let f ∈ ( R ) and t ∈ R . Z R E π ( f ( t − U )) dt = E π (cid:18)Z R f ( t − U ) dt (cid:19) = E (cid:18)Z R f ( x ) dx (cid:19) = Z R f ( x ) dx. (c) Let f ∈ ( R ) such that f > and t ∈ R . E π ( f ( t − U )) = 1 T Z T f ( t − u ) du T Z R f ( x ) dx. The next result is a Rosenthal type inequality for any Poisson process, that extends Lemma 6.2 of[26].
Lemma 6.2.
Let p > . Consider a Poisson process N on ( X , X ) a measurable space, with a finite meanmeasure ν : X 7→ R + and a function ϕ : X R which belongs to p ( ν ) . We denote ˆ β = Z X ϕ ( x ) dN x anatural estimator of β = Z X ϕ ( x ) dν ( x ) that satisfies E ( ˆ β ) = β . Then, there exists a positive constant C ( p ) only depending on p such that E ( | ˆ β − β | p ) C ( p ) (cid:18)Z X | ϕ ( x ) | p dν ( x ) + (cid:0) Var( ˆ β ) (cid:1) p (cid:19) , where Var( ˆ β ) = Z X ϕ ( x ) dν ( x ) . Laure
Sansonnet
Proof.
Let p > . Suppose k ϕ k ∞ < + ∞ first. As a Poisson process is infinitely divisible, we can write:for any positive integer k , dN = k X i =1 dN i , where the N i ’s are mutually independent Poisson processes on X with mean measure ν/k . Hence, ˆ β − β = k X i =1 Z X ϕ ( x ) (cid:0) dN ix − k − dν ( x ) (cid:1) = k X i =1 Y i , where for any i , Y i = Z X ϕ ( x ) (cid:0) dN ix − k − dν ( x ) (cid:1) . So the Y i ’s are i.i.d. centered variables, each of them has moments of order p and . We apply theclassical Rosenthal’s inequality (for instance, see Proposition 10.2 of [15]): there exists a positiveconstant C ( p ) only depending on p such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) k X i =1 Y i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p C ( p ) k X i =1 E ( | Y i | p ) + k X i =1 E ( Y i ) ! p ! . Now, we give an upper bound of the limit of E k X i =1 | Y i | ℓ ! for ℓ ∈ { p, } when k → ∞ . Let usintroduce Ω k = (cid:8) ∀ i ∈ { , . . . , k } , N i X (cid:9) , where N i X is the number of points of N i lying in X . Then, P (Ω ck ) = P ( ∃ i ∈ { , . . . , k } , N i X > k X i =1 P ( N i X >
2) = k X j > ( ν ( X ) /k ) j j ! e − ν ( X ) /k k ( ν ( X ) /k ) = k − ν ( X ) . On Ω k , if N i X = 0 (so Z X ϕ ( x ) dN ix = 0 ), | Y i | ℓ = O k ( k − ℓ ) and if N i X = 1 (so Z X ϕ ( x ) dN ix = ϕ ( T ) , where T is the point of the process N i ), | Y i | ℓ = | ϕ ( T ) | ℓ + O k ( k − | ϕ ( T ) | ℓ − ) . Consequently, E k X i =1 | Y i | ℓ ! E " Ω k kO k ( k − ℓ ) + X T ∈ N (cid:2) | ϕ ( T ) | ℓ + O k ( k − | ϕ ( T ) | ℓ − ) (cid:3)! + q P (Ω ck ) vuuut E k X i =1 | Y i | ℓ ! . (6.1) onparametric estimation in a Poissonian interactions model k X i =1 | Y i | ℓ ℓ − k X i =1 "(cid:12)(cid:12)(cid:12)(cid:12)Z X ϕ ( x ) dN ix (cid:12)(cid:12)(cid:12)(cid:12) ℓ + (cid:18) k − Z X | ϕ ( x ) | dν ( x ) (cid:19) ℓ ℓ − k X i =1 k ϕ k ℓ ∞ ( N i X ) ℓ + k (cid:18) k − Z X | ϕ ( x ) | dν ( x ) (cid:19) ℓ ! ℓ − k ϕ k ℓ ∞ N ℓ X + k (cid:18) k − Z X | ϕ ( x ) | dν ( x ) (cid:19) ℓ ! . Thus, when k → ∞ , the last term in (6.1) converges to since a Poisson variable has moments ofevery order and lim sup k →∞ E k X i =1 | Y i | ℓ ! E (cid:18)Z X | ϕ ( x ) | ℓ dN x (cid:19) = Z X | ϕ ( x ) | ℓ dν ( x ) , which concludes the proof in the bounded case.But for any function ϕ such that R X | ϕ ( x ) | p dν ( x ) < + ∞ , the desired upper bound is finite and weget it by approximating ϕ by, for instance, piecewise constant functions. Let λ ∈ Λ be fixed. G ( ϕ λ ) , defined by (2.3), is a measurable function of the observations and byconsidering the aggregated process (1.1), we can write G ( ϕ λ ) = Z R n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21) dN t = Z R n X i =1 ϕ λ ( t − U i ) dN t − ( n − Z R E π ( ϕ λ ( t − U )) dN t = X i,j n Z R ϕ λ ( t − U i ) dN jt − X i = j n Z R E π ( ϕ λ ( t − U )) dN jt = n X i =1 Z R ϕ λ ( t − U i ) dN it + X j = i Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) dN jt . Now, we prove the first part of Lemma 2.1. We have E ( G ( ϕ λ ) | U , . . . , U n )= n X i =1 Z R ϕ λ ( t − U i ) h ( t − U i ) dt + X j = i Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt . Write x = t − U i in the first integral. Therefore, E ( G ( ϕ λ ) | U , . . . , U n ) = n Z R ϕ λ ( x ) h ( x ) dx + W ( ϕ λ ) , where W ( ϕ λ ) = X i = j n Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt. Laure
Sansonnet
Moreover, E ( W ( ϕ λ )) = X i = j n Z R E (cid:18)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) (cid:19) dt = X i = j n Z R E (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E ( h ( t − U j )) dt = 0 . Finally, E ( G ( ϕ λ )) = n Z R ϕ λ ( x ) h ( x ) dx, i.e. ˆ β λ is an unbiased estimator for β λ : E ( ˆ β λ ) = E (cid:18) G ( ϕ λ ) n (cid:19) = Z R ϕ λ ( x ) h ( x ) dx = β λ . It remains to control the variance of the estimator ˆ β λ . Var( G ( ϕ λ )) = E "(cid:18) G ( ϕ λ ) − n Z R ϕ λ ( x ) h ( x ) dx (cid:19) = E (cid:2) ( G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) + W ( ϕ λ )) (cid:3) = E ( V ( ϕ λ )) + E ( W ( ϕ λ ) ) , where V ( ϕ λ ) = Var( G ( ϕ λ ) | U , . . . , U n ) . We start by dealing with the first term by using technics for Poisson processes. We have V ( ϕ λ ) = Z R n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)! n X j =1 h ( t − U j ) dt = Z R n X j =1 ϕ λ ( t − U j ) + X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt = n X j =1 Z R ϕ λ ( t − U j ) h ( t − U j ) dt + 2 n X j =1 Z R X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) ϕ λ ( t − U j ) h ( t − U j ) dt + n X j =1 Z R X i = j X k = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt. In the first integral, write x = t − U j . So, V ( ϕ λ ) = n Z R ϕ λ ( x ) h ( x ) dx + 2 n X j =1 Z R X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) ϕ λ ( t − U j ) h ( t − U j ) dt + n X j =1 Z R X i = j X k = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt. (6.2) onparametric estimation in a Poissonian interactions model U j (in each sum) and we obtain E ( V ( ϕ λ )) = n Z R ϕ λ ( x ) h ( x ) dx + n X j =1 Z R X i = j E (cid:18)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:19) E ( h ( t − U j )) dt = n Z R ϕ λ ( x ) h ( x ) dx + n ( n − Z R Var π ( ϕ λ ( t − U )) E π ( h ( t − U )) dt. (6.3)Then using equations (a) and (b) of Lemma 6.1, we have E ( V ( ϕ λ )) n Z R ϕ λ ( x ) h ( x ) dx + n ( n − T Z R ϕ λ ( x ) dx Z R h ( x ) dx. (6.4)Now, we deal with the second term by using the U -statistics technics. However, W ( ϕ λ ) is a U -statistics of order 2 but it is not degenerate. So we write W ( ϕ λ ) = W ( ϕ λ ) + W ( ϕ λ ) , (6.5)with W ( ϕ λ ) = X i = j n Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt = ( n − n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt and W ( ϕ λ ) = X i = j n g ( U i , U j ) , where g ( U i , U j ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt.W ( ϕ λ ) is a degenerate U -statistics. It is easy to verify that E ( W ( ϕ λ ) ) = E ( W ( ϕ λ ) ) + E ( W ( ϕ λ ) ) . First we compute E ( W ( ϕ λ ) ) . E ( W ( ϕ λ ) ) = Var( W ( ϕ λ ))= n ( n − Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:19) = n ( n − Var (cid:18)Z R ϕ λ ( t − U ) E π ( h ( t − U )) dt (cid:19) n ( n − E "(cid:18)Z R | ϕ λ ( t − U ) | E π ( h ( t − U )) dt (cid:19) n ( n − T E "(cid:18)Z R | ϕ λ ( t − U ) | dt (cid:19) R h ( x ) dx (cid:19) n ( n − T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) , (6.6)by applying inequality (c) of Lemma 6.1 with f = h .4 Laure
Sansonnet
It remains to compute E ( W ( ϕ λ ) ) . It is easy to see that E ( W ( ϕ λ ) ) = X i = j n E (cid:2) g ( U i , U j )( g ( U i , U j ) + g ( U j , U i )) (cid:3) X i = j n E (cid:2) g ( U i , U j ) (cid:3) n ( n − E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:19) . We denote E ( U,V ) ∼ π ⊗ π ( g ( U, V )) the expectation of g ( U, V ) where U ∼ π and V ∼ π are independentand f X ( t ) = f ( t − X ) . Hence, E ( W ( ϕ λ ) ) n ( n − E ( U,V ) ∼ π ⊗ π "(cid:18)Z R (cid:2) ϕ Uλ ( t ) − E π ( ϕ Uλ ( t )) (cid:3)(cid:2) h V ( t ) − E π ( h V ( t )) (cid:3) dt (cid:19) n ( n − E ( U,V ) ∼ π ⊗ π " Z R ϕ Uλ ( t ) h V ( t ) dt − E V ∼ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) − E U ∼ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) + E ( U,V ) ∼ π ⊗ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) ! n ( n − ( E ( U,V ) ∼ π ⊗ π "(cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) − E U ∼ π "(cid:18) E V ∼ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19)(cid:19) − E V ∼ π "(cid:18) E U ∼ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19)(cid:19) + (cid:18) E ( U,V ) ∼ π ⊗ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19)(cid:19) ) n ( n − ( E ( U,V ) ∼ π ⊗ π "(cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) + (cid:18) E ( U,V ) ∼ π ⊗ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19)(cid:19) ) . But, E ( U,V ) ∼ π ⊗ π "(cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19) E ( U,V ) ∼ π ⊗ π (cid:18)Z R ( ϕ Uλ ) ( t ) h V ( t ) dt Z R h V ( t ) dt (cid:19) = E ( U,V ) ∼ π ⊗ π (cid:18)Z R ( ϕ Uλ ) ( t ) h V ( t ) dt (cid:19) Z R h ( x ) dx = Z R E π (cid:0) ( ϕ Uλ ) ( t ) (cid:1) E π ( h V ( t )) dt Z R h ( x ) dx T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) and (cid:12)(cid:12)(cid:12)(cid:12) E ( U,V ) ∼ π ⊗ π (cid:18)Z R ϕ Uλ ( t ) h V ( t ) dt (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z R E π ( ϕ Uλ ( t )) E π ( h V ( t )) dt (cid:12)(cid:12)(cid:12)(cid:12) T Z R | ϕ λ ( x ) | dx Z R h ( x ) dx, by using Lemma 6.1. So, E ( W ( ϕ λ ) ) n ( n − ( T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) + 1 T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) ) . (6.7)Finally, by combining inequalities (6.4), (6.6) and (6.7), we obtain the following control of the onparametric estimation in a Poissonian interactions model ˆ β λ : Var( ˆ β λ ) = Var (cid:18) G ( ϕ λ ) n (cid:19) n Z R ϕ λ ( x ) h ( x ) dx + 1 T Z R ϕ λ ( x ) dx Z R h ( x ) dx + nT (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) + 2 T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) n Z R ϕ λ ( x ) h ( x ) dx + 1 T k ϕ λ k k h k + nT k ϕ λ k k h k + 2 T k ϕ λ k k h k . By using the properties of the biorthogonal wavelet bases considered in this paper, for any λ = ( j, k ) in Λ , we have: k ϕ λ k − j/ max( √ / , k ψ k ) and k ϕ λ k max(1 , k ψ k ) , which allows us to get thepurposed upper bound in Lemma 2.1. In the sequel, we will consider: n > , T > and j = O ( n ) and we will use following notations: M h, = max( k h k , , M h, ∞ = max( k h k ∞ , , M ψ, = max( k ψ k , √ / , M ψ, = max( k ψ k , and M ψ, ∞ = max( k ψ k ∞ , √ (so that, for any λ = ( j, k ) ∈ Λ , we have: k ϕ λ k − j/ M ψ, , k ϕ λ k M ψ, and k ϕ λ k ∞ j/ M ψ, ∞ ).We recall that A and M are positive real numbers such that h and ψ are compactly supported in [ − A ; A ] and in [ − M ; M ] respectively.Now, to prove Theorem 3, we apply Theorem 2 and for this purpose we have to verify Assumptions:(A1), (A2) and (A3). Let λ ∈ Γ be fixed. Remember that conditionally to the U i ’s, the expression given in (1.1) is aPoisson process. We apply Lemma 6.1 of [26]: for any α > , with probability larger than − e − α ,conditionally to the U i ’s, we have (cid:12)(cid:12)(cid:12)(cid:12) G ( ϕ λ ) − n Z R ϕ λ ( x ) h ( x ) dx − W ( ϕ λ ) (cid:12)(cid:12)(cid:12)(cid:12) p αV ( ϕ λ ) + α B ( ϕ λ ) , where W ( ϕ λ ) is defined by (6.5), V ( ϕ λ ) = Var( G ( ϕ λ ) | U , . . . , U n ) and B ( ϕ λ ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X i =1 (cid:20) ϕ λ ( · − U i ) − n − n E π ( ϕ λ ( · − U )) (cid:21)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ . Unlike B ( ϕ λ ) , V ( ϕ λ ) is non-observable (it depends on the unknown function h ). This is the reasonwhy, by fixing α > , we estimate V ( ϕ λ ) by e V ( ϕ λ ) = ˆ V ( ϕ λ ) + q α ˆ V ( ϕ λ ) B ( ϕ λ ) + 3 αB ( ϕ λ ) where ˆ V ( ϕ λ ) = Z R n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)! dN t . Moreover, by Lemma 6.1 of [26], we have also: P ( V ( ϕ λ ) > e V ( ϕ λ )) e − α . So, with probability largerthan − e − α , (cid:12)(cid:12)(cid:12)(cid:12) G ( ϕ λ ) − n Z R ϕ λ ( x ) h ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) q α e V ( ϕ λ ) + α B ( ϕ λ ) + | W ( ϕ λ ) | . (6.8)6 Laure
Sansonnet
We provide a control in probability of W ( ϕ λ ) . W ( ϕ λ ) = ( n − n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt. This is a sum of i.i.d. random variables. We apply Bernstein’s inequality (for instance, see Proposition2.9 of [20]) to get that with probability larger than − e − α , | W ( ϕ λ ) | p αv ( ϕ λ ) + α b ( ϕ λ ) , with v ( ϕ λ ) = Var( W ( ϕ λ )) n ( n − T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) (see inequality (6.6)) and b ( ϕ λ ) = ( n −
1) sup u ∈ [0; T ] (cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − u ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) n − T Z R | ϕ λ ( x ) | dx Z R h ( x ) dx, using equations (b) and (c) of Lemma 6.1. Then, with probability larger than − e − α , | W ( ϕ λ ) | √ αn ( n − T Z R | ϕ λ ( x ) | dx Z R h ( x ) dx + 2 α ( n − T Z R | ϕ λ ( x ) | dx Z R h ( x ) dx. (6.9)Now it remains to control W ( ϕ λ ) , with W ( ϕ λ ) = X i = j n g ( U i , U j ) , where g ( U i , U j ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt. This is a degenerate U -statistics of order 2, we can rewrite it as W ( ϕ λ ) = X j ( ε = 1 for instance), with probability larger than − × . e − α , | W ( ϕ λ ) | ε ) / C √ α + η ( ε ) Dα + β ( ε ) Bα / + γ ( ε ) Aα , (6.10)where • A = kGk ∞ and by applying equality (b) of Lemma 6.1 with f = h , we easily have A k ϕ λ k ∞ Z R h ( x ) dx, (6.11) • C = E ( W ( ϕ λ ) ) and with (6.7), we have C n ( n − ( T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) + 1 T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) ) , (6.12) onparametric estimation in a Poissonian interactions model • D = sup E X j
Sansonnet
But, E ( g ( u, U j ) )= E "(cid:18)Z R (cid:2) ϕ λ ( t − u ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt (cid:19) E (cid:20)Z R (cid:2) ϕ λ ( t − u ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12) h ( t − U j ) − E π ( h ( t − U )) (cid:12)(cid:12) dt Z R (cid:12)(cid:12) h ( t − U j ) − E π ( h ( t − U )) (cid:12)(cid:12) dt (cid:21) E (cid:20)Z R (cid:2) ϕ λ ( t − u ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12) h ( t − U j ) − E π ( h ( t − U )) (cid:12)(cid:12) dt (cid:21) Z R h ( x ) dx T Z R (cid:2) ϕ λ ( t − u ) − E π ( ϕ λ ( t − U )) (cid:3) dt (cid:18)Z R h ( x ) dx (cid:19) T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) and in the same way E ( g ( U j , u ) )= E "(cid:18)Z R (cid:2) ϕ λ ( t − U j ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − u ) − E π ( h ( t − U )) (cid:3) dt (cid:19) E (cid:20)Z R (cid:2) ϕ λ ( t − U j ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12) h ( t − u ) − E π ( h ( t − U )) (cid:12)(cid:12) dt Z R (cid:12)(cid:12) h ( t − u ) − E π ( h ( t − U )) (cid:12)(cid:12) dt (cid:21) E (cid:20)Z R (cid:2) ϕ λ ( t − U j ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12) h ( t − u ) − E π ( h ( t − U )) (cid:12)(cid:12) dt (cid:21) Z R h ( x ) dx T Z R (cid:12)(cid:12) h ( t − u ) − E π ( h ( t − U )) (cid:12)(cid:12) dt Z R ϕ λ ( x ) dx Z R h ( x ) dx T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) , by using Lemma 6.1. Hence, B n − T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) . (6.14)Finally, by inequalities (6.8), (6.9) and (6.10) combined with (6.11), (6.14), (6.12) and (6.13), weobtain: for any ε > , with probability larger than − (5 + 2 × . e − α , | ˆ β λ − β λ | r α e V (cid:16) ϕ λ n (cid:17) + α B (cid:16) ϕ λ n (cid:17) + ( √ αnT k ϕ λ k + 2 α T k ϕ λ k + 2(1 + ε ) / √ α r T k ϕ λ k + 1 T k ϕ λ k + 4 η ( ε ) α r T k ϕ λ k + β ( ε ) α / r nT k ϕ λ k + 8 n γ ( ε ) α k ϕ λ k ∞ ) k h k r α e V (cid:16) ϕ λ n (cid:17) + α B (cid:16) ϕ λ n (cid:17) + ( M ψ, √ αnT + 2 √ M ψ, α T + 2 √ ε ) / M ψ, r αT + 4(1 + ε ) / M ψ, √ αT + 4 η ( ε ) M ψ, α √ T + √ β ( ε ) M ψ, α / √ nT + 8 γ ( ε ) M ψ, ∞ j / α n ) k h k , onparametric estimation in a Poissonian interactions model − j/ √ if − j j . We denote b the quantity between braces above.This upper bound depends on h (via k h k ) and this potential threshold could not be used forapplications because h is unknown. So we overestimate k h k by (1+ ε ) N R n and we have a threshold thatdoes not depend on h . So, for any value of κ ∈ ]0; 1[ , by fixing α = κ j γ with γ > , we define for all λ in Γ , η λ ( γ, ∆) = r j γ e V (cid:16) ϕ λ n (cid:17) + j γ B (cid:16) ϕ λ n (cid:17) + ∆ N R n , where ∆ = (1 + ε ) ( M ψ, √ j γnT + 2 √ M ψ, j γ T + 2 √ ε ) / M ψ, r j γT + 4(1 + ε ) / M ψ, √ j γT + 4 η ( ε ) M ψ, j γ √ T + √ β ( ε ) M ψ, j / γ / √ nT + 8 γ ( ε ) M ψ, ∞ j / j γ n ) . Thus, for all λ in Γ , P (cid:0) | ˆ β λ − β λ | > κη λ ( γ, ∆) (cid:1) P (cid:18) | ˆ β λ − β λ | > r α e V (cid:16) ϕ λ n (cid:17) + α B (cid:16) ϕ λ n (cid:17) + b (1 + ε ) N R n , (1 + ε ) N R n > k h k (cid:19) + P (cid:18) | ˆ β λ − β λ | > r α e V (cid:16) ϕ λ n (cid:17) + α B (cid:16) ϕ λ n (cid:17) + b (1 + ε ) N R n , (1 + ε ) N R n k h k (cid:19) P (cid:18) | ˆ β λ − β λ | > r α e V (cid:16) ϕ λ n (cid:17) + α B (cid:16) ϕ λ n (cid:17) + b k h k (cid:19) + P (cid:18) (1 + ε ) N R n k h k (cid:19) (5 + 2 × . e − α + P (cid:18) (1 + ε ) N R n k h k (cid:19) , with P (cid:18) (1 + ε ) N R n k h k (cid:19) = P (cid:18) N R − n k h k − ε n k h k ε (cid:19) exp ( − g ( ε ) n k h k ) , using Proposition 7 of [25] with g ( ε ) = ε (cid:16) log ε − (cid:17) + 1 .Therefore, Assumption (A1) is true if we take ω = (5 + 2 × . e − κ j γ + exp ( − g ( ε ) n k h k ) , with γ > and ε > . Furthermore, the threshold (2.4) that lies at the heart of the paper is achieved byrewriting ∆ by grouping the constants into one: ∆ = d ( γ, k ψ k , k ψ k , k ψ k ∞ ) ( j j / n + j √ T + √ j nT ) (6.15)with d ( γ, k ψ k , k ψ k , k ψ k ∞ )= (1 + ε ) ( √ γM ψ, + 2 √ γM ψ, + 2 √ ε ) / √ γM ψ, + 4(1 + ε ) / √ γM ψ, + 4 η ( ε ) γM ψ, + √ β ( ε ) γ / M ψ, + 8 γ ( ε ) γ M ψ, ∞ ) , (6.16)where β ( ε ) , γ ( ε ) and η ( ε ) are defined in [14] with ε = 1 .0 Laure
Sansonnet
Let λ ∈ Γ be fixed. For any p > , E ( | ˆ β λ − β λ | p ) = E (cid:12)(cid:12)(cid:12)(cid:12) G (cid:16) ϕ λ n (cid:17) − Z R ϕ λ ( x ) h ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) p ! = 1 n p E (cid:16)(cid:12)(cid:12) G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) + W ( ϕ λ ) (cid:12)(cid:12) p (cid:17) p − n p (cid:2) E ( | G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) | p ) + E ( | W ( ϕ λ ) | p ) (cid:3) . (6.17)Now, let us give an upper bound of each term of the right-hand side of the previous inequality.We first study the first term of (6.17). We have: E ( | G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) | p ) = E (cid:2) E ( | G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) | p ) | U , . . . , U n ) (cid:3) and conditionally to the U i ’s, N is a Poisson process. We apply Lemma 6.2: for any p > , there existsa positive constant C ( p ) only depending on p such that E ( | G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) | p ) | U , . . . , U n ) C ( p ) Z R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p n X j =1 h ( t − U j ) dt + V ( ϕ λ ) p . (6.18)On the one hand, we provide a control in expectation of the first term of (6.18). We have: E Z R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p n X j =1 h ( t − U j ) dt = E Z R n X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ϕ λ ( t − U j ) + X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p h ( t − U j ) dt p − " E Z R n X j =1 | ϕ λ ( t − U j ) | p h ( t − U j ) dt + E Z R n X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p h ( t − U j ) dt , with E Z R n X j =1 | ϕ λ ( t − U j ) | p h ( t − U j ) dt = n Z R | ϕ λ ( x ) | p h ( x ) dx n Z R ϕ λ ( x ) dx k ϕ λ k p − ∞ k h k ∞ and E Z R n X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p h ( t − U j ) dt = n X j =1 Z R E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p E ( h ( t − U j )) dt = n Z R E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X i =1 (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p E π ( h ( t − U )) dt. onparametric estimation in a Poissonian interactions model C ( p ) only depending on p suchthat E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − X i =1 (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p C ( p ) (cid:16) ( n − E (cid:16)(cid:12)(cid:12) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:12)(cid:12) p (cid:17) + ( n − p (cid:2) Var π ( ϕ λ ( t − U )) (cid:3) p (cid:17) . But, E (cid:16)(cid:12)(cid:12) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:12)(cid:12) p (cid:17) K (cid:0) E π ( | ϕ λ ( t − U ) | p ) + | E π ( ϕ λ ( t − U )) | p (cid:1) K T Z T | ϕ λ ( t − u ) | p du + (cid:12)(cid:12)(cid:12)(cid:12) T Z T ϕ λ ( t − u ) du (cid:12)(cid:12)(cid:12)(cid:12) p ! K T Z R ϕ λ ( x ) dx k ϕ λ k p − ∞ + 1 T p (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) p ! , with K a positive constant only depending on p and using inequality (a) of Lemma 6.1, Var π ( ϕ λ ( t − U )) T Z R ϕ λ ( x ) dx. Thus, E Z R n X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p h ( t − U j ) dt n K (cid:18) n − T k ϕ λ k k ϕ λ k p − ∞ + n − T p k ϕ λ k p + ( n − p T p k ϕ λ k p (cid:19) Z R E π ( h ( t − U )) dt K (cid:18) n T k ϕ λ k k ϕ λ k p − ∞ + n T p k ϕ λ k p + n p +1 T p k ϕ λ k p (cid:19) k h k , using equation (b) of Lemma 6.1.Therefore, we have the following control of the first term of (6.18) E Z R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p n X j =1 h ( t − U j ) dt K " n k ϕ λ k k ϕ λ k p − ∞ k h k ∞ + (cid:18) n T k ϕ λ k k ϕ λ k p − ∞ + n T p k ϕ λ k p + n p +1 T p k ϕ λ k p (cid:19) k h k , (6.19)Now let us provide a control in expectation of the second term of (6.18), i.e V ( ϕ λ ) p . First, we recallthat V ( ϕ λ ) = Var( G ( ϕ λ ) | U , . . . , U n ) and we remark that E ( V ( ϕ λ ) p ) (cid:2) E ( V ( ϕ λ ) p ) (cid:3) / (using theCauchy-Schwarz inequality). So, we focus on the moments of V ( ϕ λ ) of any order m > .2 Laure
Sansonnet
Let m > . According to the expression (6.2) of V ( ϕ λ ) , we have: V ( ϕ λ )= n Z R ϕ λ ( x ) h ( x ) dx + 2 n X j =1 Z R X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) ϕ λ ( t − U j ) h ( t − U j ) dt + n X j =1 Z R X i = j X k = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) h ( t − U j ) dt = n Z R ϕ λ ( x ) h ( x ) dx + 2 n X j =1 Z R X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U ))) dt + 2 n X j =1 Z R X i = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U j ) h ( t − U j ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt + n X j =1 Z R X i = j X k = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt + n X j =1 Z R X i = j X k = j (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) × (cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt. This formula provides a decomposition of V ( ϕ λ ) in a sum of degenerate U -statistics of order 0, 1, 2and 3. Indeed V ( ϕ λ ) = W ( ϕ λ ) + W ( ϕ λ ) + W ( ϕ λ ) + W ( ϕ λ ) , with W i ( ϕ λ ) is a degenerate U -statistic of order i defined as follows: W ( ϕ λ ) = X i = j = k n Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) × (cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt, W ( ϕ λ )= 2 X i = j n Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U j ) h ( t − U j ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt + ( n − X i = k n Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt + X i = j n Z R (cid:20)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt, W ( ϕ λ ) = 2( n − n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt + ( n − n X i =1 Z R (cid:20)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt + ( n − n X j =1 Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt onparametric estimation in a Poissonian interactions model W ( ϕ λ ) = n Z R ϕ λ ( x ) h ( x ) dx + n ( n − Z R Var π ( ϕ λ ( t − U )) E π ( h ( t − U )) dt = E ( V ( ϕ λ )) n Z R ϕ λ ( x ) h ( x ) dx + n ( n − T Z R ϕ λ ( x ) dx Z R h ( x ) dx, (6.20)by using (6.3) and (6.4).First, we are interested in the moments of W ( ϕ λ ) that we write: W ( ϕ λ ) = W , ( ϕ λ ) + W , ( ϕ λ ) + W , ( ϕ λ ) , with: • W , ( ϕ λ ) = 2( n − n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt .We have: E ( |W , ( ϕ λ ) | m )= 2 m ( n − m E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m ! m ( n − m × C ( m ) n E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) + n m/ (cid:20) Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:19)(cid:21) m/ ! , using Rosenthal’s inequality, where C ( m ) is a positive constant only depending on m . But, applyingLemma 6.1, E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) m T m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m (cid:18)Z R | ϕ λ ( x ) | h ( x ) dx (cid:19) m and Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:19) = Var (cid:18)Z R ϕ λ ( t − U ) E π ( ϕ λ ( t − U ) h ( t − U )) dt (cid:19) E "(cid:18)Z R | ϕ λ ( t − U ) | E π ( | ϕ λ ( t − U ) | h ( t − U )) dt (cid:19) T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R | ϕ λ ( x ) | h ( x ) dx (cid:19) . Laure
Sansonnet
So, E ( |W , ( ϕ λ ) | m ) m ( n − m × C ( m ) m nT m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m (cid:18)Z R | ϕ λ ( x ) | h ( x ) dx (cid:19) m + n m/ T m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m (cid:18)Z R | ϕ λ ( x ) | h ( x ) dx (cid:19) m ! K , n m/ T m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m (cid:18)Z R | ϕ λ ( x ) | h ( x ) dx (cid:19) m K , n m/ T m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m k h k m ∞ , (6.21)with K , a positive constant only depending on m . • W , ( ϕ λ ) = ( n − n X i =1 Z R (cid:20)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt .We have: E ( |W , ( ϕ λ ) | m )= ( n − m E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 Z R (cid:20)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m ! n m × C ( m ) n E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:20)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) + n m/ (cid:20) Var (cid:18)Z R (cid:20)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt (cid:19)(cid:21) m/ ! , using Rosenthal’s inequality, where C ( m ) is a positive constant only depending on m . But, applyingLemma 6.1, E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:20)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) K , (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m , with K , a positive constant only depending on m and Var (cid:18)Z R (cid:20)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21) E π ( h ( t − U )) dt (cid:19) = Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:19) E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:19) T E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) dt (cid:19) R h ( x ) dx (cid:19) K , T (cid:18)Z R ϕ λ ( x ) dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) onparametric estimation in a Poissonian interactions model E ( |W , ( ϕ λ ) | m ) K , n m nT m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m ! K , n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m . (6.22) • W , ( ϕ λ ) = ( n − n X j =1 Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt .We have: E ( |W , ( ϕ λ ) | m )= ( n − m E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X j =1 Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m ( n − m × C ( m ) n E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) + n m/ (cid:20) Var (cid:18)Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:19)(cid:21) m/ ! , using Rosenthal’s inequality, where C ( m ) is a positive constant only depending on m . But, applyingLemma 6.1, E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m , and Var (cid:18)Z R Var π ( ϕ λ ( t − U )) (cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:19) = Var (cid:18)Z R Var π ( ϕ λ ( t − U )) h ( t − U ) dt (cid:19) E "(cid:18)Z R Var π ( ϕ λ ( t − U )) h ( t − U ) dt (cid:19) T (cid:18)Z R ϕ λ ( x ) dx (cid:19) E "(cid:18)Z R h ( t − U ) dt (cid:19) T (cid:18)Z R ϕ λ ( x ) dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) . So, E ( |W , ( ϕ λ ) | m ) ( n − m × C ( m ) m nT m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m ! K , n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m , (6.23)6 Laure
Sansonnet with K , a positive constant only depending on m .Next we deal with the moments of W ( ϕ λ ) that we write: W ( ϕ λ ) = W , ( ϕ λ ) + W , ( ϕ λ ) + W , ( ϕ λ ) , with: • W , ( ϕ λ ) = 2 X i = j n Z R [ ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U ))] (cid:2) ϕ λ ( t − U j ) h ( t − U j ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt .We want to use Theorem 8.1.6 of [6] (a moment inequality for U -statistics using decoupling) so wewrite: W , ( ϕ λ ) = 2 X i = j n f ( U i , U j ) , where f ( U i , U j ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U j ) h ( t − U j ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt. There exists a positive constant C ,m depending on m only such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i = j n f ( U i , U j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m C ,m n m E ( | f ( U , U ) | m ) C ,m n m E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R [ ϕ λ ( t − U ) − E π ( ϕ λ ( t − U ))] (cid:2) ϕ λ ( t − U ) h ( t − U ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) C ,m n m E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) dt (cid:19) m/ × E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) h ( t − U ) − E π ( ϕ λ ( t − U ) h ( t − U )) (cid:3) dt (cid:19) m/ K , n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m/ (cid:18)Z R ϕ λ ( x ) h ( x ) dx (cid:19) m/ K , n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ , by applying Lemma 6.1 and setting K , a positive constant only depending on m . So, E ( |W , ( f ) | m ) K , n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ . (6.24) • We may write: W , ( ϕ λ ) = ( n − X i = k n f ( U i , U k ) , where f ( U i , U k ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt. onparametric estimation in a Poissonian interactions model C ,m depending on m only such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i = k n f ( U i , U k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m C ,m n m E ( | f ( U , U ) | m ) C ,m n m E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) K , n m T m ( E "(cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) dt (cid:19) m/ (cid:18)Z R h ( x ) dx (cid:19) m K , n m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m , by applying Lemma 6.1 and setting K , a positive constant only depending on m . So, E ( |W , ( f ) | m ) K , n m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m . (6.25) • We may write: W , ( ϕ λ ) = X i = j n f ( U i , U j ) , where f ( U i , U j ) = Z R (cid:20)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:21)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt. We use Theorem 8.1.6 of [6]: there exists a positive constant C ,m depending on m only such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i = j n f ( U i , U j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m C ,m n m E ( | f ( U , U ) | m ) C ,m n m E (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)Z R [[ ϕ λ ( t − U ) − E π ( ϕ λ ( t − U ))] − Var π ( ϕ λ ( t − U ))][ h ( t − U ) − E π ( h ( t − U ))] dt (cid:12)(cid:12)(cid:12)(cid:12) m (cid:19) K , n m E (cid:20)(cid:18)Z R (cid:12)(cid:12)(cid:12)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) − Var π ( ϕ λ ( t − U )) (cid:12)(cid:12)(cid:12) dt (cid:19) m (cid:21) k h k m ∞ K , n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ , by applying Lemma 6.1 and setting K , a positive constant only depending on m . So, E ( |W , ( ϕ λ ) | m ) K , n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ . (6.26)And finally, we focus on the moments of W ( ϕ λ ) that we write: W ( ϕ λ ) = X i = j = k n f ( U i , U j , U k ) ,where f ( U i , U j , U k ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U k ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt. Laure
Sansonnet
We use Theorem 8.1.6 of [6]: there exists a positive constant C ,m depending on m only such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i = j = k n f ( U i , U j , U k ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m C ,m n m/ E ( | f ( U , U , U ) | m ) C ,m n m/ E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) × (cid:2) h ( t − U ) − E π ( h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m ! K n m/ E (cid:20)(cid:18)Z R (cid:12)(cid:12)(cid:12)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12) dt (cid:19) m (cid:21) k h k m ∞ , by applying Lemma 6.1 and setting K a positive constant only depending on m . Furthermore, usingthe support properties of the biorthogonal wavelet bases considered in this paper, we have E (cid:20)(cid:18)Z R (cid:12)(cid:12)(cid:12)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12) dt (cid:19) m (cid:21) = E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) ϕ λ ( t − U ) − ϕ λ ( t − U ) E π ( ϕ λ ( t − U )) − ϕ λ ( t − U ) E π ( ϕ λ ( t − U ))+ (cid:2) E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12)(cid:12) dt ! m K ( E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) ϕ λ ( t − U ) (cid:12)(cid:12)(cid:12) dt ! m + E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) E π ( ϕ λ ( t − U )) (cid:12)(cid:12)(cid:12) dt ! m + E " Z R (cid:2) E π ( ϕ λ ( t − U )) (cid:3) dt ! m , with: E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) ϕ λ ( t − U ) (cid:12)(cid:12)(cid:12) dt ! m = 1 T Z T du Z T du Z R (cid:12)(cid:12) ϕ λ ( t − u ) ϕ λ ( t − u ) (cid:12)(cid:12) dt ! m T Z T du Z u +2 Mu − M du (cid:18)Z R ϕ λ ( x ) dx (cid:19) m MT (cid:18)Z R ϕ λ ( x ) dx (cid:19) m , E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) E π ( ϕ λ ( t − U )) (cid:12)(cid:12)(cid:12) dt ! m = E " Z R (cid:12)(cid:12)(cid:12) ϕ λ ( t − U ) (cid:18) T Z T ϕ λ ( t − u ) du (cid:19) (cid:12)(cid:12)(cid:12) dt ! m T m E " Z T du Z R (cid:12)(cid:12) ϕ λ ( t − U ) ϕ λ ( t − u ) (cid:12)(cid:12) dt ! m T m +1 Z T du Z T du Z R (cid:12)(cid:12) ϕ λ ( t − u ) ϕ λ ( t − u ) (cid:12)(cid:12) dt ! m T m +1 Z T du Z u +2 Mu − M du Z R ϕ λ ( x ) dx ! m (4 M ) m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m onparametric estimation in a Poissonian interactions model Z R (cid:2) E π ( ϕ λ ( t − U )) (cid:3) dt = 1 T Z R dt Z T du ϕ λ ( t − u ) Z T dv ϕ λ ( t − v )= 1 T Z T du Z T dv Z R ϕ λ ( t − u ) ϕ λ ( t − v ) dt T Z T du Z u +2 Mu − M dv Z R ϕ λ ( x ) dx MT Z R ϕ λ ( x ) dx. So, E ( |W ( ϕ λ ) | m ) K ′ n m/ " T (cid:18)Z R ϕ λ ( x ) dx (cid:19) m + 1 T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ , (6.27)with K ′ a positive constant only depending on m and the compact support of ψ . Note that if we hadused the same method as for the control of the moments of W , ( ϕ λ ) , we would not get the correct rateof convergence. We obtain a better rate of convergence thanks to the properties of the biorthogonalwavelet bases used here.Thus, combining inequalities (6.20), (6.21), (6.22), (6.23), (6.24), (6.25), (6.26) and (6.27) yields E ( V ( ϕ λ ) m ) K ( n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ + n m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m/ T m (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) m k h k m ∞ + n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m/ T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ + n m T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m (cid:18)Z R h ( x ) dx (cid:19) m + n m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ + n m/ " T (cid:18)Z R ϕ λ ( x ) dx (cid:19) m + 1 T m (cid:18)Z R ϕ λ ( x ) dx (cid:19) m k h k m ∞ ) K ( n m/ T m k ϕ λ k m k h k m ∞ + n m T m k ϕ λ k m k h k m + " n m + n m/ T k ϕ λ k m k h k m ∞ ) , with K a positive constant only depending on m and the compact support of ψ . So, we obtain E ( V ( ϕ λ ) p ) K ( n p/ T p k ϕ λ k p k h k p ∞ + n p T p k ϕ λ k p k h k p + " n p + n p/ T k ϕ λ k p k h k p ∞ ) , (6.28)with K a positive constant only depending on p and the compact support of ψ .To conclude for the first term of (6.17), using inequalities (6.19) and (6.28) in (6.18), we have E ( | G ( ϕ λ ) − E ( G ( ϕ λ ) | U , . . . , U n ) | p ) K ( n T p k ϕ λ k p k h k + n p +1 T p k ϕ λ k p k h k + n T k ϕ λ k k ϕ λ k p − ∞ k h k + n k ϕ λ k k ϕ λ k p − ∞ k h k ∞ + n p/ T p k ϕ λ k p k h k p ∞ + n p T p k ϕ λ k p k h k p + " n p + n p/ T k ϕ λ k p k h k p ∞ ) . (6.29)0 Laure
Sansonnet
Now, we have to focus on the second term of (6.17). Recall the definition (6.5) of W ( ϕ λ ) W ( ϕ λ ) = W ( ϕ λ ) + W ( ϕ λ ) , with W ( ϕ λ ) = ( n − n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt, and W ( ϕ λ ) = X i = j n g ( U i , U j ) , where g ( U i , U j ) = Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt. So, E ( | W ( ϕ λ ) | p ) p − (cid:2) E ( | W ( ϕ λ ) | p ) + E ( | W ( ϕ λ ) | p ) (cid:3) . On the one hand, we have to control E ( | W ( ϕ λ ) | p ) . We use Rosenthal’s inequality: there exists apositive constant C ( p ) only depending on p such that E ( | W ( ϕ λ ) | p )= ( n − p E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ( n − p × C ( p ) n E (cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) p ! + n p (cid:20) Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:19)(cid:21) p ! . But, applying Lemma 6.1, E (cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:12)(cid:12)(cid:12)(cid:12) p ! T p E (cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12) p ! (cid:18)Z R h ( x ) dx (cid:19) p p T p (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) p (cid:18)Z R h ( x ) dx (cid:19) p and Var (cid:18)Z R (cid:2) ϕ λ ( t − U ) − E π ( ϕ λ ( t − U )) (cid:3) E π ( h ( t − U )) dt (cid:19) = Var (cid:18)Z R ϕ λ ( t − U ) E π ( h ( t − U )) dt (cid:19) E "(cid:18)Z R | ϕ λ ( t − U ) | E π ( h ( t − U )) dt (cid:19) T E "(cid:18)Z R | ϕ λ ( t − U ) | dt (cid:19) R h ( x ) dx (cid:19) T (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) (cid:18)Z R h ( x ) dx (cid:19) . onparametric estimation in a Poissonian interactions model E ( | W ( ϕ λ ) | p ) K n p " nT p (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) p + n p T p (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) p R h ( x ) dx (cid:19) p , (6.30)with K a positive constant only depending on p .And on the other hand, we have to control E ( | W ( ϕ λ ) | p ) . We have: W ( ϕ λ ) = X i = j n g ( U i , U j ) . We use Theorem 3.3 of [11] associated with Theorem 1 of [7] (we keep the same notations of [11]). Weset h i,j = (cid:26) if i = jg otherwise and we consider ( U (1) i , i = 1 . . . n ) and ( U (2) i , i = 1 . . . n ) two independentcopies of ( U i , i = 1 . . . n ) . With Theorem 3.3 of [11], there exists an universal constant K such that E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i,j n h i,j ( U (1) i , U (2) j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p K p (cid:2) (2 p ) p C p + (2 p ) p D p + (2 p ) p B p + (2 p ) p A p (cid:3) , where • A = max i,j k h i,j k ∞ = k g k ∞ . But, for all ( x, y ) ∈ R , | g ( x, y ) | = (cid:12)(cid:12)(cid:12)(cid:12)Z R (cid:2) ϕ λ ( t − x ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − y ) − E π ( h ( t − U )) (cid:3) dt (cid:12)(cid:12)(cid:12)(cid:12) Z R | ϕ λ ( x ) | dx k h k ∞ , using equality (b) of Lemma 6.1 with f = ϕ λ . So, A Z R | ϕ λ ( x ) | dx k h k ∞ , (6.31) • C = X i,j E ( h i,j ( U (1) i , U (2) j )) = X i = j E ( g ( U i , U j )) . But, for all i = j , E ( g ( U i , U j )) E "(cid:18)Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:2) h ( t − U j ) − E π ( h ( t − U )) (cid:3) dt (cid:19) E (cid:20)Z R [ ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U ))] | h ( t − U j ) − E π ( h ( t − U )) | dt Z R | h ( t − U j ) − E π ( h ( t − U )) | dt (cid:21) E (cid:20)Z R (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:12)(cid:12) h ( t − U j ) − E π ( h ( t − U )) (cid:12)(cid:12) dt (cid:21) Z R h ( x ) dx T Z R E (cid:16)(cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3) (cid:17) dt (cid:18)Z R h ( x ) dx (cid:19) T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) , using Lemma 6.1. So, C n ( n − T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) , (6.32)2 Laure
Sansonnet • B = max (cid:13)(cid:13)(cid:13)(cid:13) X i E ( h i,j ( U i , y )) (cid:13)(cid:13)(cid:13)(cid:13) ∞ , (cid:13)(cid:13)(cid:13)(cid:13) X j E ( h i,j ( x, U j )) (cid:13)(cid:13)(cid:13)(cid:13) ∞ , with E ( h i,j ( U i , y )) = (cid:26) E π ( g ( U, y )) if i = j otherwise T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) and E ( h i,j ( x, U j )) = (cid:26) E π ( g ( x, U )) if i = j otherwise T Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) , using established inequalities to get equation (6.14) in the proof of the assumption (A1). So, B nT Z R ϕ λ ( x ) dx (cid:18)Z R h ( x ) dx (cid:19) , (6.33) • D = sup E (cid:18) X i,j h i,j ( U (1) i , U (2) j ) a i ( U (1) i ) b j ( U (2) j ) (cid:19) : E (cid:18) X i a i ( U i ) (cid:19) , E (cid:18) X j b j ( U j ) (cid:19) .By using established inequalities to get equation (6.13) in the proof of the assumption (A1), weobtain D n s T Z R ϕ λ ( x ) dx Z R h ( x ) dx. (6.34)Moreover, we use the equivalence of Theorem 3.3 of [11] and the decoupling inequality provided inTheorem 1 of [7] to obtain the following upper bound of E ( | W ( ϕ λ ) | p ) : E ( | W ( ϕ λ ) | p ) K " n p T p (cid:18)Z R ϕ λ ( x ) dx (cid:19) p (cid:18)Z R h ( x ) dx (cid:19) p + n p T p (cid:18)Z R ϕ λ ( x ) dx (cid:19) p (cid:18)Z R h ( x ) dx (cid:19) p + (cid:18)Z R | ϕ λ ( x ) | dx (cid:19) p k h k p ∞ , (6.35)with K a positive constant only depending on p .Finally, by using inequalities (6.29), (6.30) and (6.35) in (6.17), we obtain: E ( | ˆ β λ − β λ | p ) K ( n p − T p k ϕ λ k p k h k + 1 n p − T p k ϕ λ k p k h k + 1 n p − T k ϕ λ k k ϕ λ k p − ∞ k h k + 1 n p − k ϕ λ k k ϕ λ k p − ∞ k h k ∞ + 1 n p/ T p k ϕ λ k p k h k p ∞ + 1 T p k ϕ λ k p k h k p + 1 n p k ϕ λ k p k h k p ∞ + 1 n p/ T k ϕ λ k p k h k p ∞ + nT p k ϕ λ k p k h k p + n p T p k ϕ λ k p k h k p + 1 T p k ϕ λ k p k h k p + 1 n p T p k ϕ λ k p k h k p + 1 n p k ϕ λ k p k h k p ∞ ) onparametric estimation in a Poissonian interactions model h E (cid:16) | ˆ β λ − β λ | p (cid:17)i p K ( n − /p T k ϕ λ k k h k /p + nT k ϕ λ k k h k + 1 n / T k ϕ λ k k h k ∞ + 1 n k ϕ λ k k h k ∞ + 1 n − /p T k ϕ λ k k h k /p + 1 T k ϕ λ k k h k + 1 T k ϕ λ k k h k + (cid:20) n + 1 n / T /p (cid:21) k ϕ λ k k h k ∞ + (cid:20) n − /p T /p + 1 n − /p (cid:21) k ϕ λ k /p k ϕ λ k − /p ∞ k h k /p ∞ ) , with K a positive constant only depending on p and the compact support of ψ .Recall that for any λ = ( j, k ) ∈ Λ , we have: k ϕ λ k − j/ M ψ, , k ϕ λ k M ψ, and k ϕ λ k ∞ j/ M ψ, ∞ We consider < p < ∞ and we fix < q < ∞ such that p + q = 1 , so that h E (cid:16) | ˆ β λ − β λ | p (cid:17)i p K ( nT k h k + 1 n k h k ∞ + 1 n /q T k h k /p + 1 T k h k + 1 T k h k + (cid:20) n + 1 n / T /p (cid:21) k h k ∞ + (cid:20) n /q T /p + 1 n /q (cid:21) j /q k h k /p ∞ ) , with K a positive constant depending on p , k ψ k , k ψ k , k ψ k ∞ and the compact support of ψ .Finally, choosing p = 2 , Assumption (A2) is fulfilled with R = C R ( n + 2 j / n / + 2 j / nT / + nT ) , where C R is a positive constant depending on k h k , k h k ∞ , k ψ k , k ψ k , k ψ k ∞ and the compact supportof ψ , H λ = λ ∈ Γ and ε = 1 . To shorten mathematical expressions, we denote η λ = η λ ( γ, ∆) in the sequel. The following inequality: P (cid:16) | ˆ β λ − β λ | > κη λ , | ˆ β λ | > η λ (cid:17) H λ ζ is obvious with ζ = ω , which proves Assumption (A3) choosing θ = εε . Therefore we can apply Theorem 2: the estimator ˜ β = (cid:16) ˆ β λ | ˆ β λ | > η λ λ ∈ Γ (cid:17) λ ∈ Λ satisfies − κ κ E (cid:16) k ˜ β − β k ℓ (cid:17) E inf m ⊂ Γ κ − κ X λ m β λ + 1 − κ κ X λ ∈ m ( ˆ β λ − β λ ) + X λ ∈ m η λ + LD X λ ∈ Γ H λ inf m ⊂ Γ κ − κ X λ m β λ + 1 − κ κ X λ ∈ m E (( ˆ β λ − β λ ) ) + X λ ∈ m E ( η λ ) + LD X λ ∈ Γ H λ , with4 Laure
Sansonnet • for all λ = ( j, k ) in Γ , E (( ˆ β λ − β λ ) ) = Var( ˆ β λ ) K (cid:26) n + 1 T + 2 − j nT (cid:27) , where K is a positive constant depending on k h k , k h k ∞ , k ψ k and k ψ k (see Lemma 2.1); • for all λ = ( j, k ) in Γ , η λ K (cid:18)r j e V (cid:16) ϕ λ n (cid:17) + j B (cid:16) ϕ λ n (cid:17) + ˜∆ N R n (cid:19) where K depends on ε , κ , γ , k ψ k , k ψ k and k ψ k ∞ and ˜∆ = j j / n + j √ T + √ j nT ; • LD = Rκ (cid:0) (1 + θ − / ) ω / + (1 + θ / ) ε / ζ / (cid:1) K R (cid:0) e − κ j γ/ + exp ( − g ( ε ) n k h k / (cid:1) , where K is a positive constant depending only on ε and κ , • X λ ∈ Γ H λ = | Γ | , where | Γ | is the cardinal of the set Γ . So, we can upper bound this quantity by K j , where K is a positive constant depending only on the compact support of h and thecompact support of ψ .Recall that ε = 1 , κ ∈ ]0; 1[ will be fixed in the sequel and γ > , according Assumption (A1).It remains to compute E ( η λ ) . Let λ ∈ Γ . We have: E ( η λ ) K j n E ( e V ( ϕ λ )) + j n E ( B ( ϕ λ )) + ( j nT + j T + 2 j j n ) E ( N R ) n ! , with K depending on ε , κ , γ , k ψ k , k ψ k and k ψ k ∞ and E ( N R ) = n k h k + n k h k n M h, .We control e V ( ϕ λ ) in expectation and we recall that α = κ j γ . E ( e V ( ϕ λ )) E ( ˆ V ( ϕ λ )) + q α E ( ˆ V ( ϕ λ )) E ( B ( ϕ λ )) + 3 α E ( B ( ϕ λ )) , (6.36)with using inequality (6.4), E ( ˆ V ( ϕ λ )) = E ( V ( ϕ λ )) K (cid:26) n k h k ∞ + n T k h k (cid:27) , where K is a positive constant depending only on k ψ k .Now, we focus on E ( B ( ϕ λ )) where B ( ϕ λ ) = B ( ϕ λ ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X i =1 (cid:20) ϕ λ ( · − U i ) − n − n E π ( ϕ λ ( · − U )) (cid:21)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ = sup t ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:20) ϕ λ ( t − U i ) − n − n E π ( ϕ λ ( t − U )) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ B ( ϕ λ ) + 1 T k ϕ λ k , with ˜ B ( ϕ λ ) = sup t ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:2) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) .Then, we have to control E ( ˜ B ( ϕ λ )) . Since it is a decomposition biorthogonal wavelet, ϕ λ is a piecewiseconstant function and we can write: ϕ λ = N X l =1 c l [ a l ; b l ] , onparametric estimation in a Poissonian interactions model N ∈ N ∗ and for any l ∈ { , . . . , N } , a l , b l , c l ∈ R and a l < b l . It is easy to see that ˜ B ( ϕ λ ) N X l =1 ˜ B ( c l [ a l ; b l ] ) = N X l =1 | c l | ˜ B ( [ a l ; b l ] ) . It remains to compute E ( ˜ B ( [ a ; b ] )) for some interval [ a ; b ] . ˜ B ( [ a ; b ] ) = sup t ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:2) [ a ; b ] ( t − U i ) − E π ( [ a ; b ] ( t − U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) sup B t ,t ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 (cid:2) B t ( U i ) − E π ( B t ( U )) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where for any t ∈ R , B t = [ t − b ; t − a ] .We set B = { B t , t ∈ R } and for every integer n , m n ( B ) = sup A ⊂ R , | A | = n |{ A ∩ B t , t ∈ R }| . It is easy to seethat m n ( B ) n ( n + 1)2 and so, the VC-dimension of B defined by sup { n > , m n ( B ) = 2 n } is bounded by 2 (see Definition 6.2of [20]).By applying Lemma 6.4 of [20], we obtain: √ n E ( ˜ B ( [ a ; b ] )) K √ , where K is an absolute constant. So, for any λ in Γ , E ( ˜ B ( ϕ λ )) K √ n . But, we want an upper bound of E ( ˜ B ( ϕ λ )) . For this, we use Theorem 11 of [1]: (cid:2) E ( ˜ B ( ϕ λ )) (cid:3) / K (cid:26) E ( ˜ B ( ϕ λ )) + kMk (cid:27) , where M = max i n sup t ∈ R (cid:12)(cid:12) ϕ λ ( t − U i ) − E π ( ϕ λ ( t − U )) (cid:12)(cid:12) . Hence, kMk k ϕ λ k ∞ K j , with K a constant only depending on k ψ k ∞ .Finally, E ( B ( ϕ λ )) K (cid:26) E ( ˜ B ( ϕ λ )) + 2 − j T (cid:27) K (cid:26)(cid:2) E ( ˜ B ( ϕ λ )) (cid:3) + 2 j + 2 − j T (cid:27) K (cid:26) n + 2 j + 2 − j T (cid:27) , (6.37)with K a constant only depending on k ψ k and k ψ k ∞ .Then combining (6.36) and (6.37) yields E ( η λ ) K ( j n + j / j / n / + j j n + j j n + j T + j T + j / j / nT / + j nT ) , Laure
Sansonnet where K is a constant depending on γ , k h k , k h k ∞ , k ψ k , k ψ k and k ψ k ∞ , which concludes the proofof Theorem 3 by setting F ( j , n, T ) = j n + j / j / n / + j j n + j T + j / j / nT / + j nT2