[PDF] Testing for equality between two transformations of random variables

Abstract

Consider two random variables contaminated by two unknown transformations. The aim of this paper is to test the equality of those transformations. Two cases are distinguished: first, the two random variables have known distributions. Second, they are unknown but observed before contaminations. We propose a nonparametric test statistic based on empirical cumulative distribution functions. Monte Carlo studies are performed to analyze the level and the power of the test. An illustration is presented through a real data set.

Full PDF

TTesting for equality between two transformations of randomvariables

Mohamed BOUTAHAR ∗∗ and Denys POMMERET † † October 31, 2018

Abstract

Consider two random variables contaminated by two unknown transformations.The aim of this paper is to test the equality of those transformations. Two cases aredistinguished: ﬁrst, the two random variables have known distributions. Second, theyare unknown but observed before contaminations. We propose a nonparametric teststatistic based on empirical cumulative distribution functions. Monte Carlo studies areperformed to analyze the level and the power of the test. An illustration is presentedthrough a real data set.

Keywords : empirical cumulative distribution; nonlinear contamination; nonparametricestimation

There exists an important literature concerning the deconvolution problem, when an un-known signal Y is contaminated by a noise Z , leading to the observed signal X = Y + Z. (1)A major problem is to reconstruct the density of Y . Many authors studied the univariateproblem when the noise Z has known distribution (see for instance Fan [10], Carroll andHall [3], Devroye [7], or more recently Holzmann et al. [12] for a review). Bissantz etal. [1] proposed the construction of conﬁdence bands for the density of Y based on i.i.d.observations from (1). The case where both Y and Z have unknown distributions isconsidered in Neumann [15], Diggle and Hall [8] or Johannes et al. [13] among others. Whenthe error density and the distribution of Y have diﬀerent characteristics the model can beidentiﬁed as shown in Butucea and Matias [2] and Meister [14]. But without information ∗∗ Corresponding author, IML. Luminy Faculty of Sciences. 163 Av. de Luminy 13288 Marseille Cedex9 e-mail: [email protected]. †† IML. Luminy Faculty of Sciences. 163 Av. de Luminy 13288 Marseille Cedex 9 e-mail: [email protected]. a r X i v : . [ s t a t . M E ] O c t n Z , the model suﬀers of identiﬁcation conditions. One solution is to assume anotherindependent sample is observed from the measurement error Z (as done in Efromovich andKoltchinskii [9] and Cavalier and Hengartner [5]).A more general model than (1) occurs when the contaminated random variables areobserved through a transformation; that is, there exists g such that X = g ( Y + Z ) . (2)When g is known the problem is to estimate the distribution of Y , observing a sample from(2). An application of this model to ﬂuorescence lifetime measurements is given in Comteand Rebafka [6]. The authors developed an adaptative estimator that take into accountthe perturbation from the unknown additive noise, and the distortion due to the nonlineartransformation.In this paper we consider a two sample problem of contamination that can be relatedto models (1) and (2) as follows: We assume that two contaminated random variables areobserved, say X and ˜ X , which are transformations of two known, or observed, signals, thatis: X = g ( Y ) , ˜ X = ˜ g ( ˜ Y ) , (3)where g and ˜ g are continuous monotone unknown functions. Our purpose is to test H : g = ˜ g against H : g (cid:54) = ˜ g, (4)based on two i.i.d. samples satisfying (3). The problem of testing (4) is of interest in manyapplications when a signal is noised in another way than the additive noise model (1). Wewill distinguish two important cases: Case 1

The distributions of Y and ˜ Y are known and we observe two samples reﬂecting X and ˜ X . This situation may be encountered when two signals are controlled in entrybut observed with perturbations in exit of a system. Case 2

The distributions of Y and ˜ Y are unknown and we ﬁrst observe two independentsamples based on Y and ˜ Y , and then we observe contaminated samples X and ˜ X satisfying (3). This situation may be encountered when two unknown signals areobserved both in entry and in exit of a system.For both cases we construct a test statistics based on non parametric empirical estimatorsof g and ˜ g and we adapt a limit result on empirical processes due to Sen [16]. Our teststatistics are very easily implemented and we observe through simulations that they havea good power against various alternatives. It is clear that when H is not rejected; that iswhen the two noise functions are identical, it is then of interest to interpret the commonestimation of g . We illustrate this point with a study of the Framingham dataset (seeCarroll et al. [4], and more recently Wang and Wang [17]).The paper is organized as follows: in Section 1 we consider the problem when the twooriginal signals have known distributions. In Section 2 we relax the last assumption byassuming unknown distributions but we observe the two original signals after and beforeperturbations. In Section 3 a simulation study is presented and a real data set is analyzed.2 The test statistic Y and ˜ Y are known We consider n (resp. ˜ n ) i.i.d. observations X , · · · , X n (resp. ˜ X , · · · , ˜ X ˜ n ) from (3). Weassume that Y and ˜ Y are independent. Write F Y and F ˜ Y the cumulative distributionfunctions of Y and ˜ Y respectively. We assume that these functions are known and invert-ible. We also write F X and F ˜ X the cumulative distribution functions of X and ˜ X . Alsowe assume that the transformations g and ˜ g are monotone and, without loss of generality,that they are increasing. Note that g ( y ) = F − X ( F Y ( y )) and ˜ g ( y ) = F − X ( F ˜ Y ( y )) . Hence anatural nonparametric estimators of the contaminating functions are given by (cid:98) g ( · ) = X ([ nF Y ( · )]+1) and (cid:98) ˜ g ( · ) = ˜ X ([˜ nF ˜ Y ( · )]+1) , (5)where X ( i ) and ˜ X ( i ) denote the i th order statistics, and [ x ] denotes the integer part of thereal x . A fundamental theorem of Sen [16] states the following convergence in distribution √ n (cid:0) X ([ np ]+1) − F − X ( p ) (cid:1) D → N (cid:32) , p (1 − p ) f ( F − X ( p )) (cid:33) , ∀ p ∈ (0 , , (6)where D → denotes the convergence in distribution, f denotes the density of X and N ( m, σ ) the Normal distribution with mean m and variance σ . We will need the following twostandard assumptions: • ( A ) there exists a < ∞ such that n/ ( n + ˜ n ) → a • ( A ) f > and f is C k , for some positive integer k .We deduce a ﬁrst result which is a main tool for the construction of the test statistic. Proposition 2.1

Let Assumption ( A ) − ( A ) hold. Under H we have (cid:114) n ˜ nn + ˜ n (cid:16)(cid:98) g ( y ) − (cid:98) ˜ g ( y ) (cid:17) D → N (0 , σ ( y )) , as n → ∞ , ˜ n → ∞ , (7)where σ ( y ) = (1 − a ) F Y ( y )(1 − F Y ( y )) f X ( g ( y )) + a F ˜ Y ( y )(1 − F ˜ Y ( y )) f X (˜ g ( y )) Proof.

It follows directly from (6), replacing p by F Y ( y ) and F ˜ Y ( y ) respectively. (cid:4) We will estimate the variance σ by using a nonparametric method. Consider a kernel K ( · ) , for instance the quartic kernel deﬁned by K ( y ) = (1 − y ) ( − , ( y ) , and anassociated bandwidth h n . In the sequel, we will set K h n ( y ) = K ( yh n ) . To avoid smallvalues for denominators in the estimation of the variance we use3 f X ( y ) = max (cid:32) nh n n (cid:88) i =1 K h n ( X i − y ) , e n (cid:33) and (cid:98) f ˜ X ( y ) = max (cid:32) nh ˜ n ˜ n (cid:88) i =1 K h ˜ n ( ˜ X i − y ) , e ˜ n (cid:33) , where e n > and e n → when n tends to inﬁnity. The estimator of σ is then (cid:98) σ ( y ) = (1 − a ) F Y ( y )(1 − F Y ( y )) (cid:98) f X ( (cid:98) g ( y )) + a F ˜ Y ( y )(1 − F ˜ Y ( y )) (cid:98) f X ( (cid:98) ˜ g ( y )) , and we consider the statistic T ( y ) = n ˜ nn + ˜ n (cid:98) σ ( y ) − (cid:16)(cid:98) g ( y ) − (cid:98) ˜ g ( y ) (cid:17) . (8) Proposition 2.2

Let Assumptions ( A ) - ( A ) hold. If h n (cid:39) n − c , e n (cid:39) n − c for somepositive constants c and c such that c k < c < k , then under H , when n → ∞ , ˜ n →∞ , we have for all y : T ( y ) D → Z, where Z is Chi-squared distributed with one degree of freedom . Proof.

We need the fundamental Lemma (see Härdle [11]):

Lemma 2.1 sup y ∈ R | ˆ f ( y ) − f ( y ) | = O p (cid:18) h kn + log nnh n (cid:19) . We can write (cid:98) σ ( y ) = u ( y ) (cid:98) f X ( (cid:98) g ( y )) + v ( y ) (cid:98) f X ( (cid:98) ˜ g ( y )) , where u ( y ) = (1 − a ) F Y ( y )(1 − F Y ( y )) and v ( y ) = aF ˜ Y ( y )(1 − F ˜ Y ( y )) . Using Taylorexpansion there exist A and B such that (cid:98) σ ( y ) = σ ( y ) + (cid:18) (cid:98) f X ( (cid:98) g ( y )) − f ( g ( y )) (cid:19)(cid:18) − A (cid:19) + (cid:18)(cid:98) ˜ f X ( (cid:98) ˜ g ( y )) − ˜ f (˜ g ( y )) (cid:19)(cid:18) − B (cid:19) , with A ≤ e n and B ≤ e n . Then, from Lemma 2.1 we get (cid:98) σ ( y ) = σ ( y ) + o P (1) , by assumption and the result follows from Proposition 2.1. (cid:4) .2 Case 2: the two signal distributions Y and ˜ Y are unknown We consider n x (resp. ˜ n x ) i.i.d. observations X , · · · , X n x (resp. ˜ X , · · · , ˜ X ˜ n x ) and n y (resp. ˜ n y ) i.i.d. observations Y , · · · , Y n y (resp. ˜ Y , · · · , ˜ Y ˜ n y ) from (3). Put N = n x n y / ( n x + n y ) and ˜ N = ˜ n x ˜ n y / ( ˜ n x + ˜ n y ) . The two samples Y , · · · , Y n y and ˜ Y , · · · , ˜ Y ˜ n y can be viewed as two independent trainingsets which permit to estimate the initial densities of the signals before perturbations. Againwe want test H : g = ˜ g . We now estimate g and ˜ g by (cid:98) g ( · ) = X ([ n x (cid:98) F Y ( · )]) and (cid:98) ˜ g ( · ) = ˜ X ([˜ n x (cid:98) F ˜ Y ( · )]) , (9)where (cid:98) F Y ( y ) = 1 n y n y (cid:88) i =1 { Y i ≤ y } and (cid:98) F ˜ Y ( y ) = 1˜ n y ˜ n y (cid:88) i =1 { ˜ Y i ≤ y } , (10)are the empirical distribution functions of Y and ˜ Y respectively. We assume that lim n x / ( n x + n y ) = a < ∞ , lim ˜ n x / ( ˜ n x + ˜ n y ) = ˜ a < ∞ , and we make the following assumption, extending Assumption (A1): • ( A ) there exists b < ∞ such that N/ ( N + ˜ N ) → b .We can extend Proposition 2.1 as follows: Proposition 2.3

Let Assumption ( A ) − ( A ) hold. Under H we have (cid:115) N ˜ NN + ˜ N (cid:16)(cid:98) g ( y ) − (cid:98) ˜ g ( y ) (cid:17) D → N (0 , σ ( y )) , as N → ∞ , ˜ N → ∞ , (11)where σ ( y ) = (1 − b ) F Y ( y )(1 − F Y ( y )) f X ( g ( y )) + b F ˜ Y ( y )(1 − F ˜ Y ( y )) f X (˜ g ( y )) . (12) Proof.

We ﬁrst show that U = (cid:114) n x n y n x + n y ( (cid:98) g ( y ) − g ( y )) D → N (0 , σ ( y )) , as n x → ∞ , n y → ∞ , where σ ( y ) = F Y ( y )(1 − F Y ( y )) f X ( g ( y )) . (cid:98) g ( y ) − g ( y ) = (cid:98) G ( y ) + G ( y ) , where (cid:98) G ( y ) = (cid:98) g ( y ) − F − X ( (cid:98) F Y ( y )) = X ([ n x (cid:98) F Y ( x )]) − F − X ( (cid:98) F Y ( y )) and G ( y ) = F − X ( (cid:98) F Y ( y )) − g ( y ) . By the delta method we get n / y G ( y ) → N (0 , σ ( y )) . Then we decompose the characteristic function E (cid:0) e iuU (cid:1) = E (cid:16) e iun x,y G E ( e iun x,y (cid:98) G | Y ) (cid:17) , where n x,y = (cid:114) n x n y n x + n y and Y stands for the vector of observation Y , · · · , Y n y .Since these functions are bounded we get: lim n x →∞ ,n y →∞ E (exp( iuU )) = E (cid:18) lim n x →∞ ,n y →∞ e iun x,y G lim n x →∞ ,n y →∞ E (cid:16) e iun x,y (cid:98) G | Y (cid:17)(cid:19) = E (cid:18) e iu √ aZ lim n y →∞ e − (1 − a ) (cid:98) σ ( y ) (cid:19) , where Z ∼ N (0 , σ ( y )) and (cid:98) σ ( y ) = (cid:98) F Y ( y )(1 − (cid:98) F Y ( y )) f X ( g ( y )) . We ﬁnally obtain lim n x →∞ ,n y →∞ E (exp( iuU )) = exp( − / u σ ( y )) . Similarly, writing ˜ U = (cid:115) ˜ n x ˜ n y ˜ n x + ˜ n y (cid:16)(cid:98) ˜ g ( y ) − ˜ g ( y ) (cid:17) , we obtain that ˜ U D → N (0 , ˜ σ ( y )) , as ˜ n x → ∞ , ˜ n y → ∞ , with ˜ σ ( y ) = F ˜ Y ( y )(1 − F ˜ Y ( y )) f X (˜ g ( y )) . Finally, combining these two convergences with the equality ˜ g = g under H we com-plete the proof. (cid:4) As previously we can estimate σ ( y ) in (12) by a nonparametric estimator (cid:98) σ ( y ) = (1 − b ) (cid:98) F Y ( y )(1 − (cid:98) F Y ( y )) (cid:98) f X ( (cid:98) g ( y )) + b (cid:98) F ˜ Y ( y )(1 − (cid:98) F ˜ Y ( y )) (cid:98) f X ( (cid:98) ˜ g ( y )) , (cid:98) F Y and (cid:98) F ˜ Y are the empirical distribution functions of Y and ˜ Y given by (10). Ourtest statistic is given by T ( y ) = N ˜ NN + ˜ N (cid:98) σ − ( y ) (cid:16)(cid:98) g ( y ) − (cid:98) ˜ g ( y ) (cid:17) . (13)We can now generalize Proposition 2.2 as follows. Proposition 2.4

Let Assumptions ( A ) - ( A ) hold. If h n (cid:39) n − c , e n (cid:39) n − c for somepositive constants c and c such that c k < c < k , then under H , when N → ∞ , ˜ N →∞ , we have: T D → Z, where Z is Chi-squared distributed with one degree of freedom. Proof.

We combine the proof of Proposition 2.1 with the fact that (cid:98) F (1 − (cid:98) F ) isbounded to get (cid:98) σ ( y ) = σ ( y ) + o P (1) , and we conclude by Proposition 2.3. (cid:4) H We study convergence properties of the tests T and T under some alternatives Proposition 2.5 a. General alternatives.Consider the test statistics T and T , then for all y such that g ( y ) (cid:54) = ˜ g ( y ) , we have T ( y ) P → + ∞ and T ( y ) P → + ∞ , where P → denotes the convergence in probability.b. Local alternatives.Let us denote m = n ˜ nn +˜ n or m = N ˜ NN + ˜ N according to whether if the test statistic T or T isused and consider the local alternatives H l : ˜ g ( y ) = g ( y ) + k ( y ) m β , then under H l and when n → ∞ , ˜ n → ∞ , N → ∞ , ˜ N → ∞ we have for all y :i. If β > / then T ( y ) D → Z and T ( y ) D → Z i. If β = 1 / then T ( y ) D → Z k and T ( y ) D → Z k iii. If β < / then T ( y ) P → + ∞ and T ( y ) P → + ∞ where Z is Chi-squared distributed with one degree of freedom and Z k is a decentred Chi-squared distributed with one degree of freedom and parameter k ( y ) . The proof of this proposition is straightforward and hence is omitted. (cid:4)

Remark 2.1

Estimators (cid:98) g (resp. (cid:98) ˜ g ) are computed from ( X , · · · , X n x ) and ( Y , · · · , Y n y ) (resp. ( ˜ X , · · · , ˜ X ˜ n x ) and ( ˜ Y , · · · , ˜ Y ˜ n y ) ). Under the null H there are two diﬀerent waysto construct a common estimator of g . First we can consider the aggregate estimator (cid:98) g = ( n x + n y ) (cid:98) g + ( ˜ n x + ˜ n y ) (cid:98) ˜ gn x + n y + ˜ n x + ˜ n y , (14) and, second, another estimator can be construct by aggregating the samples. For all empirical powers or empirical levels we carry out experiments of samples andwe use three diﬀerent sample sizes: n = 50 , n = 100 , and n = 500 . For each replication wecompute the statistics T ( y ) and T ( y ) given by (8) and (13), where y is chosen randomlyfollowing a standard normal distribution. We will denote by N (0 , the standard normal distribution with mean zero and variance1. We ﬁrst consider the case where Y t and ˜ Y t are independent and N (0 , distributed.The bandwidth is chosen as h n = n − / and the trimming as e n = n − / . Empirical level

To study the empirical levels of T and T we choose g ( y ) = ˜ g ( y ) = exp { ( y + 3) / ( y + 5) } , and we ﬁx a theoretical level α = 5% . Table 1 shows empirical levels of the test under H .It can be seen that both statistics T and T provide levels close to the asymptotic value.8 .2 Study of the empirical powers We consider the model where Y t and ˜ Y t are independent and N (0 , distributed. To studythe empirical powers of T and T we consider g ( y ) = exp(( y + 3) / ( y + 5)) and the fourfollowing transformations: ˜ g ( y ) = exp(( y + 3) / ( y + 5)) + 1 , ˜ g ( y ) = 2 exp(( y + 3) / ( y + 5)) , ˜ g ( y ) = − ( y + 11) / ( y + 5) , ˜ g ( y ) = 4 y + 5 , and we also study local alternatives by considering: ˜ g ( y ) = g ( y ) + 2( y + 5) n β . Tables 2-3 present empirical powers for T and T under ﬁxed and local alternatives, respec-tively, for a theoretical level α equal to . From Table 2 it appears that the knowledge ofthe probability densities of Y and ˜ Y allows to have more stable statistics that detect moreeasily the departure from the null hypothesis. Then the test statistic T provides betterpower, particularly for smallest sample size. The test statistic T has a low empirical powerfor n = 50 ; but when the sample size n increases, the empirical power of T is similar tothat of T . Table 3 indicates that T and T provide good power for β ≤ / . For β > / the power converges to the theoretical level α ; this is in accordance with the theoreticalresult stated in Proposition 2.5. We consider the Framingham Study on coronary heart disease described by Carroll et al.[4]. The data consist of measurements of systolic blood pressure (SBP) obtained at twodiﬀerent examinations in 1,615 males on an 8-year follow-up. At each examination, theSBP was measured twice for each individual. The four variables of interest are: Y = the ﬁrst SBP at examination 1, ˜ Y = the second SBP at examination 1, X = the ﬁrst SBP at examination 2, ˜ X = the second SBP at examination 2.Our purpose is to examine whether the distribution of the SBP changed during time,and which type of transformation it underwent. Following our notations, we will studythe transformation between the distributions of Y and X and also the one between thedistributions of ˜ Y and ˜ X. Then we assume that X = g ( Y ) and ˜ X = ˜ g ( ˜ Y ) . Table 4 indicates that all the distributions of X , Y , ˜ X and ˜ Y are skewed to the rightand are leptokurtic, KS is the Kolomogorov-Smirnov statistic, the associated p-values arelesser than . − and hence the normality assumption is strongly rejected. Figure 1represents nonparametric estimations of the probability densities of X, Y, ˜ X , and ˜ Y .From Figure 1 it seems that the distributions of the variables Y and X have a similarshape. However, from Table 4 we observe a noticeable decrease in the mean and an increasein the variance. Based on the nonparametric estimators given in Figure 2 we can postulate9

00 150 200 . . . y f

100 150 200 250 . . . y f

100 150 200 250 . . . y f Figure 1: Kernel estimates of the probability densities of

X, Y, ˜ X, ˜ Y . In the top panel : f11(resp. f21) is the Kernel estimate of the density of Y (resp. of X ). In the bottom panel :f12 (resp. f22) is the Kernel estimate of the density of ˜ Y (resp. of ˜ X ). llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

100 150 200 y gh llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

100 150 200 y g t h llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

100 150 200 y g0h Figure 2: Nonparametric estimators of g and ˜ g and the aggregated estimator on the interval [ c, d ] : gh (resp. gth and g h ) denotes (cid:98) g (resp. (cid:98) ˜ g and (cid:98) g ).10hat only the location and the scale are aﬀected by time, therefore, the transformation g is linear; that is, g ( y ) = ay + b. Similarly the distributions of the variables ˜ Y and ˜ X can be linked by ˜ g ( y ) = ˜ ay + ˜ b. The functions g , ˜ g are estimated on the interval [ c, d ] where c = max(min( Y i ) , min( ˜ Y j )) and d = min(max( Y i ) , max( ˜ Y j )) . These functions areestimated on the grid y i = c + ( d − c ) i/M , for a given M .By applying our test we obtain a p-value very close to 1, and hence we can considerthat g = ˜ g .In Figure 2 we observe that all the estimators (cid:98) g , (cid:98) ˜ g and (cid:98) g are approximately linear onthe interval [ c, d ] , however in the border (near c and d ) the approximation is not good. Onecan observe that they are constants on regions where there are not enough observations.Therefore, to compute the linear approximation of these estimators we consider only the y i belonging to the interval [100 , .The ordinary least squares based on ( y i , (cid:98) g ( y i )) , ( y i , (cid:98) ˜ g ( y i )) and ( y i , (cid:98) g ( y i )) , y i ∈ [100 , , ≤ i ≤ yields (cid:98) g ( y ) = 0 . y + 0 . , (cid:98) ˜ g ( y ) = 0 . y + 0 . and (cid:98) g ( y ) = 0 . y + 0 . By using a parametric approach, i.e. (cid:98) g p ( y ) = ay + b , where a = cov ( X, Y ) / var ( Y ) , b = X − aY , we obtain the following estimators (cid:98) g p ( y ) = 0 . y + 33 . , (cid:98) ˜ g p ( y ) = 0 . y + 36 . , and the common aggregate parametric estimator is given by (cid:100) h p, ( y ) = 0 . y + 34 . . To compare the parametric and the nonparametric approaches, we consider the aggregateestimators and we compare the predicted values for the two ﬁrst moments of X and ˜ X with those observed. The predictions of X ( resp. of ˜ X ) are computed by using theobserved moments of Y (resp. of ˜ Y ) and the common transformation. Using the parametricapproach we get (cid:98) m X = 0 . m Y + 34 .

787 = 133 . (cid:100) V ar X = (0 . V ar ( Y ) = 232 . . The nonparametric approach yields (cid:98) m X = 0 . m Y + 0 . . (cid:100) V ar X = (0 . V ar ( Y ) = 408 . Note that the observed two ﬁrst moments of X are given by 131.2 and 439.11.Similarly for the pair ( ˜ X, ˜ Y ) , the parametric predictions are given by (cid:98) m ˜ X = 0 . m ˜ Y + 34 .

787 = 131 . (cid:100) V ar ˜ X = (0 . V ar ( ˜ X ) = 226 . . (cid:98) m ˜ X = 0 . m ˜ Y + 0 . . , (cid:100) V ar ˜ X = (0 . V ar ( ˜ Y ) = 399 . Recall that the observed two ﬁrst moments of ˜ X are given by 128.8 and 410.21.The predictions of the nonparametric model are more close to the observed values, conse-quently the nonparametric approach seems to be more suitable. References [1] N. Bissantz, L. D ¨u mbgen, H. Holzmann, H and A. Munk, Non-parametric conﬁdencebands in deconvolution density estimation

J. Roy. Stat. Soc. B, 69 (2007), pp. 483–506.[2] C. Butucea and C. Matias,

Minimax estimation of the noise level and of the decon-volution density in a semiparametric deconvolution model , Bernoulli 11 (2005), pp.309–340.[3] R.J. Carroll and P. Hall,

Optimal rates of convergence for deconvolving a density , J.Amer. Statist. Assoc. 83 (1988), pp. 1184–1186.[4] R.J. Carroll, D. Ruppert, L.A. Stefanski and C. Crainiceanu,

Measurement Error inNonlinear Models: A Modern Perspective , Second Edition. Chapman Hall, New York,2006.[5] L. Cavalier and N. Hengartner

Adaptive estimation for inverse problems with noisyoperators , Inverse Problems 21 (2005), pp. 1345–1361.[6] F. Comte and T. Rebafka. Adaptive density estimation in the pile-up model involvingmeasurement errors. Available at http://arxiv.org/abs/1011.0592, 2010.[7] L. Devroye,

Consistent deconvolution in density estimation , Canad. J. Statist. 17(1989) pp. 235–239.[8] P. Diggle and P. Hall,

A Fourier approach to nonparametric deconvolution of a densityestimate , J. Roy. Statist. Soc. Ser.B 55 (1993) pp. 523–531.[9] S. Efromovich and V. Koltchinskii,

On inverse problems with unknown operators , IEEETrans. Inform. Theory 47 (2001), pp. 2876–2893.[10] J. Fan,

On the optimal rate of convergence for nonparametric deconvolution problems ,Ann. Statist. 19 (1991), pp. 1257–1272.[11] W. Härdle, Applied Nonparametric Regression. Cambridge Books, Cambridge Uni-versity Press, 1992. 1212] H. Holzmann, N. Bissantz and A. Munk,

Density testing in a contaminated sample ,J. Multiv. Anal. 98 (2007), pp. 57–75[13] J. Johannes, S. Van Bellegem and A. Vanhems,

Convergence rates for ill-posed inverseproblems with an unknown operator , Tech. Rep., IDEI Working Paper, 2009.[14] A. Meister,

Density estimation with normal measurement error with unknown vari-ance , Statist. Sinica 16 (2006), pp. 195–211.[15] M.H. Neumann,

Deconvolution from panel data with unknown error distribution , J.Multivariate Anal. 98 (10) (2007), pp. 1955–1968.[16] P.K. Sen,

Limiting behavior of regular functionals of empirical distributions for sta-tionary mixing processes

Probability Theory and Related Fields, 25 (1972), pp. 71–82.[17] X.F. Wang and B. Wang,

Deconvolution Estimation in Measurement Er-ror Models T and T (in %) for a theoretical level α = 5% . n = 50 n = 100 n = 500 T T T and T (in %) for a theoretical level α = 5% .T T T T ˜ g ˜ g ˜ g ˜ g n = 50 n = 100 n = 500

100 99.69 99.96 98.47 T T T T ˜ g ˜ g ˜ g ˜ g n = 50

100 100 78.59 71.47 n = 100

100 100 84.31 78.41 n = 500

100 100 92.42 92.07Table 3: Empirical powers of T and T (in %) for a theoretical level α = 5% under localalternative ˜ g . T T T T T T β = 1 / β = 1 / β = 1 / β = 1 / β = 4 β = 4 n = 50 n = 100 n = 500 Y X