Behavior of R-estimators under measurement errors
aa r X i v : . [ m a t h . S T ] F e b Bernoulli (2), 2016, 1093–1112DOI: 10.3150/14-BEJ687 Behavior of R-estimators undermeasurement errors
JANA JURE ˇCKOV ´A , HIRA L. KOUL ,RADIM NAVR ´ATIL † and JAN PICEK Faculty of Mathematics and Physics, Charles University, Sokolovsk´a 83, CZ-186 75 Prague 8,Czech Republic. E-mail: * [email protected]ff.cuni.cz; † [email protected]ff.cuni.cz;url: ** Statistics and Probability, Michigan State University, A435 Wells Hall, East Lansing, Michigan,USA.E-mail: [email protected] Department of Mathematics and Statistics, Masaryk University Brno, Kotl´aˇrsk´a 2, 611 37Brno,Czech Republic Applied Mathematics, Technical University, Voronˇeˇzsk´a 13, Liberec, Czech Republic.E-mail: [email protected]
As was shown recently, the measurement errors in regressors affect only the power of the ranktest, but not its critical region. Noting that, we study the effect of measurement errors on R-estimators in linear model. It is demonstrated that while an R-estimator admits a local asymp-totic bias, its bias surprisingly depends only on the precision of measurements and does neitherdepend on the chosen rank test score-generating function nor on the regression model errordistribution. The R-estimators are numerically illustrated and compared with the LSE and L estimators in this situation. Keywords: contiguity; linear rank statistic; linear regression model; local asymptotic bias;measurement error; R-estimate
1. Introduction
Measurement technologies are often affected by random errors; if the goal of the experi-ment is to estimate a parameter, then the estimate is biased, and thus inconsistent. Thisproblem appears in the analytic chemistry, in environmental monitoring, in modeling as-tronomical data, in biometrics, and practically in all parts of the reality. Moreover, someobservations can be undetected, for example, when the measured flux (light, magnetic)in the experiment falls below some flux limit. In econometrics, the errors can be a re-sult of misreporting by subjects, miscoding by the collectors of the data, or by incorrect
This is an electronic reprint of the original article published by the ISI/BS in
Bernoulli ,2016, Vol. 22, No. 2, 1093–1112. This reprint differs from the original in pagination andtypographic detail. (cid:13)
Jureˇckov´a, Koul, Navr´atil and Picek transformation from initial reports. An essential part of measuring techniques, used, forexample, in the analytic chemistry, is the construction of a calibration curve – the resultfor an unknown sample is then determined by interpolation. Robust calibration methodswere developed in [24]. However, even the calibration can be affected by measurementerrors. The mismeasurements make the statistical inference biased, and they distort thetrends in the data.A variety of functional models have been proposed for handling measurement errors inregression models. Either the regressor or the response or both can be affected by randomerrors. Technicians, geologists and other specialists are aware of this problem, and try toreduce the bias with various ad hoc procedures. The bias cannot be completely eliminatedor substantially reduced unless we have some additional knowledge on the behavior ofmeasurement errors. The papers dealing with practical aspects of measurement errormodels include [2, 15, 21, 23, 30], among others.Adcock [1] was probably the first to realize the importance of the situation. Thereexists a rich literature on the statistical inference in the error-in-variables (EV) modelsas is evidenced by the monographs of Fuller [9], Carroll et al. [6], and Cheng and VanNess [7], and the references therein. The monographs [9] and [7] deal mostly with classicalGaussian set up while [6] discusses numerous inference procedure under semi-parametricset up. Nonparametric methods in EV models are considered in [4, 5] and in referencestherein, and in [8], among others. The regression quantile theory in the area of EVmodels was started by He and Liang [12]. Arias, Hallock and Sosa-Escudero [3] used aninstrumental variable estimator for quantile regression, considering biases arising fromunmeasured ability and measurement errors. The problem of mismeasurement is also ofinterest in the econometric literature: [11] and [16] described the recent developments intreating the effect of mismeasurement on econometric models.The advantage of rank and signed rank procedures in the measurement errors modelswas discovered recently in [20, 25, 26, 31] and in [32]; the latter made a detailed analysisof rank procedures in the linear model with a nonlinear nuisance regressor and undervarious kinds of measurement errors. Namely the rank tests can be recommended inthis situation: it is shown in [20] that the critical region of the rank test for regressionis insensitive to measurement errors in regressors under very general conditions; theerrors affect only the power of the test. However, against expectations following fromthe invariance of the ranks, due to which an estimate of a nuisance parameter in [20]was consistent for every fixed value of the same, we show that the R-estimator of slopeparameter β in linear model is biased. More precisely, we show that, unless β = , the R-estimator is biased even in a local neighborhood of . Hence, we cannot have an unbiasedestimator of any kind in this situation, unless we have some additional information onthe measurement errors.As we further show in the present paper, surprisingly the local asymptotic bias of R-estimators neither depends on the chosen rank test score-generating functions nor on theunknown distribution of the model errors. It depends only on value of slope parametervector and on the covariance matrix of the measurement error distribution of regressors. -estimates in ME models
2. Model and preliminary considerations
Consider the linear regression model Y ni = β + x ⊤ ni β + e ni , i = 1 , . . . , n (2.1)with unknown parameters β ∈ R , β ∈ R p . The regressors x ni are either deterministic orrandom and affected by additive random measurement errors, so that instead of x ni weobserve w ni = x ni + v ni , i = 1 , . . . , n , where v n , . . . , v nn are p -dimensional random er-rors, identically distributed with an unknown distribution, and independent of the errors e ni , ≤ i ≤ n . Moreover, there are additive measurement errors in the responses, thusinstead of Y ni we observe Y ∗ ni = Y ni + u ni , where u n , . . . , u nn are i.i.d. random variables.Thus in terms of the observable responses and predicting variables, our regression modelbecomes Y ∗ ni = β + w ⊤ ni β + e ∗ ni , i = 1 , . . . , n, (2.2)where e ∗ ni = e ∗ ni ( β ) = e ni + u ni − v ⊤ ni β , i = 1 , . . . , n are i.i.d random variables.We are interested in R-estimator of the slope vector β , considering β as nuisanceparameter. To define these estimators, let R ni ( b ) be the rank of the residual Y ∗ ni − w ⊤ ni b = e ni + u ni + x ⊤ ni β − w ⊤ ni b = e ni + u ni − w ⊤ ni b ∗ − v ⊤ ni β , i = 1 , . . . , n, where b ∗ = b − β . We shall work with the vector of linear rank statistics S n ( b ) = ( S nj ( b ); j = 1 , . . . , p ) ⊤ = n − / n X i =1 ( w ni − ¯ w n ) a n ( R ni ( b )) , (2.3)where the scores a n ( i ) , ≤ i ≤ n are nondecreasing in i and P ni =1 a n ( i ) = 0.Hodges and Lehmann [14] introduced a class of estimators of the location parameter θ in one- and two-sample location models, by inverting a class of rank tests for θ . Thismethodology was extended to linear regression models without measurement error byJureˇckov´a [19], where an estimator of β is defined as b β n = arg min b ∈ R p p X j =1 | S nj ( b ) | . This estimator can be seen to be asymptotically equivalent to an estimator obtainedby inverting the equations S nj ( b ) = 0 , j = 1 , . . . , p . Note that this latter estimator is pre-cisely an extension of the Hodges–Lehmann estimator from one- and two-sample locationmodels to linear regression models without measurement error. Under more general con-ditions, the R-estimators are studied by Koul [22]. Jureˇckov´a, Koul, Navr´atil and Picek
On the other hand, Jaeckel [17] called an analog of the function D n ( b ) = n X i =1 [ Y ∗ ni − w ⊤ ni b ]( a n ( R ni ( b )) − ¯ a n ) , (2.4)as a measure of rank dispersion of residuals, in the case of no measurement error where w ni ’s are replaced by x ni ’s. He showed that D n ( b ) is convex and piecewise linear in b ∈ R p . He also showed that − n / S n ( b ) is the subgradient of D n ( b ); hence the estimatordefined as a minimizer of D n exists and is equivalent to the above estimators based on S n . Both of these estimators are asymptotically equivalent, and Jaeckel’s definition ofR-estimator is now generally used in the literature. We are using this definition of R-estimator throughout this paper.In the absence of measurement errors, that is, if w ni = x ni , u ni = 0 , i = 1 , . . . , n , theestimator b β n is consistent and asymptotically normal. However, b β n is biased in the pres-ence of measurement errors, even asymptotically, unless the true β = . Furthermore, weshow that it is even asymptotically locally biased in the sense that the asymptotic distri-bution of n / ( b β n − n − / β ), with a fixed β ∈ R p , converges to a normal distributionwith nonzero mean vector and some positive definite covariance matrix.In the sequel, all limits are taken as n → ∞ , unless mentioned otherwise, p → denotesthe convergence in probability. We shall now describe the needed assumptions on theunderlying entities.(A.1) The score generating function ϕ : (0 , R is nondecreasing, square-integrableand skew-symmetric on (0 , ϕ (1 − t ) = − ϕ ( t ) , < t <
1. The scores a n ( i ) , i = 1 , . . . , n are generated by ϕ in either of the following two ways: a n ( i ) = ϕ (cid:18) in + 1 (cid:19) or a n ( i ) = E ϕ ( U n : i ) , i = 1 , . . . , n, where U n :1 ≤ · · · ≤ U n ; n are order statistics pertaining to the sample of size n from theuniform (0 ,
1) distribution.(F.1) Distribution function F of the model errors e ni has an absolutely continuousdensity f with a.e. derivative f ′ .(F.2) For every u ∈ R , R ( | f ′ ( x − tu ) | j /f j − ( x )) d x → R ( | f ′ ( x ) | j /f j − ( x )) d x < ∞ , as t → j = 2 , { u ni , ≤ i ≤ n } are independent of { e ni , v ni , ≤ i ≤ n } and i.i.d. with generally an unknown absolutely continuous density h , having finiteFisher’s information for location.(V.2) The measurement error v ni is independent of e ni and its p -dimensional distri-bution function G has a continuous density g , generally unknown, i = 1 , . . . , n .(V.3) E V n → V where V n = n − P ni =1 ( v ni − ¯ v n )( v ni − ¯ v n ) ⊤ and V is a positivedefinite p × p matrix. Moreover, sup n ≥ E ( k v n k + k x n k ) < ∞ .(V.4) E [ n − P ni =1 ( v ni − ¯ v n )( x ni − ¯ x n ) ⊤ ] → . -estimates in ME models x ni are nonrandom, then assume that Q n → Q , where Q n = n − n X i =1 ( x ni − ¯ x n )( x ni − ¯ x n ) ⊤ , and Q is positive definite p × p matrix. Moreover,1 n max ≤ i ≤ n ( x ni − ¯ x n ) ⊤ ( Q n ) − ( x ni − ¯ x n ) → . (X.2) If the regressors x ni are random, then assume that they are independent of e ni , u ni , v ni , i = 1 , . . . , n , and E " n − n X i =1 ( x ni − ¯ x n )( x ni − ¯ x n ) ⊤ → Q , where Q is positive definite p × p matrix.Let m ( · ) , M ( · ) denote the density and distribution function of e ni + u ni , i = 1 , . . . , n ,that is, m ( z ) = R f ( z − t ) h ( t ) d t . The density is absolutely continuous and has finiteFisher’s information I ( m ). We need to define γ m = − Z R ϕ ( M ( z )) d m ( z ) , A m ( ϕ ) = γ − m Z ϕ ( u ) d u, (2.5) B = − ( Q + V ) − V β . The following theorem gives the asymptotic distribution of the estimator b β n when thetrue parameter value is β n = n − / β , β ∈ R p fixed . (2.6) Theorem 2.1.
Assume the conditions (A.1) , (F.1) – (F.2) , (V.1) – (V.4) , (X.1) – (X.2) hold. When the true parameter value is β n , the R-estimator b β n is asymptotically nor-mally distributed with the bias B = − ( Q + V ) − V β , that is, n / ( b β n − β n ) D → N p ( B , ( Q + V ) − A m ( ϕ )) . (2.7)Theorem 2.1 will be proved in several steps; the proof is given in Section 3. Thenumerical illustrations of the results are given in subsequent Section 4. Corollary 2.1.
Under the conditions of Theorem 2.1 and under β = β n = n − / β , theR-estimator b β n has asymptotic normal distribution n / ( b β n − ( Q + V ) − Q β n ) D → N p ( , ( Q + V ) − A m ( ϕ )) . (2.8) Jureˇckov´a, Koul, Navr´atil and Picek
Notice that the local asymptotic bias cannot be controlled by the choice of the score-generating function ϕ ; this choice can only influence the asymptotic variance factor of theestimator. The magnitude of the bias fully depends on the precision of the measurements,namely on the matrix V . The measurement errors in the responses Y ni affect only theasymptotic variance, not the bias. The result is entirely nonparametric, valid for classes ofdistributions of model and measurement errors, demanding only finite first moment andfinite (and positive) Fisher’s information for location of the model error distributions,and finite third moment for measurement error distributions.Consider the two measurement methods with the same regressors (random or nonran-dom), with the respective limiting covariance matrices V , V . Comparing the biases in(2.7) for V and V , the first method is considered being more precise than the secondone if the matrix ( V + Q ) − ≺ ( V + Q ) − ; otherwise speaking, if Q − V ≺ Q − V ,where the ordering A ≺ B means that B − A is a positive definite matrix.
3. Proof of Theorem 2.1
We shall prove Theorem 2.1 in several steps. Notice that if we observe Y ∗ ni = Y ni + u ni instead of Y ni , then e ∗ ni = e ni + u ni , i = 1 , . . . , n are still i.i.d. random variables withdensity m ( z ) = R f ( z − t ) h ( t ) d t . The steps of the proof are parallel for both densities f and m of model errors; measurement errors in the Y ni affect only the asymptoticvariance of the estimate, not the bias. Noting this, we shall prove the theorem assuming u ni ≡ .i = 1 , . . . , n . In the sequel, we shall suppress the subscript n whenever it does notcause a confusion.The steps of the proof are as follows:(1) Asymptotic representation of the linear rank statistic S n ( , ) = n − / n X i =1 ( w ni − ¯ w n ) a n ( R ni ( )) (3.1)with the sum of independent summands. Here w ni = x ni + v ni , i = 1 , . . . , n , while x n , . . . , x nn are either i.i.d. random vectors or nonrandom vectors, and v n , . . . , v nn are i.i.d. random vectors.(2) Contiguity of the sequence { Q n } of distributions of ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n ), with b ∗ n = n − / b , β n = n − / β for b , β ∈ R p fixed, with respect to thesequence { P n } of distributions of e ni , i = 1 , . . . , n .(3) Asymptotic representation of the linear rank statistic (2.3) under contiguous se-quence of distribution { Q n } , and the resulting asymptotic linearity of (2.3) in parameters b , β .(4) Uniform asymptotic quadraticity of D n in parameters b , β under { Q n } , as aresult of (3) and of the convexity of D n .(5) Resulting asymptotic distribution and bias of b β n in the case u ni ≡ , i = 1 , . . . , n .(6) Asymptotic distribution and bias of b β n in the case of nonzero u ni , i = 1 , . . . , n . -estimates in ME models n (0 , Assume that u ni = 0 , i = 1 , . . . , n . That is, for now we assume that there is no measure-ment error in the response variables Y ni . Let Z n = n − / n X i =1 ( w ni − ¯ w n ) ϕ ( F ( e ni )) . We are ready to state and prove the following lemma.
Lemma 3.1.
Under the conditions of Theorem 2.1, the statistic S n ( , ) admits theasymptotic representation S n ( , ) = Z n + o p (1) . (3.2) Proof.
The proof is adapted from [28]. If b = β = , then ( Y n , . . . , Y nn ) = ( e n , . . . , e nn ).Let R n , . . . , R nn denote their ranks. Further denote r ni = a n ( R ni ) − ϕ ( F ( e ni )) , i =1 , . . . , n .Let σ j be the variance of w ij , i = 1 , . . . , n , for j = 1 , . . . , p , and let s = P pj =1 σ j .Notice that ( r n , . . . , r nn ) and ( w , . . . , w n ) are independent. Consider the conditionalsquared distance E G { ( S n − Z n ) ⊤ ( S n − Z n ) | e , . . . , e n } = n − E G ( n X i =1 n X k =1 ( w i − ¯ w n ) ⊤ ( w k − ¯ w n ) r i r k (cid:12)(cid:12)(cid:12) e , . . . , e n ) = n − n X i =1 n X k =1 r i r k E G ( p X j =1 ( w ij − ¯ w j )( w kj − ¯ w j ) (cid:12)(cid:12)(cid:12) e , . . . , e n ) = n − ( n X i =1 n X k =1 r i r k p X j =1 ( x ij − ¯ x j )( x kj − ¯ x j ) + s n X i =1 ( r i − ¯ r ) ) = p X j =1 " n − / n X i =1 ( x ij − ¯ x j ) r i + s n X i =1 ( r i − ¯ r ) . Then (3.2) follows from [10] [Theorems V.1.4.a,b, V.1.6.a]. (cid:3)
For any two probability measures P and Q , absolutely continuous with respect to a σ -finite measure ν with p = d P/ d ν , q = d Q/ d ν , let H ( P, Q ) = (cid:20)Z ( √ p − √ q ) d µ (cid:21) / = (cid:20) Z (1 − √ pq ) d µ (cid:21) / Jureˇckov´a, Koul, Navr´atil and Picek denote the Hellinger distance between P and Q .Let { P n , . . . , P nn } and { Q n , . . . , Q nn } be two triangular arrays of probability mea-sures defined on measurable space ( X , A ) with densities p ni , q ni with respect to σ -finitemeasures µ i [which can be also µ i = P ni + Q ni , i = 1 , . . . , n ]. Denote P ( n ) n = Q ni =1 P ni and Q ( n ) n = Q ni =1 Q ni the product measures, n = 1 , , . . . . Oosterhoff and van Zwet [27] proved that { Q ( n ) n } is contiguous with respect to { P ( n ) n } if and only if lim sup n →∞ n X i =1 H ( P ni , Q ni ) < ∞ , (3.3)lim n →∞ n X i =1 Q ni (cid:26) q ni ( X ni ) p ni ( X ni ) ≥ c n (cid:27) = 0 ∀ c n → ∞ . (3.4)Note that in the case P ni ≡ P n , p ni ≡ p n , and Q ni ≡ Q n , q ni ≡ q n , not depending on i , n X i =1 H ( P ni , Q ni ) = n Z [ p q n ( z ) − p p n ( z )] d z = n Z ( q n ( z ) − p n ( z )) [ p q n ( z ) + p p n ( z )] d z (3.5) ≤ n Z ( q n ( z ) − p n ( z )) p n ( z ) d z. Moreover, for c n > d n = c n − n X i =1 Q ni (cid:26) q ni ( X ni ) p ni ( X ni ) ≥ c n (cid:27) = nQ n (cid:26) q n ( X n ) − p n ( X n ) p n ( X n ) ≥ d n (cid:27) ≤ d − n n Z | q n ( x ) − p n ( x ) | p n ( x ) q n ( x ) d x (3.6) ≤ d − n n Z | q n ( x ) − p n ( x ) | p n ( x ) d x + d − n n Z | q n ( x ) − p n ( x ) | p n ( x ) d x. Now, let Y ni = x ⊤ ni β + e ni , i = 1 , . . . , n . where the e ni are i.i.d. with distribution function F and density f , satisfying (F.1) and (F.2). Consider the residuals Y ni − ( w ni − ¯ w n ) ⊤ b n = e ni + ( x ni − ¯ x n ) ⊤ β n − ( w ni − ¯ w n ) ⊤ b n = e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n , -estimates in ME models i = 1 , . . . , n , where b n = n − / b , β n = n − / β , b ∗ n = n − / b ∗ , b ∗ = b − β , with fixed b , β ∈ R p . Using (3.3) and (3.4), we shall prove the following lemma. Lemma 3.2.
Under the conditions of Theorem 2.1, the sequence { Q ( n ) n } is contiguouswith respect to { P ( n ) n } , where Q ( n ) n = Q ni =1 Q ni , P ( n ) n = Q ni =1 P ni , where P ni is the distri-bution of e ni and Q ni is the distribution of ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n ) , i =1 , . . . , n . Proof.
We shall distinguish the two cases: the x ni are either i.i.d. random vectors ornonrandom vector components.We start with the first case, where w n , . . . , w nn are i.i.d. random vectors. Note that U i := ( w ni − ¯ w n ) ⊤ b ∗ + ( v ni − ¯ v n ) ⊤ β , i = 1 , . . . , n , are i.i.d. r.v.’s. Let k denote thecommon density function of U i . Then, Q ni , P ni do not depend on i and q n ( x ) ≡ R f ( x − n − / u ) k ( u ) d u, p n ( x ) ≡ f ( x ). Hence, by the Cauchy–Schwarz inequality, and the Fubinitheorem, n Z ( q n ( x ) − p n ( x )) p n ( x ) d x = n Z (cid:26)Z [ f ( x − n − / u ) − f ( x )] k ( u ) d u (cid:27) d xf ( x ) ≤ n Z Z [ f ( x − n − / u ) − f ( x )] k ( u ) f ( x ) d u d x ≤ n Z Z (cid:20)Z n − / − n − / | uf ′ ( x − tu ) | d t (cid:21) k ( u ) f ( x ) d u d x ≤ n / Z Z Z n − / − n − / | f ′ ( x − tu ) | d t u k ( u ) f ( x ) d u d x ≤ n / Z Z n − / − n − / Z | f ′ ( x − tu ) | f ( x ) d x u k ( u ) d u d t ∀ n ≥ . Hence, by (3.5), (F.2) applied with j = 2, and by (V.3), which guaranteed R u k ( u ) d u < ∞ , lim sup n n X i =1 H ( P ni , Q ni ) ≤ I ( f ) Z u k ( u ) d u < ∞ . (3.7)Similarly, the bound n Z ( q n ( x ) − p n ( x )) p n ( x ) d x ≤ n / Z Z n − / − n − / Z | f ′ ( x − tu ) | f ( x ) d x | u | k ( u ) d u d t ∀ n ≥ Jureˇckov´a, Koul, Navr´atil and Picek together with (3.6), (F.2) applied with j = 3, and (V.3), which guaranteed R | u | k ( u ) d u < ∞ , yieldlim n n X i =1 Q ni (cid:26) q ni ( Y ni ) p ni ( Y ni ) ≥ c n (cid:27) ≤ n d − n (cid:26)Z (cid:18) | f ′ ( x ) | f ( x ) (cid:19) f ( x ) d x Z | u | k ( u ) d u + I ( f ) Z u k ( u ) d u (cid:27) = 0 . This ensures the validity of (3.4), and completes the proof of the contiguity in presentcase.Next, consider the case where x n , . . . , x nn are nonrandom, and we observe w ni = x ni + v ni , i = 1 , . . . , n . Let k denote the density of ( v ni − ¯ v n ) ⊤ b , i = 1 , . . . , n . Again, by(3.5), n X i =1 H ( P ni , Q ni ) ≤ n X i =1 Z (cid:26)Z [ f ( e − n − / u ) − f ( e )] k ( u + ( x ni − ¯ x n ) ⊤ b ∗ ) d u (cid:27) d ef ( e ) ≤ n X i =1 Z (cid:26)Z [ f ( e − n − / u ) − f ( e )] k ( u − ( x ni − ¯ x n ) ⊤ b ∗ ) d u (cid:27) d ef ( e ) ≤ n / Z Z n − / n − / Z | f ′ ( e − tu ) | f ( e ) d e d t n − n X i =1 u k ( u − ( x ni − ¯ x n ) ⊤ b ∗ ) d u. Hence, by (F.2) and by the change of variable formula,lim sup n n X i =1 H ( P ni , Q ni ) ≤ C (cid:20)Z u k ( u ) d u + b ∗⊤ Q n b ∗ (cid:21) < ∞ . (3.8)Similarly one verifies (3.4) here. (cid:3) Lemmas 3.1 and 3.2 enable us to extend the approximation of the rank statistic S n ( b ∗ n , β n ) by a sum of independent r.v.’s under the contiguous sequence of distribu-tions. Let T n ( b ∗ n , β n ) = n − / n X i =1 ( w ni − ¯ w n ) ϕ ( F ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n )) . We have the following corollary. -estimates in ME models Corollary 3.1.
Under the conditions of Theorem 2.1, and under { Q ( n ) n } , S n ( b ∗ n , β n ) = n − / n X i =1 ( w ni − ¯ w n ) a n ( R ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n ))(3.9)= T n ( b ∗ n , β n ) + o p (1) . Hence, S n ( b ∗ n , β n ) − S n ( , ) = T n ( b ∗ n , β n ) − T n ( , ) + o p (1) . n (b ∗ n , β n ) Lemma 3.3.
Under the conditions of Theorem 2.1, k S n ( b ∗ n , β n ) − S n ( , ) + γ [( Q + V ) b ∗ + V β ] k p → , (3.10) where γ = Z − f ′ ( F − ( u )) f ( F − ( u )) ϕ ( u ) d u = − Z R ϕ ( F ( z )) d f ( z ) . (3.11) Proof.
Consider the sequence of functions { ϕ ( k ) ( · ) } ∞ k =1 ϕ ( k ) ( u ) = ϕ (cid:18) k + 1 (cid:19) I (cid:20) u < k (cid:21) + ϕ ( u ) I (cid:20) i − k + 1 < u ≤ ik + 1 (cid:21) , i = 2 , . . . , k. (3.12)Then, by Lemma V.1.6.a [10], ϕ ( k ) is nondecreasing and bounded on (0 ,
1) andlim n →∞ Z [ ϕ ( k ) ( u ) − ϕ ( u )] d u = 0 . (3.13)The function ϕ ( k ) has at most countable set B k of discontinuity points. Observe thatassumption (V.3) implies that n − / max ≤ i ≤ n {k w ni − ¯ w n k + k v ni − ¯ v n k} p →
0. This facttogether with the uniform continuity of F implies thatsup e ∈ R , ≤ i ≤ n | F ( e − n − / ( w ni − ¯ w n ) ⊤ b ∗ − n − / ( v ni − ¯ v n ) ⊤ β ) − F ( e ) | p → . Hence, ϕ ( k ) ( F ( e − n − / ( w ni − ¯ w n ) ⊤ b ∗ − n − / ( v ni − ¯ v n ) ⊤ β ))converges to ϕ ( k ) ( F ( e )), in probability, uniformly in i = 1 , . . . , n . It, in turn, implies thatthe conditional expectation E [( ϕ ( k ) ( F ( e ni − n − / ( w ni − ¯ w n ) ⊤ b ∗ − n − / ( v ni − ¯ v n ) ⊤ β )) − ϕ ( k ) ( F ( e ni ))) | v ni , x ni ]2 Jureˇckov´a, Koul, Navr´atil and Picek converges to 0, in probability, uniformly in i = 1 , . . . , n and k .Let S ( k ) n ( b ∗ , β ) and T ( k ) n ( b ∗ , β ) be analogous to S n ( b ∗ , β ) , T n ( b ∗ , β ), respectively,with ϕ replaced with ϕ ( k ) . Then we can bound the norm of the covariance matrix of T ( k ) n ( b ∗ , β ) − T ( k ) n ( , ) for any fixed k in the following way: denote A ( k ) n = E { [ T ( k ) n ( b ∗ n , β n ) − T ( k ) n ( , )][ T ( k ) n ( b ∗ n , β n ) − T ( k ) n ( , )] ⊤ } . Then A ( k ) n = E ( n − n X i =1 ( w ni − ¯ w n )( w ni − ¯ w n ) ⊤ × [ ϕ ( k ) ( F ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n )) − ϕ ( k ) ( F ( e ni ))] ) = n − n X i =1 E { ( w ni − ¯ w n )( w ni − ¯ w n ) ⊤ (3.14) × E [( ϕ ( k ) ( F ( e ni − ( w ni − ¯ w n ) ⊤ b ∗ n − ( v ni − ¯ v n ) ⊤ β n )) − ϕ ( k ) ( F ( e ni ))) | v ni , x ni ] } . Hence, k A ( k ) n k ≤ ((cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n − n X i =1 ( w ni − ¯ w n )( w ni − ¯ w n ) ⊤ − ( Q + V ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + k Q + V k ) · o(1)= {k Q + V k + o(1) } · o(1) . This, together with the fact E T ( k ) n ( , ) = , implies k T ( k ) n ( b ∗ n , β n ) − T ( k ) n ( , ) − E T ( k ) n ( b ∗ n , β n ) k p → . (3.15)Furthermore, for any fixed k and for fixed b ∗ , β , T ( k ) n ( b ∗ n , β n ) − T ( k ) n ( , ) + γ k [( Q + V ) b ∗ + V β ] p → , (3.16)where γ k = − Z R ϕ ( k ) ( F ( e )) f ′ ( e ) d e = − Z ϕ ( k ) ( u ) f ′ ( F − ( u )) f ( F − ( u )) d u. Indeed, (we put ¯ x n = ¯ v n = , for the sake of brevity) n − / n X i =1 E { w ni ( E [ ϕ ( k ) ( F ( e ni − n − / ( w ⊤ ni b ∗ − n − / v ⊤ ni β ) − ϕ k ( F ( e ni )) -estimates in ME models − γ k ( n − / [ w ⊤ ni b ∗ + v ⊤ ni β ]) | v ni , x ni ]) } = n − / n X i =1 E (cid:26) w ni (cid:18)Z R ϕ ( k ) ( F ( z )) d[ F ( z + n − / w ⊤ ni b ∗ + n − / v ⊤ ni β ) − F ( z )] − n − / [ w ⊤ ni b ∗ + v ⊤ ni β ] Z R ϕ ( k ) ( F ( z )) f ′ ( z ) d z (cid:19)(cid:27) = n − / n X i =1 E (cid:26) w ni (cid:18)Z R ϕ ( k ) ( F ( z )) d[ F ( z + n − / w ⊤ ni b ∗ + n − / v ⊤ ni β ) − F ( z ) − n − / ( w ⊤ ni b ∗ + v ⊤ ni β ) f ( z )] (cid:19)(cid:27) → . Moreover, we have ( γ k − γ ) = (cid:28) ( ϕ k − ϕ ) , − f ′ ( F − ( · )) f ( F − ( · )) (cid:29) ≤ k ϕ k − ϕ k (cid:13)(cid:13)(cid:13)(cid:13) − f ′ ( F − ( · )) f ( F − ( · )) (cid:13)(cid:13)(cid:13)(cid:13) (3.17)= I ( f ) k ϕ k − ϕ k → k → ∞ . Using (3.16), (3.17), Lemmas 3.1 and 3.2, Corollary 3.1 and Lemma 3.5 in [18], we obtainthat P ( k S n ( b ∗ n , β n ) − S ( k ) n ( b ∗ n , β n ) k > ε ) < ε for ∀ k > k ( ε ), ∀ n > n ( k ), and finally we arrive at (3.10). (cid:3) Recall that ¯ a n = 0 under (A.1). Rewrite the Jaeckel dispersion in the presence of mea-surement errors in the form D n ( b ) = n X i =1 [ Y ni − w ⊤ ni b ] a n ( R ni ( b ))or eventually in the form D n ( b ∗ , β ) = n X i =1 [ e ni − ( w ni − ¯ w n ) ⊤ b ∗ − ( v ni − ¯ v n ) ⊤ β ] a n ( R ( e ni − w ⊤ ni b ∗ − v ⊤ ni β )) , where b ∗ = b − β . It is a piecewise linear, convex function of b and b ∗ , respectively.Hence, the minimum b β n = arg min b ∈ R p D n ( b ) exists, and is considered as an estimate of4 Jureˇckov´a, Koul, Navr´atil and Picek β in model (2.1). By [17], the partial derivatives of D n ( b ) exist for almost all b , andwhere they exist, are equal to ∂∂b j D n ( b ) = − n / S nj ( b ) = − n X i =1 ( w nij − ¯ w j ) a n ( Y i − w ni b ) , j = 1 , . . . , p. Otherwise speaking, ∇D n ( b ∗ , β ) = − n / S n ( b ∗ , β ) = − n X i =1 ( w ni − ¯ w n ) a n ( R ( e ni − w ⊤ ni b ∗ − v ⊤ ni β )) , where ∇ denotes the subgradient.Consider the quadratic function C n ( b ∗ , β ) = γ b ∗⊤ ( Q + V ) b ∗ − b ∗⊤ S n ( ) + γ b ∗ V β + D n ( ) . Then D n ( b ) and C n ( b ) are both convex functions and D n ( ) = C n ( ). Moreover, ∇ [ D n ( b ∗ , β ) − C n ( b ∗ , β )] = − [ n / ( S n (( b ∗ , β )) − S n ( , ) + γ ( Q + V ) b ∗ + γ V β )] . Hence, it follows from (3.10) that for b ∗ n = n − / b ∗ , β n = n − / β with b ∗ , β ∈ R p fixed that k∇ [ D n ( n − / b ∗ , n − / β ) − C n ( n − / b ∗ , n − / β )] k p → . Using the convexity arguments in [13] (Appendix) and [29] (Convexity lemma), we con-clude thatsup |D n ( n − / b ∗ , n − / β ) − γ b ∗⊤ ( Q + V ) b ∗ + b ∗⊤ S n ( ) − γ b ∗ V β + D n ( ) | = o p (1) , where the supremum is taken over the set {k b ∗ k ≤ C, k β k ≤ C } . Hence, following thearguments in the proof of Theorem 1 in [29], we conclude that, under the local alternative β n = n − / β , arg min b ∗ D n ( n − / b ∗ , n − / β )is asymptotically equivalent toarg min b ∗ (cid:20) γ b ∗⊤ ( Q + V ) b ∗ − b ∗⊤ S n ( ) + γ b ∗ V β (cid:21) . (3.18)The solution of (3.18) equals to b ∗ = b − β = n / ( b β n − β n ) = γ − ( Q + V ) − S n ( , ) − ( Q + V ) − V β . Hence, in the linear model with local value of regression parameter β , when Y ni = x ⊤ ni β n + e ni , β n = n − / β , -estimates in ME models w ni = x ni + v ni instead of x ni , i = 1 , . . . , n , the R-estimator isasymptotically normally distributed with a bias B = − ( Q + V ) − V β , that is, n / ( b β n − n − / β ) D → N p ( B, ( Q + V ) − A ( ϕ )) , B = − ( Q + V ) − V β . (3.19)Finally, as we have already mentioned, all of the above arguments and motivationsare valid when we replace e ni with e ni + u ni , i = 1 , . . . , n . This completes the proof ofTheorem 2.1.
4. Numerical illustration
The following simulation study illustrates the effect of measurement errors in regressorson the finite-sample performance of R-estimates. Empirical bias (and variance) of R-estimates are computed and compared for various measurement error models. For thesake of comparison, the biases and variances are also computed for the least squaresestimate (LSE) and the least absolute deviation ( L ) estimate, under the same setup.Moreover, we compare the deterministic and random regressors.All the simulations were performed in the statistical software R using standard tools andlibraries. For minimization of (2.4) functions optimize and optim with initial estimate0.5 – regression quantile were used. The random numbers generator was setup with theinitial value set.seed(15). The results illustrate that the bias of R-estimate is surprisingly stable with respect tothe sample size; the bias corresponding to small n is comparable to the asymptotic onederived in Theorem 2.1.Notice that the bias of R-estimator only slightly differs from the biases of LSE and L -estimators. Consider first the model of regression line Y i = β + x i β + e i , i = 1 , . . . , n, where the Y i are measured accurately, while instead of x i we observe only w i = x i + v i , i =1 , . . . , n . The R-estimator of parameter β is based on Wilcoxon scores generated by scorefunction ϕ ( u ) = u − / β = 1 , β = 2, and model errors e i follow the logistic distribution. In Tables 1 and2, the empirical bias of R-estimator based on Wilcoxon scores is compared for varioussample sizes ( n = 10 , . . . , n = ∞ ). Theregressors x i are deterministic in Table 1; they were generated from uniform U ( − , U ( − , Jureˇckov´a, Koul, Navr´atil and Picek
Table 1.
Empirical bias of R-estimator for various n and measurement errors v i ; nonrandomregressors x i nv i
10 20 50 100 200 500 1000 ∞ U ( − , − − − − − − − − U (0 , − − − − − − − − U ( − , − − − − − − − − N (0 , − − − − − − − − N (0 , − − − − − − − − N (0 , − − − − − − − − enables to see the difference between deterministic and random regressors: the bias differsmore from its asymptotic value in case of deterministic regressors than in case of randomregressors; it can be caused by the slower rate of convergence. The measurement errors v i are either uniformly or normally distributed ( i = 1 , . . . , n ).Table 3 compares empirical bias and variance (in parenthesis) of R-estimator basedon Wilcoxon scores, of LSE and L -estimate for the sample size n = 50 and when re-gressors x i are random, generated from uniform U ( − ,
9) distribution; model errors e i are generated from normal, logistic, Laplace, Pareto with parameter α = 0 . v i follow various distributions, similarly as inTables 1 and 2. Table 2.
Empirical bias of R-estimator for various n and measurement errors v i ; random re-gressors x i nv i
10 20 50 100 200 500 1000 ∞ − U ( − , − − − − − − − − U (0 , − − − − − − − − U ( − , − − − − − − − − N (0 , − − − − − − − − N (0 , − − − − − − − − N (0 , − − − − − − − − -estimates in ME models Table 3.
Empirical bias (variance) of R-estimator, LSE and L -estimator for various measure-ment errors v i and model errors e i ; n = 50 e i v i Normal Logistic Laplace Pareto Cauchy0.002 (0.182) 0.008 (0.527) − − − − − − − − − U ( − , − − − − − − − − − − − − − − U ( − , − − − − − − − − − − − − − − N (0 , − − − − − − − − − − Consider the model Y i = β + x i, β + x i, β + e i , i = 1 , . . . , n, where again the Y i are measured accurately, but instead of x i we observe only w i = x i + v i , i = 1 , . . . , n . The R-estimator of parameter β = ( β , β ) ⊤ is based on Wilcoxonscores generated by score function ϕ ( u ) = u − / n = 50, parameters β = 1 , β = 2 , β = 1, random regressors x i =( x i, , x i, ) ⊤ are generated from 2-dimensional normal distributions N ( µ , S ν ) , ν = 1 , , µ = (0 , ⊤ and S = (cid:18) . . (cid:19) , S = (cid:18) . . (cid:19) , S = (cid:18) . . (cid:19) . Table 4 compares empirical bias and variance (in parentheses) of R-estimator based onWilcoxon scores, with those of the LSE and L -estimator for various distributions of themeasurement errors v i and model errors e i .We have also computed R-estimates generated by other score functions, for example,van der Waerden, median; also another simulation design was considered – various samplesizes n , values of the parameters, distributions of regressors, measurement errors v i and u i and model errors. It is of interest that the results for corresponding R-estimates arequite similar to those presented in the previous tables.The simulation study confirms that R-estimates in measurement error models are bi-ased, as well as other usual estimates. The bias is relatively stable with respect to the8 Jureˇckov´a, Koul, Navr´atil and Picek
Table 4.
Empirical bias (variance) of R-estimator, LSE and L -estimator for various measure-ment errors v i and model errors e i ; n = 50 e i v i Normal Logistic Laplace Pareto Cauchy b β − − − − − − − − b β − − − − − N ( µ , S ) b β − − − − − − − − − − − − − − b β − − − − − − − − − − − − − − − N ( µ , S ) b β − − − − − − − − − − − − − − b β − − − − − − − − − − − − − − − N ( µ , S ) b β − − − − − − − − − − − − − − b β − − − − − − − − − − − − − − sample size and to distribution of model errors. The R-estimates provide meaningfulresults as long as the e i have a finite Fisher information; even under the normal errorsare their empirical variances only slightly greater than that of LSE. The bias and otherproperties of R-estimates are comparable with those of the least squares and of L esti-mates unless the distribution of model errors e i is heavy, where the LSE fails. Generally,the reduction of the bias is rather a matter of measurement precision, of calibration andrepeated measurements. -estimates in ME models Acknowledgements
The authors thank the Referee for his/her comments, which helped to better understand-ing the text.The work of R. Navr´atil and J. Jureˇckov´a was partially done during their visits toMichigan State University, hosted by the Department of Statistics and Probability. Re-search of J. Jureˇckov´a was partially supported by the Grant GA ˇCR 201/12/0083. Re-search of H.L. Koul was supported in part by the NSF DMS Grant 1250271. Research ofR. Navr´atil was supported by the Grant SVV-2013-267 315, and J. Picek and R. Navr´atilwere supported by the Project Klimatext CZ.1.07/2.3.00/20.0086.
References [1]
Adcock, R.J. (1877). Note on the method of least squares.
The Analyst Akritas, M.G. and
Bershady, M.A. (1996). Linear regression for astronomical data withmeasurement errors and intrinsic scatter.
Astrophysical Journal
Arias, O. , Hallock, K.F. and
Sosa-Escudero, W. (2001). Individual heterogeneity inthe returns to schooling: Instrumental variables quantile regression using twins data.
Empirical Economics Carroll, R.J. , Delaigle, A. and
Hall, P. (2007). Non-parametric regression estimationfrom data contaminated by a mixture of Berkson and classical errors.
J. R. Stat. Soc.Ser. B Stat. Methodol. Carroll, R.J. , Maca, J.D. and
Ruppert, D. (1999). Nonparametric regression in thepresence of measurement error.
Biometrika Carroll, R.J. , Ruppert, D. , Stefanski, L.A. and
Crainiceanu, C.M. (2006).
Measure-ment Error in Nonlinear Models. A Modern Perspective , 2nd ed.
Monographs on Statis-tics and Applied Probability . Boca Raton, FL: Chapman & Hall/CRC. MR2243417[7]
Cheng, C.-L. and
Van Ness, J.W. (1999).
Statistical Regression with Measurement Error . Kendall’s Library of Statistics . London: Arnold. MR1719513[8] Fan, J. and
Truong, Y.K. (1993). Nonparametric regression estimation involving errors-in-variables.
Ann. Statist. Fuller, W.A. (1987).
Measurement Error Models . Wiley Series in Probability andMathematical Statistics: Probability and Mathematical Statistics . New York: Wiley.MR0898653[10]
H´ajek, J. and ˇSid´ak, Z. (1967).
Theory of Rank Tests . New York: Academic Press.MR0229351[11]
Hausman, J. (2001). Mismeasured variables in econometric analysis: Problems from theright and problems from the left.
J. Econ. Perspect. He, X. and
Liang, H. (2000). Quantile regression estimates for a class of linear and partiallylinear errors-in-variables models.
Statist. Sinica Heiler, S. and
Willers, R. (1988). Asymptotic normality of R-estimates in the linearmodel.
Statistics Hodges, J.L. Jr. and
Lehmann, E.L. (1963). Estimates of location based on rank tests.
Ann. Math. Statist. Hyk, W. and
Stojek, Z. (2013). Quantifying uncertainty of determination by standardadditions and serial dilutions methods taking into account standard uncertainties inboth axes.
Anal. Chem. Jureˇckov´a, Koul, Navr´atil and Picek [16]
Hyslop, D.R. and
Imbens, G.W. (2001). Bias from classical and other forms of measure-ment error.
J. Bus. Econom. Statist. Jaeckel, L.A. (1972). Estimating regression coefficients by minimizing the dispersion ofthe residuals.
Ann. Math. Statist. Jureˇckov´a, J. (1969). Asymptotic linearity of a rank statistic in regression parameter.
Ann. Math. Statist. Jureˇckov´a, J. (1971). Nonparametric estimate of regression coefficients.
Ann. Math.Statist. Jureˇckov´a, J. , Picek, J. and
Saleh, A.K.Md.E. (2010). Rank tests and regression andrank score tests in measurement error models.
Comput. Statist. Data Anal. Kelly, B.C. (2007). Some aspects of measurement error in linear regression of astronomicaldata.
The Astrophysical Journal
Koul, H.L. (2002).
Weighted Empirical Processes in Dynamic Nonlinear Models . LectureNotes in Statistics . New York: Springer. MR1911855[23]
Marques, T.A. (2004). Predicting and correcting bias caused by measurement errorin line transect sampling using multiplicative error models.
Biometrics M¨uller, I. (1996). Robust methods in the linear calibration model. Ph.D. thesis, CharlesUniv. in Prague.[25]
Navr´atil, R. (2012). Rank Tests and R-estimates in Location Model with Measurementerrors. In
Proceedings of Workshop of the Jaroslav H´ajek Center and Financial Math-ematics in Practice I. Book of Short Papers ( J. Zelinka and
J. Horov´a , eds.). Brno:Masaryk Univ.[26]
Navr´atil, R. and
Saleh, A.K.Md.E. (2011). Rank tests of symmetry and R-estimationof location parameter under measurement errors.
Acta Univ. Palack. Olomuc. Fac.Rerum Natur. Math. Oosterhoff, J. and van Zwet, W.R. (1979). A note on contiguity and Hellinger distance.In
Contributions to Statistics. Jaroslav H´ajek Memorial Volume ( J. Jureˇckov´a , ed.)157–166. Dordrecht: Reidel. MR0561267[28]
Picek, J. (1996). Statistical procedures based on regression rank scores. Ph.D. thesis,Charles Univ. in Prague.[29]
Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators.
Econo-metric Theory Rocke, D.M. and
Lorenzato, S. (1995). A two-component model for measurement errorin analytical chemistry.
Technometrics Saleh, A.K.Md.E. , Picek, J. and
Kalina, J. (2012). R-estimation of the parameters of amultiple regression model with measurement errors.
Metrika Sen, P.K. , Jureˇckov´a, J. and
Picek, J. (2013). Rank tests for corrupted linear models.
J. Indian Statist. Assoc.201–229. MR3234614