Quantile regression methods for first-price auctions
QQuantile regression methods for first-price auctions
Nathalie GimenesDepartment of EconomicsPUC-RioBrazil [email protected]
Emmanuel GuerreSchool of EconomicsUniversity of KentUnited Kingdom [email protected]
September 2019 a r X i v : . [ ec on . E M ] S e p bstract The paper proposes a sieve quantile regression approach for first-price auctions withsymmetric risk-neutral bidders under the independent private value paradigm. It is firstshown that a private value quantile regression model generates a quantile regression for thebids. The private value quantile regression can be easily estimated from the bid quantileregression and its derivative with respect to the quantile level. A new local polynomialtechnique is proposed to estimate the latter over the whole quantile level interval. Plug inestimation of functionals is also considered, as needed for the expected revenue or the caseof CRRA risk-averse bidders, which is amenable to our framework. A quantile regressionanalysis to USFS timber is found more appropriate than the homogenized bid methodologyand illustrates the contribution of each explanatory variables to the private value distribution.
JEL : C14, L70
Keywords : First-price auction; independent private value; dimension reduction; quantileregression; local polynomial estimation; sieve estimation; boundary correction.
A previous version of this paper has been circulated under the title ”Quantile regression methods forfirst-price auction:a signal approach”. The authors acknowledge useful discussions and comments from Xiao-hong Chen, Valentina Corradi, Yanqin Fan, Phil Haile, Xavier d’Haultfoeuille, Vadim Marmer, Isabelle Per-rigne, Martin Pesendorfer and Quang Vuong, and the audience of many conferences and seminars. NathalieGimenes also thanks Ying Fan and Ginger Jin for encouragements. All remaining errors are our responsi-bility. Both authors would like to thank the School of Economics and Finance, Queen Mary University ofLondon, for generous funding.
Introduction
Various quantile approaches have been recently proposed for the Econometrics of Auctions.Haile, Hong and Shum (2003, HHS hereafter) have used monotonicity of bidding strategyto build a quantile test of the independent private value null hypothesis. Milgrom (2001,Theorem 4.7) reformulates the identification relation of Guerre, Perrigne and Vuong (2000,GPV afterwards) using quantile function. The risk aversion identification result of Guerre,Perrigne and Vuong (2009, GPV09 hereafter) heavily relies on the bid quantile functionin first-price auctions. Zincenko (2018) develops a corresponding nonparametric estimationmethod. Liu and Luo (2017) and Liu and Vuong (2018) have respectively developed quantilebased test for the null of exogenous participation and monotonicity of the bidding strategy.Other authors have considered quantile based estimation of the private value distribution.Gimenes (2017) has implemented a quantile regression approach for ascending auction. Seealso Menzel and Morganti (2013) who proposed an order statistics approach. For first-price auction, Marmer and Shneyerov (2012) has proposed a quantile-based estimator of theprivate value probability density function (pdf), which is an alternative to the two step GPVmethod. Guerre and Sabbah (2012) have noted that the private value quantile function canbe estimated using a one step procedure from the estimation of the bid quantile function andits first derivative. Enache and Florens (2015) have developed an inverse problem approach.The two step method of GPV focuses on the private value pdf estimation, which is quitehard to estimate. Estimating pdf is useful for descriptive purposes and for computation ofimportant moments, such as the expected revenue. But the latter can also be achieved usingquantile functions, as moments are easily computed integrating it. As noted in Milgrom(2001) in the independent private value setting, the value function of a bidder observing auniform signal is nothing else than the private value quantile function, so that a quantileapproach is especially relevant in auction settings. Nonparametric density estimation isnotoriously affected by the curse of dimensionality, and parsimonious models addressing thisissue for density are less rich than for quantile functions, where both single index modelling,as already used in an auction framework by Marmer, Shneyerov and Xu (2013b), and additive1pecification are available. A simpler specification is the homogenized bid model of HHS,which postulates a regression model with iid residuals for the private value. As shown in ourempirical application and in Gimenes (2017) for ascending auctions, it may fail to capturenonlinear dependence of the private value to auction covariate. In addition, it still involvesa GPV step that may not perform well in small samples.The present paper develops a quantile regression methodology for first-price auctions,which includes parsimonious but flexible models suitable for moderate samples. The param-eter of interest is the private value conditional quantile function given some auction specificcovariates, which can be estimated faster than the conditional pdf. A key aspect of our ap-proach is that the bid conditional quantile function is a linear functional of the private valueone. It follows that the popular quantile regression model of Koenker and Bassett (1978) canplay a central role in our methodology, as it enjoys an important stability property: a privatevalue quantile regression model generates a bid quantile regression model. The private valuequantile function is a linear combination of the bid quantile function and its first derivativewith respect to the quantile level, a simple identification method which is the basis of ourestimation procedure. This also applies to the linear sieve quantile regression of Belloni,Chernozhukov, Chetverikov and Fern´andez-Val (2017). Following Horowitz and Lee (2005),the latter can be tailored to additive quantile models, which can be better estimated thatsaturated sieve models. Higher order covariate interactions can also be considered, giving aclass of flexible models which can be tailored to each specific datasets.An important challenge is raised by the estimation of the bid quantile derivative withrespect to the quantile level α . This was considered by Guerre and Sabbah (2012) and thereferences therein. We propose instead a new local polynomial approach which applies toquantile levels and aims to jointly estimate the bid quantile function and its derivatives. Anunexpected feature is that it performs well for extreme quantile levels, producing consistentestimators for α = 0 and 1. The latter upper quantile levels are particularly important forauctions as private values of winners are expected to be in the top of the distribution. Recentwork focusing on boundary issues are Aryal, Gabrielli and Vuong (2016) in a semiparametric2ramework and Hickman and Hubbard (2015). Our theoretical results include a CentralLimit Theorem for the private value quantile estimator which holds for extreme quantilesand a bias variance decomposition for its Integrated Mean Squared Error (IMSE). The latterallows in particular for bandwidth choice based on a pilot quantile model.A second family of parameters of interest consists in integral functionals of the bid quan-tile function and its quantile level first derivatives. A first example is the parameter ofConstant Relative Risk Aversion (CRRA) utility functions. CRRA risk aversion preservesindeed the quantile linearity features which are important for our quantile regression method-ology. The risk aversion parameter can be estimated using bidder variations as in GPV09but also combining first-price and ascending auction as in Lu and Perrigne (2008). A secondexample is the expected revenue, which falls in such family as it is a functional of the privatevalue quantile function (Gimenes, 2017), see also Li, Perrigne and Vuong (2003). A third ex-ample covers the conditional private value cumulative distribution function and pdf. Indeedthe rearrangement formula of Chernozhukov, Fern´andez-Val and Galichon (2010) expressesthe cdf as an integral functional of the private value quantile function. Differentiating asmooth version of this functional proposed in Dette and Volgushev (2008) gives a pdf esti-mator which fits in our framework and differs from Marmer and Shneyerov (2012). Thesedistribution estimators are useful for dimension reduction purpose.Our theoretical results are illustrated with a simulation experiment and an applicationto USFS first price auctions. A preliminary quantile regression analysis of the bid quantilefunction suggests that the homogenized bid technique should not be applied here becausethe quantile regression slopes are not constant. The private value quantile regression slopefunctions reveal the impact of the covariate, and how strongly bidders in the top of thedistribution can differ from the bottom. CRRA risk-aversion estimation using the approachesof GPV09 and Lu and Perrigne (2008) is also considered. The rest of the paper is organized asfollows. The next section introduces our quantile identification approach and the functionalsof interest. Section 3 introduces our local polynomial estimation framework. Section 4 groupsour main theoretical results for the private value quantile functions and its functionals. Our3imulation results are in Section 5 and the application can be found in Section 6. Section7 summarizes the estimation strategy and the empirical application findings, and describessome possible extensions. All the proofs are gathered in six Online Appendices. A single and indivisible object with some characteristic x ∈ R D is auctioned to I ≥ I and x are known to the bidders and the econometrician.Bids are sealed so that a bidder does not know others’ bid when forming his own bid. Theobject is sold to the highest bidder who pays his bid B i to the seller. Under the symmetricIPV paradigm, each potential bidder is assumed to have a private value V i , i = 1 , . . . , I for the auctioned object. A buyer knows his private value but not the private value of theother bidders, but the joint distribution of the V i is common knowledge. The private valuesare independently and identically drawn from a distribution given ( x, I ) with a compactlysupported cdf F ( ·| x, I ), or equivalently with conditional quantile function V ( α | x, I ) = F − ( α | x, I ) , α in [0 , . The private value quantile function is the first parameter of interest of the present paper,to be estimated from bids B i from the symmetric Bayesian Nash equilibrium. Section 2.4below considers a second set of parameters of interest derived from V ( ·|· , · ) such as the cdf F ( ·|· , · ) or the associated pdf f ( ·|· , · ). It is well-known that the bidder i private value rank A i = F ( V i | x, I )4as a uniform distribution over [0 ,
1] and is independent of x and I . It also follows from theIPV paradigm that the private value ranks A i = 1 , . . . , I are independent. The dependencebetween the private value V i and the auction covariates x and I is therefore fully capturedby the non separable quantile representation V i = V ( A i | x, I ) , A i iid ∼ U [0 , ⊥ ( x, I ) . Following Milgrom and Weber (1982) or Milgrom (2001), V ( ·| x, I ) can also be viewed as a valuation function, the private value rank A i being the associated signal. In what follows, G ( ·| x, I ) and g ( ·| x, I ) stand for respectively the bid conditional cdf and pdf.Maskin and Riley (1984) have shown that Bayesian Nash Equilibrium bids B i = σ ( V i ; x, I )of symmetric risk averse or risk neutral bidders must strictly and continuously increasewith the private values under the IPV paradigm. It follows that B i = B ( A i | x, i ) where B ( · ; x, i ) = σ ( F ( ·| x, I ) ; x, I ) can be viewed as a bidding strategy depending upon the rank A i . If F ( ·| x, I ) is also strictly increasing, so is B ( ·| x, I ) and since A i is uniform it holds G ( b | x, I ) = P [ B ( A i | x, I ) ≤ b | x, I ] = P (cid:2) A i ≤ B − ( b | x, I ) | x, I (cid:3) = B − ( b | x, I )showing that the bidding strategy B ( ·| x, I ) is also the bid quantile function.A standard best response argument will show how to identify the private value quantilefunction V ( ·| x, I ) from B ( ·| x, I ). Suppose bidder i signal A i is equal to α , but that her bidis a suboptimal B ( a | x, I ), all other bidders bidding B ( A j | x, I ). Then the probability thatbidder i wins the auction is P (cid:20) B ( a | x, I ) > max ≤ j (cid:54) = i ≤ I B ( A j | x, I ) (cid:12)(cid:12)(cid:12)(cid:12) A i = α, x, I (cid:21) = P (cid:20) a > max ≤ j (cid:54) = i ≤ I A j (cid:12)(cid:12)(cid:12)(cid:12) A i = α, x, I (cid:21) = a I − (2.1)because the A j ’s are independent U [0 , independent of x and I . It follows that the expectedrevenue of such a bid is, for a risk neutral bidder, ( V ( α | x, I ) − B ( a | x, I )) a I − . If B ( ·| x, I )5s a best-response bidding strategy, the optimal bid of a bidder with signal α is B ( α | x, I ),that is α = arg max a (cid:8) ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9) . As B ( ·| x, I ) is continuously differentiable, it follows that ∂∂a (cid:8) ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) a = α = 0 (2.2)or equivalently ddα (cid:2) α I − B ( α | x, I ) (cid:3) = ( I − α I − V ( α | x, I ). Solving with the initial condi-tion B (0 | x, I ) = V (0 | x, I ) and rearranging the equation above gives Proposition 1, which isthe cornerstone of our estimation method. From now on B (1) ( α | x, I ) = ddα B ( α | x, I ). Proposition 1
Consider a given ( x, I ) , I ≥ , for which α ∈ [0 , (cid:55)→ V ( α | x, I ) is contin-uously differentiable with a derivative V (1) ( ·| x, I ) > . Suppose the bids are drawn from thesymmetric differential Bayesian Nash equilibrium. Then,i. The conditional equilibrium quantile function B ( ·| x, I ) of the I iid optimal bids B i satisfies B ( α | x, I ) = I − α I − (cid:90) α a I − V ( a | x, I ) da. (2.3) ii. The bid quantile function B ( α | x, I ) is continuously differentiable over [0 , and it holds V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) I − . (2.4)A key feature is the linearity of the private value to bid quantile function mapping (2.3),which implies that a private value quantile linear model is mapped into a similar bid linearmodel, as detailed below for the well known quantile regression. Proposition 1-(ii) showsthat the private value quantile function is identified from the bid quantile function and its6erivative, as noted in Guerre and Sabbah (2012). It is a quantile version of the identificationstrategy of GPV, based on the computation of the private value from the bid V i = B i + 1 I − G ( B i | x, I ) g ( B i | x, I ) . Versions of (2.4) with B (1) ( α | x, I ) changed into 1 /g ( B ( α | x, I ) | x, I ) can be found in Milgrom(2001, Theorem 4.7), Liu and Luo (2014), Enache and Florens (2015), Liu and Vuong (2016)and Luo and Wan (2016) and, under risk aversion, in GPV09 and Campo, Guerre, Perrigneand Vuong (2011). As developed in Section 2.4 below, Proposition 1 can be extended to thecase of symmetric risk-averse bidders with a CRRA utility function. Private value quantile regression.
The linearity of (2.3) with respect to the privatevalue quantile function has remained unnoticed with very few exceptions, although it hasimportant model stability implications useful for practical implementation. Consider forinstance a private value quantile given by the quantile regression specification V ( α | x, I ) = γ ( α | I ) + x (cid:48) γ ( α | I ) = [1 , x (cid:48) ] γ ( α | I ) . (2.5)Proposition 1-(i) implies that the conditional bid quantile function satisfies, B ( α | x, I ) = [1 , x (cid:48) ] β ( α | I ) with β ( α | I ) = I − α I − (cid:90) α t I − γ ( t | I ) dt, (2.6)showing B ( α | x, I ) belongs to the quantile regression specification. It follows from (2.4) that γ ( α | I ) = β ( α | I ) + αβ (1) ( α | I ) I − , (2.7) This can be recovered from (2.4) taking α = A i as V i = V ( A i | x, I ), B i = B ( A i | x, I ) implying that A i = G ( A i | x, I ) and B (1) ( A i | x, I ) = 1 /g ( B ( A i | x, I ) | x, I ) = 1 /g ( B i | x, I ).
7o that γ ( α | I ) can easily be estimated from an estimation of β ( α | I ) and β (1) ( α | I ). Itthen follows that the quantile regression specification is stable, i.e. a quantile regressionspecification for the private value is equivalent to a quantile regression specification for thebid. Hence testing the correct specification of a bid quantile regression model is equivalentto test the correct specification of a private value quantile specification. The expressions(2.6) and (2.7) show that significance testing can be done through bid quantile regressionas γ j ( ·| I ) = 0 is equivalent to β j ( ·| I ) = 0, or more generally e (cid:48) γ ( ·| I ) = c is equivalent to e (cid:48) β ( ·| I ) = c for any conformable e and c . Bid homogenization and quantile regression.
HHS have noted that a translation ofthe private values results in a similar translation of the bids, an invariance property thatthey use in their bid homogenization technique. The latter can be interpreted as the use of aregression model for the private values, V i = γ + x (cid:48) γ + v i with an error term v i independentof x , as also proposed by Rezende (2008). This amounts to assume that the slope function γ ( ·| I ) in (2.5) does not depend upon the quantile level. The regression model of HHS andRezende (2008) is indeed equivalent to the quantile regression specification V ( α | x ) = γ + x (cid:48) γ + v ( α )where v ( α ) is the quantile function of v i . Since I − α I − (cid:82) α a I − da = 1, it follows that theassociated bid quantile function is, by (2.3) B ( α | x, I ) = γ + x (cid:48) γ + b ( α | I ) , where b ( α | I ) = I − α I − (cid:90) α a I − v ( a ) da. This gives the bid regression model B i = β ( I ) + x (cid:48) γ + b i , β ( I ) = γ + E [ b ( A i | I )]where the regression error term b i = b ( A i | I ) − E [ b ( A i | I )] is centered and independent of x . Following these authors, the coefficient γ can be estimated regressing the bids on [1 , x (cid:48) ]8nd the distribution of v i can be estimated applying the GPV two step method to thehomogenized bids, which are the residuals B i − x (cid:48) (cid:98) γ .However this approach requests independence between the regression error term v i andthe covariate x , an assumption which may be too restrictive in practice as found by Gimenes(2017) and the application below. When γ ( · ) is not a constant, regressing B ( α | x, I ) on[1 , x ] gives B i = β ( I ) + x (cid:48) β ( I ) + b ( A i | x, I ) with a slope coefficient satisfying β ( I ) = (cid:90) (cid:18) I − α I − (cid:90) α a I − γ ( a ) da (cid:19) dα = (cid:90) γ ( α ) dα − (cid:90) (cid:18)(cid:90) α (cid:16) aα (cid:17) I − γ (1)1 ( a ) da (cid:19) dα and a residual term b ( A i | x, I ) = v ( A i ) + x (cid:48) I − A I − i (cid:82) A i a I − γ ( a ) da − β ( I ) which now dependsupon x , so that the homogenized bid approach does not apply. Using variation of I canbe useful to detect such a situation because observing variation of β ( I ) implies that γ ( · )is not a constant. In particular, If the entries of γ (1)1 ( · ) are nonnegative, the entries of β ( I ) must increase with I . Similar features hold for centered bids B i − E [ B i | I ] whenthe homogenized bid regression is replaced by a nonparametric regression: the regressionfunction E [ B i − E [ B i | I ] | x, I ] should not depend upon I if V i = m ( X ) + v i , as for the singleindex regression specification considered in Paarsch and Hong (2006). Flexible interactive specifications.
The private value quantile regression model (2.5)assumes linearity of the private value quantile function with respect to the covariate x . Thismay be too strong and can be relaxed using a quantile nonparametric additive specification,which was considered in Horowitz and Lee (2005). Recall that x = ( x , . . . , x D ) and considerthe additive quantile function V ( α | x, I ) = D (cid:88) j =1 V j ( α ; x j , I ) (2.8)9here each functions V j ( α ; x j , I ) is specific to the entry x j . Since such quantile specificationsare obtained by summing some univariate functions, the effective dimension involved in thenonparametric dimension of this model is 1 because it can be estimated with the same ratethan a nonparametric model with a unique covariate as shown in Horowitz and Lee (2005).This parsimonious model can be generalized following Andrews and Whang (1990) to allowfor more covariate interactions. This leads to the additive interactive quantile specificationwith D M interactions V ( α | x, I ) = D M (cid:88) δ =1 (cid:88) ≤ j < ··· The interactive quantile specification (2.9) can be esti-mated using a sieve expansion, as in Horowitz and Lee (2005) or Andrews and Whang (1990).Consider a sieve { P k ( x ) , ≤ k ≤ K } is a family of functions P k ( · ) = P kK ( · ) allowing for atmost D M interactions and suppose that there are some sieve coefficients γ k ( ·| I ) = γ kK ( ·| I )10uch that for all α V ( α | x, I ) = lim K →∞ K (cid:88) k =1 γ k ( α | I ) P k ( x ) . (2.10)The expression (2.10) can be viewed as a sieve extension of the quantile regression, a sievequantile regression . It follows from Proposition 1-(i,ii) that, provided the limit in (2.10)holds uniformly with respect to α , B ( α | x, I ) = lim K →∞ K (cid:88) k =1 β k ( α | I ) P k ( x ) , β k ( α | I ) = I − α I − (cid:90) α t I − γ k ( t | I ) dt , (2.11) V ( α | x, I ) = lim K →∞ K (cid:88) k =1 (cid:32) β k ( α | I ) + αβ (1) k ( α | I ) I − (cid:33) P k ( x ) . (2.12)Hence estimating the private value sieve quantile regression can proceed from estimating thecoefficients of the bid sieve quantile regression in (2.11) and their first derivatives. Many auction parameters of interest can be written using the private value quantile functions,or equivalently the bid quantile function and its quantile derivative by (2.4). We focus hereon the conditional and unconditional integral functionals θ ( x ) = (cid:90) F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) dα, θ = (cid:90) X θ ( x ) dx (2.13)where F ( α, x, b I , b I ; I ∈ I ) is a real valued continuous function. Three illustrative examplesare as follows. Example 1: CRRA risk aversion. For symmetric risk averse bidders with a concaveutility function, the best response condition (2.2) becomes ∂∂a (cid:8) U ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) a = α = 0 . V ( α | x, I ) = B ( α | x, I )+ λ − (cid:16) αB (1) ( α | x,I ) I − (cid:17) where λ ( · ) = U ( · ) /U (cid:48) ( · ). For risk averse bidders with a CRRA utility function U ( t ) = t θ , arguing as forProposition 1 shows V ( α | x, I ) = B ( α | x, I ) + θ αB (1) ( α | x, I ) I − , (2.14) B ( α | x, I ) = 1 α I − θ (cid:90) α t I − θ − V ( t | x, I ) dt. These two formulas show that the stability implications of Proposition 1 for linear privatevalue and bid quantile functions are preserved under CRRA. Assuming as in GPV09 thatthe number of bidders is exogenous, i.e V ( α | x, I ) = V ( α | x ) for all I , gives, for any pair I (cid:54) = I θ = θ n θ d = (cid:82) X (cid:104)(cid:82) ( B ( α | x, I ) − B ( α | x, I )) (cid:16) αB (1) ( α | x,I ) I − − αB (1) ( α | x,I ) I − (cid:17) dα (cid:105) dx (cid:82) X (cid:20)(cid:82) (cid:16) αB (1) ( α | x,I ) I − − αB (1) ( α | x,I ) I − (cid:17) dα (cid:21) dx , (2.15)a formula which shows that the CRRA risk aversion can be easily identified from first-priceauction. Following Lu and Perrigne (2008), the risk-aversion parameter θ can also be iden-tified combining ascending and first-price auctions data. As seen from Gimenes (2017), theprivate value quantile function V asc ( α | x, I ) can be easily estimated from ascending auctions.Equating V asc ( α | x, I ) to V ( α | x, I ) in (2.14) gives that θ satisfies θ = (cid:82) X (cid:104)(cid:82) ( V asc ( α | x, I ) − B ( α | x, I )) αB (1) ( α | x,I ) I − dα (cid:105) dx (cid:82) X (cid:20)(cid:82) (cid:16) αB (1) ( α | x,I ) I − (cid:17) dα (cid:21) dx . (2.16) Example 2: Expected revenue. Suppose that the seller decides to reject bids lowerthan a reserve price R and let α R = α R ( x, I ) be the associated screening level, i.e. α R =12 ( R | x, I ). For CRRA bidders, the first price auction seller’s expected revenue is ER θ ( α R | x, I ) = θ · I · V ( α R | x, I )( I − 1) ( θ − 1) + θ α I − θ R (cid:18) − α ( I − θ − θ +1 R (cid:19) + I ( I − I − 1) ( θ − 1) + θ (cid:90) α R t I − θ − (cid:16) − t ( I − θ − θ +1 (cid:17) V ( t | x, I ) dt. (2.17)This expression includes an integral item θ ( x ; α R ) = (cid:90) α R t I − θ − (cid:16) − t ( I − θ − θ +1 (cid:17) V ( t | x, I ) dt which can be estimated by plugging in a risk aversion estimator (cid:98) θ and an estimator (cid:98) V ( α | x, I )of the private value quantile function, or estimators of the bid quantile function and itsderivative by (2.4). Example 3: Private value distribution Chernozhukov et al. (2010) have used therearrangement formula to invert a monotonic function. In our case, the conditional privatevalue cdf satisfies F ( v | x, I ) = E [ I [ V ( A | x, I ) ≤ v ] | x, I ] = (cid:90) I [ V ( α | x, I ) ≤ v ] dα, A ∼ U [0 , .Dette and Volgushev (2008) have considered a smoothed version I η ( · ) of the indicator func-tion F η ( v | x, I ) = (cid:90) I η [ v − V ( α | x, I )] dα It is assumed for the sake of brevity that the seller value for the good is 0.The expected revenue formulafor the general case follows from Gimenes (2017). Under risk-neutrality, integrating by parts gives that (cid:90) α R B (1) ( α | x, I ) α I − (1 − α ) dα = B ( α R | x, I ) α I − R (1 − α R ) − (cid:90) α R B ( α | x, I ) α I − ( I − − Iα ) dα, estimation of θ ( x ; α R ) can also be done using only a bid quantile estimator. I η ( t ) = (cid:82) t/η −∞ k ( u ) du , k ( · ) being a kernel function and η a bandwidth parameter.Differentiating F η ( v | x, I ) gives f η ( v | x, I ) = 1 η (cid:90) k (cid:18) v − V ( α | x, I ) η (cid:19) dα which converges to the private value pdf when η goes to 0. Note that F η ( v | x, I ) and f η ( v | x, I )can be estimated by plugging in an estimator (cid:98) V ( α | x, I ) of V ( α | x, I ). The resulting cdf andpdf estimator are expected to inherit of the dimension reduction property of this procedure.As the private value estimator (cid:98) V ( α | x, I ) proposed in the next section is consistent over thewhole [0 , Proposition 1 suggests to base the estimation of the private value quantile function on es-timations of B ( α | x, I ) and of its derivative B (1) ( α | x, I ) with respect to α . While thereis an important literature on the estimation of a conditional quantile function, estimatingthe first derivative of a quantile function has received much less attention. The augmented methodology applies local polynomial expansion with respect to α for joint estimation of B ( α | x, I ) and B (1) ( α | x, I ). Sieve methods can be used for the covariate. To ensure com-parability with the literature, we assume that the private value quantile function V ( α | x, I )has s + 1 continuous derivatives with respect to α . As seen from (2.3), this implies that thebid quantile function B ( α | x, I ) has s + 2 continuous derivatives with respect to α > 0. Thisjustifies the order s + 1 for the local polynomial estimator considered here. The no covariate case. Consider L iid first-price auctions ( I (cid:96) , x (cid:96) , B i(cid:96) , i = 1 , . . . , I (cid:96) ). Tointroduce our estimation strategy, assume first that V ( α | x, I ) = V ( α | I ) and B ( α | x, I ) =14 ( α | I ). Let ρ α ( · ) be the check function, ρ α ( q ) = q ( α − I ( q ≤ , I ( · ) being the indicator function, I ( q ≤ 0) = 1 for q ≤ B ( α | I ) = arg min q E [ I ( I (cid:96) = I ) ρ α ( B i(cid:96) − q )] , α ∈ (0 , 1) .Estimating the derivative B (1) ( α | I ) can be done by introducing local variation of the quantilelevel in the vicinity of α . Let K ( · ) ≥ − , 1] and h = h L be a positive bandwidth parameter going to 0 with the sample size. Then it follows that { B ( a | I ) , a ∈ [ α − h, α + h ] ∩ [0 , } = arg min q ( a ) (cid:90) E [ I ( I (cid:96) = I ) ρ a ( B i(cid:96) − q ( a ))] 1 h K (cid:18) a − αh (cid:19) da, (3.1)where the minimization is performed over the set of functions q ( a ) which are continuous on[ α − h, α + h ] ∩ [0 , B ( a | I ) over [ α − h, α + h ] is given by the Taylor expansion B ( a | I ) = B ( α | I ) + B (1) ( α | I ) ( a − α ) + · · · + B ( s +1) ( α | I ) ( a − α ) s +1 ( s + 1)! + O (cid:0) h s +2 (cid:1) . Let b = ( β , . . . , β s +1 ) (cid:48) be the generic coefficients of such a polynomial function and π ( a ) = (cid:20) , a, a . . . , a s +1 ( s + 1)! (cid:21) (cid:48) . (cid:98) R ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:0) B i(cid:96) − π ( a − α ) (cid:48) b (cid:1) h K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − αh − αh ρ α + ht (cid:0) B i(cid:96) − π ( ht ) (cid:48) b (cid:1) K ( t ) dt. The augmented quantile estimator is (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I ), (cid:98) β ( α | I ) and (cid:98) β ( α | I ) beingestimators of B ( α | I ) and its first derivative B (1) ( α | I ), respectively. The estimator of theprivate value quantile is (cid:98) V ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) I − . Augmented quantile regression. A first extension of this procedure is the augmentedquantile regression estimator, AQR hereafter, which considers the private quantile regressionspecification V ( α | x, I ) = [1 , x (cid:48) ] γ ( α | I ) . When the private value distribution does not depend upon I , the bid quantile functions B ( ·| I ) are suchthat the derivatives ∂ j ∂α j (cid:20) B ( α | I ) + αB (1) ( α | I ) I − (cid:21) = (cid:18) jI − (cid:19) B ( j ) ( α | I ) + αB ( j +1) ( α | I ) I − I as they are equal to V ( j ) ( α | I ) = V ( j ) ( α ), j = 0 , . . . , s + 1. These constraintscan be used to estimate V ( α ) using the parameters γ = ( γ , . . . γ s ) , δ = ( δ , . . . , δ I ) where γ j is for V ( j ) ( α ) and δ I for the derivatives B ( s +1) ( α | I ), I = 2 , . . . , I and b I ( γ, δ ) = [ b ,I , . . . , b s,I , δ I ] (cid:48) with b s,I = (cid:16) sI − (cid:17) − (cid:16) γ s − αI − δ I (cid:17) and the b j,I ’s are computed recursively using b j,I = (cid:18) jI − (cid:19) − (cid:18) γ j − αI − b j +1 ,I (cid:19) , j = 0 , . . . , s. The estimator of V ( α ) is (cid:98) γ where (cid:16)(cid:98) γ, (cid:98) δ (cid:17) = arg min γ,δ (cid:80) II =2 (cid:98) R ( b I ( γ, δ ) ; α, I ). Although not considered here, the augmented quantile estimation procedure can be used to estimate thep.d.f. f ( v | I ) of the private value using f ( v | I ) = 1 /V (1) [ F ( v | I ) | I ]. An estimator for F ( ·| I ) is (cid:98) V − ( ·| I ).Set (cid:98) V (1) ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) / ( I − 1) and (cid:98) f ( v | I ) = 1 / (cid:98) V (1) (cid:104) (cid:98) F ( v | I ) | I (cid:105) . This p.d.f. estimator canaccount for covariates by using the AQR and ASQR procedures introduced below. 16n the second extension, the augmented sieve quantile regression (ASQR), the private valuequantile function V ( α | x, I ) is equal to P ( x ) (cid:48) γ ( α | I ) up to an approximation error, where P ( x ) stacks the sieve functions P k ( x ), k = 1 , . . . , K . The AQR and ASQR approaches canbe grouped setting P ( x ) = [1 , x (cid:48) ] (cid:48) for the AQR.The bid quantile function satisfies B ( α | x, I ) = P ( x ) (cid:48) β ( α | I ) by (2.6) with γ ( α | I ) = β ( α | I ) + αβ (1) ( α | I ) / ( I − 1) by (2.7), up to an approximation error in the ASQR case.Define now the parameter b = (cid:2) β (cid:48) , β (cid:48) , . . . , β (cid:48) s +1 (cid:3) where all the β j have the same dimension D + 1 and P ( x, t ) = π ( t ) ⊗ P ( x )which is such that the Taylor expansion of B ( α | x, I ) writes, in the AQR case, B ( α + ht | x, I ) = P ( x, ht ) (cid:48) b ( α | I ) + O (cid:0) h s +2 (cid:1) where b ( α | I ) stacks β ( α | I ) and its successive derivatives β (1) ( α | I ) , . . . , β ( s +1) ( α | I ). Theobjective function of the estimation procedure becomes (cid:98) R ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:0) B i(cid:96) − P ( x (cid:96) , a − α ) (cid:48) b (cid:1) h K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − αh − αh ρ α + ht (cid:0) B i(cid:96) − P ( x (cid:96) , ht ) (cid:48) b (cid:1) K ( t ) da (3.2)which accounts for the covariate x (cid:96) . The estimation of b ( α | I ) is (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I )and the private value quantile regression estimator is (cid:98) V ( α | x, I ) = P ( x ) (cid:48) (cid:98) γ ( α | I ) with (cid:98) γ ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) I − . The bid quantile function and its derivatives can be estimated using (cid:98) B ( α | x, I ) = P ( x ) (cid:48) (cid:98) β ( α | I )17nd (cid:98) B (1) ( α | x, I ) = P ( x ) (cid:48) (cid:98) β ( α | I ). The rearrangement method of Chernozhukov et al. (2010)can be used to obtain increasing quantile estimators. Bassett and Koenker (1982) report that standard quantile regression estimators are notdefined for the extreme quantile levels α = 0 or α = 1 or even nearby. The augmentedprocedures proposed here are better behaved for extreme quantiles because the objectivefunction (cid:98) R ( · ; α, I ) averages the check function ρ a ( · ) for quantile levels a in [ α − h, α + h ] ∩ [0 , α = 1 and h ≤ (cid:98) R ( b ; 1 , I ) averages ρ ht (cid:0) B i(cid:96) − P ( x (cid:96) , ht ) (cid:48) b (cid:1) over t in [ − , (cid:98) R ( b ; 1 , I ) will be large if b is too large. Figure 1 below shows indeed that (cid:98) R ( b ; 1 , I ) has no flat part when b grows, contrasting with the standard quantile regressionobjective functions.Figure 1: A path of the objective function (cid:98) R ( b · ; 1 , I ) (solid line) of the augmented quan-tile regression estimator and of the objective function of the standard quantile regressionestimator (dotted line) when b varies in the direction [1 , . . . , (cid:48) . This averaging effect requests that t → P ( x (cid:96) , ht ) (cid:48) b is not constant meaning that the derivative compo-nents of b should not vanish. α = 0 and α = 1 than the standard quantile regression estimator. This is especiallyrelevant for estimating auction models as the winner is expected to belong to the upper tailas soon as the number of bidders is large enough. In fact, it follows from the theoretical studyof the objective function (cid:98) R ( · ; · , I ) that the AQR and ASQR estimators are uniquely definedfor all quantile levels with a large probability. As a result of a smooth objective function,the AQR and ASQR estimators are also smoother than standard quantile regression ones,see for instance Figure 4 in the Application Section. The notations a ∨ b and a ∧ b are used instead of max ( a, b ) and min ( a, b ). Recall a L (cid:16) b L means that both a L /b L = O (1) and b L /a L = O (1). The norm (cid:107)·(cid:107) is the Euclidean one, i.e. (cid:107) e (cid:107) = ( e (cid:48) e ) / . (i) The auction variables ( I (cid:96) , x (cid:96) , V i(cid:96) , B i(cid:96) , i = 1 , . . . , I (cid:96) ) are iid across (cid:96) . Thepdf f ( x | I ) of the covariates x (cid:96) given I (cid:96) = I is continuous and bounded away from overits bounded support X , with a non empty interior and which does not depend upon I . Theactual number of bidders I (cid:96) belongs to a finite set I of integer numbers larger or equal to .(ii) Given ( x (cid:96) , I (cid:96) ) = ( x, I ) , the V i(cid:96) , i = 1 , . . . , I (cid:96) are iid with a conditional quantilefunction V ( α | x, I ) , which is continuously differentiable over [0 , × X with inf ( α,x,I ) ∈ [0 , ×X ×I V (1) ( α | x, I ) > and sup ( α,x,I ) ∈ [0 , ×X ×I V (1) ( α | x, I ) < ∞ . (iii) (2.3) holds with B (0 | x, I ) = V (0 | x, I ) for all ( x, I ) ∈ X × I . See the discussion following Theorem C.4 in Appendix C for a formal argument. ssumption S For some s ≥ and each I ∈ I , V ( α | x, I ) is ( s + 1) − times continuouslydifferentiable over [0 , × X with either: (i) D M = 0 in which case V ( α | x, I ) = X (cid:48) γ ( α | I ) as in (2.5); (ii) D M > , in which case V ( α | x, I ) has D M interactions as in (2.9). Assumption H The kernel function K ( · ) with support ( − , is symmetric, continuouslydifferentiable over the straight line, and strictly positive over ( − , . The positive bandwidth h goes to with lim L →∞ log LLh D M +1) = 0 . For the ASQR estimator, P ( x ) = [ P ( x ) , . . . , P K ( x )] (cid:48) where P k ( x ) = P hk ( x ) and K (cid:16) h − D M . The retained sieve satisfies the high-level Assumption R stated in Appendix A. Assumption F For all x in X and α in [0 , , the function F [ α, x, b I , b I ; I ∈ I ] is twicedifferentiable with respect to b I and b I , I in I . The partial derivatives of order 1 and 2 arecontinuous with respect to α , x , B I and B (1) I , I in I . Assumption A recalls the quantile implications of Bayesian Nash equilibrium biddingunder symmetric IPV, see Assumption A-(iii). In Assumption A-(i), the existence of aconditional pdf for the covariate x (cid:96) is only used for the infinite dimensional quantile regressionspecification. For a standard quantile regression specification, it is sufficient to assume thatthe matrix E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] has an inverse for all I ∈ I as recalled in Assumption R-(i) inAppendix A. Note that, as all along this paper, private values and number of bidders canbe dependent. A discussion of such dependence in relation with an entry stage preliminaryto the auction can be found in Marmer, Shneyerov and Xu (2013a). For Assumption A-(ii),recall that V (1) ( α | x, I ) = 1 f ( V ( α | x, I ) | x, I ) , (4.3)where f ( v | x, I ) is the conditional private value pdf. Hence Assumption A-(ii) amounts to as-sume that f ( v | x, I ) is bounded away from 0 and infinity on its support [ V (0 | x, I ) , V (1 | x, I )]as assumed for instance in Riley and Samuelson (1981), Maskin and Riley (1984) or GPV.20he condition 0 < f ( v | x, I ) < ∞ is also used for asymptotic normality of quantile regressionestimator, see Koenker (2005). Assumption S combines a standard smoothness assumptionwith interaction restrictions.Assumption H restricts the rate at which the bandwidth can go to 0. In the AQRcase, it writes lim L →∞ log L/ ( Lh ) = 0 which is slightly more restrictive than the conditionlim L →∞ log L/ ( Lh ) = 0 used in nonparametric estimation. This rate restriction is specificto the quantile approach used here. The restriction K (cid:16) h − D M and the choice of a sievesatisfying the high-level Assumption R of Appendix A is discussed in the next section.Assumption F hold for most of the examples of functionals above. A notable excep-tion is the cdf F ( v | x, I ) in Example 3 when expressed using the rearrangement method ofChernozhukov et al. (2010), which involves an indicator function which is not smooth. How-ever it holds for the smoothed approximation F η ( v | x, I ) of the cdf, although Assumption Fimplicitly rules out vanishing bandwidth η in Example 3. The last stage of our procedure is the choice of a suitable sieve in (2.10), when a quantileregression specification cannot be used and more flexibility is needed. While the high levelAssumption R of Appendix A mentioned in Assumption H describes some key theoreticalproperties used in the main results, the focus is set here on suitable sieves. The mostimportant requirement is that the sieve has good approximation properties as detailed inAppendix A. Although not strictly necessary, the sieve functions P k ( · ) in the private valuequantile expansion (2.10) should be localized, i.e. the number of P k (cid:48) ( · ) such that P k ( · ) P k (cid:48) ( · )do not vanish must be bounded. These two requirements are typically satisfied by sievesbuilding on cardinal spline basis or wavelets as detailed now.Consider first the spline example of sieves. Assume that X = [0 , D for the sake ofbrevity. For m ≥ s + 2, set ( t ) m − = t m − if t > t ) m − = 0 otherwise. The consideredspline sieve is based upon the uniformly spaced simple knots B − spline function of order m q ( t ) = m (cid:88) i =0 ( − i (cid:0) mi (cid:1) ( t − i ) m − m !which has m − , m ].The baseline B − spline function q ( · ) generates the rescaled functions p κh ( · ) = p κ ( · ) p κ ( t ) = 1 √ h q (cid:18) t − ( κ − m ) hh (cid:19) , κ = 1 , . . . , κ where κ = κ h = O (1 /h ) is the largest integer number such that ( κ − m ) h ≤ ≤ κh .Theorem 6.20 in Schumaker (2007) implies that each function v ( · ) with s + 1 continuousderivatives can be approximated uniformly over [0 , 1] with a linear combination of the p κ ( · )’sup to an error o (cid:0) h − ( s +1) (cid:1) . The p κ ( · )’s are also localized with (cid:82) p κ ( t ) dt = O (1) uniformlyin κ and h . Similarly, additive quantile functions as in (2.8) can be approximated using thesieve { p κ ( x ) , . . . , p κ ( x D ) , κ = 1 , . . . κ } . A suitable sieve for additive interactive quantile function of order D M as in (2.9) is (cid:40) D M (cid:89) δ =1 p κ δ ( x j δ ) , all ( κ δ , j δ ) with 1 ≤ κ , . . . , κ D M ≤ κ , 1 ≤ j < · · · < j δ ≤ D (cid:41) . (4.4)The set (4.4) can be written as a collection { P k ( x ) , k = 1 , . . . , K } with K = O (cid:0) h − D M (cid:1) localized functions satisfying (cid:82) X P k ( x ) dx = O (1) uniformly in k and h .Similar localized sieve can be obtained using wavelets on the interval [0 , ϕ ( · ) and ψ ( · ) the father and mother wavelets of order s + 1, i.e. (cid:82) t r ϕ ( t ) dt = 0 for r = 1 , . . . , s + 1. A wavelet sieve similar to (4.4) is given by22he collection of functions D M (cid:89) δ =1 − H / ϕ (cid:18) x j δ − − H κ δ H (cid:19) and D M (cid:89) δ =1 − H/ ψ (cid:18) x j δ − − H κ δ H (cid:19) , H ≤ H ≤ H where H and H are two diverging integer numbers with 2 − H (cid:16) h , κ δ and j δ as in (4.4). The next sections give our theoretical results for integrated mean squared error and asymp-totic distribution of the augmented estimator (cid:98) V ( ·| x, I ). Theorem A.1 in Appendix A alsogives uniform consistency rates of similar interest. Recall P ( x (cid:96) ) = [1 , x (cid:48) (cid:96) ] (cid:48) is of the constant dimension K = D + 1 in the AQR case. Let s bethe 1 × ( s + 2) selection vector (0 , , , . . . , , which is such that s ⊗ Id K (cid:98) β ( α | I ) = (cid:98) β ( α | I )is the estimator of sieve coefficient derivative β (1) ( α ). Let Π ( α ) be the second column ofthe inverse of (cid:82) π ( t ) π ( t ) (cid:48) K ( t ) dt , i.e.,Π ( α ) = (cid:18)(cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:19) − s (cid:48) and consider the variance terms v ( α ) = Π ( α ) (cid:48) (cid:90) (cid:90) π ( t ) π ( t ) (cid:48) min ( t , t ) K ( t ) K ( t ) dt dt Π ( α ) , Σ ( α | I ) = α v ( α )( I − E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) (cid:3) E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , Σ IL = (cid:90) X (cid:90) P ( x ) (cid:48) Σ ( α | I ) P ( x ) dαdx. v ( α ), and then Σ IL , is strictly positive follows from the proof of Theorem 2 below,see in particular Lemma B.5 in Appendix B. The bias of the estimator will depend upon Bias ( α | I ) = αI − s (cid:18)(cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:19) − (cid:90) t s +2 π ( t )( s + 2)! K ( t ) dt × E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) αB ( s +2) ( α | x (cid:96) , I (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , Bias IL = (cid:90) X (cid:90) (cid:0) P ( x ) (cid:48) Bias ( α | I ) (cid:1) dαdx. Theorem 2 Suppose that the private value conditional quantile function V ( ·|· ) is a quantileregression (2.5), for which D M = 0 , or a sieve quantile regression (2.10) with D M inter-actions. Then under Assumptions A, H, S with s ≥ D M / , there exists an approximation (cid:98) v ( α | x, I ) of (cid:98) V ( α | x, I ) such that E (cid:20)(cid:90) X (cid:90) ( (cid:98) v ( α | x, I ) − V ( α | x, I )) dαdx (cid:21) = h s +1) Bias IL + Σ IL LIh D M +1 + o (cid:18) h s +1) + 1 Lh D M +1 (cid:19) where Bias IL = O (1) , Σ IL = O (1) and (cid:90) X (cid:90) (cid:16) (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) (cid:17) dαdx = o P (cid:18) Lh D M +1 (cid:19) . (4.5)The quantile estimator (cid:98) V ( α | x, I ) is nonlinear and defined in an implicit way, so thatattempting a direct computation of its IMSE is difficult. Its approximation (cid:98) v ( α | x, I ) followsfrom a Bahadur linearization argument, see Theorem D.1 and (E.1) in Appendices D and E.The rate in equation (4.5) is negligible with respect to the IMSE of (cid:98) v ( α | x, I ), showing thatit is fair to replace (cid:98) V ( α | x, I ) by (cid:98) v ( α | x, I ) to picture the IMSE of (cid:98) V ( α | x, I ).Note that Theorem 2 holds over the full quantile level range [0 , αB (1) ( α | x, I ) in V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) / ( I − s + 1)th continuously differentiablewhich gives the order h s +1 for the bias and the order 1 / (cid:0) Lh D M +1 (cid:1) / for the variance. The24ias component due to the estimation of B ( α | x, I ) is of the negligible order h s +2 except per-haps over a small vicinity of 0 where it is o ( h s +1 ). The asymptotic variance Σ IL / (cid:0) LIh D M +1 (cid:1) order is similar to the asymptotic variance obtained for kernel estimation of a conditionalpdf with D M covariates. Indeed, the bid quantile derivative is homogeneous to a conditionalpdf since B (1) ( α | x, I ) = 1 g [ B ( α | x, I ) | x, I ] , where g ( ·|· ) is the bid conditional pdf. The bid quantile function is homogeneous to a cdfand converges with a faster rate. Note that the asymptotic variance term Σ IL / (cid:0) LIh D M +1 (cid:1) depends upon the number of interactions D M and not the dimension of the covariate D .Hence Theorem 2 illustrates the dimension reduction features of the procedure. In particular,the variance term is of order 1 / ( Lh ) in the AQR case independently of the dimension of thecovariate D , which therefore can be large.Maximizing the leading term of the IMSE yields the optimal bandwidth h ∗ = (cid:18) ( D M + 1) Σ IL s + 1) Bias IL LI (cid:19) s + D M +3 . (4.6)As in kernel estimation, a pilot bandwidth can be computed using a simple private valuequantile regression model to proxy Σ IL and Bias IL in a parametric way. The correspondingIMSE rate is L s +12 s + D M +3 which decreases with the number of interactions D M , but does not depend upon the dimen-sion D of the covariate. In the AQR case with D M = 0, the IMSE rate L s +12 s +3 is, as expected,the optimal rate for estimating the marginal pdf of a real random variable. For s = 1, it isequal to L / independently of the dimension D of the covariate, which is close of L / .Two assumptions limit the use of the optimal bandwidth (4.6). First, Theorem 2 assumes s ≥ D M / D M larger than 3since s ≥ D replaces D M but plays a similar role,25ryal et al. (2016) however use a condition s + 1 > D to study a GMM version of GPVbased on a local polynomial estimation of the private value. This section states a Central Limit Theorem for (cid:98) V ( α | x, I ), Theorem 3, which illustrates thegood pointwise properties of (cid:98) V ( α | x, I ) near or at the upper boundary α = 1. Let s be theselection vector defined earlier andΠ h ( α ) = (cid:32)(cid:90) − αh − αh π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:33) − s (cid:48) ,v h ( α ) = Π h ( α ) (cid:48) (cid:90) − αh − αh (cid:90) − αh − αh π ( t ) π ( t ) (cid:48) min ( t , t ) K ( t ) K ( t ) dt dt Π h ( α ) , Σ h ( α | I ) = α v h ( α )( I − E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) (cid:3) E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , (4.7) Bias h ( α | I ) = αI − s (cid:32)(cid:90) − αh − αh π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:33) − (cid:90) − αh − αh t s +2 π ( t )( s + 2)! K ( t ) dt × E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) αB ( s +2) ( α | x (cid:96) , I ) B (1) ( α | x (cid:96) , I ) (cid:21) . (4.8) Theorem 3 Suppose that the private value conditional quantile function V ( ·|· ) is a quantileregression (2.5) or a sieve quantile regression (2.10) with D M interactions. Then underAssumptions A, H, S with s ≥ D M / and log LLh D M +1+1 ∨ D M = o (1) , t holds for α in (0 , and all x in X that (cid:18) LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / (cid:16) (cid:98) V ( α | x, I ) − V ( α | x, I ) − h s +1 P ( x ) (cid:48) Bias h ( α | I ) + o (cid:0) h s +1 (cid:1)(cid:17) converges in distribution to a standard normal. Moreover P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:16) αh − D M and max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) Bias h ( α | I ) (cid:12)(cid:12) = O (1) . Theorem 3 shows that the asymptotic variance of (cid:98) V ( α | x, I ) is of order α/ (cid:0) Lh D M +1 (cid:1) for α > 0. For α = 0, (cid:98) V ( α | x, I ) = (cid:98) B ( α | x, I ) has an asymptotic variance of order 1 / (cid:0) Lh D M +1 (cid:1) and a corresponding CLT using this standardization also holds. For other quantile levels theprivate value conditional quantile estimator depends upon (cid:98) B (1) ( α | x, I ) so that the asymp-totic variance of (cid:98) V ( α | x, I ) has the larger order 1 / (cid:0) Lh D M +1 (cid:1) which also holds in Theorem 2.The expression of the asymptotic variance of (cid:98) V ( α | x, I ) is quite typical of quantile regressionestimators, up to the factor v h ( α ) which is due to (cid:98) B (1) ( α | x, I ).It follows from Theorem 3 that the private value conditional quantile estimator is con-sistent for all quantile levels, including α = 1. The potential boundary effects only appearthrough the bias and variance factors Bias h ( α | I ) and Σ h ( α | I ). Since the support of thekernel is [ − , Bias h ( α | I ) = Bias ( α | I ) and Σ h ( α | I ) = Σ ( α | I ) for all α in [ h, − h ]where Bias ( α | I ) and Σ ( α | I ) are defined before Theorem 2, allowing in principle to implementsimple pilot bandwidth for quantile level inside [0 , α lies in (0 , h ] or [1 − h, h . It is commonly believed that the variance factoris inflated near the boundaries but there is no clear result for the bias factor, see Fan andGijbels (1996) and the references therein. 27 .3 Functional estimation The plug in estimators of θ ( x ) and θ in (2.13) are (cid:98) θ ( x ) = (cid:90) F (cid:104) α, x, (cid:98) B ( α | x, I ) , (cid:98) B (1) ( α | x, I ) ; I ∈ I (cid:105) dα, (cid:98) θ = (cid:90) X (cid:98) θ ( x ) dx, with AQR or ASQR (cid:98) B ( α | x, I ) and (cid:98) B (1) ( α | x, I ). Alternatively, θ can be estimated using (cid:80) L(cid:96) =1 (cid:98) θ ( x (cid:96) ) /L . Let us now introduce the asymptotic variances of (cid:98) θ ( x ) and (cid:98) θ . The variancesdepend upon the matrices P ( I ) = E [ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) )] , P ( α | I ) = E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , and of the functions, recalling b I and b I stand for B ( α | x, I ) and B (1) ( α | x, I ) respectively, ϕ I ( α, x ) = ∂ F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) ∂b I ,ϕ I ( α, x ) = ∂ F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) ∂b I . Let A be a random variable with the uniform distribution over [0 , 1] and define σ L ( x | I ) = I Tr (cid:26) Var (cid:20)(cid:18)(cid:90) A (cid:26) ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α (cid:27) P ( α | I ) − dα (cid:19) P ( I ) / h D M / P ( x ) (cid:21)(cid:27) ,σ L ( I ) = I Tr (cid:26) Var (cid:20)(cid:90) A (cid:18)(cid:90) X (cid:26) ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α (cid:27) P ( α | I ) − P / ( I ) P ( x ) dx (cid:19) dα (cid:21)(cid:27) ,σ L ( x ) = (cid:88) I ∈I σ L ( x | I ) , σ L = (cid:88) I ∈I σ L ( I ) . The proof of Theorem 4 in Appendix E shows that the asymptotic variances of (cid:98) θ ( x ) and (cid:98) θ are σ L ( x ) / (cid:0) Lh D M (cid:1) and σ L /L respectively provided ϕ I ( α | x ) (cid:54) = ∂ϕ I ( α | x ) ∂α (4.9)28or some α , x and I of [0 , × X × I . Indeed, if ϕ I ( α | x ) = ∂ϕ I ( α | x ) ∂α for all α and I , σ L ( x | I ) = 0 and, if this also holds for all x , σ L = 0, in which case (cid:98) θ ( x ) and (cid:98) θ can convergeto θ ( x ) and θ with “superefficient” rates, faster than (cid:0) Lh D M (cid:1) / and L / respectively. Inthe case of density based functionals, Laurent (1997) similarly obtained asymptotic variancethat can vanish. Why it is possible is better understood in our quantile context, through anexample of functionals for which (4.9) does not hold. Consider, for some I of I , F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) = 2 B ( α | x, I ) B (1) ( α | x, I )which gives ( ϕ I ( α | x ) , ϕ I ( α | x )) = 2 (cid:0) B (1) ( α | x, I ) , B ( α | x, I ) (cid:1) . Hence ϕ I ( α | x ) = ∂ϕ I ( α | x ) ∂α for all ( α, x, I ), so that (4.9) does not hold and σ L ( x ) = σ L = 0. Why (cid:98) θ ( x ) and (cid:98) θ can con-verge with superefficient rates for these functionals is in fact not surprising observing thatthey estimate θ ( x ) = B (1 | x, I ) − B (0 | x, I ) , θ = (cid:90) X θ ( x ) dx, respectively. Hence, for these examples, the parameters of interest only depend upon extremequantiles, in which case superefficient estimation is possible, see e.g. Hirano and Porter(2003) and the references therein. A role of the new Condition (4.9) is to exclude suchfunctionals. The next Theorem establishes the asymptotic normality of (cid:98) θ ( x ) and (cid:98) θ . Theorem 4 Suppose Assumptions A, F, H, S and R hold with s ≥ D M / . Then σ L ( x ) and σ L are bounded away from and infinity if (4.9) holds for some ( α, I ) in [0 , × I and forsome ( α, x, I ) in [0 , × X × I respectively. Moreoveri. If log LLh D M +2+ ( D M∨ ) = o (1) , √ Lh D M (cid:16)(cid:98) θ ( x ) − θ ( x ) − bias L,θ ( x ) (cid:17) /σ L ( x ) converges in dis-tribution to a standard normal, where bias L,θ ( x ) is a o ( h s ) bias term.ii. If log LLh D M +1+ ( D M∨ ) = o (1) , √ L (cid:16)(cid:98) θ − θ − bias L,θ (cid:17) /σ L converges in distribution to a stan-dard normal, where bias L,θ is a o ( h s ) bias term. A more systematic study is out of the scope of the present paper, as is the issue of semiparametricefficiency. B (1) ( α | x, I ). When F ( · ) depends upon αB (1) ( α | x, I ) as in all the Examples, the exact order of the bias term is h s +1 with bias L,θ ( x ) = h s +1 (1 + o (1)) (cid:88) i ∈I (cid:90) G b I (cid:2) α, x, B ( α | x, I ) , αB (1) ( α | x, I ) ; I ∈ I (cid:3) × P ( x ) (cid:48) Bias h ( α | x, I ) dα and bias L,θ = (cid:82) X bias L,θ ( x ) dx where Bias h ( α | x, I ) is as in (4.8) and G b I ( · ) is the partialderivative of F ( · ) with respect to αB (1) ( α | x, I ). (cid:98) θ ( x ) or (cid:98) θ are therefore asymptoticallyunbiased if h s +1 √ Lh D M = o (1) or h s +1 √ L = o (1) respectively. The items Bias h ( α | x, I )in the integral expression of bias L,θ ( x ) can be replaced with their limits Bias ( α | x, I ) definedbefore Theorem 2. Theorem 4 applies to our functional Examples as follows. Example 1 (cont’d). Let (cid:98) θ = (cid:98) θ n / (cid:98) θ d be the CRRA risk aversion plug in estimator derivedfrom (2.15). Under the bandwidth condition of Theorem 4-(ii), (cid:98) θ n = θ n + bias L,θ n + O P (cid:0) L − / (cid:1) and (cid:98) θ d = θ d + bias L,θ d + O P (cid:0) L − / (cid:1) . A standard linearization argument then gives that theasymptotic distribution of √ L (cid:18)(cid:98) θ − θ d bias L,θ n − θ n bias L,θ n θ d (cid:19) is the one of θ d √ L (cid:16)(cid:98) θ n − θ n (cid:17) − θ n √ L (cid:16)(cid:98) θ d − θ d (cid:17) θ d which is normal, applying Theorem 4-(ii) with F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) = B ( α | x, I ) − B ( α | x, I ) θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) − θ n θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) . ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α appearing in the asymptotic variances are, for I = I , ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α = 1 θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) − B ( α | x, I ) − B ( α | x, I ) − α (cid:0) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:1) θ d ( I − θ n θ d ( I − (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) + 2 θ n αθ d ( I − (cid:18) B (1) ( α | x, I ) + αB (2) ( α | x, I ) I − − B (1) ( α | x, I ) + αB (2) ( α | x, I ) I − (cid:19) where αB (2) ( α | x, I ) is well defined over [0 , 1] by (2.3). The case I = I is similar. Usingthese expressions to estimate the asymptotic variance CRRA risk-aversion (cid:98) θ is difficult dueto the second derivative B (2) ( α | x, I ), which is difficult to estimate. Although not formallystudied here, using a bootstrap procedure may be more appropriate. Example 2 (cont’d). Theorem 4-(i) together with Theorem 3 are useful to study the plugin estimator (cid:100) ER ( α R | x, I ) derived from (2.17). Theorem 4-(i) gives that the estimator of theintegral component θ ( x ; α R ) satisfies (cid:98) θ ( x ; α R ) = θ ( x ; α R ) + O ( h s +1 ) + O P (cid:16) / √ Lh D M (cid:17) ,while Theorem 3 ensures that (cid:98) V ( α | x, I ) = V ( α | x, I ) + O ( h s +1 ) + O P (cid:16) / √ Lh D M +1 (cid:17) . Asthe O ( h s +1 ) items correspond to bias terms and the O P ( · ) ones are given by the estimationstochastic component, both (cid:98) θ ( x ; α R ) and (cid:98) V ( α R | x, I ) contribute to the bias of (cid:100) ER ( α R | x, I ).The asymptotic distribution of the bias centered √ Lh D M +1 (cid:16)(cid:100) ER ( α R | x, I ) − ER ( α R | x, I ) (cid:17) is the one of Iα I − R (1 − α R ) √ Lh D M +1 (cid:16) (cid:98) V ( α R | x, I ) − V ( α R | x, I ) (cid:17) , which follows from The-orem 3. The uniform consistency Theorem A.1 in Appendix A can be used to study theestimated screening level (cid:98) α R ( x, I ) and reserve price (cid:98) V ( (cid:98) α R ( x, I ) | x, I ) obtained by maximiz-ing (cid:100) ER ( α R | x, I ). Example 3 (cont’d). Theorem 4-(i) is also useful to study the private value cdf. and pdf,estimator from Example 3, with a fixed bandwidth η . The proof carries over if η goes to 031ith h = o ( η ) and the order of the variance given by Theorem 4-(i) is correct if η is of theorder of η . For the cdf estimator (cid:98) F η ( v | x, I ) = (cid:82) I η (cid:104) v − (cid:98) V ( α | x, I ) (cid:105) dα , ϕ I ( α | x ) = − η k (cid:18) v − V ( α | x, I ) η (cid:19) , ϕ I ( α | x ) = α ( I − η k (cid:18) v − V ( α | x, I ) η (cid:19) ,∂ϕ I ( α | x ) ∂α = 1( I − η k (cid:18) v − V ( α | x, I ) η (cid:19) − α ( I − η k (1) (cid:18) v − V ( α | x, I ) η (cid:19) V (1) ( α | x, I ) . When η goes to 0, the dominant part of the variance is, for inner v , integrating by parts andsetting V x,I = V ( A | x, I ) ILh D M Tr (cid:26) Var (cid:20)(cid:18)(cid:90) A ∂ϕ I ( α | x ) ∂α P ( α | I ) − dα (cid:19) P ( I ) / h D M / P ( x ) (cid:21)(cid:27) = (1 + o (1)) ILh D M Tr (cid:40) Var (cid:34) ϕ I ( A | x ) ∂ P ( A | I ) − ∂α P ( I ) / h D M / P ( x ) (cid:35)(cid:41) = (1 + o (1)) I ( I − Lh D M × Tr Var F ( V x,I | x, I ) f ( V x,I | x, I ) k (cid:16) v − V x,I η (cid:17) η ∂ P ( F ( V x,I | x, I ) | I ) − ∂α P ( I ) / h D M / P ( x ) = (1 + o (1)) I (cid:82) k ( t ) dt ( I − Lηh D M (cid:18) F ( v | x, I ) f ( v | x, I ) (cid:19) × Tr (cid:40) ∂ P ( F ( v | x, I ) | I ) − ∂α P ( I ) / h D M P ( x ) P ( x ) (cid:48) P ( I ) / ∂ P ( F ( v | x, I ) | I ) − ∂α (cid:41) . Hence the order of the variance of (cid:98) F η ( v | x, I ) is 1 / (cid:0) Lηh D M (cid:1) . Its bias as an estimator of F ( v | x, I ) has two components: the first is bias L,F η ( v | x,I ) due to the bias of (cid:98) V ( α | x, I ) and isof order O ( h s +1 ), while the second is F η ( v | x, I ) − F ( v | x, I ) = O ( η s +1 ) is k ( · ) is a kernelof order s . It follows that the optimal bandwidths h and η must have the same order L − / (2 s + D M +3) which gives the consistency rate L − ( s +1) / (2 s + D M +3) . Repeating these steps forthe pdf estimator (cid:98) f η ( v | x, I ) gives the same optimal consistency rate L − s/ (2 s + D M +3) which,up to a logarithmic term, corresponds to the GPV optimal minimax rate in presence of D M covariates. 32 Simulation experiments This section reports the results of a simulation experiment for the AQR estimation of theprivate value quantile function, the expected revenue and optimal reserve price under riskneutrality from first-price auction with I = 2. A second simulation experiment considersestimation of risk aversion based on comparison of first-price auctions with I = 2 and I = 3as in (2.15) and on comparison with first-price and ascending auctions with I = 2. In eachcase, the considered number of auctions is L = 100 and the number of replications is 1 , αB (1) ( α | x, I ) / ( I − I = 2 corresponds to a worst case scenario. By contrast,the simulation experiment in GPV considers I = 5 while I = 3 or 5 in Marmer and Shneyerov(2012) and Ma, Marmer and Shneyerov (2018). The number of bids in these references rangefrom 1 , 000 for GPV to 4 , 200 for Marmer and Shneyerov (2012). In a simulation experi-ment focused on the nonparametric estimation of the utility function of risk averse bidders,Zincenko (2018) considers I = 2 with L = 300 and I = 4 with L = 150. Our simulationexperiment is therefore more focused on small samples. We also use three covariate whilethe aforementioned simulation experiments do not consider covariate, with the exception ofZincenko (2018) who increases the number of auctions to L = 900 for one or two covariatesto cope with the curse of dimensionality. The private value quantile function is given by a quantile regression model with an interceptand three independent covariates with the uniform distribution over [0 , V ( α | x ) = γ ( α ) + γ ( α ) x + γ ( α ) x + γ ( α ) x γ ( α ) = 1 + 0 . α − , γ ( α ) = 1 ,γ ( α ) = 0 . − exp( − α )) , γ ( α ) = 0 . . π + 1) α + cos(2 πα )) . The coefficient γ ( · ) is flat near 0 and fastly increases near 1, as observed in the applicationdisplayed in the next section, while γ ( · ) fastly increases near 0 and is flat after. Thederivative of γ ( · ) has some oscillating patterns.The expected revenue ER ( α ) is computed from (2.17) setting the intercept, x and x to 0 and taking x = 0 . 8. This choice gives a unique optimal reserve price achieved for α = . 3, which is not too close to the boundaries so that the expected revenue function hasa substantial concave shape which is suppose to make estimation more difficult. The private value quantile regression is estimated from a sample of 100 first-price auctionswith two bids over the estimation grid α = 0 , . , . . . , . , (cid:98) V ( α | x ) of order 2 and kernel K ( t ) = 6 t (1 − t ) I ( t ∈ [0 , (cid:100) ER ( α ) plugs 0 . (cid:98) γ ( α ) into (2.17) using Riemann sums to com-pute integrals. The optimal screening level (cid:98) α ∗ maximizes (cid:100) ER ( α ) over the grid and is usedto compute the estimated optimal reserve price (cid:98) R ∗ = . (cid:98) γ ( (cid:98) α ∗ ) and the estimated optimalrevenue (cid:100) ER ∗ = (cid:100) ER ( (cid:98) α ∗ ).Table 1 summarizes the simulation results for the estimation of the private value quantilefunction, the expected revenue and the optimal reserve price. The Bias and Square RootIntegrated Mean Squared Error (RIMSE) lines for (cid:98) V ( ·|· ) gives the simulation counterpartsof, respectively (cid:32) (cid:88) j =0 (cid:90) ( E [ (cid:98) γ j ( α )] − γ j ( α )) dα (cid:33) / and (cid:32) (cid:88) j =0 (cid:90) E (cid:2) ( (cid:98) γ j ( α ) − γ j ( α )) (cid:3) dα (cid:33) / . . , . , . . . , . h . . . . . . . . (cid:98) V ( ·|· ) Bias .131 .141 .143 .145 .150 .159 .166 .176RIMSE .433 .386 .355 .332 .322 .309 .303 .305 (cid:100) ER ( · ) Bias .036 .044 .049 .050 .051 .049 .047 .045RIMSE .109 .104 .102 .100 .099 .098 .097 .096 (cid:98) R ∗ Bias -.036 -.031 -.014 -.002 .009 .022 .037 .043RMSE .129 .099 .075 .067 .062 .064 .066 .066Table 1: Private value quantile function, expected revenue, and optimal reserve priceFigure 2: Private value quantile estimation for h = 0 . h = 0 . V ( α | x ) = γ ( α ) + ( γ ( α ) + γ ( α ) + γ ( α )) / . − . 5% quantiles of (cid:98) V ( α | x ) across1 , 000 simulations.Estimation of the private value slope coefficients seems much more sensitive to the band-width parameter than the expected revenue or optimal reserve price. It has also a muchhigher RIMSE. The bandwidth behavior of (cid:98) V ( α | x ) is illustrated in Figure 2, which consid-35igure 3: Expected revenue estimation for h = 0 . h = 0 . ER ( α | x )in black. Dashed red line: average estimation. Dotted red line: pointwise 2 . − . (cid:100) ER ( α | x ) across 1 , 000 simulations.ers the small bandwidth h = 0 . h = 0 . 8. As expected from Theorem 3, thevariance of (cid:98) V ( α | x ) increases with α and decreases with h , while the bias increases with α butdecreases with h . Figure 2 also suggests that choosing a large bandwidth as recommendedby Table 1 may lead to important bias issues, including underestimating the private valuequantile function for high α .This contrasts with estimation of the expected revenue and optimal reserve price, whichseems mostly unaffected by the bandwidth. This is because the expected revenue dependsupon (1 − α ) V ( α | x ): multiplying the private value quantile function by (1 − α ) mitigateslarger bias and variance near the boundary α = 1, see also Figure 3. For the consideredexperiment, the true expected revenue is always in the 95% band of Figure 3 while the trueprivate quantile function is out for large α when h = 0 . .3 CRRA risk aversion Two risk aversion estimators are considered. The first estimator (cid:98) θ fp is based upon (2.15) anduses two independent samples of size L = 100 with 2 and 3 bidders from the model above,which corresponds to a CRRA utility function x θ with θ = 1. Integrals with respect to α arecomputed using Riemann sums whereas integrals with respect to x are replaced with samplemeans over the two auction samples. The second estimator (cid:98) θ asc is based upon (2.16) anduses an additional sample of size L = 100 of ascending auctions with two bidders. In thiscase, it is possible to consider various values of θ and the simulation experiment considers thevalues 0 . 2, 0 . B ( α | x ) is the first-price auction quantile bid function with I = 2, the observed bids drawn from B ( α | x ) are rationalized by a CRRA utility function x θ if the private value quantile function is set to V θ ( α | x ) = B ( α | x, 2) + θαB (1) ( α | x, V (1) θ ( ·| x ) > x as seen from Campo et al. (2011) and (2.14) here. As V (1) θ ( ·|· ) > V θ ( α | x ) to generate two ascending bids for eachauction. Following Gimenes (2017), V θ ( α | x ) can be estimated from winning bids in theseascending auction using AQR for quantile level 2 α − α instead of α .The performance of the two estimators are summarized in the next Table. Table 2 shows θ h . . . . . . . . (cid:98) θ fp (cid:98) θ asc . . The optimal bid functions can be computed explicitly under the risk neutrality case θ = 1. Consideringother values of θ would request to use numerical computations of the bid functions. (cid:98) θ asc , which combines first-price and ascending auctions as in Lu and Perrigne (2008),dominates (cid:98) θ fp in this experiment. While the RMSE and bias of (cid:98) θ asc do not seem sensitiveto h , this is not the case for (cid:98) θ fp which has a high downward bias, and then RMSE, forsmall h . Further investigations suggest this is due to an unbalanced variable issue, thedifference (cid:98) B ( α | x, − (cid:98) B ( α | x, 2) being very smooth while α (cid:16) (cid:98) B (1) ( α | x, / − (cid:98) B (1) ( α | x, (cid:17) is more erratic, especially when α is close to 1. This issue is addressed in the application byrestricting α to [0 , . 8] for risk aversion estimation. This section illustrates empirically the methodology using data from ascending timber auc-tions run by the US Forest Service (USFS). Timber auctions data have been used in severalempirical studies (see Athey and Levin (2001), Athey, Levin and Seira (2011) Li and Zheng(2012), Aradillas-Lopez, Gandhi and Quint (2013) among others). Some other works haveinvestigated risk-aversion on timber auctions (e.g., Lu ad Perrigne (2008), Athey and Levin(2001), Campo et al. (2011)). The data set used here is from Lu and Perrigne (2008) andCampo et al. (2011), and aggregates auctions from the states covering the western half ofthe United States (regions 1–6 as labeled by the USFS) occurred in 1979. It contains bidsand a set of variables characterizing each timber tract, including the estimated volume ofthe timber measured in thousands of board feet (mbf) and its estimated appraisal valuegiven in dollars per unit of volume. We consider the 107 first-price auctions with two bid-ders, the first-auctions with three bidders ( L = 108) and ascending auctions with two bidders( L = 241). The considered covariates are the appraisal value and the timber volume taken inlog. The rest of the application uses a quantile regression model for the private value, whichis estimated via AQR of order 2 and kernel K ( t ) = 6 t (1 − t ) I ( t ∈ [0 , h in { . , . , . . . , . } . Confidence intervals are computed using pairwise bootstrap. Bid quantile functions. Table 3 gives the coefficients of a regression on these variables.The dependent variables are the bids for the first-price auctions while the winning bid is used38or the ascending auction. The appraisal value coefficient is close to 1 in all auctions, butAuctions Intercept Volume Appraisal value R First-price I = 2 − . (6 . . (1 . . (0 . . I = 3 − . (9 . . (1 . . (0 . . I = 2 2 . (15 . . (1 . . (0 . . I = 2 with the one with I = 3 and the ascending auction. Similarly the volume coefficientof the first-price auction with I = 2 differs from the one with I = 3 at the 10% level, andalso at the 5% level when using an unilateral test. These findings are consistent with aquantile regression specification with non constant coefficients for these two variables. Theintercept coefficients of the first-price auctions with I = 2 and I = 3 are not statisticallydistinct at the 5% level. This is not compatible with the homogenized bid regression model V = γ + X (cid:48) γ + v with v independent of X : for this model, the volume and appraisal valuecoefficients obtained from a bid regression should not depend upon I under entry exogeneity,as discussed in Section 2.2.Figure 4 sums up the quantile regression analysis of the first-price auction bids with I = 2.The difference of the AQR volume slope and regression coefficient is consistently outside thepointwise 90% bootstrap confidence interval. This finding holds for all bandwidths. Thecase of the appraisal value is more difficult. The differences between the estimated regressioncoefficient and the AQR lies outside the confidence bands between α = 90% and α = 1 dueto a strong increase of the AQR. But this holds for the bandwidths h = . h = . h . Figure 4 also reports standard quantile regression, which exhibits a similarpattern. This suggests a potential AQR bias issue for h > . α as suggested bythe standard QR estimation. Therefore, the intercept will be kept constant and set to itsestimated value from Table 3 in the rest of the application. Comparison of the augmented39igure 4: Two bidders first-price auction bid quantile slope coefficients: Intercept (left),volume (center) and appraisal value (right). AQR with h = . Risk aversion. The two risk aversion estimators look insensitive to the bandwidth, pro-ducing a risk aversion estimation around . 85 for (cid:98) θ fp and . (cid:98) θ asc . The bootstrap medianof (cid:98) θ fp reported in Table 4 suggests that the distribution of (cid:98) θ fp is asymmetric, with a medianaround . 75 slightly above the one of (cid:98) θ asc . This risk aversion estimates are similar to theones obtained with Lu and Perrigne (2008) and Campo et al. (2011). The bootstrap 90%confidence intervals in Table 4 suggests a much higher dispersion that the ones reported bythese authors from asymptotic variance estimations. In particular, it is not possible to reject40isk neutrality. . . . . . . . . (cid:98) θ fp (50%) . . . . . . . . . . . . . . . . , . , . 6] [ . , . 4] [ . , . 4] [ . , . 4] [ . , . 4] [ . , . 4] [ . , . 4] [ . , . . (cid:98) θ asc (50%) . . . . . . . . . . . . . . . . , . , . 3] [ . , . 2] [ . , . 3] [ . , . 3] [ . , . 2] [ . , . 2] [ . , . 2] [ . , . , 50% and 95% bootstrap quantiles, h = . , ..., . h = . 3. AQR estimation (full line), regression (full straight line) and 5% , , Private value quantile function and expected revenue. This section reports esti-mation results under risk neutrality for first-price auctions with two and three bidders andascending auctions. Figure 5 gives the private value slope function of the volume and ap-praisal variables. The volume slope functions differ of the corresponding OLS coefficientsfor all auctions and all the considered bandwidths. Its shape however varies across auctions:41hile convex and in the [20 , α in the first-price case, it is in the [8 , α near 0, suggesting that low type biddervaluations of timber lots are very close to the appraisal value. This contrasts with high typebidders with higher α , which markup can be very high, in a significant way for the caseof ascending auction. This illustrates again the important difference between low type andhigh type bidders.A possible discrepancy between first-price and ascending auctions with two bidders alsoappears in the expected revenue computed for median values of the two explanatory variables,see Figure 6.Figure 6: Estimated expected revenue for first-price (full line) and ascending (diamond)auctions with two bidders ( h = . − 95% bootstrap quantiles in dashed lines.The ascending auction expected revenue is always below the first-price one. This seems42tatistically significant for high quantile levels. This may not be relevant for the seller asthe optimal revenue is achieved for a wide range [0 , . 5] of quantile levels over which thetwo expected revenue curves seem flat. This feature, which appears for all the consideredbandwidths, suggests again that the private value quantile function of bidders participatingto first-price auctions is higher than the one for ascending auctions. Note also that thebootstrap confidence bands for first-price auction are larger than for ascending one, as forall the estimations reported in this application. This paper has presented a quantile regression modeling strategy for first-price auction withrisk neutral bidders under the independent value paradigm. For a conditional private valuequantile function given by a quantile regression, the conditional bid quantile function is alsoa quantile regression. Detecting the quantile regression slope which are not constant canbe done looking for the corresponding bid quantile regression slope, or with less rigor tothe variation of the corresponding homogenized bid regression coefficient with respect to thenumber of bidders, which is also a consistent estimator of the constant private value slope.Non constant private value slope functions can be recovered from their bid counterpartsand their derivative with respect to quantile level. The latter can be estimated using theaugmented quantile regression proposed in this paper, which applies local polynomial toestimate jointly the bid quantile slope and its derivatives. This approach is found to work wellboth in simulations, and in a timber auction application where a strong low type/high typebidder heterogeneity is detected. This can be interpreted as caused by heterogeneous bidderabilities to transform the auctioned timber lots into more valuable goods. An empiricalfinding is that the seller expected revenue in a median auction is higher in first-price than inascending auctions. The estimated expected revenue curves look flat for reserve prices belowa quite large threshold, including optimal ones. This suggests that the choice of a reserveprice may not be important, at least for the median auction considered in the application.43 new local polynomial estimation procedure for bid quantile regression and its quantilelevel derivatives is proposed to implement this strategy. It is based on a smoothed objec-tive function which produces smooth estimations as illustrated in the simulations and theempirical applications. The auction modeling strategy also applies for unspecified quantilefunctions thanks to linear sieve methods. This also allows to consider flexible and parsimo-nious specification such as additive quantile function. The proposed private value quantileestimator converges with nonparametric rates, mimicking the fast optimal ones achieved inthe absence of covariate for a quantile regression, or for univariate covariate for an additivequantile specification. Various functionals of private value quantile functions are considered,such as the expected revenue, the private value conditional cdf and pdf, or risk-aversion forbidders with a common CRRA utility function.Many work remain to be done. The asymptotic distribution derived for the proposedestimators often have a complicated variance, so it may be wiser to use bootstrap inferenceas in the empirical application. The risk aversion exhibits a quite large variance, suggest-ing that a better understanding of efficiency issues is needed. Various extensions can alsobe considered. The quantile approach can be extended to exchangeable affiliated valuesas considered in Hubbard, Li and Paarsch (2012). The quantile regression with unobservedvariables estimation method of Wei and Carroll (2009) can be used to tackle unobserved het-erogeneity as in Krasnokutskaya (2011). The quantile identification and estimation strategycan be modified to tackle with endogenous entry, such as reserve price as in Guerre, Perrigneand Vuong (2000) or entry costs as considered by Marmer, Shneyerov and Xu (2013a) orGentry and Li (2014). References [1] Andrews, D.W.K & Y.J. Whang (1990). Additive interactive regression models:circumventing the curse of dimensionality. Econometric Theory , 466–479.442] Athey, S. & J. Levin (2001). Information and competition in U.S. Forest Servicetimber auctions. Journal of Political Economy , 375–417.[3] Athey, S., J. Levin & E. Seira (2011). Comparing open and sealed bid auctions:evidence from timber auctions. The Quarterly Journal of Economics , 207–257.[4] Aradillas-Lopez, A., A. Gandhi & D. Quint (2013). Identification and inferencein ascending auctions with correlated private values. Econometrica Aryal, G., M.F. Gabrielli & Q. Vuong (2016). Semiparametric estimation offirst-price auction models. CONICET and Universidad Nacional de Cuyo, University ofVirginia.[6] Bassett, G. & R. Koenker (1982). An empirical quantile function for linear modelswith iid errors. Journal of the American Statistical Association, , 407–415.[7] Belloni, A., V. Chernozhukov, D. Chetverikov & I. Fernandes-V´al (2017).Conditional quantile processes based on series or many regressors. arXiv:1105.6154v3.[8] Campo, S., E. Guerre, I. Perrigne & Q. Vuong (2011). Semiparametric esti-mation of first-price auctions with risk-averse bidders. Review of Economic Studies, ,112–147.[9] Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. Chap.76 in Handbook of Econometrics, Vol. . Elsevier.[10] Chernozhukov, V., I. Fern´andes-Val & A. Galichon (2010). Quantile andprobability curves without crossing. Econometrica , 1093-1125.[11] Daubechies, I. (1992). Ten lectures on wavelets. SIAM.[12] Dette, H. & S. Volgushev (2008). Non-crossing non-parametric estimates of quan-tile curves. Journal of the Royal Statistical Society: Series B , 609–627.4513] Enache, A. & J.P. Florens (2015). A quantile approach for the estimation of first-price private value auction. Working Paper, Paris School of Economics.[14] Fan, J. & I. Gijbels (1996). Local polynomial modeling and its applications. Chapmanand Hall/CRC.[15] Gentry, M. & T. Li (2014). Identification in auctions with selective entry. Econo-metrica , 315–344.[16] Gimenes, N. (2017). Econometrics of ascending auction by quantile regression. Reviewof Economics and Statistics , 944–953.[17] Guerre, E., I. Perrigne & Q. Vuong (2000). Optimal nonparametric estimationof first-price auctions. Econometrica , 525–574.[18] Guerre, E., I. Perrigne & Q. Vuong (2009). Nonparametric identification of riskaversion in first-price auctions under exclusion restrictions. Econometrica , 1193–1227.[19] Guerre, E. & C. Sabbah (2012). Uniform bias study and Bahadur representation forlocal polynomial estimators of the conditional quantile function. Econometric Theory , 87–129.[20] H¨ardle, W., G. Kerkyacharian, D. Picard & A. Tsybakov (1998). Wavelets,approximation and statistical applications. Springer.[21] Haile, P.A., H. Hong & M. Shum (2003). Nonparametric tests for common valuesin first-price sealed-bid auctions. Cowles Foundation discussion paper.[22] Hickman, B.R. & T.P. Hubbard (2015). Replacing sample trimming with bound-ary correction in nonparametric estimation of first-price auctions. Journal of AppliedEconometrics , , 736-762. 4623] Hirano, K. & J.R. Porter (2003). Asymptotic efficiency in parameter structuralmodels with parameter-dependent support. Econometrica , , 1307–1338.[24] Horowitz, J.L. & S. Lee (2005). Nonparametric estimation of an additive quantileregression model. Journal of the American Statistical Association , 1238–1249.[25] Hubbard, T.P., T. Li & H.J. Paarsch (2012). Semiparametric estimation in modelsof first-price, sealed-bid auctions with affiliation. Journal of Econometrics , , 4–16.[26] Koenker, R. (2005). Quantile regression . Cambridge University Press.[27] Koenker, R. & G. Bassett (1978). Regression quantiles. Econometrica, , 33–50.[28] Krasnokutskaya, E. (2011). Identification and estimation of auctions models withunobserved heterogeneity. Review of Economic Studies, , 293–327.[29] Laurent, B. (1997). Estimation of integral functionals of a density and its derivatives. Bernoulli, , 181–211[30] Li, T., I. Perrigne & Q. Vuong (2003). Semiparametric estimation of the optimalreserve price in first-price Auctions, Journal of Business & Economic Statistics Li, T. & X. Zheng (2012). Information acquisition and/or bid preparation: A struc-tural analysis of entry and bidding in timber sale auctions. Journal of Econometrics , 29–46.[32] Liu, N. & Y. Luo (2017). A nonparametric test of exogenous participation in first-priceauctions. International Economic Review , 857–887[33] Liu, N. & Q. Vuong (2018). Nonparametric test of monotonicity of bidding strategyin first price auctions. Working paper. 4734] Lu, J. & I. Perrigne (2008). Estimating risk aversion from ascending and sealed-bid auctions: the case of timber auction data. Journal of Applied Econometrics ,871–896.[35] Luo, Y. & Y. Wan (2018). Integrated-Quantile-Based Estimation for First-Price Auc-tion Models. Journal of Business & Economic Statistics , 173-180.[36] Ma, J., Marmer, V., & A. Shneyerov (2018). Inference for first-price auctionswith Guerre, Perrigne and Vuong’s estimator. Working paper,[37] Marmer, V., & A. Shneyerov (2012). Quantile-based nonparametric inference forfirst-price auctions. Journal of Econometrics , 345–357.[38] Marmer, V., A. Shneyerov & P. Xu (2013a). What model for entry in first-priceauctions? A nonparametric approach. The Journal of Econometrics , , 46–58.[39] Marmer, V., A. Shneyerov & P. Xu (2013b). What model for entry in first-priceauctions? A nonparametric approach. Supplementary material. Website of The Journalof Econometrics .[40] Maskin, E. & J.G. Riley (1984). Optimal auctions with risk averse buyers. Econo-metrica Menzel, K. & P. Morganti (2013). Large sample properties for estimators basedon the order statistics approach in auctions. Quantitative Economics, Milgrom, P.R. (2001). Putting auction theory to work. Cambridge University Press.[43] Milgrom, P.R. & R.J. Weber (1982). A theory of auctions and competitive bidding. Econometrica , , 1089–1122.[44] Paarsch, H.J. & H. Hong (2006). An introduction to the structural econometrics ofauction data . MIT Press. 4845] Rezende, L. (2008). Econometrics of auctions by least squares. Journal of AppliedEconometrics , 925–948.[46] Schumaker, L.L. (2007). Spline functions: basic theory. Cambridge University Press.[47] Wei, Y. & R.J. Carroll (2009). Quantile regression with measurement error. Jour-nal of the American Statistical Association , 1129–1143.[48] Zincenko, F. (2018). Nonparametric estimation of first-price auctions with risk-aversebidders. Journal of Econometrics , 303–335.49 nline Appendix A: Sieve assumption and uniform con-sistency resultsA.1 High-level sieve assumption Section 4.1.2 suggests to use spline or wavelet but our results hold for more general sievechoices satisfying the high level Assumption R. The first key condition is the followingapproximation property. Approximation property S . For each function V ( α ; x ) with D M interactions as in (2.9), ( s + 1) th continuously differentiable over [0 , × X , there exists some coefficients γ k ( · ) = γ kK ( · ) , ( s + 1) th continuously differentiable over [0 , with equicontinuous γ ( s +1) kK ( · ) , suchthat sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( α ; x ) − K (cid:88) k =1 γ k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , (A.1.1)sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p V ( α ; x ) ∂α p − K (cid:88) k =1 γ ( p ) k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . (A.1.2)Note that K (cid:16) /h under Assumption H. Chen (2007) gives a O (cid:16) K − s +1 D M (cid:17) rate forstandard sieve methods and functions with s + 1 bounded derivatives, which is comparableto rate in (A.1.1). The rate o (cid:16) K − s +1 D M (cid:17) holds for functions with continuous derivativesof order s + 1 for multivariate B splines (Schumaker, 2007) of order s + 1 as in (4.4), ormultivariate wavelets generated by a father wavelet p ( · ) function of order s + 1, see H¨ardleet al. (1998), Chen (2007) and the references therein, in particular Daubechies (1992).These two sieve also satisfy (A.1.2) as the corresponding coefficients γ k ( · ) can be written as (cid:82) X λ k ( x ) V ( α ; x ) dx for well chosen λ k ( · ) = λ kK ( · ) satisfying sup K (cid:82) X | λ k ( x ) | dx < ∞ . Thehigh-level sieve assumption considered in our results is as follows. Assumption R The sieve satisfies the Approximation property S. In the AQR case thematrices E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] , I in I , are full rank and in the ASQR case (i) The eigenval- es of the Gram matrix (cid:82) X P ( x ) P (cid:48) ( x ) dx stay bounded away from and infinity when thedimension K of P ( · ) increases and max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) . (ii) The sieve { P k , ≤ k ≤ K } is composed with localized functions, in the sense there isa c > such that P k ( · ) P k ( · ) = 0 as soon as | k − k | > c/ with max k ≤ K (cid:26)(cid:90) X | P k ( x ) | dx (cid:27) = O (cid:0) K − / (cid:1) . (iii) For some η ∈ (0 , and K L with log K L = O (log L ) , it holds that (cid:107) P ( x ) − P ( x (cid:48) ) (cid:107) ≤ K L (cid:107) x − x (cid:48) (cid:107) η for all x , x (cid:48) of X . Assumption R first imposes well conditioned matrices E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] for the AQRcase and (cid:82) X P ( x ) P (cid:48) ( x ) dx for the ASQR case. The rest of Assumption R holds for the sieve(4.4) as max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) h − D M / (cid:1) , max k ≤ K (cid:26)(cid:90) X | P k ( x ) | dx (cid:27) = O (cid:0) h − D M / (cid:1) with K (cid:16) h − /D M by Assumption H. Assumption R-(iii) holds when the order K of the sieve(4.4) decreases with a polynomial rate and provided q ( · ) is H¨older with exponent η . Thisallows for cardinal B-splines for which η = 1, but also for wavelets which are not alwaysdifferentiable but H¨older with η < 1, see Daubechies (1992). A.2 Uniform consistency rates The next Theorem deals with uniform consistency of the ASQR procedure. Theorem A.1 Suppose that the private value conditional quantile function V ( ·|· ) is a quan-tile regression (2.5) or a sieve quantile regression (2.10) with D M interactions. Then under ssumptions A, H, S and R with s ≥ D M / and log LLh D M +1+( D M ∨ = O (1) , it holds sup ( α,x,I ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / + h s +1 (cid:33) , sup ( α,x,I ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + o (cid:0) h s +1 (cid:1) . The bandwidth condition used in Theorem A.1 is similar to the one of Theorem 3 andallows an optimal bandwidth of order (log L/L ) / (2 D M + s +3) provided the smoothness s sat-isfies s ≥ max (cid:18) D M , D M − (cid:19) . Under this condition the uniform consistency rate of the private value conditional quantileestimator is (cid:18) log LL (cid:19) s +12 s + D M +3 which coincides with the GPV optimal minimax uniform consistency rate for the estimationof the private value conditional cdf in the presence of D M covariates. Theorem A.1 alsoincludes a uniform consistency rate for the bid conditional quantile function estimator whichcan be used to estimate the bidders’ signals and private values. References [1] Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models.Chap. 76 in Handbook of Econometrics, Vol. . Elsevier. GPV consider the pdf but the rate for cdf or quantile can be derived similarly. Daubechies, I. (1992). Ten lectures on wavelets. SIAM.[3] H¨ardle, W., G. Kerkyacharian, D. Picard & A. Tsybakov (1998). Wavelets, approximation and statistical applications. Springer.[4] Schumaker, L.L. (2007). Spline functions: basic theory. Cambridge UniversityPress. 53 nline Appendix B: Notations and intermediary results We start with additional notations used all along the proof section and some preliminarylemmas which are established in Appendix F. In what follows P ( x ) = [1 , x (cid:48) ] (cid:48) in the AQR case ( K = D + 1)[ P ( x ) , . . . , P K ( x )] (cid:48) in the ASQR caseallowing an unified treatment of the two estimators, although the proof focus is on the moredifficult ASQR case. Recall that (cid:107) P ( x ) (cid:107) = (cid:0) P ( x ) (cid:48) P ( x ) (cid:1) / is the standard Euclidean normand that, under Assumptions R-(i) and H-(ii),max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) = O (cid:0) h − D M / (cid:1) , max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) , with D M = 0 in the AQR case. Recall that P ( x, ht ) = π ( ht ) ⊗ P ( x ) , π ( ht ) (cid:48) = (cid:34) , ht, . . . , ( ht ) s +1 ( s + 1)! (cid:35) so that the “design” matrix E (cid:2) P ( x (cid:96) , ht ) P ( x (cid:96) , ht ) (cid:48) (cid:3) degenerates asymptotically. To avoidthis, consider the change of parameters b = Hb with H = Diag (1 , . . . , h s +1 ) ⊗ Id K , b = β , , . . . , β ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) = β (cid:48) , hβ , , . . . , hβ ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) = hβ (cid:48) , . . . , h s +1 β s +1 , , . . . , h s +1 β s +1 ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) s +1 = h s +1 β s +1 (B.1)54o that P ( x (cid:96) , ht ) (cid:48) β = P ( x (cid:96) , t ) (cid:48) b . Define accordingly (cid:98) R ( b ; α, I ) = 1 LIh L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:18) B i(cid:96) − P (cid:18) x (cid:96) , a − αh (cid:19) (cid:48) b (cid:19) K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − α h − α h ρ a + ht (cid:0) B i(cid:96) − P ( x (cid:96) , t ) (cid:48) b (cid:1) K ( t ) dt, R ( b ; α, I ) = E (cid:104)(cid:98) R ( b ; α, I ) (cid:105) . Note that b → (cid:82) − α h − α h ρ a + ht (cid:0) B i(cid:96) − P ( x (cid:96) , t ) (cid:48) b (cid:1) K ( t ) dt is convex as an integral of convex func-tions. It follows that (cid:98) R ( b ; α, I ) and R ( b ; α, I ) have minimizers, (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I ) = H (cid:98) β ( α | I ) , b ( α | I ) = arg min b R ( b ; α, I ) , which uniqueness will be established in the next section. Set b ( α | I ) = H − b ( α | I ) recalling b ( α | I ) = (cid:104) β ( α | I ) (cid:48) , . . . , β (cid:48) s +1 ( α | I ) (cid:105) (cid:48) and define B ( α | x, I ) = P ( x ) (cid:48) β ( α | I ) ,γ ( α | I ) = β ( α | I ) + αβ ( α | I ) I − , V ( α | x, I ) = P ( x ) (cid:48) γ ( α | I ) . By Proposition C.1 and its proof, there exists some β ∗ ( ·|· ) grouping the entries in (2.11)such that sup ( α,x ) ∈ [0 , ×X | P ( x ) β ∗ ( α | I ) − B ( α | x, I ) | = o (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +1 (cid:1) . Let b ∗ ( ·|· ) and b ∗ ( ·|· ) = Hb ∗ ( ·|· ) with β ∗ ( α | I ) (cid:48) = (cid:2) β ∗ ( α | I ) (cid:48) , β ∗ ( α | I ) (cid:48) , . . . , β ∗ s +1 ( α | I ) (cid:48) (cid:3) ,β ∗ p ( α | I ) = (cid:104) β ( p ) k ( α | I ) , ≤ k ≤ K (cid:105) as in (2.11), p = 0 , . . . , s + 1.The next notations deal with the differentiability of the objective functions (cid:98) R ( · ; α, I ).55ince ∂ρ α + ht (cid:0) B − P ( x (cid:96) , t ) (cid:48) b (cid:1) ∂ b (cid:48)(cid:48) = (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) , almost everywhere, it follows that (cid:98) R ( · ; α, I ) is differentiable with (cid:98) R (1) ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − α h − α h (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) K ( t ) dt and R (1) ( b ; α, I ) = E (cid:104)(cid:98) R (1) ( b ; α, I ) (cid:105) by the Dominated Convergence Theorem. When b = b ∗ ( α | I ), P ( x, t ) (cid:48) b ∗ ( α | I ) = P ( x, ht ) (cid:48) β ∗ ( α | I ) is close to B ( α + ht | x, I ), which inverse as a functionof t in I α,h = (cid:2) I α,h , I α,h (cid:3) = (cid:20) − min (cid:16) , αh (cid:17) , min (cid:18) , − αh (cid:19)(cid:21) = [ − , ∩ (cid:20) − αh , − αh (cid:21) is G ( u | x, I ) − αh , u ∈ (cid:2) B (cid:0) α + hI α,h | x, I (cid:1) , B (cid:0) α + hI α,h | x, I (cid:1)(cid:3) . When h is small enough, it will be shown in the proof of Lemma B.1 below that ∂∂t (cid:2) P ( x, ht ) (cid:48) b ∗ ( α | I ) (cid:3) = h (cid:2) π (1) ( ht ) ⊗ P ( x ) (cid:3) (cid:48) b ∗ ( α | I )= hP ( x ) (cid:48) β ∗ ( α | I ) + O (cid:0) h (cid:1) uniformly since π (1) ( ht ) (cid:48) = [0 , , ht, . . . , ( ht ) s /s !] and that P ( x ) (cid:48) β ∗ ( α | I ) converges uni-formly to B (1) ( α | x, I ) when K diverges and is therefore positive, so that P ( x, t ) (cid:48) b ∗ ( α | I ) isan increasing function of t in I α,h for h small enough. Since max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) , t → P ( x, t ) (cid:48) b is also strictly increasing provided b is close enough to b ∗ ( α | I ).56n such case, it is convenient to redefine P ( x, t ) (cid:48) b as follows Ψ ( t | x, b ) = P (cid:0) x, I α,h (cid:1) (cid:48) b t > I α,h P ( x, t ) (cid:48) b t ∈ I α,h P (cid:0) x, I α,h (cid:1) (cid:48) b t < I α,h . When Ψ ( ·| x, b ) has an inverse, defineΦ ( u | x, b ) = α + hI α,h u > Ψ (cid:0) I α,h | x, b (cid:1) α + h Ψ − ( u | x, b ) u ∈ Ψ ( I α,h | x, b ) α + hI α,h u < Ψ (cid:0) I α,h | x, b (cid:1) , ∆ ( u | x, b ) = Φ ( u | x, b ) − αh = I α,h u > Ψ (cid:0) I α,h | x, b (cid:1) Ψ − ( u | x, b ) u ∈ Ψ ( I α,h | x, b ) I α,h u < Ψ (cid:0) I α,h | x, b (cid:1) , which is such that, as seen above, the central part of Φ ( u | x, b ∗ ( α | I )) is close to G ( u | x, I )when u is in Ψ ( I α,h | x, b ). Observe now that, provided Ψ ( ·| x, b ) is increasing and since thesupport of K ( · ) is [ − , (cid:90) I α,h I α,h { I ( B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )) − ( α + ht ) } P ( x (cid:96) , t ) K ( t ) dt = (cid:90) I α,h I α,h (cid:26) I (cid:18) Φ ( B i(cid:96) | x (cid:96) , b ) − αh ≤ t (cid:19) − ( α + ht ) (cid:27) P ( x (cid:96) , t ) K ( t ) dt = (cid:90) I α,h Φ ( Bi(cid:96) | x(cid:96), b ) − αh P ( x (cid:96) , t ) K ( t ) dt − (cid:90) I α,h I α,h ( α + ht ) P ( x (cid:96) , t ) K ( t ) dt which is differentiable with respect to b , with for B i(cid:96) in Ψ ( I α,h | x, b ) ∂ Φ ( B i(cid:96) | x (cid:96) , b ) ∂ b (cid:48) = − P ( x, ∆ ( B i(cid:96) | x (cid:96) , b ))Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b )] . In principle Ψ ( ·|· ) should be denoted Ψ α,h ( ·|· ) to acknowledge that its definition depends upon α and h . Instead, t is restricted to lie in I α,h in the sequel. The same comment applies for the functions Ψ ( ·|· )and ∆ ( ·|· ) introduced below. h small enough and for b in the vicinity of b ∗ ( α | I ), (cid:98) R ( b ; α, I ) and R ( b ; α, I ) aretwice continuously differentiable with, (cid:98) R (2) ( b ; α, I ) = 1 LIh L (cid:88) (cid:96) =1 I (cid:96) (cid:88) i =1 I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) (cid:48) Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h K (∆ ( B i(cid:96) | x (cid:96) , b )) , R (2) ( b ; α, I ) = E (cid:104)(cid:98) R (2) ( b ; α, I ) (cid:105) . The next lemma details some properties of the functions Ψ ( ·| x, b ) and Φ ( ·| x, b ) that werebriefly sketched above. Define BI α,h = (cid:26) b ; min ( t,x ) ∈I α,h ×X ∂ Ψ ( t | x, b ) ∂t > (cid:27) , BI α,h = (cid:40) b ; min ( t,x ) ∈I α,h ×X ∂ Ψ ( t | x, b ) ∂t > h/f , max p =1 ,...,s +1 (cid:32) max x ∈X (cid:12)(cid:12) P ( x ) (cid:48) b p (cid:12)(cid:12) h (cid:33) < f (cid:41) , recalling that b = (cid:2) b (cid:48) , . . . , b (cid:48) s +1 (cid:3) (cid:48) and where f and f will be taken large enough. While BI α,h is used to bound the first derivative of Ψ ( ·| x, b ) away from 0, BI α,h is used to bound thesuccessive derivatives Ψ ( p ) ( ·| x, b ), p = 1 , . . . , s + 1, away from infinity. As made possibleby Lemma B.1-(i), below, an Euclidean ball B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) with a small enoughconstant C > BI α,h and BI α,h . Lemma B.1 Suppose Assumptions A and S hold with max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) , K = h /D M that f and f are large enough. Then, h small enough and all I in I ,i. b ∗ ( α | I ) belongs to BI α,h ⊂ BI α,h and for C small enough B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) is asubset of BI α,h , for all α in [0 , . i. For all b in BI α,h and all u in Ψ ( I α,h | x, b ) ∂ Φ ( u | x, b ) ∂ b (cid:48) = − P ( x, ∆ ( u | x, b ))Ψ (∆ ( u | x, b ) | x, b ) /h ,∂ Φ ( u | x, b ) ∂u = 1Ψ (∆ ( u | x, b ) | x, b ) /h . iii. It holds that max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | = o (cid:0) h s +1 (cid:1) , max ( α,x ) ∈ [0 , ×X max t ∈I α,h | α ( B ( α + ht | x, I ) − Ψ ( t | x, b ∗ ( α | I ))) − ( ht ) s +2 ( s + 2)! αB ( s +2) ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) , and, recalling b ∗ ( α | I ) = hβ ∗ ( α | I )max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) αβ ∗ ( α | I ) − αB (1) ( α | x, I ) (cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | Φ ( u | x, b ∗ ( α | I )) − G ( u | x, I ) | = o (cid:0) h s +1 (cid:1) . iv. There is a C > such that for any b and b in BI α,h and all α in [0 , ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ) − Ψ ( t | x, b ) | , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] | Φ ( u | x, b ) − Φ ( u | x, b ) | , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ∂ Φ ∂u ( u | x, b ) − ∂ Φ ∂u ( u | x, b ) (cid:12)(cid:12)(cid:12)(cid:12) , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] (cid:12)(cid:12) Ψ (1) (∆ ( u | x, b ) | x, b ) − Ψ (1) (∆ ( u | x (cid:96) , b ) | x (cid:96) , b ) (cid:12)(cid:12) , are all smaller or equal to Ch − D M / (cid:107) b − b (cid:107) . h ( α ), Ω (0), Ω (1), Ω = Ω (0) + Ω (1) and Ω h ( α ) be the ( s + 2) × ( s + 2) matricesΩ h ( α ) = (cid:90) I α,h I α,h π ( t ) π ( t ) (cid:48) K ( t ) dt = (cid:34)(cid:90) − αh − αh t p + p K ( t ) dt, ≤ p , p ≤ s + 1 (cid:35) , Ω (0) = (cid:90) − π ( t ) π ( t ) (cid:48) K ( t ) dt, Ω (1) = (cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt, Ω h ( α ) = (cid:90) I α,h I α,h tπ ( t ) π ( t ) (cid:48) K ( t ) dt, While Ω h ( α ) (cid:22) Ω for all α and h , it holds that for h small enough Ω h ( α ) (cid:23) Ω (0) for all α in [0 , / 2] and Ω h ( α ) (cid:23) Ω (1) for all α in [1 / , Lemma B.2 Suppose Assumptions A, R-(i) and S hold, that f and f are large enough.Then, for K − /D M = O ( h ) , h small enough, all I in I , and any C > small enough, (i) Itholds that R (2) ( · ; α, I ) is continuously differentiable over B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) with max α ∈ [0 , max b , b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) (cid:107) b − b (cid:107) / ( α (1 − α ) + h ) = O (cid:0) h − D M / (cid:1) . (ii) The eigenvalues of R (2) [ b ∗ ( α | I ) ; α, I ] belongs to [1 /C, C ] for a large enough C , forall α in [0 , and h small enough with max α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13) R (2) [ b ∗ ( α | I ) ; α, I ] − Ω h ( α ) ⊗ E (cid:20) I ( I (cid:96) = 1) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) + h Ω h ( α ) ⊗ E (cid:34) I ( I (cid:96) = 1) B (2) ( α | x (cid:96) , I (cid:96) ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) ( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = o ( h ) . C > α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch s +1 ) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = O (cid:0) h s − D M / (cid:1) if h s = o (cid:0) h D M / (cid:1) , max α ∈ [0 , max b ∈B (cid:18) b ∗ ( α | I ) ,C ( log LL ( α (1 − α )+ h ) ) / (cid:19) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13)(cid:16) log LL ( α (1 − α )+ h ) (cid:17) / = O (cid:0) h − D M / (cid:1) if (cid:18) log LL (cid:19) / = o (cid:0) h D M / (cid:1) . It then follows that the eigenvalues of R (2) ( b ; α, I ) stays bounded away from 0 and infinityuniformly in α and in b in the two neighborhoods considered above, under the correspondingbandwidth assumption.The two next Lemmas study the first and second derivatives of (cid:98) R ( · ; α, I ) in a shrinkingvicinity of b ∗ ( α | I ). In particular, Lemma B.3 implies that (cid:98) R ( · ; α, I ) is strictly convex oversuch a vicinity with a probability tending to 1. Lemma B.3 Suppose Assumptions A, R-(i,ii) and S hold, and log L/ (cid:0) Lh D M +1 (cid:1) = o (1) .Then, for any C > small enough, max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) Lemma B.4 Suppose Assumptions A, R-(i,ii) and S hold, and log L/ (cid:0) Lh D M +1 (cid:1) = o (1) .Then, for any C > , max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I )( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) . Since R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0 and assuming h s +1 = O (cid:0) h D M / (cid:1) , sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) =61 ( h s +1 ) as established in (C.3), it holds thatmax α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) ( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) . The next Lemma studies the leading term (cid:98) e ( α | I ) of (cid:98) b ( α | I ) − b ( α | I ), (cid:98) e ( α | I ) = − (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) see Theorem D.1 below. Note that R (2) (cid:0) b ( α | I ) ; α, I (cid:1) is not necessarily defined and invert-ible unless h s +1 = O (cid:0) h D M / (cid:1) and sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ) as thereforeassumed and established in the proof of Theorem C.4 below, see (C.3). Lemma B.5 Suppose Assumptions A, H, R and S hold, and / (cid:0) Lh D M +1 (cid:1) = o (1) , s ≥ D M / and sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ) . Then (i) uniformly in ( α, x ) in [0 , ×X Var (cid:2) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:3) = O (cid:18) Lh D M (cid:19) and Var (cid:2) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:3) = O (cid:0) Lh D M +1 (cid:1) with Var [ (cid:98) e ( α | I ) /h ] having the expansion v h ( α ) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) (cid:3) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) + o (1) . (ii) It also holds sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) . nline Appendix C: Asymptotic bias Our bias results for the bid quantile function are based on the next Proposition, which statesbid implications of Assumption S. Proposition C.1 Assume the approximation property S holds. Suppose that V ( α | x, I ) is a ( s + 1) th continuously differentiable function over [0 , × X satisfying, inf ( α,x ) ∈ [0 , ×X V (1) ( α | x, I ) > and sup ( α,x ) ∈ [0 , ×X V (1) ( α | x, I ) < ∞ . Then, for B ( α | x, I ) as in (2.3) and sieve coefficients { γ k ( α | I ) , ≤ k ≤ K } of V ( α | x, I ) as in Property Si. min ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) > , max ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) < ∞ and B ( α | x, I ) is ( s + 2) th continuously differentiable over (0 , with lim α → sup ( x,I ) ∈X ×I (cid:12)(cid:12) αB ( s +2) ( α | x, I ) (cid:12)(cid:12) = 0 . ii. The coefficients { β k ( α | I ) , ≤ k ≤ K } from (2.11) are ( s + 1) th continuously differ-entiable and satisfy sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( α | x, I ) − K (cid:88) k =1 β k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . iii. Moreover αβ (1) k ( α ) = ( I − 1) [ γ k ( α | I ) − β k ( α )] and is therefore ( s + 1) th continuously ifferentiable for all ≤ k ≤ K . In addition sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) αB (1) ( α | x, I ) − K (cid:88) k =1 αβ (1) k ( α | x, I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p (cid:2) αB (1) ( α | x, I ) (cid:3) ∂α p − K (cid:88) k =1 ∂ p (cid:104) αβ (1) k ( α | x, I ) (cid:105) ∂α p P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . Proof of Proposition C.1. By (2.3), B ( α | x, I ) = ( I − (cid:82) u I − V ( αu | x, I ) du , so that B (1) ( α | x, I ) = ( I − (cid:82) u I − V (1) ( αu | x, I ) du which implies the two first statements in (i)about lower and upper bounds for B (1) ( α | x, I ) and that B ( ·|· , I ) is ( s + 1)th continuouslydifferentiable. That B ( ·| x, I ) is ( s + 2)th continuously differentiable over (0 , 1] follows fromits integral expression (2.3). Observe now that for p = 1 , . . . , s + 2 ∂ p [ αB ( α | x, I )] ∂α p = αB ( p ) ( α | x, I ) + pB ( p − ( α | x, I )with, for p = 1 , . . . , s + 1 B ( p ) ( α | x, I ) = ( I − (cid:90) u I − p V ( p ) ( αu | x, I ) du = I − α I − p (cid:90) α t I − p V ( p ) ( t | x, I ) dtB ( p +1) ( α | x, I ) = − ( I − 1) ( I − p ) α I + p (cid:90) α t I − p V ( p ) ( t | x, I ) dt + ( I − V ( p ) ( α | x, I ) α = − I − pα B ( p ) ( α | x, I ) + ( I − V ( p ) ( α | x, I ) α . Hence, when α goes to 0 αB ( s +2) ( α | x, I ) = − ( I + s ) B ( s +1) (0 | x, I ) + ( I − V ( s +1) (0 | x, I ) + o (1)= − ( I + s ) ( I − (cid:90) u I + s − V ( s +1) (0 | x, I ) du + ( I − V ( s +1) (0 | x, I ) + o (1)= o (1)uniformly on x . 64or (ii), consider a sequence of { γ k ( α | I ) , k ≤ K } approximating V ( α | x, I ) and its deriva-tives as in Property S. For { β k ( α | I ) , k ≤ K } as in (2.11) β ( p ) k ( α | I ) = ( I − (cid:90) u I + p − γ ( p ) k ( αu | I ) du, p = 0 , . . . , s + 1and sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( I − (cid:90) u I + p − (cid:32) V ( p ) ( αu | x, I ) − K (cid:88) k =1 γ ( p ) k ( αu | I ) P k ( x ) (cid:33) du (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) which gives the sieve approximation result for B ( α | x, I ) in (ii). Now, for αB (1) ( α | x, I ),observe that αB (1) ( α | x, I ) = ( I − 1) [ V ( α | x, I ) − B ( α | x, I )] and αβ (1) k ( α | I ) = α × (cid:32) − ( I − α I (cid:90) t I − γ k ( t | I ) dt + I − α γ k ( α | I ) (cid:33) = ( I − 1) [ γ k ( α | I ) − β k ( α | I )] . It follows sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p (cid:2) αB (1) ( α | x, I ) (cid:3) ∂α p − K (cid:88) k =1 ∂ p (cid:104) αβ (1) k ( α | I ) (cid:105) ∂α p P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ( I − 1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + ( I − 1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ I − 1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) αB (1) ( α | x, I ) in (iii). (cid:3) The study of the bias V ( α | x, I ) − V ( α | x, I ) and B ( α | x, I ) − B ( α | x, I ) is based on thefollowing Lemma which is a consequence of the Kantorovitch-Newton Theorem, see e.g.Gragg and Tapia (1974). Lemma C.2 Let F ( · ) : R D → R be a function. Suppose that there is a x ∗ ∈ R D and somereal numbers (cid:15) > and C > such that F ( · ) is twice differentiable on B ( x ∗ , C (cid:15) ) = (cid:8) x ∈ R D ; (cid:107) x − x ∗ (cid:107) < C (cid:15) (cid:9) . If, in addition,i. (cid:13)(cid:13) F (1) ( x ∗ ) (cid:13)(cid:13) ≤ (cid:15) and (cid:13)(cid:13)(cid:13)(cid:2) F (2) ( x ∗ ) (cid:3) − (cid:13)(cid:13)(cid:13) ≤ C ;ii. There is a C > such that (cid:13)(cid:13) F (2) ( x ) − F (2) ( x (cid:48) ) (cid:13)(cid:13) ≤ C (cid:107) x − x (cid:48) (cid:107) for all x, x (cid:48) ∈B ( x ∗ , C (cid:15) ) ;iii. C C (cid:15) ≤ / .Then there is a unique x such that (cid:107) x − x ∗ (cid:107) < C (cid:15) and F (1) ( x ) = 0 . The next lemma, established in Appendix F, will be used at the end of the proof ofTheorem C.4 below. Lemma C.3 Suppose Assumptions A, S and R-(ii). Then the (cid:96) norm of the columns ofthe matrix A α,h = E − I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) ) are bounded independently of L and α . That is, if A α,h = [ A α,h ( j , j ) , ≤ j , j ≤ ( s + 1) K ] , max L max α ∈ [0 , max ≤ j ≤ ( s +1) K ( s +1) K (cid:88) j =1 | A α,h ( j , j ) | < ∞ . 66n the next theorem, bias h ( α | I ) = E − I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) ) × E I ( I (cid:96) = I ) B ( s +2) ( α | x (cid:96) , I (cid:96) ) (cid:82) I α,h I α,h t s +2 P ( x (cid:96) , t ) K ( t ) dt ( s + 2)! B (1) ( α | x (cid:96) , I (cid:96) ) , and bias h ( α | I ) = (cid:2) bias h ( α | I ) (cid:48) , . . . , bias s +1 ,h ( α | I ) (cid:48) (cid:3) where the subvectors bias ph ( α | I ) are of dimension K . While bias h ( α | I ) may not exist for α = 0, the function Bias h ( α | I ) = α bias h ( α | I ) in (4.8) can be set to 0 when α = 0 byProposition C.1-(i). Theorem C.4 Suppose that Assumptions A, H and R hold with s ≥ D M / . Then, for h small enough b ( α | I ) = arg min b R ( b ; α, I ) is unique for all α in [0 , and sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12)(cid:12)(cid:12) V ( α | x, I ) − V ( α | x, I ) − h s +1 P ( x ) (cid:48) α bias h ( α | I ) I − (cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) with sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12) P ( x ) (cid:48) α bias h ( α | I ) (cid:12)(cid:12) = O (1) . Moreover sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) , sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12)(cid:12) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:12)(cid:12)(cid:12) = o ( h s ) . The proof of Theorem C.4 establishes that sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ),see (C.3), an intermediary result which will be used all along the proof. If D M / ≤ s ,67og L/ (cid:0) Lh D M +1 (cid:1) = o (1) and by Lemma B.3 and a second order Taylor expansionsup α ∈ [0 , sup b ∈B ( b ( α | I ) ,Ch s +1 ) (cid:12)(cid:12)(cid:12) h − s +1) (cid:110)(cid:98) R ( b ; α, I ) − (cid:98) R (cid:0) b ( α | I ) ; α, I (cid:1) − (cid:0) b − b ( α | I ) (cid:1) (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:111) − h − s +1) (cid:0) b − b ( α | I ) (cid:1) (cid:48) R (2) (cid:0) b ( α | I ) ; α, I (cid:1) (cid:0) b − b ( α | I ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) = o P (1) . Then by Lemma B.2 and the Argmax Theorem (cid:98) R ( · ; α, I ) has a unique minimizer over b ∈B (cid:0) b ( α | I ) , Ch s +1 (cid:1) for each α , with a probability tending to 1. Since (cid:98) R ( · ; α, I ) is con-vex a local minimum is also a global one. This implies that the AQR or ASQR estimators (cid:98) b ( α | I ) = H − (cid:98) b ( α | I ) are unique for all α in [0 , 1] with a probability tending to 1. Proof of Theorem C.4. Consider (ii) and (iii), the proof of (i) being similar as detailedbelow. The proof works by establishing that there is a solution of the first-order conditionin a open ball where R ( b ; α, I ) is strictly convex by checking the conditions of Lemma C.2,which will also gives the rate stated in the Theorem and the uniqueness of b ( α | I ). It is firstclaimed thatmax ( α,I ) ∈ [0 , ×I (cid:13)(cid:13)(cid:13) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = (cid:15) L with (C.1) (cid:15) L = O (cid:18) max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | (cid:19) = o (cid:0) h s +1 (cid:1) , where (cid:15) L = o ( h s +1 ) follows from Lemma B.1-(iii). To see that (C.1) holds, observe that (cid:13)(cid:13)(cid:13) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = max θ ; θ (cid:48) θ =1 (cid:12)(cid:12)(cid:12) θ (cid:48) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:12)(cid:12)(cid:12) . (C.2)68ut uniformly in α ∈ [0 , 1] and by Assumption R-(i), Lemma B.1-(iii), (cid:12)(cid:12)(cid:12) θ (cid:48) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:12)(cid:12)(cid:12) = E (cid:34) I ( I (cid:96) = I ) (cid:90) I α,h I α,h { G ( P ( x (cid:96) , t ) b ∗ ( α | I ) | x (cid:96) , I (cid:96) ) − G ( B ( α + ht | x, I ) | x (cid:96) , I (cid:96) ) } θ (cid:48) ( P ( x (cid:96) ) ⊗ π ( t )) K ( t ) dt ] ≤ C(cid:15) L E / (cid:20)(cid:90) − ( θ (cid:48) ( P ( x (cid:96) ) ⊗ π ( t ))) dt (cid:21) ≤ C(cid:15) L ( θ (cid:48) θ ) / = C(cid:15) L .Hence (C.1) holds, which is the first part of Condition (i) in Lemma C.2. The second partof Condition (i) follows from Lemma B.2-(ii) which ensures that there is a C > L large enough, sup ( α,I ) ∈ [0 , ×I (cid:13)(cid:13)(cid:13)(cid:13)(cid:104) R (2) ( b ∗ ( α | I ) ; α, I ) (cid:105) − (cid:13)(cid:13)(cid:13)(cid:13) ≤ C Note that s ≥ D M / (cid:15) L = o ( h s +1 ) gives that B ( b ∗ ( α | I ) , C (cid:15) L ) ⊂ B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) for all C , C > L is large enough, for all α and all I . Condition (ii) in LemmaC.2 follows from Lemma B.2-(i) which ensures that for C L = O (cid:0) h D M / (cid:1) , (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) ≤ C L (cid:107) b − b (cid:107) for all b , b in B ( b ∗ ( α | I ) , C (cid:15) L ) and all α , I . For condition (iii) in Lemma C.2, (cid:15) L = o ( h s +1 ) and s ≥ D M / C C L (cid:15) L = o (cid:0) h s − D M / (cid:1) = o (1) < / L large enough.Hence Lemma C.2 ensures that, for L large enough, all α and all I , there is a unique b ( α | I )in B ( b ∗ ( α | I ) , C (cid:15) L ) such that R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 069nd is therefore the unique minimizer of R ( · ; α, I ) over B ( b ∗ ( α | I ) , C (cid:15) L ). Since the convexfunction R ( · ; α, I ) cannot have several local minimizers, b ( α | I ) is also the unique globalminimizer of R ( · ; α, I ). Since (cid:15) L = o ( h s +1 ), it follows thatsup ( α,I ) ∈ [0 , ×I (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o (cid:0) h s +1 (cid:1) . (C.3)Consider now α b ( α | I ) − α b ∗ ( α | I ). Define g ( α | t, x, I ) = (cid:90) g (cid:0) Ψ (cid:0) t | x, b ( α | I ) (cid:1) + u ( B ( α + ht | x, I ) − Ψ ( t | x, b ∗ ( α | I ))) | t, x, I (cid:1) du which is such that, uniformly in α in [3 h, x in X and t in [ − , / g ( α | t, x, I ) = (cid:90) g (cid:0) Ψ (cid:0) t | x, b ( α | I ) (cid:1) + u (cid:0) B ( α + ht | x, I ) − Ψ (cid:0) t | x, b ( α | I ) (cid:1)(cid:1) | t, x, I (cid:1) du = (cid:90) g (cid:0) B ( α + ht | x, I ) + o (cid:0) h s +1 − D M / (cid:1) | t, x, I (cid:1) du ≥ (1 + o (1)) max y ∈ [ B (2 h | x,I ) ,B (1 − h | x,I )] g ( y | x, I ) ≥ C (cid:48)(cid:48) > o (cid:0) h s +1 − D M / (cid:1) = o ( h ) and Proposition C.1-(i). Now R (1) (cid:0) b ( α | I ) ; α, I (cid:1) =0 gives0 = (cid:90) (cid:32)(cid:90) I α,h I α,h (cid:8) G (cid:2) Ψ (cid:0) t | x, b ( α | I ) (cid:1) | x, I (cid:3) − ( α + ht ) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h (cid:8) G (cid:2) Ψ (cid:0) t | x, b ( α | I ) (cid:1) | x, I (cid:3) − G [ B ( α + ht | x, I ) | x, I ] (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − B ( α + ht | x, I ) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − Ψ ( t | x, b ∗ ( α | I )) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx + (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) { Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) } P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx. (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − Ψ ( t | x, b ∗ ( α | I )) (cid:9) P ( x, t ) = P ( x, t ) P ( x, t ) (cid:48) (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) , byAssumption R-(i), and because g ( α | t, x, I ), f ( x, I ) are bounded away from 0 and infinity α (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) = (cid:34)(cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) dt (cid:33) f ( x, I ) dx (cid:35) − × (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:40) ( ht ) s +2 ( s + 2)! αB ( s +2) ( α | x, I ) + o (cid:0) h s +2 (cid:1)(cid:41) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx uniformly in α in [0 , 1] by Lemma B.1-(iii). By Assumption R-(ii) which implies in particular (cid:13)(cid:13)(cid:13)(cid:82) (cid:16)(cid:82) I α,h I α,h | P ( x, t ) | K ( t ) dt (cid:17) dx (cid:13)(cid:13)(cid:13) = O (1), it follows b ( α | I ) − b ∗ ( α | I )= o (cid:0) h s +1 (cid:1) E − I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) ) E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35) ,α (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) = h s +2 α bias h ( α | I )+ o (cid:0) h s +2 (cid:1) E − I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) ) E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35) , (C.4)uniformly over [0 , A = A α,h = [ A , . . . A J L ] = E − I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) ) be a a J L × J L matrix with columns A j , j = 1 , . . . , J , | A j | the associated (cid:96) norm and | A | , ∞ = max j ≤ J L | A j | , S a selection matrix which selects some columns of A , a , b some71onformable vectors and | a | ∞ the largest entry of a. | a (cid:48) SAb | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) j b j a (cid:48) [ SA ] j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) j | b j | max j (cid:12)(cid:12)(cid:12) a (cid:48) [ SA ] j (cid:12)(cid:12)(cid:12) ≤ | b | | A | , ∞ | a | ∞ . This gives, since max α,L | A | , ∞ < ∞ by Lemma C.3 and by Assumption R-(ii),sup ( α,x ) ∈ [0 , ×X | P (cid:48) ( x ) S bias h ( α | I ) |≤ C (cid:32) max x ∈X K (cid:88) k =1 | P k ( x ) | (cid:33) × max ≤ k ≤ K (cid:90) | P k ( x ) | dx = O (1) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) P (cid:48) ( x ) SA E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C (cid:18) max x ∈X (cid:107) P ( x ) (cid:107) (cid:19) × max ≤ k ≤ K (cid:90) | P k ( x ) | dx = O (1) . Let S and S be the selection matrices S b = β and S b = hβ , so that B ( α | x, I ) = P (cid:48) ( x ) S b ( α | I ) and B (1) ( α | x, I ) = P (cid:48) ( x ) S b ( α | I ) /h . Then (C.3), (C.4), Lemma B.1-(iii)and the above implysup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P (cid:48) ( x ) S (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X | Ψ (0 | x, b ∗ ( α | I )) − B ( α | x, I ) | = o (cid:0) h s +1 (cid:1) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:12)(cid:12)(cid:12) = o ( h s ) , ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) α (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17) − h s +1 P (cid:48) ( x ) αS bias h ( α | I ) (cid:12)(cid:12)(cid:12) = sup ( α,x ) ∈ [0 , ×X h (cid:12)(cid:12) αP (cid:48) ( x ) S (cid:0) b ( α | I ) − b ∗ ( α | I ) − h s +2 P (cid:48) ( x ) bias h ( α | I ) (cid:1)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X h (cid:12)(cid:12) α (cid:0) P (cid:48) ( x ) b ∗ ( α | I ) − hB (1) ( α | x, I ) (cid:1)(cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) . This ends the proof of the Theorem since V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) / ( I − (cid:3) References [1] Gragg, W.B. & R.A. Tapia (1974). Optimal error bounds for the Newton-Kantorovich Theorem. SIAM Journal on Numerical Analysis , 10–13.73 nline Appendix D: Bahadur representation Let (cid:98) e ( α | I ) be a candidate linearization leading term for (cid:98) b ( α | I ) − b ( α | I ) and (cid:98) d ( α | I ) theassociate linearization error term, or Bahadur remainder term, (cid:98) e ( α | I ) = − (cid:16) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:17) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) , (D.1) (cid:98) d ( α | I ) = (cid:98) b ( α | I ) − b ( α | I ) − (cid:98) e ( α | I ) . (D.2)This section goal is to study the magnitude of (cid:98) d ( α | I ) and, in the ASQR case, the magnitudeof P (cid:48) ( x ) (cid:98) d ( α | I ) and P (cid:48) ( x ) (cid:98) d ( α | I ) /h . Theorem D.1 Suppose Assumptions A, R-(i,ii) and S hold, s ≥ D M / and log LLh D M +1) = o (1) . Then max α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Lh D M +( D M ∨ / ( h + α (1 − α )) / log L (cid:110)(cid:98) b ( α | I ) − b ( α | I )+ (cid:16) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:17) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27)(cid:13)(cid:13)(cid:13)(cid:13) = O P (1) with a diverging normalization term Lh D M +( D M ∨ / / log L . Moreover, for (cid:98) d ( α | I ) as in(D.2), sup ( α,x ) ∈ [0 , ×X (cid:0) Lh D M +1 (cid:1) / (cid:13)(cid:13)(cid:13) P (cid:48) ( x ) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) = O P (cid:32) h / log L ( Lh D M +( D M ∨ ) / (cid:33) , sup ( α,x ) ∈ [0 , ×X (cid:0) Lh D M +1 (cid:1) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P (cid:48) ( x ) (cid:98) d ( α | I ) h (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32) log L ( Lh D M +1+( D M ∨ ) / (cid:33) . Proof of Theorem D.1. We first introduce some renormalizations. Let, for (cid:98) e ( α | I ) as74n (D.1), (cid:37) αL = ( h + α (1 − α )) / log LLh D M +( D M ∨ / , (cid:98) R ( d ; α, I ) = (cid:98) R (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + (cid:37) αL d ; α, I (cid:1) − (cid:98) R (cid:0) b ( α | I ) + (cid:98) e ( α | I ) ; α, I (cid:1) , which is such that (cid:37) αL = o (1) by log L/ (cid:0) Lh D M +1) (cid:1) = o (1) (cid:98) d ( α | I ) (cid:37) αL = arg min d (cid:98) R ( d ; α, I ) . It follows that, (cid:40) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) = (cid:91) α ∈ [0 , (cid:40)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) ⊂ (cid:91) α ∈ [0 , (cid:26) inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ inf (cid:107) d (cid:107)≤ t (cid:98) R ( d ; α, I ) (cid:27) ⊂ (cid:91) α ∈ [0 , (cid:26) inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ (cid:27) since inf (cid:107) d (cid:107)≤ t (cid:98) R ( d ; α, I ) ≤ (cid:98) R (0; α, I ) = 0. The next step uses a convexity argument thatcan be found in Pollard (1991). For any d with (cid:107) d (cid:107) ≥ t , convexity yields (cid:98) R ( d ; α, I ) = (cid:107) d (cid:107) t (cid:26) t (cid:107) d (cid:107) (cid:98) R (cid:18) (cid:107) d (cid:107) d (cid:107) d (cid:107) ; α, I (cid:19) + (cid:18) − t (cid:107) d (cid:107) (cid:19) (cid:98) R (0; α, I ) (cid:27) ≥ (cid:107) d (cid:107) t (cid:98) R (cid:18) t d (cid:107) d (cid:107) ; α, I (cid:19) so that inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ (cid:107) d (cid:107) = t (cid:98) R ( d ; α, I ) ≤ (cid:40) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) ⊂ (cid:26) inf α ∈ [0 , inf (cid:107) d (cid:107) = t (cid:98) R ( d ; α, I ) ≤ (cid:27) . (D.3)Thus it is sufficient to consider those d with (cid:107) d (cid:107) = t . The expression of (cid:98) R ( d ; α, I ) gives,75sing two Taylor expansions with integral remainder, (cid:98) R ( d ; α, I ) = (cid:37) αL d (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) ; α, I (cid:1) + (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) = (cid:37) αL d (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) + (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) du (cid:21) (cid:98) e ( α | I )+ (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) . Since (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) + (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1) (cid:98) e ( α | I ) = 0 by (D.1), it follows that (cid:98) R ( d ; α, I ) = (cid:37) αL d (cid:48) (cid:20)(cid:26)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27) du (cid:21) (cid:98) e ( α | I )+ (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) . Lemma B.4 and (C.3) with s ≥ D M / 2, log L/ (cid:0) Lh D M +1) (cid:1) = o (1), Lemma B.2-(ii) givesup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) e ( α | I )( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) = o P (cid:0) h D M / (cid:1) . Lemmas B.3 and B.2-(i) then imply for the first item in (cid:98) R ( d ; α, I ), uniformly in α and d (cid:107) d (cid:107) = t , (cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:20)(cid:26)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27) du (cid:21) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:110) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − R (2) (cid:0) b ( α | I ) ; α, I (cid:1) + O P (cid:32)(cid:18) log LLh d M +1 (cid:19) / (cid:33)(cid:41) du (cid:35) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:34) O P (cid:0) h − D M / (cid:1) (cid:107) (cid:98) e ( α | I ) (cid:107) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:35) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL (cid:34) O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:35) O P (cid:32)(cid:18) ( h + α (1 − α )) log LLh D M (cid:19) / (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = t(cid:37) αL O P (cid:32) ( h + α (1 − α )) / log LLh D M +( D M ∨ / (cid:33) = t(cid:37) αL O P (1) . Observe that the condition log L/ (cid:0) Lh D M +1) (cid:1) = o (1) implieslog LLh D M +( D M ∨ = o (1) and them (cid:37) αL = o (cid:32)(cid:18) ( h + α (1 − α )) log LLh D M (cid:19) / (cid:33) . Lemmas B.3 and B.2 then imply for the second item in (cid:98) R ( d ; α, I ), uniformly in α and d with (cid:107) d (cid:107) = t , (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) = (cid:37) αL d (cid:48) (cid:34)(cid:90) (cid:40) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:41) (1 − u ) du (cid:35) d (cid:48) = (cid:37) αL d (cid:48) (cid:34)(cid:90) (cid:40) R (2) (cid:0) b ( α | I ) (cid:1) + tO P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:41) (1 − u ) du (cid:35) d (cid:48) ≥ C(cid:37) αL t (1 + to P (1)) . O P (1) and o P (1) which are uniform in α , P (cid:32) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:33) ≤ P (cid:18) inf α ∈ [0 , (cid:8) C(cid:37) αL t (1 + to P (1)) + t(cid:37) αL O P (1) (cid:9) ≤ (cid:19) = P ( Ct (1 + to P (1)) + O P (1) ≤ ≤ P ( t (1 + to P (1)) ≤ | O P (1) | )which can be made as small as needed asymptotically by increasing t . This gives the first re-sult of the Theorem. For the second and third, observe that max α ∈ [0 , (cid:37) αL = log L/Lh D M +( D M ∨ / so that, uniformly in α and x , (cid:12)(cid:12)(cid:12)(cid:0) Lh D M +1 (cid:1) / P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:12)(cid:12)(cid:12) = ( Lh ) / h D M / max x ∈X (cid:107) P ( x ) (cid:107) (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) = O P (cid:16) ( Lh ) / (cid:37) αL (cid:17) = O P (cid:32) h / log L ( Lh D M +( D M ∨ ) / (cid:33) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:0) Lh D M +1 (cid:1) / P ( x ) (cid:48) (cid:98) d ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) Lh (cid:19) / (cid:37) αL (cid:33) = O P (cid:32) log L ( Lh D M +( D M ∨ ) / (cid:33) . This ends the proof of the Theorem. (cid:3) References [1] Pollard, D. (1991). Asymptotics for least absolute deviation regression esti-mators. Econometric Theory , 186–199.78 nline Appendix E: Proof of main resultsE.1 Proof of Theorem 2 Recall that s is the row vector [0 , , , . . . , 0] of dimension s + 2 and let s = [1 , , . . . , S = s ⊗ Id K , S = s ⊗ Id K so that (cid:98) β j ( α | I ) = S j (cid:98) β ( α | I ), j = 0 , (cid:98) V ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) b ( α | I ) , V ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) b ( α | I )Define, for (cid:98) e ( α | I ) as in (D.1) (cid:98) v ( α | x, I ) = V ( α | x, I ) + P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (E.1)which is such, for (cid:98) d ( α | I ) as in (D.2), (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) d ( α | I ) . As the eigenvalues of (cid:82) X P ( x ) P ( x ) (cid:48) dx are bounded away from infinity under AssumptionR-(i) (cid:90) X (cid:90) (cid:16) (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) (cid:17) dαdx = O (cid:18) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) (cid:19) h = O P (cid:32)(cid:18) log LLh D M +1+( D M ∨ / (cid:19) (cid:33) by Theorem D.1, which gives (4.5) since, by Assumption H, Lh D M +1 log L (cid:18) log LLh D M +1+( D M ∨ / (cid:19) = log LLh D M +1+( D M ∨ = o (cid:18) log LLh D M +1) (cid:19) = o (1) . bias IL = O (1) and Σ IL = O (1) similarly follow from Assumption R-(i) and PropositionC.1-(i).It holds since E [ (cid:98) e ( α | I )] = R (2) (cid:0) b ( α | I ) ; α, I (cid:1) − R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0 for all α in [0 , E (cid:20)(cid:90) X (cid:90) ( (cid:98) v ( α | x, I ) − V ( α | x, I )) dαdx (cid:21) = (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx + (cid:90) X (cid:90) E (cid:34)(cid:18) P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) dαdx. For the bias part, Theorem C.4 gives (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx = (cid:90) X (cid:90) (cid:18) h s +1 P ( x ) (cid:48) α bias h ( α | I ) I − o (cid:0) h s +1 (cid:1)(cid:19) dαdx = h s +1) (cid:90) X (cid:90) (cid:18) P ( x ) (cid:48) α bias h ( α | I ) I − (cid:19) dαdx + o (cid:0) h s +1) (cid:1) , Since α bias h ( α | I ) / ( I − 1) differs from bias ( α | I ) for α in [0 , h ] or [1 − h, (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx = h s +1) (cid:90) X (cid:90) (cid:0) P ( x ) (cid:48) bias ( α | I ) (cid:1) dαdx + o (cid:0) h s +1) (cid:1) = h s +1) bias IL + o (cid:0) h s +1) (cid:1) . Arguing similarly with Lemma B.5-(i) yields (cid:90) X (cid:90) E (cid:34)(cid:18) P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) dαdx = (cid:90) X (cid:90) E (cid:34)(cid:18)(cid:20) P ( x ) (cid:48) α (cid:98) e ( α | I ) h ( I − (cid:21)(cid:19) (cid:35) dαdx + O (cid:18) Lh D M (cid:19) = σ LI LIh D M +1 + o (cid:18) Lh D M +1 (cid:19) . Substituting in the bias-variance decomposition of the integrated mean squared error endsthe proof of the Theorem. (cid:3) .2 Proof of Theorem 3 Assumption R-(i) and Proposition C.1-(i) imply that P ( x ) (cid:48) Σ h ( α | I ) P ( x ) = 0 holds only if P ( x ) = 0 , which is impossible in the AQR case. But, in the ASQR case, if P ( x ) = 0 for some x ∈ X and all K large enough, the approximation property S cannot hold, contradictingAssumption S-(ii). Assumptions R-(i), H and Proposition C.1-(i) implymax x ∈X (cid:0) P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:1) = O (cid:18) max x ∈X (cid:107) P ( x ) (cid:107) (cid:19) = O (cid:0) h − D M (cid:1) . By Theorem D.1, Lemma B.5, Assumptions R-(i), H, and using the same notations than inthe proof of Theorem 2 (cid:0) Lh D M +1 (cid:1) / (cid:18) (cid:98) V ( α | x, I ) − V ( α | x, I ) − P (cid:48) ( x ) α S (cid:98) e ( α | I ) h ( I − − (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1)(cid:19) = (cid:0) Lh D M +1 (cid:1) / (cid:26) P (cid:48) ( x ) (cid:98) e ( α | I ) + P (cid:48) ( x ) (cid:20) S + α S h ( I − (cid:21) (cid:98) d ( α | I ) (cid:27) = (cid:0) Lh D M +1 (cid:1) / O P (cid:32) Lh D M ) / (cid:33) + O (cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) h = O P (cid:32) h / + (cid:18) log LLh D M− ( D M∨ ) (cid:19) / (cid:33) = o P (1) . Since V ( α | x, I ) − V ( α | x, I ) = h s +1 P ( x ) (cid:48) Bias h ( α | I ) + o ( h s +1 ), it remains to show that (cid:18) LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / αP ( x ) (cid:48) S (cid:98) e ( α | I ) h ( I − d → N (0 , . Write (cid:18) LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / αP ( x ) (cid:48) S (cid:98) e ( α | I ) h ( I − 1) = L (cid:88) (cid:96) =1 r (cid:96) ( α | x, I )81ith r (cid:96) ( α | x, I ) = I ( I (cid:96) = I ) (cid:80) I (cid:96) i =1 r i(cid:96) ( α | x, I ) and r i(cid:96) ( α | x, I ) = (cid:18) α LIh ( I − (cid:19) / P ( x ) (cid:48) (cid:0) P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:1) / S (cid:104) R (2) (cid:0) b ( α | x, I ) ; α, I (cid:1)(cid:105) − × (cid:90) − αh − αh (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) b ( α | x, I ) (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) K ( t ) dt. Since E [ r (cid:96) ( α | x, I )] = 0 and max ≤ (cid:96) ≤ L | Var ( r (cid:96) ( α | x, I )) − | = o (1), it is sufficient toshow that max ≤ (cid:96) ≤ L | E [ r (cid:96) ( α | x, I )] | = o (1) holds, see e.g. Theorem < > p.179 in Pollard(2002). But Assumption R-(i) and Proposition C.1-(i), Lemma B.2 and (C.3), | r i(cid:96) ( α | x, I ) | ≤ C ( Lh ) / (cid:107) P ( x ) (cid:107)(cid:107) P ( x ) (cid:107) × max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:32) Lh D M +1 ) / (cid:33) . It follows that by Assumption Hmax ≤ (cid:96) ≤ L (cid:12)(cid:12) E (cid:2) r (cid:96) ( α | x, I ) (cid:3)(cid:12)(cid:12) ≤ I max ≤ (cid:96) ≤ L, ≤ i ≤ I (cid:96) | r i(cid:96) ( α | x, I ) | max ≤ (cid:96) ≤ L (cid:12)(cid:12) E (cid:2) r (cid:96) ( α | x, I ) (cid:3)(cid:12)(cid:12) = O (cid:32) Lh D M +1 ) / (cid:33) = o (1) . This ends the proof of the Theorem. (cid:3) E.3 Proof of Theorem 4 The proof of Theorem requests some specific additional results. The next Lemma gives anexpansion for C h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:26)(cid:90) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) × h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) [ a ∧ a − a a ] da da (cid:27) dα dα . s (cid:48) = [1 , , . . . , s (cid:48) = [0 , , , . . . , 0] and s (cid:48) = [0 , , , , . . . , 0] are vectors ofdimension s + 2. Lemma E.1 Suppose that Assumption H holds. Assume that f ( · ) = f h ( · ) and g ( · ) = g h ( · ) are continuously differentiable functions, with, when h goes to , sup α ∈ [0 , | f ( α ) | = O (1) and sup α ∈ [0 , | g ( α ) | = O (1) , sup α ∈ [ h, − h ] (cid:12)(cid:12) f (1) ( α ) (cid:12)(cid:12) = O (1) and sup α ∈ [ h, − h ] (cid:12)(cid:12) g (1) ( α ) (cid:12)(cid:12) = O (1) , sup α ∈ [0 ,h ] ∪ [1 − h, (cid:12)(cid:12) f (1) ( α ) (cid:12)(cid:12) = O (cid:18) h (cid:19) and sup α ∈ [0 ,h ] ∪ [1 − h, (cid:12)(cid:12) g (1) ( α ) (cid:12)(cid:12) = O (cid:18) h (cid:19) .Then, if A is a random variable with a uniform distribution over [0 , C h = Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) + h (cid:26) Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) + Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , f ( A ) Ω h ( A ) s (cid:19)(cid:27) + h Cov ( g ( A ) Ω h ( A ) s , f ( A ) Ω h ( A ) s ) − h E [ f ( A ) Ω h ( A ) [ s s (cid:48) + s s (cid:48) ] g ( A ) Ω h ( A )] + o (cid:0) h (cid:1) . Proof of Lemma E.1: See Appendix F.Consider two functions ϕ ( α, x ) and ϕ ( α | x ) and define (cid:98) I ϕ ( x | I ) = (cid:90) (cid:20) ϕ ( α | x ) s (cid:48) + ϕ ( α | x ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) × (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) dα. The purpose of the next Lemma is to compute the variance of this integral. Define for this83urpose P = P ( I ) = E [ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) )] , P ( α ) = P ( α | I ) = E (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , P ( α ) = P ( α | I ) = − E (cid:34) P ( x (cid:96) ) P ( x (cid:96) ) I ( I (cid:96) = I ) B (2) ( α | x (cid:96) , I (cid:96) )( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35) , and set M ( α ) = Ω h ( α ) ⊗ P ( α ) , M ( α ) = Ω h ( α ) ⊗ P ( α ) . Lemma E.2 Suppose s ≥ D M / , and that Assumptions A, H, S and R hold. Assume that ϕ ( α | x ) , ϕ ( α | x ) and ∂ϕ ( α | x ) ∂α are continuous functions in ( α, x ) ∈ [0 , × X . Let A be arandom variable with a uniform distribution over [0 , . Then Var (cid:16) √ LIh D M (cid:98) I ϕ ( x | I ) (cid:17) = σ L ( x | I ) + (cid:13)(cid:13) h D M / P ( x ) (cid:13)(cid:13) o (1) with σ L ( x | I ) = Var (cid:20) h D M / P (cid:48) ( x ) (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α | I ) − P ( I ) / dα (cid:21) and Var (cid:16) √ LI (cid:82) X (cid:98) I ϕ ( x | I ) dx (cid:17) = σ L ( I ) + o (1) with σ L ( I ) = Var (cid:20)(cid:90) A (cid:26)(cid:90) X P (cid:48) ( x ) (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) dx (cid:27) P ( α | I ) − P / ( I ) dα (cid:21) . Proof of Lemma E.2. Abbreviate R (2) (cid:0) b ( α | I ) ; α, I (cid:1) , (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) into R (2) ( α ) and (cid:98) R (1) ( α ) respectively. We now give a suitable expansion for R (2) ( α ) − . From the end of theproof of Lemma B.2 and Theorem C.4, it holds R (2) ( α ) = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) g (cid:2) B ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1) | x, I (cid:3) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx. s ≥ , B (1) ( ·| x, I ) is continuously differentiable. A first-order Taylor expansion givesthat, uniformly, R (2) ( α ) = M ( α ) + h M ( α ) + o ( h ) . It then follows, uniformly over [0 , (cid:104) R (2) ( α ) (cid:105) − = (cid:2) Id + h M ( α ) − M ( α ) + o ( h ) Id (cid:3) − M ( α ) − = M ( α ) − − h M ( α ) − M ( α ) M ( α ) − + o ( h ) Id . Now M ( α ) − = Ω h ( α ) − ⊗ P ( α ) − and M ( α ) − M ( α ) M ( α ) − = (cid:2) Ω h ( α ) − Ω h ( α ) Ω h ( α ) − (cid:3) ⊗ (cid:2) P ( α ) − P ( α ) P ( α ) − (cid:3) with s (cid:48) Ω h ( α ) − Ω h ( α ) = s (cid:48) · · · × c ( α )0 1 ... × ... . . . 0 ...0 · · · × = s (cid:48) + c ( α ) s p where c ( α ) = c h ( α ) and the entries of Ω h ( α ) − satisfy the smoothness conditions of LemmaE.1. This gives since the eigenvalues of Ω h ( α ) − and P ( α ) − are bounded away from infinityuniformly in α Var / (cid:16) √ LI (cid:98) I ϕ ( x | I ) (cid:17) = Var / (cid:16)(cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I p ( x | I ) (cid:17) + o (1) (cid:107) P ( x ) (cid:107) (cid:13)(cid:13)(cid:13)(cid:13) Var / (cid:18) √ LI (cid:90) (cid:98) R (1) ( α ) dα (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) I ( x | I ) = √ LI (cid:90) ϕ ( α | x ) [ s ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα, (cid:98) I ( x | I ) = −√ LI (cid:90) ϕ ( α | x ) [ s ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − P ( α ) P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα. (cid:98) I ( x | I ) = √ LI (cid:90) ϕ ( α | x ) (cid:104) s h ⊗ P ( x ) (cid:105) (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα, (cid:98) I p ( x | I ) = √ LI (cid:90) ϕ ( α | x ) c ( α ) [ s p ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − P ( α ) P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα. Observe now that, for any functions f ( · ) and g ( · ) satisfying the conditions of LemmaE.1 C h ( f, g ) = E (cid:20) I ( I (cid:96) = I ) (cid:90) (cid:90) f ( α ) g ( α ) × (cid:26) G (cid:20) min (cid:18) P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) , P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21) − G (cid:20) P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21) G (cid:20) P (cid:18) x (cid:96) , a − α h (cid:19) b ( a | I ) (cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21)(cid:27) × R (2) ( α ) − (cid:26)(cid:20) π (cid:18) a − α h (cid:19) π (cid:18) a − α h (cid:19) (cid:48) (cid:21) ⊗ (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3)(cid:27) R (2) ( α ) − × h K (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) (cid:48) dα dα (cid:21) Now (C.3), max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) and Lemma B.1-(iii) gives P (cid:18) x (cid:96) , a − αh (cid:19) b ( α | I ) = B ( a | x (cid:96) , I ) + o (cid:0) h s +1 − D M / (cid:1) uniformly in a , α and x (cid:96) with a − αh in the support of K ( · ), | a − α | ≤ h . Since s +1 − D M / ≥ P C h ( f, g ) = (cid:90) (cid:90) f ( α ) g ( α ) { a ∧ a − a a }× R (2) ( α ) − (cid:26)(cid:20) π (cid:18) a − α h (cid:19) π (cid:18) a − α h (cid:19) (cid:48) (cid:21) ⊗ P (cid:27) R (2) ( α ) − × h K (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) dα dα dx dx + o (1) Id . Now applying Lemma E.1 gives, since p ≥ (cid:16)(cid:98) I p ( x | I ) (cid:17) = (cid:107) P ( x ) (cid:107) o ( h ) , Cov (cid:16)(cid:98) I p ( x | I ) , (cid:98) I j ( x | I ) (cid:17) = (cid:107) P ( x ) (cid:107) o (1) , j = 1 , , (cid:13)(cid:13)(cid:13) Var (cid:16) √ LI (cid:82) (cid:98) R (1) ( α ) dα (cid:17)(cid:13)(cid:13)(cid:13) = O (1) andVar (cid:16)(cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I ( x | I ) (cid:17) = P (cid:48) ( x ) (cid:26) Var (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα P / (cid:21) − (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα P / ,ϕ ( A | x ) P ( A ) − P / (cid:3) + Var (cid:2) ϕ ( A | x ) P ( A ) − P / (cid:3)(cid:9) P ( x )+ o (1) (cid:107) P ( x ) (cid:107) = Var (cid:26) P (cid:48) ( x ) (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα − ϕ ( A | x ) P ( A ) − (cid:3) P / (cid:9) + o (1) (cid:107) P ( x ) (cid:107) . Observe now that ∂∂α (cid:2) ϕ ( α | x ) P ( α ) − (cid:3) = ∂ϕ ( α | x ) ∂α P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − 87o that (cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα − ϕ ( A | x ) P ( A ) − = (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α ) − dα + ϕ (0 | x ) P (0) − . This givesVar (cid:16) √ LI (cid:98) I ϕ ( x | I ) (cid:17) = Var (cid:26) P (cid:48) ( x ) (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α ) − P / dα (cid:27) + o (1) (cid:107) P ( x ) (cid:107) as stated in the first result of the Lemma. The second similarly follows, observing that (cid:13)(cid:13)(cid:82) X ϕ j ( α | x ) P ( x ) dx (cid:13)(cid:13) = O (1), j = 0 , (cid:3) Consider two real valued continuous functions F ( b , b ) and F ( b , b ). Define ϕ ( α | x, I ) = F (cid:0) B ( α | x, I ) , B (1) ( α | x, I ) (cid:1) , ϕ ( α | x, I ) = F (cid:0) B ( α | x, I ) , B (1) ( α | x, I ) (cid:1) , (cid:98) I F ( x | I ) = (cid:90) (cid:20) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) × (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) dα. A condition ensuring that the variances σ L ( x | I ) and σ L ( I ) of Lemma E.2 do not vanish is(4.9), that is ϕ ( α | x, I ) − ∂ϕ ( α | x, I ) ∂α (cid:54) = 0 . Proposition E.3 Suppose s ≥ D M / , and that Assumptions A, H, S and R hold. Assumethat ϕ ( α | x ) , ϕ ( α | x ) and ∂ϕ ( α | x ) ∂α are continuous functions in ( α, x ) ∈ [0 , ×X . Let σ L ( x | I ) and σ L ( I ) be as in Lemma E.2.Then if (4.9) holds for some α of [0 , and if Lh D M +2 diverges, √ LIh D M (cid:98) I F ( x | I ) /σ L ( x | I ) converges in distribution to a standard normal. If (4.9) holds for some ( α, x ) of [0 , × X and Lh diverges, √ LI (cid:82) X (cid:98) I F ( x | I ) dx/σ L converges in distribution to a standard normal. roof of Proposition E.3. The eigenvalues of P ( α ) − , P ( α ) and P are bounded uni-formly in K and α by Assumptions R and S, and (cid:13)(cid:13) h D M / P ( x ) (cid:13)(cid:13) is bounded away from 0and infinity by Assumptions R and H. Then if (4.9) holds for some α , σ L ( x | I ) is boundedaway from 0 and infinity and the exact order of Var (cid:16)(cid:98) I F ( x | I ) (cid:17) is 1 /LIh D M . We now checkthe Lyapounov condition. Write (cid:98) R (1) ( α ) = LI (cid:80) L(cid:96) =1 I [ I (cid:96) = I ] r (cid:96) ( α ), with r (cid:96) ( α ) = I (cid:96) (cid:88) i =1 (cid:90) − α h − α h (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b ( α | I ) (cid:1) − ( α + ht ) (cid:9) π ( t ) ⊗ P ( x (cid:96) ) K ( t ) dt. This gives, since the eigenvalues of R (2) ( α ) are asymptotically bounded from 0 by LemmaB.2 and (C.3), E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) (cid:20) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) (cid:104) R (2) ( α ) (cid:105) − r (cid:96) ( α ) − E [ r (cid:96) ( α )] LI dα (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) ≤ C h − max x ∈X (cid:107) P ( x ) (cid:107) ( LI ) LI Var (cid:16)(cid:98) I F ( x | I ) (cid:17) = CL h D M +1 Var (cid:16)(cid:98) I ( x | I ) (cid:17) .Lh D M +2 → ∞ implies that the Lyapounov condition holds since CLh D M +1 Var / (cid:16)(cid:98) I F ( x | I ) (cid:17) Var (cid:16)(cid:98) I F ( x | I ) (cid:17) = O (cid:32) Lh D M +2 ) / (cid:33) → (cid:98) I F ( x | I ) / Var / (cid:16)(cid:98) I F ( x | I ) (cid:17) is asymptotically N (0 , √ LI (cid:82) X (cid:98) I F ( x | I ) dx , recall that (cid:13)(cid:13)(cid:82) | P ( x ) | dx (cid:13)(cid:13) = O (1) by Assumption R. This alsogives E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) X (cid:20)(cid:90) (cid:18) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:19) ⊗ P (cid:48) ( x ) (cid:21) (cid:104) R (2) ( α ) (cid:105) − r (cid:96) ( α ) − E [ r (cid:96) ( α )] LI dα (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) ≤ C h − ( LI ) LI Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) = CL h Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) . Lh diverges, because CLh Var / (cid:16)(cid:82) X (cid:98) I F ( x | I ) dx (cid:17) Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) = C ( Lh ) / → (cid:3) Proof of Theorem 4. Let (cid:98) d ( α | I ) and (cid:98) e ( α | I ) be as in (D.2) and (D.1), (cid:98) e ( α | I ) = − (cid:16) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:17) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) , (cid:98) d ( α | I ) = (cid:98) b ( α | I ) − b ( α | I ) − (cid:98) e ( α | I ) . Let (cid:98) I F ( x | I ) be as above, replacing ϕ j ( · ) with ϕ jI ( · ), j = 0 , 1. Then the second-order Taylorinequality gives (cid:98) θ ( x ) − θ ( x )= (cid:88) I ∈I (cid:90) (cid:104) ϕ I ( α, x ) (cid:0) B ( α | x, I ) − B ( α | x, I ) (cid:1) + ϕ I ( α, x ) (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17)(cid:105) dα + (cid:88) I ∈I (cid:98) I F ( x | I )+ (cid:88) I ∈I (cid:90) (cid:20)(cid:18) ϕ I ( α, x ) s (cid:48) + ϕ I ( α, x ) s (cid:48) h (cid:19) ⊗ P (cid:48) ( x ) (cid:21) (cid:98) d ( α | I ) dα + O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:20)(cid:0) B ( α | x, I ) − B ( α | x, I ) (cid:1) + (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17) (cid:21) O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:34) ([ s (cid:48) ⊗ P (cid:48) ( x )] (cid:98) e ( α | I )) + (cid:18)(cid:20) s (cid:48) h ⊗ P (cid:48) ( x ) (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:34)(cid:16) [ s (cid:48) ⊗ P (cid:48) ( x )] (cid:98) d ( α | I ) (cid:17) + (cid:18)(cid:20) s (cid:48) h ⊗ P (cid:48) ( x ) (cid:21) (cid:98) d ( α | I ) (cid:19) (cid:35) . (cid:98) θ ( x ) − θ ( x ) = o ( h s ) + (cid:88) I ∈I (cid:98) I F ( x | I )+ 1( Lh D M ) / O P (cid:32) log L ( Lh D M +2+( D M ∨ ) / + log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:98) I F ( x | I ) + o P (cid:32) Lh D M ) / (cid:33) . Proposition E.3 then gives the result since the (cid:98) I F ( x | I ) are independent. The asymptoticnormality of (cid:98) θ similarly follows from Assumption R, which gives (cid:13)(cid:13)(cid:82) X | P ( x ) | dx (cid:13)(cid:13) = O (1),and Theorem D.1 which implies (cid:98) θ − θ = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + O sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) h + 1 L / O P (cid:32) log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + 1 L / O P (cid:32) log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + o P (cid:18) L / (cid:19) . (cid:3) .4 Proof of Theorem A.1 By Theorems C.4 and D.1, Lemma B.5 and using the notations of the proof of Theorem 2sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) P ( x ) (cid:48) S (cid:104)(cid:98) b ( α | I ) − b ( α | I ) (cid:105)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) + o (cid:0) h s +1 (cid:1) = O P (cid:34)(cid:18) log LLh D M (cid:19) / (cid:40) (cid:18) log LLh D M +( D M ∨ (cid:19) / (cid:41)(cid:35) + o (cid:0) h s +1 (cid:1) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + o (cid:0) h s +1 (cid:1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:16) S + αh S (cid:17) (cid:104)(cid:98) b ( α | I ) − b ( α | I ) (cid:105)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:32)(cid:98) d + α (cid:98) d ( α | I ) h (cid:33)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + O (cid:0) h s +1 (cid:1) = O P (cid:34)(cid:18) log LLh D M +1 (cid:19) / (cid:40) (cid:18) log LLh D M +1+( D M ∨ (cid:19) / (cid:41)(cid:35) + O (cid:0) h s +1 (cid:1) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) + O (cid:0) h s +1 (cid:1) . This end the proof of the Theorem. (cid:3) nline Appendix F: Proofs of intermediary resultsF.1 Lemmas B.1, B.2 and C.3 Proof of Lemma B.1. Consider the harder ASQR case. (i) It holds that, for β k ( ·|· ) asin (2.11), B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) β k ( α + ht | I ) − K (cid:88) k =1 P k ( x ) s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) (cid:32) β k ( α + ht | I ) − s (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) (cid:33) − ( ht ) s +1 ( s + 1)! K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | I ) . A Taylor expansion with integral remainder gives β k ( α + ht | I ) − s (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) = ( ht ) s +1 s ! (cid:90) β ( s +1) k ( α + uht | I ) (1 − u ) s du 93o that B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ ( ht ) s +1 s ! (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +1) k ( α + uht | I ) − B ( s +1) ( α + uht | I ) (cid:41) (1 − u ) s du + ( ht ) s +1 s ! (cid:90) (cid:8) B ( s +1) ( α + uht | x, I ) − B ( s +1) ( α | x, I ) (cid:9) (1 − u ) s du + ( ht ) s +1 ( s + 1)! (cid:40) B ( s +1) ( α | x, I ) − K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | x, I ) (cid:41) . Hence since B ( s +1) ( α | x, I ) is continuous, by Property S and Proposition C.1max ( α,x ) ∈ [0 , ×X max t ∈I α,h | B ( α + ht | x, I ) − P ( x, t ) b ∗ ( α | I ) | = o (cid:0) h s +1 (cid:1) + o (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +1 (cid:1) (F.1)since K − /D M = O ( h ). Observe also that, uniformly in α , x and t as above, ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3) = s +1 (cid:88) p =1 h p t p − ( p − K (cid:88) k =1 P k ( x ) β ( p ) k ( α | I )= h (cid:0) B (1) ( α | x, I ) + o (1) (cid:1) + h (cid:32) s +1 (cid:88) p =2 h p − t p − ( p − B ( p ) ( α | x, I ) + o (1) (cid:33) = hB (1) ( α | x, I ) + o ( h )by Property S, which also gives,max p =1 ,...,s +1 (cid:32) max x ∈X (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) h (cid:33) = max p =1 ,...,s +1 max ( α,x ) ∈ [0 , ×X h p − (cid:12)(cid:12) B ( p ) ( α | x, I ) + o (1) (cid:12)(cid:12) = max ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) + o (1) ≤ f f is large enough and h small enough, so that b ∗ ( α | I ) is in BI α,h since B (1) ( ·|· , · )is bounded away from 0 and infinity by Proposition C.1. Suppose now that (cid:107) b − b ∗ ( α | I ) (cid:107) ≤ Ch/K / = Ch D M / . Then (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) ≥ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) − (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:107) P ( x ) (cid:107)≥ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) − O ( h ) , (cid:12)(cid:12) P ( x ) (cid:48) b p (cid:12)(cid:12) ≤ (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) + (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:107) P ( x ) (cid:107)≤ (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) − Ch, p = 1 , . . . , s + 1 , and B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) ⊂ BI α,h when h is small enough provided C is small enough.Hence (i) holds. (ii) follows from the Implicit Function Theorem and the definition of BI α,h .The first equality of (iii) is (F.1). For the second, note that α + ht ≥ h > α ≥ h for all t in I α,h . It holds B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) (cid:32) β k ( α + ht | I ) − s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) (cid:33) with β k ( α + ht | I ) − s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) = ( ht ) s +2 ( s + 1)! (cid:90) β ( s +2) k ( α + uht | I ) (1 − u ) s +1 du recalling, as established in the proof of Proposition C.1-(i) for α > β ( s +2) k ( α | I ) = 1 α (cid:16) ( I − γ ( s +1) k ( α | I ) − ( I + s ) β ( s +1) k ( α | I ) (cid:17) ,B ( s +2) ( α | x, I ) = 1 α (cid:16) ( I − V ( s +1) k ( α | I ) − ( I + s ) B ( s +1) ( α | x, I ) (cid:17) . (F.2)95ence B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) − ( ht ) s +2 ( s + 2)! B ( s +2) ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ ( ht ) s +2 ( s + 1)! (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +2) k ( α + uht | I ) − B ( s +2) ( α + uht | x, I ) (cid:41) (1 − u ) s +1 du + ( ht ) s +2 ( s + 1)! (cid:90) (cid:8) B ( s +2) ( α + uht | x, I ) − B ( s +2) ( α | x, I ) (cid:9) (1 − u ) s +1 du, with, using the expressions β ( s +2) k ( ·|· ) and B ( s +2) ( ·|· ) of the proof of Proposition C.1max ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:32) B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = ho (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +2 (cid:1) , max ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +2) k ( α + uht | I ) − B ( s +2) ( α + uht | x, I ) (cid:41) (1 − u ) s +1 du (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C max ( α,x ) ∈ [2 h, ×X max t ∈I α,h (cid:40) αα − h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) + C max ( α,x ) ∈ [2 h, ×X max t ∈I α,h (cid:40) αα − h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K (cid:88) k =1 P k ( x ) γ ( s +1) k ( α | I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) = o (1) , max ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) α (cid:90) (cid:8) B ( s +2) ( α + uht | x, I ) − B ( s +2) ( α | x, I ) (cid:9) (1 − u ) s +1 du (cid:12)(cid:12)(cid:12)(cid:12) = o (1) . Substituting givesmax ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:32) B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) − ( ht ) s +2 ( s + 2)! B ( s +2) ( α | x, I ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12) α (cid:0) B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:1)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) , max ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ( ht ) s +2 ( s + 2)! B ( s +2) ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) . The third result in (iii) follows from Proposition C.1-(iii). The fourth equality of (iii) followsfrom o (cid:0) h s +1 (cid:1) = max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | Ψ [∆ ( u | x, b ∗ ( α | I )) | x, b ∗ ( α | I )] − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | u − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] (cid:12)(cid:12)(cid:12)(cid:12) B (cid:20) α + h G ( u | x, I ) − αh | x, I (cid:21) − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] |≥ Ch max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] (cid:12)(cid:12)(cid:12)(cid:12) G ( u | x, I ) − αh − Φ ( u | x, b ∗ ( α | I )) − αh (cid:12)(cid:12)(cid:12)(cid:12) by Proposition C.1-(i).Consider now (iv). The first bound follows from the Cauchy-Schwarz inequality. Thisbound implies for all u in Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] | = | Ψ [∆ ( u | x, b ) | x, b ] − u | = | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] | ≤ Ch − D M / (cid:107) b − b (cid:107) . 97y definition of BI α,h | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] |≥ Ch | ∆ ( u | x, b ) − ∆ ( u | x, b ) | = C | Φ ( u | x, b ) − Φ ( u | x, b ) | and substituting shows that the second bound of (iv) holds. For the third bound in (iv), itholds uniformly in α , x , u , b and b (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ( t | x, b ) ∂t (cid:12)(cid:12)(cid:12)(cid:12) | Φ ( u | x, b ) − Φ ( u | x, b ) | h + max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂P ( x, t ) ∂t ( b − b ) (cid:12)(cid:12)(cid:12)(cid:12) . But, by definition of BI α,h max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ( t | x, b ) ∂t (cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch max p =2 ,...,s +1 (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) b p h (cid:12)(cid:12)(cid:12)(cid:12) = O ( h )so that substituting and the bound for Φ ( u | x, b ) − Φ ( u | x, b ) gives, uniformly in α , x , u , b and b (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch − D M / (cid:107) b − b (cid:107) , which is the fourth inequality. The expression in (ii) of Φ ( · ) and the definition of BI α,h yieldthe third inequality. (cid:3) roof of Lemma B.2. It holds R (2) ( b ; α, I ) = E [ I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) (cid:48) Ψ (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) K (∆ ( B i(cid:96) | x (cid:96) , b )) (cid:21) = (cid:90) (cid:34)(cid:90) Ψ ( I α,h | x, b ) ∧ B (1 | x,I )Ψ ( I α,h | x, b ) ∨ B (0 | x,I ) P ( x, ∆ ( y | x, b )) P ( x, ∆ ( y | x, b )) (cid:48) Ψ (∆ ( y | x, b ) | x (cid:96) , b ) K (∆ ( y | x, b )) g ( y, x, I ) dy (cid:35) dx. Recall ∆ [Ψ [ t | x, b ] | x, b ] = t for all t in I α,h and let I α,h ( x, I ; b ) = I α,h ∧ ∆ [ B (1 | x, I ) | x, b ] , I α,h ( x, I ; b ) = I α,h ∨ ∆ [ B (0 | x, I ) | x, b ] . The change of variable y = Ψ ( t | x, b ) yields that R (2) ( b ; α, I ) = (cid:90) (cid:34)(cid:90) I α,h ( x,I ; b ) I α,h ( x,I ; b ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) g (Ψ ( t | x, b ) , x, I ) dt (cid:35) dx. The Dominated Convergence Theorem and Proposition C.1-(i) , s ≥ 1, yield that R (2) ( · ; α, I )is continuously differentiable over BI α,h with, by the Liebniz integral rule, R (3) ( b ; α, I ) [ d ] = R (3)0 ( b ; α, I ) [ d ] + R (3)1 ( b ; α, I ) [ d ] − R (3)2 ( b ; α, I ) [ d ] , R (3)0 ( b ; α, I ) [ d ] = (cid:90) X (cid:34)(cid:90) I α,h ( x,I ; b ) I α,h ( x,I ; b ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) g (1) (Ψ ( t | x, b ) , x, I ) [ d (cid:48) P ( x, t )] dt (cid:35) dx, R (3)1 ( b ; α, I ) [ d ] = (cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) K (cid:0) I α,h ( x, I ; b ) (cid:1) × g (cid:0) Ψ (cid:0) I α,h ( x, I ; b ) | x, b (cid:1) , x, I (cid:1) (cid:20) d (cid:48) ∂I α,h ( x, I ; b ) ∂ b (cid:48) (cid:21) dx, R (3)2 ( b ; α, I ) [ d ] = (cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) K (cid:0) I α,h ( x, I ; b ) (cid:1) × g (cid:0) Ψ (cid:0) I α,h ( x, I ; b ) | x, b (cid:1) , x, I (cid:1) (cid:20) d (cid:48) ∂I α,h ( x, I ; b ) ∂ b (cid:48) (cid:21) dx. which implies that g ( ·|· , I ) is bounded away from 0 and infinity. (cid:13)(cid:13)(cid:13) R (3)0 ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) (cid:22) C max x ∈X (cid:107) P ( x ) (cid:107) (cid:107) d (cid:107) ≤ Ch − D M / (cid:107) d (cid:107) . The operators R (3) i ( b ; α, I ) [ d ], i = 1 , 2, can be studied in a similar way so that only i = 1 isconsidered. Observe ∂I α,h ( x, I ; b ) ∂ b (cid:48) = I α,h ≤ ∆ [ B (1 | x, I ) | x, b ] ∂ ∆[ B (1 | x,I ) | x, b ] ∂ b (cid:48) = − P ( x, ∆( B (1 | x,I ) | x, b ))Ψ (1) (∆( B (1 | x,I ) | x, b ) | x, b ) if I α,h > ∆ [ B (1 | x, I ) | x, b ] . But, for h small enough,∆ [ B (1 | x, I ) | x, b ] = Φ [ B (1 | x, I ) | x, b ] − αh = min (cid:8) α + hI α,h , Φ [ B (1 | x, I ) | x, b ] (cid:9) − αh ≥ min (cid:8) α + hI α,h , Φ [ B (1 | x, I ) | x, b ∗ ( α | I )] − Ch − D M / (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:9) − αh ≥ min (cid:8) α + hI α,h , G [ B (1 | x, I ) | x, I ] − Ch s +1 − Ch (cid:9) − αh ≥ min (cid:8) α + h min (cid:0) − αh , (cid:1) , − Ch (cid:9) − αh uniformly in α , x and b in B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) by Lemma B.1. Hence, if α ≤ − C (cid:48) h with C (cid:48) ≥ B (1 | x, I ) | x, b ] ≥ min { α + h, − Ch } − αh ≥ ≥ I α,h ∂I α,h ( x,I ; b ) ∂ b (cid:48) = 0. Hence since B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) ⊂ BI α,h and by definition of BI α,h (cid:13)(cid:13)(cid:13) R (3)1 ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) ≤ C I [ α ≥ − C (cid:48) h ] × (cid:13)(cid:13)(cid:13)(cid:13)(cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) d (cid:48) P ( x, ∆ ( B (1 | x, I ) | x, b ))Ψ (∆ ( B (1 | x, I ) | x, b ) | x, b ) dx (cid:13)(cid:13)(cid:13)(cid:13) ≤ Ch − I [ α ≥ − C (cid:48) h ] max x ∈X (cid:107) P ( x ) (cid:107) (cid:107) d (cid:107) ≤ Ch − h − D M / (cid:107) d (cid:107) I [ α ≥ − C (cid:48) h ] ≤ C h − D M / α (1 − α ) + h (cid:107) d (cid:107) . Substituting in the expression of R (3) ( b ; α, I ) [ d ] then gives uniformly in d max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) ( α (1 − α ) + h ) (cid:13)(cid:13)(cid:13) R (3) ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) ≤ Ch − D M / (cid:107) d (cid:107) . The Taylor inequality shows that (i) holds.For (ii), the expression of R (2) ( b ; α, I ), Assumptions A and R-(i), Proposition C.1-(i),which imply that the eigenvalues of (cid:82) P ( x ) P (cid:48) ( x ) g [ B ( α | x, I ) , x, I ] dx stay bounded away 0and infinity, Lemma B.1-(iii) and Proposition C.1-(i) give that, uniformly in α and xI α,h [ x, I ; b ∗ ( α | I )] = I α,h ∧ Φ [ B (1 | x, I ) | x, b ∗ ( α | I )] − αh = I α,h ∧ o ( h s +1 ) − αh = I α,h + o ( h s ) ,I α,h [ x, I ; b ∗ ( α | I )] = I α,h + o ( h s ) , (2) ( b ∗ ( α | I ) ; α, I ) = (cid:90) (cid:34)(cid:90) I α,h [ x,I ; b ∗ ( α | I )] I α,h [ x,I ; b ∗ ( α | I )] π ( t ) π ( t ) (cid:48) K ( t ) g (Ψ ( t | x, b ∗ ( α | I )) | x, I ) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) g (cid:2) B ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1) | x, I (cid:3) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) (cid:18) B (1) ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1)(cid:19) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) (cid:32) B (1) ( α | x, I ) − ht B (2) ( α | x, I )( B (1) ( α | x, I )) + o ( h ) (cid:33) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) Ω h ( α ) ⊗ P ( x ) P ( x ) (cid:48) B (1) ( α | x, I ) f ( x, I ) dx − h (cid:90) Ω h ( α ) ⊗ P ( x ) P ( x ) (cid:48) B (2) ( α | x, I )( B (1) ( α | x, I )) f ( x, I ) dx + o ( h )where the last o ( h ) term is with respect of the matrix norm. This together the fact thatthe eigenvalues of the matrices Ω h ( α ) and (cid:82) X P ( x ) P ( x ) (cid:48) dx are bounded away from 0 andinfinity, the fact that B (1) ( α | x, I ) is bounded away from 0 and infinity shows that (ii) holds. (cid:3) Proof of Lemma C.3. Write A − α,h = D α,h + B α,h where D α,h is the diagonal of A − α,h and B α,h = A − α,h − D α,h . Provided the series converges A α,h = D − / α,h (cid:40) ∞ (cid:88) n =0 (cid:16) D − / α,h B α,h D − / α,h (cid:17) n (cid:41) D − / α,h . D − / α,h are bounded inabsolute value by C < ∞ for all α and L . It also gives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) E / (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) E / (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:37) < ≤ k , k ≤ K and 0 ≤ p , p ≤ s + 1, that is all the entries of D − / α,h B α,h D − / α,h are bounded by (cid:37) in absolute value. By Assumption R-(ii), the entries of D − / α,h B α,h D − / α,h are bounded by the ones of (cid:37) Id ⊗ ( T (cid:48) + T ), where T is a lower c/ s + 2) × ( s + 2) identity matrix. Hence the absolute valueof the entries of A α,h are bounded by the entries of C Id ⊗ (cid:32) ∞ (cid:88) n = ∞ (cid:37) n (cid:16) T n (cid:48) + T n (cid:17)(cid:33) . Since T is a triangular c − band nilpotent matrix, it follows that | A α,h ( j , j ) | ≤ Cρ | j − j | with 0 < (cid:37) ≤ ρ < 1, for all α and L . It followsmax L max α ∈ [0 , max ≤ j ≤ ( s +1) K ( s +1) K (cid:88) j =1 | A α,h ( j , j ) | ≤ C (cid:88) n ρ n < ∞ which ends the proof of the Lemma. (cid:3) F.2 Lemmas B.3, B.4 and B.5 The proofs of the lemmas grouped here make use of a deviation inequality from Massart(2007). Consider n independent random variables Z (cid:96) and, for a known real function ξ ( z, θ )separable with respect to θ ∈ Θ, Z (cid:96) ( θ ) = ξ ( Z (cid:96) , θ ) where θ is a parameter. Let ξ ( · ) ≤ ξ ( · ) betwo functions. A bracket (cid:2) ξ, ξ (cid:3) is the set of all functions ξ ( · ) such that ξ ( z ) ≤ ξ ( z ) ≤ ξ ( z )for all z . The next proposition follows from Massart (2007, Theorem 6.8 and Corollary 6.9).103 roposition F.1 Assume that sup θ ∈ Θ | Z (cid:96) ( θ ) | ≤ M ∞ , sup θ ∈ Θ Var ( Z (cid:96) ( θ )) ≤ M for all (cid:96) and that for any (cid:15) > there exists brackets (cid:104) ξ j , ξ j (cid:105) ⊂ [ − b, b ] , j = 1 , . . . , exp ( H ( (cid:15) )) , suchthat E (cid:20)(cid:16) ξ j ( Z i ) − ξ j ( Z i ) (cid:17) (cid:21) ≤ (cid:15) and { ξ ( z, θ ) , θ ∈ Θ } ⊂ exp( H ( (cid:15) )) (cid:91) j =1 (cid:104) ξ j , ξ j (cid:105) . Let H L = 54 (cid:90) M / (cid:112) min ( L, H ( (cid:15) )) d(cid:15) + 2 ( M ∞ + M ) H ( M ) L / . Then, for any t ∈ (cid:2) , L / M /M ∞ (cid:3) , P (cid:32) sup θ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) i =1 { Z (cid:96) ( θ ) − E [ Z (cid:96) ( θ )] } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ L / {H L + t } (cid:33) ≤ (cid:18) − t (cid:19) . Proof of Lemma B.3. Note that (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) is a c ( s + 2)-band matrix, sothat the order of its matrix norm is the same than the order of its largest entry. The genericentry of (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) can be written as (cid:98) r ( b ; α, I ) = 1 Lh ( D M +1) / L (cid:88) (cid:96) =1 ξ (cid:96) ( b ; α )where the ξ (cid:96) ( b ; α ) are centered iid with ξ (cid:96) ( b ; α ) = I (cid:96) (cid:88) i =1 { I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] ξ i(cid:96) ( b ) − E [ I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] ξ i(cid:96) ( b )] } ξ i(cid:96) ( b ) = h D M / h / P k ( x (cid:96) ) P k ( x (cid:96) )Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h K p (∆ ( B i(cid:96) | x (cid:96) , b )) ,K p (∆ ( B i(cid:96) | x (cid:96) , b )) = ∆ p + p ( B i(cid:96) | x (cid:96) , b ) p ! p ! K (∆ ( B i(cid:96) | x (cid:96) , b )) . | ξ (cid:96) ( b ; α ) | ≤ C h D M / max x ∈X (cid:107) P ( x ) (cid:107) h / ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / . for all α in [0 , 1] and all admissible b . For the variance, Lemma B.1-(iii,iv) gives | ∆ ( B i(cid:96) | x (cid:96) , b ) | = (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ) − αh (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ∗ ( α | I (cid:96) )) − G ( B i(cid:96) | x (cid:96) , b ) h (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ) − Φ ( B i(cid:96) | x (cid:96) , b ∗ ( α | I (cid:96) )) h (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + o ( h s ) + O (cid:18) h − D M / × h D M / h (cid:19) = (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + O (1)uniformly. It follows that, U i(cid:96) = G ( B i(cid:96) | x (cid:96) , I (cid:96) ) being a uniform random variable independentof ( x (cid:96) , I (cid:96) )Var ( ξ (cid:96) ( b ; α )) ≤ CI h D M max x ∈X (cid:107) P ( x ) (cid:107) (cid:90) X | P k ( x ) P k ( x ) | dx (cid:90) I [ − C,C ] (cid:18) u − αh (cid:19) duh ≤ CI h D M max x ∈X (cid:107) P ( x ) (cid:107) (cid:18)(cid:90) X P k ( x ) dx (cid:19) / (cid:18)(cid:90) X P k ( x ) dx (cid:19) / ≤ M with M < ∞ under Assumption R, uniformly in b and α .Consider now the brackets covering. The key observation is that ξ (cid:96) ( b ; α ) only dependson a finite dimension subvector of b , b ( k ,k ) which groups the entries of b corresponding tothose P k ( · ) such that P k ( · ) P k ( · ) (cid:54) = 0 or P k ( · ) P k ( · ) (cid:54) = 0, so that the dimension of b ( k ,k ) is less than c ( s + 2) under Assumption R-(ii). Consequently the class to be bracketed is F = (cid:8) ξ (cid:96) (cid:0) b ( k ,k ) ; α (cid:1) ; α ∈ [0 , , b ( k ,k ) ∈B (cid:0) b ( k ,k ) ∗ ( α | I ) , Ch D M / (cid:1)(cid:9) . / (cid:0) Lh D M +1 (cid:1) = o (1), van de Geer (1999, p.20) and arguing as Guerre andSabbah (2012, 2014) imply that F can be bracketed with a number of bracketsexp ( H L ( (cid:15) )) (cid:16) (cid:18) L C (cid:15) (cid:19) C so that (cid:90) M / (cid:112) min ( L, H L ( (cid:15) )) d(cid:15) ≤ (cid:18) M (cid:19) / (cid:32)(cid:90) M / H L ( (cid:15) ) d(cid:15) (cid:33) / = O (log L ) / and for the item H L of Proposition F.1, H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / since 1 / (cid:0) Lh D M +1 (cid:1) is bounded. Hence, by Proposition F.1 for t ≤ L / M /M ∞ diverges P (cid:0) Lh D M +1 (cid:1) / sup α ∈ [0 , sup b ∈B ( b ∗ ( α | I ) ,Ch D M / ) | (cid:98) r ( b ; α, I ) | ≥ C log / L + t ≤ (cid:18) − t (cid:19) uniformly over all the non zero entries (cid:98) r ( b ; α, I ) of the band matrix (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ).This gives, by the Bonferroni inequality P sup α ∈ [0 , sup b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) ≥ C log / L + t ( Lh D M +1 ) / ≤ CK exp (cid:18) − t (cid:19) which implies the result of the lemma since t ≤ L / M /M ∞ = O (cid:0) Lh D M +1 (cid:1) / can be setto t = τ log / L for an arbitrary large τ as log L/ (cid:0) Lh D M +1 (cid:1) = o (1). (cid:3) roof of Lemma B.4. The proof of Lemma B.4 is similar to the one of Lemma B.3. Thegeneric entry of (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) writes (cid:98) r ( b ; α, I ) = 1 L L (cid:88) (cid:96) =1 ξ (cid:96) ( b ; α )where the ξ (cid:96) ( b ; α ) are centered iid with, for K p ( t ) = t p K ( t ) /p !, ξ (cid:96) ( b ; α ) = I (cid:96) (cid:88) i =1 ( I ( I (cid:96) = I ) ξ i(cid:96) ( b ; α ) − E [ I ( I (cid:96) = I ) ξ i(cid:96) ( b ; α )]) ,ξ i(cid:96) ( b ; α ) = P k ( x (cid:96) ) (cid:40)(cid:90) I α, h I α, h { I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )] − ( α + ht ) } K p ( t ) dt (cid:41) . This gives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ξ (cid:96) ( b ; α )( h + α (1 − α )) / (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch − / max x ∈X (cid:107) P ( x ) (cid:107) ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / . For the computation of the variance, Lemma B.1-(iii,iv) and Proposition C.1-(i) give uni-formly in α , t in I α,h the admissible b and x (cid:96) , and for the uniform U i(cid:96) = G ( B i(cid:96) | x (cid:96) , I (cid:96) ), I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )] = I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b ∗ ( α | I )) + O ( h )]= I [ B ( U i(cid:96) | x (cid:96) , I (cid:96) ) ≤ B ( α + ht | x (cid:96) , I (cid:96) ) + O ( h )]= I [ U i(cid:96) ≤ G [ B ( α + ht | x (cid:96) , I (cid:96) ) + O ( h ) | x (cid:96) , I (cid:96) ]]= I [ U i(cid:96) ≤ α + ht + O ( h )] . U i(cid:96) is independent of ( x (cid:96) , I (cid:96) ) E (cid:2) ξ i(cid:96) ( b ; α ) | I (cid:96) (cid:3) ≤ E (cid:34) P k ( x (cid:96) ) (cid:90) I α, h I α, h (cid:90) I α, h I α, h I [ U i(cid:96) ≤ α + h ( t ∧ t ) + O ( h )] K p ( t ) K p ( t ) dt dt | I (cid:96) (cid:35) − E (cid:34) P k ( x (cid:96) ) (cid:90) I α, h I α, h (cid:90) I α, h I α, h I [ U i(cid:96) ≤ α + ht + O ( h )] ( α + ht ) K p ( t ) K p ( t ) dt dt | I (cid:96) (cid:35) + E (cid:2) P k ( x (cid:96) ) | I (cid:96) (cid:3) (cid:90) I α, h I α, h (cid:90) I α, h I α, h ( α + ht ) ( α + ht ) K p ( t ) K p ( t ) dt dt = E (cid:2) P k ( x (cid:96) ) | I (cid:96) (cid:3) (cid:90) I α, h I α, h (cid:90) I α, h I α, h (cid:8) α + O ( h ) − α (cid:9) K p ( t ) K p ( t ) dt dt ≤ C ( h + α (1 − α ))uniformly in α and b . Hence, uniformly in α and b Var (cid:32) ξ (cid:96) ( b ; α )( h + α (1 − α )) / (cid:33) ≤ M with M < ∞ . The bracketing part of the proof is similar to the one of Lemma B.3 and gives H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / . Arguing with Proposition F.1 then shows that the order of the largest entry in (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) is O P (log L/L ) / , which gives uniformly (cid:13)(cid:13)(cid:13)(cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) (cid:13)(cid:13)(cid:13) = K / O P (cid:18) log LL (cid:19) / = O P (cid:18) log LLh D M (cid:19) / and the Lemma is proved. (cid:3) roof of Lemma B.5. For (i), define P = E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) , P = E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , P = E (cid:34) I ( I (cid:96) = I ) B (2) ( α | x (cid:96) , I (cid:96) ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) ( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35) , and abbreviate Ω h ( α ), Ω h ( α ) in Ω, Ω . It holdsVar ( (cid:98) e ( α | I )) = (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − Var (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − with by Lemma B.2 (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − = [Ω ⊗ P − h Ω ⊗ P + o ( h )] − = (cid:2) Id − h (cid:0) Ω − Ω (cid:1) ⊗ (cid:0) P − P (cid:1) + o ( h ) (cid:3) − Ω − ⊗ P − = Ω − ⊗ P − + h (cid:0) Ω − Ω Ω − (cid:1) ⊗ (cid:0) P − P P − (cid:1) + o ( h )uniformly in α where the remainder term o ( h ) is with respect to the matrix norm. ForVar (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) , define ω = (cid:90) I α,h I α,h π ( t ) K ( t ) dt, ω = (cid:90) I α,h I α,h π ( t ) K ( t ) dt, Π m = (cid:90) I α,h I α,h (cid:90) I α,h I α,h min ( t , t ) π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt. LI ) Var (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) admits the expansion, with uniform remainder terms, E (cid:34)(cid:90) I α,h I α,h (cid:90) I α,h I α,h { G [ B ( α + ht | x (cid:96) , I (cid:96) ) ∧ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] − G [ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] ( α + ht ) − G [ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] ( α + ht )+ ( α + ht ) ( α + ht ) } π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt dt ⊗ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) = (cid:90) I α,h I α,h (cid:90) I α,h I α,h (cid:8) α + h ( t ∧ t ) − α − hα ( t + t ) (cid:9) π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt dt + o ( h )= α (1 − α ) ω ω (cid:48) ⊗ P + h { Π m − α ( ω ω (cid:48) + ω ω (cid:48) ) } ⊗ P + o ( h ) . Hence an elementary expansion gives, uniformly in α ∈ [0 , (cid:98) e ( α | I )) = V e / ( LI ) + o ( h )with V e = α (1 − α ) (cid:2) Ω − ω ω (cid:48) Ω − (cid:3) ⊗ (cid:2) P − PP − (cid:3) + hα (1 − α ) (cid:2) Ω − Ω Ω − ω ω (cid:48) Ω − (cid:3) ⊗ (cid:2) P − P P − PP − (cid:3) + hα (1 − α ) (cid:2) Ω − ω ω (cid:48) Ω − Ω Ω − (cid:3) ⊗ (cid:2) P − PP − P P − (cid:3) + h (cid:2) Ω − ( Π m − ( ω ω (cid:48) + ω ω (cid:48) )) Ω − (cid:3) ⊗ (cid:2) P − PP − (cid:3) . Observe now that Ω − ω = s , Ω − ω = s and Ω − Ω Ω − ω = Ω − Ω s = Ω − ω = s .This gives V e = α (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − PP − (cid:3) + hα (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − P P − PP − (cid:3) + hα (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − PP − P P − (cid:3) + h (cid:2) Ω − Π m Ω − − ( s s (cid:48) + s s (cid:48) ) (cid:3) ⊗ (cid:2) P − PP − (cid:3) . P − , P , P , Ω − and Ω are bounded away from infinity uniformlyin α , it follows that max α ∈ [0 , (cid:107) Var ( (cid:98) e ( α | I )) (cid:107) = O (1 /L ) and thenmax ( α,x ) ∈ [0 , ×X Var (cid:0) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:1) = O (cid:32) max x ∈X (cid:107) P ( x ) (cid:107) L (cid:33) = O (cid:18) Lh D M (cid:19) . For Var ( (cid:98) e ( α | I ) /h ), observe that (cid:98) e ( α | I ) = S (cid:98) e ( α | I ) with S = s (cid:48) ⊗ Idit holds S V e S (cid:48) = h (cid:0) s (cid:48) Ω − Π m Ω − s (cid:1) (cid:0) P − PP − (cid:1) = hv h ( α ) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) as v h ( α ) = s (cid:48) Ω − Π m Ω − s . This gives the result for Var ( (cid:98) e ( α | I ) /h ) and Var (cid:0) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:1) .For (ii), we just show that max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:12)(cid:12) = O P (cid:16)(cid:0) log L/Lh D M +1 (cid:1) / (cid:17) .Since max x ∈ [0 , (cid:107) P ( x ) (cid:107) = O (cid:0) h − D M / (cid:1) andmax ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18) max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) × h − / (cid:18) x ∈ [0 , (cid:107) P ( x ) (cid:107) (cid:19) it is sufficient to show max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) (cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LL (cid:19) / (cid:33) . (F.3)Write P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) = 1 L L (cid:88) (cid:96) =1 ξ (cid:96) ( α, x )111ith ξ (cid:96) ( α, x ) = I (cid:96) (cid:88) i =1 ( I ( I (cid:96) = I ) ξ i(cid:96) ( α, x ) − E [ I ( I (cid:96) = I ) ξ i(cid:96) ( α, x )]) ,ξ i(cid:96) ( α, x ) = P ( x ) (cid:48) S (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − P ( x (cid:96) ) h / (1 + (cid:107) P ( x ) (cid:107) ) × (cid:40)(cid:90) I α,h I α,h (cid:8) I (cid:2) B i(cid:96) ≤ Ψ (cid:0) t | x (cid:96) , b ( α | I ) (cid:1)(cid:3) − ( α + ht ) (cid:9) K ( t ) dt (cid:41) . This gives, for all ( α, x ) ∈ [0 , | ξ (cid:96) ( α, x ) | ≤ Ch − / (max x ∈X (cid:107) P ( x ) (cid:107) ) x ∈X (cid:107) P ( x ) (cid:107) ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / , Var ( ξ (cid:96) ( α, x )) ≤ C (max x ∈X (cid:107) P ( x ) (cid:107) ) (1 + max x ∈X (cid:107) P ( x ) (cid:107) ) ≤ M with M (cid:16) . The Implicit Function Theorem and the FOC R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0, Lemma B.2 with(C.3) and s ≥ D M / α (cid:55)→ b ( α | I ) is (cid:107)·(cid:107) -Lipshitz with a Lipshitz constant oforder L C , as α (cid:55)→ (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − and x (cid:55)→ P ( x ) / (1 + (cid:107) P ( x ) (cid:107) ). Lemma B.1-(iii),1 / (cid:0) Lh D M +1 (cid:1) = O (1), van de Geer (1999, p.20) and arguing as Guerre and Sabbah (2012,2014) imply that { ξ (cid:96) ( α, x ) ; ( α, x ) ∈ [0 , × X } can be bracketed with a number of bracketsexp ( H L ( (cid:15) )) (cid:16) (cid:18) L C (cid:15) (cid:19) C .Arguing as in the proof of Lemma B.3 gives, for the item H L of Proposition F.1, H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / and then (F.3) holds. (cid:3) .3 Lemma E.1 The proof of Lemma E.1 is based on the following lemma. Lemma F.2 Let k ( · ) and k ( · ) be two functions over [0 , with primitives K ( · ) and K ( · ) . Then, if A is a random variable with a uniform distribution over [0 , and for anychoice of the primitives K ( · ) and K ( · ) , (cid:90) (cid:90) k ( a ) k ( a ) [ a ∧ a − a a ] da da = − (cid:90) k ( a ) (cid:26)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:27) da Proof of Lemma F.2. Observe that (cid:90) (cid:90) k ( a ) k ( a ) [ a ∧ a − a a ] da da = E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:90) k ( a ) I [ A ≤ a ] da (cid:21) − E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:21) E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:21) = Cov (cid:18)(cid:90) A k ( a ) da, (cid:90) A k ( a ) da (cid:19) = Cov ( K ( A ) , K ( A ))which does not depend upon the choice of the primitives. Integrating by parts now givesCov ( K ( A ) , K ( A )) = (cid:90) K ( a ) ( K ( a ) − E [ K ( A )]) da = (cid:90) K ( a ) d (cid:20)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:21) = − (cid:90) k ( a ) (cid:26)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:27) da since (cid:82) a ( K ( a ) − E [ K ( A )]) da vanishes for a = 0 and a = 1. (cid:3) roof of Lemma E.1 It is assumed that h < / k h ( a ; α ) = h π (cid:0) a − α h (cid:1) K (cid:0) a − α h (cid:1) and K h ( a ; α ) = (cid:82) a −∞ k h ( a ; α ) da . It follows from Lemma F.2 that C h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:26)(cid:90) (cid:90) k h ( a ; α ) k h ( a ; α ) (cid:48) [ a ∧ a − a a ] da da (cid:27) dα dα = − (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) (cid:26)(cid:90) a ( K h ( a ; α ) − E [ K h ( A ; α )]) (cid:48) da (cid:27) da = −I h + J h with I h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) (cid:26)(cid:90) a K h ( a ; α ) (cid:48) da (cid:27) da = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) × (cid:26)(cid:90) a (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) da dα dα . J h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) a E [ K h ( A ; α )] (cid:48) da = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a × (cid:26)(cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) da dα dα = (cid:90) g ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα × (cid:90) f ( α ) (cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da dα . J h . The change of variable a = α + ht and the definition of Ω h ( α ) give (cid:90) g ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα = (cid:90) g ( α ) (cid:34)(cid:90) − α h − α h ( α + ht ) π ( t ) K ( t ) dt (cid:35) dα = (cid:90) α g ( α ) Ω h ( α ) s dα + h (cid:90) g ( α ) Ω h ( α ) s dα . For the second item in J h , integrating by parts gives (cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da = (cid:90) −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da − (cid:90) h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da . This gives (cid:90) f ( α ) (cid:90) (cid:20)(cid:90) a −∞ h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da dα = (cid:90) f ( α ) (cid:20)(cid:90) −∞ + (cid:90) (cid:26) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19)(cid:27) da (cid:21) dα − (cid:90) f ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα = (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ + (cid:90) − α h − α h π ( t ) K ( t ) dt (cid:35) dα − (cid:90) f ( α ) (cid:34)(cid:90) − α h − α h π ( t ) K ( t ) ( α + ht ) dt (cid:35) dα = (cid:90) f ( α ) (1 − α ) Ω h ( α ) s dα − h (cid:90) f ( α ) Ω h ( α ) s dα + (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π ( t ) K ( t ) dt (cid:35) dα . J h = (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) + h (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) s dα (cid:21) − h (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) − h (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) + (cid:20)(cid:90) g ( α ) Ω h ( α ) [ αs + hs ] dα (cid:21) (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − αh −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) . Consider now I h , which satisfies I h a = α + ht = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h π ( t ) K ( t ) × (cid:26)(cid:90) α + ht (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) dt dα dα a = α + ht = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h π ( t ) K ( t ) × (cid:40)(cid:90) α + ht (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da (cid:41) dt dα dα . (cid:90) α + ht (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da = (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + (cid:90) α + ht α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) d [ a − α − ht ]= (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + ht (cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt − (cid:90) α + ht α ( a − α − ht ) 1 h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da = (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + ht (cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt − h t (cid:90) (1 − u ) 1 h π (cid:48) (cid:18) α + ht u − α h (cid:19) K (cid:18) α + ht u − α h (cid:19) du It follows that I h = I + h I − h I with I = (cid:90) (cid:90) f ( α ) g ( α ) Ω h ( α ) s (cid:40)(cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da (cid:41) dα dα , I = (cid:90) (cid:90) f ( α ) g ( α ) Ω h ( α ) s (cid:40)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:41) dα dα , I = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) 1 h π (cid:48) (cid:18) α + htu − α h (cid:19) K (cid:18) α + htu − α h (cid:19) du (cid:27) dtdα dα . Consider first I . Integrating by parts gives I = (cid:90) f ( α ) (cid:40)(cid:90) (cid:32)(cid:90) α (cid:34)(cid:90) a − α h −∞ π ( t ) K ( t ) dt (cid:35) da (cid:33) d (cid:20) − (cid:90) α g ( a ) Ω h ( a ) s da (cid:21) (cid:48) (cid:41) (cid:48) dα = (cid:90) f ( α ) (cid:40)(cid:90) (cid:18)(cid:90) α g ( a ) Ω h ( a ) s da (cid:19) (cid:32)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:33) dα (cid:41) dα . (cid:90) (cid:18)(cid:90) α g ( a ) Ω h ( a ) s da (cid:19) (cid:32)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:33) dα = (cid:90) (cid:34)(cid:90) α − α h −∞ π ( t ) K ( t ) dt (cid:26) − d (cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:27) (cid:48) (cid:35) (cid:48) = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) s da (cid:21) dα × (cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt + (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα . It holds, for the second item (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα = (cid:90) − α h − α h (cid:20)(cid:90) α + ht (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) π (cid:48) ( t ) K ( t ) dt = (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) s (cid:48) Ω h ( α ) − h (cid:20)(cid:90) α g ( a ) Ω h ( a ) s dα (cid:21) s (cid:48) Ω h ( α )+ h g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) + o (cid:0) h (cid:1) , where the o ( h ) is uniform over [ h, − h ] and is O ( h ) uniformly over [0 , h ] and [1 − h, f ( · ) and g ( · ), in which case it contributes for o ( h )when integrated out of α . Note that (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) s (cid:48) Ω h ( α ) f ( α ) dα = (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) d (cid:20)(cid:90) α s (cid:48) Ω h ( a ) f ( a ) da (cid:21) = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα . (cid:82) (cid:104)(cid:82) α g ( a ) Ω h ( a ) da (cid:105) dα = (cid:82) αg ( α ) Ω h ( α ) dα I = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα − h (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) f ( α ) dα + h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o (cid:0) h (cid:1) + (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) Consider now I . Integrating by parts gives I = (cid:34)(cid:90) f ( α ) (cid:40)(cid:90) (cid:34)(cid:90) α − α h −∞ π ( t ) K ( t ) dt (cid:35) d (cid:20) − (cid:90) α g ( a ) s (cid:48) Ω h ( a ) da (cid:21)(cid:41) dα (cid:35) (cid:48) = (cid:20)(cid:90) g ( a ) Ω h ( a ) da (cid:21) s (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα + (cid:90) f ( α ) (cid:26)(cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα (cid:27) dα with (cid:90) f ( α ) (cid:26)(cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα (cid:27) dα = (cid:90) f ( α ) (cid:40)(cid:90) − α h − α h (cid:20)(cid:90) α + ht g ( a ) Ω h ( a ) da (cid:21) s π (cid:48) ( t ) K ( t ) dt (cid:41) dα = (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o ( h ) . I = (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o ( h )+ (cid:20)(cid:90) g ( a ) Ω h ( a ) da (cid:21) s (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − αh −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) . For I , the change of variable α = α + hτ , Assumption H and the conditions on f ( · )120nd g ( · ) give I = (cid:90) f ( α ) (cid:90) − α h − α h g ( α + hτ ) (cid:90) − α h − τ − α h − τ t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) π (cid:48) ( tu + τ ) K ( tu + τ ) du (cid:27) dtdτ dα = (cid:90) f ( α ) (cid:90) − α h − α h g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) π (cid:48) ( tu + τ ) K ( tu + τ ) du (cid:27) dtdτ dα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) (cid:20)(cid:90) h π (cid:48) (cid:18) α + htu − α h (cid:19) K (cid:18) α + htu − α h (cid:19) dα (cid:21) du (cid:27) dtdα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:40)(cid:90) (1 − u ) (cid:34)(cid:90) − α h + tu − α h + tu π (cid:48) ( τ ) K ( τ ) dτ (cid:35) du (cid:41) dtdα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) (cid:40)(cid:90) (1 − u ) (cid:34)(cid:90) − α h − α h π (cid:48) ( τ ) K ( τ ) dτ (cid:35) du (cid:41) dtdα + o (1)= 12 (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o (1) . I h = I + h I − h I and the expressions of I , I and I give I h = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα + h (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) [ s s (cid:48) − s s (cid:48) ] Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + h (cid:90) f ( α ) g ( α ) Ω h ( α ) [ s s (cid:48) + s s (cid:48) ] Ω h ( α ) dα + o (cid:0) h (cid:1) + (cid:20)(cid:90) g ( α ) Ω h ( α ) [ αs + hs ] dα (cid:21) (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) We now prepare to compute the expansion of J h − I h . Observe (cid:82) (cid:104)(cid:82) α g ( a ) Ω h ( a ) da (cid:105) dα = (cid:82) αg ( α ) Ω h ( α ) dα , so that (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) − (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) αf ( α ) Ω h ( α ) dα (cid:21) + (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) − (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) Ω h ( a ) f ( a ) da (cid:21) dα + (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) α Ω h ( α ) dα (cid:21) , = Cov (cid:18)(cid:90) A g ( a ) Ω h ( a ) s da, (cid:90) A f ( a ) Ω h ( a ) s da (cid:19) . Similarly, (cid:82) (cid:2)(cid:82) α f ( α ) Ω h ( a ) da (cid:3) dα = (cid:82) f ( α ) (1 − α ) Ω h ( α ) dα gives, after an integration122y parts, (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) − (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα = (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) (cid:18)(cid:90) α Ω h ( a ) f ( a ) da (cid:19) dα (cid:21) − (cid:90) g ( α ) Ω h ( α ) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = − Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) = Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) , (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) = Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , f ( A ) Ω h ( A ) s (cid:19) , and, for any conformable u and v , (cid:90) f ( α ) g ( α ) Ω h ( α ) [ uv (cid:48) ] Ω h ( α ) dα − (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) [ uv (cid:48) ] (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) = Cov ( g ( A ) Ω h ( A ) u, f ( A ) Ω h ( A ) v ) . Collecting these items gives the expansion of C h stated in the Lemma. (cid:3) eferences [1] Guerre, E. & C. Sabbah (2012). Uniform bias study and Bahadur repre-sentation for local polynomial estimators of the conditional quantile function. Econometric Theory , 87–129.[2] Guerre, E. & C. Sabbah (2014). Uniform bias study and Bahadur repre-sentation for local polynomial estimators of the conditional quantile function.http://arxiv.org/pdf/1105.5038.pdf[3] Massart, P. (2007). Concentration inequalities and model selection. Lec-tures Notes in Mathematics 1986. Ecole d’Et´e de Probabilit´es de Saint-FlourXXXIII-2003, Jean Picard (ed.). Springer-Verlag.[4] van de Geer, S. (1999).