[PDF] Quantile regression methods for first-price auctions

Abstract

The paper proposes a quantile-regression inference framework for first-price auctions with symmetric risk-neutral bidders under the independent private-value paradigm. It is first shown that a private-value quantile regression generates a quantile regression for the bids. The private-value quantile regression can be easily estimated from the bid quantile regression and its derivative with respect to the quantile level. This also allows to test for various specification or exogeneity null hypothesis using the observed bids in a simple way. A new local polynomial technique is proposed to estimate the latter over the whole quantile level interval. Plug-in estimation of functionals is also considered, as needed for the expected revenue or the case of CRRA risk-averse bidders, which is amenable to our framework. A quantile-regression analysis to USFS timber is found more appropriate than the homogenized-bid methodology and illustrates the contribution of each explanatory variables to the private-value distribution. Linear interactive sieve extensions are proposed and studied in the Appendices.

Full PDF

QQuantile regression methods for ﬁrst-price auctions

Nathalie GimenesDepartment of EconomicsPUC-RioBrazil [email protected]

Emmanuel GuerreSchool of EconomicsUniversity of KentUnited Kingdom [email protected]

September 2019 a r X i v : . [ ec on . E M ] S e p bstract The paper proposes a sieve quantile regression approach for ﬁrst-price auctions withsymmetric risk-neutral bidders under the independent private value paradigm. It is ﬁrstshown that a private value quantile regression model generates a quantile regression for thebids. The private value quantile regression can be easily estimated from the bid quantileregression and its derivative with respect to the quantile level. A new local polynomialtechnique is proposed to estimate the latter over the whole quantile level interval. Plug inestimation of functionals is also considered, as needed for the expected revenue or the caseof CRRA risk-averse bidders, which is amenable to our framework. A quantile regressionanalysis to USFS timber is found more appropriate than the homogenized bid methodologyand illustrates the contribution of each explanatory variables to the private value distribution.

JEL : C14, L70

Keywords : First-price auction; independent private value; dimension reduction; quantileregression; local polynomial estimation; sieve estimation; boundary correction.

A previous version of this paper has been circulated under the title ”Quantile regression methods forﬁrst-price auction:a signal approach”. The authors acknowledge useful discussions and comments from Xiao-hong Chen, Valentina Corradi, Yanqin Fan, Phil Haile, Xavier d’Haultfoeuille, Vadim Marmer, Isabelle Per-rigne, Martin Pesendorfer and Quang Vuong, and the audience of many conferences and seminars. NathalieGimenes also thanks Ying Fan and Ginger Jin for encouragements. All remaining errors are our responsi-bility. Both authors would like to thank the School of Economics and Finance, Queen Mary University ofLondon, for generous funding.

Introduction

Various quantile approaches have been recently proposed for the Econometrics of Auctions.Haile, Hong and Shum (2003, HHS hereafter) have used monotonicity of bidding strategyto build a quantile test of the independent private value null hypothesis. Milgrom (2001,Theorem 4.7) reformulates the identiﬁcation relation of Guerre, Perrigne and Vuong (2000,GPV afterwards) using quantile function. The risk aversion identiﬁcation result of Guerre,Perrigne and Vuong (2009, GPV09 hereafter) heavily relies on the bid quantile functionin ﬁrst-price auctions. Zincenko (2018) develops a corresponding nonparametric estimationmethod. Liu and Luo (2017) and Liu and Vuong (2018) have respectively developed quantilebased test for the null of exogenous participation and monotonicity of the bidding strategy.Other authors have considered quantile based estimation of the private value distribution.Gimenes (2017) has implemented a quantile regression approach for ascending auction. Seealso Menzel and Morganti (2013) who proposed an order statistics approach. For ﬁrst-price auction, Marmer and Shneyerov (2012) has proposed a quantile-based estimator of theprivate value probability density function (pdf), which is an alternative to the two step GPVmethod. Guerre and Sabbah (2012) have noted that the private value quantile function canbe estimated using a one step procedure from the estimation of the bid quantile function andits ﬁrst derivative. Enache and Florens (2015) have developed an inverse problem approach.The two step method of GPV focuses on the private value pdf estimation, which is quitehard to estimate. Estimating pdf is useful for descriptive purposes and for computation ofimportant moments, such as the expected revenue. But the latter can also be achieved usingquantile functions, as moments are easily computed integrating it. As noted in Milgrom(2001) in the independent private value setting, the value function of a bidder observing auniform signal is nothing else than the private value quantile function, so that a quantileapproach is especially relevant in auction settings. Nonparametric density estimation isnotoriously aﬀected by the curse of dimensionality, and parsimonious models addressing thisissue for density are less rich than for quantile functions, where both single index modelling,as already used in an auction framework by Marmer, Shneyerov and Xu (2013b), and additive1peciﬁcation are available. A simpler speciﬁcation is the homogenized bid model of HHS,which postulates a regression model with iid residuals for the private value. As shown in ourempirical application and in Gimenes (2017) for ascending auctions, it may fail to capturenonlinear dependence of the private value to auction covariate. In addition, it still involvesa GPV step that may not perform well in small samples.The present paper develops a quantile regression methodology for ﬁrst-price auctions,which includes parsimonious but ﬂexible models suitable for moderate samples. The param-eter of interest is the private value conditional quantile function given some auction speciﬁccovariates, which can be estimated faster than the conditional pdf. A key aspect of our ap-proach is that the bid conditional quantile function is a linear functional of the private valueone. It follows that the popular quantile regression model of Koenker and Bassett (1978) canplay a central role in our methodology, as it enjoys an important stability property: a privatevalue quantile regression model generates a bid quantile regression model. The private valuequantile function is a linear combination of the bid quantile function and its ﬁrst derivativewith respect to the quantile level, a simple identiﬁcation method which is the basis of ourestimation procedure. This also applies to the linear sieve quantile regression of Belloni,Chernozhukov, Chetverikov and Fern´andez-Val (2017). Following Horowitz and Lee (2005),the latter can be tailored to additive quantile models, which can be better estimated thatsaturated sieve models. Higher order covariate interactions can also be considered, giving aclass of ﬂexible models which can be tailored to each speciﬁc datasets.An important challenge is raised by the estimation of the bid quantile derivative withrespect to the quantile level α . This was considered by Guerre and Sabbah (2012) and thereferences therein. We propose instead a new local polynomial approach which applies toquantile levels and aims to jointly estimate the bid quantile function and its derivatives. Anunexpected feature is that it performs well for extreme quantile levels, producing consistentestimators for α = 0 and 1. The latter upper quantile levels are particularly important forauctions as private values of winners are expected to be in the top of the distribution. Recentwork focusing on boundary issues are Aryal, Gabrielli and Vuong (2016) in a semiparametric2ramework and Hickman and Hubbard (2015). Our theoretical results include a CentralLimit Theorem for the private value quantile estimator which holds for extreme quantilesand a bias variance decomposition for its Integrated Mean Squared Error (IMSE). The latterallows in particular for bandwidth choice based on a pilot quantile model.A second family of parameters of interest consists in integral functionals of the bid quan-tile function and its quantile level ﬁrst derivatives. A ﬁrst example is the parameter ofConstant Relative Risk Aversion (CRRA) utility functions. CRRA risk aversion preservesindeed the quantile linearity features which are important for our quantile regression method-ology. The risk aversion parameter can be estimated using bidder variations as in GPV09but also combining ﬁrst-price and ascending auction as in Lu and Perrigne (2008). A secondexample is the expected revenue, which falls in such family as it is a functional of the privatevalue quantile function (Gimenes, 2017), see also Li, Perrigne and Vuong (2003). A third ex-ample covers the conditional private value cumulative distribution function and pdf. Indeedthe rearrangement formula of Chernozhukov, Fern´andez-Val and Galichon (2010) expressesthe cdf as an integral functional of the private value quantile function. Diﬀerentiating asmooth version of this functional proposed in Dette and Volgushev (2008) gives a pdf esti-mator which ﬁts in our framework and diﬀers from Marmer and Shneyerov (2012). Thesedistribution estimators are useful for dimension reduction purpose.Our theoretical results are illustrated with a simulation experiment and an applicationto USFS ﬁrst price auctions. A preliminary quantile regression analysis of the bid quantilefunction suggests that the homogenized bid technique should not be applied here becausethe quantile regression slopes are not constant. The private value quantile regression slopefunctions reveal the impact of the covariate, and how strongly bidders in the top of thedistribution can diﬀer from the bottom. CRRA risk-aversion estimation using the approachesof GPV09 and Lu and Perrigne (2008) is also considered. The rest of the paper is organized asfollows. The next section introduces our quantile identiﬁcation approach and the functionalsof interest. Section 3 introduces our local polynomial estimation framework. Section 4 groupsour main theoretical results for the private value quantile functions and its functionals. Our3imulation results are in Section 5 and the application can be found in Section 6. Section7 summarizes the estimation strategy and the empirical application ﬁndings, and describessome possible extensions. All the proofs are gathered in six Online Appendices. A single and indivisible object with some characteristic x ∈ R D is auctioned to I ≥ I and x are known to the bidders and the econometrician.Bids are sealed so that a bidder does not know others’ bid when forming his own bid. Theobject is sold to the highest bidder who pays his bid B i to the seller. Under the symmetricIPV paradigm, each potential bidder is assumed to have a private value V i , i = 1 , . . . , I for the auctioned object. A buyer knows his private value but not the private value of theother bidders, but the joint distribution of the V i is common knowledge. The private valuesare independently and identically drawn from a distribution given ( x, I ) with a compactlysupported cdf F ( ·| x, I ), or equivalently with conditional quantile function V ( α | x, I ) = F − ( α | x, I ) , α in [0 , . The private value quantile function is the ﬁrst parameter of interest of the present paper,to be estimated from bids B i from the symmetric Bayesian Nash equilibrium. Section 2.4below considers a second set of parameters of interest derived from V ( ·|· , · ) such as the cdf F ( ·|· , · ) or the associated pdf f ( ·|· , · ). It is well-known that the bidder i private value rank A i = F ( V i | x, I )4as a uniform distribution over [0 ,

1] and is independent of x and I . It also follows from theIPV paradigm that the private value ranks A i = 1 , . . . , I are independent. The dependencebetween the private value V i and the auction covariates x and I is therefore fully capturedby the non separable quantile representation V i = V ( A i | x, I ) , A i iid ∼ U [0 , ⊥ ( x, I ) . Following Milgrom and Weber (1982) or Milgrom (2001), V ( ·| x, I ) can also be viewed as a valuation function, the private value rank A i being the associated signal. In what follows, G ( ·| x, I ) and g ( ·| x, I ) stand for respectively the bid conditional cdf and pdf.Maskin and Riley (1984) have shown that Bayesian Nash Equilibrium bids B i = σ ( V i ; x, I )of symmetric risk averse or risk neutral bidders must strictly and continuously increasewith the private values under the IPV paradigm. It follows that B i = B ( A i | x, i ) where B ( · ; x, i ) = σ ( F ( ·| x, I ) ; x, I ) can be viewed as a bidding strategy depending upon the rank A i . If F ( ·| x, I ) is also strictly increasing, so is B ( ·| x, I ) and since A i is uniform it holds G ( b | x, I ) = P [ B ( A i | x, I ) ≤ b | x, I ] = P (cid:2) A i ≤ B − ( b | x, I ) | x, I (cid:3) = B − ( b | x, I )showing that the bidding strategy B ( ·| x, I ) is also the bid quantile function.A standard best response argument will show how to identify the private value quantilefunction V ( ·| x, I ) from B ( ·| x, I ). Suppose bidder i signal A i is equal to α , but that her bidis a suboptimal B ( a | x, I ), all other bidders bidding B ( A j | x, I ). Then the probability thatbidder i wins the auction is P (cid:20) B ( a | x, I ) > max ≤ j (cid:54) = i ≤ I B ( A j | x, I ) (cid:12)(cid:12)(cid:12)(cid:12) A i = α, x, I (cid:21) = P (cid:20) a > max ≤ j (cid:54) = i ≤ I A j (cid:12)(cid:12)(cid:12)(cid:12) A i = α, x, I (cid:21) = a I − (2.1)because the A j ’s are independent U [0 , independent of x and I . It follows that the expectedrevenue of such a bid is, for a risk neutral bidder, ( V ( α | x, I ) − B ( a | x, I )) a I − . If B ( ·| x, I )5s a best-response bidding strategy, the optimal bid of a bidder with signal α is B ( α | x, I ),that is α = arg max a (cid:8) ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9) . As B ( ·| x, I ) is continuously diﬀerentiable, it follows that ∂∂a (cid:8) ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) a = α = 0 (2.2)or equivalently ddα (cid:2) α I − B ( α | x, I ) (cid:3) = ( I − α I − V ( α | x, I ). Solving with the initial condi-tion B (0 | x, I ) = V (0 | x, I ) and rearranging the equation above gives Proposition 1, which isthe cornerstone of our estimation method. From now on B (1) ( α | x, I ) = ddα B ( α | x, I ). Proposition 1

Consider a given ( x, I ) , I ≥ , for which α ∈ [0 , (cid:55)→ V ( α | x, I ) is contin-uously diﬀerentiable with a derivative V (1) ( ·| x, I ) > . Suppose the bids are drawn from thesymmetric diﬀerential Bayesian Nash equilibrium. Then,i. The conditional equilibrium quantile function B ( ·| x, I ) of the I iid optimal bids B i satisﬁes B ( α | x, I ) = I − α I − (cid:90) α a I − V ( a | x, I ) da. (2.3) ii. The bid quantile function B ( α | x, I ) is continuously diﬀerentiable over [0 , and it holds V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) I − . (2.4)A key feature is the linearity of the private value to bid quantile function mapping (2.3),which implies that a private value quantile linear model is mapped into a similar bid linearmodel, as detailed below for the well known quantile regression. Proposition 1-(ii) showsthat the private value quantile function is identiﬁed from the bid quantile function and its6erivative, as noted in Guerre and Sabbah (2012). It is a quantile version of the identiﬁcationstrategy of GPV, based on the computation of the private value from the bid V i = B i + 1 I − G ( B i | x, I ) g ( B i | x, I ) . Versions of (2.4) with B (1) ( α | x, I ) changed into 1 /g ( B ( α | x, I ) | x, I ) can be found in Milgrom(2001, Theorem 4.7), Liu and Luo (2014), Enache and Florens (2015), Liu and Vuong (2016)and Luo and Wan (2016) and, under risk aversion, in GPV09 and Campo, Guerre, Perrigneand Vuong (2011). As developed in Section 2.4 below, Proposition 1 can be extended to thecase of symmetric risk-averse bidders with a CRRA utility function. Private value quantile regression.

The linearity of (2.3) with respect to the privatevalue quantile function has remained unnoticed with very few exceptions, although it hasimportant model stability implications useful for practical implementation. Consider forinstance a private value quantile given by the quantile regression speciﬁcation V ( α | x, I ) = γ ( α | I ) + x (cid:48) γ ( α | I ) = [1 , x (cid:48) ] γ ( α | I ) . (2.5)Proposition 1-(i) implies that the conditional bid quantile function satisﬁes, B ( α | x, I ) = [1 , x (cid:48) ] β ( α | I ) with β ( α | I ) = I − α I − (cid:90) α t I − γ ( t | I ) dt, (2.6)showing B ( α | x, I ) belongs to the quantile regression speciﬁcation. It follows from (2.4) that γ ( α | I ) = β ( α | I ) + αβ (1) ( α | I ) I − , (2.7) This can be recovered from (2.4) taking α = A i as V i = V ( A i | x, I ), B i = B ( A i | x, I ) implying that A i = G ( A i | x, I ) and B (1) ( A i | x, I ) = 1 /g ( B ( A i | x, I ) | x, I ) = 1 /g ( B i | x, I ).

7o that γ ( α | I ) can easily be estimated from an estimation of β ( α | I ) and β (1) ( α | I ). Itthen follows that the quantile regression speciﬁcation is stable, i.e. a quantile regressionspeciﬁcation for the private value is equivalent to a quantile regression speciﬁcation for thebid. Hence testing the correct speciﬁcation of a bid quantile regression model is equivalentto test the correct speciﬁcation of a private value quantile speciﬁcation. The expressions(2.6) and (2.7) show that signiﬁcance testing can be done through bid quantile regressionas γ j ( ·| I ) = 0 is equivalent to β j ( ·| I ) = 0, or more generally e (cid:48) γ ( ·| I ) = c is equivalent to e (cid:48) β ( ·| I ) = c for any conformable e and c . Bid homogenization and quantile regression.

HHS have noted that a translation ofthe private values results in a similar translation of the bids, an invariance property thatthey use in their bid homogenization technique. The latter can be interpreted as the use of aregression model for the private values, V i = γ + x (cid:48) γ + v i with an error term v i independentof x , as also proposed by Rezende (2008). This amounts to assume that the slope function γ ( ·| I ) in (2.5) does not depend upon the quantile level. The regression model of HHS andRezende (2008) is indeed equivalent to the quantile regression speciﬁcation V ( α | x ) = γ + x (cid:48) γ + v ( α )where v ( α ) is the quantile function of v i . Since I − α I − (cid:82) α a I − da = 1, it follows that theassociated bid quantile function is, by (2.3) B ( α | x, I ) = γ + x (cid:48) γ + b ( α | I ) , where b ( α | I ) = I − α I − (cid:90) α a I − v ( a ) da. This gives the bid regression model B i = β ( I ) + x (cid:48) γ + b i , β ( I ) = γ + E [ b ( A i | I )]where the regression error term b i = b ( A i | I ) − E [ b ( A i | I )] is centered and independent of x . Following these authors, the coeﬃcient γ can be estimated regressing the bids on [1 , x (cid:48) ]8nd the distribution of v i can be estimated applying the GPV two step method to thehomogenized bids, which are the residuals B i − x (cid:48) (cid:98) γ .However this approach requests independence between the regression error term v i andthe covariate x , an assumption which may be too restrictive in practice as found by Gimenes(2017) and the application below. When γ ( · ) is not a constant, regressing B ( α | x, I ) on[1 , x ] gives B i = β ( I ) + x (cid:48) β ( I ) + b ( A i | x, I ) with a slope coeﬃcient satisfying β ( I ) = (cid:90) (cid:18) I − α I − (cid:90) α a I − γ ( a ) da (cid:19) dα = (cid:90) γ ( α ) dα − (cid:90) (cid:18)(cid:90) α (cid:16) aα (cid:17) I − γ (1)1 ( a ) da (cid:19) dα and a residual term b ( A i | x, I ) = v ( A i ) + x (cid:48) I − A I − i (cid:82) A i a I − γ ( a ) da − β ( I ) which now dependsupon x , so that the homogenized bid approach does not apply. Using variation of I canbe useful to detect such a situation because observing variation of β ( I ) implies that γ ( · )is not a constant. In particular, If the entries of γ (1)1 ( · ) are nonnegative, the entries of β ( I ) must increase with I . Similar features hold for centered bids B i − E [ B i | I ] whenthe homogenized bid regression is replaced by a nonparametric regression: the regressionfunction E [ B i − E [ B i | I ] | x, I ] should not depend upon I if V i = m ( X ) + v i , as for the singleindex regression speciﬁcation considered in Paarsch and Hong (2006). Flexible interactive speciﬁcations.

The private value quantile regression model (2.5)assumes linearity of the private value quantile function with respect to the covariate x . Thismay be too strong and can be relaxed using a quantile nonparametric additive speciﬁcation,which was considered in Horowitz and Lee (2005). Recall that x = ( x , . . . , x D ) and considerthe additive quantile function V ( α | x, I ) = D (cid:88) j =1 V j ( α ; x j , I ) (2.8)9here each functions V j ( α ; x j , I ) is speciﬁc to the entry x j . Since such quantile speciﬁcationsare obtained by summing some univariate functions, the eﬀective dimension involved in thenonparametric dimension of this model is 1 because it can be estimated with the same ratethan a nonparametric model with a unique covariate as shown in Horowitz and Lee (2005).This parsimonious model can be generalized following Andrews and Whang (1990) to allowfor more covariate interactions. This leads to the additive interactive quantile speciﬁcationwith D M interactions V ( α | x, I ) = D M (cid:88) δ =1 (cid:88) ≤ j < ···

The interactive quantile speciﬁcation (2.9) can be esti-mated using a sieve expansion, as in Horowitz and Lee (2005) or Andrews and Whang (1990).Consider a sieve { P k ( x ) , ≤ k ≤ K } is a family of functions P k ( · ) = P kK ( · ) allowing for atmost D M interactions and suppose that there are some sieve coeﬃcients γ k ( ·| I ) = γ kK ( ·| I )10uch that for all α V ( α | x, I ) = lim K →∞ K (cid:88) k =1 γ k ( α | I ) P k ( x ) . (2.10)The expression (2.10) can be viewed as a sieve extension of the quantile regression, a sievequantile regression . It follows from Proposition 1-(i,ii) that, provided the limit in (2.10)holds uniformly with respect to α , B ( α | x, I ) = lim K →∞ K (cid:88) k =1 β k ( α | I ) P k ( x ) , β k ( α | I ) = I − α I − (cid:90) α t I − γ k ( t | I ) dt , (2.11) V ( α | x, I ) = lim K →∞ K (cid:88) k =1 (cid:32) β k ( α | I ) + αβ (1) k ( α | I ) I − (cid:33) P k ( x ) . (2.12)Hence estimating the private value sieve quantile regression can proceed from estimating thecoeﬃcients of the bid sieve quantile regression in (2.11) and their ﬁrst derivatives. Many auction parameters of interest can be written using the private value quantile functions,or equivalently the bid quantile function and its quantile derivative by (2.4). We focus hereon the conditional and unconditional integral functionals θ ( x ) = (cid:90) F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) dα, θ = (cid:90) X θ ( x ) dx (2.13)where F ( α, x, b I , b I ; I ∈ I ) is a real valued continuous function. Three illustrative examplesare as follows. Example 1: CRRA risk aversion.

For symmetric risk averse bidders with a concaveutility function, the best response condition (2.2) becomes ∂∂a (cid:8) U ( V ( α | x, I ) − B ( a | x, I )) a I − (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) a = α = 0 . V ( α | x, I ) = B ( α | x, I )+ λ − (cid:16) αB (1) ( α | x,I ) I − (cid:17) where λ ( · ) = U ( · ) /U (cid:48) ( · ). For risk averse bidders with a CRRA utility function U ( t ) = t θ , arguing as forProposition 1 shows V ( α | x, I ) = B ( α | x, I ) + θ αB (1) ( α | x, I ) I − , (2.14) B ( α | x, I ) = 1 α I − θ (cid:90) α t I − θ − V ( t | x, I ) dt. These two formulas show that the stability implications of Proposition 1 for linear privatevalue and bid quantile functions are preserved under CRRA. Assuming as in GPV09 thatthe number of bidders is exogenous, i.e V ( α | x, I ) = V ( α | x ) for all I , gives, for any pair I (cid:54) = I θ = θ n θ d = (cid:82) X (cid:104)(cid:82) ( B ( α | x, I ) − B ( α | x, I )) (cid:16) αB (1) ( α | x,I ) I − − αB (1) ( α | x,I ) I − (cid:17) dα (cid:105) dx (cid:82) X (cid:20)(cid:82) (cid:16) αB (1) ( α | x,I ) I − − αB (1) ( α | x,I ) I − (cid:17) dα (cid:21) dx , (2.15)a formula which shows that the CRRA risk aversion can be easily identiﬁed from ﬁrst-priceauction. Following Lu and Perrigne (2008), the risk-aversion parameter θ can also be iden-tiﬁed combining ascending and ﬁrst-price auctions data. As seen from Gimenes (2017), theprivate value quantile function V asc ( α | x, I ) can be easily estimated from ascending auctions.Equating V asc ( α | x, I ) to V ( α | x, I ) in (2.14) gives that θ satisﬁes θ = (cid:82) X (cid:104)(cid:82) ( V asc ( α | x, I ) − B ( α | x, I )) αB (1) ( α | x,I ) I − dα (cid:105) dx (cid:82) X (cid:20)(cid:82) (cid:16) αB (1) ( α | x,I ) I − (cid:17) dα (cid:21) dx . (2.16) Example 2: Expected revenue.

Suppose that the seller decides to reject bids lowerthan a reserve price R and let α R = α R ( x, I ) be the associated screening level, i.e. α R =12 ( R | x, I ). For CRRA bidders, the ﬁrst price auction seller’s expected revenue is ER θ ( α R | x, I ) = θ · I · V ( α R | x, I )( I −

1) ( θ −

1) + θ α I − θ R (cid:18) − α ( I − θ − θ +1 R (cid:19) + I ( I − I −

1) ( θ −

1) + θ (cid:90) α R t I − θ − (cid:16) − t ( I − θ − θ +1 (cid:17) V ( t | x, I ) dt. (2.17)This expression includes an integral item θ ( x ; α R ) = (cid:90) α R t I − θ − (cid:16) − t ( I − θ − θ +1 (cid:17) V ( t | x, I ) dt which can be estimated by plugging in a risk aversion estimator (cid:98) θ and an estimator (cid:98) V ( α | x, I )of the private value quantile function, or estimators of the bid quantile function and itsderivative by (2.4). Example 3: Private value distribution

Chernozhukov et al. (2010) have used therearrangement formula to invert a monotonic function. In our case, the conditional privatevalue cdf satisﬁes F ( v | x, I ) = E [ I [ V ( A | x, I ) ≤ v ] | x, I ] = (cid:90) I [ V ( α | x, I ) ≤ v ] dα, A ∼ U [0 , .Dette and Volgushev (2008) have considered a smoothed version I η ( · ) of the indicator func-tion F η ( v | x, I ) = (cid:90) I η [ v − V ( α | x, I )] dα It is assumed for the sake of brevity that the seller value for the good is 0.The expected revenue formulafor the general case follows from Gimenes (2017). Under risk-neutrality, integrating by parts gives that (cid:90) α R B (1) ( α | x, I ) α I − (1 − α ) dα = B ( α R | x, I ) α I − R (1 − α R ) − (cid:90) α R B ( α | x, I ) α I − ( I − − Iα ) dα, estimation of θ ( x ; α R ) can also be done using only a bid quantile estimator. I η ( t ) = (cid:82) t/η −∞ k ( u ) du , k ( · ) being a kernel function and η a bandwidth parameter.Diﬀerentiating F η ( v | x, I ) gives f η ( v | x, I ) = 1 η (cid:90) k (cid:18) v − V ( α | x, I ) η (cid:19) dα which converges to the private value pdf when η goes to 0. Note that F η ( v | x, I ) and f η ( v | x, I )can be estimated by plugging in an estimator (cid:98) V ( α | x, I ) of V ( α | x, I ). The resulting cdf andpdf estimator are expected to inherit of the dimension reduction property of this procedure.As the private value estimator (cid:98) V ( α | x, I ) proposed in the next section is consistent over thewhole [0 , Proposition 1 suggests to base the estimation of the private value quantile function on es-timations of B ( α | x, I ) and of its derivative B (1) ( α | x, I ) with respect to α . While thereis an important literature on the estimation of a conditional quantile function, estimatingthe ﬁrst derivative of a quantile function has received much less attention. The augmented methodology applies local polynomial expansion with respect to α for joint estimation of B ( α | x, I ) and B (1) ( α | x, I ). Sieve methods can be used for the covariate. To ensure com-parability with the literature, we assume that the private value quantile function V ( α | x, I )has s + 1 continuous derivatives with respect to α . As seen from (2.3), this implies that thebid quantile function B ( α | x, I ) has s + 2 continuous derivatives with respect to α >

0. Thisjustiﬁes the order s + 1 for the local polynomial estimator considered here. The no covariate case.

Consider L iid ﬁrst-price auctions ( I (cid:96) , x (cid:96) , B i(cid:96) , i = 1 , . . . , I (cid:96) ). Tointroduce our estimation strategy, assume ﬁrst that V ( α | x, I ) = V ( α | I ) and B ( α | x, I ) =14 ( α | I ). Let ρ α ( · ) be the check function, ρ α ( q ) = q ( α − I ( q ≤ , I ( · ) being the indicator function, I ( q ≤

0) = 1 for q ≤ B ( α | I ) = arg min q E [ I ( I (cid:96) = I ) ρ α ( B i(cid:96) − q )] , α ∈ (0 ,

1) .Estimating the derivative B (1) ( α | I ) can be done by introducing local variation of the quantilelevel in the vicinity of α . Let K ( · ) ≥ − ,

1] and h = h L be a positive bandwidth parameter going to 0 with the sample size. Then it follows that { B ( a | I ) , a ∈ [ α − h, α + h ] ∩ [0 , } = arg min q ( a ) (cid:90) E [ I ( I (cid:96) = I ) ρ a ( B i(cid:96) − q ( a ))] 1 h K (cid:18) a − αh (cid:19) da, (3.1)where the minimization is performed over the set of functions q ( a ) which are continuous on[ α − h, α + h ] ∩ [0 , B ( a | I ) over [ α − h, α + h ] is given by the Taylor expansion B ( a | I ) = B ( α | I ) + B (1) ( α | I ) ( a − α ) + · · · + B ( s +1) ( α | I ) ( a − α ) s +1 ( s + 1)! + O (cid:0) h s +2 (cid:1) . Let b = ( β , . . . , β s +1 ) (cid:48) be the generic coeﬃcients of such a polynomial function and π ( a ) = (cid:20) , a, a . . . , a s +1 ( s + 1)! (cid:21) (cid:48) . (cid:98) R ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:0) B i(cid:96) − π ( a − α ) (cid:48) b (cid:1) h K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − αh − αh ρ α + ht (cid:0) B i(cid:96) − π ( ht ) (cid:48) b (cid:1) K ( t ) dt. The augmented quantile estimator is (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I ), (cid:98) β ( α | I ) and (cid:98) β ( α | I ) beingestimators of B ( α | I ) and its ﬁrst derivative B (1) ( α | I ), respectively. The estimator of theprivate value quantile is (cid:98) V ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) I − . Augmented quantile regression.

A ﬁrst extension of this procedure is the augmentedquantile regression estimator, AQR hereafter, which considers the private quantile regressionspeciﬁcation V ( α | x, I ) = [1 , x (cid:48) ] γ ( α | I ) . When the private value distribution does not depend upon I , the bid quantile functions B ( ·| I ) are suchthat the derivatives ∂ j ∂α j (cid:20) B ( α | I ) + αB (1) ( α | I ) I − (cid:21) = (cid:18) jI − (cid:19) B ( j ) ( α | I ) + αB ( j +1) ( α | I ) I − I as they are equal to V ( j ) ( α | I ) = V ( j ) ( α ), j = 0 , . . . , s + 1. These constraintscan be used to estimate V ( α ) using the parameters γ = ( γ , . . . γ s ) , δ = ( δ , . . . , δ I ) where γ j is for V ( j ) ( α ) and δ I for the derivatives B ( s +1) ( α | I ), I = 2 , . . . , I and b I ( γ, δ ) = [ b ,I , . . . , b s,I , δ I ] (cid:48) with b s,I = (cid:16) sI − (cid:17) − (cid:16) γ s − αI − δ I (cid:17) and the b j,I ’s are computed recursively using b j,I = (cid:18) jI − (cid:19) − (cid:18) γ j − αI − b j +1 ,I (cid:19) , j = 0 , . . . , s. The estimator of V ( α ) is (cid:98) γ where (cid:16)(cid:98) γ, (cid:98) δ (cid:17) = arg min γ,δ (cid:80) II =2 (cid:98) R ( b I ( γ, δ ) ; α, I ). Although not considered here, the augmented quantile estimation procedure can be used to estimate thep.d.f. f ( v | I ) of the private value using f ( v | I ) = 1 /V (1) [ F ( v | I ) | I ]. An estimator for F ( ·| I ) is (cid:98) V − ( ·| I ).Set (cid:98) V (1) ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) / ( I −

1) and (cid:98) f ( v | I ) = 1 / (cid:98) V (1) (cid:104) (cid:98) F ( v | I ) | I (cid:105) . This p.d.f. estimator canaccount for covariates by using the AQR and ASQR procedures introduced below.

16n the second extension, the augmented sieve quantile regression (ASQR), the private valuequantile function V ( α | x, I ) is equal to P ( x ) (cid:48) γ ( α | I ) up to an approximation error, where P ( x ) stacks the sieve functions P k ( x ), k = 1 , . . . , K . The AQR and ASQR approaches canbe grouped setting P ( x ) = [1 , x (cid:48) ] (cid:48) for the AQR.The bid quantile function satisﬁes B ( α | x, I ) = P ( x ) (cid:48) β ( α | I ) by (2.6) with γ ( α | I ) = β ( α | I ) + αβ (1) ( α | I ) / ( I −

1) by (2.7), up to an approximation error in the ASQR case.Deﬁne now the parameter b = (cid:2) β (cid:48) , β (cid:48) , . . . , β (cid:48) s +1 (cid:3) where all the β j have the same dimension D + 1 and P ( x, t ) = π ( t ) ⊗ P ( x )which is such that the Taylor expansion of B ( α | x, I ) writes, in the AQR case, B ( α + ht | x, I ) = P ( x, ht ) (cid:48) b ( α | I ) + O (cid:0) h s +2 (cid:1) where b ( α | I ) stacks β ( α | I ) and its successive derivatives β (1) ( α | I ) , . . . , β ( s +1) ( α | I ). Theobjective function of the estimation procedure becomes (cid:98) R ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:0) B i(cid:96) − P ( x (cid:96) , a − α ) (cid:48) b (cid:1) h K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − αh − αh ρ α + ht (cid:0) B i(cid:96) − P ( x (cid:96) , ht ) (cid:48) b (cid:1) K ( t ) da (3.2)which accounts for the covariate x (cid:96) . The estimation of b ( α | I ) is (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I )and the private value quantile regression estimator is (cid:98) V ( α | x, I ) = P ( x ) (cid:48) (cid:98) γ ( α | I ) with (cid:98) γ ( α | I ) = (cid:98) β ( α | I ) + α (cid:98) β ( α | I ) I − . The bid quantile function and its derivatives can be estimated using (cid:98) B ( α | x, I ) = P ( x ) (cid:48) (cid:98) β ( α | I )17nd (cid:98) B (1) ( α | x, I ) = P ( x ) (cid:48) (cid:98) β ( α | I ). The rearrangement method of Chernozhukov et al. (2010)can be used to obtain increasing quantile estimators. Bassett and Koenker (1982) report that standard quantile regression estimators are notdeﬁned for the extreme quantile levels α = 0 or α = 1 or even nearby. The augmentedprocedures proposed here are better behaved for extreme quantiles because the objectivefunction (cid:98) R ( · ; α, I ) averages the check function ρ a ( · ) for quantile levels a in [ α − h, α + h ] ∩ [0 , α = 1 and h ≤ (cid:98) R ( b ; 1 , I ) averages ρ ht (cid:0) B i(cid:96) − P ( x (cid:96) , ht ) (cid:48) b (cid:1) over t in [ − , (cid:98) R ( b ; 1 , I ) will be large if b is too large. Figure 1 below shows indeed that (cid:98) R ( b ; 1 , I ) has no ﬂat part when b grows, contrasting with the standard quantile regressionobjective functions.Figure 1: A path of the objective function (cid:98) R ( b · ; 1 , I ) (solid line) of the augmented quan-tile regression estimator and of the objective function of the standard quantile regressionestimator (dotted line) when b varies in the direction [1 , . . . , (cid:48) . This averaging eﬀect requests that t → P ( x (cid:96) , ht ) (cid:48) b is not constant meaning that the derivative compo-nents of b should not vanish. α = 0 and α = 1 than the standard quantile regression estimator. This is especiallyrelevant for estimating auction models as the winner is expected to belong to the upper tailas soon as the number of bidders is large enough. In fact, it follows from the theoretical studyof the objective function (cid:98) R ( · ; · , I ) that the AQR and ASQR estimators are uniquely deﬁnedfor all quantile levels with a large probability. As a result of a smooth objective function,the AQR and ASQR estimators are also smoother than standard quantile regression ones,see for instance Figure 4 in the Application Section.

The notations a ∨ b and a ∧ b are used instead of max ( a, b ) and min ( a, b ). Recall a L (cid:16) b L means that both a L /b L = O (1) and b L /a L = O (1). The norm (cid:107)·(cid:107) is the Euclidean one, i.e. (cid:107) e (cid:107) = ( e (cid:48) e ) / . (i) The auction variables ( I (cid:96) , x (cid:96) , V i(cid:96) , B i(cid:96) , i = 1 , . . . , I (cid:96) ) are iid across (cid:96) . Thepdf f ( x | I ) of the covariates x (cid:96) given I (cid:96) = I is continuous and bounded away from overits bounded support X , with a non empty interior and which does not depend upon I . Theactual number of bidders I (cid:96) belongs to a ﬁnite set I of integer numbers larger or equal to .(ii) Given ( x (cid:96) , I (cid:96) ) = ( x, I ) , the V i(cid:96) , i = 1 , . . . , I (cid:96) are iid with a conditional quantilefunction V ( α | x, I ) , which is continuously diﬀerentiable over [0 , × X with inf ( α,x,I ) ∈ [0 , ×X ×I V (1) ( α | x, I ) > and sup ( α,x,I ) ∈ [0 , ×X ×I V (1) ( α | x, I ) < ∞ . (iii) (2.3) holds with B (0 | x, I ) = V (0 | x, I ) for all ( x, I ) ∈ X × I . See the discussion following Theorem C.4 in Appendix C for a formal argument. ssumption S For some s ≥ and each I ∈ I , V ( α | x, I ) is ( s + 1) − times continuouslydiﬀerentiable over [0 , × X with either: (i) D M = 0 in which case V ( α | x, I ) = X (cid:48) γ ( α | I ) as in (2.5); (ii) D M > , in which case V ( α | x, I ) has D M interactions as in (2.9). Assumption H

The kernel function K ( · ) with support ( − , is symmetric, continuouslydiﬀerentiable over the straight line, and strictly positive over ( − , . The positive bandwidth h goes to with lim L →∞ log LLh D M +1) = 0 . For the ASQR estimator, P ( x ) = [ P ( x ) , . . . , P K ( x )] (cid:48) where P k ( x ) = P hk ( x ) and K (cid:16) h − D M . The retained sieve satisﬁes the high-level Assumption R stated in Appendix A. Assumption F

For all x in X and α in [0 , , the function F [ α, x, b I , b I ; I ∈ I ] is twicediﬀerentiable with respect to b I and b I , I in I . The partial derivatives of order 1 and 2 arecontinuous with respect to α , x , B I and B (1) I , I in I . Assumption A recalls the quantile implications of Bayesian Nash equilibrium biddingunder symmetric IPV, see Assumption A-(iii). In Assumption A-(i), the existence of aconditional pdf for the covariate x (cid:96) is only used for the inﬁnite dimensional quantile regressionspeciﬁcation. For a standard quantile regression speciﬁcation, it is suﬃcient to assume thatthe matrix E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] has an inverse for all I ∈ I as recalled in Assumption R-(i) inAppendix A. Note that, as all along this paper, private values and number of bidders canbe dependent. A discussion of such dependence in relation with an entry stage preliminaryto the auction can be found in Marmer, Shneyerov and Xu (2013a). For Assumption A-(ii),recall that V (1) ( α | x, I ) = 1 f ( V ( α | x, I ) | x, I ) , (4.3)where f ( v | x, I ) is the conditional private value pdf. Hence Assumption A-(ii) amounts to as-sume that f ( v | x, I ) is bounded away from 0 and inﬁnity on its support [ V (0 | x, I ) , V (1 | x, I )]as assumed for instance in Riley and Samuelson (1981), Maskin and Riley (1984) or GPV.20he condition 0 < f ( v | x, I ) < ∞ is also used for asymptotic normality of quantile regressionestimator, see Koenker (2005). Assumption S combines a standard smoothness assumptionwith interaction restrictions.Assumption H restricts the rate at which the bandwidth can go to 0. In the AQRcase, it writes lim L →∞ log L/ ( Lh ) = 0 which is slightly more restrictive than the conditionlim L →∞ log L/ ( Lh ) = 0 used in nonparametric estimation. This rate restriction is speciﬁcto the quantile approach used here. The restriction K (cid:16) h − D M and the choice of a sievesatisfying the high-level Assumption R of Appendix A is discussed in the next section.Assumption F hold for most of the examples of functionals above. A notable excep-tion is the cdf F ( v | x, I ) in Example 3 when expressed using the rearrangement method ofChernozhukov et al. (2010), which involves an indicator function which is not smooth. How-ever it holds for the smoothed approximation F η ( v | x, I ) of the cdf, although Assumption Fimplicitly rules out vanishing bandwidth η in Example 3. The last stage of our procedure is the choice of a suitable sieve in (2.10), when a quantileregression speciﬁcation cannot be used and more ﬂexibility is needed. While the high levelAssumption R of Appendix A mentioned in Assumption H describes some key theoreticalproperties used in the main results, the focus is set here on suitable sieves. The mostimportant requirement is that the sieve has good approximation properties as detailed inAppendix A. Although not strictly necessary, the sieve functions P k ( · ) in the private valuequantile expansion (2.10) should be localized, i.e. the number of P k (cid:48) ( · ) such that P k ( · ) P k (cid:48) ( · )do not vanish must be bounded. These two requirements are typically satisﬁed by sievesbuilding on cardinal spline basis or wavelets as detailed now.Consider ﬁrst the spline example of sieves. Assume that X = [0 , D for the sake ofbrevity. For m ≥ s + 2, set ( t ) m − = t m − if t > t ) m − = 0 otherwise. The consideredspline sieve is based upon the uniformly spaced simple knots B − spline function of order m q ( t ) = m (cid:88) i =0 ( − i (cid:0) mi (cid:1) ( t − i ) m − m !which has m − , m ].The baseline B − spline function q ( · ) generates the rescaled functions p κh ( · ) = p κ ( · ) p κ ( t ) = 1 √ h q (cid:18) t − ( κ − m ) hh (cid:19) , κ = 1 , . . . , κ where κ = κ h = O (1 /h ) is the largest integer number such that ( κ − m ) h ≤ ≤ κh .Theorem 6.20 in Schumaker (2007) implies that each function v ( · ) with s + 1 continuousderivatives can be approximated uniformly over [0 ,

1] with a linear combination of the p κ ( · )’sup to an error o (cid:0) h − ( s +1) (cid:1) . The p κ ( · )’s are also localized with (cid:82) p κ ( t ) dt = O (1) uniformlyin κ and h . Similarly, additive quantile functions as in (2.8) can be approximated using thesieve { p κ ( x ) , . . . , p κ ( x D ) , κ = 1 , . . . κ } . A suitable sieve for additive interactive quantile function of order D M as in (2.9) is (cid:40) D M (cid:89) δ =1 p κ δ ( x j δ ) , all ( κ δ , j δ ) with 1 ≤ κ , . . . , κ D M ≤ κ , 1 ≤ j < · · · < j δ ≤ D (cid:41) . (4.4)The set (4.4) can be written as a collection { P k ( x ) , k = 1 , . . . , K } with K = O (cid:0) h − D M (cid:1) localized functions satisfying (cid:82) X P k ( x ) dx = O (1) uniformly in k and h .Similar localized sieve can be obtained using wavelets on the interval [0 , ϕ ( · ) and ψ ( · ) the father and mother wavelets of order s + 1, i.e. (cid:82) t r ϕ ( t ) dt = 0 for r = 1 , . . . , s + 1. A wavelet sieve similar to (4.4) is given by22he collection of functions D M (cid:89) δ =1 − H / ϕ (cid:18) x j δ − − H κ δ H (cid:19) and D M (cid:89) δ =1 − H/ ψ (cid:18) x j δ − − H κ δ H (cid:19) , H ≤ H ≤ H where H and H are two diverging integer numbers with 2 − H (cid:16) h , κ δ and j δ as in (4.4). The next sections give our theoretical results for integrated mean squared error and asymp-totic distribution of the augmented estimator (cid:98) V ( ·| x, I ). Theorem A.1 in Appendix A alsogives uniform consistency rates of similar interest. Recall P ( x (cid:96) ) = [1 , x (cid:48) (cid:96) ] (cid:48) is of the constant dimension K = D + 1 in the AQR case. Let s bethe 1 × ( s + 2) selection vector (0 , , , . . . , , which is such that s ⊗ Id K (cid:98) β ( α | I ) = (cid:98) β ( α | I )is the estimator of sieve coeﬃcient derivative β (1) ( α ). Let Π ( α ) be the second column ofthe inverse of (cid:82) π ( t ) π ( t ) (cid:48) K ( t ) dt , i.e.,Π ( α ) = (cid:18)(cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:19) − s (cid:48) and consider the variance terms v ( α ) = Π ( α ) (cid:48) (cid:90) (cid:90) π ( t ) π ( t ) (cid:48) min ( t , t ) K ( t ) K ( t ) dt dt Π ( α ) , Σ ( α | I ) = α v ( α )( I − E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) (cid:3) E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , Σ IL = (cid:90) X (cid:90) P ( x ) (cid:48) Σ ( α | I ) P ( x ) dαdx. v ( α ), and then Σ IL , is strictly positive follows from the proof of Theorem 2 below,see in particular Lemma B.5 in Appendix B. The bias of the estimator will depend upon Bias ( α | I ) = αI − s (cid:18)(cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:19) − (cid:90) t s +2 π ( t )( s + 2)! K ( t ) dt × E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) αB ( s +2) ( α | x (cid:96) , I (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , Bias IL = (cid:90) X (cid:90) (cid:0) P ( x ) (cid:48) Bias ( α | I ) (cid:1) dαdx. Theorem 2

Suppose that the private value conditional quantile function V ( ·|· ) is a quantileregression (2.5), for which D M = 0 , or a sieve quantile regression (2.10) with D M inter-actions. Then under Assumptions A, H, S with s ≥ D M / , there exists an approximation (cid:98) v ( α | x, I ) of (cid:98) V ( α | x, I ) such that E (cid:20)(cid:90) X (cid:90) ( (cid:98) v ( α | x, I ) − V ( α | x, I )) dαdx (cid:21) = h s +1) Bias IL + Σ IL LIh D M +1 + o (cid:18) h s +1) + 1 Lh D M +1 (cid:19) where Bias IL = O (1) , Σ IL = O (1) and (cid:90) X (cid:90) (cid:16) (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) (cid:17) dαdx = o P (cid:18) Lh D M +1 (cid:19) . (4.5)The quantile estimator (cid:98) V ( α | x, I ) is nonlinear and deﬁned in an implicit way, so thatattempting a direct computation of its IMSE is diﬃcult. Its approximation (cid:98) v ( α | x, I ) followsfrom a Bahadur linearization argument, see Theorem D.1 and (E.1) in Appendices D and E.The rate in equation (4.5) is negligible with respect to the IMSE of (cid:98) v ( α | x, I ), showing thatit is fair to replace (cid:98) V ( α | x, I ) by (cid:98) v ( α | x, I ) to picture the IMSE of (cid:98) V ( α | x, I ).Note that Theorem 2 holds over the full quantile level range [0 , αB (1) ( α | x, I ) in V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) / ( I − s + 1)th continuously diﬀerentiablewhich gives the order h s +1 for the bias and the order 1 / (cid:0) Lh D M +1 (cid:1) / for the variance. The24ias component due to the estimation of B ( α | x, I ) is of the negligible order h s +2 except per-haps over a small vicinity of 0 where it is o ( h s +1 ). The asymptotic variance Σ IL / (cid:0) LIh D M +1 (cid:1) order is similar to the asymptotic variance obtained for kernel estimation of a conditionalpdf with D M covariates. Indeed, the bid quantile derivative is homogeneous to a conditionalpdf since B (1) ( α | x, I ) = 1 g [ B ( α | x, I ) | x, I ] , where g ( ·|· ) is the bid conditional pdf. The bid quantile function is homogeneous to a cdfand converges with a faster rate. Note that the asymptotic variance term Σ IL / (cid:0) LIh D M +1 (cid:1) depends upon the number of interactions D M and not the dimension of the covariate D .Hence Theorem 2 illustrates the dimension reduction features of the procedure. In particular,the variance term is of order 1 / ( Lh ) in the AQR case independently of the dimension of thecovariate D , which therefore can be large.Maximizing the leading term of the IMSE yields the optimal bandwidth h ∗ = (cid:18) ( D M + 1) Σ IL s + 1) Bias IL LI (cid:19) s + D M +3 . (4.6)As in kernel estimation, a pilot bandwidth can be computed using a simple private valuequantile regression model to proxy Σ IL and Bias IL in a parametric way. The correspondingIMSE rate is L s +12 s + D M +3 which decreases with the number of interactions D M , but does not depend upon the dimen-sion D of the covariate. In the AQR case with D M = 0, the IMSE rate L s +12 s +3 is, as expected,the optimal rate for estimating the marginal pdf of a real random variable. For s = 1, it isequal to L / independently of the dimension D of the covariate, which is close of L / .Two assumptions limit the use of the optimal bandwidth (4.6). First, Theorem 2 assumes s ≥ D M / D M larger than 3since s ≥ D replaces D M but plays a similar role,25ryal et al. (2016) however use a condition s + 1 > D to study a GMM version of GPVbased on a local polynomial estimation of the private value. This section states a Central Limit Theorem for (cid:98) V ( α | x, I ), Theorem 3, which illustrates thegood pointwise properties of (cid:98) V ( α | x, I ) near or at the upper boundary α = 1. Let s be theselection vector deﬁned earlier andΠ h ( α ) = (cid:32)(cid:90) − αh − αh π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:33) − s (cid:48) ,v h ( α ) = Π h ( α ) (cid:48) (cid:90) − αh − αh (cid:90) − αh − αh π ( t ) π ( t ) (cid:48) min ( t , t ) K ( t ) K ( t ) dt dt Π h ( α ) , Σ h ( α | I ) = α v h ( α )( I − E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) (cid:3) E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , (4.7) Bias h ( α | I ) = αI − s (cid:32)(cid:90) − αh − αh π ( t ) π ( t ) (cid:48) K ( t ) dt (cid:33) − (cid:90) − αh − αh t s +2 π ( t )( s + 2)! K ( t ) dt × E − (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) αB ( s +2) ( α | x (cid:96) , I ) B (1) ( α | x (cid:96) , I ) (cid:21) . (4.8) Theorem 3

Suppose that the private value conditional quantile function V ( ·|· ) is a quantileregression (2.5) or a sieve quantile regression (2.10) with D M interactions. Then underAssumptions A, H, S with s ≥ D M / and log LLh D M +1+1 ∨ D M = o (1) , t holds for α in (0 , and all x in X that (cid:18) LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / (cid:16) (cid:98) V ( α | x, I ) − V ( α | x, I ) − h s +1 P ( x ) (cid:48) Bias h ( α | I ) + o (cid:0) h s +1 (cid:1)(cid:17) converges in distribution to a standard normal. Moreover P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:16) αh − D M and max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) Bias h ( α | I ) (cid:12)(cid:12) = O (1) . Theorem 3 shows that the asymptotic variance of (cid:98) V ( α | x, I ) is of order α/ (cid:0) Lh D M +1 (cid:1) for α >

0. For α = 0, (cid:98) V ( α | x, I ) = (cid:98) B ( α | x, I ) has an asymptotic variance of order 1 / (cid:0) Lh D M +1 (cid:1) and a corresponding CLT using this standardization also holds. For other quantile levels theprivate value conditional quantile estimator depends upon (cid:98) B (1) ( α | x, I ) so that the asymp-totic variance of (cid:98) V ( α | x, I ) has the larger order 1 / (cid:0) Lh D M +1 (cid:1) which also holds in Theorem 2.The expression of the asymptotic variance of (cid:98) V ( α | x, I ) is quite typical of quantile regressionestimators, up to the factor v h ( α ) which is due to (cid:98) B (1) ( α | x, I ).It follows from Theorem 3 that the private value conditional quantile estimator is con-sistent for all quantile levels, including α = 1. The potential boundary eﬀects only appearthrough the bias and variance factors Bias h ( α | I ) and Σ h ( α | I ). Since the support of thekernel is [ − , Bias h ( α | I ) = Bias ( α | I ) and Σ h ( α | I ) = Σ ( α | I ) for all α in [ h, − h ]where Bias ( α | I ) and Σ ( α | I ) are deﬁned before Theorem 2, allowing in principle to implementsimple pilot bandwidth for quantile level inside [0 , α lies in (0 , h ] or [1 − h, h . It is commonly believed that the variance factoris inﬂated near the boundaries but there is no clear result for the bias factor, see Fan andGijbels (1996) and the references therein. 27 .3 Functional estimation The plug in estimators of θ ( x ) and θ in (2.13) are (cid:98) θ ( x ) = (cid:90) F (cid:104) α, x, (cid:98) B ( α | x, I ) , (cid:98) B (1) ( α | x, I ) ; I ∈ I (cid:105) dα, (cid:98) θ = (cid:90) X (cid:98) θ ( x ) dx, with AQR or ASQR (cid:98) B ( α | x, I ) and (cid:98) B (1) ( α | x, I ). Alternatively, θ can be estimated using (cid:80) L(cid:96) =1 (cid:98) θ ( x (cid:96) ) /L . Let us now introduce the asymptotic variances of (cid:98) θ ( x ) and (cid:98) θ . The variancesdepend upon the matrices P ( I ) = E [ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) )] , P ( α | I ) = E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , and of the functions, recalling b I and b I stand for B ( α | x, I ) and B (1) ( α | x, I ) respectively, ϕ I ( α, x ) = ∂ F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) ∂b I ,ϕ I ( α, x ) = ∂ F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) ∂b I . Let A be a random variable with the uniform distribution over [0 ,

1] and deﬁne σ L ( x | I ) = I Tr (cid:26) Var (cid:20)(cid:18)(cid:90) A (cid:26) ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α (cid:27) P ( α | I ) − dα (cid:19) P ( I ) / h D M / P ( x ) (cid:21)(cid:27) ,σ L ( I ) = I Tr (cid:26) Var (cid:20)(cid:90) A (cid:18)(cid:90) X (cid:26) ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α (cid:27) P ( α | I ) − P / ( I ) P ( x ) dx (cid:19) dα (cid:21)(cid:27) ,σ L ( x ) = (cid:88) I ∈I σ L ( x | I ) , σ L = (cid:88) I ∈I σ L ( I ) . The proof of Theorem 4 in Appendix E shows that the asymptotic variances of (cid:98) θ ( x ) and (cid:98) θ are σ L ( x ) / (cid:0) Lh D M (cid:1) and σ L /L respectively provided ϕ I ( α | x ) (cid:54) = ∂ϕ I ( α | x ) ∂α (4.9)28or some α , x and I of [0 , × X × I . Indeed, if ϕ I ( α | x ) = ∂ϕ I ( α | x ) ∂α for all α and I , σ L ( x | I ) = 0 and, if this also holds for all x , σ L = 0, in which case (cid:98) θ ( x ) and (cid:98) θ can convergeto θ ( x ) and θ with “supereﬃcient” rates, faster than (cid:0) Lh D M (cid:1) / and L / respectively. Inthe case of density based functionals, Laurent (1997) similarly obtained asymptotic variancethat can vanish. Why it is possible is better understood in our quantile context, through anexample of functionals for which (4.9) does not hold. Consider, for some I of I , F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) = 2 B ( α | x, I ) B (1) ( α | x, I )which gives ( ϕ I ( α | x ) , ϕ I ( α | x )) = 2 (cid:0) B (1) ( α | x, I ) , B ( α | x, I ) (cid:1) . Hence ϕ I ( α | x ) = ∂ϕ I ( α | x ) ∂α for all ( α, x, I ), so that (4.9) does not hold and σ L ( x ) = σ L = 0. Why (cid:98) θ ( x ) and (cid:98) θ can con-verge with supereﬃcient rates for these functionals is in fact not surprising observing thatthey estimate θ ( x ) = B (1 | x, I ) − B (0 | x, I ) , θ = (cid:90) X θ ( x ) dx, respectively. Hence, for these examples, the parameters of interest only depend upon extremequantiles, in which case supereﬃcient estimation is possible, see e.g. Hirano and Porter(2003) and the references therein. A role of the new Condition (4.9) is to exclude suchfunctionals. The next Theorem establishes the asymptotic normality of (cid:98) θ ( x ) and (cid:98) θ . Theorem 4

Suppose Assumptions A, F, H, S and R hold with s ≥ D M / . Then σ L ( x ) and σ L are bounded away from and inﬁnity if (4.9) holds for some ( α, I ) in [0 , × I and forsome ( α, x, I ) in [0 , × X × I respectively. Moreoveri. If log LLh D M +2+ ( D M∨ ) = o (1) , √ Lh D M (cid:16)(cid:98) θ ( x ) − θ ( x ) − bias L,θ ( x ) (cid:17) /σ L ( x ) converges in dis-tribution to a standard normal, where bias L,θ ( x ) is a o ( h s ) bias term.ii. If log LLh D M +1+ ( D M∨ ) = o (1) , √ L (cid:16)(cid:98) θ − θ − bias L,θ (cid:17) /σ L converges in distribution to a stan-dard normal, where bias L,θ is a o ( h s ) bias term. A more systematic study is out of the scope of the present paper, as is the issue of semiparametriceﬃciency. B (1) ( α | x, I ). When F ( · ) depends upon αB (1) ( α | x, I ) as in all the Examples, the exact order of the bias term is h s +1 with bias L,θ ( x ) = h s +1 (1 + o (1)) (cid:88) i ∈I (cid:90) G b I (cid:2) α, x, B ( α | x, I ) , αB (1) ( α | x, I ) ; I ∈ I (cid:3) × P ( x ) (cid:48) Bias h ( α | x, I ) dα and bias L,θ = (cid:82) X bias L,θ ( x ) dx where Bias h ( α | x, I ) is as in (4.8) and G b I ( · ) is the partialderivative of F ( · ) with respect to αB (1) ( α | x, I ). (cid:98) θ ( x ) or (cid:98) θ are therefore asymptoticallyunbiased if h s +1 √ Lh D M = o (1) or h s +1 √ L = o (1) respectively. The items Bias h ( α | x, I )in the integral expression of bias L,θ ( x ) can be replaced with their limits Bias ( α | x, I ) deﬁnedbefore Theorem 2. Theorem 4 applies to our functional Examples as follows. Example 1 (cont’d).

Let (cid:98) θ = (cid:98) θ n / (cid:98) θ d be the CRRA risk aversion plug in estimator derivedfrom (2.15). Under the bandwidth condition of Theorem 4-(ii), (cid:98) θ n = θ n + bias L,θ n + O P (cid:0) L − / (cid:1) and (cid:98) θ d = θ d + bias L,θ d + O P (cid:0) L − / (cid:1) . A standard linearization argument then gives that theasymptotic distribution of √ L (cid:18)(cid:98) θ − θ d bias L,θ n − θ n bias L,θ n θ d (cid:19) is the one of θ d √ L (cid:16)(cid:98) θ n − θ n (cid:17) − θ n √ L (cid:16)(cid:98) θ d − θ d (cid:17) θ d which is normal, applying Theorem 4-(ii) with F (cid:2) α, x, B ( α | x, I ) , B (1) ( α | x, I ) ; I ∈ I (cid:3) = B ( α | x, I ) − B ( α | x, I ) θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) − θ n θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) . ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α appearing in the asymptotic variances are, for I = I , ϕ I ( α | x ) − ∂ϕ I ( α | x ) ∂α = 1 θ d (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) − B ( α | x, I ) − B ( α | x, I ) − α (cid:0) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:1) θ d ( I − θ n θ d ( I − (cid:18) αB (1) ( α | x, I ) I − − αB (1) ( α | x, I ) I − (cid:19) + 2 θ n αθ d ( I − (cid:18) B (1) ( α | x, I ) + αB (2) ( α | x, I ) I − − B (1) ( α | x, I ) + αB (2) ( α | x, I ) I − (cid:19) where αB (2) ( α | x, I ) is well deﬁned over [0 ,

1] by (2.3). The case I = I is similar. Usingthese expressions to estimate the asymptotic variance CRRA risk-aversion (cid:98) θ is diﬃcult dueto the second derivative B (2) ( α | x, I ), which is diﬃcult to estimate. Although not formallystudied here, using a bootstrap procedure may be more appropriate. Example 2 (cont’d).

Theorem 4-(i) together with Theorem 3 are useful to study the plugin estimator (cid:100) ER ( α R | x, I ) derived from (2.17). Theorem 4-(i) gives that the estimator of theintegral component θ ( x ; α R ) satisﬁes (cid:98) θ ( x ; α R ) = θ ( x ; α R ) + O ( h s +1 ) + O P (cid:16) / √ Lh D M (cid:17) ,while Theorem 3 ensures that (cid:98) V ( α | x, I ) = V ( α | x, I ) + O ( h s +1 ) + O P (cid:16) / √ Lh D M +1 (cid:17) . Asthe O ( h s +1 ) items correspond to bias terms and the O P ( · ) ones are given by the estimationstochastic component, both (cid:98) θ ( x ; α R ) and (cid:98) V ( α R | x, I ) contribute to the bias of (cid:100) ER ( α R | x, I ).The asymptotic distribution of the bias centered √ Lh D M +1 (cid:16)(cid:100) ER ( α R | x, I ) − ER ( α R | x, I ) (cid:17) is the one of Iα I − R (1 − α R ) √ Lh D M +1 (cid:16) (cid:98) V ( α R | x, I ) − V ( α R | x, I ) (cid:17) , which follows from The-orem 3. The uniform consistency Theorem A.1 in Appendix A can be used to study theestimated screening level (cid:98) α R ( x, I ) and reserve price (cid:98) V ( (cid:98) α R ( x, I ) | x, I ) obtained by maximiz-ing (cid:100) ER ( α R | x, I ). Example 3 (cont’d).

Theorem 4-(i) is also useful to study the private value cdf. and pdf,estimator from Example 3, with a ﬁxed bandwidth η . The proof carries over if η goes to 031ith h = o ( η ) and the order of the variance given by Theorem 4-(i) is correct if η is of theorder of η . For the cdf estimator (cid:98) F η ( v | x, I ) = (cid:82) I η (cid:104) v − (cid:98) V ( α | x, I ) (cid:105) dα , ϕ I ( α | x ) = − η k (cid:18) v − V ( α | x, I ) η (cid:19) , ϕ I ( α | x ) = α ( I − η k (cid:18) v − V ( α | x, I ) η (cid:19) ,∂ϕ I ( α | x ) ∂α = 1( I − η k (cid:18) v − V ( α | x, I ) η (cid:19) − α ( I − η k (1) (cid:18) v − V ( α | x, I ) η (cid:19) V (1) ( α | x, I ) . When η goes to 0, the dominant part of the variance is, for inner v , integrating by parts andsetting V x,I = V ( A | x, I ) ILh D M Tr (cid:26) Var (cid:20)(cid:18)(cid:90) A ∂ϕ I ( α | x ) ∂α P ( α | I ) − dα (cid:19) P ( I ) / h D M / P ( x ) (cid:21)(cid:27) = (1 + o (1)) ILh D M Tr (cid:40) Var (cid:34) ϕ I ( A | x ) ∂ P ( A | I ) − ∂α P ( I ) / h D M / P ( x ) (cid:35)(cid:41) = (1 + o (1)) I ( I − Lh D M × Tr  Var  F ( V x,I | x, I ) f ( V x,I | x, I ) k (cid:16) v − V x,I η (cid:17) η ∂ P ( F ( V x,I | x, I ) | I ) − ∂α P ( I ) / h D M / P ( x )  = (1 + o (1)) I (cid:82) k ( t ) dt ( I − Lηh D M (cid:18) F ( v | x, I ) f ( v | x, I ) (cid:19) × Tr (cid:40) ∂ P ( F ( v | x, I ) | I ) − ∂α P ( I ) / h D M P ( x ) P ( x ) (cid:48) P ( I ) / ∂ P ( F ( v | x, I ) | I ) − ∂α (cid:41) . Hence the order of the variance of (cid:98) F η ( v | x, I ) is 1 / (cid:0) Lηh D M (cid:1) . Its bias as an estimator of F ( v | x, I ) has two components: the ﬁrst is bias L,F η ( v | x,I ) due to the bias of (cid:98) V ( α | x, I ) and isof order O ( h s +1 ), while the second is F η ( v | x, I ) − F ( v | x, I ) = O ( η s +1 ) is k ( · ) is a kernelof order s . It follows that the optimal bandwidths h and η must have the same order L − / (2 s + D M +3) which gives the consistency rate L − ( s +1) / (2 s + D M +3) . Repeating these steps forthe pdf estimator (cid:98) f η ( v | x, I ) gives the same optimal consistency rate L − s/ (2 s + D M +3) which,up to a logarithmic term, corresponds to the GPV optimal minimax rate in presence of D M covariates. 32 Simulation experiments

This section reports the results of a simulation experiment for the AQR estimation of theprivate value quantile function, the expected revenue and optimal reserve price under riskneutrality from ﬁrst-price auction with I = 2. A second simulation experiment considersestimation of risk aversion based on comparison of ﬁrst-price auctions with I = 2 and I = 3as in (2.15) and on comparison with ﬁrst-price and ascending auctions with I = 2. In eachcase, the considered number of auctions is L = 100 and the number of replications is 1 , αB (1) ( α | x, I ) / ( I − I = 2 corresponds to a worst case scenario. By contrast,the simulation experiment in GPV considers I = 5 while I = 3 or 5 in Marmer and Shneyerov(2012) and Ma, Marmer and Shneyerov (2018). The number of bids in these references rangefrom 1 ,

000 for GPV to 4 ,

200 for Marmer and Shneyerov (2012). In a simulation experi-ment focused on the nonparametric estimation of the utility function of risk averse bidders,Zincenko (2018) considers I = 2 with L = 300 and I = 4 with L = 150. Our simulationexperiment is therefore more focused on small samples. We also use three covariate whilethe aforementioned simulation experiments do not consider covariate, with the exception ofZincenko (2018) who increases the number of auctions to L = 900 for one or two covariatesto cope with the curse of dimensionality. The private value quantile function is given by a quantile regression model with an interceptand three independent covariates with the uniform distribution over [0 , V ( α | x ) = γ ( α ) + γ ( α ) x + γ ( α ) x + γ ( α ) x γ ( α ) = 1 + 0 . α − , γ ( α ) = 1 ,γ ( α ) = 0 . − exp( − α )) , γ ( α ) = 0 . . π + 1) α + cos(2 πα )) . The coeﬃcient γ ( · ) is ﬂat near 0 and fastly increases near 1, as observed in the applicationdisplayed in the next section, while γ ( · ) fastly increases near 0 and is ﬂat after. Thederivative of γ ( · ) has some oscillating patterns.The expected revenue ER ( α ) is computed from (2.17) setting the intercept, x and x to 0 and taking x = 0 .

8. This choice gives a unique optimal reserve price achieved for α = .

3, which is not too close to the boundaries so that the expected revenue function hasa substantial concave shape which is suppose to make estimation more diﬃcult.

The private value quantile regression is estimated from a sample of 100 ﬁrst-price auctionswith two bids over the estimation grid α = 0 , . , . . . , . , (cid:98) V ( α | x ) of order 2 and kernel K ( t ) = 6 t (1 − t ) I ( t ∈ [0 , (cid:100) ER ( α ) plugs 0 . (cid:98) γ ( α ) into (2.17) using Riemann sums to com-pute integrals. The optimal screening level (cid:98) α ∗ maximizes (cid:100) ER ( α ) over the grid and is usedto compute the estimated optimal reserve price (cid:98) R ∗ = . (cid:98) γ ( (cid:98) α ∗ ) and the estimated optimalrevenue (cid:100) ER ∗ = (cid:100) ER ( (cid:98) α ∗ ).Table 1 summarizes the simulation results for the estimation of the private value quantilefunction, the expected revenue and the optimal reserve price. The Bias and Square RootIntegrated Mean Squared Error (RIMSE) lines for (cid:98) V ( ·|· ) gives the simulation counterpartsof, respectively (cid:32) (cid:88) j =0 (cid:90) ( E [ (cid:98) γ j ( α )] − γ j ( α )) dα (cid:33) / and (cid:32) (cid:88) j =0 (cid:90) E (cid:2) ( (cid:98) γ j ( α ) − γ j ( α )) (cid:3) dα (cid:33) / . . , . , . . . , . h . . . . . . . . (cid:98) V ( ·|· ) Bias .131 .141 .143 .145 .150 .159 .166 .176RIMSE .433 .386 .355 .332 .322 .309 .303 .305 (cid:100) ER ( · ) Bias .036 .044 .049 .050 .051 .049 .047 .045RIMSE .109 .104 .102 .100 .099 .098 .097 .096 (cid:98) R ∗ Bias -.036 -.031 -.014 -.002 .009 .022 .037 .043RMSE .129 .099 .075 .067 .062 .064 .066 .066Table 1: Private value quantile function, expected revenue, and optimal reserve priceFigure 2: Private value quantile estimation for h = 0 . h = 0 . V ( α | x ) = γ ( α ) + ( γ ( α ) + γ ( α ) + γ ( α )) / . − .

5% quantiles of (cid:98) V ( α | x ) across1 ,

000 simulations.Estimation of the private value slope coeﬃcients seems much more sensitive to the band-width parameter than the expected revenue or optimal reserve price. It has also a muchhigher RIMSE. The bandwidth behavior of (cid:98) V ( α | x ) is illustrated in Figure 2, which consid-35igure 3: Expected revenue estimation for h = 0 . h = 0 . ER ( α | x )in black. Dashed red line: average estimation. Dotted red line: pointwise 2 . − . (cid:100) ER ( α | x ) across 1 ,

000 simulations.ers the small bandwidth h = 0 . h = 0 .

8. As expected from Theorem 3, thevariance of (cid:98) V ( α | x ) increases with α and decreases with h , while the bias increases with α butdecreases with h . Figure 2 also suggests that choosing a large bandwidth as recommendedby Table 1 may lead to important bias issues, including underestimating the private valuequantile function for high α .This contrasts with estimation of the expected revenue and optimal reserve price, whichseems mostly unaﬀected by the bandwidth. This is because the expected revenue dependsupon (1 − α ) V ( α | x ): multiplying the private value quantile function by (1 − α ) mitigateslarger bias and variance near the boundary α = 1, see also Figure 3. For the consideredexperiment, the true expected revenue is always in the 95% band of Figure 3 while the trueprivate quantile function is out for large α when h = 0 . .3 CRRA risk aversion Two risk aversion estimators are considered. The ﬁrst estimator (cid:98) θ fp is based upon (2.15) anduses two independent samples of size L = 100 with 2 and 3 bidders from the model above,which corresponds to a CRRA utility function x θ with θ = 1. Integrals with respect to α arecomputed using Riemann sums whereas integrals with respect to x are replaced with samplemeans over the two auction samples. The second estimator (cid:98) θ asc is based upon (2.16) anduses an additional sample of size L = 100 of ascending auctions with two bidders. In thiscase, it is possible to consider various values of θ and the simulation experiment considers thevalues 0 .

2, 0 . B ( α | x ) is the ﬁrst-price auction quantile bid function with I = 2, the observed bids drawn from B ( α | x ) are rationalized by a CRRA utility function x θ if the private value quantile function is set to V θ ( α | x ) = B ( α | x,

2) + θαB (1) ( α | x, V (1) θ ( ·| x ) > x as seen from Campo et al. (2011) and (2.14) here. As V (1) θ ( ·|· ) > V θ ( α | x ) to generate two ascending bids for eachauction. Following Gimenes (2017), V θ ( α | x ) can be estimated from winning bids in theseascending auction using AQR for quantile level 2 α − α instead of α .The performance of the two estimators are summarized in the next Table. Table 2 shows θ h . . . . . . . . (cid:98) θ fp (cid:98) θ asc . . The optimal bid functions can be computed explicitly under the risk neutrality case θ = 1. Consideringother values of θ would request to use numerical computations of the bid functions. (cid:98) θ asc , which combines ﬁrst-price and ascending auctions as in Lu and Perrigne (2008),dominates (cid:98) θ fp in this experiment. While the RMSE and bias of (cid:98) θ asc do not seem sensitiveto h , this is not the case for (cid:98) θ fp which has a high downward bias, and then RMSE, forsmall h . Further investigations suggest this is due to an unbalanced variable issue, thediﬀerence (cid:98) B ( α | x, − (cid:98) B ( α | x,

2) being very smooth while α (cid:16) (cid:98) B (1) ( α | x, / − (cid:98) B (1) ( α | x, (cid:17) is more erratic, especially when α is close to 1. This issue is addressed in the application byrestricting α to [0 , .

8] for risk aversion estimation.

This section illustrates empirically the methodology using data from ascending timber auc-tions run by the US Forest Service (USFS). Timber auctions data have been used in severalempirical studies (see Athey and Levin (2001), Athey, Levin and Seira (2011) Li and Zheng(2012), Aradillas-Lopez, Gandhi and Quint (2013) among others). Some other works haveinvestigated risk-aversion on timber auctions (e.g., Lu ad Perrigne (2008), Athey and Levin(2001), Campo et al. (2011)). The data set used here is from Lu and Perrigne (2008) andCampo et al. (2011), and aggregates auctions from the states covering the western half ofthe United States (regions 1–6 as labeled by the USFS) occurred in 1979. It contains bidsand a set of variables characterizing each timber tract, including the estimated volume ofthe timber measured in thousands of board feet (mbf) and its estimated appraisal valuegiven in dollars per unit of volume. We consider the 107 ﬁrst-price auctions with two bid-ders, the ﬁrst-auctions with three bidders ( L = 108) and ascending auctions with two bidders( L = 241). The considered covariates are the appraisal value and the timber volume taken inlog. The rest of the application uses a quantile regression model for the private value, whichis estimated via AQR of order 2 and kernel K ( t ) = 6 t (1 − t ) I ( t ∈ [0 , h in { . , . , . . . , . } . Conﬁdence intervals are computed using pairwise bootstrap. Bid quantile functions.

Table 3 gives the coeﬃcients of a regression on these variables.The dependent variables are the bids for the ﬁrst-price auctions while the winning bid is used38or the ascending auction. The appraisal value coeﬃcient is close to 1 in all auctions, butAuctions Intercept Volume Appraisal value R First-price I = 2 − . (6 . . (1 . . (0 . . I = 3 − . (9 . . (1 . . (0 . . I = 2 2 . (15 . . (1 . . (0 . . I = 2 with the one with I = 3 and the ascending auction. Similarly the volume coeﬃcientof the ﬁrst-price auction with I = 2 diﬀers from the one with I = 3 at the 10% level, andalso at the 5% level when using an unilateral test. These ﬁndings are consistent with aquantile regression speciﬁcation with non constant coeﬃcients for these two variables. Theintercept coeﬃcients of the ﬁrst-price auctions with I = 2 and I = 3 are not statisticallydistinct at the 5% level. This is not compatible with the homogenized bid regression model V = γ + X (cid:48) γ + v with v independent of X : for this model, the volume and appraisal valuecoeﬃcients obtained from a bid regression should not depend upon I under entry exogeneity,as discussed in Section 2.2.Figure 4 sums up the quantile regression analysis of the ﬁrst-price auction bids with I = 2.The diﬀerence of the AQR volume slope and regression coeﬃcient is consistently outside thepointwise 90% bootstrap conﬁdence interval. This ﬁnding holds for all bandwidths. Thecase of the appraisal value is more diﬃcult. The diﬀerences between the estimated regressioncoeﬃcient and the AQR lies outside the conﬁdence bands between α = 90% and α = 1 dueto a strong increase of the AQR. But this holds for the bandwidths h = . h = . h . Figure 4 also reports standard quantile regression, which exhibits a similarpattern. This suggests a potential AQR bias issue for h > . α as suggested bythe standard QR estimation. Therefore, the intercept will be kept constant and set to itsestimated value from Table 3 in the rest of the application. Comparison of the augmented39igure 4: Two bidders ﬁrst-price auction bid quantile slope coeﬃcients: Intercept (left),volume (center) and appraisal value (right). AQR with h = . Risk aversion.

The two risk aversion estimators look insensitive to the bandwidth, pro-ducing a risk aversion estimation around .

85 for (cid:98) θ fp and . (cid:98) θ asc . The bootstrap medianof (cid:98) θ fp reported in Table 4 suggests that the distribution of (cid:98) θ fp is asymmetric, with a medianaround .

75 slightly above the one of (cid:98) θ asc . This risk aversion estimates are similar to theones obtained with Lu and Perrigne (2008) and Campo et al. (2011). The bootstrap 90%conﬁdence intervals in Table 4 suggests a much higher dispersion that the ones reported bythese authors from asymptotic variance estimations. In particular, it is not possible to reject40isk neutrality. . . . . . . . . (cid:98) θ fp (50%) . . . . . . . . . . . . . . . . , . , .

6] [ . , .

4] [ . , .

4] [ . , . . (cid:98) θ asc (50%) . . . . . . . . . . . . . . . . , . , .

3] [ . , .

2] [ . , .

3] [ . , .

2] [ . , .

2] [ . , . ,

50% and 95% bootstrap quantiles, h = . , ..., . h = .

3. AQR estimation (full line), regression (full straight line) and 5% , , Private value quantile function and expected revenue.

This section reports esti-mation results under risk neutrality for ﬁrst-price auctions with two and three bidders andascending auctions. Figure 5 gives the private value slope function of the volume and ap-praisal variables. The volume slope functions diﬀer of the corresponding OLS coeﬃcientsfor all auctions and all the considered bandwidths. Its shape however varies across auctions:41hile convex and in the [20 , α in the ﬁrst-price case, it is in the [8 , α near 0, suggesting that low type biddervaluations of timber lots are very close to the appraisal value. This contrasts with high typebidders with higher α , which markup can be very high, in a signiﬁcant way for the caseof ascending auction. This illustrates again the important diﬀerence between low type andhigh type bidders.A possible discrepancy between ﬁrst-price and ascending auctions with two bidders alsoappears in the expected revenue computed for median values of the two explanatory variables,see Figure 6.Figure 6: Estimated expected revenue for ﬁrst-price (full line) and ascending (diamond)auctions with two bidders ( h = . −

95% bootstrap quantiles in dashed lines.The ascending auction expected revenue is always below the ﬁrst-price one. This seems42tatistically signiﬁcant for high quantile levels. This may not be relevant for the seller asthe optimal revenue is achieved for a wide range [0 , .

5] of quantile levels over which thetwo expected revenue curves seem ﬂat. This feature, which appears for all the consideredbandwidths, suggests again that the private value quantile function of bidders participatingto ﬁrst-price auctions is higher than the one for ascending auctions. Note also that thebootstrap conﬁdence bands for ﬁrst-price auction are larger than for ascending one, as forall the estimations reported in this application.

This paper has presented a quantile regression modeling strategy for ﬁrst-price auction withrisk neutral bidders under the independent value paradigm. For a conditional private valuequantile function given by a quantile regression, the conditional bid quantile function is alsoa quantile regression. Detecting the quantile regression slope which are not constant canbe done looking for the corresponding bid quantile regression slope, or with less rigor tothe variation of the corresponding homogenized bid regression coeﬃcient with respect to thenumber of bidders, which is also a consistent estimator of the constant private value slope.Non constant private value slope functions can be recovered from their bid counterpartsand their derivative with respect to quantile level. The latter can be estimated using theaugmented quantile regression proposed in this paper, which applies local polynomial toestimate jointly the bid quantile slope and its derivatives. This approach is found to work wellboth in simulations, and in a timber auction application where a strong low type/high typebidder heterogeneity is detected. This can be interpreted as caused by heterogeneous bidderabilities to transform the auctioned timber lots into more valuable goods. An empiricalﬁnding is that the seller expected revenue in a median auction is higher in ﬁrst-price than inascending auctions. The estimated expected revenue curves look ﬂat for reserve prices belowa quite large threshold, including optimal ones. This suggests that the choice of a reserveprice may not be important, at least for the median auction considered in the application.43 new local polynomial estimation procedure for bid quantile regression and its quantilelevel derivatives is proposed to implement this strategy. It is based on a smoothed objec-tive function which produces smooth estimations as illustrated in the simulations and theempirical applications. The auction modeling strategy also applies for unspeciﬁed quantilefunctions thanks to linear sieve methods. This also allows to consider ﬂexible and parsimo-nious speciﬁcation such as additive quantile function. The proposed private value quantileestimator converges with nonparametric rates, mimicking the fast optimal ones achieved inthe absence of covariate for a quantile regression, or for univariate covariate for an additivequantile speciﬁcation. Various functionals of private value quantile functions are considered,such as the expected revenue, the private value conditional cdf and pdf, or risk-aversion forbidders with a common CRRA utility function.Many work remain to be done. The asymptotic distribution derived for the proposedestimators often have a complicated variance, so it may be wiser to use bootstrap inferenceas in the empirical application. The risk aversion exhibits a quite large variance, suggest-ing that a better understanding of eﬃciency issues is needed. Various extensions can alsobe considered. The quantile approach can be extended to exchangeable aﬃliated valuesas considered in Hubbard, Li and Paarsch (2012). The quantile regression with unobservedvariables estimation method of Wei and Carroll (2009) can be used to tackle unobserved het-erogeneity as in Krasnokutskaya (2011). The quantile identiﬁcation and estimation strategycan be modiﬁed to tackle with endogenous entry, such as reserve price as in Guerre, Perrigneand Vuong (2000) or entry costs as considered by Marmer, Shneyerov and Xu (2013a) orGentry and Li (2014).

References [1]

Andrews, D.W.K & Y.J. Whang (1990). Additive interactive regression models:circumventing the curse of dimensionality.

Econometric Theory , 466–479.442] Athey, S. & J. Levin (2001). Information and competition in U.S. Forest Servicetimber auctions.

Journal of Political Economy , 375–417.[3]

Athey, S., J. Levin & E. Seira (2011). Comparing open and sealed bid auctions:evidence from timber auctions.

The Quarterly Journal of Economics , 207–257.[4]

Aradillas-Lopez, A., A. Gandhi & D. Quint (2013). Identiﬁcation and inferencein ascending auctions with correlated private values.

Econometrica

Aryal, G., M.F. Gabrielli & Q. Vuong (2016). Semiparametric estimation ofﬁrst-price auction models. CONICET and Universidad Nacional de Cuyo, University ofVirginia.[6]

Bassett, G. & R. Koenker (1982). An empirical quantile function for linear modelswith iid errors.

Journal of the American Statistical Association, , 407–415.[7] Belloni, A., V. Chernozhukov, D. Chetverikov & I. Fernandes-V´al (2017).Conditional quantile processes based on series or many regressors. arXiv:1105.6154v3.[8]

Campo, S., E. Guerre, I. Perrigne & Q. Vuong (2011). Semiparametric esti-mation of ﬁrst-price auctions with risk-averse bidders.

Review of Economic Studies, ,112–147.[9] Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. Chap.76 in

Handbook of Econometrics,

Vol. . Elsevier.[10] Chernozhukov, V., I. Fern´andes-Val & A. Galichon (2010). Quantile andprobability curves without crossing.

Econometrica , 1093-1125.[11] Daubechies, I. (1992).

Ten lectures on wavelets.

SIAM.[12]

Dette, H. & S. Volgushev (2008). Non-crossing non-parametric estimates of quan-tile curves.

Journal of the Royal Statistical Society: Series B , 609–627.4513] Enache, A. & J.P. Florens (2015). A quantile approach for the estimation of ﬁrst-price private value auction. Working Paper, Paris School of Economics.[14]

Fan, J. & I. Gijbels (1996).

Local polynomial modeling and its applications.

Chapmanand Hall/CRC.[15]

Gentry, M. & T. Li (2014). Identiﬁcation in auctions with selective entry.

Econo-metrica , 315–344.[16] Gimenes, N. (2017). Econometrics of ascending auction by quantile regression.

Reviewof Economics and Statistics , 944–953.[17] Guerre, E., I. Perrigne & Q. Vuong (2000). Optimal nonparametric estimationof ﬁrst-price auctions.

Econometrica , 525–574.[18] Guerre, E., I. Perrigne & Q. Vuong (2009). Nonparametric identiﬁcation of riskaversion in ﬁrst-price auctions under exclusion restrictions.

Econometrica , 1193–1227.[19] Guerre, E. & C. Sabbah (2012). Uniform bias study and Bahadur representation forlocal polynomial estimators of the conditional quantile function.

Econometric Theory , 87–129.[20] H¨ardle, W., G. Kerkyacharian, D. Picard & A. Tsybakov (1998).

Wavelets,approximation and statistical applications.

Springer.[21]

Haile, P.A., H. Hong & M. Shum (2003). Nonparametric tests for common valuesin ﬁrst-price sealed-bid auctions. Cowles Foundation discussion paper.[22]

Hickman, B.R. & T.P. Hubbard (2015). Replacing sample trimming with bound-ary correction in nonparametric estimation of ﬁrst-price auctions.

Journal of AppliedEconometrics , , 736-762. 4623] Hirano, K. & J.R. Porter (2003). Asymptotic eﬃciency in parameter structuralmodels with parameter-dependent support.

Econometrica , , 1307–1338.[24] Horowitz, J.L. & S. Lee (2005). Nonparametric estimation of an additive quantileregression model.

Journal of the American Statistical Association , 1238–1249.[25]

Hubbard, T.P., T. Li & H.J. Paarsch (2012). Semiparametric estimation in modelsof ﬁrst-price, sealed-bid auctions with aﬃliation.

Journal of Econometrics , , 4–16.[26] Koenker, R. (2005).

Quantile regression . Cambridge University Press.[27]

Koenker, R. & G. Bassett (1978). Regression quantiles.

Econometrica, , 33–50.[28] Krasnokutskaya, E. (2011). Identiﬁcation and estimation of auctions models withunobserved heterogeneity.

Review of Economic Studies, , 293–327.[29] Laurent, B. (1997). Estimation of integral functionals of a density and its derivatives.

Bernoulli, , 181–211[30] Li, T., I. Perrigne & Q. Vuong (2003). Semiparametric estimation of the optimalreserve price in ﬁrst-price Auctions,

Journal of Business & Economic Statistics Li, T. & X. Zheng (2012). Information acquisition and/or bid preparation: A struc-tural analysis of entry and bidding in timber sale auctions.

Journal of Econometrics , 29–46.[32]

Liu, N. & Y. Luo (2017). A nonparametric test of exogenous participation in ﬁrst-priceauctions.

International Economic Review , 857–887[33] Liu, N. & Q. Vuong (2018). Nonparametric test of monotonicity of bidding strategyin ﬁrst price auctions. Working paper. 4734]

Lu, J. & I. Perrigne (2008). Estimating risk aversion from ascending and sealed-bid auctions: the case of timber auction data.

Journal of Applied Econometrics ,871–896.[35] Luo, Y. & Y. Wan (2018). Integrated-Quantile-Based Estimation for First-Price Auc-tion Models.

Journal of Business & Economic Statistics , 173-180.[36] Ma, J., Marmer, V., & A. Shneyerov (2018). Inference for ﬁrst-price auctionswith Guerre, Perrigne and Vuong’s estimator. Working paper,[37]

Marmer, V., & A. Shneyerov (2012). Quantile-based nonparametric inference forﬁrst-price auctions.

Journal of Econometrics , 345–357.[38]

Marmer, V., A. Shneyerov & P. Xu (2013a). What model for entry in ﬁrst-priceauctions? A nonparametric approach.

The Journal of Econometrics , , 46–58.[39] Marmer, V., A. Shneyerov & P. Xu (2013b). What model for entry in ﬁrst-priceauctions? A nonparametric approach. Supplementary material. Website of

The Journalof Econometrics .[40]

Maskin, E. & J.G. Riley (1984). Optimal auctions with risk averse buyers.

Econo-metrica

Menzel, K. & P. Morganti (2013). Large sample properties for estimators basedon the order statistics approach in auctions.

Quantitative Economics, Milgrom, P.R. (2001).

Putting auction theory to work.

Cambridge University Press.[43]

Milgrom, P.R. & R.J. Weber (1982). A theory of auctions and competitive bidding.

Econometrica , , 1089–1122.[44] Paarsch, H.J. & H. Hong (2006).

An introduction to the structural econometrics ofauction data . MIT Press. 4845]

Rezende, L. (2008). Econometrics of auctions by least squares.

Journal of AppliedEconometrics , 925–948.[46] Schumaker, L.L. (2007).

Spline functions: basic theory.

Cambridge University Press.[47]

Wei, Y. & R.J. Carroll (2009). Quantile regression with measurement error.

Jour-nal of the American Statistical Association , 1129–1143.[48]

Zincenko, F. (2018). Nonparametric estimation of ﬁrst-price auctions with risk-aversebidders.

Journal of Econometrics , 303–335.49 nline Appendix A: Sieve assumption and uniform con-sistency resultsA.1 High-level sieve assumption

Section 4.1.2 suggests to use spline or wavelet but our results hold for more general sievechoices satisfying the high level Assumption R. The ﬁrst key condition is the followingapproximation property.

Approximation property S . For each function V ( α ; x ) with D M interactions as in (2.9), ( s + 1) th continuously diﬀerentiable over [0 , × X , there exists some coeﬃcients γ k ( · ) = γ kK ( · ) , ( s + 1) th continuously diﬀerentiable over [0 , with equicontinuous γ ( s +1) kK ( · ) , suchthat sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( α ; x ) − K (cid:88) k =1 γ k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , (A.1.1)sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p V ( α ; x ) ∂α p − K (cid:88) k =1 γ ( p ) k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . (A.1.2)Note that K (cid:16) /h under Assumption H. Chen (2007) gives a O (cid:16) K − s +1 D M (cid:17) rate forstandard sieve methods and functions with s + 1 bounded derivatives, which is comparableto rate in (A.1.1). The rate o (cid:16) K − s +1 D M (cid:17) holds for functions with continuous derivativesof order s + 1 for multivariate B splines (Schumaker, 2007) of order s + 1 as in (4.4), ormultivariate wavelets generated by a father wavelet p ( · ) function of order s + 1, see H¨ardleet al. (1998), Chen (2007) and the references therein, in particular Daubechies (1992).These two sieve also satisfy (A.1.2) as the corresponding coeﬃcients γ k ( · ) can be written as (cid:82) X λ k ( x ) V ( α ; x ) dx for well chosen λ k ( · ) = λ kK ( · ) satisfying sup K (cid:82) X | λ k ( x ) | dx < ∞ . Thehigh-level sieve assumption considered in our results is as follows. Assumption R

The sieve satisﬁes the Approximation property S. In the AQR case thematrices E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] , I in I , are full rank and in the ASQR case (i) The eigenval- es of the Gram matrix (cid:82) X P ( x ) P (cid:48) ( x ) dx stay bounded away from and inﬁnity when thedimension K of P ( · ) increases and max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) . (ii) The sieve { P k , ≤ k ≤ K } is composed with localized functions, in the sense there isa c > such that P k ( · ) P k ( · ) = 0 as soon as | k − k | > c/ with max k ≤ K (cid:26)(cid:90) X | P k ( x ) | dx (cid:27) = O (cid:0) K − / (cid:1) . (iii) For some η ∈ (0 , and K L with log K L = O (log L ) , it holds that (cid:107) P ( x ) − P ( x (cid:48) ) (cid:107) ≤ K L (cid:107) x − x (cid:48) (cid:107) η for all x , x (cid:48) of X . Assumption R ﬁrst imposes well conditioned matrices E [ I ( I (cid:96) = I ) X (cid:96) X (cid:48) (cid:96) ] for the AQRcase and (cid:82) X P ( x ) P (cid:48) ( x ) dx for the ASQR case. The rest of Assumption R holds for the sieve(4.4) as max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) h − D M / (cid:1) , max k ≤ K (cid:26)(cid:90) X | P k ( x ) | dx (cid:27) = O (cid:0) h − D M / (cid:1) with K (cid:16) h − /D M by Assumption H. Assumption R-(iii) holds when the order K of the sieve(4.4) decreases with a polynomial rate and provided q ( · ) is H¨older with exponent η . Thisallows for cardinal B-splines for which η = 1, but also for wavelets which are not alwaysdiﬀerentiable but H¨older with η <

1, see Daubechies (1992).

A.2 Uniform consistency rates

The next Theorem deals with uniform consistency of the ASQR procedure.

Theorem A.1

Suppose that the private value conditional quantile function V ( ·|· ) is a quan-tile regression (2.5) or a sieve quantile regression (2.10) with D M interactions. Then under ssumptions A, H, S and R with s ≥ D M / and log LLh D M +1+( D M ∨ = O (1) , it holds sup ( α,x,I ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / + h s +1 (cid:33) , sup ( α,x,I ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + o (cid:0) h s +1 (cid:1) . The bandwidth condition used in Theorem A.1 is similar to the one of Theorem 3 andallows an optimal bandwidth of order (log

L/L ) / (2 D M + s +3) provided the smoothness s sat-isﬁes s ≥ max (cid:18) D M , D M − (cid:19) . Under this condition the uniform consistency rate of the private value conditional quantileestimator is (cid:18) log LL (cid:19) s +12 s + D M +3 which coincides with the GPV optimal minimax uniform consistency rate for the estimationof the private value conditional cdf in the presence of D M covariates. Theorem A.1 alsoincludes a uniform consistency rate for the bid conditional quantile function estimator whichcan be used to estimate the bidders’ signals and private values.

References [1]

Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models.Chap. 76 in

Handbook of Econometrics,

Vol. . Elsevier. GPV consider the pdf but the rate for cdf or quantile can be derived similarly.

Daubechies, I. (1992).

Ten lectures on wavelets.

SIAM.[3]

H¨ardle, W., G. Kerkyacharian, D. Picard & A. Tsybakov (1998).

Wavelets, approximation and statistical applications.

Springer.[4]

Schumaker, L.L. (2007).

Spline functions: basic theory.

Cambridge UniversityPress. 53 nline Appendix B: Notations and intermediary results

We start with additional notations used all along the proof section and some preliminarylemmas which are established in Appendix F. In what follows P ( x ) =  [1 , x (cid:48) ] (cid:48) in the AQR case ( K = D + 1)[ P ( x ) , . . . , P K ( x )] (cid:48) in the ASQR caseallowing an uniﬁed treatment of the two estimators, although the proof focus is on the morediﬃcult ASQR case. Recall that (cid:107) P ( x ) (cid:107) = (cid:0) P ( x ) (cid:48) P ( x ) (cid:1) / is the standard Euclidean normand that, under Assumptions R-(i) and H-(ii),max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) = O (cid:0) h − D M / (cid:1) , max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) , with D M = 0 in the AQR case. Recall that P ( x, ht ) = π ( ht ) ⊗ P ( x ) , π ( ht ) (cid:48) = (cid:34) , ht, . . . , ( ht ) s +1 ( s + 1)! (cid:35) so that the “design” matrix E (cid:2) P ( x (cid:96) , ht ) P ( x (cid:96) , ht ) (cid:48) (cid:3) degenerates asymptotically. To avoidthis, consider the change of parameters b = Hb with H = Diag (1 , . . . , h s +1 ) ⊗ Id K , b =  β , , . . . , β ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) = β (cid:48) , hβ , , . . . , hβ ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) = hβ (cid:48) , . . . , h s +1 β s +1 , , . . . , h s +1 β s +1 ,K (cid:124) (cid:123)(cid:122) (cid:125) b (cid:48) s +1 = h s +1 β s +1  (B.1)54o that P ( x (cid:96) , ht ) (cid:48) β = P ( x (cid:96) , t ) (cid:48) b . Deﬁne accordingly (cid:98) R ( b ; α, I ) = 1 LIh L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) ρ a (cid:18) B i(cid:96) − P (cid:18) x (cid:96) , a − αh (cid:19) (cid:48) b (cid:19) K (cid:18) a − αh (cid:19) da = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − α h − α h ρ a + ht (cid:0) B i(cid:96) − P ( x (cid:96) , t ) (cid:48) b (cid:1) K ( t ) dt, R ( b ; α, I ) = E (cid:104)(cid:98) R ( b ; α, I ) (cid:105) . Note that b → (cid:82) − α h − α h ρ a + ht (cid:0) B i(cid:96) − P ( x (cid:96) , t ) (cid:48) b (cid:1) K ( t ) dt is convex as an integral of convex func-tions. It follows that (cid:98) R ( b ; α, I ) and R ( b ; α, I ) have minimizers, (cid:98) b ( α | I ) = arg min b (cid:98) R ( b ; α, I ) = H (cid:98) β ( α | I ) , b ( α | I ) = arg min b R ( b ; α, I ) , which uniqueness will be established in the next section. Set b ( α | I ) = H − b ( α | I ) recalling b ( α | I ) = (cid:104) β ( α | I ) (cid:48) , . . . , β (cid:48) s +1 ( α | I ) (cid:105) (cid:48) and deﬁne B ( α | x, I ) = P ( x ) (cid:48) β ( α | I ) ,γ ( α | I ) = β ( α | I ) + αβ ( α | I ) I − , V ( α | x, I ) = P ( x ) (cid:48) γ ( α | I ) . By Proposition C.1 and its proof, there exists some β ∗ ( ·|· ) grouping the entries in (2.11)such that sup ( α,x ) ∈ [0 , ×X | P ( x ) β ∗ ( α | I ) − B ( α | x, I ) | = o (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +1 (cid:1) . Let b ∗ ( ·|· ) and b ∗ ( ·|· ) = Hb ∗ ( ·|· ) with β ∗ ( α | I ) (cid:48) = (cid:2) β ∗ ( α | I ) (cid:48) , β ∗ ( α | I ) (cid:48) , . . . , β ∗ s +1 ( α | I ) (cid:48) (cid:3) ,β ∗ p ( α | I ) = (cid:104) β ( p ) k ( α | I ) , ≤ k ≤ K (cid:105) as in (2.11), p = 0 , . . . , s + 1.The next notations deal with the diﬀerentiability of the objective functions (cid:98) R ( · ; α, I ).55ince ∂ρ α + ht (cid:0) B − P ( x (cid:96) , t ) (cid:48) b (cid:1) ∂ b (cid:48)(cid:48) = (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) , almost everywhere, it follows that (cid:98) R ( · ; α, I ) is diﬀerentiable with (cid:98) R (1) ( b ; α, I ) = 1 LI L (cid:88) (cid:96) =1 I ( I (cid:96) = I ) I (cid:96) (cid:88) i =1 (cid:90) − α h − α h (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) K ( t ) dt and R (1) ( b ; α, I ) = E (cid:104)(cid:98) R (1) ( b ; α, I ) (cid:105) by the Dominated Convergence Theorem. When b = b ∗ ( α | I ), P ( x, t ) (cid:48) b ∗ ( α | I ) = P ( x, ht ) (cid:48) β ∗ ( α | I ) is close to B ( α + ht | x, I ), which inverse as a functionof t in I α,h = (cid:2) I α,h , I α,h (cid:3) = (cid:20) − min (cid:16) , αh (cid:17) , min (cid:18) , − αh (cid:19)(cid:21) = [ − , ∩ (cid:20) − αh , − αh (cid:21) is G ( u | x, I ) − αh , u ∈ (cid:2) B (cid:0) α + hI α,h | x, I (cid:1) , B (cid:0) α + hI α,h | x, I (cid:1)(cid:3) . When h is small enough, it will be shown in the proof of Lemma B.1 below that ∂∂t (cid:2) P ( x, ht ) (cid:48) b ∗ ( α | I ) (cid:3) = h (cid:2) π (1) ( ht ) ⊗ P ( x ) (cid:3) (cid:48) b ∗ ( α | I )= hP ( x ) (cid:48) β ∗ ( α | I ) + O (cid:0) h (cid:1) uniformly since π (1) ( ht ) (cid:48) = [0 , , ht, . . . , ( ht ) s /s !] and that P ( x ) (cid:48) β ∗ ( α | I ) converges uni-formly to B (1) ( α | x, I ) when K diverges and is therefore positive, so that P ( x, t ) (cid:48) b ∗ ( α | I ) isan increasing function of t in I α,h for h small enough. Since max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) , t → P ( x, t ) (cid:48) b is also strictly increasing provided b is close enough to b ∗ ( α | I ).56n such case, it is convenient to redeﬁne P ( x, t ) (cid:48) b as follows Ψ ( t | x, b ) =  P (cid:0) x, I α,h (cid:1) (cid:48) b t > I α,h P ( x, t ) (cid:48) b t ∈ I α,h P (cid:0) x, I α,h (cid:1) (cid:48) b t < I α,h . When Ψ ( ·| x, b ) has an inverse, deﬁneΦ ( u | x, b ) =  α + hI α,h u > Ψ (cid:0) I α,h | x, b (cid:1) α + h Ψ − ( u | x, b ) u ∈ Ψ ( I α,h | x, b ) α + hI α,h u < Ψ (cid:0) I α,h | x, b (cid:1) , ∆ ( u | x, b ) = Φ ( u | x, b ) − αh =  I α,h u > Ψ (cid:0) I α,h | x, b (cid:1) Ψ − ( u | x, b ) u ∈ Ψ ( I α,h | x, b ) I α,h u < Ψ (cid:0) I α,h | x, b (cid:1) , which is such that, as seen above, the central part of Φ ( u | x, b ∗ ( α | I )) is close to G ( u | x, I )when u is in Ψ ( I α,h | x, b ). Observe now that, provided Ψ ( ·| x, b ) is increasing and since thesupport of K ( · ) is [ − , (cid:90) I α,h I α,h { I ( B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )) − ( α + ht ) } P ( x (cid:96) , t ) K ( t ) dt = (cid:90) I α,h I α,h (cid:26) I (cid:18) Φ ( B i(cid:96) | x (cid:96) , b ) − αh ≤ t (cid:19) − ( α + ht ) (cid:27) P ( x (cid:96) , t ) K ( t ) dt = (cid:90) I α,h Φ ( Bi(cid:96) | x(cid:96), b ) − αh P ( x (cid:96) , t ) K ( t ) dt − (cid:90) I α,h I α,h ( α + ht ) P ( x (cid:96) , t ) K ( t ) dt which is diﬀerentiable with respect to b , with for B i(cid:96) in Ψ ( I α,h | x, b ) ∂ Φ ( B i(cid:96) | x (cid:96) , b ) ∂ b (cid:48) = − P ( x, ∆ ( B i(cid:96) | x (cid:96) , b ))Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b )] . In principle Ψ ( ·|· ) should be denoted Ψ α,h ( ·|· ) to acknowledge that its deﬁnition depends upon α and h . Instead, t is restricted to lie in I α,h in the sequel. The same comment applies for the functions Ψ ( ·|· )and ∆ ( ·|· ) introduced below. h small enough and for b in the vicinity of b ∗ ( α | I ), (cid:98) R ( b ; α, I ) and R ( b ; α, I ) aretwice continuously diﬀerentiable with, (cid:98) R (2) ( b ; α, I ) = 1 LIh L (cid:88) (cid:96) =1 I (cid:96) (cid:88) i =1 I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) (cid:48) Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h K (∆ ( B i(cid:96) | x (cid:96) , b )) , R (2) ( b ; α, I ) = E (cid:104)(cid:98) R (2) ( b ; α, I ) (cid:105) . The next lemma details some properties of the functions Ψ ( ·| x, b ) and Φ ( ·| x, b ) that werebrieﬂy sketched above. Deﬁne BI α,h = (cid:26) b ; min ( t,x ) ∈I α,h ×X ∂ Ψ ( t | x, b ) ∂t > (cid:27) , BI α,h = (cid:40) b ; min ( t,x ) ∈I α,h ×X ∂ Ψ ( t | x, b ) ∂t > h/f , max p =1 ,...,s +1 (cid:32) max x ∈X (cid:12)(cid:12) P ( x ) (cid:48) b p (cid:12)(cid:12) h (cid:33) < f (cid:41) , recalling that b = (cid:2) b (cid:48) , . . . , b (cid:48) s +1 (cid:3) (cid:48) and where f and f will be taken large enough. While BI α,h is used to bound the ﬁrst derivative of Ψ ( ·| x, b ) away from 0, BI α,h is used to bound thesuccessive derivatives Ψ ( p ) ( ·| x, b ), p = 1 , . . . , s + 1, away from inﬁnity. As made possibleby Lemma B.1-(i), below, an Euclidean ball B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) with a small enoughconstant C > BI α,h and BI α,h . Lemma B.1

Suppose Assumptions A and S hold with max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:0) K / (cid:1) , K = h /D M that f and f are large enough. Then, h small enough and all I in I ,i. b ∗ ( α | I ) belongs to BI α,h ⊂ BI α,h and for C small enough B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) is asubset of BI α,h , for all α in [0 , . i. For all b in BI α,h and all u in Ψ ( I α,h | x, b ) ∂ Φ ( u | x, b ) ∂ b (cid:48) = − P ( x, ∆ ( u | x, b ))Ψ (∆ ( u | x, b ) | x, b ) /h ,∂ Φ ( u | x, b ) ∂u = 1Ψ (∆ ( u | x, b ) | x, b ) /h . iii. It holds that max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | = o (cid:0) h s +1 (cid:1) , max ( α,x ) ∈ [0 , ×X max t ∈I α,h | α ( B ( α + ht | x, I ) − Ψ ( t | x, b ∗ ( α | I ))) − ( ht ) s +2 ( s + 2)! αB ( s +2) ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) , and, recalling b ∗ ( α | I ) = hβ ∗ ( α | I )max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) αβ ∗ ( α | I ) − αB (1) ( α | x, I ) (cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | Φ ( u | x, b ∗ ( α | I )) − G ( u | x, I ) | = o (cid:0) h s +1 (cid:1) . iv. There is a C > such that for any b and b in BI α,h and all α in [0 , ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ) − Ψ ( t | x, b ) | , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] | Φ ( u | x, b ) − Φ ( u | x, b ) | , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ∂ Φ ∂u ( u | x, b ) − ∂ Φ ∂u ( u | x, b ) (cid:12)(cid:12)(cid:12)(cid:12) , max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] (cid:12)(cid:12) Ψ (1) (∆ ( u | x, b ) | x, b ) − Ψ (1) (∆ ( u | x (cid:96) , b ) | x (cid:96) , b ) (cid:12)(cid:12) , are all smaller or equal to Ch − D M / (cid:107) b − b (cid:107) . h ( α ), Ω (0), Ω (1), Ω = Ω (0) + Ω (1) and Ω h ( α ) be the ( s + 2) × ( s + 2) matricesΩ h ( α ) = (cid:90) I α,h I α,h π ( t ) π ( t ) (cid:48) K ( t ) dt = (cid:34)(cid:90) − αh − αh t p + p K ( t ) dt, ≤ p , p ≤ s + 1 (cid:35) , Ω (0) = (cid:90) − π ( t ) π ( t ) (cid:48) K ( t ) dt, Ω (1) = (cid:90) π ( t ) π ( t ) (cid:48) K ( t ) dt, Ω h ( α ) = (cid:90) I α,h I α,h tπ ( t ) π ( t ) (cid:48) K ( t ) dt, While Ω h ( α ) (cid:22) Ω for all α and h , it holds that for h small enough Ω h ( α ) (cid:23) Ω (0) for all α in [0 , /

2] and Ω h ( α ) (cid:23) Ω (1) for all α in [1 / , Lemma B.2

Suppose Assumptions A, R-(i) and S hold, that f and f are large enough.Then, for K − /D M = O ( h ) , h small enough, all I in I , and any C > small enough, (i) Itholds that R (2) ( · ; α, I ) is continuously diﬀerentiable over B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) with max α ∈ [0 , max b , b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) (cid:107) b − b (cid:107) / ( α (1 − α ) + h ) = O (cid:0) h − D M / (cid:1) . (ii) The eigenvalues of R (2) [ b ∗ ( α | I ) ; α, I ] belongs to [1 /C, C ] for a large enough C , forall α in [0 , and h small enough with max α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13) R (2) [ b ∗ ( α | I ) ; α, I ] − Ω h ( α ) ⊗ E (cid:20) I ( I (cid:96) = 1) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) + h Ω h ( α ) ⊗ E (cid:34) I ( I (cid:96) = 1) B (2) ( α | x (cid:96) , I (cid:96) ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) ( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = o ( h ) . C > α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch s +1 ) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = O (cid:0) h s − D M / (cid:1) if h s = o (cid:0) h D M / (cid:1) , max α ∈ [0 , max b ∈B (cid:18) b ∗ ( α | I ) ,C ( log LL ( α (1 − α )+ h ) ) / (cid:19) (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13)(cid:16) log LL ( α (1 − α )+ h ) (cid:17) / = O (cid:0) h − D M / (cid:1) if (cid:18) log LL (cid:19) / = o (cid:0) h D M / (cid:1) . It then follows that the eigenvalues of R (2) ( b ; α, I ) stays bounded away from 0 and inﬁnityuniformly in α and in b in the two neighborhoods considered above, under the correspondingbandwidth assumption.The two next Lemmas study the ﬁrst and second derivatives of (cid:98) R ( · ; α, I ) in a shrinkingvicinity of b ∗ ( α | I ). In particular, Lemma B.3 implies that (cid:98) R ( · ; α, I ) is strictly convex oversuch a vicinity with a probability tending to 1. Lemma B.3

Suppose Assumptions A, R-(i,ii) and S hold, and log L/ (cid:0) Lh D M +1 (cid:1) = o (1) .Then, for any C > small enough, max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) Lemma B.4

Suppose Assumptions A, R-(i,ii) and S hold, and log L/ (cid:0) Lh D M +1 (cid:1) = o (1) .Then, for any C > , max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I )( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) . Since R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0 and assuming h s +1 = O (cid:0) h D M / (cid:1) , sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) =61 ( h s +1 ) as established in (C.3), it holds thatmax α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) ( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) . The next Lemma studies the leading term (cid:98) e ( α | I ) of (cid:98) b ( α | I ) − b ( α | I ), (cid:98) e ( α | I ) = − (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) see Theorem D.1 below. Note that R (2) (cid:0) b ( α | I ) ; α, I (cid:1) is not necessarily deﬁned and invert-ible unless h s +1 = O (cid:0) h D M / (cid:1) and sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ) as thereforeassumed and established in the proof of Theorem C.4 below, see (C.3). Lemma B.5

Suppose Assumptions A, H, R and S hold, and / (cid:0) Lh D M +1 (cid:1) = o (1) , s ≥ D M / and sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ) . Then (i) uniformly in ( α, x ) in [0 , ×X Var (cid:2) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:3) = O (cid:18) Lh D M (cid:19) and Var (cid:2) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:3) = O (cid:0) Lh D M +1 (cid:1) with Var [ (cid:98) e ( α | I ) /h ] having the expansion v h ( α ) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) (cid:3) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) (cid:48) P ( x (cid:96) ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) + o (1) . (ii) It also holds sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) . nline Appendix C: Asymptotic bias Our bias results for the bid quantile function are based on the next Proposition, which statesbid implications of Assumption S.

Proposition C.1

Assume the approximation property S holds. Suppose that V ( α | x, I ) is a ( s + 1) th continuously diﬀerentiable function over [0 , × X satisfying, inf ( α,x ) ∈ [0 , ×X V (1) ( α | x, I ) > and sup ( α,x ) ∈ [0 , ×X V (1) ( α | x, I ) < ∞ . Then, for B ( α | x, I ) as in (2.3) and sieve coeﬃcients { γ k ( α | I ) , ≤ k ≤ K } of V ( α | x, I ) as in Property Si. min ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) > , max ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) < ∞ and B ( α | x, I ) is ( s + 2) th continuously diﬀerentiable over (0 , with lim α → sup ( x,I ) ∈X ×I (cid:12)(cid:12) αB ( s +2) ( α | x, I ) (cid:12)(cid:12) = 0 . ii. The coeﬃcients { β k ( α | I ) , ≤ k ≤ K } from (2.11) are ( s + 1) th continuously diﬀer-entiable and satisfy sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( α | x, I ) − K (cid:88) k =1 β k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . iii. Moreover αβ (1) k ( α ) = ( I −

1) [ γ k ( α | I ) − β k ( α )] and is therefore ( s + 1) th continuously iﬀerentiable for all ≤ k ≤ K . In addition sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) αB (1) ( α | x, I ) − K (cid:88) k =1 αβ (1) k ( α | x, I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:16) K − s +1 D M (cid:17) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p (cid:2) αB (1) ( α | x, I ) (cid:3) ∂α p − K (cid:88) k =1 ∂ p (cid:104) αβ (1) k ( α | x, I ) (cid:105) ∂α p P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (1) , p = 1 , . . . , s + 1 . Proof of Proposition C.1.

By (2.3), B ( α | x, I ) = ( I − (cid:82) u I − V ( αu | x, I ) du , so that B (1) ( α | x, I ) = ( I − (cid:82) u I − V (1) ( αu | x, I ) du which implies the two ﬁrst statements in (i)about lower and upper bounds for B (1) ( α | x, I ) and that B ( ·|· , I ) is ( s + 1)th continuouslydiﬀerentiable. That B ( ·| x, I ) is ( s + 2)th continuously diﬀerentiable over (0 ,

1] follows fromits integral expression (2.3). Observe now that for p = 1 , . . . , s + 2 ∂ p [ αB ( α | x, I )] ∂α p = αB ( p ) ( α | x, I ) + pB ( p − ( α | x, I )with, for p = 1 , . . . , s + 1 B ( p ) ( α | x, I ) = ( I − (cid:90) u I − p V ( p ) ( αu | x, I ) du = I − α I − p (cid:90) α t I − p V ( p ) ( t | x, I ) dtB ( p +1) ( α | x, I ) = − ( I −

1) ( I − p ) α I + p (cid:90) α t I − p V ( p ) ( t | x, I ) dt + ( I − V ( p ) ( α | x, I ) α = − I − pα B ( p ) ( α | x, I ) + ( I − V ( p ) ( α | x, I ) α . Hence, when α goes to 0 αB ( s +2) ( α | x, I ) = − ( I + s ) B ( s +1) (0 | x, I ) + ( I − V ( s +1) (0 | x, I ) + o (1)= − ( I + s ) ( I − (cid:90) u I + s − V ( s +1) (0 | x, I ) du + ( I − V ( s +1) (0 | x, I ) + o (1)= o (1)uniformly on x . 64or (ii), consider a sequence of { γ k ( α | I ) , k ≤ K } approximating V ( α | x, I ) and its deriva-tives as in Property S. For { β k ( α | I ) , k ≤ K } as in (2.11) β ( p ) k ( α | I ) = ( I − (cid:90) u I + p − γ ( p ) k ( αu | I ) du, p = 0 , . . . , s + 1and sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( I − (cid:90) u I + p − (cid:32) V ( p ) ( αu | x, I ) − K (cid:88) k =1 γ ( p ) k ( αu | I ) P k ( x ) (cid:33) du (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) which gives the sieve approximation result for B ( α | x, I ) in (ii). Now, for αB (1) ( α | x, I ),observe that αB (1) ( α | x, I ) = ( I −

1) [ γ k ( α | I ) − β k ( α | I )] . It follows sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ p (cid:2) αB (1) ( α | x, I ) (cid:3) ∂α p − K (cid:88) k =1 ∂ p (cid:104) αβ (1) k ( α | I ) (cid:105) ∂α p P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ( I −

1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + ( I −

1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B ( p ) ( α | x, I ) − K (cid:88) k =1 β ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ I −

1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( p ) ( α | x, I ) − K (cid:88) k =1 γ ( p ) k ( α | I ) P k ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) αB (1) ( α | x, I ) in (iii). (cid:3) The study of the bias V ( α | x, I ) − V ( α | x, I ) and B ( α | x, I ) − B ( α | x, I ) is based on thefollowing Lemma which is a consequence of the Kantorovitch-Newton Theorem, see e.g.Gragg and Tapia (1974). Lemma C.2

Let F ( · ) : R D → R be a function. Suppose that there is a x ∗ ∈ R D and somereal numbers (cid:15) > and C > such that F ( · ) is twice diﬀerentiable on B ( x ∗ , C (cid:15) ) = (cid:8) x ∈ R D ; (cid:107) x − x ∗ (cid:107) < C (cid:15) (cid:9) . If, in addition,i. (cid:13)(cid:13) F (1) ( x ∗ ) (cid:13)(cid:13) ≤ (cid:15) and (cid:13)(cid:13)(cid:13)(cid:2) F (2) ( x ∗ ) (cid:3) − (cid:13)(cid:13)(cid:13) ≤ C ;ii. There is a C > such that (cid:13)(cid:13) F (2) ( x ) − F (2) ( x (cid:48) ) (cid:13)(cid:13) ≤ C (cid:107) x − x (cid:48) (cid:107) for all x, x (cid:48) ∈B ( x ∗ , C (cid:15) ) ;iii. C C (cid:15) ≤ / .Then there is a unique x such that (cid:107) x − x ∗ (cid:107) < C (cid:15) and F (1) ( x ) = 0 . The next lemma, established in Appendix F, will be used at the end of the proof ofTheorem C.4 below.

Lemma C.3

Suppose Assumptions A, S and R-(ii). Then the (cid:96) norm of the columns ofthe matrix A α,h = E −  I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) )  are bounded independently of L and α . That is, if A α,h = [ A α,h ( j , j ) , ≤ j , j ≤ ( s + 1) K ] , max L max α ∈ [0 , max ≤ j ≤ ( s +1) K ( s +1) K (cid:88) j =1 | A α,h ( j , j ) | < ∞ .

66n the next theorem, bias h ( α | I ) = E −  I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) )  × E  I ( I (cid:96) = I ) B ( s +2) ( α | x (cid:96) , I (cid:96) ) (cid:82) I α,h I α,h t s +2 P ( x (cid:96) , t ) K ( t ) dt ( s + 2)! B (1) ( α | x (cid:96) , I (cid:96) )  , and bias h ( α | I ) = (cid:2) bias h ( α | I ) (cid:48) , . . . , bias s +1 ,h ( α | I ) (cid:48) (cid:3) where the subvectors bias ph ( α | I ) are of dimension K . While bias h ( α | I ) may not exist for α = 0, the function Bias h ( α | I ) = α bias h ( α | I ) in (4.8) can be set to 0 when α = 0 byProposition C.1-(i). Theorem C.4

Suppose that Assumptions A, H and R hold with s ≥ D M / . Then, for h small enough b ( α | I ) = arg min b R ( b ; α, I ) is unique for all α in [0 , and sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12)(cid:12)(cid:12) V ( α | x, I ) − V ( α | x, I ) − h s +1 P ( x ) (cid:48) α bias h ( α | I ) I − (cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) with sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12) P ( x ) (cid:48) α bias h ( α | I ) (cid:12)(cid:12) = O (1) . Moreover sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) , sup ( α,x,I ) ∈ [0 , ×X ×I (cid:12)(cid:12)(cid:12) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:12)(cid:12)(cid:12) = o ( h s ) . The proof of Theorem C.4 establishes that sup α ∈ [0 , (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o ( h s +1 ),see (C.3), an intermediary result which will be used all along the proof. If D M / ≤ s ,67og L/ (cid:0) Lh D M +1 (cid:1) = o (1) and by Lemma B.3 and a second order Taylor expansionsup α ∈ [0 , sup b ∈B ( b ( α | I ) ,Ch s +1 ) (cid:12)(cid:12)(cid:12) h − s +1) (cid:110)(cid:98) R ( b ; α, I ) − (cid:98) R (cid:0) b ( α | I ) ; α, I (cid:1) − (cid:0) b − b ( α | I ) (cid:1) (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:111) − h − s +1) (cid:0) b − b ( α | I ) (cid:1) (cid:48) R (2) (cid:0) b ( α | I ) ; α, I (cid:1) (cid:0) b − b ( α | I ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) = o P (1) . Then by Lemma B.2 and the Argmax Theorem (cid:98) R ( · ; α, I ) has a unique minimizer over b ∈B (cid:0) b ( α | I ) , Ch s +1 (cid:1) for each α , with a probability tending to 1. Since (cid:98) R ( · ; α, I ) is con-vex a local minimum is also a global one. This implies that the AQR or ASQR estimators (cid:98) b ( α | I ) = H − (cid:98) b ( α | I ) are unique for all α in [0 ,

1] with a probability tending to 1.

Proof of Theorem C.4.

Consider (ii) and (iii), the proof of (i) being similar as detailedbelow. The proof works by establishing that there is a solution of the ﬁrst-order conditionin a open ball where R ( b ; α, I ) is strictly convex by checking the conditions of Lemma C.2,which will also gives the rate stated in the Theorem and the uniqueness of b ( α | I ). It is ﬁrstclaimed thatmax ( α,I ) ∈ [0 , ×I (cid:13)(cid:13)(cid:13) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = (cid:15) L with (C.1) (cid:15) L = O (cid:18) max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | (cid:19) = o (cid:0) h s +1 (cid:1) , where (cid:15) L = o ( h s +1 ) follows from Lemma B.1-(iii). To see that (C.1) holds, observe that (cid:13)(cid:13)(cid:13) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:13)(cid:13)(cid:13) = max θ ; θ (cid:48) θ =1 (cid:12)(cid:12)(cid:12) θ (cid:48) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:12)(cid:12)(cid:12) . (C.2)68ut uniformly in α ∈ [0 ,

1] and by Assumption R-(i), Lemma B.1-(iii), (cid:12)(cid:12)(cid:12) θ (cid:48) R (1) ( b ∗ ( α | I ) ; α, I ) (cid:12)(cid:12)(cid:12) = E (cid:34) I ( I (cid:96) = I ) (cid:90) I α,h I α,h { G ( P ( x (cid:96) , t ) b ∗ ( α | I ) | x (cid:96) , I (cid:96) ) − G ( B ( α + ht | x, I ) | x (cid:96) , I (cid:96) ) } θ (cid:48) ( P ( x (cid:96) ) ⊗ π ( t )) K ( t ) dt ] ≤ C(cid:15) L E / (cid:20)(cid:90) − ( θ (cid:48) ( P ( x (cid:96) ) ⊗ π ( t ))) dt (cid:21) ≤ C(cid:15) L ( θ (cid:48) θ ) / = C(cid:15) L .Hence (C.1) holds, which is the ﬁrst part of Condition (i) in Lemma C.2. The second partof Condition (i) follows from Lemma B.2-(ii) which ensures that there is a C > L large enough, sup ( α,I ) ∈ [0 , ×I (cid:13)(cid:13)(cid:13)(cid:13)(cid:104) R (2) ( b ∗ ( α | I ) ; α, I ) (cid:105) − (cid:13)(cid:13)(cid:13)(cid:13) ≤ C Note that s ≥ D M / (cid:15) L = o ( h s +1 ) gives that B ( b ∗ ( α | I ) , C (cid:15) L ) ⊂ B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) for all C , C > L is large enough, for all α and all I . Condition (ii) in LemmaC.2 follows from Lemma B.2-(i) which ensures that for C L = O (cid:0) h D M / (cid:1) , (cid:13)(cid:13)(cid:13) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) ≤ C L (cid:107) b − b (cid:107) for all b , b in B ( b ∗ ( α | I ) , C (cid:15) L ) and all α , I . For condition (iii) in Lemma C.2, (cid:15) L = o ( h s +1 ) and s ≥ D M / C C L (cid:15) L = o (cid:0) h s − D M / (cid:1) = o (1) < / L large enough.Hence Lemma C.2 ensures that, for L large enough, all α and all I , there is a unique b ( α | I )in B ( b ∗ ( α | I ) , C (cid:15) L ) such that R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 069nd is therefore the unique minimizer of R ( · ; α, I ) over B ( b ∗ ( α | I ) , C (cid:15) L ). Since the convexfunction R ( · ; α, I ) cannot have several local minimizers, b ( α | I ) is also the unique globalminimizer of R ( · ; α, I ). Since (cid:15) L = o ( h s +1 ), it follows thatsup ( α,I ) ∈ [0 , ×I (cid:13)(cid:13) b ( α | I ) − b ∗ ( α | I ) (cid:13)(cid:13) = o (cid:0) h s +1 (cid:1) . (C.3)Consider now α b ( α | I ) − α b ∗ ( α | I ). Deﬁne g ( α | t, x, I ) = (cid:90) g (cid:0) Ψ (cid:0) t | x, b ( α | I ) (cid:1) + u ( B ( α + ht | x, I ) − Ψ ( t | x, b ∗ ( α | I ))) | t, x, I (cid:1) du which is such that, uniformly in α in [3 h, x in X and t in [ − , / g ( α | t, x, I ) = (cid:90) g (cid:0) Ψ (cid:0) t | x, b ( α | I ) (cid:1) + u (cid:0) B ( α + ht | x, I ) − Ψ (cid:0) t | x, b ( α | I ) (cid:1)(cid:1) | t, x, I (cid:1) du = (cid:90) g (cid:0) B ( α + ht | x, I ) + o (cid:0) h s +1 − D M / (cid:1) | t, x, I (cid:1) du ≥ (1 + o (1)) max y ∈ [ B (2 h | x,I ) ,B (1 − h | x,I )] g ( y | x, I ) ≥ C (cid:48)(cid:48) > o (cid:0) h s +1 − D M / (cid:1) = o ( h ) and Proposition C.1-(i). Now R (1) (cid:0) b ( α | I ) ; α, I (cid:1) =0 gives0 = (cid:90) (cid:32)(cid:90) I α,h I α,h (cid:8) G (cid:2) Ψ (cid:0) t | x, b ( α | I ) (cid:1) | x, I (cid:3) − ( α + ht ) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h (cid:8) G (cid:2) Ψ (cid:0) t | x, b ( α | I ) (cid:1) | x, I (cid:3) − G [ B ( α + ht | x, I ) | x, I ] (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − B ( α + ht | x, I ) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx = (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − Ψ ( t | x, b ∗ ( α | I )) (cid:9) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx + (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) { Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) } P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx. (cid:8) Ψ (cid:0) t | x, b ( α | I ) (cid:1) − Ψ ( t | x, b ∗ ( α | I )) (cid:9) P ( x, t ) = P ( x, t ) P ( x, t ) (cid:48) (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) , byAssumption R-(i), and because g ( α | t, x, I ), f ( x, I ) are bounded away from 0 and inﬁnity α (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) = (cid:34)(cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) dt (cid:33) f ( x, I ) dx (cid:35) − × (cid:90) (cid:32)(cid:90) I α,h I α,h g ( α | t, x, I ) (cid:40) ( ht ) s +2 ( s + 2)! αB ( s +2) ( α | x, I ) + o (cid:0) h s +2 (cid:1)(cid:41) P ( x, t ) K ( t ) dt (cid:33) f ( x, I ) dx uniformly in α in [0 ,

1] by Lemma B.1-(iii). By Assumption R-(ii) which implies in particular (cid:13)(cid:13)(cid:13)(cid:82) (cid:16)(cid:82) I α,h I α,h | P ( x, t ) | K ( t ) dt (cid:17) dx (cid:13)(cid:13)(cid:13) = O (1), it follows b ( α | I ) − b ∗ ( α | I )= o (cid:0) h s +1 (cid:1) E −  I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) )  E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35) ,α (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1) = h s +2 α bias h ( α | I )+ o (cid:0) h s +2 (cid:1) E −  I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) )  E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35) , (C.4)uniformly over [0 , A = A α,h = [ A , . . . A J L ] = E −  I ( I (cid:96) = I ) (cid:82) I α,h I α,h P ( x (cid:96) , t ) P ( x (cid:96) , t ) (cid:48) K ( t ) dtB (1) ( α | x (cid:96) , I (cid:96) )  be a a J L × J L matrix with columns A j , j = 1 , . . . , J , | A j | the associated (cid:96) norm and | A | , ∞ = max j ≤ J L | A j | , S a selection matrix which selects some columns of A , a , b some71onformable vectors and | a | ∞ the largest entry of a. | a (cid:48) SAb | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) j b j a (cid:48) [ SA ] j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) j | b j | max j (cid:12)(cid:12)(cid:12) a (cid:48) [ SA ] j (cid:12)(cid:12)(cid:12) ≤ | b | | A | , ∞ | a | ∞ . This gives, since max α,L | A | , ∞ < ∞ by Lemma C.3 and by Assumption R-(ii),sup ( α,x ) ∈ [0 , ×X | P (cid:48) ( x ) S bias h ( α | I ) |≤ C (cid:32) max x ∈X K (cid:88) k =1 | P k ( x ) | (cid:33) × max ≤ k ≤ K (cid:90) | P k ( x ) | dx = O (1) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) P (cid:48) ( x ) SA E (cid:34)(cid:90) I α,h I α,h | P ( x (cid:96) , t ) | K ( t ) dt (cid:35)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C (cid:18) max x ∈X (cid:107) P ( x ) (cid:107) (cid:19) × max ≤ k ≤ K (cid:90) | P k ( x ) | dx = O (1) . Let S and S be the selection matrices S b = β and S b = hβ , so that B ( α | x, I ) = P (cid:48) ( x ) S b ( α | I ) and B (1) ( α | x, I ) = P (cid:48) ( x ) S b ( α | I ) /h . Then (C.3), (C.4), Lemma B.1-(iii)and the above implysup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P (cid:48) ( x ) S (cid:0) b ( α | I ) − b ∗ ( α | I ) (cid:1)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X | Ψ (0 | x, b ∗ ( α | I )) − B ( α | x, I ) | = o (cid:0) h s +1 (cid:1) , sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:12)(cid:12)(cid:12) = o ( h s ) , ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) α (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17) − h s +1 P (cid:48) ( x ) αS bias h ( α | I ) (cid:12)(cid:12)(cid:12) = sup ( α,x ) ∈ [0 , ×X h (cid:12)(cid:12) αP (cid:48) ( x ) S (cid:0) b ( α | I ) − b ∗ ( α | I ) − h s +2 P (cid:48) ( x ) bias h ( α | I ) (cid:1)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X h (cid:12)(cid:12) α (cid:0) P (cid:48) ( x ) b ∗ ( α | I ) − hB (1) ( α | x, I ) (cid:1)(cid:12)(cid:12) = o (cid:0) h s +1 (cid:1) . This ends the proof of the Theorem since V ( α | x, I ) = B ( α | x, I ) + αB (1) ( α | x, I ) / ( I − (cid:3) References [1]

Gragg, W.B. & R.A. Tapia (1974). Optimal error bounds for the Newton-Kantorovich Theorem.

SIAM Journal on Numerical Analysis , 10–13.73 nline Appendix D: Bahadur representation Let (cid:98) e ( α | I ) be a candidate linearization leading term for (cid:98) b ( α | I ) − b ( α | I ) and (cid:98) d ( α | I ) theassociate linearization error term, or Bahadur remainder term, (cid:98) e ( α | I ) = − (cid:16) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:17) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) , (D.1) (cid:98) d ( α | I ) = (cid:98) b ( α | I ) − b ( α | I ) − (cid:98) e ( α | I ) . (D.2)This section goal is to study the magnitude of (cid:98) d ( α | I ) and, in the ASQR case, the magnitudeof P (cid:48) ( x ) (cid:98) d ( α | I ) and P (cid:48) ( x ) (cid:98) d ( α | I ) /h . Theorem D.1

Suppose Assumptions A, R-(i,ii) and S hold, s ≥ D M / and log LLh D M +1) = o (1) . Then max α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Lh D M +( D M ∨ / ( h + α (1 − α )) / log L (cid:110)(cid:98) b ( α | I ) − b ( α | I )+ (cid:16) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:17) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27)(cid:13)(cid:13)(cid:13)(cid:13) = O P (1) with a diverging normalization term Lh D M +( D M ∨ / / log L . Moreover, for (cid:98) d ( α | I ) as in(D.2), sup ( α,x ) ∈ [0 , ×X (cid:0) Lh D M +1 (cid:1) / (cid:13)(cid:13)(cid:13) P (cid:48) ( x ) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) = O P (cid:32) h / log L ( Lh D M +( D M ∨ ) / (cid:33) , sup ( α,x ) ∈ [0 , ×X (cid:0) Lh D M +1 (cid:1) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P (cid:48) ( x ) (cid:98) d ( α | I ) h (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32) log L ( Lh D M +1+( D M ∨ ) / (cid:33) . Proof of Theorem D.1.

We ﬁrst introduce some renormalizations. Let, for (cid:98) e ( α | I ) as74n (D.1), (cid:37) αL = ( h + α (1 − α )) / log LLh D M +( D M ∨ / , (cid:98) R ( d ; α, I ) = (cid:98) R (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + (cid:37) αL d ; α, I (cid:1) − (cid:98) R (cid:0) b ( α | I ) + (cid:98) e ( α | I ) ; α, I (cid:1) , which is such that (cid:37) αL = o (1) by log L/ (cid:0) Lh D M +1) (cid:1) = o (1) (cid:98) d ( α | I ) (cid:37) αL = arg min d (cid:98) R ( d ; α, I ) . It follows that, (cid:40) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) = (cid:91) α ∈ [0 , (cid:40)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) ⊂ (cid:91) α ∈ [0 , (cid:26) inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ inf (cid:107) d (cid:107)≤ t (cid:98) R ( d ; α, I ) (cid:27) ⊂ (cid:91) α ∈ [0 , (cid:26) inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ (cid:27) since inf (cid:107) d (cid:107)≤ t (cid:98) R ( d ; α, I ) ≤ (cid:98) R (0; α, I ) = 0. The next step uses a convexity argument thatcan be found in Pollard (1991). For any d with (cid:107) d (cid:107) ≥ t , convexity yields (cid:98) R ( d ; α, I ) = (cid:107) d (cid:107) t (cid:26) t (cid:107) d (cid:107) (cid:98) R (cid:18) (cid:107) d (cid:107) d (cid:107) d (cid:107) ; α, I (cid:19) + (cid:18) − t (cid:107) d (cid:107) (cid:19) (cid:98) R (0; α, I ) (cid:27) ≥ (cid:107) d (cid:107) t (cid:98) R (cid:18) t d (cid:107) d (cid:107) ; α, I (cid:19) so that inf (cid:107) d (cid:107)≥ t (cid:98) R ( d ; α, I ) ≤ (cid:107) d (cid:107) = t (cid:98) R ( d ; α, I ) ≤ (cid:40) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:41) ⊂ (cid:26) inf α ∈ [0 , inf (cid:107) d (cid:107) = t (cid:98) R ( d ; α, I ) ≤ (cid:27) . (D.3)Thus it is suﬃcient to consider those d with (cid:107) d (cid:107) = t . The expression of (cid:98) R ( d ; α, I ) gives,75sing two Taylor expansions with integral remainder, (cid:98) R ( d ; α, I ) = (cid:37) αL d (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) ; α, I (cid:1) + (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) = (cid:37) αL d (cid:48) (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) + (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) du (cid:21) (cid:98) e ( α | I )+ (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) . Since (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) + (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1) (cid:98) e ( α | I ) = 0 by (D.1), it follows that (cid:98) R ( d ; α, I ) = (cid:37) αL d (cid:48) (cid:20)(cid:26)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27) du (cid:21) (cid:98) e ( α | I )+ (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) . Lemma B.4 and (C.3) with s ≥ D M /

2, log L/ (cid:0) Lh D M +1) (cid:1) = o (1), Lemma B.2-(ii) givesup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) e ( α | I )( h + α (1 − α )) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) = o P (cid:0) h D M / (cid:1) . Lemmas B.3 and B.2-(i) then imply for the ﬁrst item in (cid:98) R ( d ; α, I ), uniformly in α and d (cid:107) d (cid:107) = t , (cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:20)(cid:26)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − (cid:98) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:27) du (cid:21) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:110) R (2) (cid:0) b ( α | I ) + u (cid:98) e ( α | I ) ; α, I (cid:1) − R (2) (cid:0) b ( α | I ) ; α, I (cid:1) + O P (cid:32)(cid:18) log LLh d M +1 (cid:19) / (cid:33)(cid:41) du (cid:35) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL d (cid:48) (cid:34) O P (cid:0) h − D M / (cid:1) (cid:107) (cid:98) e ( α | I ) (cid:107) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:35) (cid:98) e ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:37) αL (cid:34) O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:35) O P (cid:32)(cid:18) ( h + α (1 − α )) log LLh D M (cid:19) / (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = t(cid:37) αL O P (cid:32) ( h + α (1 − α )) / log LLh D M +( D M ∨ / (cid:33) = t(cid:37) αL O P (1) . Observe that the condition log L/ (cid:0) Lh D M +1) (cid:1) = o (1) implieslog LLh D M +( D M ∨ = o (1) and them (cid:37) αL = o (cid:32)(cid:18) ( h + α (1 − α )) log LLh D M (cid:19) / (cid:33) . Lemmas B.3 and B.2 then imply for the second item in (cid:98) R ( d ; α, I ), uniformly in α and d with (cid:107) d (cid:107) = t , (cid:37) αL d (cid:48) (cid:20)(cid:90) (cid:98) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) (1 − u ) du (cid:21) d (cid:48) = (cid:37) αL d (cid:48) (cid:34)(cid:90) (cid:40) R (2) (cid:0) b ( α | I ) + (cid:98) e ( α | I ) + u(cid:37) αL d ; α, I (cid:1) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:41) (1 − u ) du (cid:35) d (cid:48) = (cid:37) αL d (cid:48) (cid:34)(cid:90) (cid:40) R (2) (cid:0) b ( α | I ) (cid:1) + tO P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33)(cid:41) (1 − u ) du (cid:35) d (cid:48) ≥ C(cid:37) αL t (1 + to P (1)) . O P (1) and o P (1) which are uniform in α , P (cid:32) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:37) αL (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≥ t (cid:33) ≤ P (cid:18) inf α ∈ [0 , (cid:8) C(cid:37) αL t (1 + to P (1)) + t(cid:37) αL O P (1) (cid:9) ≤ (cid:19) = P ( Ct (1 + to P (1)) + O P (1) ≤ ≤ P ( t (1 + to P (1)) ≤ | O P (1) | )which can be made as small as needed asymptotically by increasing t . This gives the ﬁrst re-sult of the Theorem. For the second and third, observe that max α ∈ [0 , (cid:37) αL = log L/Lh D M +( D M ∨ / so that, uniformly in α and x , (cid:12)(cid:12)(cid:12)(cid:0) Lh D M +1 (cid:1) / P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:12)(cid:12)(cid:12) = ( Lh ) / h D M / max x ∈X (cid:107) P ( x ) (cid:107) (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) = O P (cid:16) ( Lh ) / (cid:37) αL (cid:17) = O P (cid:32) h / log L ( Lh D M +( D M ∨ ) / (cid:33) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:0) Lh D M +1 (cid:1) / P ( x ) (cid:48) (cid:98) d ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) Lh (cid:19) / (cid:37) αL (cid:33) = O P (cid:32) log L ( Lh D M +( D M ∨ ) / (cid:33) . This ends the proof of the Theorem. (cid:3)

References [1]

Pollard, D. (1991). Asymptotics for least absolute deviation regression esti-mators.

Econometric Theory , 186–199.78 nline Appendix E: Proof of main resultsE.1 Proof of Theorem 2 Recall that s is the row vector [0 , , , . . . ,

0] of dimension s + 2 and let s = [1 , , . . . , S = s ⊗ Id K , S = s ⊗ Id K so that (cid:98) β j ( α | I ) = S j (cid:98) β ( α | I ), j = 0 , (cid:98) V ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) b ( α | I ) , V ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) b ( α | I )Deﬁne, for (cid:98) e ( α | I ) as in (D.1) (cid:98) v ( α | x, I ) = V ( α | x, I ) + P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (E.1)which is such, for (cid:98) d ( α | I ) as in (D.2), (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) = P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) d ( α | I ) . As the eigenvalues of (cid:82) X P ( x ) P ( x ) (cid:48) dx are bounded away from inﬁnity under AssumptionR-(i) (cid:90) X (cid:90) (cid:16) (cid:98) V ( α | x, I ) − (cid:98) v ( α | x, I ) (cid:17) dαdx = O (cid:18) sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) (cid:19) h = O P (cid:32)(cid:18) log LLh D M +1+( D M ∨ / (cid:19) (cid:33) by Theorem D.1, which gives (4.5) since, by Assumption H, Lh D M +1 log L (cid:18) log LLh D M +1+( D M ∨ / (cid:19) = log LLh D M +1+( D M ∨ = o (cid:18) log LLh D M +1) (cid:19) = o (1) . bias IL = O (1) and Σ IL = O (1) similarly follow from Assumption R-(i) and PropositionC.1-(i).It holds since E [ (cid:98) e ( α | I )] = R (2) (cid:0) b ( α | I ) ; α, I (cid:1) − R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0 for all α in [0 , E (cid:20)(cid:90) X (cid:90) ( (cid:98) v ( α | x, I ) − V ( α | x, I )) dαdx (cid:21) = (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx + (cid:90) X (cid:90) E (cid:34)(cid:18) P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) dαdx. For the bias part, Theorem C.4 gives (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx = (cid:90) X (cid:90) (cid:18) h s +1 P ( x ) (cid:48) α bias h ( α | I ) I − o (cid:0) h s +1 (cid:1)(cid:19) dαdx = h s +1) (cid:90) X (cid:90) (cid:18) P ( x ) (cid:48) α bias h ( α | I ) I − (cid:19) dαdx + o (cid:0) h s +1) (cid:1) , Since α bias h ( α | I ) / ( I −

1) diﬀers from bias ( α | I ) for α in [0 , h ] or [1 − h, (cid:90) X (cid:90) (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1) dαdx = h s +1) (cid:90) X (cid:90) (cid:0) P ( x ) (cid:48) bias ( α | I ) (cid:1) dαdx + o (cid:0) h s +1) (cid:1) = h s +1) bias IL + o (cid:0) h s +1) (cid:1) . Arguing similarly with Lemma B.5-(i) yields (cid:90) X (cid:90) E (cid:34)(cid:18) P ( x ) (cid:48) (cid:20) S + α S h ( I − (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) dαdx = (cid:90) X (cid:90) E (cid:34)(cid:18)(cid:20) P ( x ) (cid:48) α (cid:98) e ( α | I ) h ( I − (cid:21)(cid:19) (cid:35) dαdx + O (cid:18) Lh D M (cid:19) = σ LI LIh D M +1 + o (cid:18) Lh D M +1 (cid:19) . Substituting in the bias-variance decomposition of the integrated mean squared error endsthe proof of the Theorem. (cid:3) .2 Proof of Theorem 3 Assumption R-(i) and Proposition C.1-(i) imply that P ( x ) (cid:48) Σ h ( α | I ) P ( x ) = 0 holds only if P ( x ) = 0 , which is impossible in the AQR case. But, in the ASQR case, if P ( x ) = 0 for some x ∈ X and all K large enough, the approximation property S cannot hold, contradictingAssumption S-(ii). Assumptions R-(i), H and Proposition C.1-(i) implymax x ∈X (cid:0) P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:1) = O (cid:18) max x ∈X (cid:107) P ( x ) (cid:107) (cid:19) = O (cid:0) h − D M (cid:1) . By Theorem D.1, Lemma B.5, Assumptions R-(i), H, and using the same notations than inthe proof of Theorem 2 (cid:0) Lh D M +1 (cid:1) / (cid:18) (cid:98) V ( α | x, I ) − V ( α | x, I ) − P (cid:48) ( x ) α S (cid:98) e ( α | I ) h ( I − − (cid:0) V ( α | x, I ) − V ( α | x, I ) (cid:1)(cid:19) = (cid:0) Lh D M +1 (cid:1) / (cid:26) P (cid:48) ( x ) (cid:98) e ( α | I ) + P (cid:48) ( x ) (cid:20) S + α S h ( I − (cid:21) (cid:98) d ( α | I ) (cid:27) = (cid:0) Lh D M +1 (cid:1) /  O P (cid:32) Lh D M ) / (cid:33) + O  (cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) h  = O P (cid:32) h / + (cid:18) log LLh D M− ( D M∨ ) (cid:19) / (cid:33) = o P (1) . Since V ( α | x, I ) − V ( α | x, I ) = h s +1 P ( x ) (cid:48) Bias h ( α | I ) + o ( h s +1 ), it remains to show that (cid:18) LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / αP ( x ) (cid:48) S (cid:98) e ( α | I ) h ( I − d → N (0 , . Write (cid:18)

LIhP ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:19) / αP ( x ) (cid:48) S (cid:98) e ( α | I ) h ( I −

1) = L (cid:88) (cid:96) =1 r (cid:96) ( α | x, I )81ith r (cid:96) ( α | x, I ) = I ( I (cid:96) = I ) (cid:80) I (cid:96) i =1 r i(cid:96) ( α | x, I ) and r i(cid:96) ( α | x, I ) = (cid:18) α LIh ( I − (cid:19) / P ( x ) (cid:48) (cid:0) P ( x ) (cid:48) Σ h ( α | I ) P ( x ) (cid:1) / S (cid:104) R (2) (cid:0) b ( α | x, I ) ; α, I (cid:1)(cid:105) − × (cid:90) − αh − αh (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) b ( α | x, I ) (cid:1) − ( α + ht ) (cid:9) P ( x (cid:96) , t ) K ( t ) dt. Since E [ r (cid:96) ( α | x, I )] = 0 and max ≤ (cid:96) ≤ L | Var ( r (cid:96) ( α | x, I )) − | = o (1), it is suﬃcient toshow that max ≤ (cid:96) ≤ L | E [ r (cid:96) ( α | x, I )] | = o (1) holds, see e.g. Theorem < > p.179 in Pollard(2002). But Assumption R-(i) and Proposition C.1-(i), Lemma B.2 and (C.3), | r i(cid:96) ( α | x, I ) | ≤ C ( Lh ) / (cid:107) P ( x ) (cid:107)(cid:107) P ( x ) (cid:107) × max x ∈X (cid:107) P ( x ) (cid:107) = O (cid:32) Lh D M +1 ) / (cid:33) . It follows that by Assumption Hmax ≤ (cid:96) ≤ L (cid:12)(cid:12) E (cid:2) r (cid:96) ( α | x, I ) (cid:3)(cid:12)(cid:12) ≤ I max ≤ (cid:96) ≤ L, ≤ i ≤ I (cid:96) | r i(cid:96) ( α | x, I ) | max ≤ (cid:96) ≤ L (cid:12)(cid:12) E (cid:2) r (cid:96) ( α | x, I ) (cid:3)(cid:12)(cid:12) = O (cid:32) Lh D M +1 ) / (cid:33) = o (1) . This ends the proof of the Theorem. (cid:3)

E.3 Proof of Theorem 4

The proof of Theorem requests some speciﬁc additional results. The next Lemma gives anexpansion for C h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:26)(cid:90) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) × h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) [ a ∧ a − a a ] da da (cid:27) dα dα . s (cid:48) = [1 , , . . . , s (cid:48) = [0 , , , . . . ,

0] and s (cid:48) = [0 , , , , . . . ,

0] are vectors ofdimension s + 2. Lemma E.1

Suppose that Assumption H holds. Assume that f ( · ) = f h ( · ) and g ( · ) = g h ( · ) are continuously diﬀerentiable functions, with, when h goes to , sup α ∈ [0 , | f ( α ) | = O (1) and sup α ∈ [0 , | g ( α ) | = O (1) , sup α ∈ [ h, − h ] (cid:12)(cid:12) f (1) ( α ) (cid:12)(cid:12) = O (1) and sup α ∈ [ h, − h ] (cid:12)(cid:12) g (1) ( α ) (cid:12)(cid:12) = O (1) , sup α ∈ [0 ,h ] ∪ [1 − h, (cid:12)(cid:12) f (1) ( α ) (cid:12)(cid:12) = O (cid:18) h (cid:19) and sup α ∈ [0 ,h ] ∪ [1 − h, (cid:12)(cid:12) g (1) ( α ) (cid:12)(cid:12) = O (cid:18) h (cid:19) .Then, if A is a random variable with a uniform distribution over [0 , C h = Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) + h (cid:26) Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) + Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , f ( A ) Ω h ( A ) s (cid:19)(cid:27) + h Cov ( g ( A ) Ω h ( A ) s , f ( A ) Ω h ( A ) s ) − h E [ f ( A ) Ω h ( A ) [ s s (cid:48) + s s (cid:48) ] g ( A ) Ω h ( A )] + o (cid:0) h (cid:1) . Proof of Lemma E.1:

See Appendix F.Consider two functions ϕ ( α, x ) and ϕ ( α | x ) and deﬁne (cid:98) I ϕ ( x | I ) = (cid:90) (cid:20) ϕ ( α | x ) s (cid:48) + ϕ ( α | x ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) × (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) dα. The purpose of the next Lemma is to compute the variance of this integral. Deﬁne for this83urpose P = P ( I ) = E [ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) )] , P ( α ) = P ( α | I ) = E (cid:20) P ( x (cid:96) ) P ( x (cid:96) ) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , P ( α ) = P ( α | I ) = − E (cid:34) P ( x (cid:96) ) P ( x (cid:96) ) I ( I (cid:96) = I ) B (2) ( α | x (cid:96) , I (cid:96) )( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35) , and set M ( α ) = Ω h ( α ) ⊗ P ( α ) , M ( α ) = Ω h ( α ) ⊗ P ( α ) . Lemma E.2

Suppose s ≥ D M / , and that Assumptions A, H, S and R hold. Assume that ϕ ( α | x ) , ϕ ( α | x ) and ∂ϕ ( α | x ) ∂α are continuous functions in ( α, x ) ∈ [0 , × X . Let A be arandom variable with a uniform distribution over [0 , . Then Var (cid:16) √ LIh D M (cid:98) I ϕ ( x | I ) (cid:17) = σ L ( x | I ) + (cid:13)(cid:13) h D M / P ( x ) (cid:13)(cid:13) o (1) with σ L ( x | I ) = Var (cid:20) h D M / P (cid:48) ( x ) (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α | I ) − P ( I ) / dα (cid:21) and Var (cid:16) √ LI (cid:82) X (cid:98) I ϕ ( x | I ) dx (cid:17) = σ L ( I ) + o (1) with σ L ( I ) = Var (cid:20)(cid:90) A (cid:26)(cid:90) X P (cid:48) ( x ) (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) dx (cid:27) P ( α | I ) − P / ( I ) dα (cid:21) . Proof of Lemma E.2.

Abbreviate R (2) (cid:0) b ( α | I ) ; α, I (cid:1) , (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) into R (2) ( α ) and (cid:98) R (1) ( α ) respectively. We now give a suitable expansion for R (2) ( α ) − . From the end of theproof of Lemma B.2 and Theorem C.4, it holds R (2) ( α ) = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) g (cid:2) B ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1) | x, I (cid:3) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx. s ≥ , B (1) ( ·| x, I ) is continuously diﬀerentiable. A ﬁrst-order Taylor expansion givesthat, uniformly, R (2) ( α ) = M ( α ) + h M ( α ) + o ( h ) . It then follows, uniformly over [0 , (cid:104) R (2) ( α ) (cid:105) − = (cid:2) Id + h M ( α ) − M ( α ) + o ( h ) Id (cid:3) − M ( α ) − = M ( α ) − − h M ( α ) − M ( α ) M ( α ) − + o ( h ) Id . Now M ( α ) − = Ω h ( α ) − ⊗ P ( α ) − and M ( α ) − M ( α ) M ( α ) − = (cid:2) Ω h ( α ) − Ω h ( α ) Ω h ( α ) − (cid:3) ⊗ (cid:2) P ( α ) − P ( α ) P ( α ) − (cid:3) with s (cid:48) Ω h ( α ) − Ω h ( α ) = s (cid:48)  · · · × c ( α )0 1 ... × ... . . . 0 ...0 · · · ×  = s (cid:48) + c ( α ) s p where c ( α ) = c h ( α ) and the entries of Ω h ( α ) − satisfy the smoothness conditions of LemmaE.1. This gives since the eigenvalues of Ω h ( α ) − and P ( α ) − are bounded away from inﬁnityuniformly in α Var / (cid:16) √ LI (cid:98) I ϕ ( x | I ) (cid:17) = Var / (cid:16)(cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I p ( x | I ) (cid:17) + o (1) (cid:107) P ( x ) (cid:107) (cid:13)(cid:13)(cid:13)(cid:13) Var / (cid:18) √ LI (cid:90) (cid:98) R (1) ( α ) dα (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:98) I ( x | I ) = √ LI (cid:90) ϕ ( α | x ) [ s ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα, (cid:98) I ( x | I ) = −√ LI (cid:90) ϕ ( α | x ) [ s ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − P ( α ) P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα. (cid:98) I ( x | I ) = √ LI (cid:90) ϕ ( α | x ) (cid:104) s h ⊗ P ( x ) (cid:105) (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα, (cid:98) I p ( x | I ) = √ LI (cid:90) ϕ ( α | x ) c ( α ) [ s p ⊗ P ( x )] (cid:48) (cid:2) Ω h ( α ) − ⊗ P ( α ) − P ( α ) P ( α ) − (cid:3) (cid:98) R (1) ( α ) dα. Observe now that, for any functions f ( · ) and g ( · ) satisfying the conditions of LemmaE.1 C h ( f, g ) = E (cid:20) I ( I (cid:96) = I ) (cid:90) (cid:90) f ( α ) g ( α ) × (cid:26) G (cid:20) min (cid:18) P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) , P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21) − G (cid:20) P (cid:18) x (cid:96) , a − α h (cid:19) b ( α | I ) (cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21) G (cid:20) P (cid:18) x (cid:96) , a − α h (cid:19) b ( a | I ) (cid:12)(cid:12)(cid:12)(cid:12) x (cid:96) , I (cid:21)(cid:27) × R (2) ( α ) − (cid:26)(cid:20) π (cid:18) a − α h (cid:19) π (cid:18) a − α h (cid:19) (cid:48) (cid:21) ⊗ (cid:2) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3)(cid:27) R (2) ( α ) − × h K (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) (cid:48) dα dα (cid:21) Now (C.3), max ( x,t ) ∈X × [ − , (cid:107) P ( x, t ) (cid:107) = O (cid:0) h − D M / (cid:1) and Lemma B.1-(iii) gives P (cid:18) x (cid:96) , a − αh (cid:19) b ( α | I ) = B ( a | x (cid:96) , I ) + o (cid:0) h s +1 − D M / (cid:1) uniformly in a , α and x (cid:96) with a − αh in the support of K ( · ), | a − α | ≤ h . Since s +1 − D M / ≥ P C h ( f, g ) = (cid:90) (cid:90) f ( α ) g ( α ) { a ∧ a − a a }× R (2) ( α ) − (cid:26)(cid:20) π (cid:18) a − α h (cid:19) π (cid:18) a − α h (cid:19) (cid:48) (cid:21) ⊗ P (cid:27) R (2) ( α ) − × h K (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) dα dα dx dx + o (1) Id . Now applying Lemma E.1 gives, since p ≥ (cid:16)(cid:98) I p ( x | I ) (cid:17) = (cid:107) P ( x ) (cid:107) o ( h ) , Cov (cid:16)(cid:98) I p ( x | I ) , (cid:98) I j ( x | I ) (cid:17) = (cid:107) P ( x ) (cid:107) o (1) , j = 1 , , (cid:13)(cid:13)(cid:13) Var (cid:16) √ LI (cid:82) (cid:98) R (1) ( α ) dα (cid:17)(cid:13)(cid:13)(cid:13) = O (1) andVar (cid:16)(cid:98) I ( x | I ) + (cid:98) I ( x | I ) + (cid:98) I ( x | I ) (cid:17) = P (cid:48) ( x ) (cid:26) Var (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα P / (cid:21) − (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα P / ,ϕ ( A | x ) P ( A ) − P / (cid:3) + Var (cid:2) ϕ ( A | x ) P ( A ) − P / (cid:3)(cid:9) P ( x )+ o (1) (cid:107) P ( x ) (cid:107) = Var (cid:26) P (cid:48) ( x ) (cid:20)(cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα − ϕ ( A | x ) P ( A ) − (cid:3) P / (cid:9) + o (1) (cid:107) P ( x ) (cid:107) . Observe now that ∂∂α (cid:2) ϕ ( α | x ) P ( α ) − (cid:3) = ∂ϕ ( α | x ) ∂α P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) −

87o that (cid:90) A (cid:0) ϕ ( α | x ) P ( α ) − − ϕ ( α | x ) P ( α ) − P ( α ) P ( α ) − (cid:1) dα − ϕ ( A | x ) P ( A ) − = (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α ) − dα + ϕ (0 | x ) P (0) − . This givesVar (cid:16) √ LI (cid:98) I ϕ ( x | I ) (cid:17) = Var (cid:26) P (cid:48) ( x ) (cid:90) A (cid:18) ϕ ( α | x ) − ∂ϕ ( α | x ) ∂α (cid:19) P ( α ) − P / dα (cid:27) + o (1) (cid:107) P ( x ) (cid:107) as stated in the ﬁrst result of the Lemma. The second similarly follows, observing that (cid:13)(cid:13)(cid:82) X ϕ j ( α | x ) P ( x ) dx (cid:13)(cid:13) = O (1), j = 0 , (cid:3) Consider two real valued continuous functions F ( b , b ) and F ( b , b ). Deﬁne ϕ ( α | x, I ) = F (cid:0) B ( α | x, I ) , B (1) ( α | x, I ) (cid:1) , ϕ ( α | x, I ) = F (cid:0) B ( α | x, I ) , B (1) ( α | x, I ) (cid:1) , (cid:98) I F ( x | I ) = (cid:90) (cid:20) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) × (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − (cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1) dα. A condition ensuring that the variances σ L ( x | I ) and σ L ( I ) of Lemma E.2 do not vanish is(4.9), that is ϕ ( α | x, I ) − ∂ϕ ( α | x, I ) ∂α (cid:54) = 0 . Proposition E.3

Suppose s ≥ D M / , and that Assumptions A, H, S and R hold. Assumethat ϕ ( α | x ) , ϕ ( α | x ) and ∂ϕ ( α | x ) ∂α are continuous functions in ( α, x ) ∈ [0 , ×X . Let σ L ( x | I ) and σ L ( I ) be as in Lemma E.2.Then if (4.9) holds for some α of [0 , and if Lh D M +2 diverges, √ LIh D M (cid:98) I F ( x | I ) /σ L ( x | I ) converges in distribution to a standard normal. If (4.9) holds for some ( α, x ) of [0 , × X and Lh diverges, √ LI (cid:82) X (cid:98) I F ( x | I ) dx/σ L converges in distribution to a standard normal. roof of Proposition E.3. The eigenvalues of P ( α ) − , P ( α ) and P are bounded uni-formly in K and α by Assumptions R and S, and (cid:13)(cid:13) h D M / P ( x ) (cid:13)(cid:13) is bounded away from 0and inﬁnity by Assumptions R and H. Then if (4.9) holds for some α , σ L ( x | I ) is boundedaway from 0 and inﬁnity and the exact order of Var (cid:16)(cid:98) I F ( x | I ) (cid:17) is 1 /LIh D M . We now checkthe Lyapounov condition. Write (cid:98) R (1) ( α ) = LI (cid:80) L(cid:96) =1 I [ I (cid:96) = I ] r (cid:96) ( α ), with r (cid:96) ( α ) = I (cid:96) (cid:88) i =1 (cid:90) − α h − α h (cid:8) I (cid:0) B i(cid:96) ≤ P ( x (cid:96) , t ) (cid:48) b ( α | I ) (cid:1) − ( α + ht ) (cid:9) π ( t ) ⊗ P ( x (cid:96) ) K ( t ) dt. This gives, since the eigenvalues of R (2) ( α ) are asymptotically bounded from 0 by LemmaB.2 and (C.3), E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) (cid:20) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:21) ⊗ P (cid:48) ( x ) (cid:104) R (2) ( α ) (cid:105) − r (cid:96) ( α ) − E [ r (cid:96) ( α )] LI dα (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) ≤ C h − max x ∈X (cid:107) P ( x ) (cid:107) ( LI ) LI Var (cid:16)(cid:98) I F ( x | I ) (cid:17) = CL h D M +1 Var (cid:16)(cid:98) I ( x | I ) (cid:17) .Lh D M +2 → ∞ implies that the Lyapounov condition holds since CLh D M +1 Var / (cid:16)(cid:98) I F ( x | I ) (cid:17) Var (cid:16)(cid:98) I F ( x | I ) (cid:17) = O (cid:32) Lh D M +2 ) / (cid:33) → (cid:98) I F ( x | I ) / Var / (cid:16)(cid:98) I F ( x | I ) (cid:17) is asymptotically N (0 , √ LI (cid:82) X (cid:98) I F ( x | I ) dx , recall that (cid:13)(cid:13)(cid:82) | P ( x ) | dx (cid:13)(cid:13) = O (1) by Assumption R. This alsogives E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) X (cid:20)(cid:90) (cid:18) ϕ ( α | x, I ) s (cid:48) + ϕ ( α | x, I ) s (cid:48) h (cid:19) ⊗ P (cid:48) ( x ) (cid:21) (cid:104) R (2) ( α ) (cid:105) − r (cid:96) ( α ) − E [ r (cid:96) ( α )] LI dα (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) ≤ C h − ( LI ) LI Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) = CL h Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) . Lh diverges, because CLh

Var / (cid:16)(cid:82) X (cid:98) I F ( x | I ) dx (cid:17) Var (cid:18)(cid:90) X (cid:98) I F ( x | I ) dx (cid:19) = C ( Lh ) / → (cid:3) Proof of Theorem 4.

1. Then the second-order Taylorinequality gives (cid:98) θ ( x ) − θ ( x )= (cid:88) I ∈I (cid:90) (cid:104) ϕ I ( α, x ) (cid:0) B ( α | x, I ) − B ( α | x, I ) (cid:1) + ϕ I ( α, x ) (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17)(cid:105) dα + (cid:88) I ∈I (cid:98) I F ( x | I )+ (cid:88) I ∈I (cid:90) (cid:20)(cid:18) ϕ I ( α, x ) s (cid:48) + ϕ I ( α, x ) s (cid:48) h (cid:19) ⊗ P (cid:48) ( x ) (cid:21) (cid:98) d ( α | I ) dα + O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:20)(cid:0) B ( α | x, I ) − B ( α | x, I ) (cid:1) + (cid:16) B (1) ( α | x, I ) − B (1) ( α | x, I ) (cid:17) (cid:21) O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:34) ([ s (cid:48) ⊗ P (cid:48) ( x )] (cid:98) e ( α | I )) + (cid:18)(cid:20) s (cid:48) h ⊗ P (cid:48) ( x ) (cid:21) (cid:98) e ( α | I ) (cid:19) (cid:35) O (1) sup ( α,x,I ) ∈ [0 , ×X ×I (cid:34)(cid:16) [ s (cid:48) ⊗ P (cid:48) ( x )] (cid:98) d ( α | I ) (cid:17) + (cid:18)(cid:20) s (cid:48) h ⊗ P (cid:48) ( x ) (cid:21) (cid:98) d ( α | I ) (cid:19) (cid:35) . (cid:98) θ ( x ) − θ ( x ) = o ( h s ) + (cid:88) I ∈I (cid:98) I F ( x | I )+ 1( Lh D M ) / O P (cid:32) log L ( Lh D M +2+( D M ∨ ) / + log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:98) I F ( x | I ) + o P (cid:32) Lh D M ) / (cid:33) . Proposition E.3 then gives the result since the (cid:98) I F ( x | I ) are independent. The asymptoticnormality of (cid:98) θ similarly follows from Assumption R, which gives (cid:13)(cid:13)(cid:82) X | P ( x ) | dx (cid:13)(cid:13) = O (1),and Theorem D.1 which implies (cid:98) θ − θ = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + O  sup α ∈ [0 , (cid:13)(cid:13)(cid:13)(cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) h  + 1 L / O P (cid:32) log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + 1 L / O P (cid:32) log L ( Lh D M +2 ) / (cid:33) = o ( h s ) + (cid:88) I ∈I (cid:90) X (cid:98) I F ( x | I ) dx + o P (cid:18) L / (cid:19) . (cid:3) .4 Proof of Theorem A.1 By Theorems C.4 and D.1, Lemma B.5 and using the notations of the proof of Theorem 2sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) P ( x ) (cid:48) S (cid:104)(cid:98) b ( α | I ) − b ( α | I ) (cid:105)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) B ( α | x, I ) − B ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:98) d ( α | I ) (cid:13)(cid:13)(cid:13) + o (cid:0) h s +1 (cid:1) = O P (cid:34)(cid:18) log LLh D M (cid:19) / (cid:40) (cid:18) log LLh D M +( D M ∨ (cid:19) / (cid:41)(cid:35) + o (cid:0) h s +1 (cid:1) = O P (cid:32)(cid:18) log LLh D M (cid:19) / (cid:33) + o (cid:0) h s +1 (cid:1) sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) (cid:98) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:16) S + αh S (cid:17) (cid:104)(cid:98) b ( α | I ) − b ( α | I ) (cid:105)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) V ( α | x, I ) − V ( α | x, I ) (cid:12)(cid:12) ≤ sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) + sup ( α,x ) ∈ [0 , ×X (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P ( x ) (cid:48) (cid:32)(cid:98) d + α (cid:98) d ( α | I ) h (cid:33)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + O (cid:0) h s +1 (cid:1) = O P (cid:34)(cid:18) log LLh D M +1 (cid:19) / (cid:40) (cid:18) log LLh D M +1+( D M ∨ (cid:19) / (cid:41)(cid:35) + O (cid:0) h s +1 (cid:1) = O P (cid:32)(cid:18) log LLh D M +1 (cid:19) / (cid:33) + O (cid:0) h s +1 (cid:1) . This end the proof of the Theorem. (cid:3) nline Appendix F: Proofs of intermediary resultsF.1 Lemmas B.1, B.2 and C.3 Proof of Lemma B.1.

Consider the harder

ASQR case. (i) It holds that, for β k ( ·|· ) asin (2.11), B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) β k ( α + ht | I ) − K (cid:88) k =1 P k ( x ) s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) (cid:32) β k ( α + ht | I ) − s (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) (cid:33) − ( ht ) s +1 ( s + 1)! K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | I ) . A Taylor expansion with integral remainder gives β k ( α + ht | I ) − s (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) = ( ht ) s +1 s ! (cid:90) β ( s +1) k ( α + uht | I ) (1 − u ) s du

93o that B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ ( ht ) s +1 s ! (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +1) k ( α + uht | I ) − B ( s +1) ( α + uht | I ) (cid:41) (1 − u ) s du + ( ht ) s +1 s ! (cid:90) (cid:8) B ( s +1) ( α + uht | x, I ) − B ( s +1) ( α | x, I ) (cid:9) (1 − u ) s du + ( ht ) s +1 ( s + 1)! (cid:40) B ( s +1) ( α | x, I ) − K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | x, I ) (cid:41) . Hence since B ( s +1) ( α | x, I ) is continuous, by Property S and Proposition C.1max ( α,x ) ∈ [0 , ×X max t ∈I α,h | B ( α + ht | x, I ) − P ( x, t ) b ∗ ( α | I ) | = o (cid:0) h s +1 (cid:1) + o (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +1 (cid:1) (F.1)since K − /D M = O ( h ). Observe also that, uniformly in α , x and t as above, ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3) = s +1 (cid:88) p =1 h p t p − ( p − K (cid:88) k =1 P k ( x ) β ( p ) k ( α | I )= h (cid:0) B (1) ( α | x, I ) + o (1) (cid:1) + h (cid:32) s +1 (cid:88) p =2 h p − t p − ( p − B ( p ) ( α | x, I ) + o (1) (cid:33) = hB (1) ( α | x, I ) + o ( h )by Property S, which also gives,max p =1 ,...,s +1 (cid:32) max x ∈X (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) h (cid:33) = max p =1 ,...,s +1 max ( α,x ) ∈ [0 , ×X h p − (cid:12)(cid:12) B ( p ) ( α | x, I ) + o (1) (cid:12)(cid:12) = max ( α,x ) ∈ [0 , ×X B (1) ( α | x, I ) + o (1) ≤ f f is large enough and h small enough, so that b ∗ ( α | I ) is in BI α,h since B (1) ( ·|· , · )is bounded away from 0 and inﬁnity by Proposition C.1. Suppose now that (cid:107) b − b ∗ ( α | I ) (cid:107) ≤ Ch/K / = Ch D M / . Then (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) ≥ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) − (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:107) P ( x ) (cid:107)≥ (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t (cid:2) P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) − O ( h ) , (cid:12)(cid:12) P ( x ) (cid:48) b p (cid:12)(cid:12) ≤ (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) + (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:107) P ( x ) (cid:107)≤ (cid:12)(cid:12) P ( x ) (cid:48) b ∗ p ( α | I ) (cid:12)(cid:12) − Ch, p = 1 , . . . , s + 1 , and B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) ⊂ BI α,h when h is small enough provided C is small enough.Hence (i) holds. (ii) follows from the Implicit Function Theorem and the deﬁnition of BI α,h .The ﬁrst equality of (iii) is (F.1). For the second, note that α + ht ≥ h > α ≥ h for all t in I α,h . It holds B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ K (cid:88) k =1 P k ( x ) (cid:32) β k ( α + ht | I ) − s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) (cid:33) with β k ( α + ht | I ) − s +1 (cid:88) p =0 ( ht ) p p ! β ( p ) k ( α | I ) = ( ht ) s +2 ( s + 1)! (cid:90) β ( s +2) k ( α + uht | I ) (1 − u ) s +1 du recalling, as established in the proof of Proposition C.1-(i) for α > β ( s +2) k ( α | I ) = 1 α (cid:16) ( I − γ ( s +1) k ( α | I ) − ( I + s ) β ( s +1) k ( α | I ) (cid:17) ,B ( s +2) ( α | x, I ) = 1 α (cid:16) ( I − V ( s +1) k ( α | I ) − ( I + s ) B ( s +1) ( α | x, I ) (cid:17) . (F.2)95ence B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) − ( ht ) s +2 ( s + 2)! B ( s +2) ( α | I )= B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I )+ ( ht ) s +2 ( s + 1)! (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +2) k ( α + uht | I ) − B ( s +2) ( α + uht | x, I ) (cid:41) (1 − u ) s +1 du + ( ht ) s +2 ( s + 1)! (cid:90) (cid:8) B ( s +2) ( α + uht | x, I ) − B ( s +2) ( α | x, I ) (cid:9) (1 − u ) s +1 du, with, using the expressions β ( s +2) k ( ·|· ) and B ( s +2) ( ·|· ) of the proof of Proposition C.1max ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:32) B ( α + ht | x, I ) − K (cid:88) k =1 P k ( x ) β k ( α + ht | I ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = ho (cid:16) K − s +1 D M (cid:17) = o (cid:0) h s +2 (cid:1) , max ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:90) (cid:40) K (cid:88) k =1 P k ( x ) β ( s +2) k ( α + uht | I ) − B ( s +2) ( α + uht | x, I ) (cid:41) (1 − u ) s +1 du (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C max ( α,x ) ∈ [2 h, ×X max t ∈I α,h (cid:40) αα − h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K (cid:88) k =1 P k ( x ) β ( s +1) k ( α | I ) − B ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) + C max ( α,x ) ∈ [2 h, ×X max t ∈I α,h (cid:40) αα − h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K (cid:88) k =1 P k ( x ) γ ( s +1) k ( α | I ) − V ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) = o (1) , max ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) α (cid:90) (cid:8) B ( s +2) ( α + uht | x, I ) − B ( s +2) ( α | x, I ) (cid:9) (1 − u ) s +1 du (cid:12)(cid:12)(cid:12)(cid:12) = o (1) . Substituting givesmax ( α,x ) ∈ [3 h, ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α (cid:32) B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) − ( ht ) s +2 ( s + 2)! B ( s +2) ( α | x, I ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12) α (cid:0) B ( α + ht | x, I ) − P ( x, t ) (cid:48) b ∗ ( α | I ) (cid:1)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) , max ( α,x ) ∈ [0 , h ] ×X max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ( ht ) s +2 ( s + 2)! B ( s +2) ( α | x, I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:0) h s +2 (cid:1) . The third result in (iii) follows from Proposition C.1-(iii). The fourth equality of (iii) followsfrom o (cid:0) h s +1 (cid:1) = max ( α,x ) ∈ [0 , ×X max t ∈I α,h | Ψ ( t | x, b ∗ ( α | I )) − B ( α + ht | x, I ) | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | Ψ [∆ ( u | x, b ∗ ( α | I )) | x, b ∗ ( α | I )] − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] | u − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] | = max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] (cid:12)(cid:12)(cid:12)(cid:12) B (cid:20) α + h G ( u | x, I ) − αh | x, I (cid:21) − B [ α + h ∆ ( u | x, b ∗ ( α | I )) | x, I ] |≥ Ch max ( α,x ) ∈ [0 , ×X max u ∈ Ψ [ I α,h | x, b ∗ ( α | I ) ] (cid:12)(cid:12)(cid:12)(cid:12) G ( u | x, I ) − αh − Φ ( u | x, b ∗ ( α | I )) − αh (cid:12)(cid:12)(cid:12)(cid:12) by Proposition C.1-(i).Consider now (iv). The ﬁrst bound follows from the Cauchy-Schwarz inequality. Thisbound implies for all u in Ψ [ I α,h | x, b ] ∩ Ψ [ I α,h | x, b ] | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] | = | Ψ [∆ ( u | x, b ) | x, b ] − u | = | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] | ≤ Ch − D M / (cid:107) b − b (cid:107) .

97y deﬁnition of BI α,h | Ψ [∆ ( u | x, b ) | x, b ] − Ψ [∆ ( u | x, b ) | x, b ] |≥ Ch | ∆ ( u | x, b ) − ∆ ( u | x, b ) | = C | Φ ( u | x, b ) − Φ ( u | x, b ) | and substituting shows that the second bound of (iv) holds. For the third bound in (iv), itholds uniformly in α , x , u , b and b (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ( t | x, b ) ∂t (cid:12)(cid:12)(cid:12)(cid:12) | Φ ( u | x, b ) − Φ ( u | x, b ) | h + max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂P ( x, t ) ∂t ( b − b ) (cid:12)(cid:12)(cid:12)(cid:12) . But, by deﬁnition of BI α,h max t ∈I α,h (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ( t | x, b ) ∂t (cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch max p =2 ,...,s +1 (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) b p h (cid:12)(cid:12)(cid:12)(cid:12) = O ( h )so that substituting and the bound for Φ ( u | x, b ) − Φ ( u | x, b ) gives, uniformly in α , x , u , b and b (cid:12)(cid:12)(cid:12)(cid:12) ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] − ∂ Ψ ∂t [∆ ( u | x, b ) | x, b ] (cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch − D M / (cid:107) b − b (cid:107) , which is the fourth inequality. The expression in (ii) of Φ ( · ) and the deﬁnition of BI α,h yieldthe third inequality. (cid:3) roof of Lemma B.2. It holds R (2) ( b ; α, I ) = E [ I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) P ( x (cid:96) , ∆ ( B i(cid:96) | x (cid:96) , b )) (cid:48) Ψ (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) K (∆ ( B i(cid:96) | x (cid:96) , b )) (cid:21) = (cid:90) (cid:34)(cid:90) Ψ ( I α,h | x, b ) ∧ B (1 | x,I )Ψ ( I α,h | x, b ) ∨ B (0 | x,I ) P ( x, ∆ ( y | x, b )) P ( x, ∆ ( y | x, b )) (cid:48) Ψ (∆ ( y | x, b ) | x (cid:96) , b ) K (∆ ( y | x, b )) g ( y, x, I ) dy (cid:35) dx. Recall ∆ [Ψ [ t | x, b ] | x, b ] = t for all t in I α,h and let I α,h ( x, I ; b ) = I α,h ∧ ∆ [ B (1 | x, I ) | x, b ] , I α,h ( x, I ; b ) = I α,h ∨ ∆ [ B (0 | x, I ) | x, b ] . The change of variable y = Ψ ( t | x, b ) yields that R (2) ( b ; α, I ) = (cid:90) (cid:34)(cid:90) I α,h ( x,I ; b ) I α,h ( x,I ; b ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) g (Ψ ( t | x, b ) , x, I ) dt (cid:35) dx. The Dominated Convergence Theorem and Proposition C.1-(i) , s ≥

1, yield that R (2) ( · ; α, I )is continuously diﬀerentiable over BI α,h with, by the Liebniz integral rule, R (3) ( b ; α, I ) [ d ] = R (3)0 ( b ; α, I ) [ d ] + R (3)1 ( b ; α, I ) [ d ] − R (3)2 ( b ; α, I ) [ d ] , R (3)0 ( b ; α, I ) [ d ] = (cid:90) X (cid:34)(cid:90) I α,h ( x,I ; b ) I α,h ( x,I ; b ) P ( x, t ) P ( x, t ) (cid:48) K ( t ) g (1) (Ψ ( t | x, b ) , x, I ) [ d (cid:48) P ( x, t )] dt (cid:35) dx, R (3)1 ( b ; α, I ) [ d ] = (cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) K (cid:0) I α,h ( x, I ; b ) (cid:1) × g (cid:0) Ψ (cid:0) I α,h ( x, I ; b ) | x, b (cid:1) , x, I (cid:1) (cid:20) d (cid:48) ∂I α,h ( x, I ; b ) ∂ b (cid:48) (cid:21) dx, R (3)2 ( b ; α, I ) [ d ] = (cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) K (cid:0) I α,h ( x, I ; b ) (cid:1) × g (cid:0) Ψ (cid:0) I α,h ( x, I ; b ) | x, b (cid:1) , x, I (cid:1) (cid:20) d (cid:48) ∂I α,h ( x, I ; b ) ∂ b (cid:48) (cid:21) dx. which implies that g ( ·|· , I ) is bounded away from 0 and inﬁnity. (cid:13)(cid:13)(cid:13) R (3)0 ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) (cid:22) C max x ∈X (cid:107) P ( x ) (cid:107) (cid:107) d (cid:107) ≤ Ch − D M / (cid:107) d (cid:107) . The operators R (3) i ( b ; α, I ) [ d ], i = 1 ,

2, can be studied in a similar way so that only i = 1 isconsidered. Observe ∂I α,h ( x, I ; b ) ∂ b (cid:48) =  I α,h ≤ ∆ [ B (1 | x, I ) | x, b ] ∂ ∆[ B (1 | x,I ) | x, b ] ∂ b (cid:48) = − P ( x, ∆( B (1 | x,I ) | x, b ))Ψ (1) (∆( B (1 | x,I ) | x, b ) | x, b ) if I α,h > ∆ [ B (1 | x, I ) | x, b ] . But, for h small enough,∆ [ B (1 | x, I ) | x, b ] = Φ [ B (1 | x, I ) | x, b ] − αh = min (cid:8) α + hI α,h , Φ [ B (1 | x, I ) | x, b ] (cid:9) − αh ≥ min (cid:8) α + hI α,h , Φ [ B (1 | x, I ) | x, b ∗ ( α | I )] − Ch − D M / (cid:107) b − b ∗ ( α | I ) (cid:107) (cid:9) − αh ≥ min (cid:8) α + hI α,h , G [ B (1 | x, I ) | x, I ] − Ch s +1 − Ch (cid:9) − αh ≥ min (cid:8) α + h min (cid:0) − αh , (cid:1) , − Ch (cid:9) − αh uniformly in α , x and b in B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) by Lemma B.1. Hence, if α ≤ − C (cid:48) h with C (cid:48) ≥ B (1 | x, I ) | x, b ] ≥ min { α + h, − Ch } − αh ≥ ≥ I α,h ∂I α,h ( x,I ; b ) ∂ b (cid:48) = 0. Hence since B (cid:0) b ∗ ( α | I ) , Ch D M / (cid:1) ⊂ BI α,h and by deﬁnition of BI α,h (cid:13)(cid:13)(cid:13) R (3)1 ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) ≤ C I [ α ≥ − C (cid:48) h ] × (cid:13)(cid:13)(cid:13)(cid:13)(cid:90) X P (cid:0) x, I α,h ( x, I ; b ) (cid:1) P (cid:0) x, I α,h ( x, I ; b ) (cid:1) (cid:48) d (cid:48) P ( x, ∆ ( B (1 | x, I ) | x, b ))Ψ (∆ ( B (1 | x, I ) | x, b ) | x, b ) dx (cid:13)(cid:13)(cid:13)(cid:13) ≤ Ch − I [ α ≥ − C (cid:48) h ] max x ∈X (cid:107) P ( x ) (cid:107) (cid:107) d (cid:107) ≤ Ch − h − D M / (cid:107) d (cid:107) I [ α ≥ − C (cid:48) h ] ≤ C h − D M / α (1 − α ) + h (cid:107) d (cid:107) . Substituting in the expression of R (3) ( b ; α, I ) [ d ] then gives uniformly in d max α ∈ [0 , max b ∈B ( b ∗ ( α | I ) ,Ch D M / ) ( α (1 − α ) + h ) (cid:13)(cid:13)(cid:13) R (3) ( b ; α, I ) [ d ] (cid:13)(cid:13)(cid:13) ≤ Ch − D M / (cid:107) d (cid:107) . The Taylor inequality shows that (i) holds.For (ii), the expression of R (2) ( b ; α, I ), Assumptions A and R-(i), Proposition C.1-(i),which imply that the eigenvalues of (cid:82) P ( x ) P (cid:48) ( x ) g [ B ( α | x, I ) , x, I ] dx stay bounded away 0and inﬁnity, Lemma B.1-(iii) and Proposition C.1-(i) give that, uniformly in α and xI α,h [ x, I ; b ∗ ( α | I )] = I α,h ∧ Φ [ B (1 | x, I ) | x, b ∗ ( α | I )] − αh = I α,h ∧ o ( h s +1 ) − αh = I α,h + o ( h s ) ,I α,h [ x, I ; b ∗ ( α | I )] = I α,h + o ( h s ) , (2) ( b ∗ ( α | I ) ; α, I ) = (cid:90) (cid:34)(cid:90) I α,h [ x,I ; b ∗ ( α | I )] I α,h [ x,I ; b ∗ ( α | I )] π ( t ) π ( t ) (cid:48) K ( t ) g (Ψ ( t | x, b ∗ ( α | I )) | x, I ) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) g (cid:2) B ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1) | x, I (cid:3) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) (cid:18) B (1) ( α + ht | x, I ) + o (cid:0) h s +1 (cid:1)(cid:19) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) (cid:34)(cid:90) I α,h + o ( h s ) I α,h + o ( h s ) π ( t ) π ( t ) (cid:48) K ( t ) (cid:32) B (1) ( α | x, I ) − ht B (2) ( α | x, I )( B (1) ( α | x, I )) + o ( h ) (cid:33) dt (cid:35) ⊗ P ( x ) P ( x ) (cid:48) f ( x, I ) dx = (cid:90) Ω h ( α ) ⊗ P ( x ) P ( x ) (cid:48) B (1) ( α | x, I ) f ( x, I ) dx − h (cid:90) Ω h ( α ) ⊗ P ( x ) P ( x ) (cid:48) B (2) ( α | x, I )( B (1) ( α | x, I )) f ( x, I ) dx + o ( h )where the last o ( h ) term is with respect of the matrix norm. This together the fact thatthe eigenvalues of the matrices Ω h ( α ) and (cid:82) X P ( x ) P ( x ) (cid:48) dx are bounded away from 0 andinﬁnity, the fact that B (1) ( α | x, I ) is bounded away from 0 and inﬁnity shows that (ii) holds. (cid:3) Proof of Lemma C.3.

Write A − α,h = D α,h + B α,h where D α,h is the diagonal of A − α,h and B α,h = A − α,h − D α,h . Provided the series converges A α,h = D − / α,h (cid:40) ∞ (cid:88) n =0 (cid:16) D − / α,h B α,h D − / α,h (cid:17) n (cid:41) D − / α,h . D − / α,h are bounded inabsolute value by C < ∞ for all α and L . It also gives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) E / (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) E / (cid:104) I ( I (cid:96) = I ) B (1) ( α | x (cid:96) ,I (cid:96) ) (cid:82) I α,h I α,h P k ( x (cid:96) ) π p ( t ) K ( t ) dt (cid:105) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:37) < ≤ k , k ≤ K and 0 ≤ p , p ≤ s + 1, that is all the entries of D − / α,h B α,h D − / α,h are bounded by (cid:37) in absolute value. By Assumption R-(ii), the entries of D − / α,h B α,h D − / α,h are bounded by the ones of (cid:37) Id ⊗ ( T (cid:48) + T ), where T is a lower c/ s + 2) × ( s + 2) identity matrix. Hence the absolute valueof the entries of A α,h are bounded by the entries of C Id ⊗ (cid:32) ∞ (cid:88) n = ∞ (cid:37) n (cid:16) T n (cid:48) + T n (cid:17)(cid:33) . Since T is a triangular c − band nilpotent matrix, it follows that | A α,h ( j , j ) | ≤ Cρ | j − j | with 0 < (cid:37) ≤ ρ <

1, for all α and L . It followsmax L max α ∈ [0 , max ≤ j ≤ ( s +1) K ( s +1) K (cid:88) j =1 | A α,h ( j , j ) | ≤ C (cid:88) n ρ n < ∞ which ends the proof of the Lemma. (cid:3) F.2 Lemmas B.3, B.4 and B.5

The proofs of the lemmas grouped here make use of a deviation inequality from Massart(2007). Consider n independent random variables Z (cid:96) and, for a known real function ξ ( z, θ )separable with respect to θ ∈ Θ, Z (cid:96) ( θ ) = ξ ( Z (cid:96) , θ ) where θ is a parameter. Let ξ ( · ) ≤ ξ ( · ) betwo functions. A bracket (cid:2) ξ, ξ (cid:3) is the set of all functions ξ ( · ) such that ξ ( z ) ≤ ξ ( z ) ≤ ξ ( z )for all z . The next proposition follows from Massart (2007, Theorem 6.8 and Corollary 6.9).103 roposition F.1 Assume that sup θ ∈ Θ | Z (cid:96) ( θ ) | ≤ M ∞ , sup θ ∈ Θ Var ( Z (cid:96) ( θ )) ≤ M for all (cid:96) and that for any (cid:15) > there exists brackets (cid:104) ξ j , ξ j (cid:105) ⊂ [ − b, b ] , j = 1 , . . . , exp ( H ( (cid:15) )) , suchthat E (cid:20)(cid:16) ξ j ( Z i ) − ξ j ( Z i ) (cid:17) (cid:21) ≤ (cid:15) and { ξ ( z, θ ) , θ ∈ Θ } ⊂ exp( H ( (cid:15) )) (cid:91) j =1 (cid:104) ξ j , ξ j (cid:105) . Let H L = 54 (cid:90) M / (cid:112) min ( L, H ( (cid:15) )) d(cid:15) + 2 ( M ∞ + M ) H ( M ) L / . Then, for any t ∈ (cid:2) , L / M /M ∞ (cid:3) , P (cid:32) sup θ ∈ Θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) i =1 { Z (cid:96) ( θ ) − E [ Z (cid:96) ( θ )] } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ L / {H L + t } (cid:33) ≤ (cid:18) − t (cid:19) . Proof of Lemma B.3.

Note that (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) is a c ( s + 2)-band matrix, sothat the order of its matrix norm is the same than the order of its largest entry. The genericentry of (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) can be written as (cid:98) r ( b ; α, I ) = 1 Lh ( D M +1) / L (cid:88) (cid:96) =1 ξ (cid:96) ( b ; α )where the ξ (cid:96) ( b ; α ) are centered iid with ξ (cid:96) ( b ; α ) = I (cid:96) (cid:88) i =1 { I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] ξ i(cid:96) ( b ) − E [ I [ B i(cid:96) ∈ Ψ ( I α,h | x (cid:96) , b ) , I (cid:96) = I ] ξ i(cid:96) ( b )] } ξ i(cid:96) ( b ) = h D M / h / P k ( x (cid:96) ) P k ( x (cid:96) )Ψ (1) (∆ ( B i(cid:96) | x (cid:96) , b ) | x (cid:96) , b ) /h K p (∆ ( B i(cid:96) | x (cid:96) , b )) ,K p (∆ ( B i(cid:96) | x (cid:96) , b )) = ∆ p + p ( B i(cid:96) | x (cid:96) , b ) p ! p ! K (∆ ( B i(cid:96) | x (cid:96) , b )) . | ξ (cid:96) ( b ; α ) | ≤ C h D M / max x ∈X (cid:107) P ( x ) (cid:107) h / ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / . for all α in [0 ,

1] and all admissible b . For the variance, Lemma B.1-(iii,iv) gives | ∆ ( B i(cid:96) | x (cid:96) , b ) | = (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ) − αh (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ∗ ( α | I (cid:96) )) − G ( B i(cid:96) | x (cid:96) , b ) h (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) Φ ( B i(cid:96) | x (cid:96) , b ) − Φ ( B i(cid:96) | x (cid:96) , b ∗ ( α | I (cid:96) )) h (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + o ( h s ) + O (cid:18) h − D M / × h D M / h (cid:19) = (cid:12)(cid:12)(cid:12)(cid:12) G ( B i(cid:96) | x (cid:96) , I (cid:96) ) − αh (cid:12)(cid:12)(cid:12)(cid:12) + O (1)uniformly. It follows that, U i(cid:96) = G ( B i(cid:96) | x (cid:96) , I (cid:96) ) being a uniform random variable independentof ( x (cid:96) , I (cid:96) )Var ( ξ (cid:96) ( b ; α )) ≤ CI h D M max x ∈X (cid:107) P ( x ) (cid:107) (cid:90) X | P k ( x ) P k ( x ) | dx (cid:90) I [ − C,C ] (cid:18) u − αh (cid:19) duh ≤ CI h D M max x ∈X (cid:107) P ( x ) (cid:107) (cid:18)(cid:90) X P k ( x ) dx (cid:19) / (cid:18)(cid:90) X P k ( x ) dx (cid:19) / ≤ M with M < ∞ under Assumption R, uniformly in b and α .Consider now the brackets covering. The key observation is that ξ (cid:96) ( b ; α ) only dependson a ﬁnite dimension subvector of b , b ( k ,k ) which groups the entries of b corresponding tothose P k ( · ) such that P k ( · ) P k ( · ) (cid:54) = 0 or P k ( · ) P k ( · ) (cid:54) = 0, so that the dimension of b ( k ,k ) is less than c ( s + 2) under Assumption R-(ii). Consequently the class to be bracketed is F = (cid:8) ξ (cid:96) (cid:0) b ( k ,k ) ; α (cid:1) ; α ∈ [0 , , b ( k ,k ) ∈B (cid:0) b ( k ,k ) ∗ ( α | I ) , Ch D M / (cid:1)(cid:9) . / (cid:0) Lh D M +1 (cid:1) = o (1), van de Geer (1999, p.20) and arguing as Guerre andSabbah (2012, 2014) imply that F can be bracketed with a number of bracketsexp ( H L ( (cid:15) )) (cid:16) (cid:18) L C (cid:15) (cid:19) C so that (cid:90) M / (cid:112) min ( L, H L ( (cid:15) )) d(cid:15) ≤ (cid:18) M (cid:19) / (cid:32)(cid:90) M / H L ( (cid:15) ) d(cid:15) (cid:33) / = O (log L ) / and for the item H L of Proposition F.1, H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / since 1 / (cid:0) Lh D M +1 (cid:1) is bounded. Hence, by Proposition F.1 for t ≤ L / M /M ∞ diverges P (cid:0) Lh D M +1 (cid:1) / sup α ∈ [0 , sup b ∈B ( b ∗ ( α | I ) ,Ch D M / ) | (cid:98) r ( b ; α, I ) | ≥ C log / L + t  ≤ (cid:18) − t (cid:19) uniformly over all the non zero entries (cid:98) r ( b ; α, I ) of the band matrix (cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ).This gives, by the Bonferroni inequality P  sup α ∈ [0 , sup b ∈B ( b ∗ ( α | I ) ,Ch D M / ) (cid:13)(cid:13)(cid:13)(cid:98) R (2) ( b ; α, I ) − R (2) ( b ; α, I ) (cid:13)(cid:13)(cid:13) ≥ C log / L + t ( Lh D M +1 ) /  ≤ CK exp (cid:18) − t (cid:19) which implies the result of the lemma since t ≤ L / M /M ∞ = O (cid:0) Lh D M +1 (cid:1) / can be setto t = τ log / L for an arbitrary large τ as log L/ (cid:0) Lh D M +1 (cid:1) = o (1). (cid:3) roof of Lemma B.4. The proof of Lemma B.4 is similar to the one of Lemma B.3. Thegeneric entry of (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) writes (cid:98) r ( b ; α, I ) = 1 L L (cid:88) (cid:96) =1 ξ (cid:96) ( b ; α )where the ξ (cid:96) ( b ; α ) are centered iid with, for K p ( t ) = t p K ( t ) /p !, ξ (cid:96) ( b ; α ) = I (cid:96) (cid:88) i =1 ( I ( I (cid:96) = I ) ξ i(cid:96) ( b ; α ) − E [ I ( I (cid:96) = I ) ξ i(cid:96) ( b ; α )]) ,ξ i(cid:96) ( b ; α ) = P k ( x (cid:96) ) (cid:40)(cid:90) I α, h I α, h { I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )] − ( α + ht ) } K p ( t ) dt (cid:41) . This gives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ξ (cid:96) ( b ; α )( h + α (1 − α )) / (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Ch − / max x ∈X (cid:107) P ( x ) (cid:107) ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / . For the computation of the variance, Lemma B.1-(iii,iv) and Proposition C.1-(i) give uni-formly in α , t in I α,h the admissible b and x (cid:96) , and for the uniform U i(cid:96) = G ( B i(cid:96) | x (cid:96) , I (cid:96) ), I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b )] = I [ B i(cid:96) ≤ Ψ ( t | x (cid:96) , b ∗ ( α | I )) + O ( h )]= I [ B ( U i(cid:96) | x (cid:96) , I (cid:96) ) ≤ B ( α + ht | x (cid:96) , I (cid:96) ) + O ( h )]= I [ U i(cid:96) ≤ G [ B ( α + ht | x (cid:96) , I (cid:96) ) + O ( h ) | x (cid:96) , I (cid:96) ]]= I [ U i(cid:96) ≤ α + ht + O ( h )] . U i(cid:96) is independent of ( x (cid:96) , I (cid:96) ) E (cid:2) ξ i(cid:96) ( b ; α ) | I (cid:96) (cid:3) ≤ E (cid:34) P k ( x (cid:96) ) (cid:90) I α, h I α, h (cid:90) I α, h I α, h I [ U i(cid:96) ≤ α + h ( t ∧ t ) + O ( h )] K p ( t ) K p ( t ) dt dt | I (cid:96) (cid:35) − E (cid:34) P k ( x (cid:96) ) (cid:90) I α, h I α, h (cid:90) I α, h I α, h I [ U i(cid:96) ≤ α + ht + O ( h )] ( α + ht ) K p ( t ) K p ( t ) dt dt | I (cid:96) (cid:35) + E (cid:2) P k ( x (cid:96) ) | I (cid:96) (cid:3) (cid:90) I α, h I α, h (cid:90) I α, h I α, h ( α + ht ) ( α + ht ) K p ( t ) K p ( t ) dt dt = E (cid:2) P k ( x (cid:96) ) | I (cid:96) (cid:3) (cid:90) I α, h I α, h (cid:90) I α, h I α, h (cid:8) α + O ( h ) − α (cid:9) K p ( t ) K p ( t ) dt dt ≤ C ( h + α (1 − α ))uniformly in α and b . Hence, uniformly in α and b Var (cid:32) ξ (cid:96) ( b ; α )( h + α (1 − α )) / (cid:33) ≤ M with M < ∞ . The bracketing part of the proof is similar to the one of Lemma B.3 and gives H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / . Arguing with Proposition F.1 then shows that the order of the largest entry in (cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) is O P (log L/L ) / , which gives uniformly (cid:13)(cid:13)(cid:13)(cid:98) R (1) ( b ; α, I ) − R (1) ( b ; α, I ) (cid:13)(cid:13)(cid:13) = K / O P (cid:18) log LL (cid:19) / = O P (cid:18) log LLh D M (cid:19) / and the Lemma is proved. (cid:3) roof of Lemma B.5. For (i), deﬁne P = E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) , P = E (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) , P = E (cid:34) I ( I (cid:96) = I ) B (2) ( α | x (cid:96) , I (cid:96) ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) ( B (1) ( α | x (cid:96) , I (cid:96) )) (cid:35) , and abbreviate Ω h ( α ), Ω h ( α ) in Ω, Ω . It holdsVar ( (cid:98) e ( α | I )) = (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − Var (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − with by Lemma B.2 (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − = [Ω ⊗ P − h Ω ⊗ P + o ( h )] − = (cid:2) Id − h (cid:0) Ω − Ω (cid:1) ⊗ (cid:0) P − P (cid:1) + o ( h ) (cid:3) − Ω − ⊗ P − = Ω − ⊗ P − + h (cid:0) Ω − Ω Ω − (cid:1) ⊗ (cid:0) P − P P − (cid:1) + o ( h )uniformly in α where the remainder term o ( h ) is with respect to the matrix norm. ForVar (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) , deﬁne ω = (cid:90) I α,h I α,h π ( t ) K ( t ) dt, ω = (cid:90) I α,h I α,h π ( t ) K ( t ) dt, Π m = (cid:90) I α,h I α,h (cid:90) I α,h I α,h min ( t , t ) π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt. LI ) Var (cid:104)(cid:98) R (1) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) admits the expansion, with uniform remainder terms, E (cid:34)(cid:90) I α,h I α,h (cid:90) I α,h I α,h { G [ B ( α + ht | x (cid:96) , I (cid:96) ) ∧ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] − G [ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] ( α + ht ) − G [ B ( α + ht | x (cid:96) , I (cid:96) ) + o ( h ) | x (cid:96) , I (cid:96) ] ( α + ht )+ ( α + ht ) ( α + ht ) } π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt dt ⊗ I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) = (cid:90) I α,h I α,h (cid:90) I α,h I α,h (cid:8) α + h ( t ∧ t ) − α − hα ( t + t ) (cid:9) π ( t ) π ( t ) (cid:48) K ( t ) K ( t ) dt dt + o ( h )= α (1 − α ) ω ω (cid:48) ⊗ P + h { Π m − α ( ω ω (cid:48) + ω ω (cid:48) ) } ⊗ P + o ( h ) . Hence an elementary expansion gives, uniformly in α ∈ [0 , (cid:98) e ( α | I )) = V e / ( LI ) + o ( h )with V e = α (1 − α ) (cid:2) Ω − ω ω (cid:48) Ω − (cid:3) ⊗ (cid:2) P − PP − (cid:3) + hα (1 − α ) (cid:2) Ω − Ω Ω − ω ω (cid:48) Ω − (cid:3) ⊗ (cid:2) P − P P − PP − (cid:3) + hα (1 − α ) (cid:2) Ω − ω ω (cid:48) Ω − Ω Ω − (cid:3) ⊗ (cid:2) P − PP − P P − (cid:3) + h (cid:2) Ω − ( Π m − ( ω ω (cid:48) + ω ω (cid:48) )) Ω − (cid:3) ⊗ (cid:2) P − PP − (cid:3) . Observe now that Ω − ω = s , Ω − ω = s and Ω − Ω Ω − ω = Ω − Ω s = Ω − ω = s .This gives V e = α (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − PP − (cid:3) + hα (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − P P − PP − (cid:3) + hα (1 − α ) [ s (cid:48) s ] ⊗ (cid:2) P − PP − P P − (cid:3) + h (cid:2) Ω − Π m Ω − − ( s s (cid:48) + s s (cid:48) ) (cid:3) ⊗ (cid:2) P − PP − (cid:3) . P − , P , P , Ω − and Ω are bounded away from inﬁnity uniformlyin α , it follows that max α ∈ [0 , (cid:107) Var ( (cid:98) e ( α | I )) (cid:107) = O (1 /L ) and thenmax ( α,x ) ∈ [0 , ×X Var (cid:0) P ( x ) (cid:48) (cid:98) e ( α | I ) (cid:1) = O (cid:32) max x ∈X (cid:107) P ( x ) (cid:107) L (cid:33) = O (cid:18) Lh D M (cid:19) . For Var ( (cid:98) e ( α | I ) /h ), observe that (cid:98) e ( α | I ) = S (cid:98) e ( α | I ) with S = s (cid:48) ⊗ Idit holds S V e S (cid:48) = h (cid:0) s (cid:48) Ω − Π m Ω − s (cid:1) (cid:0) P − PP − (cid:1) = hv h ( α ) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) × E (cid:2) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) (cid:3) E − (cid:20) I ( I (cid:96) = I ) P ( x (cid:96) ) P ( x (cid:96) ) (cid:48) B (1) ( α | x (cid:96) , I (cid:96) ) (cid:21) as v h ( α ) = s (cid:48) Ω − Π m Ω − s . This gives the result for Var ( (cid:98) e ( α | I ) /h ) and Var (cid:0) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:1) .For (ii), we just show that max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) /h (cid:12)(cid:12) = O P (cid:16)(cid:0) log L/Lh D M +1 (cid:1) / (cid:17) .Since max x ∈ [0 , (cid:107) P ( x ) (cid:107) = O (cid:0) h − D M / (cid:1) andmax ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18) max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) × h − / (cid:18) x ∈ [0 , (cid:107) P ( x ) (cid:107) (cid:19) it is suﬃcient to show max ( α,x ) ∈ [0 , ×X (cid:12)(cid:12)(cid:12)(cid:12) P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) (cid:12)(cid:12)(cid:12)(cid:12) = O P (cid:32)(cid:18) log LL (cid:19) / (cid:33) . (F.3)Write P ( x ) (cid:48) (cid:98) e ( α | I ) h / (1 + (cid:107) P ( x ) (cid:107) ) = 1 L L (cid:88) (cid:96) =1 ξ (cid:96) ( α, x )111ith ξ (cid:96) ( α, x ) = I (cid:96) (cid:88) i =1 ( I ( I (cid:96) = I ) ξ i(cid:96) ( α, x ) − E [ I ( I (cid:96) = I ) ξ i(cid:96) ( α, x )]) ,ξ i(cid:96) ( α, x ) = P ( x ) (cid:48) S (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − P ( x (cid:96) ) h / (1 + (cid:107) P ( x ) (cid:107) ) × (cid:40)(cid:90) I α,h I α,h (cid:8) I (cid:2) B i(cid:96) ≤ Ψ (cid:0) t | x (cid:96) , b ( α | I ) (cid:1)(cid:3) − ( α + ht ) (cid:9) K ( t ) dt (cid:41) . This gives, for all ( α, x ) ∈ [0 , | ξ (cid:96) ( α, x ) | ≤ Ch − / (max x ∈X (cid:107) P ( x ) (cid:107) ) x ∈X (cid:107) P ( x ) (cid:107) ≤ M ∞ with M ∞ (cid:16) h − ( D M +1) / , Var ( ξ (cid:96) ( α, x )) ≤ C (max x ∈X (cid:107) P ( x ) (cid:107) ) (1 + max x ∈X (cid:107) P ( x ) (cid:107) ) ≤ M with M (cid:16) . The Implicit Function Theorem and the FOC R (1) (cid:0) b ( α | I ) ; α, I (cid:1) = 0, Lemma B.2 with(C.3) and s ≥ D M / α (cid:55)→ b ( α | I ) is (cid:107)·(cid:107) -Lipshitz with a Lipshitz constant oforder L C , as α (cid:55)→ (cid:104) R (2) (cid:0) b ( α | I ) ; α, I (cid:1)(cid:105) − and x (cid:55)→ P ( x ) / (1 + (cid:107) P ( x ) (cid:107) ). Lemma B.1-(iii),1 / (cid:0) Lh D M +1 (cid:1) = O (1), van de Geer (1999, p.20) and arguing as Guerre and Sabbah (2012,2014) imply that { ξ (cid:96) ( α, x ) ; ( α, x ) ∈ [0 , × X } can be bracketed with a number of bracketsexp ( H L ( (cid:15) )) (cid:16) (cid:18) L C (cid:15) (cid:19) C .Arguing as in the proof of Lemma B.3 gives, for the item H L of Proposition F.1, H L = O (log L ) / + O (cid:18) log LLh D M +1 (cid:19) / = O (log L ) / and then (F.3) holds. (cid:3) .3 Lemma E.1 The proof of Lemma E.1 is based on the following lemma.

Lemma F.2

Let k ( · ) and k ( · ) be two functions over [0 , with primitives K ( · ) and K ( · ) . Then, if A is a random variable with a uniform distribution over [0 , and for anychoice of the primitives K ( · ) and K ( · ) , (cid:90) (cid:90) k ( a ) k ( a ) [ a ∧ a − a a ] da da = − (cid:90) k ( a ) (cid:26)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:27) da Proof of Lemma F.2.

Observe that (cid:90) (cid:90) k ( a ) k ( a ) [ a ∧ a − a a ] da da = E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:90) k ( a ) I [ A ≤ a ] da (cid:21) − E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:21) E (cid:20)(cid:90) k ( a ) I [ A ≤ a ] da (cid:21) = Cov (cid:18)(cid:90) A k ( a ) da, (cid:90) A k ( a ) da (cid:19) = Cov ( K ( A ) , K ( A ))which does not depend upon the choice of the primitives. Integrating by parts now givesCov ( K ( A ) , K ( A )) = (cid:90) K ( a ) ( K ( a ) − E [ K ( A )]) da = (cid:90) K ( a ) d (cid:20)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:21) = − (cid:90) k ( a ) (cid:26)(cid:90) a ( K ( a ) − E [ K ( A )]) da (cid:27) da since (cid:82) a ( K ( a ) − E [ K ( A )]) da vanishes for a = 0 and a = 1. (cid:3) roof of Lemma E.1 It is assumed that h < / k h ( a ; α ) = h π (cid:0) a − α h (cid:1) K (cid:0) a − α h (cid:1) and K h ( a ; α ) = (cid:82) a −∞ k h ( a ; α ) da . It follows from Lemma F.2 that C h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:26)(cid:90) (cid:90) k h ( a ; α ) k h ( a ; α ) (cid:48) [ a ∧ a − a a ] da da (cid:27) dα dα = − (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) (cid:26)(cid:90) a ( K h ( a ; α ) − E [ K h ( A ; α )]) (cid:48) da (cid:27) da = −I h + J h with I h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) (cid:26)(cid:90) a K h ( a ; α ) (cid:48) da (cid:27) da = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) × (cid:26)(cid:90) a (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) da dα dα . J h = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) k h ( a ; α ) a E [ K h ( A ; α )] (cid:48) da = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a × (cid:26)(cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) da dα dα = (cid:90) g ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα × (cid:90) f ( α ) (cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da dα . J h . The change of variable a = α + ht and the deﬁnition of Ω h ( α ) give (cid:90) g ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα = (cid:90) g ( α ) (cid:34)(cid:90) − α h − α h ( α + ht ) π ( t ) K ( t ) dt (cid:35) dα = (cid:90) α g ( α ) Ω h ( α ) s dα + h (cid:90) g ( α ) Ω h ( α ) s dα . For the second item in J h , integrating by parts gives (cid:90) (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da = (cid:90) −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da − (cid:90) h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da . This gives (cid:90) f ( α ) (cid:90) (cid:20)(cid:90) a −∞ h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da dα = (cid:90) f ( α ) (cid:20)(cid:90) −∞ + (cid:90) (cid:26) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19)(cid:27) da (cid:21) dα − (cid:90) f ( α ) (cid:20)(cid:90) h π (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) a da (cid:21) dα = (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ + (cid:90) − α h − α h π ( t ) K ( t ) dt (cid:35) dα − (cid:90) f ( α ) (cid:34)(cid:90) − α h − α h π ( t ) K ( t ) ( α + ht ) dt (cid:35) dα = (cid:90) f ( α ) (1 − α ) Ω h ( α ) s dα − h (cid:90) f ( α ) Ω h ( α ) s dα + (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π ( t ) K ( t ) dt (cid:35) dα . J h = (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) + h (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) s dα (cid:21) − h (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) − h (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) + (cid:20)(cid:90) g ( α ) Ω h ( α ) [ αs + hs ] dα (cid:21) (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − αh −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) . Consider now I h , which satisﬁes I h a = α + ht = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h π ( t ) K ( t ) × (cid:26)(cid:90) α + ht (cid:20)(cid:90) a −∞ h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da (cid:21) da (cid:27) dt dα dα a = α + ht = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h π ( t ) K ( t ) × (cid:40)(cid:90) α + ht (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da (cid:41) dt dα dα . (cid:90) α + ht (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da = (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + (cid:90) α + ht α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) d [ a − α − ht ]= (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + ht (cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt − (cid:90) α + ht α ( a − α − ht ) 1 h π (cid:48) (cid:18) a − α h (cid:19) K (cid:18) a − α h (cid:19) da = (cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da + ht (cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt − h t (cid:90) (1 − u ) 1 h π (cid:48) (cid:18) α + ht u − α h (cid:19) K (cid:18) α + ht u − α h (cid:19) du It follows that I h = I + h I − h I with I = (cid:90) (cid:90) f ( α ) g ( α ) Ω h ( α ) s (cid:40)(cid:90) α (cid:34)(cid:90) a − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) da (cid:41) dα dα , I = (cid:90) (cid:90) f ( α ) g ( α ) Ω h ( α ) s (cid:40)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:41) dα dα , I = (cid:90) (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) 1 h π (cid:48) (cid:18) α + htu − α h (cid:19) K (cid:18) α + htu − α h (cid:19) du (cid:27) dtdα dα . Consider ﬁrst I . Integrating by parts gives I = (cid:90) f ( α ) (cid:40)(cid:90) (cid:32)(cid:90) α (cid:34)(cid:90) a − α h −∞ π ( t ) K ( t ) dt (cid:35) da (cid:33) d (cid:20) − (cid:90) α g ( a ) Ω h ( a ) s da (cid:21) (cid:48) (cid:41) (cid:48) dα = (cid:90) f ( α ) (cid:40)(cid:90) (cid:18)(cid:90) α g ( a ) Ω h ( a ) s da (cid:19) (cid:32)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:33) dα (cid:41) dα . (cid:90) (cid:18)(cid:90) α g ( a ) Ω h ( a ) s da (cid:19) (cid:32)(cid:90) α − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:33) dα = (cid:90) (cid:34)(cid:90) α − α h −∞ π ( t ) K ( t ) dt (cid:26) − d (cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:27) (cid:48) (cid:35) (cid:48) = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) s da (cid:21) dα × (cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt + (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα . It holds, for the second item (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα = (cid:90) − α h − α h (cid:20)(cid:90) α + ht (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) π (cid:48) ( t ) K ( t ) dt = (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) s (cid:48) Ω h ( α ) − h (cid:20)(cid:90) α g ( a ) Ω h ( a ) s dα (cid:21) s (cid:48) Ω h ( α )+ h g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) + o (cid:0) h (cid:1) , where the o ( h ) is uniform over [ h, − h ] and is O ( h ) uniformly over [0 , h ] and [1 − h, f ( · ) and g ( · ), in which case it contributes for o ( h )when integrated out of α . Note that (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) s (cid:48) Ω h ( α ) f ( α ) dα = (cid:90) (cid:20)(cid:90) α (cid:90) α g ( a ) Ω h ( a ) s dadα (cid:21) d (cid:20)(cid:90) α s (cid:48) Ω h ( a ) f ( a ) da (cid:21) = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα . (cid:82) (cid:104)(cid:82) α g ( a ) Ω h ( a ) da (cid:105) dα = (cid:82) αg ( α ) Ω h ( α ) dα I = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα − h (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) f ( α ) dα + h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o (cid:0) h (cid:1) + (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) Consider now I . Integrating by parts gives I = (cid:34)(cid:90) f ( α ) (cid:40)(cid:90) (cid:34)(cid:90) α − α h −∞ π ( t ) K ( t ) dt (cid:35) d (cid:20) − (cid:90) α g ( a ) s (cid:48) Ω h ( a ) da (cid:21)(cid:41) dα (cid:35) (cid:48) = (cid:20)(cid:90) g ( a ) Ω h ( a ) da (cid:21) s (cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα + (cid:90) f ( α ) (cid:26)(cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα (cid:27) dα with (cid:90) f ( α ) (cid:26)(cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s h π (cid:48) (cid:18) α − α h (cid:19) K (cid:18) α − α h (cid:19) dα (cid:27) dα = (cid:90) f ( α ) (cid:40)(cid:90) − α h − α h (cid:20)(cid:90) α + ht g ( a ) Ω h ( a ) da (cid:21) s π (cid:48) ( t ) K ( t ) dt (cid:41) dα = (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o ( h ) . I = (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o ( h )+ (cid:20)(cid:90) g ( a ) Ω h ( a ) da (cid:21) s (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − αh −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) . For I , the change of variable α = α + hτ , Assumption H and the conditions on f ( · )120nd g ( · ) give I = (cid:90) f ( α ) (cid:90) − α h − α h g ( α + hτ ) (cid:90) − α h − τ − α h − τ t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) π (cid:48) ( tu + τ ) K ( tu + τ ) du (cid:27) dtdτ dα = (cid:90) f ( α ) (cid:90) − α h − α h g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) π (cid:48) ( tu + τ ) K ( tu + τ ) du (cid:27) dtdτ dα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:26)(cid:90) (1 − u ) (cid:20)(cid:90) h π (cid:48) (cid:18) α + htu − α h (cid:19) K (cid:18) α + htu − α h (cid:19) dα (cid:21) du (cid:27) dtdα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) × (cid:40)(cid:90) (1 − u ) (cid:34)(cid:90) − α h + tu − α h + tu π (cid:48) ( τ ) K ( τ ) dτ (cid:35) du (cid:41) dtdα + o (1)= (cid:90) f ( α ) g ( α ) (cid:90) − α h − α h t π ( t ) K ( t ) (cid:40)(cid:90) (1 − u ) (cid:34)(cid:90) − α h − α h π (cid:48) ( τ ) K ( τ ) dτ (cid:35) du (cid:41) dtdα + o (1)= 12 (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + o (1) . I h = I + h I − h I and the expressions of I , I and I give I h = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα + h (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) [ s s (cid:48) − s s (cid:48) ] Ω h ( α ) dα − h (cid:90) f ( α ) g ( α ) Ω h ( α ) s s (cid:48) Ω h ( α ) dα + h (cid:90) f ( α ) g ( α ) Ω h ( α ) [ s s (cid:48) + s s (cid:48) ] Ω h ( α ) dα + o (cid:0) h (cid:1) + (cid:20)(cid:90) g ( α ) Ω h ( α ) [ αs + hs ] dα (cid:21) (cid:34)(cid:90) f ( α ) (cid:34)(cid:90) − α h −∞ π (cid:48) ( t ) K ( t ) dt (cid:35) dα (cid:35) We now prepare to compute the expansion of J h − I h . Observe (cid:82) (cid:104)(cid:82) α g ( a ) Ω h ( a ) da (cid:105) dα = (cid:82) αg ( α ) Ω h ( α ) dα , so that (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) − (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) αf ( α ) Ω h ( α ) dα (cid:21) + (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) − (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) Ω h ( a ) f ( a ) da (cid:21) dα + (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = (cid:90) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) α Ω h ( α ) dα (cid:21) , = Cov (cid:18)(cid:90) A g ( a ) Ω h ( a ) s da, (cid:90) A f ( a ) Ω h ( a ) s da (cid:19) . Similarly, (cid:82) (cid:2)(cid:82) α f ( α ) Ω h ( a ) da (cid:3) dα = (cid:82) f ( α ) (1 − α ) Ω h ( α ) dα gives, after an integration122y parts, (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) (1 − α ) Ω h ( α ) dα (cid:21) − (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα = (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) (cid:18)(cid:90) α Ω h ( a ) f ( a ) da (cid:19) dα (cid:21) − (cid:90) g ( α ) Ω h ( α ) s s (cid:48) (cid:20)(cid:90) α Ω h ( a ) f ( a ) da (cid:21) dα = − Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) = Cov (cid:18) g ( A ) Ω h ( A ) s , (cid:20)(cid:90) A f ( a ) Ω h ( a ) da (cid:21) s (cid:19) , (cid:90) f ( α ) (cid:20)(cid:90) α g ( a ) Ω h ( a ) da (cid:21) s s (cid:48) Ω h ( α ) dα − (cid:20)(cid:90) αg ( α ) Ω h ( α ) dα (cid:21) s s (cid:48) (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) = Cov (cid:18)(cid:20)(cid:90) A g ( a ) Ω h ( a ) da (cid:21) s , f ( A ) Ω h ( A ) s (cid:19) , and, for any conformable u and v , (cid:90) f ( α ) g ( α ) Ω h ( α ) [ uv (cid:48) ] Ω h ( α ) dα − (cid:20)(cid:90) g ( α ) Ω h ( α ) dα (cid:21) [ uv (cid:48) ] (cid:20)(cid:90) f ( α ) Ω h ( α ) dα (cid:21) = Cov ( g ( A ) Ω h ( A ) u, f ( A ) Ω h ( A ) v ) . Collecting these items gives the expansion of C h stated in the Lemma. (cid:3) eferences [1] Guerre, E. & C. Sabbah (2012). Uniform bias study and Bahadur repre-sentation for local polynomial estimators of the conditional quantile function.

Econometric Theory , 87–129.[2] Guerre, E. & C. Sabbah (2014). Uniform bias study and Bahadur repre-sentation for local polynomial estimators of the conditional quantile function.http://arxiv.org/pdf/1105.5038.pdf[3]

Massart, P. (2007).

Concentration inequalities and model selection.

Lec-tures Notes in Mathematics 1986. Ecole d’Et´e de Probabilit´es de Saint-FlourXXXIII-2003, Jean Picard (ed.). Springer-Verlag.[4] van de Geer, S. (1999).