[PDF] A Consistent LM Type Specification Test for Semiparametric Panel Data Models

Abstract

This paper develops a consistent series-based specification test for semiparametric panel data models with fixed effects. The test statistic resembles the Lagrange Multiplier (LM) test statistic in parametric models and is based on a quadratic form in the restricted model residuals. The use of series methods facilitates both estimation of the null model and computation of the test statistic. The asymptotic distribution of the test statistic is standard normal, so that appropriate critical values can easily be computed. The projection property of series estimators allows me to develop a degrees of freedom correction. This correction makes it possible to account for the estimation variance and obtain refined asymptotic results. It also substantially improves the finite sample performance of the test.

Full PDF

AA Consistent LM Type Speciﬁcation Test for Semiparametric Panel Data Models ∗Ivan Korolev † September 12, 2019

Abstract

This paper develops a consistent series-based speciﬁcation test for semiparametric paneldata models with ﬁxed eﬀects. The test statistic resembles the Lagrange Multiplier (LM) teststatistic in parametric models and is based on a quadratic form in the restricted model resid-uals. The use of series methods facilitates both estimation of the null model and computationof the test statistic. The asymptotic distribution of the test statistic is standard normal, sothat appropriate critical values can easily be computed. The projection property of seriesestimators allows me to develop a degrees of freedom correction. This correction makes itpossible to account for the estimation variance and obtain reﬁned asymptotic results. It alsosubstantially improves the ﬁnite sample performance of the test. ∗ I thank seminar participants at several universities and conferences for helpful comments. All remainingerrors are mine. † Department of Economics, Binghamton University. E-mail: [email protected]. Website: https://sites.google.com/view/ivan-korolev/home a r X i v : . [ ec on . E M ] S e p Introduction

Panel data allows researchers to better account for individual heterogeneity and estimatericher models than cross-sectional data. Traditionally, the literature on panel data modelsfocused on fully parametric models. Popular textbooks written by Arellano (2003), Hsiao(2003), and Baltagi (2013) give excellent overviews of such models. However, parametricmodels may be misspeciﬁed, and more ﬂexible panel data models may be needed.Semiparametric models, such as partially linear or varying coeﬃcient models, serve as anattractive alternative to fully parametric models. While being more ﬂexible than parametricmodels, they are more tractable than fully nonparametric models and alleviate the curseof dimensionality. Ai and Li (2008), Su and Ullah (2011), Rodriguez-Poo and Soberon(2017), and Parmeter and Racine (2018) provide excellent surveys of recent developments insemiparametric panel data models.While there is a growing literature on estimation of semiparametric panel data models,the literature on speciﬁcation testing in this setting remains scarce. There are three possiblereasons for this. First, in the presence of ﬁxed eﬀects, one needs to transform data to eliminatethem before the model can be estimated. Estimating the transformed model in itself can bechallenging when kernel methods are used.Second, the asymptotic theory for consistent speciﬁcation tests in semiparametric paneldata models can be challenging. For instance, Henderson et al. (2008) propose kernel-basedspeciﬁcation tests both for parametric and semiparametric ﬁxed eﬀects panel data modelsand suggest using the bootstrap to obtain critical values, but they do not derive asymptoticproperties of their tests. In turn, Lin et al. (2014) develop an asymptotic theory for kernel-based speciﬁcation tests for panel data models with ﬁxed eﬀects, but they only considerparametric models.Third, it has long been known in the literature on consistent speciﬁcation tests thatasymptotic approximations often do not work well in ﬁnite samples even with cross-sectionaldata (see, e.g., Li and Wang (1998)). The bootstrap is typically used to improve the ﬁnite2ample performance of consistent speciﬁcation tests, but it may be computationally costly.In this paper I rely on the results on series estimation of ﬁxed eﬀects panel data modelsfrom Baltagi and Li (2002) and An et al. (2016) and develop a consistent Lagrange Multiplier(LM) type speciﬁcation test for semiparametric panel data models. My test overcomes allthree challenges described above.First, the use of series methods leads to a model that is linear in parameters. As a result,transforming the data, e.g. applying the within transformation or taking ﬁrst diﬀerences, toeliminate ﬁxed eﬀects is straightforward. Thus, the test is simple to implement.Second, as in cross-sectional models in Korolev (2019), the projection property of seriesestimators allows me to develop a degrees of freedom correction. Intuitively, when seriesmethods are used, the restricted residuals are orthogonal to the series terms included inthe restricted model. This means that even under the alternative, only a subset of momentconditions, rather than all of them, can be violated, which in turn aﬀects the normalizationof the test statistic. This degrees of freedom correction has two important consequences.From the theoretical point of view, it leads to a tractable asymptotic theory for the testand allows me to obtain reﬁned asymptotic results. In my asymptotic analysis, I decomposethe test statistic into the leading term and the remainder. By relying on the projectionnature of series estimators, I can directly account for the estimation variance, so that onlybias enters the remainder term. Because of this, I only need to control the rate at which biasgoes to zero to bound the remainder term, while variance can remain large. As a result, Ican derive the asymptotic distribution of the test statistic under fairly weak rate conditions.From the practical point of view, the degrees of freedom correction substantially improvesthe ﬁnite sample performance of the test. While I propose a wild bootstrap procedure andestablish its asymptotic validity, I show using simulations that the asymptotic version ofthe proposed test with the degrees of freedom correction performs almost as well as its wildbootstrap version. Hence, the degrees of freedom correction serves as a computationallyattractive analytical way to obtain a test with good small sample behavior.3he remainder of the paper is organized as follows. Section 2 introduces the model anddescribes how to construct the series-based speciﬁcation test for semiparametric ﬁxed eﬀectsmodels. Section 3 develops the asymptotic theory for the proposed test. Section 4 studiesthe behavior of the proposed test in simulations. Section 5 applies my test to the data fromCornwell and Rupert (1988) and Baltagi and Khanti-Akom (1990). Section 6 concludes.Appendix A collects all tables and ﬁgures. Appendix B contains proofs of my results.

I consider a general nonparametric panel data model with ﬁxed eﬀects: Y it = g ( X it ) + u it = g ( X it ) + µ i + ε it , E [ ε it | X i , µ i ] = 0 , (2.1)where X i = ( X i , ..., X iT ) (cid:48) , t = 1 , ..., T , and i = 1 , ..., n . µ i denotes the ﬁxed eﬀect, whichcaptures unobserved heterogeneity and may be correlated with the regressors X i . In myasymptotic analysis, I will assume that T is ﬁxed while n grows to inﬁnity.The goal of this paper is to test that the true model is semiparametric, i.e. that H SP : P X ( g ( X it ) = f ( X it , θ , h )) = 1 for some θ ∈ Θ , h ∈ H , (2.2)where f : X × Θ × H → R is a known function, θ ∈ Θ ⊂ R d is a ﬁnite-dimensionalparameter, and h ∈ H = H × ... × H q is a vector of unknown functions. For instance,if the semiparametric model is partially linear, then f ( X it , θ, h ) = X (cid:48) it θ + h ( X it ) , where X it = ( X (cid:48) it , X (cid:48) it ) (cid:48) . Many other semiparametric models can also be written in this form.The global alternative is H : P X ( g ( X it ) (cid:54) = f ( X it , θ, h )) > for all θ ∈ Θ , h ∈ H (2.3)4 .1 Series Estimators As in Korolev (2019), I use series methods to replace unknown functions with their ﬁniteseries expansions. Namely, for any variable z , let Q a n ( z ) = ( q ( z ) , ..., q a n ( z )) (cid:48) be an a n -dimensional vector of approximating functions of z , where the number of series terms a n isallowed to grow with the sample size n . Then an unknown function g ( z ) can be approximatedas g ( z ) ≈ (cid:80) a n j =1 q j ( z ) γ j = Q a n ( z ) (cid:48) γ . I replace all unknown functions in f ( X it , θ, h ) with theirﬁnite series expansions and write the semiparametric model in a series form as Y it = W (cid:48) it β + R it + µ i + ε it , (2.4)where W it := W m n ( X it ) := ( W ( X it ) , ..., W m n ( X it )) (cid:48) are appropriate regressors or basis func-tions, such as power series or splines, m n is the number of parameters in the semiparametricnull model, R it = f ( X it , θ, h ) − W (cid:48) it β is the approximation error. To construct a speciﬁcation test, I include additional series terms, Z it := Z r n ( X it ) :=( Z ( X it ) , ..., Z r n ( X it )) (cid:48) , that capture possible deviations from the null hypothesis: Y it = W (cid:48) it β + Z (cid:48) it β + R it + µ i + ε it = P (cid:48) it β + R it + µ i + ε it , (2.5)where P it := P k n ( X it ) := ( W (cid:48) it , Z (cid:48) it ) (cid:48) , k n = m n + r n is the total number of parameters, and β = ( β (cid:48) , β (cid:48) ) (cid:48) . For instance, in the partially linear model example above, the additional seriesterms can include nonlinear terms in X it and interactions between X it and X it .Due to the presence of ﬁxed eﬀects µ i that may be correlated with X it , it is problem-atic to estimate or test this model directly. Instead, I use ﬁrst diﬀerencing or the withintransformation to get rid of ﬁxed eﬀects. The model becomes ˆ Y it = ˆ W (cid:48) it β + ˆ Z (cid:48) it β + ˆ R it + ˆ ε it = ˆ P (cid:48) it β + ˆ R it + ˆ ε it , , A it , ˆ A it = A it − A i,t − , and in the latter case ˆ A it = A it − T (cid:80) Ts =1 A is . The speciﬁcation test reduces to testing the hypothesis β = 0 .For any variable A it , let A i = ( A i , ..., A iT ) (cid:48) and A = ( A (cid:48) , ..., A (cid:48) n ) (cid:48) . The restricted estimateof β is obtained from the regression of ˆ Y on ˆ W and is given by ˜ β = ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) ˆ Y , and the restricted residuals are ˜ e = ˆ Y − ˆ W ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) ˆ Y = M ˆ W ˆ Y , where M ˆ W = I − ˆ W ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) . If the null is true, it can be shown that ˜ e = M ˆ W ˆ ε + M ˆ W ˆ R The test will be based on the moment condition E [ ˆ P (cid:48) i ˆ ε i ] = 0 . The sample analog of thismoment condition is (cid:80) ni =1 ˆ P (cid:48) i ˜ e i /n . Note that (cid:80) ni =1 ˆ W (cid:48) i ˜ e i /n = 0 , so the test is essentiallybased on (cid:80) ni =1 ˆ Z (cid:48) i ˜ e i /n .Let ˜ Z = M ˆ W ˆ Z . The LM type test statistic is given by: ξ HC = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) (2.6)Alternatively, in the homoskedastic case, it can be simpliﬁed as follows: ξ = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) , (2.7)where ˜Σ T = n (cid:80) ni =1 ˜ e i ˜ e (cid:48) i .These two test statistics resemble the parametric LM test statistic. However, the numberof restrictions r n is allowed to grow to inﬁnity. Thus, in order to obtain convergence in6istribution, a normalization is needed. The normalized test statistics are given by t HC = ξ HC − r n √ r n and t = ξ − r n √ r n (2.8)I will show in the next section that under appropriate conditions, the normalized teststatistics are asymptotically standard normal. In this section, I develop the asymptotic theory for the proposed speciﬁcation test. Ianalyze its behavior under the null hypothesis and under a ﬁxed alternative. H This section derives the asymptotic distribution of the test statistic when the semipara-metric model is correctly speciﬁed. I start with my assumptions. First, I impose someregularity conditions on the data generating process.

Assumption 1. ( Y it , X (cid:48) it ) (cid:48) ∈ R d x , d x ∈ N , i = 1 , ..., n are independent across individuals,i.e. ( Y (cid:48) i , X (cid:48) i ) (cid:48) are i.i.d. random draws of the random variables ( Y (cid:48) , X (cid:48) ) (cid:48) , and the support of X , X , is a compact subset of R d x . Assumption 2.

Let ε i = Y i − E [ Y i | X i ] . The following two conditions hold:(a) Σ( x ) = E [ ε i ε (cid:48) i | X i = x ] is bounded.(b) E [ ε it | X i ] is bounded. The following assumption deals with the behavior of the approximating series functions.From now on, let (cid:107) A (cid:107) = [ tr ( A (cid:48) A )] / be the Euclidian norm of a matrix A . Let x ∈ R d x bea realization of the random variable X it . 7 ssumption 3. For each m , r , and k there are matrices B , and B such that, for ¯ W m ( x ) = B ˆ W m ( x ) , ¯ Z r ( x ) = B ˆ Z r ( x ) , and ¯ P k ( x ) = ( ¯ W m ( x ) (cid:48) , ¯ Z r ( x ) (cid:48) ) ,(a) There exists a sequence of constants ζ ( · ) that satisﬁes the conditions sup x ∈X (cid:107) ¯ W m ( x ) (cid:107) ≤ ζ ( m ) , sup x ∈X (cid:107) ¯ Z r ( x ) (cid:107) ≤ ζ ( r ) , and sup x ∈X (cid:107) ¯ P k ( x ) (cid:107) ≤ ζ ( k ) .(b) The smallest eigenvalue of E [ ¯ P k ( X it ) ¯ P k ( X it ) (cid:48) ] is bounded away from zero uniformly in k . Assumption 4.

Suppose that H holds. There exist α > and β ∈ R m n such that sup x ∈X | f ( x, θ , h ) − W m n ( x ) (cid:48) β | = O ( m − αn ) β in this assumption can be deﬁned in various ways. One natural deﬁnition is projection: β = E [ ˆ W it ˆ W (cid:48) it ] − E [ ˆ W it ˆ f ( X i , θ , h )] , where ˆ f ( · ) is an appropriate transformation of f ( · ) . Theorem 1.

Assume that Assumptions 1, 2, 3, and 4 are satisﬁed, and the following rateconditions hold: ( m n /n + m − αn ) ζ ( r n ) r / n → (3.1) ζ ( r n ) r n /n / → (3.2) ζ ( k n ) m / n k / n /n / → (3.3) nm − αn /r / n → (3.4) ζ ( r n ) /n / → (3.5) Also assume that (cid:107) ˆΩ − ˜Ω (cid:107) = o p ( r − / n ) , where ˜Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˆ Z i and ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i . hen under H t HC = ξ HC − r n √ r n d → N (0 , , (3.6) where ξ HC is as in Equation 2.6.If, in addition to the assumptions above, Σ( x ) ≡ Σ and (cid:107) ˆΩ − ˜Ω (cid:107) = o p ( r − / n ) , where ˜Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i ˜Σ T ˆ Z i and ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i , then t = ξ − r n √ r n d → N (0 , , where ξ is as in Equation 2.7. The normalization I use, r n , diﬀers from the normalization used in most series-basedspeciﬁcation tests for parametric models with cross-sectional data, which use the total numberof parameters in the nonparametric model k n (see equations (2.1) and (2.2) in Hong andWhite (1995) and Lemma 6.2 in Donald et al. (2003)). This diﬀerence can be viewed as adegrees of freedom correction.The fact that I am dealing with semiparametric, as opposed to parametric, models requiresme to modify the key step of my proof, going from the transformed semiparametric regressionresiduals ˜ e to the transformed true errors ˆ ε . My approach relies on the projection property ofseries estimators to eliminate the estimation variance and hence only needs to deal with theapproximation bias. Speciﬁcally, it uses the equality ˜ e = M W ˆ ε + M W ˆ R , applies a central limittheorem for U -statistics to the quadratic form in M W ˆ ε , and bounds the remainder terms byrequiring the approximation error R to be small.The conventional approach does not impose any special structure on the model residualsand uses the equality ˜ e = ˆ ε + (ˆ g − ˜ g ) . In parametric models, ˆ g − ˜ g = ˆ X (cid:48) ( β − ˆ β ) , and ˆ β is √ n -consistent. This makes it possible to apply a central limit theorem for U -statistics to the9uadratic form in ˆ ε and bound the remainder terms that depend on ˆ X (cid:48) ( β − ˆ β ) . However,in semiparametric models this approach needs to deal with both the bias and variance ofsemiparametric estimators. Speciﬁcally, ˆ g − ˜ g = ˆ R + ˆ W (cid:48) ( β − ˜ β ) , where ˆ R can be viewed asthe bias term and ˆ W (cid:48) ( β − ˜ β ) as the variance term. Thus, in order for (ˆ g − ˜ g ) to be small,both bias and variance need to vanish suﬃciently fast, and the resulting rate conditionsturn out to be very restrictive. To see this, it is useful to look at the rates that would bepermissible with and without the degrees of freedom correction.Usually ζ ( k ) = O ( k / ) for splines and ζ ( k ) = O ( k ) for power series. It can be shownthat if splines are used, the rates k n = O ( n / ) , r n = O ( n / ) , m n = O ( n / ) are permissibleif α ≥ . If power series are used, the rates k n = O ( n / ) , r n = O ( n / ) , m n = O ( n / ) arepermissible if α ≥ .Without the degrees of freedom correction, in order for the test to be asymptoticallyvalid, m n typically has to be of the order o ( k / n ) . Hence, k n = O ( n / ) would require m n = o ( n / ) and α ≥ if splines are used. If power series are used, k n = O ( n / ) wouldrequire m n = o ( n / ) and α ≥ . In this section I propose a wild bootstrap procedure that can be used to obtain criticalvalues for my test and establish its asymptotic validity. I will compare the small samplebehavior of the asymptotic and bootstrap versions of the test in simulations.Because I am interested in approximating the asymptotic distribution of the test underthe null hypothesis, the bootstrap data generating process should satisfy the null. Moreover,because my test is robust to heteroskedasticity, the bootstrap data generating process shouldbe able to accommodate heteroskedastic errors. Finally, because the errors in panel datamodels may be correlated over time (but not across units), the bootstrap procedure shouldtake this into account. The wild bootstrap can satisfy both these requirements.The bootstrap procedure will be based on the residuals based on the transformed data10 e it = ˆ Y it − ˆ W (cid:48) it ˜ β . I require the bootstrap errors to satisfy the following two requirements:(i) E ∗ [ˆ ε ∗ i ] = 0 , (ii) E ∗ [ˆ ε ∗ i ˆ ε ∗(cid:48) i ] = ˜ e i ˜ e (cid:48) i , where E ∗ [ · ] = E [ ·|Z n,T ] is the expectation conditional on the data Z n,T = { ( Y it , X (cid:48) it ) (cid:48) } n,Ti =1 ,t =1 .To satisfy these requirements, I let ˆ ε ∗ i = V ∗ i ˜ e i , where V ∗ i is a two-point distribution. Note thatI use the same V ∗ i for all time periods for a given i . By doing so, I maintain the intertemporalcorrelation of the transformed errors and residuals in the original sample.Various choices of V ∗ i are possible. One popular option is Mammen’s two point distribu-tion, originally introduced in Mammen (1993): V ∗ i =  (1 − √ / with probability ( √ / (2 √ , (1 + √ / with probability ( √ − / (2 √ . Another possible choice is the Rademacher distribution, as suggested in Davidson andFlachaire (2008): V ∗ i =  − with probability , with probability . The wild bootstrap procedure then works as follows:1. Obtain the estimates ˜ β and residuals ˜ e i from the restricted model ˆ Y it = ˆ W (cid:48) it β + ˆ e it .2. Generate the wild bootstrap error ˆ ε ∗ i = V ∗ i ˜ e i .3. Obtain ˆ Y ∗ it = ˆ W (cid:48) it ˜ β + ˆ ε ∗ it . Then estimate the restricted model and obtain the restrictedbootstrap residuals ˜ e ∗ it using the bootstrap sample { ( ˆ Y ∗ it , ˆ W (cid:48) it ) (cid:48) } n,Ti =1 ,t =1 .4. Use ˜ e ∗ it in place of ˜ e it to compute the bootstrap test statistic t ∗ HC,r n or t ∗ r n .5. Repeat steps 2–4 B times (e.g. B = 399 ) and obtain the empirical distribution of the B test statistics t ∗ r n or t ∗ HC,r n . Use this empirical distribution to compute the bootstrap11ritical values of the bootstrap p -values.Then the following is true. Theorem 2.

Assume that Assumptions of Theorem 1 hold. Let Z n,T = { ( Y it , X (cid:48) it ) (cid:48) } n,Ti =1 ,t =1 .Then F ∗ HC,n ( t ) → Φ( t ) in probability , for all t , as n → ∞ , where F ∗ HC,n ( t ) is the bootstrap distribution of t ∗ HC,r n |Z n,T and Φ( · ) isthe standard normal CDF. A similar result can be obtained for the homoskedastic test statistic t ∗ r n . It is omitted forbrevity. This section discusses the behavior of the test statistic under a ﬁxed alternative. First, acautionary note is in order. Note that the null hypothesis concerns the model Y it = g ( X it ) + µ i + ε it , while the semiparametric series estimation method is based on the transformed model ˆ Y it = ˆ g ( X i ) + ˆ ε it , where the model is transformed by taking the ﬁrst diﬀerences or using the within transfor-mation. In particular, the model is estimated based on the series form ˆ Y it = ˆ W (cid:48) it β + e it Because of this, my test will only be able to detect speciﬁcation errors that are present Because of this, ˆ g ( · ) may depend on all elements of X i , not just X it .

12n the transformed model. In other words, the null hypothesis essentially becomes H : P (ˆ g ( X i ) = ˆ f ( X i , θ , h )) = 1 for some θ and h . Because most of the time researchers workwith transformed models when they deal with ﬁxed eﬀects, I believe this is a reasonablehypothesis to test. Other speciﬁcation tests for ﬁxed eﬀects panel data models, e.g. in Linet al. (2014), are also usually based on the transformed residuals. As long as the transfor-mation used does not eliminate the speciﬁcation error in the original model, the test will beconsistent for the original model. Assumption 5. (Donald et al. (2003), Assumption 1)Assume that E [ ˆ P it ˆ P (cid:48) it ] is ﬁnite for all k , and for any a ( x ) with E [ a ( X i ) ] < ∞ there are k × vectors γ k such that, as k → ∞ , E [( a ( X i ) − ˆ P (cid:48) it γ k ) ] → Lemma A.3 in the appendix shows that when this assumption is satisﬁed, the conditionalmoment restriction E [ ε i | X i ] = 0 is equivalent to a growing number of unconditional momentrestrictions. The class of functions a ( x ) , for which the equivalence between the conditionaland unconditional restrictions holds, consists of functions that can be approximated (in themean squared sense) using series as the number of series terms grows. While it is diﬃcult togive a necessary and suﬃcient primitive condition that would describe this class of functions,the test will likely be consistent against continuous and smooth alternatives, while it maynot be consistent against alternatives that exhibit jumps.This is a population result in the sense that it does not involve the sample size n . Inorder to use this result in practice, I require the number of series terms used to constructthe test statistic, k n , to grow with the sample size. By doing so, I ensure that the uncon-ditional moment restriction E [ ˆ P (cid:48) i ˆ ε i ] = 0 , on which the test is based, is equivalent to theconditional moment restriction E [ˆ ε i | X i ] . Thus, the test will be consistent against a wideclass of alternatives satisfying Assumption 5.13n order to analyze the behavior of the test under a ﬁxed alternative, I introduce somenotation ﬁrst. The true model is nonparamertic: Y it = g ( X it ) + µ i + ε it , E [ ε i | X i ] = 0 An alternative way to write this model is Y it = f ( X it , θ ∗ , h ∗ ) + µ i + ε ∗ it , where θ ∗ and h ∗ are pseudo-true parameter values and ε ∗ it = ε it + ( g ( X it ) − f ( X it , θ ∗ , h ∗ )) = ε ti + d ( X it ) is a composite error term. The pseudo-true parameter values minimize E [( g ( X it ) − f ( X it , θ, h )) ] over a suitable parameter space.Note that the model can be written as Y it = W (cid:48) it β ∗ + µ i + ε ∗ it + R ∗ it , where R ∗ it = ( f ( X it , θ ∗ , h ∗ ) − W (cid:48) it β ∗ ) . After transforming the data, the model becomes ˆ Y it = ˆ W (cid:48) it β ∗ + ˆ ε ∗ it + ˆ R ∗ it The pseudo-true parameter value β ∗ solves the moment condition E [ ˆ W it ( ˆ Y it − ˆ W (cid:48) it β ∗ )] = 0 ,and the semiparametric estimator ˜ β solves its sample analog ˆ W (cid:48) ( ˆ Y − ˆ W ˜ β ) /n = 0 .The following theorem provides the divergence rate of the test statistic under the ﬁxedalternative. 14 heorem 3. Let Ω ∗ = E [ ˆ Z (cid:48) i ˆ ε i ˆ ε (cid:48) i ˆ Z i ] . In the heteroskedastic case, let ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i , and in the homoskedastic case let ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i Suppose that there exists β ∗ such that sup x ∈X | f ( x, θ ∗ , h ∗ ) − W m n ( x ) (cid:48) β ∗ | → , (cid:107) ˆΩ − Ω ∗ (cid:107) p → , the smallest eigenvalue of Ω ∗ is bounded away from zero, m n → ∞ , r n → ∞ , r n /n → , E [ˆ ε ∗(cid:48) i T i ]Ω ∗− E [ T (cid:48) i ˆ ε ∗ i ] → ∆ , where ∆ is a constant. Then under homoskedasticity √ r n n ξ − r n √ r n p → ∆ / √ , and under heteroskedasticity √ r n n ξ HC − r n √ r n p → ∆ / √ In this section, I study the ﬁnite sample performance of the proposed test using simu-lations. I have several goals: ﬁrst, to illustrate the importance of the degrees of freedomcorrection; second, to study the sensitivity of the test to the choice of basis functions andtuning parameters; third, to compare the asymptotic version of my test with its bootstrapversion; ﬁnally, to study the eﬀect of the sample size on the test behavior.The setup I use resembles the one in Korolev (2019) but includes ﬁxed eﬀects: Y it = µ i + X it β + g ( X it ) + ε it Here ε it are independent across individuals i and time t , while α i are ﬁxed eﬀects that are15orrelated with both the regressors and error terms for individual i . More speciﬁcally, µ i = ν i + µ X,i , where ν i ∼ i.i.d. N (0 , . and µ X,i = (cid:80) Tt =1 (0 . X it + 0 . X it ) . In this setting, estimatingthe model Y it = 2 X it + g ( X it ) + e it , where e it = µ i + ε it , would result in inconsistentestimates, so it is crucial to account for the panel nature of the data and for the presence ofﬁxed eﬀects. To achieve this, I use the within transformation. After that, I estimate themodel and compute the proposed test statistic.I test the following null hypothesis: H SP : P ( E [ Y it | µ i , X it ] = µ i + X it β + g ( X it )) = 1 for some β, g ( X it ) against the alternative H : P ( E [ Y it | µ i , X it ] (cid:54) = µ i + X it β + g ( X it )) > for all β, g ( X it ) I use two data generating processes:1. Semiparametric partially linear, which corresponds to H SP : Y it = µ i + 2 X it + g ( X it ) + ε it g ( X it ) = 3 + 2(exp( X it ) − X it + 3)) (4.1)2. Nonparametric, which corresponds to H : Y it = µ i + 2 X it + g ( X it ) + h ( X it , X it ) + ε it h ( X it , X it ) = h ( X it ) h ( X it ) h ( X it ) = 1 .

25 cos( X it − , h ( X it ) = sin(0 . X it ) (4.2) I have tried using ﬁrst diﬀerencing instead of the within transformation and obtained similar results. ( n = 250 , T = 2) , Setup 2 with ( n = 250 , T = 4) , Setup 3 with ( n = 500 , T = 2) ,Setup 4 with ( n = 500 , T = 4) . I separately consider two settings: with homoskedastic errorsand with heteroskedastic errors.To implement the test, I use both power series and cubic splines as basis functions due totheir popularity. Instead of studying the behavior of my test for a given (arbitrary) numberof series terms, I vary the number of terms in univariate series expansions to investigate howthe behavior of the test changes as a result. The total number of parameters k n ranges from15 to 39 in Setups 1 and 2 and to 52 in Setups 3 and 4. First, I investigate the performance of the test when the errors are homoskedastic. Theerrors are normally distributed and independent across both i and t : ε it ∼ i.i.d. N (0 , . Iconsider tests based both on the LM type test statistic ξ = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) a ∼ χ ( τ n ) and on the normalized statistic t τ n = ξ − τ n √ τ n a ∼ N (0 , .I start by looking at the simulated size of the test at the nominal 5% level. Figures 1, 2, 3,and 4 plot the simulated size as a function of the number of series terms in univariate seriesexpansions a n (including the constant term) for the four setups I consider. The upper panelsof these ﬁgures use the LM type statistic ξ , while the bottom panels use the normalized teststatistic t . The left panels use power series and the right panels use splines. I consider fourversions of the test: the asymptotic version with τ n = r n (red solid lines), the asymptoticversion with τ n = k n (magenta solid lines), the wild bootstrap version with the Rademacher For more details, see the online supplement to Korolev (2019). τ n = k n ) is severely undersized. In turn, the asymptotic test with the degrees of freedomcorrection (i.e. with τ n = r n ) based on the t statistic is slightly oversized, while the asymptotictest based on the ξ statistic controls size very well. Depending on the setup, its performanceis either very close to, or even better than, that of the wild bootstrap tests. We can also seethat the performance of the test is fairly robust to the choice of basis functions and tuningparameters.Next, I turn to the test power. Figures 5, 6, 7, and 8 plot the simulated power ofthe nominal 5% level test as a function of the number of series terms in univariate seriesexpansions a n . Given that the asymptotic test without the degrees of freedom correctionis undersized, it is not surprising that it also has very low power. In turn, the power ofthe asymptotic version of the test with the degrees of freedom correction is very similar tothe power of the wild bootstrap tests. As could be expected, the power increases as thesample size (the number of units n or the number of periods T ) increases. Finally, the powerdecreases as the number of series terms grows. This is due to the fact that the alternativeis smooth and can be captured by the ﬁrst few series terms. I will turn to a data-drivenmethod to choose tuning parameters later. In this section I investigate the performance of the test when the errors are heteroskedastic.The errors are normally distributed and independent across both i and t , but not identicallydistributed: ε it ∼ i.n.i.d. N (0 , .

75 exp(0 . X it + X it ))) . I consider tests based bothon the heteroskedasticity-robust LM type test statistic ξ HC = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) t τ n ,HC = ξ HC − τ n √ τ n a ∼ N (0 , .First, I look at the simulated size of the test at the nominal 5% level. Figures 9, 10,11, and 12 plot the simulated size as a function of the number of series terms in univariateseries expansions a n . We can see that the asymptotic test without the degrees of freedomcorrection is again severely undersized. The asymptotic test with the degrees of freedomcorrection based on the ξ HC statistic is also undersized, though its size becomes closer tothe nominal level as the sample size grows. In turn, the simulated size of the asymptotictest with the degrees of freedom correction based on the t HC statistic is pretty close to thenominal level. In fact, in Setups 1 and 2, when splines are used, it controls size even betterthat the wild bootstrap tests.Next, I turn to the test power. Figures 13, 14, 15, and 16 plot the simulated power ofthe nominal 5% level test as a function of the number of series terms in univariate seriesexpansions a n . The asymptotic test without the degrees of freedom correction has low powerin all setups. The asymptotic test with the degrees of freedom correction based on the ξ HC test statistic is less powerful than the wild bootstrap tests, but the power loss decreasesas the sample size grows. Finally, the power of the asymptotic test with the degrees offreedom correction based on the t HC statistic is fairly close to the power of the bootstraptests, especially with larger sample sizes.To summarize, even though the performance of the asymptotic test with the degreesof freedom correction deteriorates when the errors are heteroskedastic, as opposed to ho-moskedastic, it nevertheless comes close to the wild bootstrap tests in most setups. Withlarger sample sizes, its performance is almost indistinguishable from that of the bootstraptests. In the simulations presented above, the alternative was smooth and the power of the testdeclined as the number of series terms increased. However, this is not always the case. There19xist alternatives that are orthogonal to the ﬁrst few series terms, and in order to detect suchalternatives, one needs to include higher order series terms. In this section, I investigate theﬁnite sample performance of a data-driven method to select tuning parameters.In order to simplify the problem, I abstract away from the task of selecting the numberof series terms under the null and consider a linear univariate null model: Y it = µ i + 2 X it + ε it The smooth alternative is given by Y it = µ i + 2 X it + cos( X it −

2) + ε it I also consider an alternative that is orthogonal to the ﬁrst four power terms in X . Adata-driven test should be able to adapt to a wide class of alternatives and choose tuningparameters appropriately.I use a modiﬁed version of the approach proposed in Guay and Guerre (2006). I use the ξ test statistic and pick the value of r n that maximizes ξ ( r n ) − r n − γ n (cid:113) r n − r n,min ) , where γ n = c √ r n , c is a constant that satisﬁes c ≥ ε for some ε > , Card r n is the cardinality of the set of possible numbers of restrictions, and r n,min is the lowestpossible number of restrictions across diﬀerent choices of r n . The notation ξ ( r n ) emphasizesthe dependence of the test statistic ξ = (cid:16)(cid:80) ni =1 ˜ e (cid:48) i ˜ Z i (cid:17) (cid:16)(cid:80) ni =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i (cid:17) − (cid:16)(cid:80) ni =1 ˜ Z (cid:48) i ˜ e i (cid:17) on thenumber r n of elements in ˜ Z . Intuitively, r n is the center term of ξ ( r n ) , while γ n (cid:112) r n − r n,min ) is the penalty term that rewards simpler alternatives. In my analysis, I set c = 5 .Table 1 presents the results. I report the simulated size, power against the standardalternative, and power against the orthogonal alternative for the data driven test and the20est with the ﬁxed number of series term equal to a n = 4 and a n = 9 (including the constantterm). The former choice of a n is typically optimal under the regular alternative but has nopower against the orthogonal alternative. The latter choice of a n typically leads to good poweragainst the orthogonal alternative but results in the loss of power against the well-behavedalternative.As we can see, the data-driven test is slightly oversized in the ﬁrst three setups and isslightly undersized in the last setup, but overall its size is close to the nominal level. Moreover,it has excellent power against the standard alternative and pretty good power against theorthogonal alternative. Even though a more careful investigation of data-driven speciﬁcationtests for panel data models is beyond the scope of this paper, my simulations suggest thatthe proposed procedure performs well in ﬁnite samples. In this section, I apply my test to the PSID data that was used in Cornwell and Rupert(1988) and Baltagi and Khanti-Akom (1990). The dataset contains 7 years of observationson 595 heads of household between the ages of 18 and 65 in 1976 with a positive reportedwage in some private, non-farm employment for all 7 years. Among other models, the authorsestimated the following wage equation with ﬁxed eﬀects: LW AGE it = α W KS it + δ EXP it + λ EXP it + D (cid:48) it γ + µ i + ε it E [ ε i | W KS i , EXP i , D i , µ i ] = 0 (5.1)where LW AGE it is the natural logarithm of the wage of individual i in year t , W KS is weeksworked,

EXP is experience, and D it includes the following dummy variables: occupation( OCC = 1 if the individual has blue-collar occupation), industry (

IN D = 1 if the individualworks in a manufacturing industry), residence (

SOU T H = 1 , SM SA = 1 if the individual Available at http://bcs.wiley.com/he-bcs/Books?action=resource&bcsId=4338&itemId=1118672321&resourceId=13452 . M S = 1 if the individual is married), union coverage (

U N ION = 1 if the individual’s wage is set bya union contract).While my test is fairly general and applies to semiparametric as well as parametric models,I focus on a parametric model because parametric panel data models are prevalent in appli-cations. I test this parametric model against the alternative which is fully nonparametric inweeks worked and experience but is parametric in the dummy variables:

LW AGE it = g ( W KS it , EXP it ) + D (cid:48) it γ + µ i + ε it , where D it includes the six dummy variables listed above. Due to the number of dummyvariables, considering the alternative which is fully nonparametric appears implausible, as itwould essentially require me to split the dataset into = 64 bins and estimate it withineach bin separately.In order to implement the test, I need to select the basis functions and the number ofseries terms. I use both power series and splines and utilize a data-driven procedure to selectthe number of series terms. Note that the number of terms under the null is ﬁxed becausethe null model is parametric. Thus, I only need to choose the number of series terms underthe alternative. Following the approach discussed in Section 4.3, I vary the number of seriesterms in univariate series expansions in W KS and

EXP from 3 to 8 (not including theconstant) and pick the value of r n that maximizes ξ HC ( r n ) − r n − γ n (cid:113) r n − r n,min ) I ﬁnd that the optimal number of terms is equal to 3 (not including the constant term), i.e.that a cubic polynomial should be used. In this case, power series and splines coincide asthere are no knots yet. The resulting number of restrictions is r n = 12 . The upper panelof Table 2 reports the heteroskedastic test statistic ξ HC as well as the standardized statistic22 HC . As we can see, the null hypothesis that the model is correctly speciﬁed is not rejectedat the 5% level, but it is rejected at the 10% level.Next, I repeat this exercise for the speciﬁcation that drops the quadratic term in experi-ence. I use the same nonparametric alternative as before. Because the null model has oneregressor less than before, I am testing r n = 13 restrictions. The middle panel of Table 2reports the results. All for types of the test reject the null hypothesis at any conventionalconﬁdence level.Finally, I estimate a semiparametric model that is nonparametric in experience but isparametric in the remaining variables: LW AGE it = α W KS it + g ( EXP it ) + D (cid:48) it γ + µ i + ε it (5.2)I estimate this semiparametric model using power series with up to cubic terms. Thesemiparametric model leads to r n = 11 restrictions. Figure 17 plots the estimated eﬀects ofexperience for the linear, quadratic, and emiparametric models. As we can see, the semipara-metric model appears to be pretty similar to the quadratic model, and the linear model is nottoo far oﬀ. However, speciﬁcation testing draws a somewhat diﬀerent picture. As we can seefrom the bottom panel of Table 2, while the linear model is overwhelmingly rejected and thequadratic model is rejected at the 10% level, there is no evidence against the semiparametricmodel.Because in this paper I develop a speciﬁcation test and not a model selection procedure,one should be careful with applying my test to several models sequentially. However, itappears that there is substantial evidence against the linear model, while there is littleevidence agains the quadratic speciﬁcation employed by Cornwell and Rupert (1988) andBaltagi and Khanti-Akom (1990). If the researcher worries about the borderline results andwants to be on the safe side, it may be plausible to use a more ﬂexible semiparametric modelthat is fully nonparametric in experience. 23 Conclusion

In this paper, I develop a Lagrange Multiplier type speciﬁcation test for semiparametricpanel data models with ﬁxed eﬀects. The test achieves consistency by turning a conditionalmoment restriction into a growing number of unconditional moment restrictions. Unlike inthe traditional parametric Lagrange Multiplier test, both the number of parameters and thenumber of restrictions are allowed to grow with the sample size. I develop an asymptotictheory that explicitly takes this into account and prove that the normalized test statisticconverges in distribution to the standard normal.My test has several attractive features. First, ﬁxed eﬀects panel data models typicallyrequire researchers to transform their data, by taking ﬁrst diﬀerences or applying the withintransformation. This makes semiparametric estimation and speciﬁcation testing that involveskernel methods problematic, as it is diﬃcult to impose the additive structure on kernel estima-tors. In contrast, with series methods, the transformed model remains linear in parameters,and the proposed test is very simple to implement.Second, the projection property of series estimators allows me to develop a degrees offreedom correction, which explicitly accounts for the variance of semiparametric estimators.Thus, I only need to control the bias, and my rate conditions are relatively mild. Moreover,the degrees of freedom correction results in good performance of the test in simulations.In future research, I plan to extend the proposed test to semiparametric dynamic paneldata models. The presence of endogenous variables calls for the use of instrumental variables.Estimation of such models, with endogeneity only in the parametric part, has been studiedin Baltagi and Li (2002) and An et al. (2016). A possible concern for speciﬁcation testing inthese models is that nonparametric instrumental variables models are subject to the ill-posedinverse problem, so the unrestricted nonparametric model may not be identiﬁed. It remainsto be seen whether this identiﬁcation problem poses a challenge for speciﬁcation testing indynamic panel data models. 24 ppendix A Tables and Figures

Figure 1: Simulated Size of the Test, n = 250 , T = 2 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 4 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 2 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 4 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 2 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 4 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 2 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 4 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ test statistic from Equation 2.7. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 2 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 4 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 2 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 4 This ﬁgure plots the simulated size of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 2 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 250 , T = 4 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 2 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. n = 500 , T = 4 This ﬁgure plots the simulated power of the nominal 5% test against the number of series terms in univariateseries expansions, a n (including the constant). The left panel uses power series. The right panel uses splines.The upper panel uses the ξ HC test statistic from Equation 2.6. The lower panel uses the t test statistic fromEquation 2.8.The red solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = r n .The magenta solid line corresponds to the test that uses the asymptotic critical values and normalization τ n = k n . The cyan dash-dotted line corresponds to the test that uses the wild bootstrap critical valuesbased on Rademacher distribution. The blue dashed line corresponds to the test that uses the wild bootstrapcritical values based on Mammen’s distribution. The results are based on M = 1 , simulations and B = 399 bootstrap iterations. This ﬁgure plots the estimated eﬀects of experience from diﬀerent models. The dash-dotted line shows theestimated experience eﬀect from the linear model. The dashed line shows the estimated experience eﬀectfrom the quadratic model. The solid line shows the estimated experience eﬀect from the semiparametricmodel. The x axis plots experience. The y axis plots the logarithm of wage. a n = 4 a n = 9 Data-Driven a n = 4 a n = 9 n = 250 , T = 2 Size 0.055 0.046 0.049 0.054 0.046 0.047Power, regular 0.369 0.363 0.196 0.369 0.363 0.201Power, orthogonal 0.105 0.051 0.224 0.096 0.051 0.225 n = 250 , T = 4 Size 0.064 0.061 0.041 0.063 0.061 0.044Power, regular 0.825 0.825 0.622 0.825 0.825 0.621Power, orthogonal 0.447 0.062 0.654 0.440 0.062 0.644 n = 500 , T = 2 Size 0.058 0.054 0.055 0.056 0.054 0.054Power, regular 0.673 0.671 0.444 0.673 0.671 0.448Power, orthogonal 0.238 0.048 0.421 0.241 0.048 0.426 n = 500 , T = 4 Size 0.041 0.037 0.052 0.041 0.037 0.051Power, regular 0.990 0.990 0.935 0.990 0.990 0.932Power, orthogonal 0.856 0.033 0.933 0.843 0.033 0.939

The table reports the simulated size, power against the regular alternative, and power against the orthogonalalternative of the test based on the test statistic ξ . The left panel uses power series and the right panel usessplines. The column called “Data-Driven” uses the data-driven value of the tuning parameter r n as describedin Section 4.3. Other columns use the ﬁxed number of series terms a n equal to 4 or 9 (including the constant).The results are based on M = 1 , simulations. Table 2: Speciﬁcation Testing ResultsTest Statistic 5% Critical Value 10% Critical ValueQuadratic Model ξ HC t HC ξ HC t HC ξ HC t HC The table reports the values of the test statistics ξ HC a ∼ χ ( r n ) and t HC a ∼ N (0 , for the quadraticspeciﬁcation from Equation 5.1, linear speciﬁcation, and semiparametric speciﬁcation from Equation 5.2.The number of restrictions is r n = 12 , r n = 13 , and r n = 11 respectively. The corresponding critical valuesare shown together with the test statistics. ppendix B Proofs B.1 Proof of Theorem 1

The homoskedastic and heteroskedastic test statistics have diﬀerent expressions, and itis convenient to introduce some notation that allows me to write them in a similar form.Denote ˆΩ = n − (cid:80) ni =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i in the heteroskedastic case or ˆΩ = n − (cid:80) ni =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i in thehomoskedastic case. Then both test statistics can be written as ξ = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:16) n ˆΩ (cid:17) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) = ˜ e (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˜ e = (ˆ ε + ˆ R ) (cid:48) M ˆ W ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) M ˆ W (ˆ ε + ˆ R ) Because ˜ Z = M ˆ W ˆ Z and M ˆ W ˜ Z = M ˆ W M ˆ W ˆ Z = M ˆ W ˆ Z = ˜ Z , the test statistic can berewritten as ξ = (ˆ ε + ˆ R ) (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) (ˆ ε + ˆ R ) The proof consists of several steps.Step 1. Decompose the test statistic and bound the remainder terms. (ˆ ε + ˆ R ) (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) (ˆ ε + ˆ R ) = ˆ ε (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˆ ε + ˆ R (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˆ R + 2 ˆ R (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˆ ε By Lemma A.4, the smallest and largest eigenvalues of ˜ Z ˜ Z (cid:48) / ( nT ) converge to one. Be-cause ˜ Z (cid:48) ˜ Z/ ( nT ) and ˜ Z ˜ Z (cid:48) / ( nT ) have the same nonzero eigenvalues, λ max ( ˜ Z ˜ Z (cid:48) / ( nT )) con-verges in probability to 1. Moreover, the eigenvalues of ˆΩ are bounded below and above.Thus, by Assumption 4, ˆ R (cid:48) ˜ Z ( n ˆΩ) − ˜ Z (cid:48) ˆ R ≤ T C ˆ R (cid:48) (( nT ) − ˜ Z ˜ Z (cid:48) ) ˆ R ≤ C ˆ R (cid:48) ˆ R = O p ( nm − αn ) , where T is absorbed by C because the length of the panel is ﬁxed.Next, (cid:12)(cid:12)(cid:12) ˆ R (cid:48) ˜ Z ( n ˆΩ) − ˜ Z (cid:48) ˆ ε (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) T Cλ max ( ˜ Z ˜ Z (cid:48) / ( nT )) ˆ R (cid:48) ˆ ε (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) C ˆ R (cid:48) ˆ ε (cid:12)(cid:12)(cid:12) = O p ( n / m − αn ) Thus, (ˆ ε + ˆ R ) (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) (ˆ ε + ˆ R ) = ˆ ε (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˆ ε + O p ( nm − αn ) + O p ( n / m − αn ) ˆ ε (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) ˆ ε = ˆ ε (cid:48) ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε − ε (cid:48) P ˆ W ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε + ˆ ε (cid:48) P ˆ W ˆ Z ( n ˆΩ) − ˆ Z (cid:48) P ˆ W ˆ ε Let χ i = ˆ Z (cid:48) i ˆ ε i , Ω = E [ ˆ Z (cid:48) i ˆ ε i ˆ ε (cid:48) i ˆ Z i ] = E [ χ i χ (cid:48) i ] . Note that E [( n − ˆ ε (cid:48) ˆ Z )Ω − ( n − ˆ Z (cid:48) ˆ ε )] /n = E [ χ (cid:48) i Ω − χ i ] = E [ tr ( χ (cid:48) i Ω − χ i )] /n = E [ tr (Ω − χ i χ (cid:48) i )] /n = tr (Ω − E [ χ i χ (cid:48) i ]) /n = tr ( I r n ) /n = r n /n Thus, by Markov’s inequality, (cid:107) Ω − ( n − ˆ Z (cid:48) ˆ ε ) (cid:107) ≤ C (cid:113) ( n − ˆ ε (cid:48) ˆ Z )Ω − ( n − ˆ Z (cid:48) ˆ ε ) = O p ( (cid:112) r n /n ) Because the eigenvalues of Ω are bounded below and above w.p.a. 1, it is also true that (cid:107) n − ˆ Z (cid:48) ˆ ε (cid:107) = O p ( (cid:112) r n /n ) . Similarly, (cid:107) n − ˆ W (cid:48) ˆ ε (cid:107) = O p ( (cid:112) m n /n ) . Using this result and theinequality (cid:107) AB (cid:107) ≤ (cid:107) A (cid:107) (cid:107) B (cid:107) , get (cid:13)(cid:13)(cid:13) ˆ ε (cid:48) P ˆ W ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) ˆ ε (cid:48) ˆ W ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) ˆ Z ( n ˆΩ) − ˆ T (cid:48) ˆ ε (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) ( nT ) (cid:16) ˆ ε (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ W / ( nT ) (cid:17) − (cid:16) ˆ W (cid:48) ˆ T / ( nT ) (cid:17) ˆΩ − (cid:16) ˆ T (cid:48) ˆ ε/ ( nT ) (cid:17) (cid:13)(cid:13)(cid:13) ≤ Cn (cid:13)(cid:13)(cid:13) (cid:16) ˆ ε (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:17) (cid:16) ˆ Z (cid:48) ˆ ε/ ( nT ) (cid:17) (cid:13)(cid:13)(cid:13) ≤ Cn (cid:13)(cid:13)(cid:13) ˆ ε (cid:48) ˆ W / ( nT ) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆ Z (cid:48) ˆ ε/ ( nT ) (cid:13)(cid:13)(cid:13) = nO p ( (cid:112) m n /n ) O p ( ζ ( k n ) (cid:112) k n /n ) O p ( (cid:112) r n /n ) = O p ( ζ ( k n ) (cid:112) m n k n r n /n ) In turn, (cid:13)(cid:13)(cid:13) ˆ ε (cid:48) P ˆ W ˆ Z ( n ˆΩ) − ˆ Z (cid:48) P ˆ W ˆ ε (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) ˆ ε ˆ W ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ W ( ˆ W (cid:48) ˆ W ) − ˆ W (cid:48) ˆ ε (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) ( nT ) (cid:16) ˆ ε (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ W / ( nT ) (cid:17) − (cid:16) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:17) ˆΩ − (cid:16) ˆ Z (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ W / ( nT ) (cid:17) − (cid:16) ˆ W (cid:48) ˆ ε/ ( nT ) (cid:17) (cid:13)(cid:13)(cid:13) ≤ Cn (cid:13)(cid:13)(cid:13) (cid:16) ˆ ε (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:17) (cid:16) ˆ Z (cid:48) ˆ W / ( nT ) (cid:17) (cid:16) ˆ W (cid:48) ˆ ε/ ( nT ) (cid:17) (cid:13)(cid:13)(cid:13) ≤ Cn (cid:13)(cid:13)(cid:13) ˆ ε (cid:48) ˆ W / ( nT ) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆ Z (cid:48) ˆ W / ( nT ) (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ˆ W (cid:48) ˆ ε/ ( nT ) (cid:13)(cid:13)(cid:13) = nO p ( (cid:112) m n /n ) O p ( ζ ( k n ) (cid:112) k n /n ) O p ( ζ ( k n ) (cid:112) k n /n ) O p ( (cid:112) m n /n ) = O p ( ζ ( k n ) m n k n /n ) (ˆ ε + ˆ R ) (cid:48) ˜ Z (cid:16) n ˆΩ (cid:17) − ˜ Z (cid:48) (ˆ ε + ˆ R ) = ˆ ε (cid:48) ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε + O p ( nm − αn ) + O p ( n / m − αn )+ O p ( ζ ( k n ) (cid:112) m n k n r n /n ) + O p ( ζ ( k n ) m n k n /n ) = ˆ ε (cid:48) ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε + o p ( √ r n ) (A.1)Step 3. Deal with the leading term.As shown in Lemma A.6, n ( n − ˆ ε (cid:48) ˆ Z ) ˆΩ − ( n − ˆ Z (cid:48) ˆ ε ) − n ( n − ˆ ε (cid:48) ˆ Z )Ω − ( n − ˆ Z (cid:48) ˆ ε ) √ r n p → (A.2)Note that ˆ ε (cid:48) ˆ Z ( n Ω) − ˆ Z (cid:48) ˆ ε = n − n (cid:88) i =1 n (cid:88) j =1 ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j = n − n (cid:88) i =1 ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) i ˆ ε i + 2 n − n − (cid:88) i =1 (cid:88) j>i ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j Thus, t ∗ = ξ ∗ − r n √ r n = n − (cid:80) ni =1 ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) i ˆ ε i − r n √ r n + √ (cid:80) n − i =1 (cid:80) j>i ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j √ n r n = t + t , (A.3)where t = n − (cid:80) ni =1 ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) i ˆ ε i − r n √ r n t = √ (cid:80) n − i =1 (cid:80) j>i ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j √ n r n Note that E [ t ] = E [ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) i ˆ ε i ] − r n √ r n = E [ χ (cid:48) i Ω − χ i ] − r n √ r n = 0 , because E [ χ (cid:48) i Ω − χ i ] = r n Next,

V ar ( t ) ≤ E [( χ (cid:48) i Ω − χ i ) ] / (2 nr n ) ≤ CE [ (cid:107) χ i (cid:107) ] / ( nr n ) ≤ CE [ (cid:107) ˆ ε i (cid:107) (cid:107) ˆ Z i (cid:107) ] / ( nr n ) ≤ CE [ (cid:107) ˆ Z i (cid:107) ] / ( nr n ) ≤ Cζ ( r n ) r n / (2 nr n ) = Cζ ( r n ) /n → , so, by Markov’s inequality, t p → . 45ext, note that t = (cid:80) n − i =1 (cid:80) j>i H n ( χ i , χ j ) , where H n ( χ i , χ j ) = (cid:114) n r n χ (cid:48) i Ω − χ j = (cid:114) n r n ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j Then G n ( u, v ) = E [ H n ( χ , u ) H n ( χ , v )] = 2 n r n E [ χ (cid:48) Ω − uχ (cid:48) Ω − v ]= 2 n r n E [ u (cid:48) Ω − χ χ (cid:48) Ω − v ] = 2 n r n u (cid:48) Ω − v = (cid:114) n r n H n ( u, v ) Note that E [ H n ( χ , χ ) | χ ] = χ (cid:48) Ω E [ χ ] = 0 and that E [ H n ( χ , χ ) ] = 2 n r n E [ χ (cid:48) Ω − χ χ (cid:48) Ω − χ ] = 2 n r n E [ χ (cid:48) Ω − χ χ (cid:48) Ω − χ ] = 2 n r n E [ χ (cid:48) Ω − χ ]= 2 n r n E [ tr ( χ (cid:48) Ω − χ )] = 2 n r n E [ tr ( χ χ (cid:48) Ω − )] = 2 n Thus, we have: E [ G n ( χ , χ ) ] { E [ H n ( χ , χ ) ] } = (2 /n r n )(2 /n )(4 /n ) = 1 r n → and, using the Cauchy-Schwartz inequality, n − E [ H n ( χ , χ ) ] { E [ H n ( χ , χ ) ] } = (4 /n r n ) E [( χ (cid:48) Ω − χ ) ](4 /n ) ≤ E [( χ (cid:48) Ω − χ ) ( χ (cid:48) Ω − χ ) ] nr n = E [( χ (cid:48) Ω − χ ) ] nr n = (cid:20) E [( χ (cid:48) Ω − χ ) ] r n √ n (cid:21) → Thus, the conditions of Theorem 1 in Hall (1984) hold, and t = (cid:114) n r n n − (cid:88) i =1 (cid:88) j>i ˆ ε (cid:48) i ˆ Z i Ω − ˆ Z (cid:48) j ˆ ε j d → N (0 , (A.4)The result of the theorem now follows from equations A.1, A.2, A.3, and A.4. (cid:4) .2 Proof of Theorem 2 The wild bootstrap test statistic is given by ξ ∗ HC = (cid:32) n (cid:88) i =1 ˜ e ∗(cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e ∗ i ˜ e ∗(cid:48) i ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e ∗ i (cid:33) = ˜ e ∗(cid:48) ˜ Z (cid:16) n ˆΩ ∗ (cid:17) − ˜ Z (cid:48) ˜ e ∗ , where ˆΩ ∗ = n − (cid:80) ni =1 ˜ Z (cid:48) i ˜ e ∗ i ˜ e ∗(cid:48) i ˜ Z i .Note that the bootstrap data is generated as ˆ Y ∗ = ˆ W ˜ β + ˆ ε ∗ The bootstrap residuals are given by ˜ e ∗ = M ˆ W ˆ Y ∗ = M ˆ W ˆ ε ∗ . Thus, the bootstrap teststatistic can be rewritten as ξ ∗ HC = ˆ ε ∗(cid:48) M ˆ W ˜ Z (cid:16) n ˆΩ ∗ (cid:17) − ˜ Z (cid:48) M ˆ W ˆ ε ∗ The rest of the proof is very similar to the proof of Theorem 1, so I only provide a sketchhere. First, one can show that ˆ ε ∗(cid:48) M ˆ W ˜ Z (cid:16) n ˆΩ ∗ (cid:17) − ˜ Z (cid:48) M ˆ W ˆ ε ∗ − ˆ ε ∗(cid:48) Z ( n Ω ∗ ) − Z ˆ ε ∗ √ r n p → , where Ω ∗ = n − (cid:80) ni =1 ˆ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˆ Z i . Next, one can deal with the leading term ˆ ε ∗(cid:48) Z ( n Ω ∗ ) − Z ˆ ε ∗ as in Step 3 of the proof of Theorem 1, but now conditional on the data Z n,T . The result ofthe theorem follows. B.3 Proof of Theorem 3

Recall that ˆΩ = n − (cid:80) ni =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i in the homoskedastic case and ˆΩ = n − (cid:80) ni =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i in the heteroskedastic case. Next, note that n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i = ˜ Z (cid:48) ˜ e where ˜ Z = ( ˜ Z (cid:48) , ..., ˜ Z (cid:48) n ) (cid:48) is nT × r n , ˜ e = (˜ e (cid:48) , ..., ˜ e (cid:48) n ) (cid:48) is nT × .47hen under homoskedasticity ξ = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) = n (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) , while under heteroskedasticity ξ HC = (cid:32) n (cid:88) i =1 ˜ e (cid:48) i ˜ Z i (cid:33) (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i (cid:33) − (cid:32) n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i (cid:33) = n (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) Also note that √ r n n n (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) − r n √ r n = 1 √ (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) + T , where T = − r n / ( n √ → .Hence, it suﬃces to show that (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) p → ∆ .Next, note that due to the projection nature of the series estimators, ˜ e = M ˆ W ˆ Y = M ˆ W ˆ ε ∗ + M ˆ W ˆ R ∗ . Hence, (cid:16) n − ˜ e (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ˜ e (cid:17) = (cid:16) n − ( M ˆ W ˆ ε ∗ + M ˆ W ˆ R ∗ ) (cid:48) ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) ( M ˆ W ˆ ε ∗ + M ˆ W ˆ R ∗ ) (cid:17) = (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˜ Z (cid:17) ˆΩ − (cid:16) n − ˜ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) = (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) Thus, it suﬃces to show that n (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) p → ∆ Next, n (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) = (cid:16) n − ˆ ε ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ ε ∗ (cid:17) + 2 (cid:16) n − ˆ R ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ ε ∗ (cid:17) + (cid:16) n − ˆ R ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ R ∗ (cid:17) Similarly to the proof of Theorem 1, but using the fact that sup x ∈X R ∗ ( x ) = o (1) insteadof sup x ∈X R ( x ) = O ( m − αn ) , (cid:16) n − ˆ R ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ ε ∗ (cid:17) ≤ C ˆ R ∗(cid:48) ˆ ε ∗ / ( n ˜ σ ) = O p ( n − / ) o p (1) = o p (1) (cid:16) n − ˆ R ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ R ∗ (cid:17) ≤ C ˆ R ∗(cid:48) ˆ R ∗ / ( n ˜ σ ) = o p (1) Thus, n (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) = (cid:16) n − ˆ ε ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ ε ∗ (cid:17) + o p (1) Next, given that, as shown in the proof of Theorem 1, M ˆ W ˆ Z = ˆ Z + o p (1) and theeigenvalues of ˆΩ are bounded above and below w.p.a. 1, (cid:16) n − ˆ ε ∗(cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W ˆ ε ∗ (cid:17) = (cid:16) n − ˆ ε ∗(cid:48) ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) ˆ ε ∗ (cid:17) + o p (1) Next, (cid:12)(cid:12)(cid:12) ( n − ˆ ε ∗(cid:48) ˆ Z )( ˆΩ − − Ω ∗− )( n − ˆ Z (cid:48) ˆ ε ∗ ) (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) ( n − ˆ ε ∗(cid:48) ˆ Z )(Ω ∗− ( ˆΩ − Ω ∗ ) ˆΩ ∗− ( ˆΩ − Ω ∗ )Ω ∗− )( n − ˆ Z (cid:48) ˆ ε ∗ ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ( n − ˆ ε ∗(cid:48) ˆ Z )(Ω ∗− ( ˆΩ − Ω ∗ )Ω ∗− )( n − ˆ Z (cid:48) ˆ ε ∗ ) (cid:12)(cid:12)(cid:12) ≤ (cid:107) Ω ∗− n − ˆ Z (cid:48) ˆ ε ∗ (cid:107) ( (cid:107) ˆΩ − Ω ∗ (cid:107) + C (cid:107) ˆΩ − Ω ∗ (cid:107) ) = o p (1) Thus, ( n − ˆ ε ∗(cid:48) ˆ Z ) ˆΩ − ( n − ˆ Z (cid:48) ˆ ε ∗ ) = ( n − ˆ ε ∗(cid:48) ˆ Z )Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ ) + o p (1) .To complete the proof, note that V ar ( ˆ Z (cid:48) i ˆ ε ∗ i ) ≤ Ω ∗ , because Ω ∗ = E [ ˆ Z (cid:48) i ˆ ε i ˆ ε (cid:48) i ˆ Z i ] . Then E [( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:48) Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ])] ≤ E [( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:48) V ar ( ˆ Z (cid:48) i ˆ ε ∗ i ) − ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ])]= E [ tr (cid:16) V ar ( ˆ Z (cid:48) i ˆ ε ∗ i ) − ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ])( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:48) (cid:17) ] = tr ( I r n ) /n = r n /n → Thus, (cid:12)(cid:12)(cid:12) ( n − ˆ ε ∗(cid:48) ˆ Z )Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ ) − E [ˆ ε ∗(cid:48) i ˆ Z i ]Ω ∗− E [ ˆ Z (cid:48) i ˆ ε ∗ i ] (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:48) Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:12)(cid:12)(cid:12) + 2 (cid:12)(cid:12)(cid:12) E [ˆ ε ∗(cid:48) i ˆ Z i ]Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:12)(cid:12)(cid:12) ≤ o p (1) + 2 (cid:113) E [ˆ ε ∗ i ˆ Z (cid:48) i ]Ω ∗− E [ ˆ Z i ˆ ε ∗ i ] (cid:113) ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ]) (cid:48) Ω ∗− ( n − ˆ Z (cid:48) ˆ ε ∗ − E [ ˆ Z (cid:48) i ˆ ε ∗ i ])= o p (1) + 2 √ ∆ o p (1) = o p (1) Combining the results above, (cid:16) n − (ˆ ε ∗ + ˆ R ∗ ) (cid:48) M ˆ W ˆ Z (cid:17) ˆΩ − (cid:16) n − ˆ Z (cid:48) M ˆ W (ˆ ε ∗ + ˆ R ∗ ) (cid:17) p → ∆ . (cid:4) .4 Auxiliary Lemmas Lemma A.1 (Donald et al. (2003), Lemma A.2) . If Assumption 3 is satisﬁed then it can beassumed without loss of generality that ˆ P k ( x ) = ¯ P k ( x ) and that E [ ¯ P k ( X it ) ¯ P k ( X it ) (cid:48) ] = I k . Remark A.1.

This normalization is common in the literature on series estimation, when allelements of P k ( X i ) are used to estimate the model. In my setting, P k ( X i ) is partitioned into W m ( X i ) , used in estimation, and T r ( X i ) , used in testing. The normalization implies that W i and T i are orthogonal to each other. This can be justiﬁed as follows. Suppose that W i and T i are not orthogonal. Then one can take all elements of ( W (cid:48) i , T (cid:48) i ) (cid:48) and apply the Gram-Schmidtprocess to them. Because the orthogonalization process is sequential, it will yield the vector ( W i , T i ) (cid:48) such that W i spans the same space as W i , T i spans the same space as T i , and W i and T i are orthogonal. Thus, the normalization is indeed without loss of generality. Lemma A.2 (Baltagi and Li (2002), Theorem 2.2) . Let f ( x ) = f ( x, θ , h ) , f i = f ( X i ) , ˜ f ( x ) = f ( x, ˜ θ, ˜ h ) = W m n ( x ) (cid:48) ˜ β , and ˜ f i = ˜ f ( X i ) = W (cid:48) i ˜ β . Under Assumptions 1, 3, and 4,the following is true: n n (cid:88) i =1 ( ˜ f i − f i ) = O p ( m n /n + m − αn ) Lemma A.3.

Suppose that Assumption 5 is satisﬁed and E [ˆ ε i ˆ ε (cid:48) i ] is ﬁnite. If E [ˆ ε i | ˆ X i ] = 0 then E [ ˆ P (cid:48) i ˆ ε i ] = 0 for all k . Furthermore, if E [ˆ ε i | X i ] (cid:54) = 0 then E [ ˆ P (cid:48) i ˆ ε i ] (cid:54) = 0 for all k largeenough.Proof. The proof is similar to the proof of Lemma 2.1 in Donald et al. (2003) and is thusomitted. (cid:4)

Lemma A.4.

If Assumptions of Theorem 1 hold, then (cid:107) ˆ P (cid:48) ˆ P / ( nT ) − I k n (cid:107) = O p ( ζ ( k n ) (cid:112) k n /n ) , (cid:107) ˆ W (cid:48) ˆ W / ( nT ) − I m n (cid:107) = O p ( ζ ( m n ) (cid:112) m n /n ) and (cid:107) ˆ Z (cid:48) ˆ Z/ ( nT ) − I r n (cid:107) = O p ( ζ ( r n ) (cid:112) r n /n ) . More-over, (cid:107) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:107) = O p ( ζ ( k n ) (cid:112) k n /n ) .Proof. By Lemma A.1 in Baltagi and Li (2002), (cid:107) ˆ P (cid:48) ˆ P / ( nT ) − I k n (cid:107) = O p ( ζ ( k n ) (cid:112) k n /n ) .Similarly, it can be shown that (cid:107) ˆ W (cid:48) ˆ W / ( nT ) − I m n (cid:107) = O p ( ζ ( m n ) (cid:112) m n /n ) and (cid:107) ˆ Z (cid:48) ˆ Z/ ( nT ) − I r n (cid:107) = O p ( ζ ( r n ) (cid:112) r n /n ) . Moreover, note that ˆ P (cid:48) ˆ P / ( nT ) − I k n = (cid:32) ˆ W (cid:48) ˆ W / ( nT ) ˆ W (cid:48) ˆ Z/ ( nT )ˆ Z (cid:48) ˆ W / ( nT ) ˆ Z (cid:48) ˆ Z/ ( nT ) (cid:33) − (cid:32) I m n m n × r n r n × m n I r n (cid:33) Hence, (cid:107) ˆ W (cid:48) ˆ Z/ ( nT ) (cid:107) = O p ( ζ ( k n ) (cid:112) k n /n ) . (cid:4) emma A.5. In the homoskedastic case, let ˜Ω = n − (cid:80) ni =1 ˆ Z (cid:48) i ˜Σ T ˆ Z i , ˘Ω = n − (cid:80) ni =1 ˆ Z (cid:48) i Σ T ˆ Z i .In the heteroskedastic case, let ˜Ω = n − (cid:80) ni =1 ˆ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˆ Z i , ˘Ω = n − (cid:80) ni =1 ˆ Z (cid:48) i ˆ ε i ˆ ε (cid:48) i ˆ Z i , and ¯Ω = n − (cid:80) ni =1 ˆ Z (cid:48) i Σ i ˆ Z i , where Σ i = E [ˆ ε i ˆ ε (cid:48) i | X i ] .Let Ω = E [ ˆ Z i Σ i ˆ Z (cid:48) i ] .Suppose that Assumptions 2(ii), 3, and 4 are satisﬁed. Then (cid:107) ˜Ω − ˘Ω (cid:107) = O p (cid:0) ζ ( r n ) ( m n /n + m − αn ) (cid:1) (cid:107) ˘Ω − ¯Ω (cid:107) = O p ( ζ ( r n ) r / n /n / ) (cid:107) ¯Ω − Ω (cid:107) = O p ( ζ ( r n ) r / n /n / ) If Assumption 2(i) is also satisﬁed then /C ≤ λ min (Ω) ≤ λ max (Ω) ≤ C , and if ζ ( r n ) ( m n /n + m − αn ) → and ζ ( r n ) r / n /n / → , then w.p.a. 1, /C ≤ λ min ( ˜Ω) ≤ λ max ( ˜Ω) ≤ C and /C ≤ λ min ( ¯Ω) ≤ λ max ( ¯Ω) ≤ C .Moreover, if (cid:107) ˆΩ − ˜Ω (cid:107) = o p (1) , then /C ≤ λ min ( ˆΩ) ≤ λ max ( ˆΩ) ≤ C .Sketch of the Proof of Lemma A.5. Note that in the homoskedastic case, all matrices in-volved in the lemma can be written as ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜Σ T ˜ Z i = T (cid:88) t =1 T (cid:88) s =1 ˜ σ ts n − n (cid:88) i =1 ˜ Z it ˜ Z (cid:48) is ˜Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i ˜Σ T ˆ Z i = T (cid:88) t =1 T (cid:88) s =1 ˜ σ ts n − n (cid:88) i =1 ˆ Z it ˆ Z (cid:48) is ˘Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i Σ T ˆ Z i = T (cid:88) t =1 T (cid:88) s =1 σ ts n − n (cid:88) i =1 ˆ Z it ˆ Z (cid:48) is In the heteroskedastic case, all matrices involved in the lemma can be written as ˆΩ = n − n (cid:88) i =1 ˜ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˜ Z i = T (cid:88) t =1 T (cid:88) s =1 n − n (cid:88) i =1 ˜ e it ˜ e is ˜ Z it ˜ Z (cid:48) is ˜Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i ˜ e i ˜ e (cid:48) i ˆ Z i = T (cid:88) t =1 T (cid:88) s =1 n − n (cid:88) i =1 ˜ e it ˜ e is ˆ Z it ˆ Z (cid:48) is ˘Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i ˆ ε i ˆ ε (cid:48) i ˆ Z i = T (cid:88) t =1 T (cid:88) s =1 n − n (cid:88) i =1 ˆ ε it ˆ ε is ˆ Z it ˆ Z (cid:48) is ¯Ω = n − n (cid:88) i =1 ˆ Z (cid:48) i Σ i ˆ Z i = T (cid:88) t =1 T (cid:88) s =1 n − n (cid:88) i =1 σ ts ˆ Z it ˆ Z (cid:48) is Because T is ﬁnite, Lemma A.5 can be proved by applying the results from Lemmas A.551nd A.6 in Korolev (2019) element by element. (cid:4) Lemma A.6.

If Assumptions of Theorem 1 hold, then n ( n − ˆ ε (cid:48) ˆ Z ) ˆΩ − ( n − ˆ Z (cid:48) ˆ ε ) − n ( n − ˆ ε (cid:48) ˆ Z )Ω − ( n − ˆ Z (cid:48) ˆ ε ) √ r n p → Proof.

Note that ˆ ε (cid:48) ˆ Z ( n ˆΩ) − ˆ Z (cid:48) ˆ ε = n ( n − ˆ ε (cid:48) ˆ Z ) ˆΩ − ( n − ˆ Z (cid:48) ˆ ε ) . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ( n − ˆ ε (cid:48) ˆ Z ) ˆΩ − ( n − ˆ Z (cid:48) ˆ ε ) √ r n − n ( n − ˆ ε (cid:48) ˆ Z )Ω − ( n − ˆ Z (cid:48) ˆ ε ) √ r n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n ( n − ˆ ε (cid:48) ˆ Z )( ˆΩ − − Ω − )( n − ˆ Z (cid:48) ˆ ε ) √ r n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n (cid:107) Ω − n − ˆ Z (cid:48) ˆ ε (cid:107) ( (cid:107) ˆΩ − Ω (cid:107) + C (cid:107) ˆΩ − Ω (cid:107) ) √ r n As shown above, (cid:107) Ω − ( n − ˆ Z (cid:48) ˆ ε ) (cid:107) = O p ( (cid:112) r n /n ) .Then n (cid:107) Ω − n − ˆ Z (cid:48) ˆ ε (cid:107) ( (cid:107) ˜Ω − Ω (cid:107) + C (cid:107) ˜Ω − Ω (cid:107) ) √ r n = nO p ( r n /n ) o p (1 / √ r n ) √ r n = o p ( √ r n ) √ r n = o p (1) , provided that (cid:107) ˆΩ − Ω (cid:107) = o p (1 / √ r n ) , which holds under the rate conditions 3.1–3.5 and thehigh level assumption that (cid:107) ˆΩ − ˜Ω (cid:107) = o p ( r − / n ) . (cid:4) eferences Ai, C. and Q. Li (2008): “Semi-parametric and Non-parametric Methods in Panel DataModels,” in

The Econometrics of Panel Data: Fundamentals and Recent Developments inTheory and Practice , ed. by L. M´aty´as and P. Sevestre, Berlin, Heidelberg: Springer BerlinHeidelberg, 451–478.

An, Y., C. Hsiao, and D. Li (2016): “Semiparametric Estimation of Partially LinearVarying Coeﬃcient Panel Data Models,” in

Essays in Honor of Aman Ullah (Advancesin Econometrics) , ed. by G. Gonzalez-Rivera, R. C. Hill, and T.-H. Lee, Emerald GroupPublishing Limited, vol. 36, chap. 2, 47–65.

Arellano, M. (2003):

Panel data econometrics , Oxford university press.

Baltagi, B. (2013):

Econometric analysis of panel data , John Wiley & Sons, 5 ed.

Baltagi, B. H. and S. Khanti-Akom (1990): “On eﬃcient estimation with panel data:An empirical comparison of instrumental variables estimators,”

Journal of Applied Econo-metrics , 5, 401–406.

Baltagi, B. H. and D. Li (2002): “Series Estimation of Partially Linear Panel Data Modelswith Fixed Eﬀects,”

Annals of Economics and Finance , 3, 103–116.

Cornwell, C. and P. Rupert (1988): “Eﬃcient estimation with panel data: An empiricalcomparison of instrumental variables estimators,”

Journal of Applied Econometrics , 3, 149–155.

Davidson, R. and E. Flachaire (2008): “The wild bootstrap, tamed at last,”

Journal ofEconometrics , 146, 162–169.

Donald, S. G., G. W. Imbens, and W. K. Newey (2003): “Empirical likelihood estima-tion and consistent tests with conditional moment restrictions,”

Journal of Econometrics ,117, 55 – 93.

Guay, A. and E. Guerre (2006): “A Data-Driven Nonparametric Speciﬁcation Test forDynamic Regression Models,”

Econometric Theory , 22, 543–586.

Hall, P. (1984): “Central limit theorem for integrated square error of multivariate nonpara-metric density estimators,”

Journal of Multivariate Analysis , 14, 1 – 16.

Henderson, D. J., R. J. Carroll, and Q. Li (2008): “Nonparametric estimation andtesting of ﬁxed eﬀects panel data models,”

Journal of Econometrics , 144, 257 – 275.53 ong, Y. and H. White (1995): “Consistent Speciﬁcation Testing Via NonparametricSeries Regression,”

Econometrica , 63, 1133–1159.

Hsiao, C. (2003):

Analysis of Panel Data , Econometric Society Monographs, CambridgeUniversity Press, 2 ed.

Korolev, I. (2019): “A Consistent LM Type Speciﬁcation Test for Semiparametric Models,”

ArXiv e-prints . Li, Q. and S. Wang (1998): “A simple consistent bootstrap test for a parametric regressionfunction,”

Journal of Econometrics , 87, 145–165.

Lin, Z., Q. Li, and Y. Sun (2014): “A consistent nonparametric test of parametric re-gression functional form in ﬁxed eﬀects panel data models,”

Journal of Econometrics , 178,167–179.

Mammen, E. (1993): “Bootstrap and Wild Bootstrap for High Dimensional Linear Models,”

The Annals of Statistics , 21, 255–285.

Parmeter, C. F. and J. S. Racine (2018): “Nonparametric Estimation and Inferencefor Panel Data Models,” Department of Economics Working Papers 2018-02, McMasterUniversity.

Rodriguez-Poo, J. M. and A. Soberon (2017): “Nonparametric and semiparametricpanel data models: Recent developments,”

Journal of Economic Surveys , 31, 923–960.

Su, L. and A. Ullah (2011): “Nonparametric and semiparametric panel econometric mod-els: estimation and testing,” in