[PDF] Heteroscedastic stratified two-way EC models of single equations and SUR systems

Abstract

A relevant issue in panel data estimation is heteroscedasticity, which often occurs when the sample is large and individual units are of varying size. Furthermore, many of the available panel data sets are unbalanced in nature, because of attrition or accretion, and micro-econometric models applied to panel data are frequently multi-equation models. This paper considers the general least squares estimation of the heteroscedastic stratified two-way error component (EC) models of both single equations and seemingly unrelated regressions (SUR) systems (with cross-equations restrictions) on unbalanced panel data. The derived heteroscedastic estimators of both single equations and SUR systems improve the estimation efficiency.

Full PDF

aa r X i v : . [ s t a t . M E ] A ug Heteroscedastic stratiﬁed two-way EC modelsof single equations and SUR systems S ILVIA P LATONI , L AURA B ARBIERI , D ANIELE M ORO , AND P AOLO S CKOKAI Dipartimento di Scienze economiche e sociali and Dipartimento di Economia agro-alimentare,Universit`a Cattolica del Sacro Cuore, Piacenza, Italy

A relevant issue in panel data estimation is heteroscedasticity, which often occurs whenthe sample is large and individual units are of varying size. Furthermore, many of theavailable panel data sets are unbalanced in nature, because of attrition or accretion, andmicro-econometric models applied to panel data are frequently multi-equation models.This paper considers the general least squares estimation of the heteroscedastic stratiﬁedtwo-way error component ( EC ) models of both single equations and seemingly unrelatedregressions ( SUR ) systems (with cross-equations restrictions) on unbalanced panel data.The derived heteroscedastic estimators of both single equations and

SUR systems im-prove the estimation efﬁciency.K

EYWORDS . Unbalanced panel, EC model, SUR, heteroscedasticity.JEL

CLASSIFICATION . C13, C23, C33.

1. I

NTRODUCTION

In applied econometrics, there is an increasing use of panel data, that Baltagi (2013,page 1) deﬁnes as ‘the pooling of observations on a cross-section of households, coun-tries, ﬁrms, etc. over several time periods’. The reason for this increasing use is thatpanel data sets are more informative, since they often provide richer and more disag-gregated information. Furthermore, they allow to model individual heterogeneity andto address aggregation issues. Finally, since they span over several time periods, theyalso allow to describe the dynamics of the phenomena under study.The error component ( EC ) model is the standard approach to the estimation of indi-vidual and time effects in econometric single-equation models based on panel data (seeBaltagi, 2013, for a review of the methods). Many of the available data sets are unbal-anced in nature, that is, not all the individuals are observed over the whole time period.Several and different reasons, such as attrition or accretion, may produce an incompletepanel data set. Therefore, standard single-equation EC models have been extended tothe econometric treatment of unbalanced panel data: Biørn (1981) and Baltagi (1985) Address correspondence to Silvia Platoni, Dipartimento di Scienze economiche sociali, Universit`a Cat-tolica del Sacro Cuore, via Emilia Parmense 84, 29122 Piacenza, Italy; [email protected]; tel.+390523599337; fax +390523599303. EC model, Wansbeek and Kapteyn (1989) andDavis (2002) extended such estimation method to the two and multi-way cases.Although often discarded in empirical applications, a relevant issue in panel dataestimation is heteroscedasticity, which often occurs when the sample is large and ob-servations differ in “size characteristic” (i.e., the level of the variables). Under this per-spective, heteroscedasticity arises from the fact that the degree to which a relationshipmay explain actual observations is likely to depend on individual speciﬁc characteris-tics. On the other hand, the error variance may also systematically vary across observa-tions of similar size and, in practice, the two different sources of heteroscedasticity maybe simultaneously present (see Lejeune, 1996, 2004). This means that heteroscedastic-ity is the rule rather than the exception when dealing with individual data concerninghouseholds or ﬁrms. Assuming homoscedastic disturbances when heteroscedasticity ispresent will still result in consistent estimates of the regression coefﬁcients, but theseestimates will not be efﬁcient. Also, the standard errors of the ﬁxed-effect ( FE ) esti-mates will be biased and robust standard errors should be computed in order to correctfor the possible presence of heteroscedasticity.Several authors have analyzed the problem of heteroscedasticity in balanced paneldata, usually considering a single-equation regression model with one-way disturbances ε it = µ i + u it . Baltagi and Grifﬁn (1988) are concerned with the estimation of a random-effect ( RE ) model allowing for heteroscedasticity on the individual-speciﬁc error termvar ( µ i ) = ϕ i . In contrast, Rao et al. (1981), Magnus (1982), Baltagi (1988), andWansbeek (1989) adopt a symmetrically opposite speciﬁcation allowing for heteroscedas-ticity on the remainder error term var ( u it ) = ψ i .As Mazodier and Trognon (1978) pointed out, if the ϕ i ’s are unknown, then thereis no hope to estimate them from the data: even if the µ i ’s were observed, it wouldbe impossible to estimate their variances from only one observation on each individualdisturbance. Therefore, the model proposed by Baltagi and Grifﬁn (1988) suffers fromthe incidental parameters problem (see Phillips, 2003; Baltagi, 2013). Furthermore,also the models allowing for heteroscedasticity on the remainder error term u it sufferfrom the incidental parameters problem when the time dimension of the panel is short.There are two possible solutions to avoid the incidental parameters problem (seeBaltagi, 2013): either to allow the variances to change across strata (i.e., stratiﬁed EC models) or, if the variables that determine heteroscedasticity are known, to spec-ify parametric variance functions (i.e., adaptive estimation of heteroscedasticity of un-known form). While all these papers assume constant slope coefﬁcients, Bresson et al. (2006, 2011) allow varia-tions in parameters across cross-sectional units in order to take into account the between individual hetero-geneity. Hence, these authors derive a hierarchical Bayesian panel data estimator for a random coefﬁcientmodel (

RCM ), where heteroscedasticity is modeled following both the

RCM s on panel data proposed byHsiao and Pesaran (2004) and Chib (2008) and the general heteroscedastic one-way EC model proposed byRandolph (1988), who assumes that both the individual-speciﬁc term µ i and the remainder error term u it areheteroscedastic. Neyman and Scott (1948) study maximum likelihood ( ML ) estimation of models having both struc-tural and incidental parameters: while the structural parameters can be consistently estimated, the incidentalparameters cannot be consistently estimated. These authors show that the estimation of the ML model is in-consistent (or partially inconsistent) if the model contains nuisance or incidental parameters which increasein number with the sample size. EC model, i.e., ε it = µ i + ν t + u it , on balanced panels in which both the individual-speciﬁc effect µ i andtime-speciﬁc effect ν t variances are constant within subsets of observations (or strata),but are allowed to change across strata. More recently, Phillips (2003) considers astratiﬁed one-way EC model, again on balanced panels, where the variances of theindividual-speciﬁc effect µ i are allowed to change not across individuals but acrossstrata, and provides an expectation-maximization ( EM ) algorithm to estimate the model’sparameters.Li and Stengos (1994) derive an adaptive estimator for the heteroscedastic one-way EC model using balanced panel data where heteroscedasticity is placed on the remain-der error term, and hence, var ( u it | x it ) = ψ ( x it ) ≡ ψ it . Later, Roy (2002) derives asimilar adaptive estimator where heteroscedasticity is placed on the individual-speciﬁcterm rather than the remainder disturbance, and hence, var ( µ i | ¯x i (cid:5) ) = ϕ ( ¯x i (cid:5) ) ≡ ϕ i .Baltagi et al. (2005) check the sensitivity of these two adaptive heteroscedastic esti-mators to misspeciﬁcation of the form of heteroscedasticity, showing that misleadinginference may occur when heteroscedasticity is present in both components. There-fore, accounting for both sources of heteroscedasticity seems to be very important inempirical work.Indeed, if heteroscedasticity is due to differences in size characteristic across sta-tistical units (i.e., individuals, households, ﬁrms or countries), then both error com-ponents are expected to be heteroscedastic, and it may be difﬁcult to argue that onlyone component of the error term is heteroscedastic but not the other (see Bresson et al.,2006, 2011). To this end, Randolph (1988), working on unbalanced panel data, al-lows for a more general heteroscedastic single-equation one-way EC model, assumingthat both the individual-speciﬁc and remainder error terms are heteroscedastic, i.e.,var ( µ i ) = ϕ i and E (cid:0) uu T (cid:1) = diag ( ψ it ) . Lejeune (1996, 2004) is concerned with theestimation and speciﬁcation testing of a full heteroscedastic one-way EC model, in thespirit of Randolph (1988) and Baltagi et al. (2005), and speciﬁes parametrically thevariance functions. Baltagi et al. (2006), in the spirit of Randolph (1988) and Lejeune(1996, 2004), derive a joint Lagrange multiplier ( LM ) test for homoscedasticity againstthe alternative of heteroscedasticity both in the individual-speciﬁc term µ i and in theremainder error term u it .Micro-econometric models applied to panel data are often multi-equation models.Primal and dual production models are a common case, when systems of input demandsand/or output supply equations have to be estimated; the same is true for systems ofdemand equations in consumer analysis. Baltagi (1980) and Magnus (1982) extendedthe estimation procedure of the single-equation model to the case of seemingly unre-lated regressions ( SUR s) for balanced panels; Biørn (2004) proposed a parsimonioustechnique to estimate one-way

SUR systems on unbalanced panel data; Platoni et al.(2012) extended the procedure suggested by Biørn (2004) to the two-way case. Al-though heteroscedasticity is a frequent and relevant issue also in the multi-equationmodels applied to (unbalanced) panel data, to our knowledge very few papers concern-ing heteroscedastic

SUR systems have been published. A relevant exception is Verbon(1980), who derived a LM test for heteroscedasticity in a model of SUR equations for Throughout the paper, all vectors and matrices are in non-italics. EC model, i.e., ε it = µ i + ν t + u it ,on unbalanced panel data in case of both single equations and SUR systems (withcross-equations restrictions). The individual-speciﬁc effect µ i and remainder error term u it variances and covariances are constant within strata, but they are allowed to changeacross strata. Indeed, the variance and covariance estimations in two-way SUR systemsare implemented, starting from the extension of the two-way single-equation EC modelfrom the homoscedastic to the heteroscedastic stratiﬁed case. Moreover, the estimationis implemented by two methods: the quadratic unbiased estimation ( QUE ) proceduresuggested by Wansbeek and Kapteyn (1989) and the within-between ( WB ) procedureproposed by Biørn (2004).The remainder of the paper proceeds as follows. While Section 2 describes theheteroscedastic two-way estimation for single equations, Section 3 extends the analy-sis to the corresponding estimation for SUR systems. Finally, Section 4 draws someconclusions.2. H

ETEROSCEDASTIC SINGLE - EQUATION TWO - WAY EC MODEL

We start by considering an unbalanced panel characterized by a total of n observa-tions, with N individuals (indexed i = , . . . , N ) observed over T periods (indexed t = , . . . , T ). Let T i denote the number of times the individual i is observed and N t the number of individuals observed in period t . Hence, ∑ i T i = ∑ t N t = n .In the following we consider the regression model: y it = x T it β + µ i + ν t + u it = x T it β + ε it , (1)where x it is a k × β a k × µ i is the individual-speciﬁc effect, ν t the time-speciﬁc effect, and u it the remaindererror term; in the RE model ε it is the composite error term.Using the n × N matrix ∆ µ and the n × T matrix ∆ ν , that are matrices of indicatorvariables denoting observations on individuals and time periods respectively, we candeﬁne the N × N diagonal matrix ∆ N ≡ ∆ T µ ∆ µ (diagonal elements correspond to the T i ’s) and the T × T diagonal matrix ∆ T ≡ ∆ T ν ∆ ν (diagonal elements correspond tothe N t ’s), as well as the T × N matrix of zeros and ones ∆ T N ≡ ∆ T ν ∆ µ , indicating theabsence or presence of an individual in a certain time period. Hence, using matrixnotation, we can write: y = X β + ∆ µ µ + ∆ ν ν + u = X β + ε , (2)where X is a n × k matrix of explanatory variables.Let us assume there exists a meaningful stratiﬁcation of observations . Hence, theunbalanced panel can also be characterized by A strata (indexed a = , . . . , A ), with N a the number of individuals belonging to stratum a . Moreover, the number of observa-tions related to stratum a is n a = ∑ i ∈ I a T i , with I a the set of individuals belonging to The estimation procedures proposed here can deﬁnitely be applied also to balanced panel data. In empirical work the number of strata is unidentiﬁed. Therefore, it is necessary to use a selectionprocedure, such as the Akaike (1974) information criterion, to determine the number of strata. a . Using the n × A matrix ∆ α of indicator variables denoting observations on strata,we can deﬁne the A × A diagonal matrix ∆ A ≡ ∆ T α ∆ α (diagonal elements correspondto the n a ’s) and the A × N matrix of zeros and ones ∆ AN ≡ ∆ T α ∆ µ ∆ − N , indicating theabsence or presence of an individual in a certain stratum (notice that ∆ T α ∆ µ is a matrixof zeros and T i ’s for i ∈ I a ).As Mazodier and Trognon (1978) and Phillips (2003), we assume the individual-speciﬁc error and remainder error variances are constant within stratum but changeacross strata. Hence, heteroscedasticity on the individual-speciﬁc disturbance im-plies µ i ∼ (cid:0) , ϕ a (cid:1) , while heteroscedasticity on the remainder error term implies u it ∼ (cid:0) , ψ a (cid:1) . 2.1 Robust two-way FE

In the FE model the individual-speciﬁc term µ i and the time-speciﬁc term ν t are pa-rameters to be estimated. Therefore, heteroscedasticity is placed only on the remaindererror u it by assuming u it ∼ (cid:0) , ψ a (cid:1) . The Within ( W ) estimator is:ˆ β W = (cid:0) X T Q ∆ X (cid:1) − X T Q ∆ y , (3)where the n × n matrix Q ∆ on which the two-way EC model transformation is basedis: Q ∆ = Q A − P B = Q A − Q A ∆ ν Q − ∆ T ν Q A , (4)with Q A = I n − P A , P A = ∆ µ ∆ − N ∆ T µ , Q = ∆ T ν Q A ∆ ν , and Q − the generalized inverse(see Wansbeek and Kapteyn, 1989; Davis, 2002). Under the assumptions of strict exogeneity, consistency, homoscedasticity and noserial correlation (see assumptions FE.1-FE.3 in Appendix A of Platoni et al., 2012),the W estimator is consistent and asymptotically normal (see Wooldridge, 2010) withvar (cid:0) ˆ β W (cid:1) = ˆ σ u (cid:0) X T Q ∆ X (cid:1) − , (5)where ˆ σ u is the estimator of σ u . However, relaxing the homoscedasticity assumption(see assumption FE.3 in Appendix A), the expression (5) gives an improper variance-covariance matrix estimator (see Wooldridge, 2010).To obtain robust standard errors we follow the simple method suggested by Arellano(1987) for the one-way EC model, and proposed also by Baltagi (2013). If we stack Note that ∑ Aa = N a = N and ∑ Aa = n a = n . The number of explanatory variables, obviously without the intercept, is k − For a FE model the number of ﬁxed-effect parameters µ ,... , µ N and ν ,... , ν T increases with thenumber of individuals N and periods T , respectively. Hence, the conventional asymptotic result cannot beapplied: if N → ∞ , then estimates of the parameters µ ,... , µ N are necessarily inconsistent for a ﬁxed T (seeWang and Ho, 2010), and if T → ∞ , then estimates of the parameters ν ,... , ν T are necessarily inconsistentfor a ﬁxed N . Therefore, when the time dimension of the panel is short, the noise in the estimation of theincidental parameters µ i contaminates the ML estimates of the structural parameters (see Bester and Hansen,2016). The literature proposes some solutions to the incidental parameters problem for some of the models,usually relying on removing the incidental parameters before estimations (see Wang and Ho, 2010). Onepopular approach, widely used in linear models, is to transform the model by the W transformation (i.e., y it and the ( k − ) × it are demeaned), as we have done in deriving our estimation. i , we can write: e y i = (cid:0) E T i − E T i D i Q − D T i E T i (cid:1) y i , e X i = (cid:0) E T i − E T i D i Q − D T i E T i (cid:1) X i , (6)where E T i = I T i − ¯J T i , with I T i an identity matrix of dimension T i , ¯J T i = J Ti T i , and J T i amatrix of ones of dimension T i , and D i is the T i × T matrix obtained from the T × T identity matrix I T by omitting the rows corresponding to periods in which the individual i is not observed. Therefore, we can compute the T i × e e i = e y i − e X i ˆ β W and therobust asymptotic variance-covariance matrix of ˆ β W is:var (cid:0) ˆ β W (cid:1) = (cid:0) X T Q ∆ X (cid:1) − N ∑ i = e X T i e e i e e T i e X i (cid:0) X T Q ∆ X (cid:1) − . (7)However, since u it ∼ (cid:0) , ψ a (cid:1) , it is possible to obtain robust standard errors also bystacking the observations for each stratum a , as described later in Appendix C.2.2 GLS estimation

In the RE model, not only the remainder error u it , but also the individual-speciﬁc error µ i and the time-speciﬁc error ν t are random variables.If we assume that the variances of µ i , ν t , and u it are known, then the general leastsquares ( GLS ) estimator for β , obtained by minimizing ε T it Ω − ε it where Ω is the n × n variance-covariance matrix, is given by:ˆ β GLS = (cid:0) X T Ω − X (cid:1) − X T Ω − y . (8)Assuming homoscedasticity and no serial correlation (i.e., the assumption RE.3 inAppendix B of Platoni et al., 2012), the variance-covariance matrix Ω has the follow-ing form: Ω = σ u I n + σ µ ∆ µ ∆ T µ + σ ν ∆ ν ∆ T ν , (9)and the GLS estimator in (8) is efﬁcient. However, assuming homoscedastic µ i and/or u it when heteroscedasticity is present will still result in consistent estimates of theregression coefﬁcients, but these estimates will not be efﬁcient.With general heteroscedasticity (see assumption RE.3 in Appendix B), that is µ i ∼ (cid:0) , ϕ a (cid:1) and u it ∼ (cid:0) , ψ a (cid:1) , the matrix Ω in (9) is modiﬁed to: Ω = Ψ + ∆ µ Φ∆ T µ + σ ν ∆ ν ∆ T ν , (10)with the n × n matrix Ψ = diag (cid:0) ∆ µ ∆ T AN ψ (cid:1) , and the A × ψ = ( ψ , ψ , . . . , ψ A ) T ,the N × N matrix Φ = diag (cid:0) ∆ T AN ϕ (cid:1) , and the A × ϕ = ( ϕ , ϕ , . . . , ϕ A ) T .The ANOVA-type quadratic unbiased estimator of the variance components basedon the W residuals in the homoscedastic case (9) is determined in Wansbeek and Kapteyn(1989) and Davis (2002). The estimation of the components of the variance-covariancematrix Ω in the heteroscedastic case (10) can be obtained modifying the QUE proce-dure suggested by Wansbeek and Kapteyn (1989).This latter procedure considers the n × ≡ y − X ˆ β W from the W esti-mator in (3), where X is a matrix of dimension n × ( k − ) , since it does not includethe intercept. Given that the n × k matrix X in (8) contains a vector of ones, we have Note that the number of explanatory variables, obviously including the intercept, is k . This matrix and the related vector ψ have been already deﬁned in Appendix A.

6o deﬁne the n × ≡ E n e = e − ¯ e , where E n = I n − ¯J n ,with I n being an identity matrix of dimension n , ¯J n = J n n , and J n a matrix of ones ofdimension n . Moreover, we have to deﬁne also the n a × a = H a f, with H a the n a × n matrix obtained from the identity matrix I n by omitting therows referring to observations not related to stratum a , and the matrix ¯J n a = J na n a , withJ n a a matrix of ones of dimension n a .The adapted QUE s for Ψ , Φ , and σ ν is obtained by equating: q n a ≡ f T Q ∆ H T a H a Q ∆ f → A ∑ a = q n a = q n ≡ f T Q ∆ f , q N a ≡ f T a ¯J n a f a → A ∑ a = q N a = q N ≡ f T ∆ µ ∆ − N ∆ T µ f , q T ≡ f T ∆ ν ∆ − T ∆ T ν f , (11)to their expected values. For more details on the identities in (11), see the formula (37)in Appendix D.Hence, the estimator of ψ a is: ˆ ψ a = q n a + k a ˆ σ u n a − N a − τ a . (12)where k a ≡ tr [( X T Q ∆ X ) − X T Q ∆ H T a H a Q ∆ X ] , with ∑ Aa = k a = k − τ a ≡ n a − N a − tr ( H a Q ∆ H T a ) , with ∑ Aa = τ a = T −

1. The estimated variance ˆ σ u is obtained by equating q n to its expected value (see Wansbeek and Kapteyn, 1989). Furthermore, the estimatorof ϕ a is: ˆ ϕ a = q N a − (cid:0) N a − n a n (cid:1) ˆ ψ a − (cid:0) k N a − k a + n a n k + n a n (cid:1) ˆ σ u n a − λ µ a + − n a n λ µ ˆ σ µ − (cid:0) N a − λ ν a + n a n λ ν (cid:1) ˆ σ ν n a − λ µ a . (13)where k N a ≡ tr [( X T Q ∆ X ) − X T a ¯J n a X a ] , k ≡ ι T n X ( X T Q ∆ X ) − X T ι n n , k a ≡ ι T n X ( X T Q ∆ X ) − X T a ι na n = ι T na X a ( X T Q ∆ X ) − X T ι n n , with ι n and ι n a vectors of ones of dimension n and n a respec-tively, λ µ ≡ ι T n ∆ µ ∆ T µ ι n n = ∑ Ni = T i n , λ µ a ϕ a ≡ ι T n ∆ µ Φ∆ T µ H T a ι na n = ∑ i ∈ Ia T i n ϕ a , λ ν ≡ ι T n ∆ ν ∆ T ν ι n n = ∑ Tt = N t n , λ ν a ≡ ι T n ∆ ν ∆ T ν H T a ι na n = ∑ t ∈ Ja N t n , with J a the set of periods in which individualsbelonging to stratum a are observed. The estimated variances ˆ σ µ and ˆ σ ν are obtainedjointly by equating q N and q T to their expected values (see Wansbeek and Kapteyn,1989).Simpler heteroscedastic schemes (i.e., heteroscedasticity only on the individual-speciﬁc disturbance or on the remainder error) can be obtained combining results forthe general scheme with those for the homoscedastic case, although when we considerthe case of heteroscedasticity only on the individual-speciﬁc disturbance the expectedvalue of q N a and the estimated variance ˆ ϕ a are obtained differently as detailed in equa-tions (42)-(43) in Appendix D. 7.3 Monte Carlo experiment – single-equation case

In order to analyze the performances of the proposed techniques, we develop a simplesimulation on y = β + β x + β x + β x + ε , where β = β = − β =

8, and β = − N =

250 and N = T = EC model is theappropriate one.Moreover, the experiment is implemented by considering as strata the deciles ofthe independent variable x . The homoscedastic time variance is σ ν = . ϕ a = σ µ ( + λ ¯ x a ) , where σ µ = . ψ a = σ u ( + λ ¯ x a ) , where σ u = . λ is assigned values 0 , , and 2,where λ = λ becomes larger. Finally, the independent variables’ values x kit ( k = , ,

3) are generated accordingto a modiﬁed version of the scheme introduced by Nerlove (1971) and used, amongothers, by Baltagi (1981), Wansbeek and Kapteyn (1989), and Platoni et al. (2012): x kit = . t + . x kit − + ω kit , k = , , ω kit following the uniform distribution [ − , ] and x ki = + ω ki .In order to construct the unbalanced panels, we adopt the procedure currently usedfor rotating panels, in which we have approximately the same number of individualsevery time period: a ﬁxed percentage of individuals (20% in our case ) is replacedeach time period, but they can re-enter the sample in later periods. Thus, if the numberof individuals is N =

250 then the number of observations is n = N =

500 then he number of observations is n = are shown in Table 1 and Table 2 .Table 1 reports the estimated variances ˆ ψ a and ˆ ϕ a , being the latter computed onthe basis of a remainder error either homoscedastic ( ˆ σ u ) or heteroscedastic ( ˆ ψ a ). Asone can notice right away, if λ is equal to 1 or 2 (i.e. in the heteroscedastic cases) the The simulations have been implemented with the econometric software

TSP version 5.1. Whereas data have been generated by specifying the same parametric variance functions as inLi and Stengos (1994) and Roy (2002), the proposed estimation method proves to be effective also in thethe case of heteroscedasticity of unknown form, if the strata are identiﬁed by using a proper selection proce-dure, such as the Akaike (1974) information criterion. Also in Wansbeek and Kapteyn (1989) each period 20% of the households in the panel is removedrandomly. With N =

250 the average numbers of observations for each stratum a are ¯ n a = =

78, ¯ n a = = n a = = n a = = n a = = n a = = n a = = n a = =

77, ¯ n a = =

51, and ¯ n a = = N =

500 they are ¯ n a = = n a = = n a = = n a = = n a = = n a = = n a = = n a = = n a = = n a = = As in Baltagi and Grifﬁn (1988) and Phillips (2003), negative variance estimates are replaced by zero. Whereas data have been generated such that the individual-speciﬁc error µ i and the time-speciﬁc error ν t are random variables, Table 2 displays also the results of the two-way FE and robust two-way FE estima-tions to check the method suggested in subsection 1. Moreover, note that the two-way FE residuals are usedin the QUE procedure of the

GLS estimation (and both in the

QUE and WB procedures of the SUR systemsestimation in the following section 3). ϕ a (cid:0) ˆ ψ a (cid:1) is closer than the estimated variance ˆ ϕ a (cid:0) ˆ σ u (cid:1) to the truevalue ϕ a . Moreover, when λ is equal to 0 (homoscedastic case), the heteroscedasticprocedures allow to obtain estimated variances ˆ ψ a and ˆ ϕ a that do not substantiallyvary among strata, and that are very close to the estimated values ˆ σ u and ˆ σ µ obtainedthrough the homoscedastic procedure (also reported in Table 1). T ABLE

1. Simulation results on single-equation two-way EC model:estimated variances ˆ ψ a and ˆ ϕ a N = T =

12, and n = N = T =

12, and n = ψ a ˆ ψ a ϕ a ˆ ϕ a (cid:0) ˆ σ u (cid:1) ˆ ϕ a (cid:0) ˆ ψ a (cid:1) ψ a ˆ ψ a ϕ a ˆ ϕ a (cid:0) ˆ σ u (cid:1) ˆ ϕ a (cid:0) ˆ ψ a (cid:1) λ =

01 6.039 6.045 6.488 6.589 6.589 6.039 6.019 6.488 6.560 6.5692 6.039 6.043 6.488 6.509 6.510 6.039 6.046 6.488 6.538 6.5373 6.039 6.059 6.488 6.513 6.511 6.039 6.051 6.488 6.522 6.5214 6.039 6.012 6.488 6.536 6.542 6.039 6.060 6.488 6.499 6.4965 6.039 6.038 6.488 6.534 6.535 6.039 6.032 6.488 6.525 6.5276 6.039 6.050 6.488 6.478 6.476 6.039 6.040 6.488 6.488 6.4897 6.039 6.073 6.488 6.451 6.444 6.039 6.044 6.488 6.528 6.5288 6.039 6.061 6.488 6.532 6.527 6.039 6.051 6.488 6.529 6.5269 6.039 6.046 6.488 6.530 6.536 6.039 6.042 6.488 6.527 6.52810 6.039 5.962 6.488 6.617 6.881 6.039 5.969 6.488 6.561 6.652 σ u ˆ σ u σ µ ˆ σ µ σ u ˆ σ u σ µ ˆ σ µ λ =

11 12.352 12.947 13.270 5.400 13.220 12.296 12.554 13.211 4.796 13.2132 19.770 20.189 21.239 16.807 21.114 19.760 19.982 21.229 16.930 21.2423 25.800 26.164 27.718 25.154 27.650 25.775 25.964 27.692 25.277 27.7874 31.743 31.750 34.103 32.944 34.274 31.741 31.922 34.101 32.748 34.0495 38.086 38.108 40.918 40.964 41.173 38.081 38.048 40.912 40.855 41.0776 45.119 45.088 48.473 49.268 48.212 45.104 45.053 48.458 49.350 48.3217 53.775 53.783 57.773 60.349 57.110 53.741 53.623 57.737 61.068 57.9408 67.934 67.518 72.985 82.875 73.373 67.800 67.596 72.841 82.462 73.0919 94.070 92.953 101.064 128.090 101.610 93.710 93.079 100.677 127.137 101.04310 152.259 147.422 163.580 255.287 173.531 152.315 149.183 163.639 253.722 167.770 λ =

21 20.935 22.737 22.491 3.298 22.052 20.774 21.616 22.319 1.632 22.1172 41.435 42.753 44.516 30.440 44.059 41.396 42.077 44.474 30.685 44.3673 59.331 60.436 63.743 55.292 63.446 59.246 59.811 63.651 55.632 63.8354 77.650 77.814 83.423 79.246 83.749 77.633 78.145 83.405 78.762 83.2115 97.740 97.849 105.007 104.667 105.633 97.713 97.656 104.977 104.342 105.3546 120.505 120.402 129.465 131.767 128.658 120.449 120.301 129.404 131.993 128.9707 149.079 148.994 160.163 168.423 158.219 148.955 148.567 160.030 170.403 160.5348 196.797 195.348 211.429 243.594 212.610 196.321 195.606 210.917 242.139 211.5939 287.036 283.267 308.377 398.811 309.990 285.749 283.619 306.995 395.537 308.07710 493.850 477.513 530.568 847.322 563.254 494.017 483.568 530.747 841.774 544.214

Note : ψ a and ϕ a are the true values of the variances, ˆ ψ a are the estimated variances of the remainder error u it , ˆ ϕ a are theestimated variances of the individual-speciﬁc error µ i computed on the basis of a remainder error either homoscedastic ( ˆ σ u ) or heteroscedastic ( ˆ ψ a ). Table 2 shows that the heteroscedastic procedures allow to obtain standard errors9ower than those obtained through the homoscedastic procedure if λ = ,

2, but higherstandard errors if λ =

0. However, in the latter case (i.e., the homoscedastic case) ifthe number of individuals (and thus the number of observations) increases, then thestandard errors computed with the heteroscedastic procedures become closer to thestandard errors computed with the homoscedastic procedure. T ABLE

2. Simulation results on single-equation two-way EC model: standard errors of theestimated parameters and (average) estimated variances of the error components N = T =

12, and n = N = T =

12, and n = u it µ i u it , µ i value FE robust homosc. u it µ i u it , µ i (a) (b) (c) (d) (e) (f) (a) (b) (c) (d) (e) (f) λ = β β β β ϕ σ ν ψ λ = β β β β ϕ σ ν ψ λ = β β β β ϕ σ ν ψ Note : Parameters estimation based on (a-b) the estimated homoscedastic variance ˆ σ u ; (c) the estimated homoscedasticvariances ˆ σ ν , ˆ σ µ , and ˆ σ u ; (d) the estimated homoscedastic variances ˆ σ ν and ˆ σ µ and heteroscedastic variances ˆ ψ a , whosethe average value is ˆ ψ ; (e) the estimated homoscedastic variances ˆ σ ν and ˆ σ u and heteroscedastic variances ˆ ϕ a ( ˆ σ u ) , whosethe average value is ˆ ϕ ; (f) the estimated homoscedastic variance ˆ σ ν and heteroscedastic variances ˆ ψ a and ˆ ϕ a ( ˆ ψ a ) . Focusing on the heteroscedastic cases, considering heteroscedasticity only on theremainder error (columns (d)) allows to obtain standard errors that are lower than thestandard errors obtained considering heteroscedasticity only on the individual-speciﬁceffect (columns (e)). In other words, misspecifying the form of heteroscedasticity canbe costly when heteroscedasticity is assumed only on the individual-speciﬁc effect; thisloss in efﬁciency is smaller when heteroscedasticity is assumed only on the remaindererror. These ﬁndings conﬁrm the conclusions in Baltagi et al. (2005). Obviously, thesmallest standard errors are obtained implementing the estimation procedure whichconsiders both heteroscedasticity types (columns (f)).As in Li and Stengos (1994), Roy (2002), and Baltagi et al. (2005), we consider therelative efﬁciency of the different estimators, computed as the ratio of the mean square10rror (

MSE ) of the estimator under consideration to the

MSE of the true

GLS estimator.Results are reported in Table 3. T ABLE

3. Relative efﬁciency of the single-equation two-way EC model N = T =

12, and n = N = T =

12, and n = u it µ i u it , µ i homoscedasticity u it µ i u it , µ i λ = λ = λ = Note : Relative efﬁciency is deﬁned as the ratio of the

MSE of the estimator under consideration to the

MSE of the true

GLS estimator (computed considering the true variances ψ a , ϕ a , and σ ν ). Note that values of the ratio both larger andsmaller than 1 indicate a loss in efﬁciency: if the ratio is larger than 1, then the absolute value of the composite error term ε it = µ i + ν t + u it is larger than the true value; and if the ratio is smaller than 1, then the absolute value of the compositeerror term ε it is smaller than the true value. We see that there are improvements in relative

MSE numbers as the sample sizeincreases, especially when we refer to the homoscedastic estimator. Furthermore, con-ﬁrming our previous remarks, misspecifying the form of heteroscedasticity may becostly when only the individual-speciﬁc effect is considered heteroscedastic, especiallyif the sample size is small. Besides, as already observed in the comments to Table 2,the most efﬁcient estimator is the one that considers both the remainder error and theindividual-speciﬁc effect heteroscedastic.3. H

ETEROSCEDASTIC TWO - WAY

SUR

SYSTEMS

When systems of equations have to be estimated, as it is the case of

SUR systems,single-equation estimation techniques are not appropriate. In order to estimate het-eroscedastic two-way

SUR systems we extend the procedure in Biørn (2004), withindividuals grouped according to the number of times they are observed.3.1

Model and notation

Let N p denote the number of individuals observed exactly in p periods, with p = , . . . , T . Hence ∑ p N p = N and ∑ p ( N p p ) = n . Moreover, let N a , p denote the number ofindividuals belonging to stratum a and observed in p periods; therefore, ∑ a N a , p = N p and ∑ p ∑ a N a , p = N .We assume that the T groups of individuals are ordered such that the N p = indi-viduals observed once come ﬁrst, the N p = individuals observed twice come second,etc. Hence, with C p = ∑ ph = N h being the cumulated number of individuals observed atmost p times, the index sets of the individuals observed exactly p times can be writtenas I p = { C p − + , . . . , C p } . Note that I p = may be considered as a pure cross sectionand I p , with p ≥

2, as a pseudo-balanced panel with p observations for each individ-ual. This structure allows us to use a number of results derived for the two-way SUR k m is the number of regressors for equation m , the total number of regressors forthe system is K = ∑ Mm = k m . Stacking the M equations, indexed m = , . . . , M , for theobservation ( i , t ) we have:y it = X it β + µ i + ν t + u it = X it β + ε , (14)where the M × K matrix of explanatory variables is X it = diag ( x T1 it , . . . , x T Mit ) and the K × β = ( β T1 , . . . , β T M ) T and where µ i ≡ ( µ i , . . . , µ Mi ) T , ν t ≡ ( ν t , . . . , ν Mt ) T , and u it ≡ ( u it , . . . , u Mit ) T . If we do not have cross-equation restrictions,we can assume E ( u mit | x T1 it , x T2 it , . . . , x T Mit ) =

0, and then E ( y mit | x T1 it , x T2 it , . . . , x T Mit ) = E ( y mit | x T mit ) = x T mit β m . On the contrary, if we have cross-equation restrictions , wecan only assume E ( u it | x T it ) =

0, where x it ≡ ( x T1 it , x T2 it , . . . , x T Mit ) T .With heteroscedasticity on both the individual-speciﬁc disturbance and the remain-der error, for i ∈ I a across the regression equations m and j , we assume that: E (cid:0) µ mi , µ ji ′ (cid:1) = (cid:26) ϕ a , m j i = i ′ i = i ′ , E (cid:0) ν mt , ν jt ′ (cid:1) = (cid:26) σ ν , m j t = t ′ t = t ′ , E (cid:0) u mit , u ji ′ t ′ (cid:1) = (cid:26) ψ a , m j i = i ′ and t = t ′ i = i ′ and/or t = t ′ . (15)Let us consider the NM × µ ≡ ( µ T1 , . . . , µ T N ) T , the T M × ν ≡ ( ν T1 , . . . , ν T T ) T ,and the nM × ≡ ( u T11 , u T12 , . . . , u T1 T , u T21 , . . . , u T NT N ) T . Since the M × it ∼ ( , Ψ a ) , the M × µ i ∼ ( , Φ a ) , and the T M × ν ∼ ( , Σ ν ) , withthe M × M matrices Ψ a = [ ψ a , m j ] , Φ a = [ ϕ a , m j ] , and Σ ν = [ σ ν , m j ] , we can assume thatthe expected values of the vectors u it , µ i , and ν t are zero and their covariance matricesare equal to Ψ a , Φ a , and Σ ν . It follows that E ( ε it ε T i ′ t ′ ) = δ ii ′ Φ a + δ tt ′ Σ ν + δ ii ′ δ tt ′ Ψ a ,with δ ii ′ = i = i ′ and δ ii ′ = i = i ′ , δ tt ′ = t = t ′ and δ tt ′ = t = t ′ .As in Biørn (2004), let us consider the pM × i ( p ) ≡ ( y T i , . . . , y T ip ) T , the pM × K matrix of explanatory variables X i ( p ) ≡ ( X T i , . . . , X T ip ) T ,and the pM × ε i ( p ) ≡ ( ε T i , . . . , ε T ip ) T for i ∈ I p . If wedeﬁne the pM × T M matrix ∆ i ( p ) , indicating in which period t the individual i of thegroup p is observed, and if we consider the T M × ν , for the individual i ∈ I p we can deﬁne the pM × ν i ( p ) ≡ ∆ i ( p ) ν and write the model:y i ( p ) = X i ( p ) β + ( ι p ⊗ µ i ) + ν i ( p ) + u i ( p ) = X i ( p ) β + ε i ( p ) , (16)where ι p is a p × pM × pM heteroscedastic variance-covariance matrix of the pM × ε i ( a , p ) for the individual i ∈ I a , p , with I a , p = I a ∩ I p the set of individuals As Biørn (2004) suggests, with cross-equations restrictions we can redeﬁne β as the complete K × M × K regression matrix as X it = ( x it , x it ,... , x Mit ) T , wherethe k th element of the k m × mit either contains the observation on the variable in the m th equationwhich corresponds to the k th coefﬁcient in β or is zero if the k th coefﬁcient does not occur in the m th equation. a and observed in p periods, is given by: Ω a , p = E p ⊗ ( Ψ a + Σ ν ) + ¯J p ⊗ ( Ψ a + Σ ν + p Φ a ) , (17)where E p = I p − ¯J p (with I p identity matrix of dimension p ) and ¯J p = J p p (with J p matrix of ones of dimension p ). Since E p and ¯J p are symmetric, idempotent, and haveorthogonal columns, the inverse of the variance-covariance matrix of the individualsbelonging to stratum a and group p is: Ω − a , p = E p ⊗ ( Ψ a + Σ ν ) − + ¯J p ⊗ ( Ψ a + Σ ν + p Φ a ) − . (18)This speciﬁcation nests simpler heteroscedastic schemes as well as the homoscedasticcase by replacing Φ a with Σ µ and/or Ψ a with Σ u .If we assume that Ψ a , Φ a , and Σ ν are known, then in the heteroscedastic case wecan write the GLS estimator for the K × β as the problem ofminimizing: T ∑ p = A ∑ a = ∑ i ∈ I a , p ε T i Ω − a , p ε i . (19)where, for sake of simplicity and since there is no risk of ambiguity, ε i is used insteadof ε i ( a , p ) .If we apply GLS on the observations for the individuals observed p times we obtain:ˆ β GLSp = A ∑ a = ∑ i ∈ I a , p X T i Ω − a , p X i ! − A ∑ a = ∑ i ∈ I a , p X T i Ω − a , p y i , (20)while the full GLS estimator is:ˆ β GLS = T ∑ p = A ∑ a = ∑ i ∈ I a , p X T i Ω − a , p X i ! − T ∑ p = A ∑ a = ∑ i ∈ I a , p X T i Ω − a , p y i , (21)where X i is the pM × K matrix of explanatory variables related to individual i ∈ I a , p .3.2 Estimation of the covariance matrices

The next step is to ﬁnd an appropriate technique to estimate the components of thevariance-covariance matrices of the two-way

SUR system Ψ a , Φ a , and Σ ν . This canbe achieved adopting either the QUE procedure suggested by Wansbeek and Kapteyn(1989) for the homoscedastic single-equation case or the within-between ( WB ) proce-dure suggested by Biørn (2004) for the homoscedastic one-way SUR system. In thefollowing sub-sections we modify both procedures making them suitable for the het-eroscedastic two-way

SUR system. 13 he QUE procedure

The

QUE procedure considers the n × m ≡ y m − X m ˆ β Wm from the W estima-tor in (3) for the equation m = , . . . , M , where X m is a matrix of dimension n × ( k m − ) .If we assume that the n × k m matrix X m contains a vector of ones, then we have to deﬁnethe n × m ≡ E n e m = e m − ¯ e m .With heteroscedasticity, we can obtain the adapted QUE s for Ψ m j , Φ m j , and σ ν , m j by equating: q n a , m j ≡ f T j Q ∆ H T a H a Q ∆ f m → A ∑ a = q n a , m j = q n , m j ≡ f T j Q ∆ f m , q N a , m j ≡ f T a j ¯J N a f a m → A ∑ a = q N a , m j = q N , m j ≡ f T j ∆ µ ∆ − N ∆ T µ f m , q T , m j ≡ f T j ∆ ν ∆ − T ∆ T ν f m , (22)to their expected values. The identities in (22) can be further detailed as already donein formula (37), Appendix D, for the identities in (11).Hence, the estimator of ψ a , m j is:ˆ ψ a , m j = q n a , m j + ( k a , m + k a , j − k a , m j ) ˆ σ u , m j n a − N a − τ a (23)where k a , m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j Q ∆ H T a H a Q ∆ X m ] , with ∑ Aa = k a , m j = k m j and k m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j Q ∆ X m ] . Theestimated variance-covariance ˆ σ u , m j is obtained by equating q n , m j to its expected value(see Platoni et al., 2012). Furthermore, the estimator of ϕ a , m j is:ˆ ϕ a , m j = q N a , m j − (cid:0) N a − n a n (cid:1) ˆ ψ a , m j − (cid:0) k N a , m j − k a , m j + n a n k , m j + n a n (cid:1) ˆ σ u , m j n a − λ µ a + − n a n λ µ ˆ σ µ , m j − (cid:0) N a − λ ν a + n a n λ ν (cid:1) ˆ σ ν , m j n a − λ µ a , (24)where k N a , m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T a j ¯J N a X a m ] , k a , m j ≡ ι T Na X am ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j ι n n + ι T n X m ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T aj ι Na n , k , m j ≡ ι T n X m ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j ι n n . The estimated variance-covariance ˆ σ µ , m j is obtained jointly with ˆ σ ν , m j by equating q N , m j and q T , m j to their expected values (seePlatoni et al., 2012).As in the single-equation case, simpler heteroscedastic scheme (i.e., heteroscedas-ticity only on the individual-speciﬁc disturbance or on the remainder error) can beobtained combining results for the general scheme with those for the homoscedasticcase, although when we consider the case of heteroscedasticity only on the individual-speciﬁc disturbance the expected value of q N a , m j and the estimated variance-covarianceˆ ϕ a , m j are obtained differently (see equations (48)-(49) in Appendix D).14 he WB procedure With heteroscedastic two-way systems of equations, the M × M matrices of withinindividuals, between individuals, and between times (co)variations in the ε ’s of the M equations are the following:W ε = A ∑ a = W ε a = A ∑ a = ∑ i ∈ I a T i ∑ t = ( ε it − ¯ ε i (cid:5) − ¯ ε (cid:5) t ) ( ε it − ¯ ε i (cid:5) − ¯ ε (cid:5) t ) T , B C ε = A ∑ a = B C ε a = A ∑ a = ∑ i ∈ I a T i ( ¯ ε i (cid:5) − ¯ ε ) ( ¯ ε i (cid:5) − ¯ ε ) T , B T ε = T ∑ t = N t ( ¯ ε (cid:5) t − ¯ ε ) ( ¯ ε (cid:5) t − ¯ ε ) T , (25)where for each equation m we have ¯ ε mi (cid:5) = ∑ Tit = ε mit T i , ¯ ε m (cid:5) t = ∑ Nti = ε mit N t , and ¯ ε m = ∑ Ni = ∑ Tit = ε mit n = ∑ Ni = ( T i ¯ ε mi (cid:5) ) n or ¯ ε m = ∑ Tt = ∑ Nti = ε mit n = ∑ Tt = ( N t ¯ ε m (cid:5) t ) n .Because the u it ’s, the µ i ’s, and the ν t ’s are independent, from the equations in (25)we can write: E ( W ε a ) = E ( W u a ) , E (cid:0) B C ε a (cid:1) = E (cid:0) B C µ a (cid:1) + E (cid:0) B Cu a (cid:1) , E (cid:0) B T ε (cid:1) = E (cid:0) B T ν (cid:1) + E (cid:0) B Tu (cid:1) , (26)where the within individuals (co)variation is:W u a = ∑ i ∈ I a T i ∑ t = ( u it − ¯u i (cid:5) − ¯u (cid:5) t ) ( u it − ¯u i (cid:5) − ¯u (cid:5) t ) T = ∑ i ∈ I a T i ∑ t = u it u T it − ∑ i ∈ I a T i ¯u i (cid:5) ¯u T i (cid:5) − ∑ i ∈ I a T i ∑ t = ¯u (cid:5) t ¯u T (cid:5) t , (27)the between individuals (co)variations are:B C µ a = ∑ i ∈ I a T i ( µ i − ¯ µ ) ( µ i − ¯ µ ) T = ∑ i ∈ I a T i µ i µ T i − ∑ i ∈ I a T i ¯ µ ¯ µ T , B Cu a = ∑ i ∈ I a T i ( ¯u i (cid:5) − ¯u ) ( ¯u i (cid:5) − ¯u ) T = ∑ i ∈ I a T i ¯u i (cid:5) ¯u T i (cid:5) − ∑ i ∈ I a T i ¯u¯u T , (28)and the between times (co)variations, as in the homoscedastic case, are:B T ν = T ∑ t = N t ( ν t − ¯ ν ) ( ν t − ¯ ν ) T = T ∑ t = N t ν t ν T t − n ¯ ν ¯ ν T , B Tu = T ∑ t = N t ( ¯u (cid:5) t − ¯u ) ( ¯u (cid:5) t − ¯u ) T = T ∑ t = N t ¯u (cid:5) t ¯u T (cid:5) t − n ¯u¯u T , (29)where ¯ u mi (cid:5) = ∑ Tit = u mit T i , ¯ u m (cid:5) t = ∑ Nti = u mit N t , ¯ u m = ∑ Ni = ∑ Tit = u mit n = ∑ Ni = ( T i ¯ u mi (cid:5) ) n or ¯ u m = ∑ Tt = ∑ Nti = u mit n = ∑ Tt = ( N t ¯ u m (cid:5) t ) n , ¯ µ m = ∑ Ni = ( T i µ mi ) n , and ¯ ν m = ∑ Tt = ( N t ν mt ) n (see Biørn, 2004; Platoni et al.,2012).Since for i ∈ I a we have E ( ε it ε T i ′ t ′ ) = δ ii ′ Φ a + δ tt ′ Σ ν + δ ii ′ δ tt ′ Ψ a , where E ( u it u T i ′ t ′ ) = ii ′ δ tt ′ Ψ a , E ( µ i µ T i ′ ) = δ ii ′ Φ a , and E ( ν t ν ′ t ′ ) = δ tt ′ Σ ν , it follows that E ( ¯u i (cid:5) ¯u T i (cid:5) ) = Ψ a T i , E ( ¯u (cid:5) t ¯u T (cid:5) t ) = ∑ i ∈ It Ψ a N t ≃ ¯ Ψ N t ≈ Σ u N t , with I t the set of individuals observed in period t , E ( ¯u¯u T ) = ∑ Ni = ( T i Ψ a ) n = ¯ Ψ n ≈ Σ u n , E ( ¯ µ ¯ µ T ) = ∑ Ni = ( T i Φ a )( ∑ Ni = T i ) = ∑ Ni = T i n ¯ Φ ≈ ∑ Ni = T i n Σ µ , and E ( ¯ ν ¯ ν T ) = ∑ Tt = N t n Σ ν .Hence, the M × M matricesˆ Ψ a = W ε a + ∑ i ∈ I a T i ∑ t = N t ˆ Σ u n a − N a , (30)with ∑ Aa = ∑ i ∈ I a ∑ T i t = N t = T , andˆ Φ a = B C ε a + ∑ i ∈ I a T i n N ∑ j = T j n ˆ Σ µ − N a ˆ Ψ a + ∑ i ∈ I a T i n ˆ Σ u ∑ i ∈ I a T i (31)would be unbiased estimators of Ψ a and Φ a if the ε ’s were known. Both the estimatorsof Σ u and Σ µ and the estimator of Σ ν are derived as in the homoscedastic case:ˆ Σ u = W ε n − N − T , ˆ Σ µ = B C ε − ( N − ) ˆ Σ u n − N ∑ i = T i n , and ˆ Σ ν = B T ε − ( T − ) ˆ Σ u n − T ∑ t = N t n , (32)that would be unbiased estimators of Σ u , Σ µ , and Σ ν if the ε ’s were known (see Biørn,2004; Platoni et al., 2012).Again, a simpler heteroscedastic scheme (i.e., heteroscedasticity only on the individual-speciﬁc disturbance and on the remainder error) can be obtained combining results forthe general scheme with those for the homoscedastic case, although when we considerthe case of heteroscedasticity only on the individual-speciﬁc disturbance the estimatorˆ Φ a is obtained differently (see equation (50) in Appendix E).As Biørn (2004) suggested, in empirical applications consistent residuals can re-place ε ’s in (25) to obtain consistent estimates of Ψ a , Φ a , and Σ ν . Since the QUE procedure is based on the W residuals, for coherence also in the WB procedure we con-sider the M × it ≡ y it − X it ˆ β W from the W estimator in (3) for the individual i in period t , where X it is a matrix of dimension M × ( K − M ) . As above, if we assumethat the M × K matrix X it in (14) always contains M vectors of ones (a vector of onesfor each equation m ), then we have to deﬁne the M × it = e it − ¯e, where ¯ e m = ∑ Ni = ∑ Tit = e mit n = ∑ Tt = ∑ Nti = e mit n . Therefore, the M × M matrices ofwithin individuals, between individuals, and between times (co)variations in the f’s of16he different M equations are the following:W f = A ∑ a = W f a = A ∑ a = ∑ i ∈ I a T i ∑ t = (cid:0) f it − ¯f i (cid:5) − ¯f (cid:5) t (cid:1) (cid:0) f it − ¯f i (cid:5) − ¯f (cid:5) t (cid:1) T , B Cf = A ∑ a = B Cf a = A ∑ a = ∑ i ∈ I a T i (cid:0) ¯f i (cid:5) − ¯f (cid:1) (cid:0) ¯f i (cid:5) − ¯f (cid:1) T , B Tf = T ∑ t = N t (cid:0) ¯f (cid:5) t − ¯f (cid:1) (cid:0) ¯f (cid:5) t − ¯f (cid:1) T , (33)where for each equation m we have ¯ f mi (cid:5) = ∑ Tit = f mit T i , ¯ f m (cid:5) t = ∑ Nti = f mit N t , and ¯ f m = ∑ Ni = ∑ Tit = f mit n = ∑ Ni = ( T i ¯ f mi (cid:5) ) n or ¯ f m = ∑ Tt = ∑ Nti = f mit n = ∑ Tt = ( N t ¯ f m (cid:5) t ) n . Given that: E (cid:0) W f a (cid:1) = ( n a − N a ) Ψ a − ∑ i ∈ I a ∑ t ∈ J i N t ¯ Ψ , E (cid:16) B Cf a (cid:17) = ∑ i ∈ I a T i Φ a − ∑ i ∈ I a T i n N ∑ j = T j n ¯ Φ + N a Ψ a − ∑ i ∈ I a T i n ¯ Ψ , E (cid:16) B Tf (cid:17) = (cid:18) n − T ∑ t = N t n (cid:19) Σ ν + ( T − ) ¯ Ψ , (34)where J i is the set of periods in which individual i is observed and with ¯ Ψ ≈ Σ u and ¯ Φ ≈ Σ µ , we can conclude that the estimators in (30) and (31), with W f a instead of W ε a andB Cf a instead of B C ε a respectively, are consistent estimators of Ψ a and Φ a . As mentionedabove, both the consistent estimators of Σ u and Σ µ and the consistent estimator of Σ ν are derived as in the homoscedastic case (see Biørn, 2004; Platoni et al., 2012). Finally,with heteroscedasticity only on the individual-speciﬁc disturbance, the expected value E (cid:16) B Cf a (cid:17) is given by the equation (51) in Appendix E.3.3 Monte Carlo experiment – SUR system case

In order to analyze the performances of the proposed techniques, we develop a simplesimulation on a three-equation system ( M = y = β + β x + β x + ε , y = β + β x + β x + β x + ε , y = β + β x + β x + ε , where β = ( , , − ) T , β = ( , − , , − ) T , and β = ( , − , ) T . Then wealso allow the cross equations restrictions β = β and β = β .The independent variables’ values x kit ( k = , ,

3) have been generated and the un-balanced panel has been constructed according to the same

DGP of the single-equation Note that the second equation is the same equation of the single-equation case in subsection 3. . This should mimic a real world situation of a large unbalanced panel for whichthe two-way SUR system is the appropriate model. Moreover, as in the single-equationcase, the experiment is implemented by considering as strata the deciles of the inde-pendent variable x . The homoscedastic time variance-covariance matrix is: Σ ν = " , while the heteroscedastic variances-covariances ϕ a , m j and ψ a , m j have been generatedfrom the matrices: Σ µ = " and Σ u = " with ϕ a , m j = σ µ , m j ( + λ ¯ x a ) and ψ a , m j = σ u , m j ( + λ ¯ x a ) , where σ µ , m j and σ u , m j areelements of the matrices Σ µ and Σ u respectively and ¯ x a is the mean of the independentvariable x over the decile/stratum a . The results of a 2000-run simulation are shown in Tables 4 and 5.

Tables 4 and 5 show that, contrary to the single-equation case, the heteroscedasticprocedures allow to obtain standard errors lower than those obtained through the ho-moscedastic procedure in all cases, i.e., not only in the heteroscedastic cases λ = , λ =

0. However, in the homoscedastic case (i.e.,with λ =

0) the standard errors computed with the heteroscedastic procedures are veryclosed to the standard errors computed with the homoscedastic procedure.Focusing on the heteroscedastic cases (i.e., with λ = , • the smallest standard errors are obtained when the estimation procedure whichconsiders both kinds of heteroscedasticity is implemented; • though, differently from the single-equation estimation, there is not an evidentdifference in the loss in efﬁciency due to the misspeciﬁcation in the form ofheteroscedasticity.Finally, comparing the standard errors obtained with the QUE procedure (displayedin Table 4) and those obtained with the WB procedure (displayed in Table 5), it ispossible to assert that the QUE procedure allows to obtained lower standard errors thanthose obtained with the WB procedure. With N =

250 the numbers of individuals for each group p are N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = =

6, and N p = =

5; and with N =

500 they are N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = = N p = =

12, and N p = = The correlation among equations veriﬁes the null hypothesis of the Breusch and Pagan (1979) test at n = , N = As in Baltagi and Grifﬁn (1988) and Phillips (2003), negative variance estimates are replaced by zero. Table 7 in Appendix F displays the estimated variances-covariances for the stratum a = ABLE Simulation results on two-way

SUR systems -

QUE procedure:standard errors of the estimated parameters and (average) estimated variances andcovariances of the error components N = T =

12, and n = N = T =

12, and n = u it µ it µ it , u it value homosc. u it µ it µ it , u it (a) (b) (c) (d) (a) (b) (c) (d) λ = β β β ϕ ϕ -1.048 -1.067 -1.067 -1.067 -1.069 -1.048 -1.044 -1.044 -1.044 -1.044¯ ϕ σ ν , σ ν , σ ν , -1.107 -1.124 -1.107 -1.113¯ ψ ψ ψ β β β β ϕ ϕ σ ν , σ ν , ψ ψ -1.232 -1.238 -1.236 -1.238 -1.236 -1.232 -1.228 -1.229 -1.228 -1.229 β β β ϕ σ ν , ψ λ = β β β ϕ ϕ -7.437 -7.101 -7.101 -7.096 -7.566 -7.442 -6.950 -6.950 -6.947 -7.363¯ ϕ σ ν , σ ν , σ ν , -1.107 -1.119 -1.107 -1.112¯ ψ ψ ψ a) (b) (c) (d) (a) (b) (c) (d) β β β β ϕ ϕ σ ν , σ ν , ψ ψ -8.743 -8.040 -8.690 -8.040 -8.690 -8.748 -7.988 -8.725 -7.988 -8.725 β β β ϕ σ ν , ψ λ = β β β ϕ ϕ -20.075 -18.830 -18.830 -18.811 -20.426 -20.091 -18.425 -18.425 -18.416 -19.852¯ ϕ σ ν , σ ν , σ ν , -1.107 -1.111 -1.107 -1.111¯ ψ ψ ψ β β β β ϕ ϕ σ ν , σ ν , ψ ψ -23.599 -21.142 -23.393 -21.142 -23.393 -23.618 -21.021 -23.558 -21.021 -23.558 β β β ϕ σ ν , ψ Note : Parameters estimation based on (a) the estimated homoscedastic vars-Covs ˆ σ ν , mj , ˆ σ µ , mj , ˆ σ u , mj ; (b) the estimatedhomoscedastic vars-Covs ˆ σ ν , mj and ˆ σ µ , mj and heteroscedastic vars-Covs ˆ ψ a , mj , whose the average value is ˆ ψ mj ; (c) theestimated homoscedastic vars-Covs ˆ σ ν , mj and ˆ σ u , mj and heteroscedastic vars-Covs ˆ ϕ a , mj ( ˆ σ u , mj ) , whose the average value isˆ ϕ mj ; (d) the estimated homoscedastic vars-Covs ˆ σ ν , mj and heteroscedastic vars-Covs ˆ ψ a , mj and ˆ ϕ a , mj ( ˆ ψ a , mj ) . ABLE Simulation results on two-way

SUR systems - WB procedure:standard errors of the estimated parameters and (average) estimated variances andcovariances of the error components N = T =

12, and n = N = T =

12, and n = u it µ it µ it , u it value homosc. u it µ it µ it , u it (a) (b) (c) (d) (a) (b) (c) (d) λ = β β β ϕ ϕ -1.048 -0.990 -0.990 -0.985 -1.059 -1.048 -0.968 -0.968 -0.966 -1.024¯ ϕ σ ν , σ ν , σ ν , -1.107 -1.113 -1.107 -1.107¯ ψ ψ ψ β β β β ϕ ϕ σ ν , σ ν , ψ ψ -1.232 -0.949 -0.821 -0.949 -0.821 -1.232 -0.937 -0.819 -0.937 -0.819 β β β ϕ σ ν , ψ λ = β β β ϕ ϕ -7.437 -7.005 -7.005 -6.970 -7.518 -7.442 -6.863 -6.863 -6.846 -7.319¯ ϕ σ ν , σ ν , σ ν , -1.107 -1.060 -1.107 -1.081¯ ψ ψ ψ a) (b) (c) (d) (a) (b) (c) (d) β β β β ϕ ϕ σ ν , σ ν , ψ ψ -8.743 -7.749 -8.283 -7.749 -8.283 -8.748 -7.694 -8.315 -7.694 -8.315 β β β ϕ σ ν , ψ λ = β β β ϕ ϕ -20.075 -18.695 -18.695 -18.601 -20.302 -20.091 -18.318 -18.318 -18.272 -19.761¯ ϕ σ ν , σ ν , σ ν , -1.107 -0.954 -1.107 -1.029¯ ψ ψ ψ β β β β ϕ ϕ σ ν , σ ν , ψ ψ -23.599 -20.845 -22.997 -20.845 -22.997 -23.618 -20.721 -23.144 -20.721 -23.144 β β β ϕ σ ν , ψ Note : Parameters estimation based on (a) the estimated homoscedastic vars-Covs ˆ σ ν , mj , ˆ σ µ , mj , ˆ σ u , mj ; (b) the estimatedhomoscedastic vars-Covs ˆ σ ν , mj and ˆ σ µ , mj and heteroscedastic vars-Covs ˆ ψ a , mj , whose the average value is ˆ ψ mj ; (c) theestimated homoscedastic vars-Covs ˆ σ ν , mj and ˆ σ u , mj and heteroscedastic vars-Covs ˆ ϕ a , mj ( ˆ σ u , mj ) , whose the average value isˆ ϕ mj ; (d) the estimated homoscedastic vars-Covs ˆ σ ν , mj and heteroscedastic vars-Covs ˆ ψ a , mj and ˆ ϕ a , mj ( ˆ ψ a , mj ) .

22s Table 3, Table 6 displays the ratios of the

MSE of the estimators under consid-eration to the

MSE of the true

GLS estimator, i.e. it displays the measures of relativeefﬁciency of the different estimators. T ABLE

6. Relative efﬁciency of two-way

SUR systems N = T =

12, and n = N = T =

12, and n = u it µ i u it , µ i homoscedasticity u it µ i u it , µ i QUE procedure λ = y y y λ = y y y λ = y y y WB procedure λ = y y y λ = y y y λ = y y y Note : Relative efﬁciency is deﬁned as the ratio of the

MSE of the estimator under consideration to the

MSE of the true

GLS estimator (computed considering the true vars-Covs ψ a , mj , ϕ a , mj , and σ ν , mj ). Note that values of the ratio both larger andsmaller than 1 indicate a loss in efﬁciency: if the ratio is larger than 1, then the absolute value of the composite error term ε mit = µ mi + ν mt + u mit is larger than the true value; and if the ratio is smaller than 1, then the absolute value of the compositeerror term ε mit is smaller than the true value. This table highlights that, as expected, with λ = λ = , λ =

1) the WB procedure ismore efﬁcient than the QUE procedure, whereas if the heteroscedasticity is high (i.e.,with λ =

2) the

QUE procedure is more efﬁcient than the WB procedure.23. C ONCLUSION

The use of panel data is becoming very popular in applied econometrics, since largedata sets including many individuals observed for several periods are increasingly ac-cessible and manageable. Most of these data sets are unbalanced panels, since veryoften not all the individuals are observed over the whole time period. In estimatingsingle-equation or system of equations EC models on these data, the heteroscedasticityproblem may be very common, especially when individuals differ in size.In this paper, we have derived suitable EC model estimators for heteroscedastictwo-way single equations and SUR systems (with cross-equations restrictions) on un-balanced panel data. Our simulations show that such estimators substantially improveestimation efﬁciency as compared to the case where heteroscedasticity is not taken intoaccount, especially when both the individual-speciﬁc and remainder error componentsare heteroscedastic.A

PPENDIX

A: F

IXED EFFECTS ESTIMATION ASSUMPTIONS

In the FE estimation the following assumptions are made . FE .1 S TRICT EXOGENEITY

The set of ( k − ) T i explanatory variables for each indi-vidual x i ◦ ≡ ( x i , x i , . . . , x iT i ) is uncorrelated with the idiosyncratic error u it andthe set of ( k − ) N t explanatory variables in each time period x ◦ t ≡ ( x t , x t , . . . , x N t t ) is also uncorrelated with the same idiosyncratic error u it : E ( u it | x , µ i , ν t ) = E ( u it | x i ◦ , µ i , ν t ) = E ( u it | x ◦ t , µ i , ν t ) = , with x ≡ ( x , . . . , x T , x , . . . , x T , . . . , x N , . . . , x NT N ) . FE .2 C ONSISTENCY

The W estimator in (3) is asymptotically well behaved, in thesense that the “adjusted” ( k − ) × ( k − ) outer product matrix X T Q [ ∆ ] X has theappropriate rank: rank (cid:0) X T Q [ ∆ ] X (cid:1) = k − . FE .3 N O SERIAL CORRELATION

For each stratum a the conditional variance-covariancematrix of the idiosyncratic error terms u it coincides with the unconditional one,and it is characterized by constant variances and zero covariances: E (cid:0) u a u T a (cid:12)(cid:12) x a , µ i ( a ) , ν t (cid:1) = ψ a I n a . Hence given the A × ψ = ( ψ , ψ , . . . , ψ A ) T we can deﬁne the n × n matrix Ψ = diag (cid:0) ∆ µ ∆ T AN ψ (cid:1) and the conditional variance-covariance matrix of u it is E (cid:0) uu T | x , µ i , ν t (cid:1) = Ψ . Details on the assumptions FE.1 and FE.2 can be found in Appendix A of Platoni et al. (2012). PPENDIX

B: R

ANDOM EFFECTS ESTIMATION ASSUMPTIONS

In the RE estimation the following assumptions are made . RE .1. A S TRICT EXOGENEITY

The set of kT i explanatory variables for each individualx i ◦ ≡ ( x i , x i , . . . , x iT i ) is uncorrelated with the idiosyncratic error u it and the setof kN t explanatory variables in each time period x ◦ t ≡ ( x t , x t , . . . , x N t t ) is alsouncorrelated with the same idiosyncratic error u it : E ( u it | x , µ i , ν t ) = E ( u it | x i ◦ , µ i , ν t ) = E ( u it | x ◦ t , µ i , ν t ) = , with x ≡ ( x , . . . , x T , x , . . . , x T , . . . , x N , . . . , x NT N ) . RE .1. B AND RE .1. C O RTHOGONALITY CONDITIONS

Both µ i and ν t are orthogonalto the corresponding sets of explanatory variables, that is the kT i explanatoryvariables for each individual x i ◦ and the kN t explanatory variables in each timeperiod x ◦ t : E ( µ i | x i ◦ ) = E ( µ i ) = E ( ν t | x ◦ t ) = E ( ν t ) = . RE .2 R ANK CONDITION

The k × k weighted outer product matrix X T Ω − X has theappropriate rank, ensuring the

GLS estimator in (8) is consistent:rank (cid:0) X T Ω − X (cid:1) = k . RE .3 N O SERIAL CORRELATION

For each stratum a the conditional variance-covariancematrix of the idiosyncratic error terms u it is characterized by constant variancesand zero covariances; in addition, whereas the variance of the time-speciﬁc ef-fect ν t is constant across strata, the variance of the individual-speciﬁc effect µ i is constant within each stratum a :a. E (cid:0) u a u T a (cid:12)(cid:12) x a , µ i ( a ) , ν t (cid:1) = ψ a I n a ,b. E (cid:16) µ i ( a ) (cid:12)(cid:12) x i ( a ) (cid:17) = ϕ a ,c. E (cid:0) ν t | x t (cid:1) = σ ν .A PPENDIX

C: A

LTERNATIVE ROBUST STANDARD ERRORS

Let us re-index the individuals belonging to stratum a as i a = a , . . . , N a , so that T i a refers to the number of times the individual i of the stratum a is observed.Since u it ∼ (cid:0) , ψ a (cid:1) , it is possible to obtain robust standard errors also by stacking Details on the assumptions RE.1 and RE.2 can be found in Appendix B of Platoni et al. (2012). a , and then by writing: e y a = h diag (cid:0) E T ia (cid:1) − (cid:0) E T a D a , . . . , E T Na D N a (cid:1) T Q − (cid:0) D T1 a E T a , . . . , D T N a E T Na (cid:1)i y a , e X a = h diag (cid:0) E T ia (cid:1) − (cid:0) E T a D a , . . . , E T Na D N a (cid:1) T Q − (cid:0) D T1 a E T a , . . . , D T N a E T Na (cid:1)i X a . (35)Therefore, we can compute the N a × e e a = e y a − e X a ˆ β W and the robust asymptoticvariance-covariance matrix of ˆ β W is estimated by:var (cid:0) ˆ β W (cid:1) = (cid:0) X T Q ∆ X (cid:1) − A ∑ a = (cid:16)e X T a e e a e e T a e X a (cid:17) (cid:0) X T Q ∆ X (cid:1) − . (36)A PPENDIX

D: T

ECHNICAL APPENDIX ON

QUE

PROCEDURES

D.1

Adapted QUEs in (11)

The identities in (11) can be further detailed as: q n a ≡ h f T a − ¯f T N (cid:5) ∆ T µ a − (cid:0) ¯f T (cid:5) T ∆ T − ¯f T N (cid:5) ∆ T TN (cid:1) Q − (cid:0) ∆ ν a − ∆ µ a ∆ − N ∆ T TN (cid:1) T ih f a − ∆ µ a ¯f N (cid:5) − (cid:0) ∆ ν a − ∆ µ a ∆ − N ∆ T TN (cid:1) Q − (cid:0) ¯f T (cid:5) T ∆ T − ¯f T N (cid:5) ∆ T TN (cid:1) T i , q n ≡ f T1 × n f n × − ¯f T N (cid:5) × N ∆ NN × N ¯f N (cid:5) N × − ¯f T (cid:5) T × T ∆ TT × T − ¯f T N (cid:5) × N ∆ T TNN × T ! Q − T × T ¯f T (cid:5) T × T ∆ TT × T − ¯f T N (cid:5) × N ∆ T TNN × T ! T , q N a ≡ ∑ i ∈ I a T i ¯ f i (cid:5) , q N ≡ ¯f T N (cid:5) × N ∆ NN × N ¯f N (cid:5) N × = N ∑ i = T i ¯ f i (cid:5) = A ∑ a = ∑ i ∈ I a T i ¯ f i (cid:5) , q T ≡ ¯f T (cid:5) T × T ∆ TT × T ¯f (cid:5) TT × = T ∑ t = N t ¯ f (cid:5) t , (37)where the elements of the N × N (cid:5) are ¯ f i (cid:5) = ∑ Tit = f it T i , the elements of the T × (cid:5) T are ¯ f (cid:5) t = ∑ Nti = f it N t , ∆ µ a = H a ∆ µ , and ∆ ν a = H a ∆ ν .D.2 Expected values in the single-equation case

Referring to the identities in (11), and considering the n × n matrix M ≡ I n − X ( X T Q ∆ X ) − X T Q ∆ (and then by deﬁnition e = My = M ε and ff T = E n ee T E n = E n M Ω M T E n ), theexpected value of q n a is: E ( q n a ) = tr (cid:0) H a Q ∆ E n M Ω M T E n Q ∆ H T a (cid:1) = ( n a − N a − τ a ) ψ a − k a ¯ ψ , (38)26here τ a ≡ n a − N a − tr ( H a Q ∆ H T a ) , k a ≡ tr [( X T Q ∆ X ) − X T Q ∆ H T a H a Q ∆ X ] , and ¯ ψ ≈ σ u is obtained by equating q n to its expected value (see Wansbeek and Kapteyn, 1989;Davis, 2002), that is: E ( q n ) = [ n − N − ( T − ) − ( k − )] σ u . (39)Moreover, the expected value of q N a is: E ( q N a ) = tr (cid:0) ¯J n a H a E n M Ω M T E n H T a (cid:1) = (cid:16) N a − n a n (cid:17) ψ a + (cid:16) k N a − k a + n a n k + n a n (cid:17) ¯ ψ + (cid:0) n a − λ µ a (cid:1) ϕ a + n a n λ µ ¯ ϕ + (cid:16) N a − λ ν a + n a n λ ν (cid:17) σ ν , (40)where ¯ ϕ ≈ σ µ is obtained jointly with σ ν by equating q N and q T to their expectedvalues, that is: E ( q N ) = ( N + k N − k − ) σ u + (cid:0) n − λ µ (cid:1) σ µ + ( N − λ ν ) σ ν , E ( q T ) = ( T + k T − k − ) σ u + (cid:0) T − λ µ (cid:1) σ µ + ( n − λ ν ) σ ν , (41)with k N ≡ tr [( X T Q ∆ X ) − X T ∆ µ ∆ N ∆ T µ X ] and k T ≡ tr [( X T Q ∆ X ) − X T ∆ ν ∆ T ∆ T ν X ] .In the case heteroscedasticity is only on the individual-speciﬁc disturbance the ex-pected value of q N a is obtained as follows: E ( q N a ) = (cid:16) N a + k N a − k a + n a n k − n a n (cid:17) σ u + (cid:0) n a − λ µ a (cid:1) ϕ a + n a n λ µ ¯ ϕ + (cid:16) N a − λ ν a + n a n λ ν (cid:17) σ ν , (42)and, therefore, ˆ ϕ a = q N a − (cid:0) N a + k N a − k a + n a n k − n a n (cid:1) ˆ σ u − n a n λ µ ˆ σ µ n a − λ µ a + − (cid:0) N a − λ ν a + n a n λ ν (cid:1) ˆ σ ν n a − λ µ a . (43)D.3 Expected values in the SUR systems case

Referring to the identities in (22), and considering the n × n matrix M m ≡ I n − X m ( X T m Q ∆ X m ) − X T m Q ∆ (and then by deﬁnition e m = M m y m = M m ε m and f m f T j = E n e m e T j E n = E n M m Ω m j M T j E n ), the expected value of q n a , m j is: E ( q n a , m j ) = tr (cid:0) H a Q ∆ E n M m Ω m j M T j E n Q ∆ H T a (cid:1) = ( n a − N a − τ a ) ψ a , m j − ( k a , m + k a , j − k a , m j ) ¯ ψ m j , (44)where k a , m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j Q ∆ H T a H a Q ∆ X m ] and k m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j Q ∆ X m ] , and ¯ ψ m j ≈ σ u , m j is obtained by equat-27ng q n , m j to its expected value (see Platoni et al., 2012): E ( q n , m j ) = [ n − N − ( T − ) − ( k m − ) − ( k j − ) + k m j ] σ u , m j . (45)Moreover, the expected value of q N a , m j is: E ( q N a , m j ) = tr (cid:0) ¯J N a H a E n M m Ω m j M T j E n H T a (cid:1) = (cid:16) N a − n a n (cid:17) ψ a , m j + (cid:16) k N a , m j − k a , m j + n a n k , m j + n a n (cid:17) ¯ ψ m j + (cid:0) n a − λ µ a (cid:1) ϕ a , m j + n a n λ µ ¯ ϕ m j + (cid:16) N a − λ ν a + n a n λ ν (cid:17) σ ν , m j , (46)where k N a , m j ≡ tr [( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T a j ¯J N a X a m ] , k a , m j ≡ ι T Na X am ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j ι n + ι T n X m ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T aj ι Na n , k , m j ≡ ι T n X m ( X T m Q ∆ X m ) − X T m Q ∆ X j ( X T j Q ∆ X j ) − X T j ι n n , and ¯ ϕ m j ≈ σ µ , m j is obtained jointly with σ ν , m j by equating q N , m j and q T , m j to their expected values (see Platoni et al., 2012): E ( q N , m j ) = (cid:0) N + k N , m j − k , m j − (cid:1) σ u , m j + (cid:0) n − λ µ (cid:1) σ µ , m j + ( N − λ ν ) σ ν , m j , E ( q T , m j ) = (cid:0) T + k T , m j − k , m j − (cid:1) σ u , m j + (cid:0) T − λ µ (cid:1) σ µ , m j + ( n − λ ν ) σ ν , m j , (47)with k N , m j ≡ tr [( X T j Q ∆ X j ) − X T j Q ∆ X m ( X T m Q ∆ X m ) − X T m ∆ µ ∆ N ∆ T µ X j ] and k T , m j ≡ tr [( X T j Q ∆ X j ) − X T j Q ∆ X m ( X T m Q ∆ X m ) − X T m ∆ ν ∆ T ∆ T ν X j ] .In the case heteroscedasticity is only on the individual-speciﬁc disturbance, theexpected value of q N a , m j is obtained differently as: E ( q N a , m j ) = (cid:16) N a + k N a , m j − k a , m j + n a n k , m j − n a n (cid:17) σ u , m j + (cid:0) n a − λ µ a (cid:1) ϕ a , m j + n a n λ µ ¯ ϕ m j + (cid:16) N a − λ ν a + n a n λ ν (cid:17) σ ν , m j (48)and, therefore,ˆ ϕ a , m j = q N a , m j − (cid:0) N a + k N a , m j − k a , m j + n a n k , m j − n a n (cid:1) ˆ σ u , m j n a − λ µ a + − n a n λ µ ˆ σ µ , m j − (cid:0) N a − λ ν a + n a n λ ν (cid:1) ˆ σ ν , m j n a − λ µ a . (49)28 PPENDIX

E: T

ECHNICAL APPENDIX ON WB PROCEDURE

In case of heteroscedasticity only on the individual-speciﬁc disturbance the estimatoris: ˆ Φ a = B C ε a + ∑ i ∈ I a T i n N ∑ j = T j n ˆ Σ µ − (cid:18) N a − ∑ i ∈ I a T i n (cid:19) ˆ Σ u ∑ i ∈ I a T i , (50)that would be an unbiased estimator of Φ a if the ε ’s were known.Using the centered residuals from the W estimation, the expected value of the be-tween individuals (co)variations is: E (cid:16) B Cf a (cid:17) = ∑ i ∈ I a T i Φ a − ∑ i ∈ I a T i n N ∑ j = T j n ¯ Φ + (cid:18) N a − ∑ i ∈ I a T i n (cid:19) Σ u , (51)and therefore the estimator in (50), with B Cf a instead of B C ε a , is a consistent estimator of Φ a . A PPENDIX

F: A

DDITIONAL TABLES

Due to the space limit it would be impossible (and unnecessary) to display 480 variance-covariance matrices as done in Table 1 for the single-equation case. Table 7 displaysthe estimated variances-covariances for the stratum a = ABLE

7. Simulation results on two-way

SUR systems: estimated variances-covariances ˆ ψ , m j and ˆ ϕ , m j N = T =

12, and n = N = T =

12, and n = ϕ , mj on ˆ ϕ , mj on true values ˆ ϕ , mj on ˆ ϕ , mj on m j ψ , mj ϕ , mj ˆ ψ , mj ˆ σ u , mj ˆ ψ , mj ˆ ψ , mj ˆ σ u , mj ˆ ψ , mj ψ , mj ϕ , mj ˆ ψ , mj ˆ σ u , mj ˆ ψ , mj ˆ ψ , mj ˆ σ u , mj ˆ ψ , mj λ =

011 6.544 9.377 6.547 9.373 9.372 7.338 9.625 9.735 6.544 9.377 6.563 9.414 9.412 7.293 9.708 9.81612 0.738 -1.048 0.741 -1.079 -1.079 0.805 -1.031 -1.019 0.738 -1.048 0.735 -1.045 -1.044 0.806 -1.003 -0.99213 0.881 1.276 0.872 1.212 1.214 0.758 1.141 1.124 0.881 1.276 0.884 1.315 1.315 0.765 1.249 1.23022 6.039 6.488 6.038 6.534 6.535 6.802 6.813 6.924 6.039 6.488 6.032 6.525 6.527 6.754 6.829 6.94223 -1.232 0.710 -1.235 0.730 0.729 -1.072 0.791 0.812 -1.232 0.710 -1.213 0.746 0.743 -1.060 0.810 0.83133 9.489 6.207 9.434 6.156 6.166 10.570 6.610 6.783 9.489 6.207 9.490 6.249 6.248 10.557 6.725 6.890 λ =

111 41.271 59.138 41.325 58.882 59.093 42.697 58.738 59.075 41.265 59.129 41.401 59.116 59.331 42.418 59.208 59.54112 4.654 -6.609 4.681 -6.784 -6.756 4.713 -6.682 -6.644 4.654 -6.608 4.638 -6.630 -6.597 4.684 -6.561 -6.51713 5.556 8.047 5.499 7.606 7.644 5.454 7.480 7.504 5.555 8.046 5.573 8.170 8.198 5.480 8.076 8.08522 38.086 40.918 38.108 40.964 41.173 39.251 41.011 41.343 38.081 40.912 38.048 40.855 41.077 38.962 41.042 41.38223 -7.770 4.478 -7.794 4.643 4.599 -7.609 4.663 4.638 -7.769 4.477 -7.655 4.683 4.624 -7.489 4.725 4.68833 59.844 39.146 59.545 38.501 38.873 61.184 38.723 39.274 59.836 39.140 59.857 39.049 39.377 61.157 39.405 39.910 λ =

211 105.914 151.765 106.110 150.610 151.619 108.699 149.776 150.938 105.884 151.722 106.262 151.181 152.214 107.884 150.922 152.08412 11.944 -16.962 12.021 -17.445 -17.319 11.993 -17.245 -17.112 11.941 -16.957 11.903 -17.074 -16.934 11.903 -16.954 -16.80413 14.259 20.652 14.120 19.452 19.614 14.219 19.232 19.384 14.255 20.646 14.300 20.841 20.974 14.268 20.697 20.81422 97.740 105.007 97.849 104.667 105.633 99.808 104.325 105.437 97.713 104.977 97.656 104.342 105.354 98.982 104.333 105.47323 -19.940 11.491 -20.012 12.023 11.821 -19.786 11.966 11.779 -19.934 11.488 -19.647 12.075 11.835 -19.464 12.077 11.85733 153.578 100.459 152.893 98.094 99.722 155.594 97.926 99.757 153.534 100.431 153.627 99.466 101.008 155.420 99.621 101.356

Note : ψ , mj and ϕ , mj are the true values of the vars-Covs, ˆ ψ , mj are the estimated vars-Covs of the remainder error u it , ˆ ϕ , mj are the estimated vars-Covs of the individual-speciﬁc error µ i computedon the basis of a remainder error either homoscedastic ( ˆ σ u , mj ) or heteroscedastic ( ˆ ψ , mj ) . EFERENCES

Akaike, H. (1974), “A new look at the statistical model identiﬁcation.”

IEEE Transac-tions on Automatic Control , 19 (6), 716-723.Arellano, M. (1987), “Computing robust standard errors for within groups estimators.”

Oxford Bulletin of Economics and Statistics , 49 (4), 431-434.Baltagi, B. H. (1980), “On seemingly unrelated regressions with error components.”

Econometrica , 48 (6), 1547-1551.Baltagi, B. H. (1981), “Pooling: an experimental study of alternative testing and esti-mation procedures in a two-way error component model.”

Journal of Econometrics ,17 (1), 21-49.Baltagi, B. H. (1985), “Pooling cross-sections with unequal time series lengths.”

Eco-nomics Letters , 18 (2-3), 133-136.Baltagi, B. H. (1988), “An alternative heteroscedastic error components model (prob-lem 88.2.2.).”

Econometric Theory , 4 (2), 349-350.Baltagi, B. H. (2013).

Econometric Analysis of Panel Data , 5 th edition. Wiley andSons, Chichester (UK).Baltagi, B. H., G. Bresson, and A. Pirotte (2005), “Adaptive estimation of het-eroskedastic error component models.” Econometric Reviews , 24 (1), 39-58.Baltagi, B. H., G. Bresson, and A. Pirotte (2006), “Joint LM test for homoskedasticityin a one-way error component model.”

Journal of Econometrics , 134 (2), 401-417.Baltagi, B. H. and J. M. Grifﬁn (1988), “A generalized error component model withheteroscedastic disturbances.”

International Economic Review , 29 (4), 745-753.Bester, C. A. and C. B. Hansen (2016), “Grouped effects estimators in ﬁxed effectsmodels.”

Journal of Econometrics , 190 (1), 197-208.Biørn, E. (1981), “Estimating economic relations from incomplete cross-section/time-series data.”

Journal of Econometrics , 16 (2), 221-236.Biørn, E. (2004), “Regression systems for unbalanced panel data: a stepwise maximumlikelihood procedure.”

Journal of Econometrics , 122 (2), 281-291.Bresson, G., C. Hsiao, and A. Pirotte (2006), “Heteroskedasticity and random coefﬁ-cient model on panel data.” Working Papers ERMES, No. 0601, 51 p.Bresson, G., C. Hsiao, and A. Pirotte (2011), “Assessing the contribution of R&D tototal factor productivity: a Bayesian approach to account for heterogeneity and het-eroskedasticity.”

Advances in Statistical Analysis , 95 (4), 435-452.Breusch, T. S. and A. R. Pagan (1979), “A simple test for heteroscedasticity and ran-dom coefﬁcient variation.”

Econometrica , 47 (5), 1287-1294.31hib, S. (2008), “Panel data modeling and inference: a Bayesian primer.” In

TheEconometrics of Panel Data (M´aty´as, L. and P. Sevestre, eds.), book series

AdvancedStudies in Theoretical and Applied Econometrics , Vol. 46, Chapter 15, 479-515, 3 rd edition. Springer-Verlag, Berlin (GE).Davis, P. (2002), “Estimating multi-way error components models with unbalanceddata structures.” Journal of Econometrics , 106 (1), 67-95.Hsiao, C. and M. H. Pesaran (2004), “Random coefﬁcient panel data models.”IZA Discussion Paper Series, No. 1236, 39 p.Lejeune, B. (1996), “A full heteroscedastic one-way error components model for in-complete panel: Maximum likelihood estimation and Lagrange multiplier testing.”CORE Discussion Paper, Universit´e Catholique de Louvain, No. 1996/006, 28 p.Lejeune, B. (2004), “A full heteroscedastic one-way error components model allow-ing for unbalanced panel: pseudo-maximum likelihood estimation and speciﬁcationtesting.” CORE Discussion Paper, Universit´e Catholique de Louvain, No. 2004/76,37 p.Li, Q. and T. Stengos (1994), “Adaptive estimation in the panel data error componentmodel with heteroskedasticity of unknown form.”

International Economic Review ,35 (4), 981-1000.Magnus, J. R. (1982), “Multivariate error components analysis of linear and non-linearregression models by maximum likelihood.”

Journal of Econometrics , 19 (2-3), 239-285.Mazodier, P. and A. Trognon (1978), “Heteroscedasticity and stratiﬁcation in errorcomponents models.”

Annales de l’INSEE , 30-31, 451-482.Nerlove, M. (1971), “Further evidence on the estimation of dynamic economic rela-tions from a time series of cross sections.”

Econometrica , 39 (2), 359-382.Neyman, J. and E. L. Scott (1948), “Consistent estimates based on partially consistentobservations.”

Econometrica , 16 (1), 1-32.Phillips, R. F. (2003), “Estimation of a stratiﬁed error-components model.”

Interna-tional Economic Review , 44 (2), 501-521.Platoni, S., P. Sckokai, and D. Moro (2012), “A note on two-way ECM estimation ofSUR systems on unbalanced panel data.”

Econometric Reviews , 31 (2), 119-141.Randolph, W. C. (1988), “A transformation for heteroscedastic error components re-gression models.”

Economics Letters , 27 (4), 349-354.Rao, P. S. R. S., J. Kaplan, and W. C. Cochran (1981), “Estimators for the one-wayrandom effects model with unequal error variances.”

Journal of the American Statis-tical Association , 76 (373), 89-97. 32oy, N. (2002), “Is adaptive estimation useful for panel models with heteroskedasticityin the individual speciﬁc error component? Some Monte Carlo evidence.”

Econo-metric Reviews , 21 (2), 189-203.Verbon, H. A. A. (1980), “Testing for heteroscedasticity in a model of seemingly un-related regression equations with variance component.”

Economics Letters , 5 (2),149-153.Wang, H.-J. and C.-W. Ho (2010), “Estimating ﬁxed-effect panel stochastic frontiermodels by model transformation.”

Journal of Econometrics , 157 (2), 286-296.Wansbeek, T. (1989), “An alternative heteroscedastic error components model (prob-lem 88.2.2.).”

Econometric Theory , 5 (2), 326.Wansbeek, T. and A. Kapteyn (1989), “Estimation of the error-components model withincomplete panels.”

Journal of Econometrics , 41 (3), 341-361.Wooldridge, J. M. (2010).

Econometric Analysis of Cross Section and Panel Data , 2 ndnd