Identification of Time-Varying Transformation Models with Fixed Effects, with an Application to Unobserved Heterogeneity in Resource Shares
IIntertemporal Collective Household Models:Identification in Short Panels with UnobservedHeterogeneity in Resource Shares
Irene Botosaru, Chris Muris, and Krishna Pendakur ∗ August 14, 2020
Abstract
We provide a new full-commitment intertemporal collective household modelto estimate resource shares, defined as the fraction of household expenditureenjoyed by household members. Our model implies nonlinear time-varyinghousehold quantity demand functions that depend on fixed effects.We provide new econometric results showing identification of a large classof models that includes our household model. We cover fixed- T panel mod-els where the response variable is an unknown monotonic function of a linearlatent variable with fixed effects, regressors, and a nonparametric error term.The function may be weakly monotonic and time-varying, and the fixed effectsare unrestricted. We identify the structural parameters and features of the dis-tribution of fixed effects. In our household model, these correspond to featuresof the distribution of resource shares.Using Bangladeshi data, we show: women’s resource shares decline withhousehold budgets; and, half the variation in women’s resource shares is due tounobserved heterogeneity. ∗ Email: [email protected], [email protected], [email protected]. We would like tothank St´ephane Bonhomme, P.A. Chiappori, Ian Crawford, Iv´an Fern´andez-Val, Dalia Ghanem, HiroKasahara, Shakeeb Khan, Valerie Lechene, Arthur Lewbel, Corrine Low, Maurizio Mazzocco, JackPorter and Elie Tamer for useful discussions and suggestions; conference participants at the ASSA2018, Berkeley-Stanford Jamboree 2017, IWEEE 2018, Seattle-Vancouver Econometrics Conference2016 and 2017, and seminar participants at Cornell, CREST, Harvard, Oxford, Penn State, Queen’sUniversity, Rotterdam, Tilburg, University of Bristol, TSE, UBC, UCL, UC Davis, UCLA, UPennand Vanderbilt for comments and suggestions. We thank Shirleen Manzur for her research assistanceon this project, and gratefully acknowledge the financial support of SSHRC under grants IDG 430-2015-00073 and IG 435-2020-0026 and of the Institute for Advanced Study at the University ofBristol.
JEL codes : C14; C23; C41 keywords : panel data, fixed effects, incidental parameter, time-varyingtransformation function, collective household, full commitment, resource shares, gender inequality a r X i v : . [ ec on . E M ] A ug Introduction
Standard poverty and inequality measures, based on per-capita household income orexpenditure, assume that household resources are distributed equally across house-hold members. These may be misleading if some members—such as women—havepoor access to household resources. We study resource shares , defined as the fractionof the total expenditure of a household consumed by one of its members. Resourceshares are not directly observable, but are important because unequal resource sharesacross household members signal within-household inequality. In this paper, we pro-vide a new intertemporal collective household model that permits the use of shortpanel data to estimate resource shares within households.Compared to many cross-sectional approaches, resource shares in our model maybe arbitrarily correlated with observed variables, such as the household budget, andmay depend on unobserved household-level heterogeneity, such as unobserved bar-gaining power shifters. Our household model embeds resource shares inside quantitydemand functions, where they appear as fixed effects. The nonlinear quantity demandfunctions in our model are time-varying because quantity demands depend on prices,and prices are not observed but do vary across the waves of our panel.We show point-identification of a general class of fixed- T time-varying nonlinearpanel models, which includes our household model. This class has a response variableequal to a time-varying weakly monotonic transformation function of a linear indexof regressors, fixed effects, and error terms. In contrast, almost all existing results forthis class of models require time-invariant transformation functions. Our theoremsimply novel identification results for time-varying versions of some commonly usedmodels, e.g., the time-varying ordered logit and multiple-spell GAFT models.We point-identify regression coefficients, transformation functions, and the mean(up to location) and variance of the distribution of fixed effects. The latter two areof specific relevance to our collective household model.Ours are the first empirical estimates of a full-commitment collective householdmodel in a short panel, and we demonstrate the importance of accounting for both ob-served and unobserved heterogeneity in women’s resource shares. Using a two-periodBangladeshi panel dataset on household expenditures, we show that less than halfof the variation in women’s resource shares can be explained by observed covariates.This means that there is much more inequality within households than previously2hought. We also find that women’s resource shares are negatively correlated withhousehold budgets, so that women in poorer households have larger resource shares.This means that women face less economic inequality than would be revealed bylooking at the distribution of their household budgets.Our micro-economic contribution begins with a new model of an efficient full-commitment intertemporal collective (FIC) household that builds on Browning et al.(2013) and Chiappori and Mazzoco (2017). Collective household models posit thathousehold behavior is driven by the preferences of the individuals who comprise thehousehold. In efficient models, the individuals together reach the Pareto frontier. Infull-commitment intertemporal models, the individuals in the household face uncer-tainty, and are able to insure each other against risk by making state-contingent bind-ing commitments over future actions. We show that quantity demand equations inefficient FIC household models must in general depend on this initial commitment—atime-invariant feature of the household.We then give a parametric model of individual utilities in our FIC householdmodel that delivers quantity demand equations. We model the demand for food byadult women in the household. These demand equations are time-varying monotonicnonlinear functions of a linear index of fixed effects, logged household budgets, andan error term. Here, the fixed effects have an interpretation: they equal the log ofthe resource share of the woman in the household. They may be correlated withhousehold budgets, and depend on household-level unobserved heterogeneity.Many strategies to identify resource shares with cross-sectional data require thatthey are conditionally independent of the household budget. We relax that restriction,and find evidence that resource shares exhibit a slight negative conditional dependenceon household budgets.Our econometric contribution is to provide sufficient conditions for point-identificationin a large class of models, with the outcome Y it given by: Y it = h t ( α i + X it β − U it ) (1.1)for i = 1 , . . . , n and time periods t = 1 , . . . , T (with T ≥ h t is an unknowntime-varying monotonic function, α i are fixed effects, X it is a vector of regressors withcoefficients β , and U it is an error term drawn from a stationary distribution.Our setting has the following four features:3. a fixed- T setting—in fact T = 2 is sufficient for our results;2. the monotonic transformation function h t can be time-varying and weakly monotonic;3. the functions h t , and the distribution of the error term U it , may be nonpara-metric;4. and, the fixed effects α i are unrestricted. We provide sufficient conditions for the identification of h t and β for two cases, wherethe error term U it is distributed as logistic and where it follows an unknown distribu-tion. The panel model literature is very large, with many papers showing identifica-tion of β (and sometimes h t ) in models with two or three of these features. However,ours is the first paper to cover all four features. An immediate implication of our workis that extensions to time-varying and/or nonparametric counterparts of well-knownmodels, such as ordered choice, censored regression, and duration modeling, can nowbe shown to be identified. And, of course, our collective household model is in theclass and therefore identified.For the case where h t is strictly monotonic we provide additional results, identify-ing the conditional mean (up to location) and conditional variance of the distributionof fixed effects. This relates directly to our microeconomic model, because in thatmodel fixed effects have a clear economic interpretation: they are logged resourceshares. To the best of our knowledge, there are no results in the fixed- T , fixed-effects,nonlinear panel literature that cover this aspect of model identification.In particular, we show identification of the response of the conditional mean of α i to observed covariates, and provide additional sufficient conditions for the iden-tification of the conditional variance of α i . The former corresponds to identificationof coefficients in the regression of logged resource shares on covariates; the lattercorresponds to identification of inequality in resource shares.Section 2 provides a review of the related literature. In Sections 3 and 4, weprovide our main identification results. Section 5 introduces our collective householdmodel, which uses data described in Section 6. We present estimates of women’sresource shares in Section 7. All proofs, descriptive statistics for the data, additionalrobustness results and estimation details are in the Appendix.4 Existing Literature
Following the terminology in Abrevaya (1999), we refer to the model in this paper asthe fixed-effects linear transformation (FELT) model. Note that we can express theoutcome equation in (1.1) using the latent variable notation Y it = h t ( Y ∗ it ) = h t ( α i + X it β − U it ) , (2.1)where Y ∗ it is a latent variable, α i are fixed effects and U it are stationary errors. Because h t can be weakly monotonic, FELT includes many previously studieddiscrete choice models, such as binary choice, ordered choice and censored models.When h t is strictly monotonic, it covers other previously studied models, such asduration and Box-Cox regression models. As described below, our results provideidentification of some extensions of these models that were not previously shown tobe identified. Since h t can be time-varying, our framework generalizes typical discrete-and continuous-choice models where the transformation function h t is fixed over time.Our approach builds on classic results in binary choice, and extends those resultsto the entire FELT class. FELT nests binary choice models with time effects asa special case when h t ( Y ∗ it ) = 1 { Y ∗ it ≥ λ t } . In these models, the parameter β isknown to be identified for U it parametric or nonparametric, see, e.g., Rasch (1960),Chamberlain (1980), Manski (1987), Magnac (2004), and Chamberlain (2010). Weuse the insights of Chamberlain (1980) and Manski (1987) about binary choice modelswhere the error U it is logistic or nonparametric, respectively, to show identification ofall models nested in FELT.Our theoretical work connects two sets of classic results with two new contribu-tions. The first established result we invoke comes from the cross-sectional work ofDoksum and Gasko (1990) and Chen (2002) that shows that transformation modelscan in general be binarized into a set of related binary choice models. The secondestablished result we invoke comes from Chamberlain (1980) and Manski (1987) whoshow that fixed effects binary choice models are identified.We begin by showing that we can binarize in a panel data setting, even if thetransformation functions h t vary with time. Given the results of Chamberlain (1980) This setting excludes some important types of models: dynamic models (e.g. Honor´e andKyriazidou (2000), Aguirregabiria et al. (ming), Khan et al. (2020)); and models of multinomialchoice (e.g., Shi et al. (2018)). β in the FELT model. Our first con-tribution is to show that we can re-assemble the identified binarized models to obtainidentification of the transformation functions h t in the FELT model. Our secondcontribution is specific to the case where h t is strictly monotonic. Here, we derivesufficient conditions for the identification of the conditional variance of fixed effects,and for the response of the conditional mean of fixed effects to observed covariates. T Nonlinear Panel Models with Fixed Effects
The literature on panel data methods is vast. There are excellent reviews of the litera-ture, e.g., Arellano and Honor´e (2001), Arellano (2003), and Arellano and Bonhomme(2011). Despite the vastness of this literature, we are not aware of any paper thatdelivers all four features discussed above, demanded by our empirical application.We outline connections to the literature in the context of our list of four model fea-tures. Below, we highlight the key differences between our approach and approachesin the literature that lack one or more of our key features.
Feature 1: We show identification in fixed- T panel models . The incidentalparameter problem occurs in fixed-effect panel models with a finite number of timeperiods, see Neyman and Scott (1948). Essentially, the problem arises from thefact that the n fixed effects α i cannot be consistently estimated if T does not tendto infinity. Thus, identification of the parameters common across individuals mustbe shown in a context where the incidental parameters α i cannot be identified orconsistently estimated.Ours is a fixed- T approach, with n → ∞ and works even if T = 2. A largeliterature analyzes the behavior of fixed effects procedures under the alternative as-sumption that the number of time periods goes to infinity, e.g., Hahn and Newey(2004), Arellano and Hahn (2007), Arellano and Bonhomme (2009), Fern´andez-Val(2009), Fern´andez-Val and Weidner (2016), and Chernozhukov et al. (2018). In thissetting, it is generally possible to identify each fixed effect, and consequently, thedistribution of fixed effects. In our model, we show identification of specific momentsof this distribution even though the number of time periods is fixed. Feature 2: We allow weakly monotonic time-varying transformationfunctions h t . Abrevaya (1999) provides a consistent estimator of β (the “leapfrog”6stimator) in the FELT model under the restriction that the transformation func-tions are strictly monotonic. However, because he differences out the transformationfunctions h t , his focus does not extend to their identification. Athey and Imbens(2006) propose a “changes in changes” estimator, which is a generalization of thelinear differences-in-differences estimator, in both a cross-sectional and a panel datasetting. Their panel data fixed-effects setting is a potential outcomes analog to ourmodel with strictly monotonic transformations. They show identification of the aver-age treatment effect, but not identification of h t or the distribution of α i . In compari-son, we cover the weakly monotone case, and identify the transformation function h t .In the strictly monotone case, we additionally identify moments of the distributionof fixed effects α i .Abrevaya (2000) considers a model that allows for weak monotonicity but restrictsthe transformation functions to be time-invariant (and allows for nonseparable errors).He provides a consistent estimator for β only. A literature on duration models alsoconsiders time-invariant transformation functions that are weakly monotonic due tocensoring, e.g., Lee (2008), Khan and Tamer (2007), Chen (2010a,b); Chen and Zhou(2012); Chen (2012), and Chen and Zhou (2012); we review this below.A more recent literature has focused on identification issues in a class of panelmodels with potentially non-monotonic but time-invariant structural functions (orstrong assumptions on how those functions vary over time), e.g., Hoderlein and White(2012), Chernozhukov et al. (2013), Chernozhukov et al. (2015). These papers focuson (partial) identification of partial effects. But, because they don’t impose mono-tonicity, these approaches preclude identification of the structural function(s) or ofthe distribution of fixed effects.
Feature 3: We allow nonparametric transformation functions and non-parametric errors . Bonhomme (2012) proposes a general-purpose likelihood-basedapproach to obtain identification for models with parametric h t and parametric U it ,even allowing for dynamics. These results exploit the fact that a likelihood functioncan be constructed for such models and show identification in the presence of fixed Chen (2010c) in Remark 6 discusses a version of Abrevaya (1999) that allows for some weakmonotonicity due to censoring. He focuses on β and does not discuss identification of h t , althoughhis Remark 1 sketches an approach for estimation of h t = h for all t . Chernozhukov et al. (2018) uses a distribution regression technique that is closely related to ourbinarization approach, and consequently accommodates weakly monotonic transformation functions.However, theirs is a large- T setting. T setting. (Earlier work for the same setting by Lancaster (2002)requires T → ∞ , sacrificing feature 1.) Our model requires strictly exogenous regres-sors, precluding many dynamic structures. But, our Theorems 1 and 2 apply evenwhen h t is nonparametric, U it is nonparametric, or both are nonparametric.The setting with parametric transformation functions and parametric errors cov-ers many models previously shown to be identified, including the time-invariant fixed-effect panel versions of: binary choice (e.g., Rasch (1960), Chamberlain (1980),Magnac (2004), and Chamberlain (2010)); the linear regression model with normalerrors; and the ordered logit model (e.g., Das and van Soest (1999); Baetschmannet al. (2015); Muris (2017)). Application of our results immediately shows identifica-tion of the time-varying versions of these models. This result is novel for the orderedlogit model, where our results imply identification of time-varying thresholds.Parametric transformation models with nonparametric errors are widely studied,starting with Manski (1987) for the binary choice fixed effects model. (Aristodemou(ming) provides partial identification results for ordered choice with nonparametricerrors.) Parametric panel data censored regression models also fit into our framework,and were studied intensively starting with Honor´e (1992) (e.g., Charlier et al. (2000),Honor´e and Kyriazidou (2000), Chen (2012)). These papers show identification ofthe regression coefficient β for the linear model with time-invariant censoring andnonparametric errors. In this context, our results show identification of models thatwere not previously known to be identified. In particular, the model is identified evenif the transformation function is nonparametric (as opposed to linear or Box-Cox)and time-varying and/or where the censoring cutoff is time-varying. Duration models can be recast as transformation models like ours, with non-parametric transformation functions (see Ridder (1990)). Consequently, the largeliterature on identification of duration models is related to our work. A very commonfeature in this literature is the use of error terms following the type 1 extreme valuedistribution (EV1).Consider the multiple-spell mixed proportional hazards (MPH) model with spell- Many papers in the literature on censored regression have focused on endogeneity. For example,Honor´e and Hu (2004) allow for endogenous covariates, and Khan et al. (2016) study the case ofendogenous censoring cutoffs. Our results do not cover the case of endogenous regressors or cutoffs.Horowitz and Lee (2004) and Lee (2008) consider dependent censoring, where the censoring cutoffdepends on observed covariates and the error term follows a parametric distribution. We do notconsider dependent censoring. This model can be obtained fromFELT by letting (i) h − t ( v ) = log (cid:8)(cid:82) v λ t ( u ) du (cid:9) , where λ t is the baseline hazard forspell t , (ii) α i and U it are independent across t , and (iii) U it is independent of X i anddistributed as EV1. Honor´e (1993) derives sufficient conditions for the identificationof this model (Lee (2008) provides consistent estimators under other parametric errordistributions). Our theorems immediately provide the novel result that this model isidentified when the error terms are drawn from a nonparametric distribution.Consider the single-spell generalized accelerated failure time (GAFT) model intro-duced by Ridder (1990) (see also van den Berg (2001)) that has non EV1 errors, andis consistent with a duration model. Just like the MPH model, it can be extended to amultiple-spell setting (e.g., Evdokimov (2011)). Abrevaya (1999) shows that the com-mon parameter vector β in the multiple-spell GAFT model is consistently estimated.However, he does not show identification of the transformation function h t , whichcan be seen as dual to identification of the spell-specific baseline hazard function. Evdokimov (2011) considers identification of a related version of the multiple-spellGAFT with spell-specific baseline hazard, but he requires continuity of α i and at least3 spells ( T ≥ β and h t in the multiple-spellGAFT model, imposing no restrictions on α i and requiring just 2 spells ( T = 2). Feature 4: We allow for unrestricted fixed effects.
This contrasts withidentification strategies based on special regressors and with the literature on theidentification of correlated random effects models. Special regressor approaches (seethe review in Lewbel (2014)) have identifying power in transformation models withfixed effects. They require the availability of a continuous variable that is independentof the fixed effects. With such a variable, one can show identification of transformationmodels in the cross-sectional case (Chiappori et al. (2015)) and in the panel data case,e.g., Honor´e and Lewbel (2002), Ai and Gan (2010), Lewbel and Yang (2016), Chenet al. (2019). Our results do not invoke a special regressor. Further, we are notaware of any special regressor-based papers that identify time-varying transformation Horowitz and Lee (2004) show identification of this model under the restriction that the baselinehazard is the same for all spells, analogous to time-invariant h t . Chen (2010b) considers the samemodel, but relaxes the restriction that errors are type 1 EV, but shows identification of only thecommon parameter vector β . Khan and Tamer (2007) establish consistency of an estimator of the regression coefficient inGAFT under the restriction that the baseline hazard is the same for all spells, analogous to time-invariant h t . A related literature considers restrictions on the joint distribution of ( α i , X i , ..., X iT ).For example, Altonji and Matzkin (2005) impose exchangeability on this joint distri-bution, and Bester and Hansen (2009) restricts the dependence of α i on ( X i , ..., X iT )to be finite-dimensional. In our model this joint distribution is unrestricted.A further group of papers establishes identification of panel models, includingthe distribution of α i , by using techniques from the measurement error literaturethat: (i) impose various assumptions on α i , such as full support and/or continuousdistribution; (ii) assume serial independence of U it ; and (iii) restrict the conditionaldistribution of ( α i , U i , . . . , U iT ) conditional on ( X i , ..., X iT ), see, Evdokimov (2010),Evdokimov (2011), Wilhelm (2015), and Freyberger (2018). In contrast, our resultson the identification of the conditional variance of α i do not require (i). All our otherresults, including identification of the dependence of α i on observed covariates, arefree of assumptions like (i), (ii) and (iii). We also show identification of some aspects of the distribution of fixedeffects.
As we noted above, correlated random effects models identify the distributionof individual effects, but at the cost of restricting their distribution. To our knowledge,we are the first to show identification of moments of this distribution in a nonlinearpanel model, when that distribution is unrestricted.We show the practical importance of these innovations in our empirical workbelow. Identification of the conditional mean and variance of the distribution of fixedeffects in a context with time-varying transformation functions is essential to ourinvestigation of women’s access to household resources in rural Bangladeshi.
Dating back at least to Becker (1962), collective household models are those in whichthe household is characterized as a collection of individuals, each of whom has a well-defined objective function, and who interact to generate household level decisions suchas consumption expenditures.
Efficient collective household models are those in whichthe individuals in the household are assumed to reach the (household) Pareto frontier. We conjecture that the existence of a special regressor would be sufficient to identify time-varying nonparametric transformation functions, and, with strict monotonicity, the distribution offixed effects. However, we think that a setting with completely unrestricted fixed effects is useful ina variety of empirical applications, including our own.
Pareto weights . This in turn implies that the household-level allocation problem is observationally equivalent to a decentralized, person-level,allocation problem.In this decentralized allocation, each household member is assigned a shadow bud-get . They then demand a vector of consumption quantities given their preferencesand their personal shadow budget, and the household purchases the sum of these de-manded quantities (adjusted for shareability/economies of scale and for public goodswithin the household).
Resource shares , defined as the ratio of each person’s shadow budget to the over-all household budget, are useful measures of individual consumption expenditures. Ifthere is intra-household inequality, these resource shares would be unequal. Conse-quently, standard per-capita calculations (assigning equal resource shares to all house-hold members) would yield invalid measures of individual consumption and poverty(see, e.g., Dunbar et al. (2013)). In this paper, we show identification of the condi-tional mean (up to location) and conditional variance of the distribution of resourceshares in a panel data context.The early literature on these models, including Bourguignon et al. (1993); Brown-ing and Chiappori (1998); Vermeulen (2002); Chiappori and Ekeland (2006), con-strains goods to be either purely private or purely public within a household. Thesepapers show that one can generally identify the response of resource shares to changesin observed variables such as distribution factors. Like those papers, we can identifythe response of resource shares to observed variables, but we also can account forunobserved household-level variables through the inclusion of fixed effects.We work with a more general model of sharing and scale economies based onBrowning et al. (2013). This model allows some or all goods to be partly or fullyshared, and the authors show that there is a one-to-one correspondence betweenPareto weights and resource shares. Dunbar et al. (2013) use this model, and showhow assignable goods, defined as goods consumed exclusively by a single known house-hold member, may be used to identify resource shares (see also Chiappori and Ekeland(2009)). Like them, we use an assignable good to support identification.A key identifying assumption in Dunbar et al. (2013) is that resource shares are11ndependent of household budgets in a cross-sectional sense. This identifying re-striction has been used to estimate resource shares, within-household inequality andindividual-level poverty in many countries (DLP and DLP2 in Malawi; Bargain et al.(2014) in Cote D’Ivoire; Calvi (2019) in India; Vreyer and Lambert (2016) in Senegal;Bargain et al. (2018) in Bangladesh). In our model, we show identification of theresponse of the conditional mean of resource shares to observed covariates, even ifresource shares are correlated with (lifetime) household budgets. Consequently, wecan test this identifying restriction.Using cross-sectional data, Menon et al. (2012) and Cherchye et al. (2015) testthe restriction that resource shares are correlated with household budgets, and findno evidence that they are. But, their estimators don’t have much power. In ourempirical work with panel data, we get a fairly precise estimate of this correlation,allowing us to detect even a small dependence. We find evidence that women’s re-source shares are slightly negatively correlated with household budgets (conditionalon other observed variables). This finding suggests that the cross-sectional identifica-tion strategy proposed by Dunbar et al. (2013) may come at a cost that is not facedin a panel data setting.Dunbar et al. (2013) does not accommodate unobserved heterogeneity in resourceshares. Two newer papers, Chiappori and Kim (2017) and Dunbar et al. (2019)consider identification in cross-sectional data with unobserved household-level het-erogeneity in resource shares. Like Chiappori and Kim (2017) and Theorem 1 inDunbar et al. (2019), our work investigates identification of the distribution of re-source shares up to an unknown normalization. However, the results in those papersare of the random effects type. That is, the authors impose the restriction that theconditional distribution of resource shares is independent of the household budget.In this paper, we consider a panel data setting with household-level unobserved het-erogeneity in resource shares, without any restriction on the distribution of resourceshares. Further, we show sufficient conditions for identification of the conditionalvariance of (logged) resource shares.Sokullu and Valente (2019) use a one-period micro-economic model similar toDunbar et al. (2013), and estimate it on three waves of a Mexican panel dataset. Incontrast, our micro-economic model considers choice in the presence of unobservedhousehold-level heterogeneity, and over many periods under uncertainty.The literature cited above considered one-period micro-economic models. But,12any interesting questions about households, and the distribution of resources withinhouseholds, are dynamic in nature. For example: how do household members sharerisk?; how do household investments relate to individual consumption?; how can weuse information from multiple time periods to estimate resource shares when there isunobserved household-level heterogeneity?Chiappori and Mazzoco (2017) give a lovely review of the literature on collectivehousehold models in an intertemporal setting. These models generally come in twoflavours--limited commitment or full commitment---depending on whether or not thehousehold can commit to a permanent Pareto weight at the moment of householdformation. Full-commitment models answer “yes”, and limited-commitment modelanswer “no”. Limited commitment models have commanded the most theoreticalattention. Much effort has gone into testing the full-commitment model against alimited-commitment alternative, e.g., Ligon (1998); Mazzocco (2007); Mazzocco et al.(2014); Voena (2015).Fewer papers study the identification of Pareto weights or resource shares in anintertemporal context. Lise and Yamada (2019) use a long panel of Japanese house-hold consumption data to estimate how Pareto weights (which are dual to resourceshares) depend on observed covariates and on unanticipated shocks. They find evi-dence that the full-commitment model does not hold in Japan. Our model is one offull-commitment and our data are a short (2 period) panel, so we provide a comple-ment to the approach of Lise and Yamada (2019) for cases where the data are notrich enough to estimate a limited-commitment model.Full commitment models are more restrictive, but may be useful nonetheless.Chiappori and Mazzoco (2017) write “In more traditional environments (such asrural societies in many developing countries), renegotiation may be less frequent sincethe cost of divorce is relatively high, threats of ending a marriage are therefore lesscredible, and noncooperation is less appealing since households members are boundto spend a lifetime together.” We use a full commitment setting to estimate resourceshares for rural Bangladeshi households.In this paper, we adapt the general full-commitment framework of Chiappori andMazzoco (2017) to the scale economy and sharing model of Browning et al. (2013).Then, like Dunbar et al. (2013) do in their cross-sectional analysis, we identify resourceshares on the basis of household-level demand functions for assignable goods. Inour general model, observed household-level quantity demand functions depend on13esource shares, and resource shares depend a time-invariant factor (a fixed effect)representing the initial (and permanent) Pareto weights of household members.We then provide a parametric form for utility functions that results in demandequations for assignable goods that are nonlinear in shadow budgets, and have loggedshadow budgets that are linear in logged household budgets and a fixed effect. Fur-ther, demand equations are time-varying because prices vary over time. Such demandequations fall into the FELT class, and are therefore identified in our short-panel set-ting. The parametric model also gives meaning to the fixed effect: it equals a loggedresource share, so its distribution is an economically interesting object. So, our micro-economic theory demands an econometric model that allows for time-varying trans-formation functions and that can identify moments of the conditional distribution offixed effects.
Dropping the i subscript, let Y = ( Y , ..., Y T ) (cid:48) and X = ( X (cid:48) , ..., X (cid:48) T ) (cid:48) . We write FELTas a latent variable model using the notation in (2.1). For t = 1 , ..., T and for all α and X , Y t = h t ( Y ∗ t ) = h t ( α + X t β − U t ) ,U t | α, X ∼ F t ( u | α, X ) . (3.1)Denote the supports of Y t , Y ∗ t , X t by Y ⊆ R , Y * = R , and X ⊆ R K , respectively. We provide sufficient conditions for identification of ( β, h t ). We consider twonon-nested cases. The first case allows for nonparametric F t ( u | α, X ), requiring onlythat it is conditionally stationary. In this case, the idiosyncratic errors may be seriallydependent. The second case assumes that U t , t = 1 , · · · , T , are serially independent,standard logistic, and strictly exogenous. Of course, the second case has an errordistribution that is nested within that of the first case. But, the second case requiresweaker assumptions on the distribution of the regressors (c.f. Assumption 3 below).For both cases, we maintain the assumption below: The supports may be indexed by t . We omit this index here for the sake of concise notation. Botosaru and Muris (2017) introduce four estimators, depending on whether the outcome vari-able is discrete or continuous, and on whether the stationary distribution of the error term is non-parametric or logistic. ssumption 1. [Weak monotonicity] For each t , the transformation function h t : Y * → Y is unknown, non-decreasing, right continuous, and non-degenerate. Define the generalized inverse h − t : Y → Y ∗ as h − t ( y ) ≡ inf (cid:8) y ∗ ∈ Y * : y ≤ h t ( y ∗ ) (cid:9) , with the convention that inf ( ∅ ) = inf ( Y ). Additionally, let Y ≡ Y\ inf Y . For anarbitrary y ∈ Y , define the binary random variable D t ( y ) ≡ { Y t ≥ y } (3.2)= 1 (cid:8) U t ≤ α + X t β − h − t ( y ) (cid:9) , where the equality follows from specification ((3.1)) and weak monotonicity. Here,we use Y instead of Y because D t (inf Y ) = 1 almost surely for all t .The key insight of our identification argument is to allow the threshold y in (3.2)to be different across time periods, in addition to allowing it to vary across Y , thesupport of the observed outcome. We thus compare outcomes observed at t = 1with threshold y and outcomes observed at t = 2 with threshold y , where y (cid:54) = y and ( y , y ) ∈ Y . This allows us to group individuals into switchers and non-switchers , where an individual is a switcher provided that D ( y ) + D ( y ) = 1.It is the existence of switchers that informs our identification of the time-varyingtransformation function. In other words, in comparison with previous results wherethe threshold is the same in each time period, using different thresholds in eachtime period reveals new information about the response functions. It is this newinformation that enables us to identify the time-dependence of h t . Two time periods are sufficient for our identification results, so we let T = 2 in whatfollows. To the best of our knowledge, previous papers, see, e.g., Chen (2002), Chen (2010a); Cher-nozhukov et al. (2018), restrict the two thresholds to be equal to each other. This is relevant sincethis restriction essentially prevents identification of time-effects or time-varying transformation func-tions h t . y , y ) ∈ Y , define the following vector of binary variables D ( y , y ) ≡ ( D ( y ) , D ( y )) . Our identification strategy for ( β, h , h ), which we call binarization, is based on theobservation that the 2-vector D ( y , y ) follows a panel data binary choice model for any ( y , y ) ∈ Y . This result is summarized in Lemma 1 below.The identification proof proceeds in three steps. First, we show identificationof β and of h − ( y ) − h − ( y ) for arbitrary ( y , y ) ∈ Y . In the resulting binarychoice model, the difference h − ( y ) − h − ( y ) is the coefficient on the differenced timedummy, and β is the regression coefficient on X − X . For a given binary choicemodel, identification of β and of h − ( y ) − h − ( y ) follows Manski (1987) for thenonparametric version of our model, and Chamberlain (2010) for the logistic version.This result is summarized in Theorem 1 below.Second, we show that varying the pair ( y , y ) over Y obtains identification of (cid:8) h − ( y ) − h − ( y ) , ( y , y ) ∈ Y (cid:9) . Third, we show that identification of this set of differences obtains identificationof the functions h and h under a normalization assumption on h − . That is, for anarbitrary y ∈ Y , h − ( y ) = 0. This type of assumption is customarily made in theliterature on transformation models. This result is presented in Theorem 2.In summary, we show that FELT can be converted into a collection of binary choicemodels, which allows us to identify the transformation functions h t . Omitting the factthat FELT can be transformed into many binary choice models obtains identificationof β only.Figures 3.1 and 3.2 illustrate the intuition behind our identification strategy fortwo arbitrary functions, h and h , both accommodated by FELT. The line with kinksand a flat part represents an arbitrary function h , while the solid curve representsan arbitrary function h . Consider Figure 3.1. Pick a y ∈ Y on the vertical axis.For all y ≤ y , h ( y ) gets mapped to zero, while for all y > y , it gets mapped toone. Now pick a y ∈ Y . For all y ≤ y , h ( y ) gets mapped to zero, while for all y > y , it gets mapped to one. This gives rise to a fixed effects binary model for( D ( y ) , D ( y )), also plotted in the figure as the grey solid lines. Our first result16igure 3.1: FELT functions h and h . Figure 3.2: Normalization and tracing.in Theorem 1 identifies the difference h − ( y ) − h − ( y ) at arbitrary points ( y , y ),as well as the coefficient β . It is clear that normalizing h − ( . ) at an arbitrary pointidentifies the function h − ( y ) at an arbitrary point y . This is captured in Figure3.2. There, for an arbitrary y , h − ( y ) = 0 . Then, as y is arbitrary, Figure 3.2 showsthat moving y it on its support traces out the generalized inverse h − on its domain.Theorem 2 wraps up this argument by showing that h and h are identified fromtheir generalized inverses. In this section, we provide nonparametric identification results for ( β, h , h ). Partsof our identification proof build on Manski (1987), who in turn builds on Manski(1975, 1985). Assumption 2. [Error terms](i) F ( u | α, X ) = F ( u | α, X ) ≡ F ( u | α, X ) for all ( α, X ) ;(ii) The support of F ( u | α, X ) is R for all ( α, X ) . Assumption 2 places no parametric distributional restrictions on the distributionof U it and allows the stochastic errors U it to be correlated across time. The firstpart of the assumption, 2(i), is a stationarity assumption, requiring time-invarianceof the distribution of the error terms conditional on the trajectory of the observedregressors and on the unobserved heterogeneity. This assumption excludes lagged de-pendent variables as covariates. Additionally, as noted by, e.g., Chamberlain (2010),17lthough it allows for heteroskedasticity, it restricts the relationship between the ob-served regressors and U it by requiring that even when x (cid:54) = x , U and U have equalskedasticities. This type of stationarity assumption is common in linear and nonlinearpanel models, e.g., Chernozhukov et al. (2013) and references therein.Assumption 2(ii) requires full support of the error terms. It guarantees that, forany pair ( y , y ) ∈ Y , the probability of being a switcher is positive . In our context,being a switcher refers to the event D ( y ) + D ( y ) = 1, so that Assumption 2 guar-antees that P ( D ( y ) + D ( y ) = 1) >
0. This assumption is similar to Assumption1 in Manski (1987).Let ∆ X ≡ X − X and for an arbitrary pair ( y , y ) ∈ Y , define γ ( y , y ) ≡ h − ( y ) − h − ( y ) . (3.3) Lemma 1.
Suppose that ( Y, X ) follows the model in (3.1) . Let Assumptions 1 and2 hold. Then for all ( y , y ) ∈ Y , med ( D ( y ) − D ( y ) | X, D ( y ) + D ( y ) = 1) = sgn (∆ Xβ − γ ( y , y )) . (3.4) Proof.
The proof builds on Manski (1985, 1987), and is presented in Appendix A.1.Let W ≡ (∆ X, − (cid:48) and θ ( y , y ) ≡ ( β, γ ( y , y )) , so that (3 .
4) can be written asmed ( D ( y ) − D ( y ) | X, D ( y ) + D ( y ) = 1) = sgn ( W θ ( y , y )) . For identification of θ ( y , y ) we impose the following additional assumptions. Assumption 3. [Covariates](i) The distribution of ∆ X is such that at least one component of ∆ X has positiveLebesgue density on R conditional on all the other components of ∆ X with probabilityone. The corresponding component of β is non-zero.(ii) The support of W is not contained in any proper linear subspace of R K +1 . Assumption 3(i) requires that the change in one of the regressors be continuouslydistributed conditional on the other components. Assumption 3(ii) is a full rankassumption. These assumptions are standard in the binary choice literature concernedwith point identification of the parameters.18ssumption 3 resembles Assumption 2 in Manski (1987), the difference being thatour assumption concerns W , which includes a constant that captures a time trend.The presence of this constant requires sufficient variation in X t over time. No linearcombination of the components of X t can equal the time trend. Assumption 4. [Normalization- β ] For any ( y , y ) ∈ Y , θ ( y , y ) ∈ Θ =
B × R , where B = (cid:8) β : β ∈ R K , (cid:107) β (cid:107) = 1 (cid:9) . Assumption 4 imposes a normalization on β , namely that the norm of the re-gression coefficient equals 1. Scale normalizations are standard in the binary choiceliterature, and are necessary for point identification when the distribution of the errorterms is not parameterized. Normalizing β (instead of θ ) avoids a normalization thatwould otherwise depend on the choice of ( y , y ). In this way, the scale of β remainsconstant across different choices of ( y , y ). Alternatively, one can normalize the co-efficient on the continuous covariate (cf. Assumption 3(i)) to be equal to one. In oureconomic model in Section 5 the latter assumption holds automatically. Theorem 1.
Suppose that ( Y, X ) follows the model in (3.1) , and let the distributionof ( Y, X ) be observed. Let Assumptions 1, 2, 3, and 4 hold. Then, for an arbitrarypair ( y , y ) ∈ Y , θ ( y , y ) is identified.Proof. The proof proceeds by showing that FELT can be converted into a binarychoice model for an arbitrary pair ( y , y ) , and then builds on Theorem 1 in Manski(1987), which in turn uses results in Manski (1985). See Appendix A.2.So far, we have identified the regression coefficient β and the difference in thegeneralized inverses at arbitrary pairs ( y , y ). We consider now identification of thefunctions h and h on Y . Assumption 5. [Normalization- h ] For some y ∈ Y , h − ( y ) = 0 . Such a normalization is standard in transformation models, see, e.g., Horowitz(1996). Without this normalization, all identification results hold up to h − ( y ). Wenormalize the function in the first time period only, imposing no restrictions on the There are models with sufficient structure on the transformation function h t where identifica-tion is possible without a normalization on the regression coefficient. Examples include the linearregression model, the censored linear regression model in Honor´e (1992), and the interval-censoredregression model in Abrevaya and Muris (2020). Theorem 2.
Suppose that ( Y, X ) follows the model in (3.1) , and let the distributionof ( Y, X ) be observed. Under Assumptions 1, 2, 3, 4, and 5, the transformationfunctions h and h are identified.Proof. The proof proceeds by identifying the generalized inverses of monotone func-tions, which obtains identification of the pre-images of h and h . This obtainsidentification of the functions themselves. See Appendix A.3. In this section, we show identification of ( β, h , h ) when the error terms are assumedto follow the standard logistic distribution. The logistic case is not nested in thenonparametric case. In particular, when the errors are logistic, we do not requirea continuous regressor. However, we require conditional serial independence of theerror terms. Assumption 6. [Logit] (i) F ( u | α, X ) = F ( u | α, X ) = Λ ( u ) = exp( u )1+exp( u ) , and U and U are independent; (ii) E ( W (cid:48) W ) is invertible. Assumption 6(i) strengthens Assumption 2 by requiring the errors to follow thestandard logistic distribution and to be serially independent. Note that one conse-quence of this assumption, which specifies the variance of the error terms to be equalto 1, is to eliminate the need to normalize β . On the other hand, Assumption 6(ii)imposes weaker restrictions on the observed covariates relative to Assumption 3, sinceit does not require the existence of a continuous covariate. Sufficient variation in ∆ X is sufficient to obtain identification of the vector β when the error terms follow thestandard logistic distribution. Theorem 3.
Suppose that ( Y, X ) follow the model in (3.1) , and let the distributionof ( Y, X ) be observed. Let Assumptions 1 and 6 hold. Then, for an arbitrary pair See Chamberlain (2010) and Magnac (2004) for more details about identification under non-parametric versus logistic errors in the panel data binary choice context. y , y ) ∈ Y , θ ( y , y ) is identified. Additionally, letting Assumption 5 hold, then thetransformation functions h ( · ) and h ( · ) are identified.Proof. See Appendix A.4.
If ( h , h ) are invertible, we can use the previous identification theorem to identifyfeatures of the distribution of the fixed effects conditional on observed regressors.These features are the change in the conditional mean function of α and the condi-tional variance of α conditional on X , X . These results are relevant since in ourcollective household model, the fixed effects represent the log of resource shares, andboth the standard deviation of these resource shares and the response of their con-ditional mean to covariates are key parameters of interest in the empirical literature.As this is relevant to our application, we note here that a normalization assumption,such as 4, on the demand function in the first period is not necessary for these resultson the resource shares because, e.g., we only need their deviation with respect to themean of the fixed effects.In this section, we provide sufficient conditions for the identification of the changein the conditional mean function of the fixed effects, defined as: µ ( x ) ≡ E [ α | X = x ] , for all x ∈ X , (4.1)as well as for the conditional variance of the fixed effects. For these results, thenormalization assumption 4 is not necessary. To provide intuition for this, let c ≡ h − ( y ) , at an arbitrary y ∈ Y and g t ( y ) ≡ h − t ( y ) − c for all y ∈ Y . Note that Theorem 1recovers (cid:101) U t ≡ α − U t = h − t ( Y t ) − X t β, up to c , so that the joint distribution of (cid:16) (cid:101) U , (cid:101) U , X , X (cid:17) is identified up to c . Byplacing restrictions on the distribution of ( α, U , U , X , X ), we can then recover ourfeatures of interest. 21 heorem 4. (i) Let the assumptions of Theorem 1 hold, and additionally assumethat (4a) ( h , h ) are strictly increasing, and (4b) let m ∈ R be an unknown constantsuch that E ( U t | X = x ) = m , for all x ∈ X . Then, for any x, x (cid:48) ∈ X , the change inthe conditional mean function µ ( x ) − µ ( x (cid:48) ) is identified and given by µ ( x ) − µ (cid:16) x (cid:48) (cid:17) = E [ g t ( Y t ) − X t β | X = x ] − E (cid:104) g t ( Y t ) − X t β | X = x (cid:48) (cid:105) . Proof.
See Appendix A.5.
Remark . As opposed to our main identification result in Theorem 2, Theorem 4does not use a normalization on the functions ( h , h ). If we were to impose the nor-malization in Assumption 5, the conditional mean function µ ( x ) would be identifiedfor all x ∈ X . This result provides justification for nonparametric regression of α onobservables (up to location). Remark . Under slightly weaker conditions, we can obtain the projection coefficientsof α on X t . This is of interest for our empirical application. Recall that the jointdistribution of (cid:16) (cid:101) U , (cid:101) U , X , X (cid:17) is identified up to c . Then, assuming Cov ( U s , X t ) =0, we can identify the projection coefficient of α on X t from[Var ( X t )] − Cov ( α, X t ) = [Var ( X t )] − Cov ( α − U s , X t ) = [Var ( X t )] − Cov (cid:16) (cid:101) U s , X t (cid:17) . Second, define the conditional variance of the fixed effects as σ α ( x ) ≡ V ar [ α | X = x ] , for all x ∈ X . (4.2)For this second result, we strengthen our assumptions to include, among others, serialindependence of the error term. This allows us to pin the persistence in unit i ’s timeseries on α i instead of on serial dependence in the errors. Theorem 5.
Let the assumptions of Theorem 1 and assume that (5a) ( h , h ) arestrictly increasing, and (5b) Cov [ α, U t | X = x ] = 0 for all x ∈ X and t , and (5c) Cov [ U , U | X = x ] = 0 for all x ∈ X . Then for all x ∈ X , the conditional variancefunction σ α ( x ) is identified and given by: σ α ( x ) = Cov ( g ( Y ) − X β, g ( Y ) − X β | X = x ) . roof. See Appendix A.5.It may be possible to obtain the entire conditional distribution of the fixed effectsunder the assumption that ( α, U , U ) are mutually independent by using argumentssimilar to those in Arellano and Bonhomme (2012). In this section, we construct a new model of an efficient full-commitment intertem-poral collective (FIC) household. Essentially, we combine the models of Browninget al. (2013) and Chiappori and Mazzoco (2017) to generate an empirically practicalmodel that allows identification of resource shares. Chiappori and Mazzoco (2017)write their model in terms of pure public and pure private goods. We instead adaptthat model to the more general sharing model given in the collective household modelof Browning et al. (2013).A feature of efficient models like this is that the household-level problem can bedecentralized into an observationally equivalent set of individual decision problems.Each individual problem is to choose demands based on an individual-level constraintdefined by a shadow price vector and a shadow budget constraint.We use subscripts i, j, t . Let i = 1 , ..., n index households and assume the house-hold has a time-invariant composition, with N ij members of type j . Let j = m, f, c formen, women and children. Let t = 1 ,
2. Let z be a vector of time-varying household-level demographic characteristics, and let the numbers of household members of eachtype, N im , N if and N ic , be (time-invariant) elements of z it . Like Chiappori and Maz-zoco (2017), this is a model with uncertainty, so we use the superscript s = 1 , V j ( p, x, z ), is the maximized value of utility given a budget con-straint defined by prices p and budget x , given characteristics z . Let V j be strictlyconcave in the budget x . Indirect utility depends on time only through its dependenceon the budget constraint and time-varying demographics z . Let v ijt ≡ V j ( p t , x it , z it )denote the utility level of a person of type j in household i in period t .Browning et al. (2013) model sharing and household scale economies via a house-hold consumption function that reflects the fact that shareable goods feel “cheap”within the household. This is embodied in a shadow price vector for consumption23ithin the household that is weakly smaller than the market price vector p t faced bysingle individuals, because singles cannot take advantage of scale economies in house-hold consumption. For example, goods that are not shareable at all—for which thereare no scale economies in household consumption—have shadow prices equal to themarket price. Goods that are fully shareable, so that each person in the householdcan enjoy an effective consumption equal to the amount purchased by the household,have a shadow price equal to the market price divided by the number of members.Let A it ≡ A ( z it ) be a diagonal matrix that gives the shareability of each good, andlet it depend on demographics z it (including the numbers of household members). Fornonshareable goods, the corresponding element of A it equals 1; for shareable goods,it is less than 1, possibly as small as 1 /N i where N i is the number of householdmembers. Goods may be partly shareable, with an element of A it between 1 /N i and1. With market prices p t , within-household shadow prices are given by the lineartransformation A it p t . Shadow prices are the same for all household members j .Browning et al. (2013) also allow for inequality in the distribution of householdresources. Let η ijt be the resource share of type j in household i in time period t .It gives the fraction of the household budget consumed by that type. Each personof the N ij people of type j consumes η ijt /N ij of the household budget x it , so theyeach have a budget of η ijt x it /N ij . As we will see below, the resource share is a choicevariable for the household.The resource shares and shadow price vector together define the decentralizedshadow budget constraints faced by each household member. The model has eachhousehold member facing a shadow budget of η ijt x it /N ij and shadow prices of A it p t ,so that, within the household, utility v ijt is given by v ijt = V j ( A it p t , η ijt x it /N ij , z it ) . (5.1)Let V xj ( p, w, z ) ≡ ∂V j ( p, w, z ) /∂w be the monotonically decreasing marginal utilityof person j with respect to their (shadow) budget. Then, V xj ( A it p t , η ijt x it /N ij , z it )is the value of their marginal utility evaluated at their shadow budget constraint.Let p s , z si , x si for s = 1 , π i and π i (which sum to 1). Here, x si is the state-specific lifetime wealth of household i , revealed in period 2. Informationabout the joint distribution of these unobserved state-dependent variables is embodiedin Φ i , the information set available in period 1 to household i . The informationset can also include unobserved time-invariant features of the household. Paretoweights φ ij = φ j (Φ i ) depend on the information set Φ i , which varies arbitrarily acrosshouseholds. This household-level time-invariant variable will form the basis of ourfixed-effects variation.Chiappori and Mazzoco (2017) let individuals have expected lifetime utilities givenby the sum of period 1 utility and the discounted probability-weighted sum of state-specific period 2 utility. Using the Pareto weights, they then write the Bergson-Samuelson Welfare Function, W i , for the household as W i ≡ J (cid:88) j =1 N ij φ ij (cid:34) v ij + (cid:88) s =1 ρ i π si v sij (cid:35) . (5.2)The term in square brackets is the expected lifetime utility of each member of type j in household i . Each member of type j gets the Pareto weight φ ij = φ j (Φ i ).Next, substitute indirect utility (5.1) for utility v ij and v sij into (5.2), andform the Lagrangian using the intertemporal budget constraint with interest rate τ , x i + x si / (1 + τ ) = x si , and the adding-up constraints on resource shares, (cid:80) j η ij = If the membership of the household changed over time, or if the household was choosing itsmembership, we would need a household welfare function that used some kind of population ethicsprinciple (see Blackorby et al. (2005)). It is for this reason that we focus on households with fixedmembership.Here, we only consider egotistic preferences. However, this is without loss of generality: “It isimportant to point out, however, that the model with egotistical preferences ... plays a special role.The reason for this is that the solution to the collective model with caring preferences must also bea solution of the collective model ... with egotistical preferences.” (Chiappori and Mazzoco (2017),page 21).The restriction there there are only two periods and only two states is for convenience. None ofour conclusions about resource shares depend on it. In contrast, Chiappori and Mazzoco (2017) substitute direct utility for utility v ij and v sij usinga model of pure private and pure public goods. In that model, each individual’s utility is given bytheir direct utility function, which is a function of their (unobserved) consumption of a vector ofprivate goods and their (observed) consumption of a vector of public goods. j η sij = 1 . Each household i chooses x i , η ij and η sij to maximize W i = J (cid:88) j =1 N ij φ ij (cid:34) V j ( A i p , η ij x i /N ij , z i ) + (cid:88) s =1 ρ i π si V j ( A i p s , η sij x si /N ij , z si ) (cid:35) − (cid:88) s =1 κ s ( x i + x si / (1 + τ ) − x si ) − λ (cid:32) J (cid:88) j =1 η ij − (cid:33) − (cid:88) s =1 λ s (cid:32) J (cid:88) j =1 η sij − (cid:33) . Because the optimand is additively separable across periods, it can be thought of as atwo-stage budgeting problem, where the household first chooses the period budgets, x i and x si , and then chooses resource shares conditional on this allocation of budget.First-order conditions for η ijt in each period and each state are given by: φ ij V xj ( A i p , η ij x i /N ij , z i ) x i − λ = 0 ,ρ i φ ij V xj ( A i p s , η sij x si /N ij , z si ) x si − λ s = 0 , for s = 1 ,
2. Thus for any two types, j and k , we have the following equality: φ ij φ ik = V xk ( A i p , η ik x i /N ik , z i ) V xj ( A i p , η ij x i /N ij , z i ) = V xk ( A i p s , η sik x si /N ik , z si ) V xj ( A i p s , η sij x si /N ij , z si ) , (5.3)for s = 1 , i . That is, the household chooses resource shares so as to equateratios of marginal utilities with ratios of Pareto weights. There is a unique solutionto this problem because each person j has a utility function strictly concave in theshadow budget.Resource shares are implicitly determined by (5.3) and depend on the Pareto-weights φ i , ..., φ iJ . Because we are in a full-commitment world, these Pareto-weightsare time-invariant. And, because there are both observed and unobserved household-level shifters to Pareto-weights, the Pareto-weights are heterogeneous across observ-ably identical households. Consequently, the Pareto-weights are fixed effects hidinginside the resource share functions.Household quantity demands given the sharing model of Browning et al. (2013)are very simple: the household purchases the sum of what all the individuals woulddemand if they faced the within-household shadow price vector A it p t and had theirshadow budget η ijt x/N ij , adjusted for sharing as defined by A it . A key feature hereis that the household demand for a non-shareable good does not have to be adjusted26or shareability: it is just the sum of what each individual would demand.An assignable good is one where we observe the consumption of that good by aspecific person (or type of person). Assuming the existence of a scalar-value demandfunction q j ( p, x, z ) for an assignable and non-shareable good (e.g., food or clothing)for a person of type j , the household’s quantity demand, Q ijt , for the assignable goodfor each of the N ij people of type j is given by Q ijt = q j ( A it p t , η ijt x it /N ij , z it ) . Because only people of type j purchase this good, the household does not sum over thedemand of other household members. Assuming that the assignable good is a normalgood implies that q j is strictly increasing in its second argument, and is thereforestrictly monotonic.Suppose p t is unobserved, but varies over time. Then, we may express the house-hold demand for the assignable good of a member of type j as a time-varying functionof observed data. Defining (cid:101) q jt ( A t p t , x, z ) = q j ( p t , x, z ), we have Q ijt = (cid:101) q jt ( η ijt x it /N ij , z it ) . (5.4)This is the structural demand equation that we ultimately bring to the data.This model is very general. It assumes only that: the household satisfies the in-tertemporal budget constraint under uncertainty, can fully commit to future actions,reaches the Pareto frontier, and has scale economies embodied in the shareabilitymatrix A ( z ). It places no additional restrictions on utility functions or the bargain-ing model. It implies that quantity demands for assignable goods are time-varyingfunctions of resource shares, and that resource shares depend on Pareto weights thatare fixed over time (aka: fixed effects). The model above has resource shares depend on a fixed effect, but expresses thoseresource shares as a vector of implicit functions, which may be hard to work with. Tomake the model tractable, we impose sufficient structure on utility functions to findclosed forms for resource shares.In our empirical example below, we work with data that have time-invariant de-27ographic characteristics, so let z it = z i be fixed over time. This implies that theshareability of goods embodied in A is time-invariant: A it = A i = A ( z i ).Let indirect utilities be in the price-independent generalized logarithmic (PIGL)class (Muellbauer (1975, 1976)) given by V j ( p, x, z ) = C j ( p, z ) + ( B ( p, z ) x ) r ( z ) /r ( z ) . (5.5)Here, V j is homogeneous of degree 1 in p, x if C j is homogeneous of degree 0 in p and B is homogeneous of degree − p . V is increasing in x if B ( p, z ) is positive and V is concave in x if r ( z ) <
1. In terms of preferences, this class is reasonably wide. Itgives quasihomothetic preferences if r ( z ) = 1, and PIGLOG preferences as r ( z ) → C j vary across types j , and so the model allows for preferenceheterogeneity between types, e.g., between men and women. The restrictions that B ( p, z ) and r ( z ) don’t vary across j and that r ( z ) does not depend on prices p areimportant: as we see below, they imply that resource shares are constant over time.Substituting the BCL model, observed demographics and period t budgets, wehave the utility of person j in household i in period t as v ijt = V j ( A i p t , η ijt x it /N ij , z i ) = C j ( A i p t , z i ) + B ( A i p t , z i ) r ( z i ) ( η ijt x it /N ij ) r ( z i ) /r ( z i ) , and thus marginal utilities are given by V xj ( A it p t , η ijt x it /N ij , z it ) = B ( A i p t , z i ) r ( z i ) ( η ijt x it /N ij ) r ( z i ) − . For r ( z i ) (cid:54) = 1, and for any pair of types j, k , we substitute into (5.3) and cancel terms, φ ij φ ik = V xk ( A it p t , η ijt x it /N ij , z it ) V xj ( A it p t , η ijt x it /N ij , z it ) = (cid:18) η ikt /N ik η ijt /N ij (cid:19) r ( z i ) − , Rearranging, we get (cid:18) η ikt η ijt (cid:19) = N ik N ij (cid:18) φ ij φ ik (cid:19) / ( r ( z i ) − . (5.6)The household chooses resource shares in each period and each state to satisfy(5.6). Since the right-hand side has no variation over time or state, this implies that,given PIGL utilities (5.5), the resource shares in a given household i are independent28f period t and state s . However, resource shares do vary with both observed andunobserved variables across households i . Let the fixed resource shares that solve thefirst-order conditions with PIGL demands be denoted η ij . We will estimate the demand equation for women’s food in nuclear households com-prised of 1 man, 1 woman and 1 − N if = N im = 1. Let the resourceshare for adult women be η if and define α i ≡ ln (cid:0) η if /N if (cid:1) = ln η if , (5.7)equal to the logged resource share of the woman in the household.Let there be a multiplicative Berkson (1950) measurement error denoted exp ( − U )which multiplies the budget, so that if we observe x , the actual budget is x/ exp ( U ).The measurement error is i.i.d. across time and households. Here, the measurementerror does not affect resource shares, but does affect the distribution of observedquantity demands. Plugging this measurement error and the PIGL form for resourceshares given by (5.7) into the assignable goods demand equation (5.4) yields a house-hold demand for women’s food, Q ift , given by Q ift = (cid:101) q ft (exp ( α i ) x it / exp ( U it ) , z i ) . This is a FELT model, conditional on covariates: Y it = h t ( Y ∗ it , z i ) , (5.8)where h t ( Y ∗ it , z i ) = (cid:101) q ft (exp ( Y ∗ it ) , z i ) and Y ∗ it = α i + X it − U it (5.9)and X it = ln x it is the logged household budget.The assumption that the assignable good is normal means that the time-varyingfunctions h t are strictly monotonic in Y ∗ it . One could additionally impose that thedemand functions (cid:101) q ft come from the application of Roy’s Identity to the indirect utilityfunction (5.5). These demand functions equal a coefficient times the shadow budget29lus a coefficient times the shadow budget raised to a power, where the coefficientsare time-varying and depend on z i . We do not impose that additional structurehere; instead, we show in Section 7 that the estimated demand curves given by FELTare close to the PIGL shape restrictions.Here, the time-dependence of h t is economically important; it is driven by theprice-dependence of preferences and by the fact that prices are common to all house-holds i but vary over time t . Further, the fixed effects α i are economically meaningfulparameters: they are equal to the logged women’s resource shares in each household.The standard deviation of the logs is a common inequality measure, and the standarddeviation of α i is identified by FELT given strict monotonicity, as we show in Section4. Further, the covariation of α i with observed regressors is identified.This model is useful to answer two important questions. First, are fixed effects(resource shares) fully explained by observed demographics and budgets, or do weneed to appeal to unobserved heterogeneity? Second, are fixed effects correlated withlog-budgets X it ? Some results concerning identification of collective household modelsin cross-sectional data rely on the assumption of independence. We use data from the 2012 and 2015 Bangladesh Integrated Household Surveys. Thisdata set is a household survey panel conducted jointly by the International FoodPolicy Research Institute and the World Bank. In this survey, a detailed questionnairewas administered to a sample of rural Bangladeshi households. This data set has twouseful features for our purposes: 1) it includes person-level data on food intakes andhousehold-level data on total household expenditures; and 2) it is a panel, followingroughly 6000 households over two (nonconsecutive) years. The former allows us touse food as the assignable good to identify our collective household model parameters. Individual demands are derived by the application of Roy’s Identity to (5.5), and are: q jt ( x, z ) = c jt ( z ) x − r ( z ) + b t ( z ) x where c jt ( z ) = − ∇ p C j ( p t ,z ) B ( p t ,z ) r , b t ( z ) = −∇ p ln B ( p t , z ). This notation makes clear that we have time-varying demand functions, due to the fact that prices vary over time. In our application, pricesin each period are not observed, so we allow the ( z − dependent) functions c jt and b t to vary overtime. We require that the assignable good be normal, meaning that its demand function is globallyincreasing in x . This form for demand functions is globally increasing if c jt ( z ), b t ( z ) and 1 − r ( z )are all positive. There are 1920 households whose composition is unchanged between 2012and 2015. Roughly half of these households have more than one adult man or morethan one adult women. To simplify the interpretation of estimated resource shareswe focus on nuclear households. This leaves 871 nuclear households comprised of oneman, one woman and 1 to 4 children, where children are defined to be 14 years oldor younger.The assignable good, Y it , is annual consumption of food by the woman. The sur-veys contain 7-day recall data on household-level quantities (measured in kilograms)of food consumption in 7 categories: Cereals, Pulses, Oils; Vegetables; Fruits; Pro-teins; Drinks and Others. These consumption quantities include home-produced foodand purchased food and gifts. They include both food consumed in the home (bothcooked at home and prepared ready-to-eat food), as well as food consumed outsidethe home (at food carts or restaurants). These weekly quantities are grossed up toannual consumption expenditure by multiplying by 52 and multiplying by estimatedvillage-level unit-values (following Deaton (1997)).Our household-level annual consumption, x it , is the sum of total expenditure on,and imputed home-produced consumption of, the following categories of consumption:rent, food, clothing, footwear, bedding, non-rent housing expense, medical expenses,education, remittances, religious food and other offerings (jakat/ fitra/ daan/ sodka/kurbani/ milad/ other), entertainment, fines and legal expenses, utensils, furniture,personal items, lights, fuel and lighting energy, personal care, cleaning, transport That is, we exclude households with births, deaths, new members by marriage or adoption, etc.Although a full-commitment model can accommodate such changes in household composition, it iseasier to think through the meaning of a person’s resource share if the composition is held constant. X it = ln x it ,the natural logarithm of annual consumption.Our model is also conditioned on a set of time-invariant demographic variables z i .We include several types of observed covariates in z i that may affect both preferencesand resource shares: 1) the age in 2012 of the adult male; 2) the age in 2012 of theadult female; 3) the average age in 2012 of the children; 4) the average education inyears of the adult male; 5) the average education in years of the adult female; 6) anindicator that the household has 2 children; 7) an indicator that the household has 3or 4 children; and 8) the fraction of children that are girls. For the first five of these demographic variables, in order to reduce the supportof the regressors, we top- and bottom-code each variable so that values above (be-low) the 95 th (5 th ) percentiles equal the 95 th (5 th ) percentile values. For all seven ofthese variables, we standardize the location and scale so that their support is [0 , Following our identification results, estimation could be based on composite versionsof the maximum score estimator or the conditional logit estimator (see Botosaru andMuris (2017)). Here, we instead follow a sieve GMM approach that facilitates theinclusion of a large vector of demographic conditioning variables z and the impositionof strict monotonicity on the demand functions (aka: normality of the assignablegood) . Since household membership is fixed for all households in our sample, age, number and gendercomposition are time-invariant by construction. However, education level of men and women aretime-varying in roughly 20% of households. For our time-invariant education variables, we use theaverage education across the two observed years. z . Denote the inverse demand functions g t ( Y it , z it ) = h − t ( · , z it ).Given (5.8) a two-period setting with t = 1 ,
2, and time-invariant demographics z it = z i , we have α i + X it − U it = g t ( Y it , z i ) , (7.1)implying the conditional moment condition E [ g ( Y i , z i ) − g ( Y i , z i ) − (cid:52) X it | X i , X i , z i ] = 0 . (7.2)We provide a detailed description of our GMM estimator in the Appendix. Briefly,we approximate the inverse demand functions, g t , t = 1 , , using Bernstein polyno-mials. In the main text, we use 8th order Bernstein polynomials restricted so thatestimated demand curves are strictly monotonically increasing. In the Appendix, weprovide estimates for other orders.We characterize several interesting features of the distribution of resource shares.Recall from Theorems 4 and 5 that identification of features of this distribution doesnot impose a normalization on assignable good demand functions, and only identifiesthe distribution of logged resource shares (fixed effects) up to location. Consequently,we only identify features of the resource share distribution up to a scale normalization.Let (cid:98) g it = (cid:98) g t ( Y it , z i ) equal the predicted values of the inverse demand functions atthe observed data. Recall that g t ( Y it , z i ) = α i + X it − U it , so we can think of (cid:98) g it − X it as a prediction of α i − U it . We then compute the following summary statistics ofinterest, leaving the dependence of ˆ g it , t = 1 , , on z i implicit:1. an estimate of the standard deviation of α i given byˆ std ( α ) = (cid:112) ˆ cov (( (cid:98) g i − X i ) , ( (cid:98) g i − X i )) , where ˆ cov denotes the sample covariance. The standard deviation of logs isa standard (scale-free) inequality measure. So this gives a direct measure ofinter-household variation in women’s resource shares.2. an estimate of the standard deviation of the projection error, e i , of α i on ¯ X i =33 ( X i + X i ) and Z i . Consider the projection α i = γ ¯ X i + γ Z i + e i , where Z i contains a constant. We are interested in the standard deviation of e i . To obtain this parameter, we compute estimators for γ , γ from the pooledlinear regression of ˆ g it − X it on ¯ X i and Z i . Call these estimators ˆ γ , ˆ γ . Then,as in ˆ std ( α ) in (1), an estimate of the standard deviation of e i is given by:ˆ std ( e i ) = (cid:113) ˆ cov (cid:0)(cid:0)(cid:98) g i − X i − ˆ γ ¯ X i − ˆ γ Z i (cid:1) , (cid:0)(cid:98) g i − X i − ˆ γ ¯ X i − ˆ γ Z i (cid:1)(cid:1) . This object measures the amount of variation in α i that cannot be explainedwith observed regressors. If it is zero, then we don’t really need to accountfor household-level unobserved heterogeneity in resource shares. It is muchlarger than zero, then accounting for household-level unobserved heterogeneityis important.3. an estimate of the standard deviation of α i + X it for t = 1 ,
2, computed asˆ std ( α i + X it ) = (cid:112) ˆ var ( α i ) + ˆ var ( X it ) + 2 ˆ cov ( α i , X it ) , where ˆ cov ( α i , X i ) = cov (ˆ g i − X i , X i ) , ˆ cov ( α i , X i ) = cov (ˆ g i − X i , X i ) , and ˆ var ( α i ) = (cid:16) ˆ std ( α ) (cid:17) and ˆ var ( X it ) , t = 1 , , is observed in the data. Since α i + X it is a measure of the woman’s shadow budget, ˆ std ( α i + X it ) is a mea-sure of inter-household inequality in women’s shadow budgets. This inequalitymeasure is directly comparable to the standard deviation of X i (shown in Table1), which measures inequality in household budgets.4. an estimate of the covariance of α i , X it for t = 1 ,
2, denoted ˆ cov ( α i , X it ). Thisobject is of direct interest to applied researchers using cross-sectional data toidentify resource shares. If this covariance is non-zero, then the independenceof resource shares and household budgets is cast into doubt, and identification34trategies based on this restriction are threatened.Of these, the first 2 summary statistics are about the variance of fixed effects, andare computed using data from both years. Their validity requires serial independenceof the measurement errors U it . In contrast, the second 2 summary statistics areabout the correlation of fixed effects with the household budget, and are computedat the year level. They are valid with stationary U it , even in the presence of serialcorrelation.We also consider the multivariate relationship between resource shares, householdbudgets and demographics. Recall that the fixed effect α i subject to a locationnormalization; this means that resource shares are subject to a scale normalization.So, we construct an estimate of the woman’s resource share in each household as (cid:98) η i = exp (cid:0) (ˆ g i − X i ) + (ˆ g i − X i ) (cid:1) , normalized to have an average value of 0 . (cid:98) η i on ¯ X i and Z i , and present the estimatedregression coefficients, which may be directly compared with similar estimates in thecross-sectional literature.The estimated coefficient on X i gives the conditional dependence of resource shareson household budgets, and therefore speaks to the reasonableness of the restrictionthat resource shares are independent of those budgets (an identifying restriction usedin the cross-sectional literature). Finally, using the estimate of the variance of fixedeffects, we construct an estimate of R in the regression of resource shares on observedcovariates. This provides an estimate of how much unobserved heterogeneity mattersin the overall variation of resource shares. Figure 8 shows our estimates of h and h (or, equivalently, of g and g ) for K = 8,for a family with two children with mean values, z , of the other demographics. Thefigures have food quantities q t on the vertical axis and (cid:98) g t ( q t ) on the horizontal axis,so the horizontal axis is like a predicted logged household budget. Solid lines givethe nonparametric estimates, and 95% pointwise confidence bands for the nonpara-metric estimates are denoted by dotted lines. Additionally, to provide reassurancethat the PIGL utility model—which implies the FELT demand curves—fits the dataadequately, we display the PIGL demand curve closest to the FELT estimates in each35 log budget de m and f o r f ood Figure 8.1: Estimated demand functions. Solid line is the nonparametric estimate,evaluated at the mean value of the demographics. The dotted lines indicate the95% confidence interval. The dashed line is the PIGL closest to the nonparametricestimate. Left panel is for period 1, right panel is for period 2.time period with dashed lines. Note that since g t are identified only up to location (of g ), we normalize theaverage of (cid:98) g t to half the geometric mean of household budgets at t =1, x . Becauseestimated nonparametric regression functions can be ill-behaved near their bound-aries, we truncate the estimated functions at the 5th and 95th percentiles of thedistribution of q t in each t . The key message from 8 is that these estimated demandcurves are somewhat nonlinear, estimated reasonably precisely, and not too far fromPIGL. The estimated PIGL curvature parameter is r ( z ) = 0 .
06, which means thatfood demands are close to PIGLOG (as in Banks et al. (1997)).Table 2 gives our summary statistics (items 1-4 above), with bootstrapped 95%confidence intervals, for our estimates with 8 Bernstein polynomials (see the Appendixfor other lengths of the Bernstein sieve). In the lower panel, we provide estimatedregression coefficients, also with bootstrapped 95% confidence intervals, where weregress estimated resource shares (cid:98) η i on log-budgets X i and demographics z i . We compute these PIGL demand curves by nonlinear least squares estimation of a pooled q it on (cid:98) g it , where the demand curves have the form q it = c t x − r + b t x . We estimate the model on a gridof 198 points, one for each interior percentile of q it in each period t = 1 , Variability of fixed effects α i ˆ std ( α i ) 0.2647 0.1518 0.3707ˆ std ( e i ) 0.1637 0.1262 0.1931ˆ cov ( α i , X i ) -0.0901 -0.1292 -0.0429ˆ cov ( α i , X i ) -0.1034 -0.1418 -0.0561ˆ std ( α i + X i ) 0.3537 0.2720 0.4605ˆ std ( α i + X i ) 0.3763 0.3010 0.4760 Regression estimates R : X i , z i on η i X i -0.0452 -0.0700 -0.0196age–woman -0.2513 -0.4437 -0.0545age–man 0.2845 0.1389 0.41912 children -0.0502 -0.1694 0.04133 or 4 children -0.1248 -0.2407 0.0140avg age of children -0.0095 -0.1979 0.2151fraction girl children 0.0136 -0.0881 0.1246education–woman -0.0325 -0.1519 0.1198education–men -0.1330 -0.2589 -0.0015Table 1: Estimates.Starting with the top panel of Table 1, the standard deviation of α i is a measureof inter-household dispersion in women’s resource shares. If this dispersion is verysmall, then variation in resource shares does not induce much inequality, and wecan reasonably use the household-level income distribution as a proxy for person-level inequality. However, if the dispersion is large, then household-level measures ofinequality leave out a lot of the action.The estimated value is roughly 0 .
26, with a 95% confidence interval coveringroughly 0 .
15 to 0 .
37. To get a sense of the magnitude for the standard deviationof logged resource shares, suppose that women’s resource shares were lognormallydistributed. Then our estimated standard deviation of 0.26 is consistent with 95% ofthe distribution of the resource shares lying in the range [0 . , . α i we can ex-plain with observed covariates. The standard deviation of e i gives a measure of theunexplained variation, and gives us an idea of whether household-level unobservedheterogeneity is an important feature of the data. If the standard deviation of e i
37s very small, then fixed effects are not needed—conditioning on observed covari-ates would be sufficient. Our estimate of the standard deviation of the unexplainedvariation in α i is about 0 .
16. This is large relative to the overall estimated stan-dard deviation of 0 .
26, and suggests that accounting for household-level unobservedheterogeneity is quite important.The next two rows give the covariance of α i and X it . Here, we see that logresource shares α i strongly and statistically significantly negatively covary with ob-served household budgets (the implied correlation coefficients are close to − . α i (which corresponds to a scale normalization of shadow budgets). Theestimated standard deviations are 0 .
35 and 0 .
38 in the two periods, respectively. Wecan compare these with the standard deviation of log-budgets, reported in Table 1,of 0 .
49 and 0 .
53. The point estimates suggest that there is less inequality in women’sshadow budgets than in household budgets. Although the confidence intervals arelarge, the test of the hypothesis that the standard deviation of log-budgets equals thestandard deviation of log-shadow budgets rejects in both years. Thus, if we take these results at face value, there is less consumption inequalityamong women than household-level analysis would suggest. However, another impli-cation of this is that there is more gender inequality than household level data wouldsuggest. The reason is that household-level analysis of gender inequality pins genderinequality on over-representation of one gender in poorer households. In our data, allhouseholds have 1 man and 1 woman, so household-level analysis of gender inequalitywould show zero gender inequality. But, because women in richer households havesmaller resource shares, this induces gender inequality even in these data.Finding correlation between α i and household budgets is not sufficient to invalidateprevious identification strategies for cross-sectional settings that rely on independencebetween resource shares and household budgets. The reason is that the independence For H : V ar ( X it ) − V ar ( α i + X it ), we have the following estimated test statistics and (confi-dence intervals). Period 1: 0.110 (0.0236,0.165); Period 2: 0.137 (0.0527,0.191). .
33) estimated resource shares (cid:98) η i on log-budgets X i andother covariates z i .The figure below shows the scatterplot of predicted resource shares versus the loghousehold budget. Here, we see a lot of variation in resource shares, and it is clearlycorrelated with household budgets. The overall variation here provides an estimateof the explained sum of squares in an infeasible regression of true resource shares η i on X i and z i . We may construct an estimate of the total sum of squares of resourceshares from our estimate of the standard deviation of α i . This yields an estimateof R in the infeasible regression, which we interpret as the fraction of variation inresource shares explained by observables.In the first row of the bottom panel, we see that observed variables explain roughlyhalf the variation in resource shares (the estimate of R is 0 . This artificial regression is infeasible because we observe (through g it ) a prediction of α i + U it ,not of α i itself. However, because we have an estimate of the variance of α i , we can construct anestimate of the variance of η i (subject to the scale normalization that it has a mean of 0 . g it , which is a prediction of η i u it . Since u it are uncorrelated with X i and z by assumption, the explained sum of squares from this regression applies to η i , and we canuse it to form an estimate of R , which we report, along with a bootstrapped confidence interval. X i . The estimated coefficient is − .
045 and isstatistically significantly different from 0. This means that, even after conditioningon other covariates (many of which are highly correlated with the budget), we stillsee a significant relationship between resource shares and household budgets.However, the magnitude of this effect is small. Conditional on z i , the standarddeviation of X it is 0 .
43 in year 1 and 0 .
47 in year 2. Thus, comparing two householdswith identical z but which are one standard deviation apart in terms the householdbudget, we would expect the woman in the poorer household to have a resource share2 percentage points higher than the woman in the richer household. Thus, the bulkof the variation the makes the standard deviation of women’s shadow budgets smallerthan that of household budgets is not running through the dependence of resourceshares on household budgets, but rather through the dependence of resource shareson other covariates that are correlated with household budgets.We get a very precise estimate of the conditional dependence of resource shareson household budgets. Overall, then, we see that women’s resource shares are sta-tistically significantly correlated with household budgets, even conditional on otherobserved characteristics. But, the estimated difference in resource shares at differenthousehold budgets is quite small. So, we take this as evidence that the identifying re-strictions used by Dunbar et al. (2013) (and Dunbar et al. (2019)) may be false, thoughperhaps not very false. It does suggest that alternative identifying restrictions—suchas those developed here with a panel data model—may be useful.The rows of Table 1 give several other coefficients that are comparable to otherestimates in the literature. Calvi (2019) finds that women’s resource shares in Indiadecline with the age of the woman. In these Bangladeshi data, we find evidence thatwomen’s resource shares are strongly negatively correlated with the age of womenand positively correlated with the age of men.Dunbar et al. (2013) find that women’s resource shares in Malawi decline withthe number of children. Here, we also see that pattern: households with 2 chil-dren have women’s resource shares 5 percentage points less than households with 1child; households with 3 or 4 children have resource shares 12 percentage points less.Dunbar et al. (2013) also find that Malawian women’s resource shares are higher inhouseholds with girls than households with boys. We do not see evidence of this inrural Bangladesh: the estimated coefficient on the fraction of children that are girls40tatistically insignificantly different from 0.In the Appendix, we also provide estimates analogous to Table 1 using a differentassignable good: clothing. Under the model, using different assignable goods shouldyield the same estimates of resource shares. This is roughly what we find in ourestimates using women’s clothing.Our estimates use 8th order Bernstein polynomials to approximate the inversedemand functionsD. In Appendix B, we present estimates using Bernstein polynomialsof order K = 1 , , ,
10 and show that our finite-dimensional parameter estimates haveroughly the same value for K ≥ References
Abrevaya, J. (1999). Leapfrog estimation of a fixed-effects model with unknowntransformation of the dependent variable.
Journal of Econometrics , 93(2):203–228.Abrevaya, J. (2000). Rank estimation of a generalized fixed-effects regression model.
Journal of Econometrics , 95(1):1–23.Abrevaya, J. and Muris, C. (2020). Interval censored regression with fixed effects.
Journal of Applied Econometrics , 35(2):198–216.Aguirregabiria, V., Gu, J., and Luo, Y. (forthcoming). Sufficient statistics for unob-served heterogeneity in dynamic structural logit models.
Journal of Econometrics .Ai, C. and Gan, L. (2010). An alternative root-n consistent estimator for panel databinary choice models.
Journal of Econometrics , 157(1):93–100.Altonji, J. G. and Matzkin, R. L. (2005). Cross section and panel data estimators fornonseparable models with endogenous regressors.
Econometrica , 73(4):1053–1102. Food is a plausible assignable good (because if one person eats it, nobody else can), but itmay not be non-shareable (because there may be scale economies in cooking). In contrast, clothingmay be plausibly non-shareable, but it may not be assignable (because, e.g., mothers and daughtersmight wear each others’ clothes). See the Appendix for details on clothing estimates.
Economet-rica , 67:1341–1383.Andrews, D. W. (2000). Inconsistency of the bootstrap when a parameter is on theboundary of the parameter space.
Econometrica , 68:399–405.Arellano, M. (2003).
Panel Data Econometrics . Oxford University Press.Arellano, M. and Bonhomme, S. (2009). Robust priors in nonlinear panel data models.
Econometrica , 77(2):489–536.Arellano, M. and Bonhomme, S. (2011). Nonlinear panel data analysis.
AnnualReview of Economics , 3(1):395–424.Arellano, M. and Bonhomme, S. (2012). Identifying distributional characteristics inrandom coefficients panel data models.
Review of Economic Studies , 79(3):987–1020.Arellano, M. and Hahn, J. (2007). Understanding bias in nonlinear panel models:Some recent developments. In Blundell, R., Newey, W., and Persson, T., editors,
Advances in Economics and Econometrics , pages 381–409.Arellano, M. and Honor´e, B. (2001). Panel data models: Recent developments. InHeckman, J. and Leamer, E., editors,
Handbook of Econometrics , volume 5, chap-ter 53, pages 3219–3296. Elsevier.Aristodemou, E. (forthcoming). Semiparametric identification in panel data discreteresponse models.
Journal of Econometrics .Athey, S. and Imbens, G. (2006). Identification and inference in nonlinear difference-in-differences model.
Econometrica , 74(2):431–497.Baetschmann, G., Staub, K. E., and Winkelmann, R. (2015). Consistent estimationof the fixed effects ordered logit model.
Journal of the Royal Statistical Society A ,178(3):685–703.Banks, J., Blundell, R., and Lewbel, A. (1997). Quadratic engel curves and consumerdemand.
Review of Economics and statistics , 79(4):527–539.Bargain, O., Donni, O., and Kwenda, P. (2014). Intrahousehold distribution andpoverty: Evidence from cote d’ivoire.
Journal of Development Economics , 107:262–276.Bargain, O., Lacroix, G., and Tiberti, L. (2018). Validating the collective model ofhousehold consumption using direct evidence on sharing. Partnership for EconomicPolicy Working Paper 2018-06.Becker, G. S. (1962). Investment in human capital: A theoretical analysis.
Journalof Political Economy , 70(5):9–49.Berkson, J. (1950). Are there two regressions?
Journal of the American StatisticalAssociation , 45(250):164–180.Bester, C. A. and Hansen, C. (2009). Identification of marginal effects in a non-parametric correlated random effects model.
Journal of Business and EconomicStatistics , 27(2):235–250.Blackorby, C., Bossert, W., and Donaldson, D. (2005).
Population Issues in SocialChoice Theory, Welfare Economics, and Ethics . Cambridge University Press.42onhomme, S. (2012). Functional differencing.
Econometrica , 800(4):1337–1385.Botosaru, I. and Muris, C. (2017). Binarization for panel models with fixed effects.cemmap working paper CWP31/17.Bourguignon, F., Browning, M., Chiappori, P. A., and Lechene, V. (1993). Intrahousehold allocation of consumption: A model and some evidence from frenchdata.
Annales d’ ´Economie et de Statistique , (29):137–156.Browning, M. and Chiappori, P. A. (1998). Efficient intra-household allocations: ageneral characterization and empirical tests.
Econometrica , 66(6):1241–1278.Browning, M., Chiappori, P. A., and Lewbel, A. (2013). Estimating consumptioneconomies of scale, adult equivalence scales, and household bargaining power.
Re-view of Economic Studies , 80(4):1267–1303.Calvi, R. (2019). Why are older women missing in india? the age profile of bargainingpower and poverty.
Journal of Political Economy . Forthcoming.Chamberlain, G. (1980). Analysis of covariance with qualitative data.
Review ofEconomic Studies , 47(1):225–238.Chamberlain, G. (2010). Binary response models for panel data: Identification andinformation.
Econometrica , 78(1):159–168.Charlier, E., Melenberg, B., and van Soest, A. (2000). Estimation of a censored re-gression panel data model using conditional moment restrictions efficiently.
Journalof Econometrics , 95:25–56.Chen, S. (2002). Rank estimation of transformation models.
Econometrica ,70(4):1683–1697.Chen, S. (2010a). An integrated maximum score estimator for a generalized censoredquantile regression model.
Journal of Econometrics , 155(1):90–98.Chen, S. (2010b). Non-parametric identification and estimation of truncated regres-sion models.
Review of Economic Studies , 77(1):127–153.Chen, S. (2010c). Root-N-consistent estimation of fixed-effect panel data transforma-tion models with censoring.
Journal of Econometrics , 159(1):222–234.Chen, S. (2012). Distribution-free estimation of the Box-Cox regression model withcensoring.
Econometric Theory , 28(3):680–695.Chen, S., Khan, S., and Tang, X. (2019). Exclusion Restrictions in Dynamic BinaryChoice Panel Data Models: Comment on “Semiparametric Binary Choice PanelData Models Without Strictly Exogenous Regressors”.
Econometrica , 87(5):1781–1785.Chen, S. and Zhou, X. (2012). Semiparametric estimation of a truncated regressionmodel.
Journal of Econometrics , 167(2):297–304.Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. In
Handbook of Econometrics , volume 6B, chapter 76, pages 5549–5632. Elsevier.Cherchye, L., De Rock, B., Lewbel, A., and Vermeulen, F. (2015). Sharing ruleidentification for general collective consumption models.
Econometrica , 83(5):2001–2041.Chernozhukov, V., Fern´andez-Val, I., Hahn, J., and Newey, W. (2013). Average and43uantile effects in nonseparable panel models.
Econometrica , 81(2):535–580.Chernozhukov, V., Fern´andez-Val, I., Hoderlein, S., Holzmann, H., and Newey, W.(2015). Nonparametric identification in panels using quantiles.
Journal of Econo-metrics , 188(2):378–392.Chernozhukov, V., Fern´andez-Val, I., and Weidner, M. (2018). Network and panelquantile effects via distribution regression. arXiv:1803.08154.Chiappori, P., Komunjer, I., and Kristensen, D. (2015). Nonparametric identificationand estimation of transformation models.
Journal of Econometrics , 188:22–39.Chiappori, P. A. (1988). Rational household labor supply.
Econometrica , 56(1):63–90.Chiappori, P. A. (1992). Collective labor supply and welfare.
Journal of PoliticalEconomy , 100(3):437–467.Chiappori, P. A. and Ekeland, I. (2006). The micro economics of group behavior:General characterization.
Journal of Economic Theory , 130(1):1–26.Chiappori, P. A. and Ekeland, I. (2009). The microeconomics of efficient groupbehavior: Identification.
Econometrica , 77(3):763–799.Chiappori, P. A. and Kim, J. H. (2017). A note on identifying heterogeneous sharingrules.
Quantitative Economics , 8(1):201–218.Chiappori, P.-A. and Mazzoco, M. (2017). Static and intertemporal household deci-sions.
Journal of Economic Literature , 55(3):985–1045.Das, M. and van Soest, A. (1999). A panel data model for subjective informa-tion on household income growth.
Journal of Economic Behavior & Organization ,40(4):409–426.Deaton, A. and Muellbauer, J. (1980).
Economics and consumer behavior . UniversityPress, Cambridge.Deaton, A. S. (1997).
The Analysis of Household Surveys : A MicroeconometricApproach to Development Policy . Washington, D.C. : The World Bank.Doksum, K. A. and Gasko, M. (1990). On a correspondence between models inbinary regression analysis and in survival analysis.
International Statistical Review ,58:243–252.Dunbar, G. R., Lewbel, A., and Pendakur, K. (2013). Children’s resources in collec-tive households: identification, estimation, and an application to child poverty inMalawi.
American Economic Review , 103(1):438–471.Dunbar, G. R., Lewbel, A., and Pendakur, K. (2019). Identification of random re-source shares in collective households without preference similarity restrictions.
TheJournal of Business and Economic Statistics .Evdokimov, K. (2010). Identification and estimation of a nonparametric panel datamodel with unobserved heterogeneity. Working paper.Evdokimov, K. (2011). Nonparametric identification of a nonlinear panel model withapplication to duration analysis with multiple spells. Working paper.Fern´andez-Val, I. (2009). Fixed effects estimation of structural parameters andmarginal effects in panel probit models.
Journal of Econometrics , 150(1):71–85.Fern´andez-Val, I. and Weidner, M. (2016). Individual and time effects in nonlinear44anel data models with large N, T.
Journal of Econometrics , 192(1):291–312.Freyberger, J. (2018). Nonparametric panel data models with interactive fixed effects.
The Review of Economic Studies , 85(3):1824–1851.Hahn, J. and Newey, W. (2004). Jackknife and analytical bias reduction for nonlinearpanel models.
Econometrica , 72(4):1295–1319.Hoderlein, S. and White, H. (2012). Nonparametric identification in nonsepara-ble panel data models with generalized fixed effects.
Journal of Econometrics ,168(2):300–314.Honor´e, B. (1992). Trimmed lad and least squares estimation of truncated and cen-sored regression models with fixed effects.
Econometrica , 60(3):533–565.Honor´e, B. (1993). Identification results for duration models with multiple spells.
Review of Economic Studies , 60(1):241–246.Honor´e, B. and Hu, L. (2004). Estimation of cross sectional and panel data censoredregression models with endogeneity.
Journal of Econometrics , 122(2):293–316.Honor´e, B. and Kyriazidou, E. (2000). Panel discrete choice models with laggeddependent variables.
Econometrica , 78(4):839–874.Honor´e, B. and Lewbel, A. (2002). Semiparametric binary choice panel data modelswithout strictly exogeneous regressors.
Econometrica , 70(5):2053 – 2063.Horowitz, J. L. (1996). Semiparametric estimation of a regression model with anunknown transformation of the dependent variable.
Econometrica , 64(1):103–137.Horowitz, J. L. and Lee, S. (2004). Semiparametric estimation of a panel data propor-tional hazards model with fixed effects.
Journal of Econometrics , 119(1):155–198.Khan, S., Ponomareva, M., and Tamer, E. (2016). Identification of panel data modelswith endogenous censoring.
Journal of Econometrics , 194(1):57–75.Khan, S., Ponomareva, M., and Tamer, E. (2020). Identification of dynamic binaryresponse models. Working paper.Khan, S. and Tamer, E. (2007). Partial rank estimation of duration models withgeneral forms of censoring.
Journal of Econometrics , 136(1):251–280.Lancaster, T. (2002). Orthogonal parameters and panel data.
The Review of Eco-nomic Studies , 69(3):647–666.Lee, S. (2008). Estimating panel data duration models with censored data.
Econo-metric Theory , 24(5):1254–1276.Lewbel, A. (2014). An overview of the special regressor method. In
Oxford Handbookof Applied Nonparametric and Semiparametric Econometrics and Statistics , pages38–62. Oxford University Press.Lewbel, A. and Yang, T. (2016). Identifying the average treatment effect in orderedtreatment models without unconfoundedness.
Journal of Econometrics , 195:1–22.Ligon, E. (1998). Risk sharing and information in village economies.
The Review ofEconomic Studies , 65(4):847–864.Lise, J. and Yamada, K. (2019). Household sharing and commitment: Evidence frompanel data on individual expenditures and time use.
The Review of EconomicStudies , 86(5):2184–2219. 45agnac, T. (2004). Panel binary variables and suficiency: Generaling conditionallogit.
Econometrica , 72(6):1859–1876.Manski, C. (1975). Maximum score estimation of the stochastic utility model ofchoice.
Journal of Econometrics , 3(3):205–228.Manski, C. (1985). Semiparametric analysis of discrete response: Asymptotic prop-erties of the maximum score estimator.
Journal of Econometrics , 27(3):313–333.Manski, C. F. (1987). Semiparametric analysis of random effects linear models frombinary panel data.
Econometrica , 55(2):357–362.Mazzocco, M. (2007). Household intertemporal behaviour: A collective characteriza-tion and a test of commitment.
The Review of Economic Studies , 74(3):857–895.Mazzocco, M., Ruiz, C., and Yamaguchi, S. (2014). Labor supply and householddynamics.
American Economic Review , 104(5):354–59.Menon, M., Pendakur, K., and Perali, F. (2012). On the expenditure-dependence ofchildren’s resource shares.
Economics Letters , 117(3):739–742.Muellbauer, J. (1975). Aggregation, income distribution and consumer demand.
TheReview of Economic Studies , 42(4):525–543.Muellbauer, J. (1976). Community preferences and the representative consumer.
Econometrica , (5):979–999.Muris, C. (2017). Estimation in the fixed effects ordered logit model.
The Review ofEconomics and Statistics , 99(3):465–477.Neyman, J. and Scott, E. (1948). Consistent estimates based on partially consistentobservations.
Econometrica , 16(1):1–32.Rasch, G. (1960).
Probabilistic Models for Some Intelligence and Attainment Tests .Copenhagen:Denmark Paedagogiske Institut.Ridder, G. (1990). The non-parametric identification of generalized acceleratedfailure-time models.
Review of Economic Studies , 57(2):167–181.Shi, X., Shum, M., and Song, W. (2018). Estimating semi-parametric panel multino-mial choice models using cyclic monotonicity.
Econometrica , 86(2):737–761.Sokullu, S. and Valente, C. (2019). Individual consumption in collective households:Identification using repeated observations with an application to progresa.van den Berg, G. J. (2001). Duration models: specification, identification and multipledurations. In Heckman, J. and Leamer, E., editors,
Handbook of Econometrics ,volume 5, chapter 55, pages 3381–3460. Elsevier.Vermeulen, F. (2002). Collective household models: principles and main results.
Journal of Economic Surveys , 16(4):533–564.Voena, A. (2015). Yours, mine and ours: Do divorce laws affect the intertemporalbehavior of married couples?
American Economic Review , 105(8):2295–2332.Vreyer, P. D. and Lambert, S. (2016). Intra-household inequalities and poverty insenegal. mimeo, Paris School of Economics.Wang, J. and Ghosh, S. (2012). Shape restricted nonparametric regression withbernstein polynomials.
Computational Statistics and Data Analysis , 56:2729–2741.Wilhelm, D. (2015). Identification and estimation of nonparametric panel data re-46ressions with measurement error. Cemmap working paper CWP34/15.47
NLINE APPENDICESA Proofs
A.1 Proof of Lemma (1)
Proof.
Define D = 1 { D ( y ) + D ( y ) = 1 } . The proof consists in showing the fol-lowing:med (cid:0) D ( y ) − D ( y ) | X, D = 1 (cid:1) (A.1)= sgn (cid:0) P (cid:0) D ( y , y ) = (0 , | X, D = 1 (cid:1) − P (cid:0) D ( y , y ) = (1 , | X, D = 1 (cid:1)(cid:1) (A.2)= sgn (cid:32) P (cid:0) D ( y , y ) = (0 , , D = 1 | X (cid:1) P ( D = 1 | X ) − P (cid:0) D ( y , y ) = (1 , , D = 1 | X (cid:1) P ( D = 1 | X ) (cid:33) (A.3)= sgn (cid:0) P (cid:0) D ( y , y ) = (0 , , D = 1 | X (cid:1) − P (cid:0) D ( y , y ) = (1 , , D = 1 | X (cid:1)(cid:1) (A.4)= sgn ( P ( D ( y , y ) = (0 , | X ) − P ( D ( y , y ) = (1 , | X )) (A.5)= sgn ( P ( D ( y ) = 1 | X ) − P ( D ( y ) = 1 | X )) (A.6)= sgn (∆ Xβ − γ ( y , y )) (A.7)where ( A.
2) follows since the random variable D ( y ) − D ( y ) ∈ {− , } , whichimplies thatmed (cid:0) D ( y ) − D ( y ) | X, D = 1 (cid:1) = (cid:26) P (cid:0) D ( y , y ) = (0 , | X, D = 1 (cid:1) > P (cid:0) D ( y , y ) = (1 , | X, D = 1 (cid:1) − P (cid:0) D ( y , y ) = (0 , | X, D = 1 (cid:1) < P (cid:0) D ( y , y ) = (1 , | X, D = 1 (cid:1) , ( A.
3) follows from the definition of conditional probability, ( A.
4) follows since thesign function is not affected by scaling both quantities by the same positive factor(the denominator), ( A.
5) follows by the definition of D , and ( A.
6) follows since: P ( D ( y ) = 1 | X ) = P ( D ( y , y ) = (0 , | X ) + P ( D ( y , y ) = (1 , | X ) P ( D ( y ) = 1 | X ) = P ( D ( y , y ) = (1 , | X ) + P ( D ( y , y ) = (1 , | X )Finally, ( A.
7) follows from Assumption 2(ii), which implies that, e.g., P ( D ( y ) = 1 | α, X ) > P ( D ( y ) = 1 | α, X ) ⇔ α + X β − h − ( y ) > α + X β − h − ( y ) . Integrating both sides over the conditional distribution of α given X obtains: P ( D ( y ) = 1 | X ) > P ( D ( y ) = 1 | X ) ⇔ X β − h − ( y ) > X β − h − ( y ) ⇔ ∆ Xβ − γ ( y , y ) > . .
4) now follows.
A.2 Proof of Theorem 1
Proof.
Following Manski (1985), it suffices to show that for an arbitrary θ ∈ Θ, θ (cid:54) = θ ≡ θ ( y , y ), P ( W θ < ≤ W θ ) + P ( W θ < ≤ W θ ) > . (A.8)Our proof follows very closely that in Manski (1985), with W θ taking the role of xb and W θ taking the role of xβ . However, our scale normalization is different.Without loss of generality, let X K be the continuous regressor in Assumption3(i). Separate ∆ X = (∆ X − K , ∆ X K ) where the first component ∆ X − K representsall covariates except the K -th one. Similarly, for any θ = ( β, γ ) ∈ Θ, separate β = ( β − K , β K ). Furthermore denote W − K = (∆ X − K , −
1) and θ − K = ( β − K , γ ).Assume that the associated regression coefficient β ,K >
0. The case β ,K < θ = ( β, γ ) ∈ Θ, θ (cid:54) = θ . As in Manski (1985, p. 318), considerthree cases: (i) β K <
0; (ii) β K = 0; (iii) β K > Cases (i) and (ii). β K ≤
0. The proof is identical to that in Manski (1985),with Xβ replaced by W θ . The fact that we use a different scale normalization doesnot come into play.
Case (iii). β K >
0. note that P ( W θ < ≤ W θ ) = P (cid:18) − W − K θ , − K β ,K < ∆ X K < − W − K θ − K β K (cid:19) .P ( W θ < ≤ W θ ) = P (cid:18) − W − K θ − K β K < ∆ X K < − W − K θ , − K β ,K (cid:19) . By assumption 4, β − K β K (cid:54) = β , − K β ,K , which shows that the first K components of the vector θ are not a scalar multiple of the first K components of the vector θ . Therefore, θ isnot a scalar multiple of θ . In particular, θ , − K β ,K (cid:54) = θ − K β K . Additionally, assumption 3(ii)implies that P (cid:16) W − K θ , − K β ,K (cid:54) = W − K θ − K β K (cid:17) >
0. Hence at least one of the two probabilitiesabove is positive so that (A.8) holds.
A.3 Proof of Theorem 2
Proof.
Under Assumption 5, h − ( y ) = 0. Using the pair ( y , y ) for binarization thusobtains identification of γ ( y , y ) = h − ( y ) − h − ( y )= h − ( y ) .
49y varying y ∈ Y , we identify the function h − from the binary choice models asso-ciated with { D ( y , y ) = ( D ( y ) , D ( y )) , y ∈ Y} .The pairs ( y , y ) and ( y , y ) identify the difference γ ( y , y ) − γ ( y , y ) = ( h − ( y ) − h − ( y )) − ( h − ( y ) − h − ( y ))= h − ( y ) . By varying y ∈ Y we therefore identify h − .Thus, the functions h − and h − are identified. Because of monotonicity of h t (Assumption 1), and because Y is known, h − t contains all the information about thepre-image of h t . Knowledge of the pre-image of a function is equivalent to knowledgeof the function itself. Therefore, h t can be identified from h − t . A.4 Proof of Theorem (3)
Proof.
For the panel data binary choice model with logit errors, we obtain P ( D ( y ) = 1 | D ( y , y ) = 1 , X, α ) (A.9)= P ( D ( y ) = 1 , D ( y , y ) = 1 | X, α ) P (cid:0) D ( y , y ) = 1 | X, α (cid:1) (A.10)= P ( D ( y ) = 0 , D ( y ) = 1 | X, α ) P (cid:0) D ( y , y ) = 1 | X, α (cid:1) (A.11)= P ( D ( y ) = 0 , D ( y ) = 1 | X, α ) P ( D ( y ) = 0 , D ( y ) = 1 | X, α ) + P ( D ( y ) = 1 , D ( y ) = 0 | X, α ) (A.12)= 11 + P ( D ( y )=1 ,D ( y )=0 | X,α ) P ( D ( y )=0 ,D ( y )=1 | X,α ) (A.13)= Λ(∆ Xβ − γ ( y , y )) (A.14)where A.10 follows from the definition of a conditional probability; A.11 follows be-cause D = 1 and ¯ D = 1 are equivalent to D = 0 and D = 1; A.12 follows because D + D = 1 happens precisely when either ( D , D ) = (1 ,
0) or ( D , D ) = (0 , P ( D ( y )=1 ,D ( y )=0 | X,α ) P ( D ( y )=0 ,D ( y )=1 | X,α ) equals50 ( D ( y ) = 1 | X, α ) P ( D ( y ) = 0 | X, α ) P ( D ( y ) = 0 | X, α ) P ( D ( y ) = 1 | X, α ) (A.15)= Λ (cid:0) α + X β − h − ( y ) (cid:1) (cid:2) − Λ (cid:0) α + X β − h − ( y ) (cid:1)(cid:3)(cid:2) − Λ (cid:0) α + X β − h − ( y ) (cid:1)(cid:3) Λ (cid:0) α + X β − h − ( y ) (cid:1) (A.16)= exp (cid:0) α + X β − h − ( y ) (cid:1) exp (cid:0) α + X β − h − ( y ) (cid:1) (A.17)= exp (cid:0) ( X − X ) β − (cid:0) h − ( y ) − h − ( y ) (cid:1)(cid:1) , (A.18)where A.15 follows from serial independence of ( U , U ) conditional on ( X, α ); A.16from the logit model specification; and A.17 follows fromΛ ( u ) / (1 − Λ ( u )) = exp ( u ) . The discussion above implies that A.9 does not depend on α . Hence, p ( X, y , y ) ≡ P ( D ( y ) = 1 | D ( y , y ) = 1 , X )= Λ(∆ Xβ − γ ( y , y ))= Λ ( W θ ( y , y )) . and note that p ( X, y , y ) is identified from the distribution of ( Y, X ), which is as-sumed to be observed. Then θ ( y , y ) = [ E ( W (cid:48) W )] − E ( W (cid:48) Λ − ( p ( X, y , y )))by invertibility of Λ and the full rank assumption on E [ W (cid:48) W ]. This establishesidentification of β and γ ( y , y ). The proof in Section A.3 applies, which shows theidentification of h and h . A.5 Proof of Theorem 4
Proof. (a) Note that, without Assumption 5, it follows immediately from the proof ofTheorem 2 that we can only identify (cid:8) h − t ( y ) − c , y ∈ Y , t = 1 , (cid:9) , i.e. we identifythe functions g t ( y ) , t = 1 , . Because the functions g , g are identified, and because the distribution of ( Y, X )is observable, we can identify the distribution of the left hand side of the relationbelow: g t ( Y t ) − X t β = h − t ( Y t ) − X t β − c = α − U t − c , t = 1 , x ∈ X ,µ ( x ) = E [ α | X = x ]= E [ α − U t | X = x ] + E [ U t | X = x ]= E (cid:2) h − t ( Y t ) − X t β (cid:12)(cid:12) X = x (cid:3) + m = E [ g t ( Y t ) − X t β | X = x ] + c + m. is identified up to the constants c and m .The difference in conditional means at any two values x, x (cid:48) ∈ X is thereforeidentified and given by: µ ( x ) − µ (cid:16) x (cid:48) (cid:17) =( E [ g t ( Y t ) − X t β | X = x ] + c + m ) − (cid:16) E (cid:104) g t ( Y t ) − X t β | X = x (cid:48) (cid:105) + c + m (cid:17) = E [ g t ( Y t ) − X t β | X = x ] − E (cid:104) g t ( Y t ) − X t β | X = x (cid:48) (cid:105) . (b) To see that the conditional variance is identified, note that for all x ∈ X ,Cov ( g ( Y ) − X β, g ( Y ) − X β | X = x ) == Cov (cid:0) h − ( Y ) − X β − c , h − ( Y ) − X β − c (cid:12)(cid:12) X = x (cid:1) = Cov ( α − U − c , α − U − c | X = x )= V ar ( α | X = x ) − Cov ( α, U | X = x ) − Cov ( α, U | X = x ) + Cov ( U , U | X = x )= V ar ( α | X = x ) = σ α ( x ) , where the first equality follows from the definition of g t ; the second from the model;the third equality follows from the linearity of the covariance; and the fourth equalityuses assumption (4c) and (4d). B GMM Estimator
The women’s food demand equation (5.8) is a FELT model, conditional on observedcovariates z . Denote the inverse demand functions g t ( Y it , z it ) = h − t ( · , z it ). Given (5.8)and a two-period setting with t = 1 ,
2, and time-invariant demographics z it = z i , wehave α i + X it − U it = g t ( Y it , z i ) , (B.1)implying the conditional moment condition E [ g ( Y i , z i ) − g ( Y i , z i ) − (cid:52) X it | X i , X i , z i ] = 0 . (B.2)52 ieve estimators . We approximate the inverse demand functions, g t , t = 1 , , using Bernstein polynomials. This allows us to impose monotonicity in a straight-forward way, see, e.g., Wang and Ghosh (2012). Let k = 0 , ..., K index univariateBernstein functions denoted as B k ( · , K ), where K is the degree of the Bernsteinpolynomial and where the Bernstein functions are given by: B k ( u, K ) = (cid:18) Kk (cid:19) u k (1 − u ) K − k , u ∈ [0 , . Let l = 0 , ..., L index the elements of z i , and let the first (index 0) element of z i be aconstant equal to 1, that is z i = [1 , z i , . . . , z iL ].Our approximation to g t ( Y it , z i ) is given by: g t ( Y it , z i ) ≈ K (cid:88) k =0 β kt ( z i ) B k ( Y it , K ) ≈ L (cid:88) l =0 K (cid:88) k =0 z ( l ) i β ( l ) kt B k ( Y it , K ) . For example, when there are no covariates L = 0, the expression above reduces to thestandard Bernstein polynomial approximation g t ( Y it ) ≈ (cid:80) Kk =0 β (0) kt B k ( Y it , K ) . TheBernstein coefficients are linear functions of the demographics, and the dependenceof the Bernstein coefficients on the demographics is allowed to vary with time. Inthis way, the relationship between the (nonlinear) demand Y it and the latent budget Y ∗ it depends on demographic characteristics and on prices, through t . Since Bernsteinpolynomials are defined on the unit interval, we normalize Y it to be uniform on [0 , Unconditional moments . To form GMM estimators, we construct the followingunconditional moments: E (cid:34)(cid:32) L (cid:88) l =0 K (cid:88) k =0 z ( l ) i (cid:16) β ( l ) k B k ( Y i , K ) − β ( l ) k B k ( Y i , K ) (cid:17) − (cid:52) X it (cid:33) B k (cid:48) ( x it , K ) z ( l (cid:48) ) i (cid:35) = 0for k (cid:48) = 0 , . . . , K , l (cid:48) = 0 , . . . , L , and t = 1 ,
2, and E (cid:34)(cid:32) L (cid:88) l =0 K (cid:88) k =0 z ( l ) i (cid:16) β ( l ) k B k ( Y i , K ) − β ( l ) k B k ( Y i , K ) (cid:17) − (cid:52) X it (cid:33) X it z ( l (cid:48) ) i (cid:35) = 0 , for l (cid:48) = 0 , . . . , L and t = 1 , . We include the second condition (where the loggedhousehold budget X it is exogenous) because we ultimately wish to consider the corre-lation of α i and X it . For a given order of the sieve approximation, K , the equations We use this normalization for the estimation of the Bernstein coefficients, but we present ourresults in terms of untransformed Y it . These results are obtained by applying the inverse transfor-mation to the function estimated with transformed data. g and g by imposing β kt ( z i ) ≥ β k − ,t ( z i ) , for all z i , for all t , and for k ≥
2. This results in a quadraticprogramming problem with linear inequality restrictions, which we implement in R using the quadprog package. Degree of Bernstein polynomial . Implementing this method requires the selectionof the degree of the Bernstein polynomial, K . While developing a formal selectionrule for this parameter would be desirable, it is beyond the scope of the presentpaper. Nonetheless, we adopt an informal selection rule for the number of Bernsteinbasis functions – the smoothing parameter – based on the following observation. Inour semiparametric setting, the estimators are known to have the same asymptoticdistribution for a range of smoothing parameters (see, e.g., Chen (2007)). When thenumber of Bernstein basis functions is small, the bias dominates and the estimatesexhibit a decreasing bias as the number of terms increases. On the other hand,when the number of basis functions is large, the statistical noise dominates. Weimplement our estimation method over a range of smoothing parameter values, thatis, K ∈ { , , . . . , } , in search of a region where the estimates are not very sensitiveto small variations in the smoothing parameter. We select the mid-point of thatregion, so our main results use K = 8. We present results for K ∈ { , , , } in theAppendix. In the Appendix table, “X” denotes a case where an estimated varianceis negative. Confidence bands . We compute confidence bands via the nonparametric bootstrap,although we do not provide a formal justification for it in this setting. We use 1,000bootstrap replications for all our results. We report pointwise 95% confidence bands.All reported estimates in the main text are bias-corrected using the bootstrap.
Estimates and parameters of interest . We present estimates for the demand func-tions, (cid:98) h t ( x, z ) = (cid:98) g − t ( · , z ) for t = 1 ,
2, where ¯ z represents a household with 1 child andhas other observed characteristics that are the average of those of 1-child households.The functions h t (demand functions) are not of direct interest, but identification ofthe h t supports identification of moments of the distribution of fixed effects α i . Inour context, α i equals the log of the resource share of the woman in household i .We characterize the following interesting features of the distribution of resourceshares. Recall from Theorems 4 and 5 that identification of features of this distri-bution does not impose a normalization on assignable good demand functions, andonly identifies the distribution of logged resource shares (fixed effects) up to location.Consequently, we only identify features of the resource share distribution up to a scalenormalization.Let (cid:98) g it = (cid:98) g t ( Y it , z i ) equal the predicted values of the inverse demand functions atthe observed data. Recall that g t ( Y it , z i ) = α i + X it − U it , so we can think of (cid:98) g it − X it as a prediction of α i − U it . We then compute the following summary statistics of When the parameter is on the boundary and the bootstrap is not consistent, it is common inthe literature to perform an m -out-of- n bootstrap, see, e.g., Andrews (1999, 2000). g it , t = 1 , , on z i implicit:1. an estimate of the standard deviation of α i given byˆ std ( α ) = (cid:112) ˆ cov (( (cid:98) g i − X i ) , ( (cid:98) g i − X i )) , where ˆ cov denotes the sample covariance. The standard deviation of logs isa standard (scale-free) inequality measure. So this gives a direct measure ofinter-household variation in women’s resource shares.2. an estimate of the standard deviation of the projection error, e i , of α i on ¯ X i = ( X i + X i ) and Z i . Consider the projection α i = γ ¯ X i + γ Z i + e i , where Z i contains a constant. We are interested in the standard deviation of e i . To obtain this parameter, we compute estimators for γ , γ from the pooledlinear regression of ˆ g it − X it on ¯ X i and Z i . Call these estimators ˆ γ , ˆ γ . Then,as in ˆ std ( α ) in (1), an estimate of the standard deviation of e i is given by:ˆ std ( e i ) = (cid:113) ˆ cov (cid:0)(cid:0)(cid:98) g i − X i − ˆ γ ¯ X i − ˆ γ Z i (cid:1) , (cid:0)(cid:98) g i − X i − ˆ γ ¯ X i − ˆ γ Z i (cid:1)(cid:1) . This object measures the amount of variation in α i that cannot be explainedwith observed regressors. If it is zero, then we don’t really need to accountfor household-level unobserved heterogeneity in resource shares. It is muchlarger than zero, then accounting for household-level unobserved heterogeneityis important.3. an estimate of the standard deviation of α i + X it for t = 1 ,
2, computed asˆ std ( α i + X it ) = (cid:112) ˆ var ( α i ) + ˆ var ( X it ) + 2 ˆ cov ( α i , X it ) , where ˆ cov ( α i , X i ) = cov (ˆ g i − X i , X i ) , ˆ cov ( α i , X i ) = cov (ˆ g i − X i , X i ) , and ˆ var ( α i ) = (cid:16) ˆ std ( α ) (cid:17) and ˆ var ( X it ) , t = 1 , , is observed in the data. Since α i + X it is a measure of the woman’s shadow budget, ˆ std ( α i + X it ) is a mea-sure of inter-household inequality in women’s shadow budgets. This inequalitymeasure is directly comparable to the standard deviation of X i (shown in Table1), which measures inequality in household budgets.4. an estimate of the covariance of α i , X it for t = 1 ,
2, denoted ˆ cov ( α i , X it ). This55bject is of direct interest to applied researchers using cross-sectional data toidentify resource shares. If this covariance is non-zero, then the independenceof resource shares and household budgets is cast into doubt, and identificationstrategies based on this restriction are threatened.Of these, the first 2 summary statistics are about the variance of fixed effects, andare computed using data from both years. Their validity requires serial independenceof the measurement errors U it . In contrast, the second 2 summary statistics areabout the correlation of fixed effects with the household budget, and are computedat the year level. They are valid with stationary U it , even in the presence of serialcorrelation.We also consider the multivariate relationship between resource shares, householdbudgets and demographics. Recall that the fixed effect α i subject to a locationnormalization; this means that resource shares are subject to a scale normalization.So, we construct an estimate of the woman’s resource share in each household as (cid:98) η i = exp (cid:0) (ˆ g i − X i ) + (ˆ g i − X i ) (cid:1) , normalized to have an average value of 0 . (cid:98) η i on ¯ X i and Z i , and present the estimatedregression coefficients, which may be directly compared with similar estimates in thecross-sectional literature.The estimated coefficient on X i gives the conditional dependence of resource shareson household budgets, and therefore speaks to the reasonableness of the restrictionthat resource shares are independent of those budgets (an identifying restriction usedin the cross-sectional literature). Finally, using the estimate of the variance of fixedeffects, we construct an estimate of R in the regression of resource shares on observedcovariates. This provides an estimate of how much unobserved heterogeneity mattersin the overall variation of resource shares.We also provide estimates that use women’s clothing as the assignable non-shareablegood. Here, clothing expenditure is equal to four times the reported three-month re-call expenditure on the following female-specific clothing items: Saree; Blouse/ pet-ticoat; Salwar kameez; and Orna. We note that although clothing is a semi-durablegood, there are 3 years between the waves of the panel. Consequently, we do not thinkthat the demands across periods will be strongly correlated due to the durability ofclothing purchased.Appendix table “Additional Estimates” gives results for K ∈ { , , , } for food(top panel) and clothing (bottom panel). In the table, “X” denotes a case where anestimated variance is negative. 56 Descriptive Statistics
Table 2: Descriptive Statisticsraw data top- and bottom-codedand normalizedVariable Mean Std Dev Min Max Mean Std Dev X i X i (cid:52) X i Y i Y i Additional estimates E s t i m a t e q ( . ) q ( . ) E s t i m a t e q ( . ) q ( . ) E s t i m a t e q ( . ) q ( . ) E s t i m a t e q ( . ) q ( . ) V a r i a b ili t y i nun o b s e r v a b l e s : F oo d s d a . . . . . . . . . . . . s d e . . . . . . . . . . . . s d a x - . - . - . - . - . - . - . - . - . - . - . - . s d a x - . - . - . - . - . - . - . - . - . - . - . - . c o v a x . . . . . . . . . . . . c o v a x . . . . . . . . . . . . R e g r e ss i o n c o e ffi c i e n t s : F oo d R : X i , z i o n η i . . . . . . . . . . . . X i - . - . - . - . - . . - . - . - . - . - . - . ag e w o m e n - . - . . - . - . - . - . - . - . - . - . - . ag e m e n - . - . . . - . . . . . . . . c h il d r e n - . - . - . - . - . - . - . - . . - . - . . c h il d r e n p - . - . - . - . - . - . - . - . . - . - . . ag ec h il d r e n - . - . - . - . - . . - . - . . . - . . f r a c g i r l - . - . . . - . . . - . . - . - . . e du w o m e n . - . . . - . . - . - . . - . - . . e du m e n . - . . - . - . . - . - . - . - . - . . V a r i a b ili t y i nun o b s e r v a b l e s : C l o t h s d a . X . . . . . . . . . . s d e . X . . . . XX . XXX s d a x - . - . - . - . - . . - . - . - . - . - . - . s d a x - . - . - . - . - . - . - . - . - . - . - . - . c o v a x . X . . . . . X . XX . c o v a x . X . . . . XX . XX . R e g r e ss i o n c o e ffi c i e n t s : C l o t h R : X i , z i o n η i . . . . - . . . . . . . . X i . . . . - . . - . - . - . - . - . - . ag e w o m e n . - . . - . - . - . - . - . - . - . - . . ag e m e n . - . . . . . . . . . . . c h il d r e n - . - . - . . - . . . - . . . . . c h il d r e n p - . - . . - . - . . - . - . . - . - . . ag ec h il d r e n - . - . - . - . - . . . - . . - . - . - . f r a c g i r l - . - . . - . - . - . . . . . . . e du w o m e n - . - . . - . - . . - . - . - . - . - . - . e du m e n - . - . - . . - . . . - . . . - . . T a b l e : Su mm a r y s t a t i s t i c s f o r l og r e s o u r ce s h a r e s a nd t h e i r p r o j ec t i o n o nd e m og r a ph i c s a ndbud g e t ..