Identifying and Estimating Perceived Returns to Binary Investments
IIdentifying and Estimating Perceived Returns toBinary Investments ∗ Clint Harris † January 27, 2021
Abstract
I describe a method for estimating agents’ perceived returns to investments that relies oncross-sectional data containing binary choices and prices, where prices may be imperfectlyknown to agents. This method identifies the scale of perceived returns by assuming agentknowledge of an identity that relates profits, revenues, and costs rather than by elicitingor assuming agent beliefs about structural parameters that are estimated by researchers.With this assumption, modest adjustments to standard binary choice estimators enableconsistent estimation of perceived returns when using price instruments that are uncor-related with unobserved determinants of agents’ price misperceptions as well as otherunobserved determinants of their perceived returns. I demonstrate the method, and theimportance of using price variation that is known to agents, in a series of data simulations.JEL Codes: C31, D84, D61Keywords: Biased Beliefs, Returns to Investments, Revealed Preference, Subsidies, Taxes ∗ I thank Mary Kate Batistich, Trevor Gallen, Kendall Kennedy, Soojin Kim, Dan Millimet, KevinMumford, Victoria Prowse, and Miguel Sarzosa as well as seminar participants at Case Western ReserveUniversity, The European Association of Labor Economists Meeting, Kansas State University, The Mid-west Economics Association Meeting, The National Tax Association Meeting, Purdue University, TheSouthern Economic Association Meeting, and The US Census Bureau for helpful comments. † Wisconsin Institute for Discovery, University of Wisconsin-Madison, 330 N Orchard Street, Madison,WI 53715 USA; email: [email protected] a r X i v : . [ ec on . E M ] J a n . Introduction In this paper I describe a method for estimating distributions of perceived private returnsto binary investments. These structural perceived returns estimates are of distributionsof agents’ compensating variation associated with a binary choice that condition on ob-servables. This method complements program evaluation methods that estimate effectsof specific policy shocks on binary choices by allowing for predictions of counterfactualpolicies that differ from past policies in magnitude or targeted population. For instance,Harris (2020) applies this method to estimate perceived returns to college, allowing forcounterfactual predictions of targeted college attendance subsidies (and taxes) for diversegroups of individuals. Identification is achieved by assuming common agent knowledge ofan identity that relates prices to returns, while also using instruments that are de factoknown to agents, in the sense that they shift perceived prices the same amount that theyshift actual prices, in addition to satisfying the traditional exclusion restriction.This paper presents a special case of a general method for identifying the scale of bi-nary choice models by assuming agent beliefs about a variable observed by the researcherand agent beliefs about the mapping between that variable and the perceived returnlatent variable. Existing work that makes such assumptions includes Cunha, Heckman,and Navarro (2005), who assume agent knowledge of their lifetime pecuniary return tocollege insofar as it is attributable to explanatory variables observed by the researcher,and Dickstein and Morales (2018), who assume partial agent knowledge of trade revenuesand agent knowledge of an estimated demand elasticity parameter. The present paperassumes partial agent knowledge of prices in the sense of Dickstein and Morales (2018)while assuming agent knowledge that prices causally decrease returns dollar for dollarin accordance with an identity that relates profits, revenues, and costs. The use of thisidentity imposes a theoretical restriction on a structural parameter (the coefficient onprice in the binary choice latent variable equation) without requiring its estimation byresearchers or agents. Avoiding the assumption that agents obtain the same estimate ofa parameter as researchers improves robustness to the concerns articulated by Manski21993, 2004) about the pitfalls of making incorrect assumptions on agents’ knowledge ofstructural models.The method in the present paper avoids assuming rational expectations on any modelobjects, instead assuming that the variation in prices associated with chosen instrumentsis known to agents regardless of whether agents are correct about prices on average.This makes it particularly attractive in applications where rational expectations assump-tions in general are suspect, but the researcher can credibly argue that a particular priceshock is nonetheless known to agents. Considering the example of college attendance,it is possible that exogeneous policy shocks may shift prices more than they shift per-ceived prices, as with Pell grants (Hansen, 1983; Kane, 1995), they may shift perceivedprices more than they shift prices, as with the Michigan HAIL policy (Dynarski, Libassi,Michelmore, and Owen, 2018), or they may shift prices and perceived prices the sameamount, as with the Social Security Student Benefit termination (Dynarski, 2003). Ofthese preceding sources of variation, only the last would be appropriate for estimatingthe model presented in this paper. In addition to college attendance, attractive targetsfor this method include healthcare, home purchases, R&D, and export decisions due tothe substantial information frictions on prices in these settings.In addition to considerations regarding the relative credibility of different assumptionson agent beliefs, applications also differ in data availability. The method described inthis paper relies on cross-sectional data that contains binary choices on investments andprices associated with those investments. Methods that rely on rational expectationson ex post returns to investments require longitudinal data (without requiring data onprices), as in Cunha, Heckman, and Navarro (2005) and related research surveyed byCunha and Heckman (2007). Meanwhile, inferring beliefs by eliciting them directly fromagents requires surveys that contain this information, as in Jensen (2010), Wiswall andZafar (2015), and Bleemer and Zafar (2018). The method described in this paper is thususeful in settings where there is no clear winner in terms of assumption validity, but whenlongitudinal data and data on agent perceptions in unavailable.I describe how to estimate perceived returns when prices are known to agents and3xogenous, and how to overcome violations of these conditions using instrumental vari-ables. I compare performance of these methods with valid and invalid instruments acrossdata generating processes that differ in the assumptions on agent knowledge of prices.In the most realistic settings, methods that make no use of instruments, or which useinstruments that are correlated with agent misperceptions, perform poorly compared tothose that use instruments that are de facto known to agents.The plan of the rest of this paper is as follows. Section 2 introduces the empiricalmodel. Section 3 describes the econometric strategy and the assumptions required foridentification. Section 4 evaluates the robustness of various methods and instruments tovarious empirical challenges in a series of simulated data exercises. Section 5 concludes.
2. Model
I assume that agents choose whether to make an investment based on their beliefs aboutdiscounted net incomes and costs associated with choices, which I present as a two-sectorgeneralized Roy (1951) model. Agents choose to select the investment, S i = 1, or to notdo so, S i = 0, which is observed by the researcher. I define (cid:101) Y ,i as agent i ’s perceiveddiscounted present value of lifetime income associated with choosing the investment and (cid:101) Y ,i as their perceived discounted present value of lifetime income associated with notdoing so. I further define (cid:101) C i as their perceived net present value cost of making theinvestment, which includes prices paid and nonpecuniary costs expressed in monetaryvalues. Unlike common applications of the Roy model, none of (cid:101) Y ,i , (cid:101) Y ,i , and (cid:101) C i areobserved by the researcher for any individual because they represent agent perceptions.I express the perceived potential incomes and costs for individual i with the followinglinear-in-parameters production functions, (cid:101) Y ,i = X i β + ˜ (cid:15) ,i (cid:101) Y ,i = X i β + ˜ (cid:15) ,i (cid:101) C i = X i β C + (cid:94) P rice i + ˜ (cid:15) Ci . (1)4ere, X i are variables observed by the researcher that determine potential incomes andcosts. The parameters { β } capture the extent to which these variables drive beliefs aboutpotential outcomes regardless of whether they are known to agents. (cid:94) P rice i is the agent’sperceived price for the investment, which is known to agents but not to researchers.Importantly, it is assumed to only affect costs and has a coefficient that is normalized tounity. Finally, ˜ (cid:15) ,i , ˜ (cid:15) ,i , and ˜ (cid:15) Ci represent idiosyncratic perceived returns to investmentthat are known to agents but not to the researcher.I assume that agents maximize expected wealth independently of how they consumeit, as in the case of perfect credit markets. It follows that the perceived net return/profit, (cid:101) π i , is sufficient to determine agents’ decisions in accordance with the rule S i =
1, if (cid:101) π i ≥ ,
0, otherwise. (2)I further assume that the definition of profit, π i ≡ Revenue i − Cost i , is known to agentsin the sense that it holds for their beliefs as well, such that (cid:101) π i = (cid:94) Revenue i − (cid:93) Cost i = (cid:101) Y ,i − ( (cid:101) Y ,i + (cid:101) C i ) , (3)where (cid:94) Revenue i denotes the agent’s perceived income and (cid:93) Cost i denotes the agent’sperceived opportunity cost, which includes (cid:101) Y ,i . It follows that the agent’s decision rulecan be expressed in terms of potential outcomes as S i =
1, if (cid:101) Y ,i − (cid:101) Y ,i − (cid:101) C i ≥ ,
0, otherwise. (4)Defining the net marginal effects β ≡ β − β − β C and the net idiosyncratic component I avoid denoting agents’ beliefs with conditional expectations over realized values, as is common inthe literature, to avoid the implication of rational expectations which follows from the law of iteratedexpectations.
5f perceived outcomes ˜ (cid:15) i ≡ ˜ (cid:15) ,i − ˜ (cid:15) ,i − ˜ (cid:15) Ci , we can combine (1) with (4) to write theperceived return latent variable as (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i . (5)Importantly, the assumptions given result in the latent variable being linear in perceivedprices, with a marginal effect ( −
1) that is known to both agents and the researcher. Theexpression of perceived returns as a latent variable in a binary choice problem with asingle known marginal effect is the starting point of the estimation procedures describedbelow.
3. Empirical Strategy
It follows from the model that latent perceived returns are identified by β , (cid:94) P rice i , and˜ (cid:15) i , given the observed X i . The lack of observation of ˜ (cid:15) i is a common problem that willbe addressed with commonly used binary choice estimation techniques. In this sectionI will describe adjustments to these estimators that leverage the assumptions describedabove to permit identification of β and the scale of the distribution of ˜ (cid:15) i in the context ofthe researcher’s failure to observe agents’ perceived prices. To preface, these adjustmentsaddress challenges that arise due to perceived costs having a causal effect on perceivedreturns in the identity given in (3).The econometric methods described below establish conditions under which the as-sumed coefficient on perceived prices from (5) exactly determines the marginal effect ofrealized prices on perceived returns in a binary choice model. Omitted variable biasand measurement error in prices as measures of perceived prices threaten the validity ofthis assumption. It follows that methods which address omitted variable bias and mea-surement error will validate the assumption on the marginal effect of realized prices onperceived returns. To clarify, consider the expression of agents’ beliefs about prices used The researcher constraining the price coefficient to the value used by agents is key to identification,not the researcher or agents being correct about its value. (cid:94)
P rice i = P rice i + X i α + ν i , (6)where the realized price, P rice i , is observed by the researcher, α gives the effect ofexplanatory variables on price misperceptions, and ν i is the idiosyncratic component ofagent i ’s misperception of prices. Here, realized prices are assumed to increase agents’beliefs about prices at a known marginal rate of unity insofar as they are known to agents.This expression allows us to present an empirically tractable version of perceivedreturns, (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i = X i β − P rice i − X i α − ν i + ˜ (cid:15) i , (7)by substituting in prices observed by the researcher for agents’ unobserved perceivedprices and defining θ = β − α . This representation presents the unexplained price mis-perception as an omitted variable, which will produce problems if
P rice i is correlatedwith ν i . Natural examples of problematic correlations between price misperceptions in-clude agents systematically over-reacting or under-reacting to price predictors that areunobserved by the researcher. The extreme case of under-reaction is that in which anunobserved predictor of realized price variation is ignored by or unknown to agents alto-gether, which amounts to classical measurement error in realized prices as measures ofperceived prices.In what follows, I first consider a benchmark case in which unobserved components ofprice misperceptions are mean independent of realized prices and prices are uncorrelatedwith unobserved determinants of perceived returns. Though agents may be mistakenabout prices, actual prices can stand in for perceived prices because any systematic price The distinction between the extent to which each control contributes to misperceptions in prices, α , and to other components of perceived returns, β , is presented to emphasize that the methods in thispaper are robust to systematic bias in perceptions associated with explanatory variables, even thoughthey are not separately identified. Here, I describe a benchmark procedure for estimating perceived returns with a simpleadjustment to a common binary choice method. This procedure will provide consistentestimates of the perceived returns distribution under two assumptions that are likely tobe violated in applications. First, this method assumes that prices and the unobservedcomponent of perceived returns are uncorrelated. Second, it assumes that unobservedcomponents of price misperceptions are mean independent of prices conditional on X i ,the simplest case of which is agents having perfect information on prices.With the decision rule in (4) and the expression of perceived returns in (7), an assump-tion on the distribution of − ν i + ˜ (cid:15) i is sufficient to consistently estimate perceived returnsby maximum likelihood. I assume the composite unobserved component of perceivedreturns in (7) is normally distributed as − ν i + ˜ (cid:15) i | X i , P rice i ∼ N (0 , σ ) . (8)The assumption of normality is chosen for convenience, and is not necessary for theestimation procedures in this paper. Defining ( β ∗ , θ ∗ , γ ∗ ) = ( βσ , θσ , σ ) for notational con-venience, the probability of selection is given by P r ( S i = 1 | X i , P rice i ) = Φ( X i θ ∗ − P rice i γ ∗ ) , (9)8here Φ( · ) denotes the standard normal CDF.The parameters ( θ ∗ , γ ∗ ) are the values that maximize the log-likelihood L ( θ ∗ , γ ∗ | X i , P rice i ) = (cid:88) i S i log (cid:34) Φ (cid:16) X i θ ∗ − P rice i γ ∗ (cid:17)(cid:35) + (1 − S i ) log (cid:34) − Φ (cid:16) X i θ ∗ − P rice i γ ∗ (cid:17)(cid:35) . (10)The estimates of perceived returns are then given byˆ (cid:101) π i | X i , P rice i ∼ N ( X i ˆ θ − P rice i , ˆ σ ) , (11)where imposing the constraint γ ∗ = σ (rather than the standard constraint σ = 1) is theonly difference from a standard probit. Importantly, the assumption that γ ∗ = σ is onlyvalid under the assumptions described in Section 2 when realized prices are uncorrelatedwith unobserved components of price misperceptions and perceived returns conditionalon X i . As this generally will not be the case, this assumption is not an innocuousnormalization. Here, I describe a control function approach that addresses correlation between pricesand unobserved components of perceived returns as well as arbitrary correlation betweenprices and misperceptions on prices. In Appendix A, I discuss a method developed byDickstein and Morales (2018) that performs well in this model when agents under-react toprice variation, such as when they form rational expectations on prices based on a knownprice predictors and only a subset of price predictors are known to them. The method inthis section uses an established estimator, but adds the assumption that instruments areuncorrelated with unobserved components of price misperceptions in addition to the morecommonly invoked assumption that instruments are uncorrelated with other unobservedidiosyncratic components of perceived returns. This additional assumption contributesto credibility for predictions of responds to counterfactual price changes that are known9o agents, without changing the asymptotic or finite sample properties of the estimator.The control function approach uses the following system of equations, with referenceto the expression of perceived returns in (7), (cid:101) π i = X i θ − P rice i − ν i + ˜ (cid:15) i P rice i = Z i δ + u i , (12)where I have left unobserved price misperceptions and other unobserved components ofperceived returns separate for clarity. Here, I introduce the instruments, Z i , where X i ⊂ Z i , that are assumed to be conditionally uncorrelated with − ν i +˜ (cid:15) i and strongly correlatedwith observed prices. With some loss of generality, I will refer to instruments that satisfythis condition as “known and exogoneous” for brevity. With valid instruments, theprice residual u i contains all components of prices that are correlated with idiosyncraticcomponents of price misperceptions or other unobserved components of perceived returns.Given the above, I estimate the following equation, (cid:101) π i = X i θ − P rice i − ν i + ˜ (cid:15) i = X i θ − P rice i + u i ρ + ξ i = X i θ − P rice i + ˆ u i ρ + ζ i . (13)The first line follows directly from the representation of perceived returns in (7). Thesecond line substitutes in the linear projection of the composite error − ν i + ˜ (cid:15) i on the firststage error u i , wherein ρ = E [ u i ( − ν i + ˜ (cid:15) i )] / E [ u i ] and ξ i is the residual when controllingfor u i . The third line substitutes the estimated residuals from the first stage regression of P rice i on Z i in for their unobserved true values, generating a new error, ζ i = ξ i +( u i − ˆ u i ) ρ .This new error will converge asymptotically to ξ i , but will differ in small samples due tosampling error in the estimation of the residual from the first stage, ˆ u i . It is not necessary that agents know the instruments in Z i , but only that they know the variationin prices that is attributable to Z i . For example, agents need not know about a tax or subsidy shockto the price of investment, so long as they are aware of the change in price that arises from the policyshock. Furthermore, the language that instruments are known and exogenous suggests that Cov ( Z i , ν i ) = Cov ( Z i , ˜ (cid:15) i ) = 0, while these are sufficient but not necessary for the less intuitive condition Cov ( Z i , ˜ (cid:15) i − ν i ) = 0, which accommodates the knife-edge case of the two sources of bias cancelling out.
10o estimate perceived returns, I assume that the new error in the perceived returnscontrol function expression is normally distributed, ζ i | X i , P rice i , ˆ u i ∼ N (0 , σ ζ ) , (14)noting that the variance of ζ i will differ from that of ˜ (cid:15) i if ρ (cid:54) = 0. I estimate perceivedreturns using two-stage conditional maximum likelihood, following Rivers and Vuong(1988), while correcting for the inclusion of estimated regressors, following Murphy andTopel (1985), though other estimators will also provide consistent estimates. Defining( θ ∗ ζ , γ ∗ ζ , ρ ∗ ζ ) = ( θσ ζ , σ ζ , ρσ ζ ), the log-likelihood for the second stage of the control functionapproach is given by L (cid:16) θ ∗ , γ ∗ , ρ ∗ | X i , ˆ u i (cid:17) = (cid:88) i S i log (cid:34) Φ (cid:16) X i θ ∗ ζ − P rice i γ ∗ ζ + ˆ u i ρ ∗ ζ (cid:17)(cid:35) +(1 − S i ) log (cid:34) − Φ (cid:16) X i θ ∗ ζ − P rice i γ ∗ ζ + ˆ u i ρ ∗ ζ (cid:17)(cid:35) . (15)Estimates of perceived returns are obtained by plugging the estimated parameters andthe assumed coefficient on perceived prices into the latent variable equation, (cid:101) π i | X i , ˆ u i ∼ N (cid:16) X i ˆ θ − P rice i + ˆ u i ˆ ρ, ˆ σ ζ (cid:17) . (16)
4. Simulations
In this section I apply the methods described above to simulated datasets to comparetheir performance. The important considerations involve agent beliefs about prices, priceendogeneity, and instruments being known and/or exogenous to agents. Because theestimators used are standard, I stop short of performing full Monte Carlo simulations, As an closely-related alternative, we could perform a instrumental variables probit to obtain identicalestimates of θ . The control function method has the advantage of conditioning on the variation in pricesthat isn’t used in identifying the effect on perceived returns, which permits more precise counterfactualpredictions for policies that are targeted on observables. (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i (cid:94) P rice i = P rice i + ν i = Z i δ + u i + ν i , (17)where the nature of the covariance of ( Z i , u i , ν i , ˜ (cid:15) i ) will determine the performance ofvarious estimation approaches. Both the probit and the control function method willobtain estimates of β , while the probit will estimate σ = V ar ( − ν i + (cid:15) ) (18)and the control function method will estimate ρ = E [ u i ( − ν i + ˜ (cid:15) i )] / E [ u i ] ,σ ζ = (cid:112) V ar ( ζ i ) = (cid:112) V ar ( − ν i + ˜ (cid:15) i − ˆ u i ρ ) . (19)Each DGP is comprised of N = 10 ,
000 observations of agents whose decisions are gov-erned by their perceived returns to investment.12 .1. Simulation with Known, Exogenous Prices
I begin with a well-behaved benchmark DGP that corresponds to the setting describedin Section 3.1. I generate data according to z i u i ν i ˜ (cid:15) i ∼ N ( , Σ); Σ = . (20)I construct the instrument vector as Z i = [ X i z ,i ] where X i includes only a constant, and α = 0 such that θ = β . Finally, I set β = 1 and δ = [0 1] (cid:48) . Although I set V ar ( ν i ) = 2, Idescribe prices as known in this setting because the price misperception is uncorrelatedwith prices. Table 1 shows perceived returns estimates for one simulation of this DGP using themethods from Section 3.1 and Section 3.2. Figure 1 shows the distributions implied bythe estimates for each method. In this case, the lack of correlation between prices andunobserved components of perceived returns, including price misperceptions, means thatboth methods will provide consistent estimates of perceived returns.
In this simulation, I consider a DGP that corresponds to the setting described in Section3.2 in which agents systematically misperceive prices in ways that not accounted for byobservables, and prices are correlated with unobserved components of perceived returns.I also compare the performance of an instrument that is exogenous but unknown to one This setting is one in which agents are wrong about prices in ways that are unrelated to pricedeterminants. This sort of price misperception is plausible in cases where prices change frequentlyaccording to a distribution that is de facto known to agents, such as frequently repeated investments. σ σ ζ ρ Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11) and (16), respectively. All data is generated in Stata using random seed 1234.
Figure 1: Simulation 1, Implied Perceived Returns Distributions
Notes:
Estimated densities of perceived returns given by the probit method using expression (11), andthe control function method using expression (16). z ,i z ,i u i ν i ˜ (cid:15) i ∼ N ( , Σ); Σ = − − − − . (21)I construct the instrument vector as Z i = [ X i z ,i z ,i ] where X i includes only a constant,and α = 0 such that θ = β . Finally, I set β = 1 and δ = [0 1 1] (cid:48) .In this case, there is positive correlation between u i and ˜ (cid:15) i such that individuals whoface idiosyncratically high prices also have high perceived returns, as may occur withprice discrimination. Additionally, there is negative correlation between u i and ν i suchthat individuals systematically underestimate the extent to which their price deviatesfrom the average, as may occur if agents form rational expectations on prices conditionalon an incomplete set of price determinants. Finally, this DGP includes two potentialinstruments; z ,i , which is exogenous but not fully known to agents, as in the case of apoorly publicized policy shock, and z ,i , which is both exogenous and known to agents.Because z ,i is correlated with ν i , it is not a valid instrument for the purposes of thispaper. For the control function estimates of ρ and σ ζ , I use u ,i in place of u i , where u ,i = z ,i δ + u i . In applications with many valid instruments, including different combinationsof instruments will result in different estimates ˆ u i , ˆ ρ and ˆ σ ζ , while nonetheless all returningconsistent estimates of perceived returns. For comparisons between instruments, thecomplete distribution of perceived returns (succinctly described by the figures) and theestimated coefficients on X i will be correct for all valid instruments.Table 2 shows the estimates for one simulation of this DGP using both methods,and also using each instrumental variable individually. Figure 2 shows the distributionsimplied by the estimates for each method. Because z ,i is correlated with misperceptions,it is not a valid instrument, and results in an estimated perceived returns distribution15hat is no better than that obtained when using no instruments. Table 2: Simulation 2, Perceived Returns Estimates(1) (2) (3)Target Probit Control Function z Control Function z Constant 1 1.528 1.659 0.863(0.102) (0.127) (0.086) σ σ ζ ρ .5 . -0.097 0.507(0.046) (0.018)Observations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11) and (16), respectively. All data is generated in Stata using random seed 1234.
Figure 2: Simulation 2, Implied Perceived Returns Distributions
Notes:
Estimated densities of perceived returns given by the probit method using expression (11), andthe control function method using expression (16). The unknown IV is z ,i and the valid IV is z ,i , whereeach IV is excluded from the estimation model when the other is used. For estimating instrument-specific intent to treat effects of prices on investment, which would be suf-ficient for determining the performance of a particular policy in the context of its actual implementation,instruments such as z ,i are valid. They nonetheless fail to provide credible insight into counterfactualpolicy changes that are well-publicized. . Conclusions In this paper I describe how to estimate perceived returns to investments by assumingagent knowledge of an intuitive identity and modestly altering common estimation tech-niques. The assumption on agent knowledge may be preferable to rational expectationsor related assumptions in applications. I further describe the econometric challenges thatarise from the assumption and how to overcome them with careful choice of instrumentsthat are not only exogenous to agents, but are also de facto known to them.This method is relevant in many empirical questions, especially those subject to sub-stantial information frictions on prices such as such as college attendance, firm R&D,automobile purchases, home purchases, and healthcare. While the estimation techniquesused in this paper are restricted to a probit and a control function probit, the generalinsights are relevant to more sophisticated models that involve responses to prices. Im-plementation of the identity relating perceived returns and prices used in this paper inthe context of more sophisticated models, such as Berry, Levinsohn, and Pakes (1995)and its extensions, are left to future work.In terms of policy implications, the methods described in this paper are relevant forconstructing credible counterfactuals for well-publicized price changes, which are rele-vant for taxes and subsidies on investments including those associated with educationand healthcare. The general insight is to avoid being too quick to assume that agentshave rational expectations on model objects when alternative assumptions may be moredefensible. Relatedly, the insights here also caution against extrapolating effects of coun-terfactual policies when the policy effects are estimated using a source of variation inprices that may not be known to agents. In practice, applied researchers should justifythat sources of variation used for estimating treatment effects are known to agents just asthey justify that they are exogenous to agents when making counterfactual predictions.17 eferences
Andrews, D. W. K., and
G. Soares (2010): “Inference for Parameters Defined byMoment Inequalities Using Generalized Moment Selection,”
Econometrica , 78(1), 119–157.
Berry, S., J. Levinsohn, and
A. Pakes (1995): “Automobile Prices in MarketEquilibrium,”
Econometrica: Journal of the Econometric Society , pp. 841–890.
Bleemer, Z., and
B. Zafar (2018): “Intended College Attendance: Evidence froman Experiment on College Returns and Costs,”
Journal of Public Economics , 157,184–211.
Cunha, F., J. Heckman, and
S. Navarro (2005): “Separating Uncertainty fromHeterogeneity in Life Cycle Earnings,”
Oxford Economic Papers , 57(2), 191–261.
Cunha, F., and
J. J. Heckman (2007): “Identifying and estimating the distributionsof ex post and ex ante returns to schooling,”
Labour Economics , 14(6), 870–893.
Dickstein, M. J., and
E. Morales (2018): “What Do Exporters Know?,”
The Quar-terly Journal of Economics , 133(4), 1753–1801.
Dynarski, S., C. Libassi, K. Michelmore, and
S. Owen (2018): “Closing thegap: The effect of a targeted, tuition-free promise on college choices of high-achieving,low-income students,” Discussion paper, National Bureau of Economic Research.
Dynarski, S. M. (2003): “Does Aid Matter? Measuring the Effect of Student Aid onCollege Attendance and Completion,”
American Economic Review , 93(1), 279–288.
Hansen, W. L. (1983): “Impact of student financial aid on access,”
Proceedings of theAcademy of Political Science , 35(2), 84–96.
Harris, C. M. (2020): “Estimating the perceived returns to college,”
Available at SSRN3577816 . 18 ensen, R. (2010): “The (Perceived) Returns to Education and the Demand for School-ing,”
The Quarterly Journal of Economics , 125(2), 515–548.
Kane, T. J. (1995): “Rising public college tuition and college entry: How well do publicsubsidies promote access to college?,” Discussion paper, National Bureau of EconomicResearch.
Manski, C. F. (1993): “Adolescent econometricians: How do youth infer the returns toschooling?,” in
Studies of supply and demand in higher education , pp. 43–60. Universityof Chicago Press.(2004): “Measuring expectations,”
Econometrica , 72(5), 1329–1376.
Murphy, K. M., and
R. H. Topel (1985): “Least Squares with Estimated Regres-sors,”
Journal of Business and Economic Statistics . Rivers, D., and
Q. H. Vuong (1988): “Limited Information Estimators and Exogene-ity Tests for Simultaneous Probit Models,”
Journal of Econometrics , 39(3), 347–366.
Roy, A. D. (1951): “Some Thoughts on the Distribution of Earnings,”
Oxford EconomicPapers , 3(2), 135–146.
Wiswall, M., and
B. Zafar (2015): “How Do College Students Respond to PublicInformation about Earnings?,”
Journal of Human Capital , 9(2), 117–169.
Yatchew, A., and
Z. Griliches (1985): “Specification error in probit models,”
TheReview of Economics and Statistics , pp. 134–139.19 ppendix A: Moment Inequalities
This section describes how to adapt the moment inequality method developed by Dick-stein and Morales (2018) (DM) to the setting described in this paper. The setting ofDM involves trade revenues that are partially observed by firms, which have a structuralrelationship with profits that is assumed to be known to agents. The type of informationfrictions described in DM are a special case of those described in the present paper, inwhich some sources of variation in the treatment variable are unknown to agents. Inthe context of the model presenting in equation (6), this involves negative correlationbetween ν i and P rice i such that prices are a mean preserving spread of perceived prices.Furthermore, the DM method assumes that ˜ (cid:15) i is independent of other determinants ofperceived returns.This method makes use of instruments, Z i that are independent of (˜ (cid:15) i , ν i ). For X i ⊂ Z i , this implies that the expectation of (6) conditional on Z i gives E [ (cid:94) P rice i | Z i ] = E [ P rice i | Z i ] + X i α. (A.1)Additionally, it makes a distributional assumption on unobserved perceived returns suchas ˜ (cid:15) i | X i , (cid:94) P rice i ∼ N (0 , σ (cid:15) ) , (A.2)where the assumption of normality is unnecessary, but there are some restrictions onthe assumed distribution which I discuss below. The method uses two types of momentinequalities to obtain bounds on the parameters of perceived returns, ( θ, σ ˜ (cid:15) ). I willpresent the inequalities and provide a brief discussion here. For the derivation and furtherdiscussion of the moment inequalities, see DM.20 .1. Revealed Preference Moment Inequalities Defining ( β ∗ ˜ (cid:15) , θ ∗ ˜ (cid:15) , γ ∗ ˜ (cid:15) ) = ( βσ ˜ (cid:15) , θσ ˜ (cid:15) , σ ˜ (cid:15) ) for notational convenience, the conditional revealedpreference moment inequalities are E (cid:34) S i ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) + (1 − S i ) φ ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ , E (cid:34) − (1 − S i )( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) + S i φ ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (A.3)These inequalities are consistent with the revealed preference argument that perceivedreturns are positive for those who select the investment and negative for those who donot. Here, I provide an overview of the intuition.Regarding the first inequality, consider an agent that selects the investment suchthat S i = 1. Following the revealed preference argument articulated in (4) and therepresentation of perceived returns in (7), it follows that this individual’s perceived returnis positive, such that S i ( X i θ − P rice i − ν i + ˜ (cid:15) i ) ≥ . (A.4)This expression cannot be computed directly because researchers do not observe ν i or ˜ (cid:15) i .However, as the inequality holds for all i , it follows that it holds in expectation conditionalon Z i , E [ S i ( X i θ − P rice i − ν i + ˜ (cid:15) i ) | Z i ] ≥ . (A.5)Finally, it follows from the Law of Iterated Expectations, the assumption that ν i is un-known to agents and therefore not acted upon, and the assumption that Z i is uncorrelatedwith ν i such that E [ S i ν i | Z i ] = E [ S i E [ ν i | S i , Z i ] | Z i ] = 0, yielding, E [ S i ( X i θ − P rice i + ˜ (cid:15) i ) | Z i ] ≥ . (A.6)21he first inequality in (A.3) is derived from this inequality where its second term is a posi-tively biased approximation of E [ S i ˜ (cid:15) i | Z i ] that exploits the closed form for E [ S i ˜ (cid:15) i | X i , (cid:94) P rice i ]under the normality assumption on ˜ (cid:15) i . Heuristically, if observed prices are a mean-preserving spread of perceived prices,substituting them in place of perceived prices will mistakenly increase expected perceivedreturns unconditional on selection for some agents and decrease them for others. For theagents for whom this expectation increases, the expectation of the error conditional onselection approaches zero. For those for whom it decreases, the expectation of the errorconditional on selection approaches positive infinity. In many cases, this second effectwill dominate the overall expectation of the error conditional on selection. The secondinequality follows from similar intuition applied to individuals who do not select theinvestment.
A.1.1 Odds-Based Moment Inequalities
The conditional odds-based moment inequalities are E (cid:34)(cid:32) S i − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) − (1 − S i ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ , E (cid:34)(cid:32) (1 − S i ) Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) − S i (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (A.7)They are derived from the unobservable conditional score equation, E (cid:34) S i φ ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) − (1 − S i ) φ ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i , (cid:94) P rice i (cid:35) = 0 . (A.8) The bias makes substitution of prices for perceived prices nontrivial, and contributes to the inequality. Global convexity of E [˜ (cid:15) i | ˜ (cid:15) i < κ ] in κ is necessary for the inequalities to hold regardless of the value of κ and the variance of the misperception term. This condition is satisfied by both the normal and logisticdistributions. E (cid:34)(cid:32) (1 − S i ) Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) − S i (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i , (cid:94) P rice i (cid:35) = 0 . (A.9)The advantage of this transformation is that the odds-ratio is globally convex in its ar-guments. Replacing the unobserved (cid:94) P rice i with P rice i changes the equation into aninequality by application of Jensen’s inequality due to the global convexity of the oddsratio. As the index of the odds ratio increases, the model-predicted odds of a givenoutcome approach positive infinity, while the odds approach zero as the index decreases.When the index is replaced with a mean-preserving spread of itself (via replacing per-ceived prices with prices), this first effect will usually dominate the second regardless ofthe distributional assumption. This inequality holds when taking its expectation con-ditional on Z i by law of iterated expectations. The first inequality follows from similarintuition for those who do not select the investment. A.1.2 Estimation Using Moment Inequalities
Under the information assumptions provided, the true parameters ψ = ( θ, σ ˜ (cid:15) ) will becontained within the set of parameters that satisfy the inequalities, which I define as Ψ .First, because it is computationally expensive to compute the inequalities conditional on Z i , I will instead use unconditional inequalities that are consistent with the conditionalinequalities described above. Additionally, in small samples it is possible that the trueparameters will not strictly satisfy these inequalities, so it is necessary to construct a testof the hypothesis that a given value ψ p = ( θ p , σ ˜ (cid:15),p ) is consistent with the inequalities.To do this I employ the modified method of moments procedure described by Andrewsand Soares (2010), which yields a confidence set of parameters ˆΨ that I fail to rejectare consistent with the inequalities, where an element of this set is given by (ˆ θ p , ˆ σ ˜ (cid:15),p ). A Global convexity of the odds ratio is necessary for this condition to hold for all values of the index andfor all magnitudes of mean-preserving spreads. This condition is satisfied by log-concave distributions,such as the normal and logistic. β, σ ˜ (cid:15) ), perceived returnsare given by Y i | X i , (cid:94) P rice i ∼ N ( X i β − (cid:94) P rice i , σ ˜ (cid:15) ) . (A.10)Thus, even given the true ( β, σ ˜ (cid:15) ), the problem remains that we do not observe (cid:94) P rice i inthe data. However, it is possible to bound perceived returns at the true parameter valuesusing P rice i and E [ P rice i | Z i ], which we do have access to.For valid Z i , equation (6) implies that it is possible to approximate (cid:94) P rice i with ϕP rice i +(1 − ϕ ) E [ P rice i | Z i ], where ϕ minimizes E [ (cid:94) P rice i − ( ϕP rice i +(1 − ϕ ) E [ P rice i | Z i ])] .It must be that ϕ ∈ [0 , ϕ ∈ [0 , θ p , ˆ σ ˜ (cid:15) p ) can be constructed using Y i | X i , P rice i , Z i ∼ N (cid:0) X i ˆ θ p − ϕP rice i − (1 − ϕ ) E [ P rice i | Z i ] , ˆ σ ˜ (cid:15) p (cid:1) . (A.11)Note that the PDF of this distribution is non-monotonic in ϕ , so setting ϕ = 0 and ϕ = 1will not bound its PDF across its entire support. Computing the distribution for all ϕ ∈ [0 ,
1] for each (ˆ θ p , ˆ σ ˜ (cid:15),p ) ∈ ˆΨ is necessary to provide bounds for the perceived returnsdistribution. In practice, choosing any set of values between zero and one, including zero and one, will approximatethese bounds. DM describe an alternative method that can be used to bound the CDF of perceivedreturns. ppendix B: Moment Inequality Estimation I closely follow appendices A.5 and A.7 in Dickstein and Morales (2018) to estimate themoment inequalities’ confidence set for the true parameter ψ . Adapting DM’s procedureto the current setting would account for imputation of prices. I use a simplified version oftheir procedure, because I assume that prices are observed for all individuals, regardlessof whether they select the investment. This assumption is irrelevant to the contributionsof this paper, as each method admits imputation. I also deviate from DM in how Iconduct the grid search over potential parameters in order to speed computation in theabsence of parallelization.The confidence set is obtained by applying the Andrews and Soares (2010) modi-fied method of moments (MMM). This method follows the intuition of the generalizedmethod of moments, but only penalizes moment deviations that violate the inequalitywhile adjusting the hypothesis testing procedure to accommodate this change. I indexthe moment inequalities used in estimation by (cid:96) = 1 , ..., L and denote them¯ m (cid:96) ( ψ ) ≡ N N (cid:88) i =1 m (cid:96) ( Z i , ψ ) , (cid:96) = 1 , ..., L, where N is the sample size. The MMM test statistic Q ( ψ ) = L (cid:88) (cid:96) [min( √ N ¯ m (cid:96) ( ψ )ˆ σ (cid:96) ( ψ ) , , (B.1)gives the sum of squared inequality violations, whereˆ σ (cid:96) ( ψ ) = (cid:118)(cid:117)(cid:117)(cid:116) N N (cid:88) i =1 ( m (cid:96) ( Z i , ψ ) − ¯ m (cid:96) ( ψ )) . Note that as in Section A, X i ⊂ Z i . m (cid:96) ( · ) is a conditional revealed preference or odds-based moment inequality constructed as described in DM, Appendix A.5. I compute aconfidence set for the true parameter ψ using the following steps, closely following DM. Step 1: define a grid Ψ g that overlaps with the confidence set. I define this grid25s a K -dimensional orthotope where K is the number of scalars indexed by k = 1 , ..., K within the parameter vector ψ . To define this grid, I choose ψ min to minimize Q ( ψ ),initializing the minimization with the control function estimates ˆ ψ CF ≡ (ˆ θ CF , ˆ σ ζ,CF ),which in simulations is typically near a minimum (zero) of Q ( ψ ). The moment inequalityconfidence set encompass the control function estimates in simulations included later inAppendix D when they provide consistent bounds, and there is good reason to believethat this will be the case generally (see Appendix C). Because Q ( ˆ ψ min ) will be close tozero, it is likely to be within the 95% confidence set, ˆΨ , if this set is nonempty. I createboundaries in dimension k by multiplying the standard error of the k th parameter by alarge number, and adding and subtracting this value from the parameter to form boundsin the k th dimension. I repeat this for each of the K parameters to obtain bounds ona K -dimensional initial grid Ψ g . I fill this grid with 10 K equidistant points. Step 2: choose a point ψ p ∈ Ψ g . For speed, I test points in ascending order of theireuclidean distance from ˆ ψ min . With ψ p , I test the hypothesis that ψ p = ψ : H : ψ = ψ p vs. H : ψ (cid:54) = ψ p . Step 3: evaluate the MMM test statistic at ψ p : Q ( ψ p ) = L (cid:88) (cid:96) (cid:2) min( √ N ¯ m (cid:96) ( ψ p )ˆ σ (cid:96) ( ψ p ) , (cid:3) , (B.2) Step 4: compute the correlation matrix of the moments evaluated at ψ p : ˆΩ( ψ p ) = Diag − ( ˆΣ( ψ p )) ˆΣ( ψ p ) Diag − ( ˆΣ( ψ p )) , where Diag − ( ˆΣ( ψ p )) is the L × L diagonal matrix that shares diagonal elements with As there are negligible computational disadvantages from having a very large initial grid, I multiplythe standard errors by 20. ψ p ). Diag − ( ˆΣ( ψ p )) satisfies Diag − ( ˆΣ( ψ p )) Diag − ( ˆΣ( ψ p )) = Diag − ( ˆΣ( ψ p )) whereˆΣ( ψ p ) = 1 N N (cid:88) i =1 ( m ( Z i , ψ p ) − ¯ m ( ψ p ))( m ( Z i , ψ p ) − ¯ m ( ψ p )) (cid:48) ,m ( Z i , ψ p ) = ( m ( Z i , ψ p ) , ..., m L ( Z i , ψ p )), and ¯ m ( ψ p ) = ( ¯ m ( ψ p ) , ..., ¯ m L ( ψ p )), where¯ m (cid:96) ( ψ p ) ≡ N N (cid:88) i =1 m (cid:96) ( Z i , ψ p ) , ∀ (cid:96) = 1 , ..., L. Step 5: simulate the asymptotic distribution of Q ( ψ p ) . Take R = 1000 draws fromthe multivariate normal distribution N (0 L , I L ) where 0 L is a vector of zeros and I L is an L -dimensional identity matrix. Denote each of these draws as χ r . Define the criterionfunction Q AAN,r ( ψ p ) as Q AAN,r ( ψ p ) = L (cid:88) (cid:96) =1 (cid:34)(cid:18) min (cid:16) [ ˆΩ ( ψ p ) χ r ] (cid:96) , (cid:17)(cid:19) × (cid:18) √ N ¯ m (cid:96) ( ψ p )ˆ σ (cid:96) ( ψ p ) ≤ √ ln N (cid:19)(cid:21) , where [ ˆΩ ( ψ p ) χ r ] (cid:96) is the (cid:96)th element of the vector ˆΩ ( ψ p ) χ r . Step 6: compute the critical value.
The critical value ˆ c AAN ( ψ p , − α ) is the (1 − α )-quantile distribution of the distribution of Q AAN,r ( ψ p ) across the R draws taken in step 5. Step 7: reject or fail to reject ψ p . If Q ( ψ p ) ≤ ˆ c AAN ( ψ p , − α ), include ψ p in theestimated (1 − α )% confidence set, ˆΨ − α and the (initially empty) grid Ψ g (cid:48) that willcontain the confidence set. Step 8: repeat steps 2 through 7 until a ψ p is not rejected. This will likely occurat the first point checked, ψ min , as this parameter minimizes Q ( ψ p ), though it does notmaximize ˆ c AAN ( ψ p , − α ). Step 9: form a small grid around each ψ p in ˆΨ − α . Form Ψ g,p , a local K -dimensionalorthotope with 3 equidistant points in each dimension (with distance between pointsdefined as in step 1), centered around ψ p for each ψ p in ˆΨ − α . Add Ψ g,p to the grid Ψ g (cid:48) that will contain the confidence set. Step 10: repeat steps 3 through 7 for every point in Ψ g (cid:48) that has not yet beenchecked. tep 11: iterate on steps 9 and 10 until all points in Ψ g (cid:48) have been checked.Step 12: ensure desired grid fineness. If the number of elements of the set ˆΨ − α isbelow the desired minimum number, set the distance between grid points at one-half ofthe current value and repeat step 11. Repeat this step until the number of elements ofˆΨ − α exceeds the desired number of such elements.28 ppendix C: Moment Inequalities and Endogeneity For proofs of the validity of the moment inequalities for providing a confidence set thatconsistently bounds the true parameter vector, ( θ, σ (cid:15) ), in the context of the setting pre-sented in Section A, see DM. The inequalities also appear to consistently bound perceivedreturns in simulations when there is correlation between perceived prices and the unob-served error in perceived returns and correlation between information frictions and theunobserved error in perceived returns under the assumption ρ − ρ ≥
0, which is weakerthan the assumption described in Section A. I provide proofs of consistency here for therevealed preference moment inequalities, and arguments for consistency for the odds-based moment inequalities, borrowing from the proofs provided by DM. Note that theparameters relevant to this section are ( θ, σ ξ ), not those used in Section A. I use thenotation ( θ ∗ ξ , γ ∗ ξ , ρ ∗ ξ ) = (cid:0) θσ ξ , σ ξ , ρσ ξ (cid:1) throughout the following while assuming ξ i | X i , P rice i , u i ∼ N (0 , σ ξ ) . (C.1)The condition ρ − ρ ≥ u i with a multiplier of − − ρ ).Note that ρ − ρ ≥ ρ ∈ [0 , ρ ≥ ρ ≤ σ ξ , but not necessarily σ . As these parameters serve thesame function, this has no effect on the predictive capacity of any resulting estimates of29erceived returns.I begin by presenting a lemma that will be useful in the subsequent proofs. It alsoserves as the main point of departure from the proofs provided by DM. Lemma 1
If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E [ u i (1 − ρ ) | S i = 0 , Z i ] ≥ ≥ E [ u i (1 − ρ ) | S i = 1 , Z i ] . (C.2) Proof:
From the definition of S i given in (4) and (5), substituting in the expression forperceived returns in (13) implies E [ u i (1 − ρ ) | S i = 0 , Z i ]= E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≤ , Z i ] . (C.3)Substituting in the definition of P rice i provided in (12) and rearranging the conditioninginequality implies E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≤ , Z i ]= E [ u i (1 − ρ ) | X i θ − Z i δ − u i (1 − ρ ) + ξ i ≤ , Z i ]= E [ u i (1 − ρ ) | u i (1 − ρ ) ≥ ( X i θ − Z i δ + ξ i ) , Z i ] . (C.4)Given the property of expectations of truncated variables that E [ X | X ≥ Y ] ≥ E [ X ], itfollows that E [ u i (1 − ρ ) | u i (1 − ρ ) ≥ ( X i θ − Z i δ + ξ i ) , Z i ] ≥ E [ u i (1 − ρ ) | Z i ]=0 , (C.5)where the last equality follows from the definition of u i given in (12). The definition of S i given in (4) and (5), substituting in the expression of perceived returns in (13), also30mplies E [ u i (1 − ρ ) | S i = 1 , Z i ]= E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≥ , Z i ] . (C.6)Substituting in the definition of P rice i provided in (12) and rearranging the conditioninginequality implies E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≥ , Z i ]= E [ u i (1 − ρ ) | X i θ − Z i δ − u i (1 − ρ ) + ξ i ≥ , Z i ]= E [ u i (1 − ρ ) | u i (1 − ρ ) ≤ ( X i θ − Z i δ + ξ i ) , Z i ] . (C.7)Given the property of expectations of truncated variables that E [ X | X ≤ Y ] ≤ E [ X ], itfollows that E [ u i (1 − ρ ) | u i (1 − ρ ) ≤ ( X i θ − Z i δ + ξ i ) , Z i ] ≤ E [ u i (1 − ρ ) | Z i ]=0 , (C.8)where the last equality follows from the definition of u i given in (12). Substituting (C.5)into (C.3) and (C.8) into (C.6) implies (C.2). (cid:4) C.1. Proof of Revealed Preference Inequality Robustness to En-dogeneity
Lemma 2
Suppose equations (4) , (5) , and (13) hold. Then E (cid:34) S i (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.9) Proof:
From equations (4), (5), and (13), S i = { X i θ − P rice i + u i ρ + ξ i ≥ } . (C.10)31his implies S i (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1) ≥ . (C.11)This inequality holds for every individual i , therefore it will hold in expectation condi-tional on Z i . (cid:4) Lemma 3
Equations (4) , (5) , (12) , (13) , and ( C. imply that E (cid:34) S i (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ (cid:1) +(1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ Proof:
Equation (C.9) and the definition of
P rice i from equation (12) imply E [ S i ( X i θ − Z i δ ) | Z i ] − E [ S i u i (1 − ρ ) | Z i (cid:3) + E [ S i ξ i | Z i ] ≥ . (C.13)The assumption in (12) implies that E [ u i | Z i ] = 0, so it follows that E [ S i u i (1 − ρ ) + (1 − S i ) u i (1 − ρ ) | Z i ] = 0 . Expression (C.1) implies that E [ ξ i | X i , P rice i , u i ] = 0, which implies E [ S i ξ i + (1 − S i ) ξ i | X i , P rice i , u i ] = 0 . Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applyingthe law of iterated expectations, the preceding two equations allow us to rewrite equation(C.13) as E [ S i ( X i θ − Z i δ ) | Z i ] + E [(1 − S i ) u i (1 − ρ ) | Z i ] − E (cid:2) (1 − S i ) ξ i (cid:12)(cid:12) Z i (cid:3) ≥ . (C.14)Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applying32he law of iterated expectations also implies E [(1 − S i ) ξ i | Z i ] = E (cid:2) E [(1 − S i ) ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [(1 − S i ) | X i , P rice i , u i ] E [ ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 1 , X i , P rice i , u i ]+ P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [(1 − S i ) | X i , P rice i , u i ] E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:104) E (cid:2) (1 − S i ) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) X i , P rice i , u i ] (cid:12)(cid:12)(cid:12) Z i (cid:105) = E (cid:2) (1 − S i ) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i ] . This allows us to rewrite equation (C.14) as E (cid:2) S i ( X i θ − Z i δ ) + (1 − S i ) (cid:0) u i (1 − ρ ) − E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:1)(cid:12)(cid:12) Z i (cid:3) ≥ . (C.15)Using the definition of S i from equation (4) and substituting in equations (5) and (13),it follows that E [ ξ i | S i = 0 , X i , P rice i , u i ] = E (cid:2) ξ i (cid:12)(cid:12)(cid:0) − ξ i ≥ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) = − E (cid:2) − ξ i (cid:12)(cid:12)(cid:0) − ξ i ≥ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) , which allows us to rewrite E [ ξ i | S i = 0 , X i , P rice i , u i ] = − σ ξ φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (C.16)using Expression (C.1) and applying the symmetry of the normal distribution. Equation(C.12) follows by applying this equality to (C.15) and dividing each side of the resultinginequality by σ ξ . (cid:4) emma 4 Given ρ − ρ ≥ , equations (4) , (5) , (12) , and (13) imply E (cid:34) (1 − S i ) (cid:32) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) (C.17) Proof:
Using the definition of
P rice i from equation (12), it follows that E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.18)The law of iterated expectations and S i ∈ { , } implies E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.19)Because ∂ φ ( − x )1 − Φ( − x ) ∂x ∈ ( − , , it follows that the expression u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (C.20)is monotonically increasing in u i ( γ ∗ ξ − ρ ∗ ξ ). It follows then that adding a positive value tothis value will increase the value of the function. From (C.2) and the condition ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ ≥ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] ≥ ∀ i , so it follows that E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) = E (cid:20)(cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1) + φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1)(cid:17) − Φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1)(cid:17) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) , (C.21)where the second line relates to the third by this addition, and the first relates to thesecond by algebraic simplifications. Finally, because the term u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]and the term φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) are globally convex in − u i , the entire function is globally convex in − u i . It follows that E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) (C.22)35y Jensen’s inequality. Combining this inequality with that in (C.21) yields the result E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) . (C.23)It follows immediately that E (cid:34) (1 − S i ) E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.24)Equation (C.17) follows from this by substituting in the definition of P rice i from (12)and applying the law of iterated expectations. (cid:4) Corollary 1
Given (C.12), (C.17), and the definition of
P rice i given in (12) , it followsthat E (cid:34) S i ( X i θ ∗ ξ − P rice i γ ∗ ξ ) + (1 − S i ) φ ( X i θ ∗ ξ − P rice i γ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.25) Proof:
The result follows from equations (C.12), (C.17), substituting E [ − S i u i = (1 − S i ) u i | Z i ], and substituting in the definition of P rice i given in (12). (cid:4) Lemma 5
Suppose equations (4) , (5) , and (13) hold. Then E (cid:34) − (1 − S i ) (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.26) Proof:
From equations (4), (5), and (13), S i = { X i θ − P rice i + u i ρ + ξ i ≥ } . (C.27)36his implies − (1 − S i ) (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1) ≥ . (C.28)This inequality holds for every individual i , therefore it will hold in expectation condi-tional on Z i . (cid:4) Lemma 6
Equations (4) , (5) , (12) , (13) , and ( C. imply that E (cid:34) − (1 − S i ) (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ (cid:1) + S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ Proof:
Equation (C.26) and the definition of
P rice i from equation (12) imply E [ − (1 − S i )( X i θ − Z i δ ) | Z i ] + E [(1 − S i ) u i (1 − ρ ) | Z i (cid:3) − E [(1 − S i ) ξ i | Z i ] ≥ . (C.30)The assumption in (12) implies that E [ u i | Z i ] = 0, so it follows that E [ S i u i (1 − ρ ) + (1 − S i ) u i (1 − ρ ) | Z i ] = 0 . Expression (C.1) implies that E [ ξ i | X i , P rice i , u i ] = 0, which implies E [ S i ξ i + (1 − S i ) ξ i | X i , P rice i , u i ] = 0 . Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applyingthe law of iterated expectations, the preceding two equations allow us to rewrite equation(C.30) as E [ − (1 − S i )( X i θ − Z i δ ) | Z i ] − E [ S i u i (1 − ρ ) | Z i ] + E (cid:2) S i ξ i (cid:12)(cid:12) Z i (cid:3) ≥ . (C.31)Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applying37he law of iterated expectations also implies E [ S i ξ i | Z i ] = E (cid:2) E [ S i ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [ S i | X i , P rice i , u i ] E [ ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 1 , X i , P rice i , u i ]+ P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [ S i | X i , P rice i , u i ] E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:104) E (cid:2) S i E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) X i , P rice i , u i ] (cid:12)(cid:12)(cid:12) Z i (cid:105) = E (cid:2) S i E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i ] . This allows us to rewrite equation (C.31) as E (cid:2) − (1 − S i )( X i θ − Z i δ ) + S i (cid:0) − u i (1 − ρ ) + E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:1)(cid:12)(cid:12) Z i (cid:3) ≥ . (C.32)Using the definition of S i from equation (4) and substituting in equations (5) and (13),it follows that E [ ξ i | S i = 1 , X i , P rice i , u i ] = E (cid:2) ξ i (cid:12)(cid:12)(cid:0) − ξ i ≤ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) = − E (cid:2) − ξ i (cid:12)(cid:12)(cid:0) − ξ i ≤ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) , which allows us to rewrite E [ ξ i | S i = 1 , X i , P rice i , u i ] = σ ξ φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (C.33)using Expression (C.1) and applying the symmetry of the normal distribution. Equation(C.29) follows by applying this equality to (C.32) and dividing each side of the resultinginequality by σ ξ . (cid:4) emma 7 Given ρ − ρ ≥ , equations (4) , (5) , (12) , and (13) imply E (cid:34) S i (cid:32) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) (C.34) Proof:
Using the definition of
P rice i from equation (12), it follows that E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.35)Law of iterated expectations and S i ∈ { , } implies E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.36)Because ∂ φ ( − x )Φ( − x ) ∂x ∈ (0 , , it follows that the expression − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (C.37)is monotonically decreasing in u i ( γ ∗ ξ − ρ ∗ ξ ). It follows then that adding a negative value tothis value will increase the value of the function. From (C.2) and the condition ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ ≥ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] ≤ ∀ i , so it follows that E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) = E (cid:20) − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1) + φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1)(cid:17) Φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1)(cid:17) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) , (C.38)where the second line relates to the third by this addition, and the first relates to thesecond by algebraic simplifications. Finally, because the term φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) is globally convex in − u i , the function is globally convex in − u i . It follows that E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) (C.39)40y Jensen’s inequality. Combining this inequality with that in (C.38) yields the result E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) . (C.40)It follows immediately that E (cid:34) S i E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.41)Equation (C.34) follows from this by substituting the definition of P rice i from (12) andapplying the law of iterated expectations. (cid:4) Corollary 2
Given (C.29), (C.34), and the definition of
P rice i given in (12) , it followsthat E (cid:34) − (1 − S i )( X i θ ∗ ξ − P rice i γ ∗ ξ ) + S i φ ( X i θ ∗ ξ − P rice i γ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.42) Proof:
The result follows from equations (C.29), (C.34), that E [ S i u i = − (1 − S i ) u i | Z i ],and the definition of P rice i given in (12). (cid:4) Proof of Robustness of Revealed Preference Inequalities to Endogeneity:
Com-bining equations (C.25) and (C.42) provides both inequalities defined in equation (A.3). (cid:4) .2. Argument for Odds-Based Inequality Robustness to Endo-geneity The following argument is constructed as a proof, where the components of the argumentthat do not meet the standards of a proof are discussed as they arise.
Lemma 8
Equations (4) , (5) , (13) , ( C. , and the assumption that the distribution of Z i is degenerate conditional on ( X i , P rice i , u i ) imply that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . Proof:
Expression (C.1) implies that S i − { X i θ − P rice i + u i ρ + ξ i ≥ } ≥ , or, equivalently, 1 − { X i θ − P rice i + u i ρ + ξ i ≥ } − (1 − S i ) ≥ , { X i θ − P rice i + u i ρ + ξ i ≤ } − (1 − S i ) ≥ , for all i . Given that this inequality holds for all individuals, it will also hold in expectation,conditional on any set of variables, across individuals. It follows that E [ { X i θ − P rice i + u i ρ + ξ i ≤ } − (1 − S i ) | X i , P rice i , u i ] ≥ . The distributional assumption in (C.1) implies E [1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i ) | X i , P rice i , u i ] ≥ . X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) yields E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Adding and subtracting 1 − S i gives E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which we can rearrange into E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which can then be rearranged into E (cid:20) S i − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Equation (C.43) follows from the law of iterated expectations and the assumption thatthe distribution of Z i conditional on ( X i , P rice i , u i ) is degenerate. (cid:4) Lemma 9
If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.44)43 rgument: Substituting the definition of
P rice i from equation (12), we have that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . Because S i ∈ { , } , it follows that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ ⇐⇒ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ . Proving the argument requires that E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Given that 1 − Φ( x )Φ( x )is globally convex in x , Jensen’s inequality implies that E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≤ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Combining these inequalities yields E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≤ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Thus, the argument holds if the first inequality dominates the second. There is goodreason to believe that this will be the case. The first inequality arises from
V ar ( u i | S i =1 , Z i ) (through Jensen’s inequality), while the second arises from E [ u i | S i = 1 , Z i ]. Threepoints are salient here. First, given that E [ u i | Z i ] = 0, the value of E [ u i | S i = 1 , Z i ] is amonotonic function of V ar ( u i | Z i ). Second, the function (1 − Φ( x )) / Φ( x ) has a very largesecond derivative for most of its support, such that the application of Jensen’s inequalitywill have a large effect on the inequality. Thirdly, because ρ ∗ ξ is constrained to be smallrelative to γ ∗ ξ , E [ u i | S i = 1 , Z i ] is likely to have a relatively small effect on the inequality.Equation (C.44) follows from the preceding inequalities if the first inequality domi-nates the second by performing simple algebraic manipulations and applying the definitionof P rice i given in (12). (cid:4) Lemma 10
Equations (4) , (5) , (13) , ( C. , and the assumption that the distribution of i is degenerate conditional on ( X i , P rice i , u i ) imply that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − S i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . Proof:
Expression (C.1) implies that { X i θ − P rice i + u i ρ + ξ i ≥ } − S i ≥ , Given that this inequality holds for all individuals, it will also hold in expectation, con-ditional on any set of variables, across individuals. It follows that E [ { X i θ − P rice i + u i ρ + ξ i ≤ } − S i | X i , P rice i , u i ] ≥ . The distributional assumption in (C.1) implies E [Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i | X i , P rice i , u i ] ≥ . Dividing through by 1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) yields E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Adding and subtracting S i gives E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which is straightforward to rearrange into E (cid:20) (1 − S i ) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Equation (C.45) follows from the law of iterated expectations and the assumption thatthe distribution of Z i conditional on ( X i , P rice i , u i ) is degenerate. (cid:4) Lemma 11
If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.46) Argument:
Substituting the definition of
P rice i from equation (12), we have that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . S i ∈ { , } , it follows that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ ⇐⇒ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ . Proving the argument requires that E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Given that Φ( x )1 − Φ( x )is globally convex in x , Jensen’s inequality implies that E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Meanwhile, (C.2) implies E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≤ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≤ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Thus, the argument holds if the first inequality dominates the second. There is goodreason to believe that this will be the case. The first inequality arises from
V ar ( u i | S i =0 , Z i ) (through Jensen’s inequality), while the second arises from E [ u i | S i = 0 , Z i ]. Threepoints are salient here. First, given that E [ u i | Z i ] = 0, the value of E [ u i | S i = 0 , Z i ] is amonotonic function of V ar ( u i | Z i ). Second, the function Φ( x ) / (1 − Φ( x )) has a very largesecond derivative for most of its support, such that the application of Jensen’s inequalitywill have a large effect on the inequality. Thirdly, because ρ ∗ ξ is constrained to be smallrelative to γ ∗ ξ , E [ u i | S i = 0 , Z i ] is likely to have a relatively small effect on the inequality.Equation (C.46) follows from the immediately preceding inequalities if the first in-equality dominates the second by performing simple algebraic manipulations and apply-ing the definition of P rice i given in (12). (cid:4) Argument for Odds Based Inequality Robustness to Endogeneity:
Substitutingequation (C.44) into (C.43) and equation (C.46) into (C.45) provides the inequalitiesdefined in equation (A.7). 49 ppendix D: Additional Simulations
This section presents additional simulations that include estimated bound on parametersusing the moment inequality method described in Appendix A. I present a series of vari-ations on the setting described in Section 3.2, where the magnitudes and directions ofselection and misperception biases vary. These simulations demonstrate the robustnessof the control function method to a wide variety of empirical settings, while also demon-strating the performance of the moment inequalities in settings other than that describedin Section A. I also present simulations with additional explanatory variables in order todemonstrate the computational performance of the different estimators.As in the body of the paper, I use the following DGP, Y i = X i β − (cid:94) P rice i + (cid:15) i (cid:94) P rice i = P rice i + ν i P rice i = Z i δ + u i , (D.1)where for these simulations Z i is always uncorrelated with (cid:15) i , ν i , and u i , and the nature ofthe covariance structure on these error terms will determine which methods will and willnot provide consistent estimates of perceived returns. Because perceived prices only differfrom realized prices in idiosyncratic ways, β = θ in all the following DGPs. Finally, I notethat the probit will estimate ( θ, σ ), the control function method will estimate ( θ, ρ, σ ζ ),and the moment inequalities will bound ( θ, σ (cid:15) ) under the assumptions in Appendix A or( θ, σ ξ ) under the assumptions in Appendix C, where these are defined in sections 3.1, 3.2,and Appendix A.Each DGP is comprised of N = 10 ,
000 observations of agents whose decisions aregoverned by their perceived returns to selection. I construct the instrument vector as Z i = [ X i z i ] where X i always includes only a constant unless otherwise stated, and z i isa single known and exogenous instrument. Finally, I assume the constant β = 1 and δ = [0 1] (cid:48) for all DGPs. 50 .1. Known, Exogenous Prices I begin with a well-behaved benchmark DGP that corresponds to the setting describedin Section 3.1. I generate data according to z i u i ν i (cid:15) i ∼ N ( , Σ); Σ = , (D.2)where I include ν i with a variance of zero such that agents have perfect information onprices.Table D.1 shows perceived returns estimates for one simulation of this DGP usingall three methods. Figure D.1 shows the distributions implied by the estimates for eachmethod. Because this DGP is particularly well-behaved, all three methods’ estimatesare very close to the data-generating parameters. Additionally, the moment inequalitiesprovide very tight bounds here because the first-stage error has relatively low variancesuch that making use of E [ P rice i | Z i ] in place of (cid:94) P rice i introduces little uncertainty intothe estimated perceived returns. D.2. Mean-reverting Misperceptions of Exogenous Prices
In this simulation, I consider a DGP that corresponds to the setting described in SectionA in which agents do not precisely forecast prices such that
P rice i (cid:54) = (cid:94) P rice i . Specifically,price misperceptions move in the opposite direction of prices such that Cov ( P rice i , ν i ) <
0, as in the case when agents form rational expectations on prices using a strict subset ofrelevant forecasting variables. This causes agents to tend to believe their price is closer51able D.1: Perceived Returns Estimates, Known Exogenous Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 0.986 0.997 [0.906, 1.066](0.033) (0.034) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,2) . . [1.976, 2.236]N/AObservations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Figure D.1: Perceived Returns Distribution, Known Exogenous Prices
Notes:
Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% confidence set are shown using ϕ = [0 ,
1] with steps of 1 /
52o the average than it actually is. I generate data according to z i u i ν i (cid:15) i ∼ N ( , Σ); Σ = − − . (D.3)Table D.2 shows the estimates for one simulation of this DGP using all three methods.Figure D.2 shows the distributions implied by the estimates for each method. The controlfunction and moment inequality estimates are close to the true parameters. The controlfunction estimates are significantly more precise than those of the moment inequalities.The probit’s estimates are biased upward as expected, given the normality assumptionson the errors (Yatchew and Griliches, 1985).Table D.2: Perceived Returns Estimates, Mean-Reverting Misperceptions(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 2.753 0.896 [-0.819, 2.611](0.160) (0.071) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.119, 4.919]N/AObservations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. D.3. Known, Positively Selected Prices
In this simulation, I consider a DGP in which prices are known, but u i and (cid:15) i are positivelycorrelated, such as in the case of price discrimination. This setting is one case of that53igure D.2: Perceived Returns Distributions, Mean-Reverting Misperceptions Notes:
Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% confidence set are shown using ϕ = [0 ,
1] with steps of 1 / described in Section 3.2. I generate data according to z i u i ν i (cid:15) i ∼ N ( , Σ); Σ = . (D.4)Table D.3 shows the estimates for one simulation of this DGP using all three methods.Figure D.3 shows the distributions implied by the estimates for each method. The controlfunction estimates are close to the true parameter values, while the moment inequalitiesalso bound the true parameters. The probit estimates are biased, as expected given theprice endogeneity.In this case ρ ∈ [0 , ρ , it is worth noting that the sign is determined by E [ u i ( − ν i + (cid:15) i )], such that negative54positive) correlation between u i and ν i will produce an equivalent situation as positive(negative) correlation between u i and (cid:15) i .Table D.3: Perceived Returns Estimates, Positively Selected Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 2.932 0.989 [-0.483, 2.757](0.168) (0.074) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (4,3) . . [1.271, 4.889]N/AObservations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Figure D.3: Perceived Returns Distributions, Positively Selected Prices
Notes:
Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% confidence set are shown using ϕ = [0 ,
1] with steps of 1 / .4. Known, Negatively Selected Prices In this simulation, I consider a DGP in which u i and (cid:15) i are negatively correlated. Thissetting is a case of the one described in Section 3.2. I generate data according to z i u i ν i (cid:15) i ∼ N ( , Σ); Σ = −
100 0 0 00 −
10 0 9 . (D.5)where I include ν i with a variance of zero to emphasize that there are no price misper-ceptions in this case.Table D.4 shows the estimates for one simulation of this DGP using all three methods.Figure D.4 shows the distributions implied by the estimates for each method. The controlfunction method estimates are close to the true parameter values, while the other methodsperform poorly. In the case of the probit, there is nothing to address inequality, whilethe moment inequalities address positive correlation between prices and the compositeidiosyncratic preference term nu i + (cid:15) i , but not negative correlation.Table D.4: Perceived Returns Estimates, Negatively Selected Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 0.739 1.100 [-1.753, 2.654](0.034) (0.088) N/A σ σ ζ ρ -.5 . -0.497 .(0.059)( σ (cid:15) , σ ξ ) (3,2) . . [0.131, 0.922]N/AObservations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Notes:
Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% confidence set are shown using ϕ = [0 ,
1] with steps of 1 / D.5. Mean-reverting Misperceptions of Positively Selected Prices
In this simulation, I consider a DGP in which u i is positively correlated with (cid:15) i andnegative correlated with ν i . This case would occur in a setting in which there is pricediscrimination on unobserved components of preferences, and agents are only aware ofa subset of price determinants and form rational expectations based on known pricedeterminants. I generate data according to z i u i ν i (cid:15) i ∼ N ( , Σ); Σ = − − . (D.6)This setting corresponds to the one described in Section 3.2. This setting is likely themost realistic, insofar as mean-reverting price misperceptions and positive selection onprices are likely. In this case, Section 3.2 and Appendix C suggest that the controlfunction method and the moment inequality method will consistently estimate perceivedreturns, but the probit will not. 57able D.5 shows the estimates for one simulation of this DGP using all three methods.Figure D.5 shows the distributions implied by the estimates for each method. The controlfunction method estimates are close to the true parameter values, while the momentinequalities bound the true values. The probit estimates are biased away from zero,because variation in prices predicts relatively modest changes in investment, as not allprice variation is known to agents and because price variation is accompanied by higheridiosyncratic preferences for investment.Table D.5: Perceived Returns Estimates, Positively Selected Partially Known Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.076 0.978 [-0.785, 2.740](0.173) (0.073) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.107, 4.970](.) N/AObservations 10000 10000 10000 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. D.6. Computational Comparison with Controls
Next, I present two simulations which include additional explanatory variables. This ex-ercise is intended to provide a computational comparison of the control function methodand the moment inequality method, so they include the computation time taken to com-plete each procedure. These simulations use the DGP described in Section D.5 with theaddition of the variables x in the first simulation, and x and x in the second. I set x ∼ N (0 ,
4) and x ∼ N (0 ,
4) with coefficients of zero. The results are shown in TableD.6 and Table D.7, respectively, where graphs of implied perceived returns are omitted58igure D.5: Perceived Returns Distributions, Positively Selected Partially Known Prices
Notes:
Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% confidence set are shown using ϕ = [0 ,
1] with steps of 1 / because they are visually indistinguishable from Figure D.5 (given the zero coefficients onthe new variables). All simulations are performed on a Linux server with two Intel XeonX5550 CPUs and 48GB of RAM. Note that the run times in seconds for the momentinequalities are orders of magnitude higher than the other methods for both simulations,and that this difference is increasing in the number of variables.59able D.6: Perceived Returns Estimates, 1 Control(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.075 0.977 [-6.956, 8.029](0.173) (0.073) N/A x σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.106, 4.970]N/AObservations 10000 10000 10000Computation Time 0 2 1017 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second. Table D.7: Perceived Returns Estimates, 2 Controls(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.076 0.977 [-14.890, 16.550](0.173) (0.073) N/A x x σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [0.677, 4.970]N/AObservations 10000 10000 10000Computation Time 1 2 21085 Notes:
Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second.under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second.