[PDF] Identifying and Estimating Perceived Returns to Binary Investments

Abstract

I describe a method for estimating agents' perceived returns to investments that relies on cross-sectional data containing binary choices and prices, where prices may be imperfectly known to agents. This method identifies the scale of perceived returns by assuming agent knowledge of an identity that relates profits, revenues, and costs rather than by eliciting or assuming agent beliefs about structural parameters that are estimated by researchers. With this assumption, modest adjustments to standard binary choice estimators enable consistent estimation of perceived returns when using price instruments that are uncorrelated with unobserved determinants of agents' price misperceptions as well as other unobserved determinants of their perceived returns. I demonstrate the method, and the importance of using price variation that is known to agents, in a series of data simulations.

Full PDF

IIdentifying and Estimating Perceived Returns toBinary Investments ∗ Clint Harris † January 27, 2021

Abstract

I describe a method for estimating agents’ perceived returns to investments that relies oncross-sectional data containing binary choices and prices, where prices may be imperfectlyknown to agents. This method identiﬁes the scale of perceived returns by assuming agentknowledge of an identity that relates proﬁts, revenues, and costs rather than by elicitingor assuming agent beliefs about structural parameters that are estimated by researchers.With this assumption, modest adjustments to standard binary choice estimators enableconsistent estimation of perceived returns when using price instruments that are uncor-related with unobserved determinants of agents’ price misperceptions as well as otherunobserved determinants of their perceived returns. I demonstrate the method, and theimportance of using price variation that is known to agents, in a series of data simulations.JEL Codes: C31, D84, D61Keywords: Biased Beliefs, Returns to Investments, Revealed Preference, Subsidies, Taxes ∗ I thank Mary Kate Batistich, Trevor Gallen, Kendall Kennedy, Soojin Kim, Dan Millimet, KevinMumford, Victoria Prowse, and Miguel Sarzosa as well as seminar participants at Case Western ReserveUniversity, The European Association of Labor Economists Meeting, Kansas State University, The Mid-west Economics Association Meeting, The National Tax Association Meeting, Purdue University, TheSouthern Economic Association Meeting, and The US Census Bureau for helpful comments. † Wisconsin Institute for Discovery, University of Wisconsin-Madison, 330 N Orchard Street, Madison,WI 53715 USA; email: [email protected] a r X i v : . [ ec on . E M ] J a n . Introduction In this paper I describe a method for estimating distributions of perceived private returnsto binary investments. These structural perceived returns estimates are of distributionsof agents’ compensating variation associated with a binary choice that condition on ob-servables. This method complements program evaluation methods that estimate eﬀectsof speciﬁc policy shocks on binary choices by allowing for predictions of counterfactualpolicies that diﬀer from past policies in magnitude or targeted population. For instance,Harris (2020) applies this method to estimate perceived returns to college, allowing forcounterfactual predictions of targeted college attendance subsidies (and taxes) for diversegroups of individuals. Identiﬁcation is achieved by assuming common agent knowledge ofan identity that relates prices to returns, while also using instruments that are de factoknown to agents, in the sense that they shift perceived prices the same amount that theyshift actual prices, in addition to satisfying the traditional exclusion restriction.This paper presents a special case of a general method for identifying the scale of bi-nary choice models by assuming agent beliefs about a variable observed by the researcherand agent beliefs about the mapping between that variable and the perceived returnlatent variable. Existing work that makes such assumptions includes Cunha, Heckman,and Navarro (2005), who assume agent knowledge of their lifetime pecuniary return tocollege insofar as it is attributable to explanatory variables observed by the researcher,and Dickstein and Morales (2018), who assume partial agent knowledge of trade revenuesand agent knowledge of an estimated demand elasticity parameter. The present paperassumes partial agent knowledge of prices in the sense of Dickstein and Morales (2018)while assuming agent knowledge that prices causally decrease returns dollar for dollarin accordance with an identity that relates proﬁts, revenues, and costs. The use of thisidentity imposes a theoretical restriction on a structural parameter (the coeﬃcient onprice in the binary choice latent variable equation) without requiring its estimation byresearchers or agents. Avoiding the assumption that agents obtain the same estimate ofa parameter as researchers improves robustness to the concerns articulated by Manski21993, 2004) about the pitfalls of making incorrect assumptions on agents’ knowledge ofstructural models.The method in the present paper avoids assuming rational expectations on any modelobjects, instead assuming that the variation in prices associated with chosen instrumentsis known to agents regardless of whether agents are correct about prices on average.This makes it particularly attractive in applications where rational expectations assump-tions in general are suspect, but the researcher can credibly argue that a particular priceshock is nonetheless known to agents. Considering the example of college attendance,it is possible that exogeneous policy shocks may shift prices more than they shift per-ceived prices, as with Pell grants (Hansen, 1983; Kane, 1995), they may shift perceivedprices more than they shift prices, as with the Michigan HAIL policy (Dynarski, Libassi,Michelmore, and Owen, 2018), or they may shift prices and perceived prices the sameamount, as with the Social Security Student Beneﬁt termination (Dynarski, 2003). Ofthese preceding sources of variation, only the last would be appropriate for estimatingthe model presented in this paper. In addition to college attendance, attractive targetsfor this method include healthcare, home purchases, R&D, and export decisions due tothe substantial information frictions on prices in these settings.In addition to considerations regarding the relative credibility of diﬀerent assumptionson agent beliefs, applications also diﬀer in data availability. The method described inthis paper relies on cross-sectional data that contains binary choices on investments andprices associated with those investments. Methods that rely on rational expectationson ex post returns to investments require longitudinal data (without requiring data onprices), as in Cunha, Heckman, and Navarro (2005) and related research surveyed byCunha and Heckman (2007). Meanwhile, inferring beliefs by eliciting them directly fromagents requires surveys that contain this information, as in Jensen (2010), Wiswall andZafar (2015), and Bleemer and Zafar (2018). The method described in this paper is thususeful in settings where there is no clear winner in terms of assumption validity, but whenlongitudinal data and data on agent perceptions in unavailable.I describe how to estimate perceived returns when prices are known to agents and3xogenous, and how to overcome violations of these conditions using instrumental vari-ables. I compare performance of these methods with valid and invalid instruments acrossdata generating processes that diﬀer in the assumptions on agent knowledge of prices.In the most realistic settings, methods that make no use of instruments, or which useinstruments that are correlated with agent misperceptions, perform poorly compared tothose that use instruments that are de facto known to agents.The plan of the rest of this paper is as follows. Section 2 introduces the empiricalmodel. Section 3 describes the econometric strategy and the assumptions required foridentiﬁcation. Section 4 evaluates the robustness of various methods and instruments tovarious empirical challenges in a series of simulated data exercises. Section 5 concludes.

2. Model

I assume that agents choose whether to make an investment based on their beliefs aboutdiscounted net incomes and costs associated with choices, which I present as a two-sectorgeneralized Roy (1951) model. Agents choose to select the investment, S i = 1, or to notdo so, S i = 0, which is observed by the researcher. I deﬁne (cid:101) Y ,i as agent i ’s perceiveddiscounted present value of lifetime income associated with choosing the investment and (cid:101) Y ,i as their perceived discounted present value of lifetime income associated with notdoing so. I further deﬁne (cid:101) C i as their perceived net present value cost of making theinvestment, which includes prices paid and nonpecuniary costs expressed in monetaryvalues. Unlike common applications of the Roy model, none of (cid:101) Y ,i , (cid:101) Y ,i , and (cid:101) C i areobserved by the researcher for any individual because they represent agent perceptions.I express the perceived potential incomes and costs for individual i with the followinglinear-in-parameters production functions, (cid:101) Y ,i = X i β + ˜ (cid:15) ,i (cid:101) Y ,i = X i β + ˜ (cid:15) ,i (cid:101) C i = X i β C + (cid:94) P rice i + ˜ (cid:15) Ci . (1)4ere, X i are variables observed by the researcher that determine potential incomes andcosts. The parameters { β } capture the extent to which these variables drive beliefs aboutpotential outcomes regardless of whether they are known to agents. (cid:94) P rice i is the agent’sperceived price for the investment, which is known to agents but not to researchers.Importantly, it is assumed to only aﬀect costs and has a coeﬃcient that is normalized tounity. Finally, ˜ (cid:15) ,i , ˜ (cid:15) ,i , and ˜ (cid:15) Ci represent idiosyncratic perceived returns to investmentthat are known to agents but not to the researcher.I assume that agents maximize expected wealth independently of how they consumeit, as in the case of perfect credit markets. It follows that the perceived net return/proﬁt, (cid:101) π i , is suﬃcient to determine agents’ decisions in accordance with the rule S i = 

1, if (cid:101) π i ≥ ,

0, otherwise. (2)I further assume that the deﬁnition of proﬁt, π i ≡ Revenue i − Cost i , is known to agentsin the sense that it holds for their beliefs as well, such that (cid:101) π i = (cid:94) Revenue i − (cid:93) Cost i = (cid:101) Y ,i − ( (cid:101) Y ,i + (cid:101) C i ) , (3)where (cid:94) Revenue i denotes the agent’s perceived income and (cid:93) Cost i denotes the agent’sperceived opportunity cost, which includes (cid:101) Y ,i . It follows that the agent’s decision rulecan be expressed in terms of potential outcomes as S i = 

1, if (cid:101) Y ,i − (cid:101) Y ,i − (cid:101) C i ≥ ,

0, otherwise. (4)Deﬁning the net marginal eﬀects β ≡ β − β − β C and the net idiosyncratic component I avoid denoting agents’ beliefs with conditional expectations over realized values, as is common inthe literature, to avoid the implication of rational expectations which follows from the law of iteratedexpectations.

5f perceived outcomes ˜ (cid:15) i ≡ ˜ (cid:15) ,i − ˜ (cid:15) ,i − ˜ (cid:15) Ci , we can combine (1) with (4) to write theperceived return latent variable as (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i . (5)Importantly, the assumptions given result in the latent variable being linear in perceivedprices, with a marginal eﬀect ( −

1) that is known to both agents and the researcher. Theexpression of perceived returns as a latent variable in a binary choice problem with asingle known marginal eﬀect is the starting point of the estimation procedures describedbelow.

3. Empirical Strategy

It follows from the model that latent perceived returns are identiﬁed by β , (cid:94) P rice i , and˜ (cid:15) i , given the observed X i . The lack of observation of ˜ (cid:15) i is a common problem that willbe addressed with commonly used binary choice estimation techniques. In this sectionI will describe adjustments to these estimators that leverage the assumptions describedabove to permit identiﬁcation of β and the scale of the distribution of ˜ (cid:15) i in the context ofthe researcher’s failure to observe agents’ perceived prices. To preface, these adjustmentsaddress challenges that arise due to perceived costs having a causal eﬀect on perceivedreturns in the identity given in (3).The econometric methods described below establish conditions under which the as-sumed coeﬃcient on perceived prices from (5) exactly determines the marginal eﬀect ofrealized prices on perceived returns in a binary choice model. Omitted variable biasand measurement error in prices as measures of perceived prices threaten the validity ofthis assumption. It follows that methods which address omitted variable bias and mea-surement error will validate the assumption on the marginal eﬀect of realized prices onperceived returns. To clarify, consider the expression of agents’ beliefs about prices used The researcher constraining the price coeﬃcient to the value used by agents is key to identiﬁcation,not the researcher or agents being correct about its value. (cid:94)

P rice i = P rice i + X i α + ν i , (6)where the realized price, P rice i , is observed by the researcher, α gives the eﬀect ofexplanatory variables on price misperceptions, and ν i is the idiosyncratic component ofagent i ’s misperception of prices. Here, realized prices are assumed to increase agents’beliefs about prices at a known marginal rate of unity insofar as they are known to agents.This expression allows us to present an empirically tractable version of perceivedreturns, (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i = X i β − P rice i − X i α − ν i + ˜ (cid:15) i , (7)by substituting in prices observed by the researcher for agents’ unobserved perceivedprices and deﬁning θ = β − α . This representation presents the unexplained price mis-perception as an omitted variable, which will produce problems if

P rice i is correlatedwith ν i . Natural examples of problematic correlations between price misperceptions in-clude agents systematically over-reacting or under-reacting to price predictors that areunobserved by the researcher. The extreme case of under-reaction is that in which anunobserved predictor of realized price variation is ignored by or unknown to agents alto-gether, which amounts to classical measurement error in realized prices as measures ofperceived prices.In what follows, I ﬁrst consider a benchmark case in which unobserved components ofprice misperceptions are mean independent of realized prices and prices are uncorrelatedwith unobserved determinants of perceived returns. Though agents may be mistakenabout prices, actual prices can stand in for perceived prices because any systematic price The distinction between the extent to which each control contributes to misperceptions in prices, α , and to other components of perceived returns, β , is presented to emphasize that the methods in thispaper are robust to systematic bias in perceptions associated with explanatory variables, even thoughthey are not separately identiﬁed. Here, I describe a benchmark procedure for estimating perceived returns with a simpleadjustment to a common binary choice method. This procedure will provide consistentestimates of the perceived returns distribution under two assumptions that are likely tobe violated in applications. First, this method assumes that prices and the unobservedcomponent of perceived returns are uncorrelated. Second, it assumes that unobservedcomponents of price misperceptions are mean independent of prices conditional on X i ,the simplest case of which is agents having perfect information on prices.With the decision rule in (4) and the expression of perceived returns in (7), an assump-tion on the distribution of − ν i + ˜ (cid:15) i is suﬃcient to consistently estimate perceived returnsby maximum likelihood. I assume the composite unobserved component of perceivedreturns in (7) is normally distributed as − ν i + ˜ (cid:15) i | X i , P rice i ∼ N (0 , σ ) . (8)The assumption of normality is chosen for convenience, and is not necessary for theestimation procedures in this paper. Deﬁning ( β ∗ , θ ∗ , γ ∗ ) = ( βσ , θσ , σ ) for notational con-venience, the probability of selection is given by P r ( S i = 1 | X i , P rice i ) = Φ( X i θ ∗ − P rice i γ ∗ ) , (9)8here Φ( · ) denotes the standard normal CDF.The parameters ( θ ∗ , γ ∗ ) are the values that maximize the log-likelihood L ( θ ∗ , γ ∗ | X i , P rice i ) = (cid:88) i S i log (cid:34) Φ (cid:16) X i θ ∗ − P rice i γ ∗ (cid:17)(cid:35) + (1 − S i ) log (cid:34) − Φ (cid:16) X i θ ∗ − P rice i γ ∗ (cid:17)(cid:35) . (10)The estimates of perceived returns are then given byˆ (cid:101) π i | X i , P rice i ∼ N ( X i ˆ θ − P rice i , ˆ σ ) , (11)where imposing the constraint γ ∗ = σ (rather than the standard constraint σ = 1) is theonly diﬀerence from a standard probit. Importantly, the assumption that γ ∗ = σ is onlyvalid under the assumptions described in Section 2 when realized prices are uncorrelatedwith unobserved components of price misperceptions and perceived returns conditionalon X i . As this generally will not be the case, this assumption is not an innocuousnormalization. Here, I describe a control function approach that addresses correlation between pricesand unobserved components of perceived returns as well as arbitrary correlation betweenprices and misperceptions on prices. In Appendix A, I discuss a method developed byDickstein and Morales (2018) that performs well in this model when agents under-react toprice variation, such as when they form rational expectations on prices based on a knownprice predictors and only a subset of price predictors are known to them. The method inthis section uses an established estimator, but adds the assumption that instruments areuncorrelated with unobserved components of price misperceptions in addition to the morecommonly invoked assumption that instruments are uncorrelated with other unobservedidiosyncratic components of perceived returns. This additional assumption contributesto credibility for predictions of responds to counterfactual price changes that are known9o agents, without changing the asymptotic or ﬁnite sample properties of the estimator.The control function approach uses the following system of equations, with referenceto the expression of perceived returns in (7), (cid:101) π i = X i θ − P rice i − ν i + ˜ (cid:15) i P rice i = Z i δ + u i , (12)where I have left unobserved price misperceptions and other unobserved components ofperceived returns separate for clarity. Here, I introduce the instruments, Z i , where X i ⊂ Z i , that are assumed to be conditionally uncorrelated with − ν i +˜ (cid:15) i and strongly correlatedwith observed prices. With some loss of generality, I will refer to instruments that satisfythis condition as “known and exogoneous” for brevity. With valid instruments, theprice residual u i contains all components of prices that are correlated with idiosyncraticcomponents of price misperceptions or other unobserved components of perceived returns.Given the above, I estimate the following equation, (cid:101) π i = X i θ − P rice i − ν i + ˜ (cid:15) i = X i θ − P rice i + u i ρ + ξ i = X i θ − P rice i + ˆ u i ρ + ζ i . (13)The ﬁrst line follows directly from the representation of perceived returns in (7). Thesecond line substitutes in the linear projection of the composite error − ν i + ˜ (cid:15) i on the ﬁrststage error u i , wherein ρ = E [ u i ( − ν i + ˜ (cid:15) i )] / E [ u i ] and ξ i is the residual when controllingfor u i . The third line substitutes the estimated residuals from the ﬁrst stage regression of P rice i on Z i in for their unobserved true values, generating a new error, ζ i = ξ i +( u i − ˆ u i ) ρ .This new error will converge asymptotically to ξ i , but will diﬀer in small samples due tosampling error in the estimation of the residual from the ﬁrst stage, ˆ u i . It is not necessary that agents know the instruments in Z i , but only that they know the variationin prices that is attributable to Z i . For example, agents need not know about a tax or subsidy shockto the price of investment, so long as they are aware of the change in price that arises from the policyshock. Furthermore, the language that instruments are known and exogenous suggests that Cov ( Z i , ν i ) = Cov ( Z i , ˜ (cid:15) i ) = 0, while these are suﬃcient but not necessary for the less intuitive condition Cov ( Z i , ˜ (cid:15) i − ν i ) = 0, which accommodates the knife-edge case of the two sources of bias cancelling out.

10o estimate perceived returns, I assume that the new error in the perceived returnscontrol function expression is normally distributed, ζ i | X i , P rice i , ˆ u i ∼ N (0 , σ ζ ) , (14)noting that the variance of ζ i will diﬀer from that of ˜ (cid:15) i if ρ (cid:54) = 0. I estimate perceivedreturns using two-stage conditional maximum likelihood, following Rivers and Vuong(1988), while correcting for the inclusion of estimated regressors, following Murphy andTopel (1985), though other estimators will also provide consistent estimates. Deﬁning( θ ∗ ζ , γ ∗ ζ , ρ ∗ ζ ) = ( θσ ζ , σ ζ , ρσ ζ ), the log-likelihood for the second stage of the control functionapproach is given by L (cid:16) θ ∗ , γ ∗ , ρ ∗ | X i , ˆ u i (cid:17) = (cid:88) i S i log (cid:34) Φ (cid:16) X i θ ∗ ζ − P rice i γ ∗ ζ + ˆ u i ρ ∗ ζ (cid:17)(cid:35) +(1 − S i ) log (cid:34) − Φ (cid:16) X i θ ∗ ζ − P rice i γ ∗ ζ + ˆ u i ρ ∗ ζ (cid:17)(cid:35) . (15)Estimates of perceived returns are obtained by plugging the estimated parameters andthe assumed coeﬃcient on perceived prices into the latent variable equation, (cid:101) π i | X i , ˆ u i ∼ N (cid:16) X i ˆ θ − P rice i + ˆ u i ˆ ρ, ˆ σ ζ (cid:17) . (16)

4. Simulations

In this section I apply the methods described above to simulated datasets to comparetheir performance. The important considerations involve agent beliefs about prices, priceendogeneity, and instruments being known and/or exogenous to agents. Because theestimators used are standard, I stop short of performing full Monte Carlo simulations, As an closely-related alternative, we could perform a instrumental variables probit to obtain identicalestimates of θ . The control function method has the advantage of conditioning on the variation in pricesthat isn’t used in identifying the eﬀect on perceived returns, which permits more precise counterfactualpredictions for policies that are targeted on observables. (cid:101) π i = X i β − (cid:94) P rice i + ˜ (cid:15) i (cid:94) P rice i = P rice i + ν i = Z i δ + u i + ν i , (17)where the nature of the covariance of ( Z i , u i , ν i , ˜ (cid:15) i ) will determine the performance ofvarious estimation approaches. Both the probit and the control function method willobtain estimates of β , while the probit will estimate σ = V ar ( − ν i + (cid:15) ) (18)and the control function method will estimate ρ = E [ u i ( − ν i + ˜ (cid:15) i )] / E [ u i ] ,σ ζ = (cid:112) V ar ( ζ i ) = (cid:112) V ar ( − ν i + ˜ (cid:15) i − ˆ u i ρ ) . (19)Each DGP is comprised of N = 10 ,

000 observations of agents whose decisions are gov-erned by their perceived returns to investment.12 .1. Simulation with Known, Exogenous Prices

I begin with a well-behaved benchmark DGP that corresponds to the setting describedin Section 3.1. I generate data according to  z i u i ν i ˜ (cid:15) i  ∼ N ( , Σ); Σ =   . (20)I construct the instrument vector as Z i = [ X i z ,i ] where X i includes only a constant, and α = 0 such that θ = β . Finally, I set β = 1 and δ = [0 1] (cid:48) . Although I set V ar ( ν i ) = 2, Idescribe prices as known in this setting because the price misperception is uncorrelatedwith prices. Table 1 shows perceived returns estimates for one simulation of this DGP using themethods from Section 3.1 and Section 3.2. Figure 1 shows the distributions implied bythe estimates for each method. In this case, the lack of correlation between prices andunobserved components of perceived returns, including price misperceptions, means thatboth methods will provide consistent estimates of perceived returns.

In this simulation, I consider a DGP that corresponds to the setting described in Section3.2 in which agents systematically misperceive prices in ways that not accounted for byobservables, and prices are correlated with unobserved components of perceived returns.I also compare the performance of an instrument that is exogenous but unknown to one This setting is one in which agents are wrong about prices in ways that are unrelated to pricedeterminants. This sort of price misperception is plausible in cases where prices change frequentlyaccording to a distribution that is de facto known to agents, such as frequently repeated investments. σ σ ζ ρ Notes:

Figure 1: Simulation 1, Implied Perceived Returns Distributions

Notes:

Estimated densities of perceived returns given by the probit method using expression (11), andthe control function method using expression (16).  z ,i z ,i u i ν i ˜ (cid:15) i  ∼ N ( , Σ); Σ =  − − − −  . (21)I construct the instrument vector as Z i = [ X i z ,i z ,i ] where X i includes only a constant,and α = 0 such that θ = β . Finally, I set β = 1 and δ = [0 1 1] (cid:48) .In this case, there is positive correlation between u i and ˜ (cid:15) i such that individuals whoface idiosyncratically high prices also have high perceived returns, as may occur withprice discrimination. Additionally, there is negative correlation between u i and ν i suchthat individuals systematically underestimate the extent to which their price deviatesfrom the average, as may occur if agents form rational expectations on prices conditionalon an incomplete set of price determinants. Finally, this DGP includes two potentialinstruments; z ,i , which is exogenous but not fully known to agents, as in the case of apoorly publicized policy shock, and z ,i , which is both exogenous and known to agents.Because z ,i is correlated with ν i , it is not a valid instrument for the purposes of thispaper. For the control function estimates of ρ and σ ζ , I use u ,i in place of u i , where u ,i = z ,i δ + u i . In applications with many valid instruments, including diﬀerent combinationsof instruments will result in diﬀerent estimates ˆ u i , ˆ ρ and ˆ σ ζ , while nonetheless all returningconsistent estimates of perceived returns. For comparisons between instruments, thecomplete distribution of perceived returns (succinctly described by the ﬁgures) and theestimated coeﬃcients on X i will be correct for all valid instruments.Table 2 shows the estimates for one simulation of this DGP using both methods,and also using each instrumental variable individually. Figure 2 shows the distributionsimplied by the estimates for each method. Because z ,i is correlated with misperceptions,it is not a valid instrument, and results in an estimated perceived returns distribution15hat is no better than that obtained when using no instruments. Table 2: Simulation 2, Perceived Returns Estimates(1) (2) (3)Target Probit Control Function z Control Function z Constant 1 1.528 1.659 0.863(0.102) (0.127) (0.086) σ σ ζ ρ .5 . -0.097 0.507(0.046) (0.018)Observations 10000 10000 10000 Notes:

Figure 2: Simulation 2, Implied Perceived Returns Distributions

Notes:

Estimated densities of perceived returns given by the probit method using expression (11), andthe control function method using expression (16). The unknown IV is z ,i and the valid IV is z ,i , whereeach IV is excluded from the estimation model when the other is used. For estimating instrument-speciﬁc intent to treat eﬀects of prices on investment, which would be suf-ﬁcient for determining the performance of a particular policy in the context of its actual implementation,instruments such as z ,i are valid. They nonetheless fail to provide credible insight into counterfactualpolicy changes that are well-publicized. . Conclusions In this paper I describe how to estimate perceived returns to investments by assumingagent knowledge of an intuitive identity and modestly altering common estimation tech-niques. The assumption on agent knowledge may be preferable to rational expectationsor related assumptions in applications. I further describe the econometric challenges thatarise from the assumption and how to overcome them with careful choice of instrumentsthat are not only exogenous to agents, but are also de facto known to them.This method is relevant in many empirical questions, especially those subject to sub-stantial information frictions on prices such as such as college attendance, ﬁrm R&D,automobile purchases, home purchases, and healthcare. While the estimation techniquesused in this paper are restricted to a probit and a control function probit, the generalinsights are relevant to more sophisticated models that involve responses to prices. Im-plementation of the identity relating perceived returns and prices used in this paper inthe context of more sophisticated models, such as Berry, Levinsohn, and Pakes (1995)and its extensions, are left to future work.In terms of policy implications, the methods described in this paper are relevant forconstructing credible counterfactuals for well-publicized price changes, which are rele-vant for taxes and subsidies on investments including those associated with educationand healthcare. The general insight is to avoid being too quick to assume that agentshave rational expectations on model objects when alternative assumptions may be moredefensible. Relatedly, the insights here also caution against extrapolating eﬀects of coun-terfactual policies when the policy eﬀects are estimated using a source of variation inprices that may not be known to agents. In practice, applied researchers should justifythat sources of variation used for estimating treatment eﬀects are known to agents just asthey justify that they are exogenous to agents when making counterfactual predictions.17 eferences

Andrews, D. W. K., and

G. Soares (2010): “Inference for Parameters Deﬁned byMoment Inequalities Using Generalized Moment Selection,”

Econometrica , 78(1), 119–157.

Berry, S., J. Levinsohn, and

A. Pakes (1995): “Automobile Prices in MarketEquilibrium,”

Econometrica: Journal of the Econometric Society , pp. 841–890.

Bleemer, Z., and

B. Zafar (2018): “Intended College Attendance: Evidence froman Experiment on College Returns and Costs,”

Journal of Public Economics , 157,184–211.

Cunha, F., J. Heckman, and

S. Navarro (2005): “Separating Uncertainty fromHeterogeneity in Life Cycle Earnings,”

Oxford Economic Papers , 57(2), 191–261.

Cunha, F., and

J. J. Heckman (2007): “Identifying and estimating the distributionsof ex post and ex ante returns to schooling,”

Labour Economics , 14(6), 870–893.

Dickstein, M. J., and

E. Morales (2018): “What Do Exporters Know?,”

The Quar-terly Journal of Economics , 133(4), 1753–1801.

Dynarski, S., C. Libassi, K. Michelmore, and

S. Owen (2018): “Closing thegap: The eﬀect of a targeted, tuition-free promise on college choices of high-achieving,low-income students,” Discussion paper, National Bureau of Economic Research.

Dynarski, S. M. (2003): “Does Aid Matter? Measuring the Eﬀect of Student Aid onCollege Attendance and Completion,”

American Economic Review , 93(1), 279–288.

Hansen, W. L. (1983): “Impact of student ﬁnancial aid on access,”

Proceedings of theAcademy of Political Science , 35(2), 84–96.

Harris, C. M. (2020): “Estimating the perceived returns to college,”

Available at SSRN3577816 . 18 ensen, R. (2010): “The (Perceived) Returns to Education and the Demand for School-ing,”

The Quarterly Journal of Economics , 125(2), 515–548.

Kane, T. J. (1995): “Rising public college tuition and college entry: How well do publicsubsidies promote access to college?,” Discussion paper, National Bureau of EconomicResearch.

Manski, C. F. (1993): “Adolescent econometricians: How do youth infer the returns toschooling?,” in

Studies of supply and demand in higher education , pp. 43–60. Universityof Chicago Press.(2004): “Measuring expectations,”

Econometrica , 72(5), 1329–1376.

Murphy, K. M., and

R. H. Topel (1985): “Least Squares with Estimated Regres-sors,”

Journal of Business and Economic Statistics . Rivers, D., and

Q. H. Vuong (1988): “Limited Information Estimators and Exogene-ity Tests for Simultaneous Probit Models,”

Journal of Econometrics , 39(3), 347–366.

Roy, A. D. (1951): “Some Thoughts on the Distribution of Earnings,”

Oxford EconomicPapers , 3(2), 135–146.

Wiswall, M., and

B. Zafar (2015): “How Do College Students Respond to PublicInformation about Earnings?,”

Journal of Human Capital , 9(2), 117–169.

Yatchew, A., and

Z. Griliches (1985): “Speciﬁcation error in probit models,”

TheReview of Economics and Statistics , pp. 134–139.19 ppendix A: Moment Inequalities

This section describes how to adapt the moment inequality method developed by Dick-stein and Morales (2018) (DM) to the setting described in this paper. The setting ofDM involves trade revenues that are partially observed by ﬁrms, which have a structuralrelationship with proﬁts that is assumed to be known to agents. The type of informationfrictions described in DM are a special case of those described in the present paper, inwhich some sources of variation in the treatment variable are unknown to agents. Inthe context of the model presenting in equation (6), this involves negative correlationbetween ν i and P rice i such that prices are a mean preserving spread of perceived prices.Furthermore, the DM method assumes that ˜ (cid:15) i is independent of other determinants ofperceived returns.This method makes use of instruments, Z i that are independent of (˜ (cid:15) i , ν i ). For X i ⊂ Z i , this implies that the expectation of (6) conditional on Z i gives E [ (cid:94) P rice i | Z i ] = E [ P rice i | Z i ] + X i α. (A.1)Additionally, it makes a distributional assumption on unobserved perceived returns suchas ˜ (cid:15) i | X i , (cid:94) P rice i ∼ N (0 , σ (cid:15) ) , (A.2)where the assumption of normality is unnecessary, but there are some restrictions onthe assumed distribution which I discuss below. The method uses two types of momentinequalities to obtain bounds on the parameters of perceived returns, ( θ, σ ˜ (cid:15) ). I willpresent the inequalities and provide a brief discussion here. For the derivation and furtherdiscussion of the moment inequalities, see DM.20 .1. Revealed Preference Moment Inequalities Deﬁning ( β ∗ ˜ (cid:15) , θ ∗ ˜ (cid:15) , γ ∗ ˜ (cid:15) ) = ( βσ ˜ (cid:15) , θσ ˜ (cid:15) , σ ˜ (cid:15) ) for notational convenience, the conditional revealedpreference moment inequalities are E (cid:34) S i ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) + (1 − S i ) φ ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ , E (cid:34) − (1 − S i )( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) + S i φ ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (A.3)These inequalities are consistent with the revealed preference argument that perceivedreturns are positive for those who select the investment and negative for those who donot. Here, I provide an overview of the intuition.Regarding the ﬁrst inequality, consider an agent that selects the investment suchthat S i = 1. Following the revealed preference argument articulated in (4) and therepresentation of perceived returns in (7), it follows that this individual’s perceived returnis positive, such that S i ( X i θ − P rice i − ν i + ˜ (cid:15) i ) ≥ . (A.4)This expression cannot be computed directly because researchers do not observe ν i or ˜ (cid:15) i .However, as the inequality holds for all i , it follows that it holds in expectation conditionalon Z i , E [ S i ( X i θ − P rice i − ν i + ˜ (cid:15) i ) | Z i ] ≥ . (A.5)Finally, it follows from the Law of Iterated Expectations, the assumption that ν i is un-known to agents and therefore not acted upon, and the assumption that Z i is uncorrelatedwith ν i such that E [ S i ν i | Z i ] = E [ S i E [ ν i | S i , Z i ] | Z i ] = 0, yielding, E [ S i ( X i θ − P rice i + ˜ (cid:15) i ) | Z i ] ≥ . (A.6)21he ﬁrst inequality in (A.3) is derived from this inequality where its second term is a posi-tively biased approximation of E [ S i ˜ (cid:15) i | Z i ] that exploits the closed form for E [ S i ˜ (cid:15) i | X i , (cid:94) P rice i ]under the normality assumption on ˜ (cid:15) i . Heuristically, if observed prices are a mean-preserving spread of perceived prices,substituting them in place of perceived prices will mistakenly increase expected perceivedreturns unconditional on selection for some agents and decrease them for others. For theagents for whom this expectation increases, the expectation of the error conditional onselection approaches zero. For those for whom it decreases, the expectation of the errorconditional on selection approaches positive inﬁnity. In many cases, this second eﬀectwill dominate the overall expectation of the error conditional on selection. The secondinequality follows from similar intuition applied to individuals who do not select theinvestment.

A.1.1 Odds-Based Moment Inequalities

The conditional odds-based moment inequalities are E (cid:34)(cid:32) S i − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) − (1 − S i ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ , E (cid:34)(cid:32) (1 − S i ) Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i θ ∗ ˜ (cid:15) − P rice i γ ∗ ˜ (cid:15) ) − S i (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (A.7)They are derived from the unobservable conditional score equation, E (cid:34) S i φ ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) − (1 − S i ) φ ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i , (cid:94) P rice i (cid:35) = 0 . (A.8) The bias makes substitution of prices for perceived prices nontrivial, and contributes to the inequality. Global convexity of E [˜ (cid:15) i | ˜ (cid:15) i < κ ] in κ is necessary for the inequalities to hold regardless of the value of κ and the variance of the misperception term. This condition is satisﬁed by both the normal and logisticdistributions. E (cid:34)(cid:32) (1 − S i ) Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) )1 − Φ( X i β ∗ ˜ (cid:15) − (cid:94) P rice i γ ∗ ˜ (cid:15) ) − S i (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i , (cid:94) P rice i (cid:35) = 0 . (A.9)The advantage of this transformation is that the odds-ratio is globally convex in its ar-guments. Replacing the unobserved (cid:94) P rice i with P rice i changes the equation into aninequality by application of Jensen’s inequality due to the global convexity of the oddsratio. As the index of the odds ratio increases, the model-predicted odds of a givenoutcome approach positive inﬁnity, while the odds approach zero as the index decreases.When the index is replaced with a mean-preserving spread of itself (via replacing per-ceived prices with prices), this ﬁrst eﬀect will usually dominate the second regardless ofthe distributional assumption. This inequality holds when taking its expectation con-ditional on Z i by law of iterated expectations. The ﬁrst inequality follows from similarintuition for those who do not select the investment. A.1.2 Estimation Using Moment Inequalities

Under the information assumptions provided, the true parameters ψ = ( θ, σ ˜ (cid:15) ) will becontained within the set of parameters that satisfy the inequalities, which I deﬁne as Ψ .First, because it is computationally expensive to compute the inequalities conditional on Z i , I will instead use unconditional inequalities that are consistent with the conditionalinequalities described above. Additionally, in small samples it is possible that the trueparameters will not strictly satisfy these inequalities, so it is necessary to construct a testof the hypothesis that a given value ψ p = ( θ p , σ ˜ (cid:15),p ) is consistent with the inequalities.To do this I employ the modiﬁed method of moments procedure described by Andrewsand Soares (2010), which yields a conﬁdence set of parameters ˆΨ that I fail to rejectare consistent with the inequalities, where an element of this set is given by (ˆ θ p , ˆ σ ˜ (cid:15),p ). A Global convexity of the odds ratio is necessary for this condition to hold for all values of the index andfor all magnitudes of mean-preserving spreads. This condition is satisﬁed by log-concave distributions,such as the normal and logistic. β, σ ˜ (cid:15) ), perceived returnsare given by Y i | X i , (cid:94) P rice i ∼ N ( X i β − (cid:94) P rice i , σ ˜ (cid:15) ) . (A.10)Thus, even given the true ( β, σ ˜ (cid:15) ), the problem remains that we do not observe (cid:94) P rice i inthe data. However, it is possible to bound perceived returns at the true parameter valuesusing P rice i and E [ P rice i | Z i ], which we do have access to.For valid Z i , equation (6) implies that it is possible to approximate (cid:94) P rice i with ϕP rice i +(1 − ϕ ) E [ P rice i | Z i ], where ϕ minimizes E [ (cid:94) P rice i − ( ϕP rice i +(1 − ϕ ) E [ P rice i | Z i ])] .It must be that ϕ ∈ [0 , ϕ ∈ [0 , θ p , ˆ σ ˜ (cid:15) p ) can be constructed using Y i | X i , P rice i , Z i ∼ N (cid:0) X i ˆ θ p − ϕP rice i − (1 − ϕ ) E [ P rice i | Z i ] , ˆ σ ˜ (cid:15) p (cid:1) . (A.11)Note that the PDF of this distribution is non-monotonic in ϕ , so setting ϕ = 0 and ϕ = 1will not bound its PDF across its entire support. Computing the distribution for all ϕ ∈ [0 ,

1] for each (ˆ θ p , ˆ σ ˜ (cid:15),p ) ∈ ˆΨ is necessary to provide bounds for the perceived returnsdistribution. In practice, choosing any set of values between zero and one, including zero and one, will approximatethese bounds. DM describe an alternative method that can be used to bound the CDF of perceivedreturns. ppendix B: Moment Inequality Estimation I closely follow appendices A.5 and A.7 in Dickstein and Morales (2018) to estimate themoment inequalities’ conﬁdence set for the true parameter ψ . Adapting DM’s procedureto the current setting would account for imputation of prices. I use a simpliﬁed version oftheir procedure, because I assume that prices are observed for all individuals, regardlessof whether they select the investment. This assumption is irrelevant to the contributionsof this paper, as each method admits imputation. I also deviate from DM in how Iconduct the grid search over potential parameters in order to speed computation in theabsence of parallelization.The conﬁdence set is obtained by applying the Andrews and Soares (2010) modi-ﬁed method of moments (MMM). This method follows the intuition of the generalizedmethod of moments, but only penalizes moment deviations that violate the inequalitywhile adjusting the hypothesis testing procedure to accommodate this change. I indexthe moment inequalities used in estimation by (cid:96) = 1 , ..., L and denote them¯ m (cid:96) ( ψ ) ≡ N N (cid:88) i =1 m (cid:96) ( Z i , ψ ) , (cid:96) = 1 , ..., L, where N is the sample size. The MMM test statistic Q ( ψ ) = L (cid:88) (cid:96) [min( √ N ¯ m (cid:96) ( ψ )ˆ σ (cid:96) ( ψ ) , , (B.1)gives the sum of squared inequality violations, whereˆ σ (cid:96) ( ψ ) = (cid:118)(cid:117)(cid:117)(cid:116) N N (cid:88) i =1 ( m (cid:96) ( Z i , ψ ) − ¯ m (cid:96) ( ψ )) . Note that as in Section A, X i ⊂ Z i . m (cid:96) ( · ) is a conditional revealed preference or odds-based moment inequality constructed as described in DM, Appendix A.5. I compute aconﬁdence set for the true parameter ψ using the following steps, closely following DM. Step 1: deﬁne a grid Ψ g that overlaps with the conﬁdence set. I deﬁne this grid25s a K -dimensional orthotope where K is the number of scalars indexed by k = 1 , ..., K within the parameter vector ψ . To deﬁne this grid, I choose ψ min to minimize Q ( ψ ),initializing the minimization with the control function estimates ˆ ψ CF ≡ (ˆ θ CF , ˆ σ ζ,CF ),which in simulations is typically near a minimum (zero) of Q ( ψ ). The moment inequalityconﬁdence set encompass the control function estimates in simulations included later inAppendix D when they provide consistent bounds, and there is good reason to believethat this will be the case generally (see Appendix C). Because Q ( ˆ ψ min ) will be close tozero, it is likely to be within the 95% conﬁdence set, ˆΨ , if this set is nonempty. I createboundaries in dimension k by multiplying the standard error of the k th parameter by alarge number, and adding and subtracting this value from the parameter to form boundsin the k th dimension. I repeat this for each of the K parameters to obtain bounds ona K -dimensional initial grid Ψ g . I ﬁll this grid with 10 K equidistant points. Step 2: choose a point ψ p ∈ Ψ g . For speed, I test points in ascending order of theireuclidean distance from ˆ ψ min . With ψ p , I test the hypothesis that ψ p = ψ : H : ψ = ψ p vs. H : ψ (cid:54) = ψ p . Step 3: evaluate the MMM test statistic at ψ p : Q ( ψ p ) = L (cid:88) (cid:96) (cid:2) min( √ N ¯ m (cid:96) ( ψ p )ˆ σ (cid:96) ( ψ p ) , (cid:3) , (B.2) Step 4: compute the correlation matrix of the moments evaluated at ψ p : ˆΩ( ψ p ) = Diag − ( ˆΣ( ψ p )) ˆΣ( ψ p ) Diag − ( ˆΣ( ψ p )) , where Diag − ( ˆΣ( ψ p )) is the L × L diagonal matrix that shares diagonal elements with As there are negligible computational disadvantages from having a very large initial grid, I multiplythe standard errors by 20. ψ p ). Diag − ( ˆΣ( ψ p )) satisﬁes Diag − ( ˆΣ( ψ p )) Diag − ( ˆΣ( ψ p )) = Diag − ( ˆΣ( ψ p )) whereˆΣ( ψ p ) = 1 N N (cid:88) i =1 ( m ( Z i , ψ p ) − ¯ m ( ψ p ))( m ( Z i , ψ p ) − ¯ m ( ψ p )) (cid:48) ,m ( Z i , ψ p ) = ( m ( Z i , ψ p ) , ..., m L ( Z i , ψ p )), and ¯ m ( ψ p ) = ( ¯ m ( ψ p ) , ..., ¯ m L ( ψ p )), where¯ m (cid:96) ( ψ p ) ≡ N N (cid:88) i =1 m (cid:96) ( Z i , ψ p ) , ∀ (cid:96) = 1 , ..., L. Step 5: simulate the asymptotic distribution of Q ( ψ p ) . Take R = 1000 draws fromthe multivariate normal distribution N (0 L , I L ) where 0 L is a vector of zeros and I L is an L -dimensional identity matrix. Denote each of these draws as χ r . Deﬁne the criterionfunction Q AAN,r ( ψ p ) as Q AAN,r ( ψ p ) = L (cid:88) (cid:96) =1 (cid:34)(cid:18) min (cid:16) [ ˆΩ ( ψ p ) χ r ] (cid:96) , (cid:17)(cid:19) × (cid:18) √ N ¯ m (cid:96) ( ψ p )ˆ σ (cid:96) ( ψ p ) ≤ √ ln N (cid:19)(cid:21) , where [ ˆΩ ( ψ p ) χ r ] (cid:96) is the (cid:96)th element of the vector ˆΩ ( ψ p ) χ r . Step 6: compute the critical value.

The critical value ˆ c AAN ( ψ p , − α ) is the (1 − α )-quantile distribution of the distribution of Q AAN,r ( ψ p ) across the R draws taken in step 5. Step 7: reject or fail to reject ψ p . If Q ( ψ p ) ≤ ˆ c AAN ( ψ p , − α ), include ψ p in theestimated (1 − α )% conﬁdence set, ˆΨ − α and the (initially empty) grid Ψ g (cid:48) that willcontain the conﬁdence set. Step 8: repeat steps 2 through 7 until a ψ p is not rejected. This will likely occurat the ﬁrst point checked, ψ min , as this parameter minimizes Q ( ψ p ), though it does notmaximize ˆ c AAN ( ψ p , − α ). Step 9: form a small grid around each ψ p in ˆΨ − α . Form Ψ g,p , a local K -dimensionalorthotope with 3 equidistant points in each dimension (with distance between pointsdeﬁned as in step 1), centered around ψ p for each ψ p in ˆΨ − α . Add Ψ g,p to the grid Ψ g (cid:48) that will contain the conﬁdence set. Step 10: repeat steps 3 through 7 for every point in Ψ g (cid:48) that has not yet beenchecked. tep 11: iterate on steps 9 and 10 until all points in Ψ g (cid:48) have been checked.Step 12: ensure desired grid ﬁneness. If the number of elements of the set ˆΨ − α isbelow the desired minimum number, set the distance between grid points at one-half ofthe current value and repeat step 11. Repeat this step until the number of elements ofˆΨ − α exceeds the desired number of such elements.28 ppendix C: Moment Inequalities and Endogeneity For proofs of the validity of the moment inequalities for providing a conﬁdence set thatconsistently bounds the true parameter vector, ( θ, σ (cid:15) ), in the context of the setting pre-sented in Section A, see DM. The inequalities also appear to consistently bound perceivedreturns in simulations when there is correlation between perceived prices and the unob-served error in perceived returns and correlation between information frictions and theunobserved error in perceived returns under the assumption ρ − ρ ≥

0, which is weakerthan the assumption described in Section A. I provide proofs of consistency here for therevealed preference moment inequalities, and arguments for consistency for the odds-based moment inequalities, borrowing from the proofs provided by DM. Note that theparameters relevant to this section are ( θ, σ ξ ), not those used in Section A. I use thenotation ( θ ∗ ξ , γ ∗ ξ , ρ ∗ ξ ) = (cid:0) θσ ξ , σ ξ , ρσ ξ (cid:1) throughout the following while assuming ξ i | X i , P rice i , u i ∼ N (0 , σ ξ ) . (C.1)The condition ρ − ρ ≥ u i with a multiplier of − − ρ ).Note that ρ − ρ ≥ ρ ∈ [0 , ρ ≥ ρ ≤ σ ξ , but not necessarily σ . As these parameters serve thesame function, this has no eﬀect on the predictive capacity of any resulting estimates of29erceived returns.I begin by presenting a lemma that will be useful in the subsequent proofs. It alsoserves as the main point of departure from the proofs provided by DM. Lemma 1

If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E [ u i (1 − ρ ) | S i = 0 , Z i ] ≥ ≥ E [ u i (1 − ρ ) | S i = 1 , Z i ] . (C.2) Proof:

From the deﬁnition of S i given in (4) and (5), substituting in the expression forperceived returns in (13) implies E [ u i (1 − ρ ) | S i = 0 , Z i ]= E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≤ , Z i ] . (C.3)Substituting in the deﬁnition of P rice i provided in (12) and rearranging the conditioninginequality implies E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≤ , Z i ]= E [ u i (1 − ρ ) | X i θ − Z i δ − u i (1 − ρ ) + ξ i ≤ , Z i ]= E [ u i (1 − ρ ) | u i (1 − ρ ) ≥ ( X i θ − Z i δ + ξ i ) , Z i ] . (C.4)Given the property of expectations of truncated variables that E [ X | X ≥ Y ] ≥ E [ X ], itfollows that E [ u i (1 − ρ ) | u i (1 − ρ ) ≥ ( X i θ − Z i δ + ξ i ) , Z i ] ≥ E [ u i (1 − ρ ) | Z i ]=0 , (C.5)where the last equality follows from the deﬁnition of u i given in (12). The deﬁnition of S i given in (4) and (5), substituting in the expression of perceived returns in (13), also30mplies E [ u i (1 − ρ ) | S i = 1 , Z i ]= E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≥ , Z i ] . (C.6)Substituting in the deﬁnition of P rice i provided in (12) and rearranging the conditioninginequality implies E [ u i (1 − ρ ) | X i θ − P rice i + u i ρ + ξ i ≥ , Z i ]= E [ u i (1 − ρ ) | X i θ − Z i δ − u i (1 − ρ ) + ξ i ≥ , Z i ]= E [ u i (1 − ρ ) | u i (1 − ρ ) ≤ ( X i θ − Z i δ + ξ i ) , Z i ] . (C.7)Given the property of expectations of truncated variables that E [ X | X ≤ Y ] ≤ E [ X ], itfollows that E [ u i (1 − ρ ) | u i (1 − ρ ) ≤ ( X i θ − Z i δ + ξ i ) , Z i ] ≤ E [ u i (1 − ρ ) | Z i ]=0 , (C.8)where the last equality follows from the deﬁnition of u i given in (12). Substituting (C.5)into (C.3) and (C.8) into (C.6) implies (C.2). (cid:4) C.1. Proof of Revealed Preference Inequality Robustness to En-dogeneity

Lemma 2

Suppose equations (4) , (5) , and (13) hold. Then E (cid:34) S i (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.9) Proof:

From equations (4), (5), and (13), S i = { X i θ − P rice i + u i ρ + ξ i ≥ } . (C.10)31his implies S i (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1) ≥ . (C.11)This inequality holds for every individual i , therefore it will hold in expectation condi-tional on Z i . (cid:4) Lemma 3

Equations (4) , (5) , (12) , (13) , and ( C. imply that E (cid:34) S i (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ (cid:1) +(1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ Proof:

Equation (C.9) and the deﬁnition of

P rice i from equation (12) imply E [ S i ( X i θ − Z i δ ) | Z i ] − E [ S i u i (1 − ρ ) | Z i (cid:3) + E [ S i ξ i | Z i ] ≥ . (C.13)The assumption in (12) implies that E [ u i | Z i ] = 0, so it follows that E [ S i u i (1 − ρ ) + (1 − S i ) u i (1 − ρ ) | Z i ] = 0 . Expression (C.1) implies that E [ ξ i | X i , P rice i , u i ] = 0, which implies E [ S i ξ i + (1 − S i ) ξ i | X i , P rice i , u i ] = 0 . Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applyingthe law of iterated expectations, the preceding two equations allow us to rewrite equation(C.13) as E [ S i ( X i θ − Z i δ ) | Z i ] + E [(1 − S i ) u i (1 − ρ ) | Z i ] − E (cid:2) (1 − S i ) ξ i (cid:12)(cid:12) Z i (cid:3) ≥ . (C.14)Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applying32he law of iterated expectations also implies E [(1 − S i ) ξ i | Z i ] = E (cid:2) E [(1 − S i ) ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [(1 − S i ) | X i , P rice i , u i ] E [ ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 1 , X i , P rice i , u i ]+ P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [(1 − S i ) | X i , P rice i , u i ] E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:104) E (cid:2) (1 − S i ) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) X i , P rice i , u i ] (cid:12)(cid:12)(cid:12) Z i (cid:105) = E (cid:2) (1 − S i ) E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i ] . This allows us to rewrite equation (C.14) as E (cid:2) S i ( X i θ − Z i δ ) + (1 − S i ) (cid:0) u i (1 − ρ ) − E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:1)(cid:12)(cid:12) Z i (cid:3) ≥ . (C.15)Using the deﬁnition of S i from equation (4) and substituting in equations (5) and (13),it follows that E [ ξ i | S i = 0 , X i , P rice i , u i ] = E (cid:2) ξ i (cid:12)(cid:12)(cid:0) − ξ i ≥ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) = − E (cid:2) − ξ i (cid:12)(cid:12)(cid:0) − ξ i ≥ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) , which allows us to rewrite E [ ξ i | S i = 0 , X i , P rice i , u i ] = − σ ξ φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (C.16)using Expression (C.1) and applying the symmetry of the normal distribution. Equation(C.12) follows by applying this equality to (C.15) and dividing each side of the resultinginequality by σ ξ . (cid:4) emma 4 Given ρ − ρ ≥ , equations (4) , (5) , (12) , and (13) imply E (cid:34) (1 − S i ) (cid:32) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) (C.17) Proof:

Using the deﬁnition of

P rice i from equation (12), it follows that E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.18)The law of iterated expectations and S i ∈ { , } implies E (cid:34) (1 − S i ) (cid:32) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.19)Because ∂ φ ( − x )1 − Φ( − x ) ∂x ∈ ( − , , it follows that the expression u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (C.20)is monotonically increasing in u i ( γ ∗ ξ − ρ ∗ ξ ). It follows then that adding a positive value tothis value will increase the value of the function. From (C.2) and the condition ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ ≥ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] ≥ ∀ i , so it follows that E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) = E (cid:20)(cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1) + φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1)(cid:17) − Φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 0 , Z i ] (cid:1)(cid:17) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) , (C.21)where the second line relates to the third by this addition, and the ﬁrst relates to thesecond by algebraic simpliﬁcations. Finally, because the term u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]and the term φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) are globally convex in − u i , the entire function is globally convex in − u i . It follows that E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + E [ u i ρ ∗ ξ | S i = 0 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) (C.22)35y Jensen’s inequality. Combining this inequality with that in (C.21) yields the result E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) ≥ E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21) . (C.23)It follows immediately that E (cid:34) (1 − S i ) E (cid:20) u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) E (cid:20) u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.24)Equation (C.17) follows from this by substituting in the deﬁnition of P rice i from (12)and applying the law of iterated expectations. (cid:4) Corollary 1

Given (C.12), (C.17), and the deﬁnition of

P rice i given in (12) , it followsthat E (cid:34) S i ( X i θ ∗ ξ − P rice i γ ∗ ξ ) + (1 − S i ) φ ( X i θ ∗ ξ − P rice i γ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.25) Proof:

The result follows from equations (C.12), (C.17), substituting E [ − S i u i = (1 − S i ) u i | Z i ], and substituting in the deﬁnition of P rice i given in (12). (cid:4) Lemma 5

Suppose equations (4) , (5) , and (13) hold. Then E (cid:34) − (1 − S i ) (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.26) Proof:

From equations (4), (5), and (13), S i = { X i θ − P rice i + u i ρ + ξ i ≥ } . (C.27)36his implies − (1 − S i ) (cid:0) X i θ − P rice i + u i ρ + ξ i (cid:1) ≥ . (C.28)This inequality holds for every individual i , therefore it will hold in expectation condi-tional on Z i . (cid:4) Lemma 6

Equations (4) , (5) , (12) , (13) , and ( C. imply that E (cid:34) − (1 − S i ) (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ (cid:1) + S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ Proof:

Equation (C.26) and the deﬁnition of

P rice i from equation (12) imply E [ − (1 − S i )( X i θ − Z i δ ) | Z i ] + E [(1 − S i ) u i (1 − ρ ) | Z i (cid:3) − E [(1 − S i ) ξ i | Z i ] ≥ . (C.30)The assumption in (12) implies that E [ u i | Z i ] = 0, so it follows that E [ S i u i (1 − ρ ) + (1 − S i ) u i (1 − ρ ) | Z i ] = 0 . Expression (C.1) implies that E [ ξ i | X i , P rice i , u i ] = 0, which implies E [ S i ξ i + (1 − S i ) ξ i | X i , P rice i , u i ] = 0 . Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applyingthe law of iterated expectations, the preceding two equations allow us to rewrite equation(C.30) as E [ − (1 − S i )( X i θ − Z i δ ) | Z i ] − E [ S i u i (1 − ρ ) | Z i ] + E (cid:2) S i ξ i (cid:12)(cid:12) Z i (cid:3) ≥ . (C.31)Assuming the distribution of Z i conditional on X i , P rice i , u i is degenerate and applying37he law of iterated expectations also implies E [ S i ξ i | Z i ] = E (cid:2) E [ S i ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [ S i | X i , P rice i , u i ] E [ ξ i | S i , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 1 , X i , P rice i , u i ]+ P (cid:0) S i = 0 | X i , P rice i , u i (cid:1) × × E [ ξ i | S i = 0 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) P (cid:0) S i = 1 | X i , P rice i , u i (cid:1) E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:2) E [ S i | X i , P rice i , u i ] E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i (cid:3) = E (cid:104) E (cid:2) S i E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) X i , P rice i , u i ] (cid:12)(cid:12)(cid:12) Z i (cid:105) = E (cid:2) S i E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:12)(cid:12) Z i ] . This allows us to rewrite equation (C.31) as E (cid:2) − (1 − S i )( X i θ − Z i δ ) + S i (cid:0) − u i (1 − ρ ) + E [ ξ i | S i = 1 , X i , P rice i , u i ] (cid:1)(cid:12)(cid:12) Z i (cid:3) ≥ . (C.32)Using the deﬁnition of S i from equation (4) and substituting in equations (5) and (13),it follows that E [ ξ i | S i = 1 , X i , P rice i , u i ] = E (cid:2) ξ i (cid:12)(cid:12)(cid:0) − ξ i ≤ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) = − E (cid:2) − ξ i (cid:12)(cid:12)(cid:0) − ξ i ≤ X i θ − P rice i + u i ρ (cid:1) , X i , P rice i , u i (cid:3) , which allows us to rewrite E [ ξ i | S i = 1 , X i , P rice i , u i ] = σ ξ φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (C.33)using Expression (C.1) and applying the symmetry of the normal distribution. Equation(C.29) follows by applying this equality to (C.32) and dividing each side of the resultinginequality by σ ξ . (cid:4) emma 7 Given ρ − ρ ≥ , equations (4) , (5) , (12) , and (13) imply E (cid:34) S i (cid:32) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) (C.34) Proof:

Using the deﬁnition of

P rice i from equation (12), it follows that E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.35)Law of iterated expectations and S i ∈ { , } implies E (cid:34) S i (cid:32) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.36)Because ∂ φ ( − x )Φ( − x ) ∂x ∈ (0 , , it follows that the expression − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (C.37)is monotonically decreasing in u i ( γ ∗ ξ − ρ ∗ ξ ). It follows then that adding a negative value tothis value will increase the value of the function. From (C.2) and the condition ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ ≥ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] ≤ ∀ i , so it follows that E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) = E (cid:20) − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1) + φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1)(cid:17) Φ (cid:16) X i θ ∗ ξ − Z i δγ ∗ ξ − (cid:0) u i ( γ ∗ ξ − ρ ∗ ξ ) + ρ ∗ ξ γ ∗ ξ − ρ ∗ ξ E [ u i ( γ ∗ ξ − ρ ∗ ξ ) | S i = 1 , Z i ] (cid:1)(cid:17) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) , (C.38)where the second line relates to the third by this addition, and the ﬁrst relates to thesecond by algebraic simpliﬁcations. Finally, because the term φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) is globally convex in − u i , the function is globally convex in − u i . It follows that E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ]+ φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) (C.39)40y Jensen’s inequality. Combining this inequality with that in (C.38) yields the result E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) ≥ E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21) . (C.40)It follows immediately that E (cid:34) S i E (cid:20) − u i γ ∗ ξ + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i E (cid:20) − u i ( γ ∗ ξ − ρ ∗ ξ ) + φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:21)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.41)Equation (C.34) follows from this by substituting the deﬁnition of P rice i from (12) andapplying the law of iterated expectations. (cid:4) Corollary 2

Given (C.29), (C.34), and the deﬁnition of

P rice i given in (12) , it followsthat E (cid:34) − (1 − S i )( X i θ ∗ ξ − P rice i γ ∗ ξ ) + S i φ ( X i θ ∗ ξ − P rice i γ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . (C.42) Proof:

The result follows from equations (C.29), (C.34), that E [ S i u i = − (1 − S i ) u i | Z i ],and the deﬁnition of P rice i given in (12). (cid:4) Proof of Robustness of Revealed Preference Inequalities to Endogeneity:

Com-bining equations (C.25) and (C.42) provides both inequalities deﬁned in equation (A.3). (cid:4) .2. Argument for Odds-Based Inequality Robustness to Endo-geneity The following argument is constructed as a proof, where the components of the argumentthat do not meet the standards of a proof are discussed as they arise.

Lemma 8

Equations (4) , (5) , (13) , ( C. , and the assumption that the distribution of Z i is degenerate conditional on ( X i , P rice i , u i ) imply that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . Proof:

Expression (C.1) implies that S i − { X i θ − P rice i + u i ρ + ξ i ≥ } ≥ , or, equivalently, 1 − { X i θ − P rice i + u i ρ + ξ i ≥ } − (1 − S i ) ≥ , { X i θ − P rice i + u i ρ + ξ i ≤ } − (1 − S i ) ≥ , for all i . Given that this inequality holds for all individuals, it will also hold in expectation,conditional on any set of variables, across individuals. It follows that E [ { X i θ − P rice i + u i ρ + ξ i ≤ } − (1 − S i ) | X i , P rice i , u i ] ≥ . The distributional assumption in (C.1) implies E [1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i ) | X i , P rice i , u i ] ≥ . X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) yields E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Adding and subtracting 1 − S i gives E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which we can rearrange into E (cid:20) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which can then be rearranged into E (cid:20) S i − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (1 − S i ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Equation (C.43) follows from the law of iterated expectations and the assumption thatthe distribution of Z i conditional on ( X i , P rice i , u i ) is degenerate. (cid:4) Lemma 9

If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.44)43 rgument: Substituting the deﬁnition of

P rice i from equation (12), we have that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . Because S i ∈ { , } , it follows that E (cid:34) S i − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ ⇐⇒ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ . Proving the argument requires that E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Given that 1 − Φ( x )Φ( x )is globally convex in x , Jensen’s inequality implies that E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≤ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Combining these inequalities yields E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≥ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 1 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) ≤ E (cid:34) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 1 , Z i (cid:35) . Thus, the argument holds if the ﬁrst inequality dominates the second. There is goodreason to believe that this will be the case. The ﬁrst inequality arises from

V ar ( u i | S i =1 , Z i ) (through Jensen’s inequality), while the second arises from E [ u i | S i = 1 , Z i ]. Threepoints are salient here. First, given that E [ u i | Z i ] = 0, the value of E [ u i | S i = 1 , Z i ] is amonotonic function of V ar ( u i | Z i ). Second, the function (1 − Φ( x )) / Φ( x ) has a very largesecond derivative for most of its support, such that the application of Jensen’s inequalitywill have a large eﬀect on the inequality. Thirdly, because ρ ∗ ξ is constrained to be smallrelative to γ ∗ ξ , E [ u i | S i = 1 , Z i ] is likely to have a relatively small eﬀect on the inequality.Equation (C.44) follows from the preceding inequalities if the ﬁrst inequality domi-nates the second by performing simple algebraic manipulations and applying the deﬁnitionof P rice i given in (12). (cid:4) Lemma 10

Equations (4) , (5) , (13) , ( C. , and the assumption that the distribution of i is degenerate conditional on ( X i , P rice i , u i ) imply that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − S i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ . Proof:

Expression (C.1) implies that { X i θ − P rice i + u i ρ + ξ i ≥ } − S i ≥ , Given that this inequality holds for all individuals, it will also hold in expectation, con-ditional on any set of variables, across individuals. It follows that E [ { X i θ − P rice i + u i ρ + ξ i ≤ } − S i | X i , P rice i , u i ] ≥ . The distributional assumption in (C.1) implies E [Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i | X i , P rice i , u i ] ≥ . Dividing through by 1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) yields E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Adding and subtracting S i gives E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) − − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , E (cid:20) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − (cid:18) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) (cid:19) S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ , which is straightforward to rearrange into E (cid:20) (1 − S i ) Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ )1 − Φ( X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ ) − S i (cid:12)(cid:12)(cid:12)(cid:12) X i , P rice i , u i (cid:21) ≥ . Equation (C.45) follows from the law of iterated expectations and the assumption thatthe distribution of Z i conditional on ( X i , P rice i , u i ) is degenerate. (cid:4) Lemma 11

If equations (4) , (5) , (12) , and (13) hold and ρ − ρ ≥ , then E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . (C.46) Argument:

Substituting the deﬁnition of

P rice i from equation (12), we have that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) = E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − P rice i γ ∗ ξ + u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) . S i ∈ { , } , it follows that E (cid:34) (1 − S i ) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z i (cid:35) ≥ ⇐⇒ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ . Proving the argument requires that E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Given that Φ( x )1 − Φ( x )is globally convex in x , Jensen’s inequality implies that E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Meanwhile, (C.2) implies E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≤ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − u i ρ ∗ ξ (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≥ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) − E [ u i ρ ∗ ξ | S i = 0 , Z i ] (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) ≤ E (cid:34) Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) − Φ (cid:0) X i θ ∗ ξ − Z i δγ ∗ ξ − u i ( γ ∗ ξ − ρ ∗ ξ ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) S i = 0 , Z i (cid:35) . Thus, the argument holds if the ﬁrst inequality dominates the second. There is goodreason to believe that this will be the case. The ﬁrst inequality arises from

V ar ( u i | S i =0 , Z i ) (through Jensen’s inequality), while the second arises from E [ u i | S i = 0 , Z i ]. Threepoints are salient here. First, given that E [ u i | Z i ] = 0, the value of E [ u i | S i = 0 , Z i ] is amonotonic function of V ar ( u i | Z i ). Second, the function Φ( x ) / (1 − Φ( x )) has a very largesecond derivative for most of its support, such that the application of Jensen’s inequalitywill have a large eﬀect on the inequality. Thirdly, because ρ ∗ ξ is constrained to be smallrelative to γ ∗ ξ , E [ u i | S i = 0 , Z i ] is likely to have a relatively small eﬀect on the inequality.Equation (C.46) follows from the immediately preceding inequalities if the ﬁrst in-equality dominates the second by performing simple algebraic manipulations and apply-ing the deﬁnition of P rice i given in (12). (cid:4) Argument for Odds Based Inequality Robustness to Endogeneity:

Substitutingequation (C.44) into (C.43) and equation (C.46) into (C.45) provides the inequalitiesdeﬁned in equation (A.7). 49 ppendix D: Additional Simulations

This section presents additional simulations that include estimated bound on parametersusing the moment inequality method described in Appendix A. I present a series of vari-ations on the setting described in Section 3.2, where the magnitudes and directions ofselection and misperception biases vary. These simulations demonstrate the robustnessof the control function method to a wide variety of empirical settings, while also demon-strating the performance of the moment inequalities in settings other than that describedin Section A. I also present simulations with additional explanatory variables in order todemonstrate the computational performance of the diﬀerent estimators.As in the body of the paper, I use the following DGP, Y i = X i β − (cid:94) P rice i + (cid:15) i (cid:94) P rice i = P rice i + ν i P rice i = Z i δ + u i , (D.1)where for these simulations Z i is always uncorrelated with (cid:15) i , ν i , and u i , and the nature ofthe covariance structure on these error terms will determine which methods will and willnot provide consistent estimates of perceived returns. Because perceived prices only diﬀerfrom realized prices in idiosyncratic ways, β = θ in all the following DGPs. Finally, I notethat the probit will estimate ( θ, σ ), the control function method will estimate ( θ, ρ, σ ζ ),and the moment inequalities will bound ( θ, σ (cid:15) ) under the assumptions in Appendix A or( θ, σ ξ ) under the assumptions in Appendix C, where these are deﬁned in sections 3.1, 3.2,and Appendix A.Each DGP is comprised of N = 10 ,

000 observations of agents whose decisions aregoverned by their perceived returns to selection. I construct the instrument vector as Z i = [ X i z i ] where X i always includes only a constant unless otherwise stated, and z i isa single known and exogenous instrument. Finally, I assume the constant β = 1 and δ = [0 1] (cid:48) for all DGPs. 50 .1. Known, Exogenous Prices I begin with a well-behaved benchmark DGP that corresponds to the setting describedin Section 3.1. I generate data according to  z i u i ν i (cid:15) i  ∼ N ( , Σ); Σ =   , (D.2)where I include ν i with a variance of zero such that agents have perfect information onprices.Table D.1 shows perceived returns estimates for one simulation of this DGP usingall three methods. Figure D.1 shows the distributions implied by the estimates for eachmethod. Because this DGP is particularly well-behaved, all three methods’ estimatesare very close to the data-generating parameters. Additionally, the moment inequalitiesprovide very tight bounds here because the ﬁrst-stage error has relatively low variancesuch that making use of E [ P rice i | Z i ] in place of (cid:94) P rice i introduces little uncertainty intothe estimated perceived returns. D.2. Mean-reverting Misperceptions of Exogenous Prices

In this simulation, I consider a DGP that corresponds to the setting described in SectionA in which agents do not precisely forecast prices such that

P rice i (cid:54) = (cid:94) P rice i . Speciﬁcally,price misperceptions move in the opposite direction of prices such that Cov ( P rice i , ν i ) <

0, as in the case when agents form rational expectations on prices using a strict subset ofrelevant forecasting variables. This causes agents to tend to believe their price is closer51able D.1: Perceived Returns Estimates, Known Exogenous Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 0.986 0.997 [0.906, 1.066](0.033) (0.034) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,2) . . [1.976, 2.236]N/AObservations 10000 10000 10000 Notes:

Notes:

Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% conﬁdence set are shown using ϕ = [0 ,

1] with steps of 1 /

52o the average than it actually is. I generate data according to  z i u i ν i (cid:15) i  ∼ N ( , Σ); Σ =  − −  . (D.3)Table D.2 shows the estimates for one simulation of this DGP using all three methods.Figure D.2 shows the distributions implied by the estimates for each method. The controlfunction and moment inequality estimates are close to the true parameters. The controlfunction estimates are signiﬁcantly more precise than those of the moment inequalities.The probit’s estimates are biased upward as expected, given the normality assumptionson the errors (Yatchew and Griliches, 1985).Table D.2: Perceived Returns Estimates, Mean-Reverting Misperceptions(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 2.753 0.896 [-0.819, 2.611](0.160) (0.071) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.119, 4.919]N/AObservations 10000 10000 10000 Notes:

In this simulation, I consider a DGP in which prices are known, but u i and (cid:15) i are positivelycorrelated, such as in the case of price discrimination. This setting is one case of that53igure D.2: Perceived Returns Distributions, Mean-Reverting Misperceptions Notes:

Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% conﬁdence set are shown using ϕ = [0 ,

1] with steps of 1 / described in Section 3.2. I generate data according to  z i u i ν i (cid:15) i  ∼ N ( , Σ); Σ =   . (D.4)Table D.3 shows the estimates for one simulation of this DGP using all three methods.Figure D.3 shows the distributions implied by the estimates for each method. The controlfunction estimates are close to the true parameter values, while the moment inequalitiesalso bound the true parameters. The probit estimates are biased, as expected given theprice endogeneity.In this case ρ ∈ [0 , ρ , it is worth noting that the sign is determined by E [ u i ( − ν i + (cid:15) i )], such that negative54positive) correlation between u i and ν i will produce an equivalent situation as positive(negative) correlation between u i and (cid:15) i .Table D.3: Perceived Returns Estimates, Positively Selected Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 2.932 0.989 [-0.483, 2.757](0.168) (0.074) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (4,3) . . [1.271, 4.889]N/AObservations 10000 10000 10000 Notes:

Notes:

Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% conﬁdence set are shown using ϕ = [0 ,

1] with steps of 1 / .4. Known, Negatively Selected Prices In this simulation, I consider a DGP in which u i and (cid:15) i are negatively correlated. Thissetting is a case of the one described in Section 3.2. I generate data according to  z i u i ν i (cid:15) i  ∼ N ( , Σ); Σ =  −

100 0 0 00 −

10 0 9  . (D.5)where I include ν i with a variance of zero to emphasize that there are no price misper-ceptions in this case.Table D.4 shows the estimates for one simulation of this DGP using all three methods.Figure D.4 shows the distributions implied by the estimates for each method. The controlfunction method estimates are close to the true parameter values, while the other methodsperform poorly. In the case of the probit, there is nothing to address inequality, whilethe moment inequalities address positive correlation between prices and the compositeidiosyncratic preference term nu i + (cid:15) i , but not negative correlation.Table D.4: Perceived Returns Estimates, Negatively Selected Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 0.739 1.100 [-1.753, 2.654](0.034) (0.088) N/A σ σ ζ ρ -.5 . -0.497 .(0.059)( σ (cid:15) , σ ξ ) (3,2) . . [0.131, 0.922]N/AObservations 10000 10000 10000 Notes:

Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% conﬁdence set are shown using ϕ = [0 ,

1] with steps of 1 / D.5. Mean-reverting Misperceptions of Positively Selected Prices

In this simulation, I consider a DGP in which u i is positively correlated with (cid:15) i andnegative correlated with ν i . This case would occur in a setting in which there is pricediscrimination on unobserved components of preferences, and agents are only aware ofa subset of price determinants and form rational expectations based on known pricedeterminants. I generate data according to  z i u i ν i (cid:15) i  ∼ N ( , Σ); Σ =  − −  . (D.6)This setting corresponds to the one described in Section 3.2. This setting is likely themost realistic, insofar as mean-reverting price misperceptions and positive selection onprices are likely. In this case, Section 3.2 and Appendix C suggest that the controlfunction method and the moment inequality method will consistently estimate perceivedreturns, but the probit will not. 57able D.5 shows the estimates for one simulation of this DGP using all three methods.Figure D.5 shows the distributions implied by the estimates for each method. The controlfunction method estimates are close to the true parameter values, while the momentinequalities bound the true values. The probit estimates are biased away from zero,because variation in prices predicts relatively modest changes in investment, as not allprice variation is known to agents and because price variation is accompanied by higheridiosyncratic preferences for investment.Table D.5: Perceived Returns Estimates, Positively Selected Partially Known Prices(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.076 0.978 [-0.785, 2.740](0.173) (0.073) N/A σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.107, 4.970](.) N/AObservations 10000 10000 10000 Notes:

Next, I present two simulations which include additional explanatory variables. This ex-ercise is intended to provide a computational comparison of the control function methodand the moment inequality method, so they include the computation time taken to com-plete each procedure. These simulations use the DGP described in Section D.5 with theaddition of the variables x in the ﬁrst simulation, and x and x in the second. I set x ∼ N (0 ,

4) and x ∼ N (0 ,

4) with coeﬃcients of zero. The results are shown in TableD.6 and Table D.7, respectively, where graphs of implied perceived returns are omitted58igure D.5: Perceived Returns Distributions, Positively Selected Partially Known Prices

Notes:

Estimated densities of perceived returns given by each method. Densities for each parametervector in the moment inequalities’ 95% conﬁdence set are shown using ϕ = [0 ,

1] with steps of 1 / because they are visually indistinguishable from Figure D.5 (given the zero coeﬃcients onthe new variables). All simulations are performed on a Linux server with two Intel XeonX5550 CPUs and 48GB of RAM. Note that the run times in seconds for the momentinequalities are orders of magnitude higher than the other methods for both simulations,and that this diﬀerence is increasing in the number of variables.59able D.6: Perceived Returns Estimates, 1 Control(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.075 0.977 [-6.956, 8.029](0.173) (0.073) N/A x σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [1.106, 4.970]N/AObservations 10000 10000 10000Computation Time 0 2 1017 Notes:

Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second. Table D.7: Perceived Returns Estimates, 2 Controls(1) (2) (3)Target Probit Control Function Moment InequalitiesConstant 1 3.076 0.977 [-14.890, 16.550](0.173) (0.073) N/A x x σ σ ζ ρ σ (cid:15) , σ ξ ) (2,3) . . [0.677, 4.970]N/AObservations 10000 10000 10000Computation Time 1 2 21085 Notes:

Standard errors in parentheses, corrected for the inclusion of estimated regressors following Mur-phy and Topel (1985) in the case of the control function. Parameters are in monetary units. Estimatesrelate to expressions (11), (16), and (A.11), respectively. The moment inequalities estimate bounds for σ (cid:15) under the assumptions in Appendix A and σ ξ under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second.under those in Appendix C. All data is generated inStata using random seed 1234. Computation time is rounded to the nearest whole second.