Counterfactual and Welfare Analysis with an Approximate Model
CCounterfactual and Welfare Analysis with an Approximate Model ∗ Roy AllenDepartment of EconomicsUniversity of Western [email protected] John RehbeckDepartment of EconomicsThe Ohio State [email protected] 9, 2020
Abstract
We propose a conceptual framework for counterfactual and welfare analy-sis for approximate models. Our key assumption is that model approximationerror is the same magnitude at new choices as the observed data. Applyingthe framework to quasilinear utility, we obtain bounds on quantities at newprices using an approximate law of demand. We then bound utility differencesbetween bundles and welfare differences between prices. All bounds are com-putable as linear programs. We provide detailed analytical results describinghow the data map to the bounds including shape restrictions that provide afoundation for plug-in estimation. An application to gasoline demand illustratesthe methodology. ∗ We thank Victor Aguiar, Victor Aguirregabiria, Lars Hansen, Nail Kashaev, Lance Lochner,Nirav Mehta, Magne Mogstad, Ismael Mourifi´e, Salvador Navarro, Andres Santos, and participantsat the Banff Empirical Microeconomics Workshop, the Western Conference on Counterfactuals withEconomic Restrictions, the University of Chicago, and the University of Toronto for helpful com-ments. a r X i v : . [ ec on . E M ] S e p Introduction
Models are generally viewed as approximations. A common intuition in empiricalwork is that conclusions of a model are robust to “small” amounts of approximationerror. Unfortunately, this intuition does not apply to many standard frameworks.For example, when performing a revealed preference analysis [Varian, 1982], if adataset is inconsistent with a model, then counterfactual predictions described bycertain inequalities cross. Thus, if there is any violation of the model (no matterhow “small”), then the model fails to generate coherent counterfactual or welfarestatements. Alternatively, one can formally acknowledge approximation error throughout the anal-ysis. Rather than treat approximation error as nonexistent, one can place restrictionson the magnitude of the approximation error. This paper does so for counterfactualand welfare analysis, with the following assumption on this magnitude.
Assumption 1.
When making counterfactual predictions or measuring welfarechanges, we assume the approximation error of the model on the counterfactual pre-dictions is the same as the approximation error of the model on the observed dataset.
This assumption is a natural extension of the standard approach to generate coun-terfactual predictions that assumes that both the observed data and counterfactualpredictions are consistent with a model. We present a framework in which a modelcan be used even though it is not exactly consistent with observed data. In particu-lar, this paper assumes that the approximation error of the model on observed andunobserved situations has the same magnitude. We call the counterfactuals that areconsistent with Assumption 1 adaptive counterfactuals because they adapt to approx-imation error present in the observed dataset. This assumption can be questioned,especially when the counterfactual setting is significantly different than observed data,yet provides a way to conduct counterfactual analysis taking approximation error se-riously.The conceptual framework of this paper is general and can be applied to differentsettings. In this paper, we formalize how to generate counterfactual predictions and Related concerns have been raised in the econometric literature on partial identification [Pono-mareva and Tamer, 2011, M¨uller and Norets, 2016]. The most common criticism isthat the model does not allow income effects. In addition, the model neglects dy-namics, limited consideration, and peer effects, among many other omitted features.We present a framework in which one does not need to pick a single story why thebaseline model is imperfect when generating counterfactual predictions on quantities.However, to make welfare comparisons we take a stand on the interpretation of theapproximation error: the individual ranks bundles according to a quasilinear utilityfunction but for reasons we do not model explicitly, the choices do not exactly maxi-mize the function. Overall, the framework we propose permits many reasons why thebaseline model is wrong, provided the approximation error is the same magnitude inthe counterfactual setting.Despite being viewed as an approximation, the quasilinear model is widely used.Examples include work in insurance choice [Einav et al., 2010, Bundorf et al., 2012,Tebaldi et al., 2018] and public health [Cohen et al., 2010]. In addition, the quasilinearstructure is closely related to a large class of latent utility models (e.g. McFadden[1981], Allen and Rehbeck [2019a]), and so the insights of this paper are directlyrelevant beyond a setting with just prices and quantities. In particular, many latentutility models used in applied work involve characteristics other than prices that shiftthe desirability of goods but not the budget constraint. We now describe the framework in more detail. We begin by studing counterfactualbounds for the demand of goods at new prices. We construct the counterfactualbounds by looking for the maximal and minimal demand for each good in the pres- A notable exception that studies the approximation error explicitly is Willig [1976]. Our analysis is also relevant for specifications in which latent utilities depend on a nonlinearfunction of prices. For example, Berry et al. [1995] specifies that the utility of alternative j dependson several observables including a term ln p m ´ p j q where p j is the price of good j and m is income.This is a quasilinear model in the variable ˜ p j “ ln p m ´ p j q . prices . Taking into account these wedges, we present bounds on differences in utility overconsumption bundles and robust consumer surplus bounds involving price changes.We present computational results for both bounds, as well as analytical results de-signed to interpret specifically how the data are used to measure welfare changes.These bounds generalize existing work in several directions: first, and most impor-tantly, they are valid with approximation error; second, they apply to finite datasetsrather than requiring demand functions; third, the bounds apply to the (approxi-mate) indirect utility at a new price without needing to first bound the quantity atthat price. In particular, we show that the bounds on (approximate) indirect utilityis a generalization of the standard integral definition of consumer surplus and we We thus complement the core analysis of Bernheim and Rangel [2009], which focuses on recov-ering ordinal information on preferences over consumption bundles and does not distinguish betweenthese wedges. Finally, we establish sev-eral convexity properties describing how quantities data map to the bounds. To ourknowledge these results are all new even under correct specification for quasilinearutility. Our analysis also establishes that the bounds satisfy a key continuity property: as thedegree of approximation error limits to 0, our analysis limits to the analysis undercorrect specification. In fact, we show a stronger property that the counterfactualand approximate indirect utility bounds are jointly continuous when viewed as afunction of the quantities in the data and the degree of approximation error. Thisfacilities plug-in estimation of the bounds in which we replace true quantities withestimated quantities. This is needed to cover our empirical application in which weapply the framework with data on gasoline purchases used previously in Blundellet al. [2012]. The data is a single cross section, and we pre-process the data as inBlundell et al. [2012] by kernel smoothing. We conduct a representative agent analysiswith quantities (conditional means) estimated from the kernel smoothed data. LikeBlundell et al. [2012], we find that for several natural choices of the bandwidth, More specifically, convexity holds for the money metric version of the utility function. In generalone can only obtain quasiconvexity. The closest work appears to be a computational approach to describing bounds for modelsrelated to quasilinear utility, without describing detailed shape restrictions [Chiong et al., 2017,Tebaldi et al., 2018, Allen and Rehbeck, 2019a].
This paper is part of a long literature that uses the revealed preference approach to docounterfactual and welfare analysis. The primary model used is the general model ofutility maximization subject to a budget constraint, whose empirical content has beencharacterized in Afriat [1967], Diewert [1973], and Varian [1982]. Recent econometricwork considering counterfactual or welfare bounds includes Blundell et al. [2003],Blundell et al. [2008], Blundell et al. [2012], Blundell et al. [2014], Hoderlein and Stoye[2015], Kline and Tartari [2016], Blundell et al. [2017], Cosaert and Demuynck [2018],Aguiar and Kashaev [2018], Adams [2019], Cherchye et al. [2019], and Kitamura andStoye [2019]. Several proposals have been made to assess the fit of a model usingrevealed preference tools, including Afriat [1973], Houtman and Maks [1985], Varian[1990], and Echenique, Lee, and Shum [2011]. Other papers outside of the revealedpreference literature that discuss fit of an approximate model include Kydland andPrescott [1982], Vuong [1989], and Hansen and Jagannathan [1997]. The primaryway in which we differ from existing work is that we use a measure of fit to adjustbounds on counterfactuals and welfare. In addition, relative to the general modelwith income effects, which has been the focus of the revealed preference literature,we conduct counterfactual analysis fixing prices at a new value without also fixingexpenditure. See Allen and Rehbeck [2019b] for additional references and discussion of units. Work that uses revealed preference techniques to go beyond measuring the fit of the model in-cludes Varian [1990], Halevy et al. [2018], and Gauthier [2019], which study parameter recoverability.See also Chetty [2012] for a related approach. a priori . The goal of this paper is to provide a framework where a researcher begins with abaseline model that is taken seriously as an approximation. Since the model is an ap-proximation, the researcher does not expect all data to be consistent with the model.Nonetheless, the researcher may want to use the model for counterfactual and welfare7nalysis. We present an adaptive framework framework for this by operationalizingAssumption 1 for the quasilinear utility model. Recall that Assumption 1 maintainsthat a researcher considers counterfactuals that are “no worse” than the observeddata. To do this, we enlarge the baseline model to fit the data, which introduces ameasurement wedge and prediction wedge. The measurement wedge concerns limitson what an analyst can learn due to approximation error for objects defined in theexisting dataset (e.g. utility functions). The prediction wedge describes limits onwhat can be said in new settings where the model may not be perfect (e.g. counter-factual quantities). We formalize Assumption 1 by making these wedges as small aspossible while still fitting the observed data, using the notion of approximation errorfrom Allen and Rehbeck [2020]. We describe the framework more below.We formalize the baseline model of quasilinear utility. A consumption bundle p x, y q P R K ` ˆ R is evaluated according to u p x q ` y , where u : R K ` Ñ R is a utility functionover bundles x . The numeraire good is given by y and has a price of one. Givenprices p P R K `` and income I P R decisions in a quasilinear utility model followmax x P R K ` ,y P R u p x q ` y ðñ max x P R K ` u p x q ` I ´ p ¨ x s.t. p ¨ x ` y ď I where consumption of the numeraire good is allowed to be negative for unobservedborrowing. We study the quasilinear utility model since it is regularly used in ap-plications described in the Introduction, including adaptations to handle non-pricecharacteristics . In addition, it has a tractable notion of welfare in terms of unitsof the numeraire.Now we present an enlargement of the baseline model that relaxes the assumption ofexact maximization to a notion of approximate optimization. Because we focus onthe empirical analysis, we define the enlargement in terms of a finite datasets of theform tp x t , p t qu Tt “ . There are T observations, quantities are weakly positive x t P R K ` ,and prices are strictly positive p t P R K `` . Importantly, quantities can be discrete orcontinuous, and 0 quantities are permitted in this framework. Allowing negative expenditure also avoids boundary issues for chosen consumption bundles. Allen and Rehbeck [2019a] show many applications including the additive random utility model[McFadden, 1981] are quasilinear models with utility indices playing the role of prices. efinition 1. A dataset tp x t , p t qu Tt “ is ε -rationalized by quasilinear utility for ε ě if there exists a utility function u : R K ` Ñ R such that for all t P t , . . . , T u and forall x P R K ` , the following inequality holds: u p x t q ´ p t ¨ x t ě u p x q ´ p t ¨ x ´ ε. When ε equals zero, we say the dataset is quasilinear rationalized. The value ε is in the same units as the price of the numeraire good, e.g. dollars pertime period. When ε ą
0, the observed bundles are within ε dollars of the maximumutility possible at a given price. One interpretation is that ε captures “unstructured”deviations from the quasilinear utility model (cf. Chetty [2012], Hansen and Sargent[2018]), without a single interpretation of the nature of the deviations. Instead,the magnitude of the deviations is controlled. This interpretation can be pursuedwhen making counterfactual predictions. In contrast, to make welfare predictions aninterpretation of the model is crucial. When discussing welfare, we follow Allen andRehbeck [2020] and interpret the value ε as a level of satisficing in the spirit of Simon[1947]. In this case, a higher value of ε means there is a larger set of consumptionbundles that are “good enough” to be chosen.When using the model for counterfactual or welfare analysis, a measurement wedge and a prediction wedge arise. These concepts will become more clear when we turnto specific analysis below, but we first provide an overview. When observed datais not exactly consistent with a quasilinear utility model, the econometrician knowsthere is no utility function that rationalizes the entire dataset, so at some observationthe quantity is not optimal. Thus, if a researcher still wants to use the quasilinearmodel even when data is inconsistent with the baseline quasilinear model, then thereis a measurement wedge when trying to recover information about candidate utilityfunctions and indirect utility. Second, even after the econometrician has a set ofcandidate utility functions that match the original dataset, the econometrician cannotknow that counterfactual choices will exactly maximize a candidate utility function.Thus, there is a prediction wedge when forecasting even after recovering informationon the utility function.We now discuss how to formalize Assumption 1 for the approximate quasilinear utilitymodel in relation to the measurement wedge and prediction wedge. Let ε M denote9he measurement wedge and ε P denote the prediction wedge. In principle, the twowedges may not be the same, but Assumption 1 allows us to treat these as equal. To further formalize Assumption 1, we introduce ε ˚ as the smallest value of ε suchthat the dataset is ε ˚ -rationalized by quasilinear utility. We also refer to ε ˚ as thelevel of approximation error or approximation error of the quasilinear utility modelfor the observed dataset. Proposition 1 (Allen and Rehbeck [2020]) . Let ε ˚ ě be the smallest value suchthat for all ε ě ε ˚ the dataset tp x t , p t qu Tt “ is ε -rationalized by quasilinear utility. Thevalue ε ˚ exists and is obtained by a linear program. We note that ε ˚ is a function of the dataset to a number, so for a dataset D “tp x t , p t qu Tt “ we can write ε ˚ p D q . When we discuss only a single dataset, we typicallydrop dependence on D . The value ε ˚ will be used in our framework to place restric-tions on the magnitude of the measurement and prediction wedges. Setting ε M ě ε ˚ formalizes that the measurement wedge is large enough to explain the data we haveseen. Similarly, setting ε P ě ε ˚ formalizes that the model is no better at predicting innew settings than the data we have seen. We make these bounds as tight as possible,and formalize Assumption 1 for this setting as follows. Assumption 1 . When performing counterfactual analysis, the measurement wedge,prediction wedge, and approximation error of the model are equal, ε M “ ε P “ ε ˚ . This is a direct generalization of the standard approach to counterfactual and welfareanalysis, which sets ε M “ ε P “
0. The conceptual framework of the standard ap-proach only applies to models that perfectly fit the data, which translates to ε ˚ “ ε M “ ε P . For notational convenience,we will let ε denote the common value. Assumption 1 is the special case where ε “ ε ˚ . See Appendix B for additional discussion on this case and a more formal treatment of themeasurement and prediction wedges. Counterfactuals
For the quasilinear utility model with approximation error that does not exceed ε , weconsider sharp counterfactual bounds. More formally, we consider when the measure-ment and prediction wedges are both equal to a single value ε . We impose ε “ ε ˚ toimplement Assumption 1 and describe some properties on the counterfactual bounds.In Section 3.1, we provide graphical intuition for the bounds. In Section 3.2, we de-scribe how to compute bounds on quantities fixing a price. In Section 3.3 we describeadditional restrictions that can be imposed to tighten the bounds, such as a priori bounds on expenditure at a new price.To that end, for notational convenience, let D “ tp x t , p t qu Tt “ denote the observeddataset. Suppose we have a candidate quantity-price tuple p ˜ x, ˜ p q . We can add this tothe original dataset to form an augmented dataset D Y p ˜ x, ˜ p q . We consider candidatessuch that the approximation error of the augmented dataset is bounded by ε . Inparticular, the set of consistent demands and prices for the level of approximationerror ε is given by C p D, ε q “ (cid:32) p ˜ x, ˜ p q P R K ` ˆ R K `` | ε ˚ p D Y p ˜ x, ˜ p qq ď ε ( . To check whether a candidate tuple p ˜ x, ˜ p q is in C p D, ε q , one can calculate ε ˚ for theaugmented dataset using Proposition 1. If this measure of approximation error forthe augmented dataset is below ε , then the candidate tuple is in the set C p D, ε q .Our framework imposes Assumption 1 to generate counterfactual predictions assum-ing the level of approximation error does not get worse. This amounts to setting ε equal to the approximation error of the observed dataset. In particular, we focus onthe adaptive counterfactual set AC p D q “ (cid:32) p ˜ x, ˜ p q P R K ` ˆ R K `` | ε ˚ p D Y p ˜ x, ˜ p qq ď ε ˚ p D q ( “ C p D, ε ˚ p D qq . We collect some facts about AC p¨q and C p¨q .11 act 1 (Constant Approximation Error) . For any p ˜ x, ˜ p q P AC p D q , we have ε ˚ p D Y p ˜ x, ˜ p qq “ ε ˚ p D q . Thus, when a candidate observation in AC p D q is added to D , the measure of ap-proximation stays the same. This follows from the construction of AC p D q . Thisequality does not hold for all measures of model approximation error. For example,if we had chosen to take ε ˚ divided by the number of observations T as the measureof approximation error, then Fact 1 would not hold in general since the measure ofapproximation error for the augmented dataset would divide by T ` Fact 2 (Monotonicity) . If ε ă ε , then C p D, ε q Ď C p D, ε q . Higher values of ε correspond to less informative counterfactual predictions. Thisfollows from the fact that if a dataset is ε -rationalized by quasilinear utility, then itis also ε -rationalized for ε ă ε . Fact 3 (Nonemptiness) . C p D, ε q is nonempty if and only if ε ě ε ˚ p D q . This states that the observed data places a lower bound on the minimal amount ofapproximation error needed to conduct counterfactual anlaysis. If ε ě ε ˚ , nonempti-ness of C p D, ε q is guaranteed by considering ˜ p sufficiently high along each dimensionand ˜ x “
0. Alternatively, for the dataset D , when ε ă ε ˚ even observations withinthe dataset cannot be ε -quasilinear rationalized. Fact 4 (Minimality) . AC p D q is obtained from the smallest ε such that C p D, ε q isnonempty. This formalizes that setting ε “ ε ˚ for counterfactual values obtains the sharpestrestrictions under Assumption 1 subject to the constraint that counterfactuals arenonempty. This follows from the previous facts. To perform a sensitivity analysis, onecould examine any ε ą ε ˚ and use C p D, ε q as the counterfactual set. Our frameworkallows this yet focuses on ε “ ε ˚ . To gain intuition on the “shape” of the counterfactual sets C p D, ε q and AC p D q , wepresent a graphical description of the restrictions on counterfactuals. For exposition12e focus on some of the restrictions rather than all of them. First we describe arestriction that must hold for a dataset to be ε -rationalized. At price p r we musthave u p x r q ´ p r ¨ x r ě u p x s q ´ p r ¨ x s ´ ε for some unknown function u . This states that x s cannot be much better than x r atprice p r . Flipping the role of observations r and s and basic algebra yields12 p p s ´ p r q ¨ p x s ´ x r q ď ε. (1)This is a multivariate approximate law of demand. The usual multivariate law ofdemand obtains when ε “
0. For a given value ε ě
0, this also places restrictionson counterfactual demand ˜ x at prices ˜ p since for any r P t , . . . , T u , a potentialcounterfactual tuple must satisfy12 p ˜ p ´ p r q ¨ p ˜ x ´ x r q ď ε. (2)When we apply Assumption 1 , we evaluate counterfacturals at ε “ ε ˚ . This inequal-ity places a restriction on candidate quantity-price tuples when compared with anyobservation in the dataset. In the one dimensional case ( K “ p r to ˜ p , then demand cannot increase by too much. Thebound on the increase in quantities is inversely related to the magnitude of the priceincrease. That is, when ˜ p ´ p r ą x ď x r ` ε ˜ p ´ p r . px (a) Law of Demand px (b) Approximate Law of Demand Figure 1: Restrictions of Law of DemandWe illustrate these bounds in two example datasets displayed in Figure 1. Each13ataset has four observations respresented as black dots. The gray area denotes theset of quantity-price tuples p ˜ x, ˜ p q that are consistent with the existing dataset withminimal level of approximation error ε “ ε ˚ . In panel (a), the observed dataset isexactly consistent with quasilinear utility and the counterfactual set has ε “
0. In thiscase, the constructed bounds have the property that when price increases quantitycannot increase. This leads to the “rectangular” bounds in panel (a). Note that forlow values of prices, quantity has a lower bound but not an upper bound. Similarly,when prices are higher than any observed data the lower bound on counterfactualdemand is zero.In panel (b), the dataset is not consistent with quasilinear utility, because there isan instance in which price goes up and quantity goes up. Here we graphically obtainthe counterfactual restrictions using the approximate law of demand constructed fromEquation 2, setting ε “ ε ˚ . This approach leads to “hyperbolic” bounds, in contrastwith the rectangular bounds in panel (a). The fact that ε “ ε ˚ is the minimalapproximation error needed to rationalize the data is demonstrated on the graph bytwo points touching dashed hyperbolas. The sets C p¨q and AC p¨q completely describe counterfactuals. An analyst may not beinterested in the entire set of counterfactual quantity-price tuples, but rather certainfeatures of it. For example, an analyst may only be interested in quantities at a fixedcounterfactual price ˜ p P R K `` allowing approximation error ε . This set may be written X p ˜ p, D, ε q “ (cid:32) ˜ x P R K ` | p ˜ x, ˜ p q P C p D, ε q ( . Our first question is when this set is nonempty, i.e. when can we conduct counterfac-tual analysis.
Proposition 2.
For a dataset D and counterfactual price ˜ p , the set X p ˜ p, D, ε q isnonempty if and only if ε ě ε ˚ . Moreover, when ε ě ε ˚ there is a concave, strictlyincreasing, continuous utility function u : R K ` Ñ R that ε -rationalizes the dataset and There are additional restrictions beyond Equation 2; here we provide a graphical illustrationbut the general framework uses additional inequalities. as an exact maximizer for each p P R K `` . This shows that by allowing enough approximation error, we can find counterfactualquantities for any price. This is stronger than Fact 3 because it gives nonemptinessof the counterfactual quantity set for any price. Existing work has studied whenobserved datasets can be rationalized by quasilinear utility (Brown and Calsamiglia[2007] for ε “ ε ě We now discuss additional properties of X p ˜ p, D, ε q . Proposition 3.
For a dataset D “ tp x t , p t qu Tt “ , let ε ě ε ˚ . The set X p ˜ p, D, ε q is aclosed, convex polyhedron. In particular, ˜ x P X p ˜ p, D, ε q if and only if the inequalities p ˜ p ´ p t M q ¨ ˜ x ď p M ` q ε ` ˜ p ¨ x t ´ p t M ¨ x t M ´ M ´ ÿ m “ p t m ¨ p x t m ´ x t m ` q (3) hold for all finite sequences t t m u Mm “ without cycles where t m P t , . . . , T u and M ě . When M “
1, the inequalities in Equation 3 yield the approximate law of demanddescribed in Equation 2. In this case we compare an observation in the dataset witha conjectured counterfactual tuple p ˜ x, ˜ p q , which leads to two instances of ε in Equa-tion 3, just like the approximate law of demand. Proposition 3 shows there are otherrestrictions imposed on counterfactuals beyond the law of demand by consideringmore than one observation at a time ( M ě By strictly increasing we mean the usual definition, i.e. if each component of x is weaklygreater than each component of z , then u p x q ě u p z q , and if in addition some component of x isstrictly greater than the corresponding component of z , then u p x q ą u p z q . See the proof of Proposition 13. See also Aguiar et al. [2020] for recent work concerningemptiness of counterfactual sets when using the weak axiom of revealed preference. k -th goodat a price ˜ p , allowing up to ε approximation error. These bounds are extrema of X p ˜ p, D, ε q along the k -the dimension. That is, they they are the extreme points ofthe set X k p ˜ p, D, ε q “ t x k P R ` | There is some ˜ x P X p ˜ p, D, ε q with x k “ ˜ x k u . The following proposition discusses the bounds for the k -th good. In particular, whenthe bounds exist they can be computed by a linear program and the bounds satisfymonotonicity properties with respect to the approximation error ε . Proposition 4.
For a dataset D , let ε ě ε ˚ . The bounds x k p ˜ p, ε q “ sup x k P X k p ˜ p,D,ε q x k x k p ˜ p, ε q “ inf x k P X k p ˜ p,D,ε q x k can each be computed as a linear program whenever they are finite.Under Assumption 1 ( ε “ ε ˚ ), these bounds cannot be improved. The details on the linear program to compute bounds are found in Proposition A.1of Appendix A. Recall X k p ˜ p, D, ε q is convex from Proposition 3. Thus, any quantitybetween x k p ˜ p, ε q and x k p ˜ p, ε q is a candidate counterfactual quantity for good k .Next we elaborate on when these bounds are finite. We show the lower bound isalways finite but the upper bound is finite only when prices are sufficiently high. Toformalize this define the upper comprehensive convex hull of a finite set t z (cid:96) u L(cid:96) “ asCCo pt z (cid:96) u L(cid:96) “ q “ z P R K | z ě L ÿ (cid:96) “ α (cid:96) z (cid:96) for some nonnegative α , . . . , α L such that L ÿ (cid:96) “ α (cid:96) “ + . The inequality in the definition here is componentwise. In addition, let int A denotethe interior of a set A . Proposition 5.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . The upper bound x k p ˜ p, ε q is nite if and only if ˜ p P int CCo pt p t u Tt “ q . The lower bound of x k p ˜ p, ε q is always finite.The upper bound x k p ˜ p, ε q is weakly increasing in ε and the lower bound x k p ˜ p, ε q isweakly decreasing in ε. Finally, we show that for one good p K “ q , the bounds on demand are downwardsloping in own-price. Proposition 6 (Univariate Monotonicity) . For a dataset D , let ε ě ε ˚ and supposethere is a single good ( K “ ). For any pair of prices ˜ p , ˜ p P R `` , it follows that p x p ˜ p , ε q ´ x p ˜ p , ε qqp ˜ p ´ ˜ p q ď and p x p ˜ p , ε q ´ x p ˜ p , ε qqp ˜ p ´ ˜ p q ď . When ε “
0, the dataset satisfies the exact law of demand. When ε ˚ (and K “ r, s P t , . . . , T u that violates the law of demand so p x r ´ x s qp p r ´ p s q ą . Proposition 6 shows that while such violations can occur in the data, the boundsthemselves satisfy the exact law of demand.
Remark . If an analyst is unsure what is a sensible choiceof ε (other than the requirement ε ě ε ˚ ), then it is possible to perform sensitivityanalysis of x k p ˜ p, ε q and x k p ˜ p, ε q as ε varies. A specific question is the largest amountof approximation error in which one can still bound the quantity of the k th good bya pre-specified value, e.g. sup t ε ě ε ˚ | x k p ˜ p, ε q ď q k u . This bound is related the analysis of breakdown frontiers of Masten and Poirier [2019],which involve the weakest assumptions under which one can reach a conclusion. Here,weakest assumption translates to most approximation error.
Remark . It is straightforward to generalize Proposition 4 to boundcertain linear combinations of the candidate demand vector ˜ x . Bounds on such linearcombinations may be computed as the value of a linear programming problem. Oneinteresting linear combination is ˜ p ¨ ˜ x , which is the expenditure on the K goods. Sharp17ounds on general functionals f p ˜ x q can also be described as the value of a constrainedoptimization problem. For example, an upper bound is given bysup ˜ x P X p ˜ p,D,ε q f p ˜ x q . Recall that Proposition 3 states this constraint set is a closed convex polyhedron.This can facilitate computation though we do not formally study computation forgeneral f . Additional assumptions can tighten the bounds on quantities in Proposition 4. Forexample, one can assume that expenditure is the same at the counterfactual valueas the last period of data, so ˜ p ¨ ˜ x “ p T ¨ x T . Alternatively, one could place boundson the expenditures so that m ď ˜ p ¨ ˜ x ď m . One may also impose a priori boundson the quantities of other goods. These bounds can considerably shrink the setof counterfactual bounds, especially when there are multiple goods. In addition,computation with these additional restrictions is not challenging because these areinequality constraints that can be appended to the original linear program. Whenadding these additional constraints, however, it is possible that the counterfactual setcan be empty.We emphasize that in general, such expenditure bounds are not needed to delivernontrivial counterfactual bounds. It is helpful to contrast our approach with thegeneral model of utility maximization subject to a budget constraint, with preferencesthat need not be quasilinear. In the general model, even under correct specificationthe sharp bounds on quantities of each good at a given price are the trivial bounds r , unless the analyst places a priori bounds on expenditure at the new price. This is because the general model does not rule out expenditure of 0 or arbitrarilyhigh values at counterfactuals when we only fix prices. The bounds r , are for bounding one good at a time (similar to x k and x k above). Thereare nontrivial restrictions on the entire demand tuple. The results in Deb et al. [2018] can be used to show nontrivial bounds are possible in thegeneral model when income is always the same value (inside and outside the dataset) and there isan unobserved good whose price is fixed. Welfare
To study welfare, we must take a stand on the interpretation of approximation error.For this section, we follow Allen and Rehbeck [2020] and treat the approximation erroras arising from satisficing in the spirit of Simon [1947]. In particular, an individualhas a utility function that describes the ranking over goods, but satisfices by choosingbundles that are “good enough.”We now discuss how satisficing relates to the measurement and prediction wedge.When trying to learn about utility from data, a measurement wedge arises sinceobserved choices may not be optimal. When trying to predict welfare for a pricechange, the prediction wedge occurs since we only know the region of bundles thatare “good enough.” Assumption 1 means that the measurement and prediction wedgeare the same size as the smallest amount of satisficing needed to describe the data. Wenote that one can also apply the satisficing interpretation to counterfactual quantities,but it is not necessary. For this reason we did not distinguish between these wedgesin Section 3. Appendix B provides additional discussion.Since we are studying quasilinear utility there are two natural welfare objects. We lookat differences in utility over consumption bundles and differences in (approximate)indirect utility over prices. An important asymmetry arises because learning aboutdifferences in utility only involves the measurement wedge because it does not involvechoices in new situations. In contrast, differences in (approximate) indirect utilityover prices involves both the measurement wedge and prediction wedge because onemust consider choices in new settings. We elaborate more below. Our first goal is to learn about the unknown utility function over consumption bundlesusing data. This is helpful when considering policies involving the direct distributionof goods.In general, there is a collection utility functions that can ε -rationalize a dataset tp x t , p t qu Tt “ . We study bounds on utility differences between consumption bundles.Specifically, given two consumption bundles ˜ x , ˜ x P R K ` we consider the upper and19ower bounds u p ˜ x , ˜ x , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) u p ˜ x q ´ u p ˜ x q ( u p ˜ x , ˜ x , ε q “ inf t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) u p ˜ x q ´ u p ˜ x q ( . Here, we consider all possible utility functions u : R K ` Ñ R without additional restric-tions such as monotonicity or concavity. A utility function u is said to ε -rationalizethe dataset tp x t , p t qu Tt “ when for every t P t , . . . , T u the inequality u p x t q ´ p t ¨ x t ě u p x q ´ p ¨ x ´ ε holds for every x P R K ` .To interpret these bounds, suppose for example that u p ˜ x , ˜ x , ε q ă
0. We concludethat the individual ranks ˜ x above ˜ x , even when the individual’s choices do not ex-actly maximize utility. Thus, there is no ambiguity in the ranking of these bundlesaccording to the unknown utility function u . If u p ˜ x , ˜ x , ε q ą
0, then it is possible thatthe individual ranks ˜ x above ˜ x . Lastly, if u p ˜ x , ˜ x , ε q ą
0, then we conclude the in-dividual ranks ˜ x above ˜ x . More broadly, these bounds provide cardinal informationon utility differences, in units of the price of the numeraire.To gain some intuition how bounds on differences of utility are informed by data,consider two bundles x r and x s in the dataset. Since x r is approximately optimalgiven prices p r , we have the restriction u p x r q ´ p r ¨ x r ě u p x s q ´ p r ¨ x s ´ ε, which rearranges to u p x s q ´ u p x r q ď p r ¨ p x s ´ x r q ` ε. (4)Differences in utility are thus bounded by changes in expenditure. Here, price isfixed and a change in quantity determines the magnitude of the expenditure change.The inequality in (4) arises because the point in the data x r was approximatelyoptimal at prices p r . Thus, ε here directly involves the observed data and is part of20he measurement wedge. There is no prediction wedge because an analyst does notcontemplate choices in new situations.We first formalize computation of the bounds before providing additional interpre-tation. We show the bounds can be calculated as a linear program. An explicitdescription is relegated to Proposition A.2 in Appendix A. Proposition 7.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . If ˜ x is in the dataset, i.e. ˜ x “ x S for some S P t , . . . , T u , then u p ˜ x , ˜ x , ε q is finite and can be calculated asa linear program. If ˜ x is in the dataset, i.e. ˜ x “ x F for some F P t , . . . , T u , then u p ˜ x , ˜ x , ε q is finite and can be calculated as a linear program.Under Assumption 1 ( ε “ ε ˚ ), these bounds cannot be improved. Note that the set (cid:32) u | u ε ´ rationalizes tp x t , p t qu Tt “ ( is convex in the sense that if each u a , u b ε -rationalize the dataset, then αu a ` p ´ α q u b does for α P r , s . This follows from inspecting inequalities such as u p x t q ´ p t ¨ x t ě u p x q ´ p ¨ x ´ ε that define ε -rationalizability by a utility function u . This means that any valuebetween u p ˜ x , ˜ x , ε q and u p ˜ x , ˜ x , ε q can be attained.To gain further intuition how data bound utility differences, we provide an analyticalcharacterization. This characterization builds on inequalities such as (4) above, yetuses longer sequences (rather than just pairs) of observations to describe the tightestpossible bounds. This parallels analysis of counterfactuals, where restrictions otherthan the law of demand arise by considering sequences of observations. Proposition 8.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . If ˜ x is in the dataset, i.e. ˜ x “ x S for some S P t , . . . , T u , then for any ˜ x P R K ` with ˜ x ‰ x S , the upper boundon utility differences is given by u p ˜ x , x S , ε q “ min σ P Σ S p σ p M q ¨ p ˜ x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , where Σ S is the set of sequences that start with σ p q “ S , have no cycles, and have ength at least M ě . Moreover, the function u is strictly increasing and continuousin p ˜ x , ε q over the region that satisfies ε ě ε ˚ and excludes ˜ x “ x S . The sums inside the minimum are closely related to sums discussed in Proposition 3 forcounterfactuals. The sums differ because Proposition 3 constructs sequences makinga cycle (to remove the unknown utility numbers). In contrast, Proposition 8 considerssequences that do not make a cycle because the goal is to examine differences of utilitynumbers.Continuity and concavity fail at ˜ x “ x S (when ε ą
0) because the difference in utilityis zero when the quantity is the same. Since u p ˜ x , ˜ x , ε q “ ´ u p ˜ x , ˜ x , ε q , analogousresults hold for the lower bound u p x F , ˜ x , ε q if the first argument x F is in the dataset.See Proposition A.2 for formal results.An important feature for practical application is that the bounds on utilities aretrivial unless an appropriate quantity is in the dataset. We formalize this as follows. Proposition 9.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . If ˜ x is not in the dataset, i.e. ˜ x ‰ x r for every r P t , . . . , T u , then u p ˜ x , ˜ x , ε q “ 8 . If ˜ x is not in the dataset, then u p ˜ x , ˜ x , ε q “ ´8 . Recall that Proposition 8 shows that u is strictly increasing and continuous over aregion. Thus, the upper bound on utility differences has some shape restrictions like a“nice” utility function. Despite this, the bound is not concave/continuous in the firstargument at ˜ x “ x S (when ε ą Imposing these (or other) shape restrictions isimportant if one wishes to bound utility when neither quantity is in the dataset, sincefrom Proposition 9 we know the bounds are trivial without more structure. Continuity and concavity do not tighten the bounds when ε “ u isthen continuous and concave for all values of ˜ x . See the proof of Proposition A.2 for more details. .2 Recoverability of Approximate Indirect Utility We now turn to welfare analysis concerning price changes. Here both the measurementand prediction wedge play a role. Recall that a “measurement wedge” shows up forbounds on the utility over bundles as in Section 4.1 since observations may not exactly maximize a quasilinear utility function. Here the prediction wedge also arises when ε ˚ ą u : R K ` Ñ R is given by V u p p q “ sup x P R K ` u p x q ´ x ¨ p. Since the researcher does not know the individual’s utility a priori , we consider indi-rect utility associated with candidate utility functions.We show how indirect utility interacts with the measurement wedge. If x t is within ε of the maximum utility possible at price p t , then we can write V u p p t q ď u p x t q ´ p t ¨ x t ` ε. The definition of the indirect utility yields for arbitrary p P R K `` , the inequality V u p p q ě u p x t q ´ p ¨ x t . Differencing these, we obtain V u p p t q ´ V u p p q ď x t ¨ p p ´ p t q ` ε. (5)Here, ε arises because the observed choices need not be exact maximizers and thusis part of the measurement wedge. With a restriction on the magnitude of ε , we canuse observations of x t and p t to bound differences in indirect utility.We now introduce the prediction wedge. This wedge arises because raw differences23n indirect utility are not the natural welfare object in our setting for a price changebecause we focus on ex ante policy evaluation. Instead, we take into account thatwhen ε ą
0, an individual may choose bundles with different utility when facing thesame prices. This is because we assume an individual satisfices.The utility the individual attains for a given price and choice of consumption bundleis the approximate indirect utility . For observation t P t , . . . , T u , the approximateindirect utility is u p x t q ´ p t ¨ x t . At price p , the approximate indirect utility is restricted to be somewhere in theinterval r V u p p q ´ ε, V u p p qs . In fact, (weakly) further restrictions take into account that the approximate indirectutility attained is bounded below by V u,A p p, ε q “ inf x P R K ` u p x q ´ p ¨ x s.t. u p x q ´ p ¨ x ě V u p p q ´ ε, while the upper bound is the indirect utility. The lower bound on approximate indirectutility is the lower bound V u,A p p, ε q , while the upper bound on approximate indirectutility is the upper bound V u,A p p, ε q “ V u p p q .Now suppose we wish to bound the change in approximate indirect utility betweenprices ˜ p and ˜ p . If the utility u and level of satisficing ε were known, then the welfarebounds would be r V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε q , V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε qs . Fixing u , this interval becomes wider when ε increases. In general, ε controls theprediction wedge, which arises even if we knew u because we would not know whatis chosen by the satisficer.Since we do not know the utility function a priori , we consider bounds involving thesmallest and largest changes in approximate indirect utility among all utility functions24hat ε -rationalize the dataset: V p ˜ p , ˜ p , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε q ( V p ˜ p , ˜ p , ε q “ inf t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε q ( . These bounds incorporate both the measurement and prediction wedges. The mea-surement wedge shows up when considering u that ε -rationalize the data, while theprediction wedge arises when defining the approximate indirect utility. We use thesame value ε for both since we maintain Assumption 1 .These bounds can inform a researcher about changes in welfare even in the presenceof satisficing. If V p ˜ p , ˜ p , ε q ą
0, then we can conclude that given a price changefrom ˜ p to ˜ p the individual is better off at ˜ p . If V p ˜ p , ˜ p , ε q ă
0, then the pricechange from ˜ p to ˜ p makes the individual worse off. In contrast, ambiguity ariseswhen V p ˜ p , ˜ p , ε q ă V p ˜ p , ˜ p , ε q ą
0. In this case an individual may be betteror worse given the price change, but the data alone are inconclusive.We now state a computational result for the bounds. A specific description of thelinear program is given in Proposition A.3 in Appendix A.
Proposition 10.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . The bounds on approximateindirect utility V p ˜ p , ˜ p , ε q and V p ˜ p , ˜ p , ε q can each be computed as a linear programwhenever they are finite.Under Assumption 1 p ε “ ε ˚ q , these bounds cannot be improved. When ε “ ε ˚ “
0, the approximate indirect utility equals the indirect utility, andthese are the sharp bounds on consumer surplus with limited price variation. Whenwe set ε “ ε ˚ , these are the adaptive consumer surplus bounds . These bounds maybe used for arbitrary prices ˜ p , ˜ p , not only at prices in tp p t qu Tt “ . In particular, thesebounds provide welfare bounds at new prices without needing to first provide boundson the quantities at the prices. Formally, we take the supremum over u such that V u,A p ˜ p , ε q is not , and the infimum over u such that V u,A p ˜ p , ε q is not . p t in the dataset, V u p p t q ´ V u p p q ď x t ¨ p p ´ p t q ` ε. This states that differences in indirect utility are bounded by changes in expenditure.Here, the change in expenditure involves keeping the quantity fixed and changingprices. We present lower and upper bounds on V that build on this inequality. Tostate the result, first suppose ˜ p “ p S is in the dataset. Define h p ˜ p q “ min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + , where Σ S is the set of sequences that start with σ p q “ S , have no cycles, and havelength at least M ě Proposition 11.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . If ˜ p is in the dataset, i.e. ˜ p “ p S for some S P t , . . . , T u , then for ˜ p ‰ ˜ p , h p ˜ p q ´ ε ď V p ˜ p , ˜ p , ε q ď h p ˜ p q ` ε, and for ˜ p “ ˜ p , V p ˜ p , ˜ p , ε q “ ε . This result is established by leveraging duality results we present in Appendix C.3.Analogous results exist for V because V p ˜ p , ˜ p , ε q “ ´ V p ˜ p , ˜ p , ε q , and are omittedfor brevity.We reiterate that Proposition 10 describes that V can be computed exactly as a linearprogram. The goal of Proposition 11 is to make this process less of a “black box.”Note that when ε “
0, the lower and upper bounds coincide and we characterize V p ˜ p , ˜ p , q . We recognize h as a function closely related to the construction of theRiemann integral, since it computes the sum of the area of certain rectangles. Wemay view h as a “discrete” analogue of the consumer surplus formula, which statesthat differences in indirect utility are the area of a demand function. In fact, thisintegration intuition can be formalized in the special case of a single good p K “ q ,when ε “ Proposition 12.
Suppose there is a single good ( K “ ), the dataset tp x t , p t u Tt “ is exactly consistent with quasilinear utility ( ε ˚ “ ), and we set ε “ . If ˜ p ą t p , . . . , p T , ˜ p u , then V p ˜ p , ˜ p , q “ ż x p t ˜ p ` p ´ t q ˜ p , qp ˜ p ´ ˜ p q dt. Proposition 12 shows that in a certain case, there is a tight connection betweenbounds on quantities and welfare bounds. Further relationships between welfare andcounterfactual quantities are left for future work.To close this section, we present shape restrictions on V that hold for all ε ě Proposition 13.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . V p ˜ p , ˜ p , ε q is convex, weaklydecreasing, and lower semicontinuous in ˜ p , and weakly increasing in ε . If ˜ p P CCo pt p t u Tt “ q , then V p ˜ p , ˜ p , ε q is finite. If ˜ p R CCo pt p t u Tt “ Y ˜ p q , then V p ˜ p , ˜ p , ε q “8 . The shape restrictions in Proposition 13 are those of an indirect utility function.We do not obtain global continuity here because the welfare bounds can be infinite.However, V p ˜ p , ˜ p , ε q is continuous in ˜ p over the relative interior of CCo pt p t u Tt “ q because it is convex and finite over this set (Rockafellar [2015], Theorem 10.1).Recall that Proposition 9 shows that bounds on utility differences are trivial whena quantity is not in the dataset. In contrast, Proposition 13 shows that the boundon approximate indirect utility V p ˜ p , ˜ p , ε q is typically finite provided ˜ p is not toolow. In particular, neither ˜ p nor ˜ p need be in the dataset. The reason we obtainthese contrasting results is that indirect utility functions must satisfy certain shaperestrictions while we consider utility functions that need not satisfy shape restrictionssuch as concavity or monotonicity. A function f : H Ñ R Y t8 , ´8u is lower semicontinuous if for any a P R the set t x P H | f p x q ď a u is closed in the topology on H . Continuity and Convexity in Quantities and Ap-proximation Error
In classic revealed preference, a small amount of measurement error can lead to refu-tation of the model. In this case, there is no way to use the model for counterfactualor welfare analysis. Below we show continuity of the welfare/counterfactual boundsin both quantities and degree of approximation error. Thus, we provide a way tostill conduct analysis when the model is not perfect, and do so in a way that is acontinuous enlargement of the standard conceptual framework. In more detail, here we study the joint mapping from quantities and approximationerror to the bounds analyzed previously. One motivation for this is that in appli-cations, an analyst may not observe a dataset of interest tp x t , p t qu Tt “ exactly, andmay instead only have an estimate of the quantities. We show below that if we canconsistently estimate quantities, then we can consistently estimate the bounds.For concreteness, suppose an analyst is conducting a representative agent analysis,and quantities are mean quantities from a population at a each time period. Weexamine the mean demand vector at period t so that x t “ E r X i,t s , where X i,t isdemand for individual i at time t . Here X i,t is treated as a random variable that isidentically distributed across individuals. An analyst estimates { E r X i,t s from a cross-sectional dataset in which individuals at each time period face the same prices. Forexample, the estimator could be the sample average of demands at time t across manyindividuals.We now turn to the formal results. While we allow estimation error associated withquantities, here we take each price p t as nonrandom and measured exactly. Recallthat x k p ˜ p, ε q is the maximal quantity of good k at counterfactual price ˜ p assum-ing approximation error is no greater than ε . The definition of x k is presented inProposition 4. We now treat x k as a function of the dataset of quantities, and withminor abuse of notation we write x k p d, ˜ p, ε q , where d “ ` d , . . . , d T ˘ P R K ˆ T ` denotesquantities across all goods at time periods. This allows us to study how the bounddepends on quantities in the dataset while keeping prices, t p t u Tt “ , fixed. Similarly, x k p d, ˜ p, ε q denotes the lower bound. Finally, let A Ď R K ˆ T ` ˆ R ` denote combinations To be clear, results in this paper are also new under ε “
28f quantities and approximation error such that the counterfactual/welfare objectsare defined, i.e. A “ tp d, ε q P R K ˆ T ` ˆ R ` | ε ě ε ˚ ` tp d t , p t qu Tt “ ˘ u . Proposition 14.
Fix a price ˜ p where we wish to bound counterfactual quantities andassume the dataset of prices t p t u Tt “ is fixed. The set A is convex. The mapping x k p¨ , ˜ p, ¨q : A Ñ R ` Y t8u is concave in p d, ε q , and is continuous in p d, ε q at any pointwhere it is finite. The mapping x k p¨ , ˜ p, ¨q : A Ñ R ` is convex and continuous in p d, ε q . We obtain a similar result for the bounds on approximate indirect utility V and V when we view them as a function of the dataset of quantities. To formalize this, withminor abuse of notation let V p d, ˜ p , ˜ p , ε q describe the upper bound as a mappingof the quantities d P R K ˆ T ` in a dataset. Similarly, V p d, ˜ p , ˜ p , ε q denotes the lowerbound. Proposition 15.
Fix a price pair ˜ p and ˜ p where we wish to bound the differencein approximate indirect utility, and assume the dataset of prices t p t u Tt “ is fixed. Themapping V p¨ , ˜ p , ˜ p , ¨q : A Ñ R Y t8u is concave in p d, ε q , and is continuous in p d, ε q at any point where it is finite. The mapping V p¨ , ˜ p , ˜ p , ¨q : A Ñ R Y t´8u is convexin p d, ε q , and is continuous in p d, ε q at any point where it is finite. Recall Proposition 13 shows that when ˜ p P CCo pt p t u Tt “ q , V p d, ˜ p , ˜ p , ε q is finite forany p d, ε q P A .We need one more result. Here, we interpret the minimal approximation error ε ˚ asa function of quantities for fixed prices t p t u Tt “ . Proposition 16 (Allen and Rehbeck [2020]) . The mapping ε ˚ : R K ˆ T ` Ñ R ` isconvex and continuous. The previous continuity results imply the following consistency results.
Corollary 1.
Suppose we have some estimator of the quantities that satisfies ˆ d n p ÝÑ d . hen x k ´ ˆ d n , ˜ p, ε ˚ ´ ˆ d n ¯¯ p ÝÑ x k p d, ˜ p, ε ˚ p d qq x k ´ ˆ d n , ˜ p, ε ˚ ´ ˆ d n ¯¯ p ÝÑ x k p d, ˜ p, ε ˚ p d qq .V ´ ˆ d n , ˜ p , ˜ p , ε ˚ ´ ˆ d n ¯¯ p ÝÑ V ` d, ˜ p , ˜ p , ε ˚ p d q ˘ V ´ ˆ d n , ˜ p , ˜ p , ε ˚ ´ ˆ d n ¯¯ p ÝÑ V ` d, ˜ p , ˜ p , ε ˚ p d q ˘ , where each result holds whenever the right hand side result is finite. This provides a theoretical foundation for plug-in estimation. We omit a formal de-scription of the sampling scheme since the result applies to any collection of randomvariables ! ˆ d n ) that converges in probability to d . For example, if we have panel dataand the quantities p X i,t q Tt “ are independent and identically distributed across indi-viduals, one can use sample averages so that ˆ d n “ ` n ř ni “ X i,t ˘ Tt “ when estimating d “ p E r X i,t sq Tt “ .Finally, we consider shape restrictions for the bounds on utility differences u and u ,viewed as functions of quantities. As before, fixing prices t p t u Tt “ , we study dependenceon the quantities d P R K ˆ T ` . With minor abuse of notation write u p d, ˜ x , ˜ x , ε q as afunction of quantities and approximation error, and similarly for u .Recall that Proposition 8 shows that when ˜ x is in the dataset of quantities d , u p d, ˜ x , ˜ x , ε q is finite provided ε ě ε ˚ . In contrast, Proposition 9 shows that whenever˜ x is outside the dataset, u p d, ˜ x , ˜ x , ε q “ 8 . We conclude that when viewed as a map-ping of quantities, u is no longer continuous. It is, however, continuous over a certainsubset of A . To describe this, for a vector ˜ x P R K ` let A p ˜ x q “ tp d, ε q P A | d “ ˜ x u .This restricts attention to quantities datasets that all contain a certain vector ˜ x asthe first component. We formalize continuity and concavity results as follows. Proposition 17.
Fix a quantity pair ˜ x ‰ ˜ x where we wish to bound the differencein utility, and assume the dataset of prices t p t u Tt “ is fixed. The set A p ˜ x q is convexfor any ˜ x P R K ` . The mapping u p¨ , ˜ x , ˜ x , ¨q : A Ñ R Y t8u is concave and continuousin p d, ε q over the region A p ˜ x q . The mapping u p¨ , ˜ x , ˜ x , ¨q : A Ñ R Y t´8u is convexand continuous in p d, ε q over the region A p ˜ x q . Continuity over all of A does not hold and so one cannot directly apply the continuous30apping theorem to establish a consistency result like Corollary 1. If some quantityvector d is measured without error, however, then it is possible to consistently esti-mate the bounds on utility differences between d and other bundles though we omitdetails for brevity. We now illustrate the results in the paper with data on the demand for gasoline.Data are from the 2001 United States National Household Travel Survey, and havepreviously been used in Blundell et al. [2012]. The data are from a single cross-section. For brevity we refer to Blundell et al. [2012] for additional details, includingconstruction of the particular sample.The primary observables of interest are quantities and prices. Quantities are annualgasoline consumption, which is constructed from odometer readings and an estimateof fuel efficiency. Prices are the average tax-inclusive price per gallon, in the countywhere the individual lives.First note it is possible to map the raw quantities and prices to a dataset tp X i , P i qu ni “ ,and then apply our previous analysis. Here i denotes the individual and n denotesthe sample size. We use this notation rather than t and T to emphasize we have across-section. We use upper case X i and P i to denote that these are random variables.We do not use the raw dataset, and instead “pre-process it” to map to our frame-work. We do so because we have a cross-section of individuals. We wish to both todiminish the impact of sampling variability as well as incorporate heterogeneity alongobservable variables. As in, Blundell et al. [2012] we pre-process by first consideringa partially linear model given by X “ g p P, Y q ` β W ` U, where P is price, Y is income, W are observed covariates, and U is unobservable Allen and Rehbeck [2020] study how stochastic shocks and approximation error can be stud-ied in a common framework. That paper provides several aggregation theorems, and discusses arepresentative agent in this setting. g as a demand curve for a rep-resentative agent for the general model of utility maximization subject to a budgetconstraint, here we have a different interpretation. We interpret g p¨ , Y q as the de-mand curve for the representative agent with income level Y ; Y thus serves as ademographic characteristic that alters the shape of the demand curve. We close themodel with the restriction E r U | P “ p, Y “ y, W “ w s “ . This specification allows price sensitivity to depend on the level of income of anindividual. For each level of income ˜ y , we consider a dataset (in the sense usedpreviously in the paper) of the form D p ˜ y q “ (cid:32) ˆ g p P i , ˜ y q , P i ( ˜ ni “ , where ˆ g is an estimator of g described below. Thus, ˆ g p P i , ˜ y q is akin to the structuralquantity x t in the previous notation, and P i is akin to p t . Like Blundell et al. [2012],we consider prices between the 5-th and 95-th quantile to mitigate endpoint issues,so ˜ n enumerates these observations.The estimator ˆ g is constructed similar to Blundell et al. [2012]. We first estimate ˆ β by a double residual regression as in Robinson [1988]. Then we setˆ g p P i , ˜ y q “ ř Nj “ ´ X j ´ ˆ β W j ¯ K h p p P j ´ P i q K h y p Y j ´ ˜ y q ř Nj “ K h p p P j ´ P i q K h y p Y j ´ ˜ y q , where K h is a kernel with bandwith h . Following Blundell et al. [2012] we use thebiweight kernel. Throughout, the bandwidths h p and h y are chosen so that h p h y “ ˆ σ P ˆ σ Y , where ˆ σ P “ n ř ni “ p P i ´ P q , P “ n ř ni “ P i , ˆ σ Y “ n ř ni “ p Y i ´ Y q , and Y “ n ř ni “ Y i . Note that these are all constructed with all observations.Figure 2 presents analysis with two choices of bandwidths. These correspond to the We use the biweight kernel with ad hoc bandwidth .
75 after standardizing the data.
32d hoc choices .
75 and 1 after standardizing. The top two panels display the kernel-smoothed “dataset” D ` Y ˘ “ (cid:32) ˆ g ` P i , Y ˘ , P i ( ˜ ni “ as well as counterfactual bounds, where ˆ g ` P i , Y ˘ is interpreted as a quantity for ob-servation i facing prices P i . Recall ˜ n denotes the middle 90% of observations in termsof price, where we drop the lower and upper 5% to mitigate endpoint issues. Incomeis evaluated at the sample mean Y . The welfare bounds for approximate indirectutility are displayed in the middle panels. The bounds are evaluated relative to themean price P “ . (cid:32) ˆ g ` P i , Y ˘( ˜ ni “ . Recallfrom Proposition 9, comparisons in utility when one quantity is not in the datasetwill have at least one trivial bound. This is why we restrict attention to comparisonsin which both quantities are in the dataset. It is important to note that in practice,simply bounding quantities over a grid will lead to trivial bounds for many points inthe grid. 33 a) (b)(c) (d)(e) (f) Figure 2: Quantity, Approximate Indirect Utility, and Utility Bounds at Two Band-widths
Notes: The top panels depict quantity bounds at new prices. The middle panels depictbounds on approximate indirect utility relative to the mean price. The lower panels depictbounds on utility at certain quantities, relative to the median quantity in the dataset.
34s can be seen from the figures, the choice of bandwidth noticeably alters the infor-mativeness of the counterfactual bounds (upper panels). In contrast, the bounds onapproximate indirect utility (middle panels) and utility (lower panels) are relativelynarrow for both bandwidths. Similar results obtain for alternative bandwidths andare available upon request.In the lower panels, a contrast emerges between the lower and upper bounds on utility.Recall that these bounds are for u p ˜ x q ´ u p ˜ x q , where ˜ x is the median quantity, i.e. inthe dataset. Because the second argument is in the dataset, Proposition 8 applies tothe upper bounds and establishes monotonicity in the argument ˜ x . The graphs areconsistent with this, since the upper bounds are monotone in quantities. In contrast,the lower bounds are not monotone. In order to make lower bounds on utilitiesmonotone, it would be necessary to have the first argument be in the dataset andfixed (˜ x ). See Proposition A.2(vi) in Appendix A. This paper provides a conceptual framework for counterfactual and welfare analysisfor approximate models. Our main conceptual assumption is that model approxima-tion error has the same magnitude in new settings as the data we have seen. Weformalize this for the quasilinear utility model. This assumption is portable to othersettings, and generalizes the standard approach that requires correct specification inboth the data we have seen and at hypothetical values.Engaging with the possibility that a model may not perfectly match data is especiallyimportant using the nonparametric revealed preference approach. Indeed, a naturalintuition is that if approximation error is “small,” then it is second order and we canignore it for certain questions. Unfortunately, this intuition is false in the standardapproach used in the revealed preference literature, since small violations of the modelmean it cannot be used for counterfactual or welfare analysis. This paper presentsan adaptive approach allowing the analyst to use the model formally viewing it as anapproximation. Moreover, our counterfactual/welfare bounds are continuous in thedegree of approximation error, and so they continuously transition to the standardframework when approximation error is negligible.35 ppendix A Proofs of Main Results
This appendix provides proofs of the results in the main text. It also provides explicitdescriptions of the linear programs mentioned in the main text. Some of the proofsrequire additional lemmas contained in Supplemental Appendix C.
A.1 Proofs for Section 3
Proof of Facts 1-4.
The proofs are in the main text.
Proof of Proposition 2.
Emptiness of X p ˜ p, D, ε q when ε ă ε ˚ is immediate fromFact 3. It remains to show that when ε ě ε ˚ , the set X p ˜ p, D, ε q is nonempty.First, fix p x , p q P tp x t , p t qu Tt “ and let Σ denote the set of finite sequences of t Pt , . . . , T u with no cycles that begins at p x , p q . Define U p x q “ min σ P Σ p σ p M q ¨ ` x ´ x σ p M q ˘ ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , where σ P Σ is a sequence of length M , for all m P t , . . . , M u it follows that σ p m q P t , . . . , T u , and σ p q “
1. Allen and Rehbeck [2020] have shown that for ε ě ε ˚ , this function ε -rationalizes the data in the sense that for each t P t , . . . , T u and each x P R K ` , U p x t q ´ p t ¨ x t ě U p x q ´ p t ¨ x ´ ε. The function U p x q need not induce an ε -maximizer when prices take low values.However, the constructed utility can be modified to guarantee maximizers exist.To that end, let U “ sup x P Co pt x t u Tt “ q U p x q where Co pt x t u Tt “ q denotes the convex hull, i.e. the smallest convex set containing t x t u Tt “ . We see U ă 8 since U is continuous and Co pt x t u Tt “ q is compact. Define f : R K ` Ñ R by f p x q “ ř Kk “ x k ` ř Kk “ x k ` U , which is bounded and concave. To see this,note the function h : R ` Ñ R given by h p z q “ z {p ` z q is concave by inspecting36erivatives. Since f is a composition of a concave function and an affine and strictlyincreasing function it is concave.Now construct ˜ U p x q “ min t U p x q , f p x qu . This function rationalizes the data since for t P t , . . . , T u and x P R K ` we have˜ U p x t q ´ p t ¨ x t “ U p x t q ´ p t ¨ x t ě U p x q ´ p t ¨ x ´ ε ě ˜ U p x q ´ p t ¨ x ´ ε. In addition, ˜ U is concave since it is the minimum of concave functions. Similarly, ˜ U iscontinuous and strictly increasing as it is the minimum of finitely many continuous andstrictly increasing functions. It remains to show this utility admits an ε -maximizerfor all prices p P R `` .To that end, note the indirect utility of f , denoted V f p p q “ sup x P R K ` f p x q ´ p ¨ x, is everywhere finite over the region p P R K `` because f is bounded between U and U `
1. Moreover, since ˜ U ď f pointwise, we also have V ˜ U p p q ď V f p p q , so that V ˜ U isfinite for any p P R K `` . Since R K `` is open, from Lemma C.2 we conclude that˜ U p x q ´ p ¨ x admits an exact maximizer in x for any p P R K `` . In particular, it admits ε -maximizers, completing the proof. Proof of Proposition 3.
An equivalent definition of X p ˜ p, D, ε q is ˜ x P X p ˜ p, D, ε q if andonly if the augmented dataset D Y p ˜ x, ˜ p q is ε -rationalized by quasilinear utility. Fromthe characterization in Lemma C.1(iii), this is equivalent to showing that certainsequences satisfy an inequality. For each sequence involving the augmented datasetthere are two cases. If the sequence does not contain p ˜ x, ˜ p q , then the inequalityin Lemma C.1(iii) is satisfied because we assume ε ě ε ˚ p D q . (Note that ε ˚ p D q isconstructed to have this property.) It remains to check sequences involving p ˜ x, ˜ p q .37earranging the inequality of Lemma C.1(iii), we see that p ˜ p ´ p t M q ¨ ˜ x ď p M ` q ε ` ˜ p ¨ x t ´ p t M ¨ x t M ´ M ´ ÿ m “ p t m ¨ p x t m ´ x t m ` q (6)must hold for all finite sequences t t m u Mm “ without cycles where t m P t , . . . , T u and M ě
1. Here we have the M ` ε since the sequences include the coun-terfactual observation and a length M sequence. This characterizes the set X p ˜ p, D, ε q as an intersection of finitely many half-spaces. Thus, X p ˜ p, D, ε q is a closed, convexpolyhedron.To prove Proposition 4, we prove a stronger result that explicitly describes the linearprogram. Proposition A.1.
For a dataset D “ tp x t , p t qu Tt “ , let ε ě ε ˚ . Then whenever X k p ˜ p, D, ε q is bounded above, its maximum is given by the linear program x k p ˜ p, ε q “ max ˜ x P R K ` u ,...,u T , ˜ u P R ` ˜ x k s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u ď u r ` p r ¨ p ˜ x ´ x r q ` ε for all r P t , . . . , T u u r ď ˜ u ` ˜ p ¨ p x r ´ ˜ x q ` ε for all r P t , . . . , T u . The upper bound x k p ˜ p, ε q may be equivalently calculated as x k p ˜ p, ε q “ max ˜ x P R K ` ˜ x k s.t. p ˜ p ´ p t M q ¨ ˜ x ď p M ` q ε ` ˜ p ¨ x t ´ p t M ¨ x t M ´ M ´ ÿ m “ p t m ¨ p x t m ´ x t m ` q , where this inequality must hold for all finite sequences t t m u Mm “ with t m P t , . . . , T u and M ě . The value of x k p ˜ p, ε q is calculated as the minimum of the objectivewith either constraint set of the above linear programs. The value x k p ˜ p, ε q is weaklyincreasing in ε over the region ε ě ε ˚ , and x k p ˜ p, ε q is weakly decreasing in ε over theregion ε ě ε ˚ . The first linear program is easy to implement as it has order p T ` q constraints38nd T ` ` K unknowns. The second linear program is useful to understand themapping from data to bounds. However, directly operationalizing the second linearprogram would require enumerating all finite sequences of the dataset that do notcontain cycles, which is computationally costly.Related bounds have appeared in Chiong et al. [2017] and Allen and Rehbeck [2019a],which focus on latent utility models with observable characteristics of goods otherthan prices. The result here differs since ε can be nonzero and the first set of boundsdirectly describes a convenient linear program used to compute bounds. We take x k to be positive infinity when there is no finite upper bound. Proof of Proposition A.1 (and Proposition 4).
The linear programming formulationsare immediate from Lemma C.1 and the proof of Proposition 3. From Lemma C.3,the maximum is attained because the linear program has a bounded value function byconstruction. Recall that for the second formulation, we only need to consider cyclesinvolving the counterfactual quantity-price tuple because we have assumed ε ě ε ˚ .Recall that all cycles that do not involve the counterfactual quantity are necessarilyless than ε ˚ and will not bind. We leveraged these properties in Proposition 3 already.The proof for x k are analogous and are omitted.Now we argue that these bounds cannot be improved under Assumption 1 . To seethis, note from Proposition 5 that the bounds are weakly monotone in ε . Moreover,when ε ă ε ˚ we know the set X p ˜ p, D, ε q (and hence X k p ˜ p, D, ε q ) is empty fromProposition 2. Proof of Proposition 5.
We begin by showing that x k p ˜ p, ε q is finite if and only if˜ p P intCCo pt p t u Tt “ q . First, let ˜ p P intCCo pt p t u Tt “ q so that ˜ p ą ř Tt “ α t p t for somenonnegative α t such that ř Tt α t “
1. Note that for each t P t , . . . T u , the approximatelaw of demand yields p ˜ p ´ p t q ¨ p ˜ x ´ x t q ď ε. Chiong et al. [2017] essentially start with the second formulation of the bounds (in terms ofcycles) and show that while there are many cycles, only a certain number are effectively binding.The first formulation of Proposition A.1 complements their analysis by describing an explicit linearprogram with order p T ` q scalar inequalities. Allen and Rehbeck [2019a] describe bounds incertain models with characteristics in place of prices, and use a characterization similar to the cyclescondition, but do not study extreme points or describe computations. α t and summing up the inequalities gives that ˜ ˜ p ´ T ÿ t “ α t p t ¸ ¨ ˜ ˜ x ´ L ÿ t “ α t x t ¸ ď ε. Thus, ˜ ˜ p ´ T ÿ t “ α t p t ¸ ¨ ˜ x ď ε ` ˜ ˜ p ´ T ÿ t “ α t p t ¸ ¨ L ÿ t “ ` α t x t ˘ . Since ´ ˜ p ´ ř Tt “ α t p t ¯ ą x P R K ` one can bound the values on each dimensionof ˜ x so that ˜ x P K ź k “ »– , ε ` ´ ˜ p ´ ř Tt “ α t p t ¯ ¨ ř Lt “ α t x t q ˜ p k ´ ř Tt “ α t p tk fifl . This shows that x k p ˜ p, ε q is finite when ˜ p P intCCo pt p t u Tt “ q .Next, we show when ˜ p R intCCo pt p t u Tt “ q that x k p ˜ p, ε q is unbounded. Suppose that˜ p R intCCo pt p t u Tt “ q . This means for all t P t , . . . , T u that ˜ p ď p t . From Proposition 3we know ˜ x P X p ˜ p, D, ε q if and only if for any sequence t t m u Mm “ with M ě p ˜ p ´ p t M q ¨ p ˜ x ´ x t q ď p M ` q ε ` p t M ¨ x t ´ p t M ¨ x t M ´ M ´ ÿ m “ p t m ¨ p x t m ´ x t m ` q . Note that the right hand side of the expression is always weakly positive. Moreover,˜ p ´ p t M ď x so that one can choose arbitrarily positive amounts of every good. If there is adimension k such that ˜ p k ´ p t M k ă
0, then one can choose arbitrarily high amounts of˜ x k to satisfy all such inequalities. This establishes that x k p ˜ p, ε q is unbounded above.Note that the lower bound x k p ˜ p, ε q is always finite because it is bounded below by 0.To show monotonicity in ε note that the feasibility region is weakly increasing (withregard to set inclusion) as ε increases. Thus, x k is weakly increasing in ε , and x k isweakly decreasing in ε . Proof of Proposition 6.
First, let Σ be the set of sequences t t m u Mm “ that contain nocycles where t m P t , . . . , T u . From Proposition 3, the counterfactual bounds on40emand are given by inequalities of the form p ˜ p ´ p t M q ˜ x ď p M ` q ε ` ˜ px t ´ p t M x t M ´ M ´ ÿ m “ p t m p x t m ´ x t m ` q , (7)which much hold for every sequence in Σ. Dot products are removed since all objectsare one-dimensional.Whether a certain inequality of the form (7) provides an upper bound or lower boundon ˜ x depends on the sign of ˜ p ´ p t M . To see this, note that rearranging Equation 7when ˜ p ą p t M yields˜ x ď p M ` q ε ` ˜ px t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q ˜ p ´ p t M “ x t ` p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q ˜ p ´ p t M . (8)Note the the expression in Equation 8 that is divided by ˜ p ´ p t m is positive since theterms above are those for a cycle of length M . To see this, note that p t M p x t M ´ x t q ` M ´ ÿ m “ p t m p x t m ´ x t m ` q ď M ε ď p M ` q ε, where the first inequality holds because ε ě ε ˚ and the left hand side is a sequenceof length M . Thus, such sequences constitute upper bounds.When instead ˜ p ă p t M , the sequence yields a lower bound since one is dividing bya negative number. Since the sign of the difference matters, we partition the set ofsequences in Σ as follows. We consider the counterfactual prices where ˜ p ă ˜ p withoutloss of generality. Let t t m u Mm “ “ σ P Σ when p t M ď ˜ p ă ˜ p . Let t t m u Mm “ “ σ P Σ when ˜ p ă p t M ă ˜ p . Lastly, let t t m u Mm “ “ σ P Σ ´ when ˜ p ă ˜ p ď p t M .Upper bounds on counterfactual demand for the price ˜ p involve sequences where p t M ď ˜ p (i.e. sequences in Σ ). Upper bounds on counterfactual demand for theprice ˜ p involve sequences where p t M ď ˜ p (i.e. sequences in Σ Y Σ ). We denote theupper bound inequalities by U B p ˜ p q “ t ˜ x P R ` | Equation 7 holds for sequences σ P Σ with ˜ p “ ˜ p u U B p ˜ p q “ t ˜ x P R ` | Equation 7 holds for sequences σ P Σ Y Σ with ˜ p “ ˜ p u . We use Equation (8) to show that
U B p ˜ p q Ď U B p ˜ p q . First, if ˜ x P U B p ˜ p q then forevery sequence t t m u Mm “ “ σ P Σ with p t M ă ˜ p ă ˜ p it follows that˜ x ď x t ` p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q ˜ p ´ p t M ď x t ` p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q ˜ p ´ p t M where the second inequality holds since the numerator is positive and 0 ă ˜ p ´ p t M ă ˜ p ´ p t M . If ˜ p “ p t M for a sequence σ P Σ , then there is no restriction on thecounterfactual demands. Since U B p ˜ p q only is restricted by sequences in Σ while U B p ˜ p q is restricted by sequences in Σ and Σ , this shows U B p ˜ p q Ď U B p ˜ p q . Thisproves that ¯ x p ˜ p , ε q ď ¯ x p ˜ p , ε q since ¯ x is the maximum, the upper bounds satisfy U B p ˜ p q Ď U B p ˜ p q , and a maximum over a larger set is weakly larger.Next note that the lower bounds on counterfactual demand ˜ x are given by the fol-lowing LB p ˜ p q “ t ˜ x P R ` | Equation 7 holds for sequences σ P Σ ´ Y Σ with ˜ p “ ˜ p u and LB p ˜ p q “ t ˜ x P R ` | Equation 7 holds for sequences σ P Σ ´ with ˜ p “ ˜ p u . To see this, note that rearranging Equation 7 when ˜ p ă p t M yields˜ x ě x t ´ p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q p t M ´ ˜ p . We now show that LB p ˜ p q Ď LB p ˜ p q . If ˜ x P LB p ˜ p q , then for every sequence42 t m u Mm “ “ σ P Σ ´ with ˜ p ă ˜ p ă p t M it follows that˜ x ě x t ´ p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q p t M ´ ˜ p ě x t ´ p M ` q ε ` p t M x t ´ p t M x t M ´ ř M ´ m “ p t m p x t m ´ x t m ` q p t M ´ ˜ p since the term being subtracted weakly increases when dividing by a smaller differencesince 0 ă p t M ´ ˜ p ă p t M ´ ˜ p . (Recall the numerator in each fraction is positive.)When the sequence σ P Σ ´ has ˜ p “ p t M there is no restriction on counterfactualdemands. Since LB p ˜ p q only is restricted from sequences in Σ ´ while LB p ˜ p q isrestricted by sequences in Σ ´ and Σ , this shows LB p ˜ p q Ď LB p ˜ p q . This also showsthat x p ˜ p , ε q ď x p ˜ p , ε q since x is a minimum, the constraint set on the lower bounds LB p ˜ p q Ď LB p ˜ p q , and a minimum over a smaller set is weakly larger. A.2 Proofs for Section 4
A.2.1 Proofs for Section 4.1
Propositions 7 and 8 are proven together in the following result.
Proposition A.2.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ .i. If ˜ x is in the dataset, i.e. ˜ x “ x S for some S P t , . . . , T u , and ˜ x ‰ ˜ x , then u p ˜ x , ˜ x , ε q “ max u ,...,u T , ˜ u P R ` ˜ u ´ u S s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u ď u r ` p r ¨ p ˜ x ´ x r q ` ε for all r P t , . . . , T u ˜ u “ u r for all r P t , . . . , T u with ˜ x “ x r . ii. If ˜ x “ x S is in the dataset and ˜ x ‰ ˜ x , then the upper bound is equivalently iven by u p ˜ x , ˜ x , ε q “ min σ P Σ S p σ p M q ¨ p ˜ x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , where Σ S is the set of sequences that start with σ p q “ S , have no cycles, andhave length at least M ě .iii. If ˜ x “ x S is in the dataset, the function u is strictly increasing and continuous in p ˜ x , ε q over the region that excludes ˜ x “ ˜ x . In particular, under Assumption 1the bound cannot be improved.iv. If ˜ x is in the dataset, i.e. ˜ x “ x F for some F P t , . . . , T u , and ˜ x ‰ ˜ x , then u p ˜ x , ˜ x , ε q “ min u ,...,u T , ˜ u P R ` u F ´ ˜ u s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u ď u r ` p r ¨ p ˜ x ´ x r q ` ε for all r P t , . . . , T u ˜ u “ u r for all r P t , . . . , T u with ˜ x “ x r . v. If ˜ x “ x F is in the dataset and ˜ x ‰ ˜ x , then the lower bound is equivalentlygiven by u p ˜ x , ˜ x , ε q “ max σ P Σ F p σ p M q ¨ p x σ p M q ´ ˜ x q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m q ´ x σ p m ` q q ´ M ε + . vi. If ˜ x “ x F is in the dataset, the function u is strictly decreasing and continuous in p ˜ x , ε q over the region that excludes ˜ x “ ˜ x . In particular, under Assumption 1the bound cannot be improved. Parts (i) and (iv) describe the linear programs used for computation and stated asProposition 7 in the main text. Note that parts (ii) and (v) show that the boundsare finite, as claimed in Proposition 7. The other parts cover Proposition 8 stated inthe main text. Parts (ii) and (v) provide analytical characterizations of the boundson utility differences. Parts (iii) and (vi) describe shape restrictions of the bounds.44 roof of Proposition A.2.
We first prove parts (i) and (ii).The definition of ε -rationalizability yields u p ˜ x , x S , ε q ď sup u ,...,u T , ˜ u P R ` ˜ u ´ u S s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u ď u r ` p r ¨ p ˜ x ´ x r q ` ε for all r P t , . . . , T u ˜ u “ u r for all r P t , . . . , T u with ˜ x “ x r . We shall show the opposite inequality holds to prove (i), and in doing so characterizethe maximum as stated in part (ii). First, note that the problem on the right handside is feasible since for the dataset D “ tp x t , p t qu Tt “ , we assumed ε ě ε ˚ p D q . Weshow that there is a utility function ˜ u such that for any u , . . . , u T , ˜ u P R ` that arefeasible, ˜ u ´ u S ď ˜ u p ˜ x q ´ ˜ u p x S q . To that end, first consider feasible values u , . . . , u T , ˜ u . For any sequence that beginsat σ p q “ S , we can sum up the inequalities in the program to obtain˜ u ´ u S ď p σ p M q ¨ p ˜ x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε.
Thus,˜ u ´ u S ď min σ P Σ S p σ p M q ¨ p ˜ x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , where Σ S is the set of sequences with σ p q “ S , have no cycles, and have length atleast M ě
1. We show in particular that provided ˜ x ‰ x S , the upper bound on theright hand side can be attained by the utility function ˜ u , defined for x ‰ x S by˜ u p x q “ min σ P Σ S p σ p M q ¨ p x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , and defined for x S by ˜ u p x S q “
0. Note that the summation on the right side defining˜ u p x q is zero whenever M “ u is not continuous at x S , which is key for our arguments.We show that this utility function rationalizes the data. For any x P R K ` , it followsthat for any t P t , . . . , T u such that x t ‰ x S ,˜ u p x q ´ p t ¨ x ď p t ¨ p x ´ x t q ` p σ ˚ ,t p M ˚ ,t q ¨ p x t ´ x σ ˚ ,t p M ˚ ,t q q` M ˚ ,t ´ ÿ m “ p σ ˚ ,t p m q ¨ p x σ ˚ ,t p m ` q ´ x σ ˚ ,t p m q q ` p M ˚ ,t ` q ε ´ p t ¨ x “ p σ ˚ ,t p M ˚ ,t q ¨ p x t ´ x σ ˚ ,t p M ˚ ,t q q` M ˚ ,t ´ ÿ m “ p σ ˚ ,t p m q ¨ p x σ ˚ ,t p m ` q ´ x σ ˚ ,t p m q q ` p M ˚ ,t ` q ε ´ p t ¨ x t “ ˜ u p x t q ´ p t ¨ x t ` ε where σ ˚ ,t P Σ S is a sequence that obtains the minimum of ˜ u p x t q and M ˚ ,t is thelength of that sequence.Lastly, consider the observation S P t , . . . , T u . For any x P R K ` , it follows that˜ u p x q ´ p S ¨ x ď p S ¨ p x ´ x S q ` ε ´ p S ¨ x “ ˜ u p x S q ´ p S ¨ x S ` ε where the inequality follows by looking at the sequence length one which only hasobservation S and the equality follows since ˜ u p x S q “ u p ˜ x q ´ ˜ u p x S q “ min σ P Σ S p σ p M q ¨ p ˜ x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + ď u p ˜ x , x S , ε q . The inequality holds because ˜ u ε -rationalizes the dataset. The first part of the proofof the proposition established u p ˜ x , x S , ε q ď ˜ u p ˜ x q ´ ˜ u p x S q . This proves part (ii).To prove part (i), note that we can use the function ˜ u p x q to generate utility num-bers that satisfy the inequality and equality conditions in the linear programmingformulation. Indeed, set u t “ ˜ u p x t q ´ min x Pt x s u Ts “ Y ˜ x t ˜ u p x qu for t P t , . . . , T u and46 u “ ˜ u p ˜ x q ´ min x Pt x s u Ts “ Y ˜ x t ˜ u p x qu . We subtract the minimum because ˜ u p x q can benegative. Here the values u t and ˜ u are weakly positive and satisfy the inequalities.To prove (iii), recall that the upper bound is the minimum of finitely many affinefunctions as shown in (ii), except at ˜ x “ x S when ε ą
0. Each such function isstrictly increasing and continuous in p ˜ x , ε q . From this we conclude that u is strictlyincreasing and continuous in p ˜ x , ε q (except at ˜ x “ x S when ε ą ε subject to the constraint that the bound is defined, we see thatunder Assumption 1 the bound cannot be tightened. Note that here we use that forthe special case ˜ x “ ˜ x , u p ˜ x , ˜ x , ε q “
0, which is also the tightest possible underAssumption 1 .The proofs for (iv)-(vi) are analogous since u p ˜ x , ˜ x , ε q “ ´ u p ˜ x , ˜ x , ε q , and are omit-ted. Proof of Proposition 9.
First suppose ˜ x is not in the dataset. Since ε ě ε ˚ thereis some utility function u that ε -rationalizes the dataset. If we modify the utilityfunction to make u p ˜ x q arbitrarily negative, the modified function still rationalizesthe dataset. Note that this modified function satisfies local nonsatiation, but is not(globally) strictly increasing or concave. This proves u p ˜ x , ˜ x , ε q “ 8 .Now instead suppose ˜ x is not in the dataset. For any utility function that ε -rationalizes the dataset, we can modify u p ˜ x q to be arbitrariliy negative. Such amodified utility function still rationalizes the dataset, and so u p ˜ x , ˜ x , ε q “ ´8 . A.3 Proofs for Section 4.2
We prove a stronger and more formal version of Proposition 10, explicitly describinga tractable linear program. To state this result, relative to the main text we useargument ˜ p T ` in place of ˜ p and ˜ p T ` in place of ˜ p . We use this notation becausein the proofs it is helpful to think of these as extra observations relative to a datasetof T observations. Proposition A.3.
For a dataset tp x t , p t qu Tt “ , let ε ě ε ˚ . Whenever V p ˜ p T ` , ˜ p T ` , ε q is finite, the change in the approximate indirect utility can be bounded by the following inear program: V p ˜ p T ` , ˜ p T ` , ε q “ max ˜ x T ` , ˜ x T ` P R K ` u ,...,u T , ˜ u T ` , ˜ u T ` P R ` ˜ u T ` ´ ˜ p T ` ¨ ˜ x T ` ´ ˜ u T ` ` ˜ p T ` ¨ ˜ x T ` s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u T ` ď u r ` p r ¨ p ˜ x T ` ´ x r q ` ε for all r P t , . . . , T u ˜ u T ` ď u r ` p r ¨ p ˜ x T ` ´ x r q ` ε for all r P t , . . . , T u u r ď ˜ u T ` ` ˜ p T ` ¨ p x r ´ ˜ x T ` q ` ε for all r P t , . . . , T u u r ď ˜ u T ` ` ˜ p T ` ¨ p x r ´ ˜ x T ` q ` ε for all r P t , . . . , T u ˜ u T ` ď ˜ u T ` ` ˜ p T ` ¨ p ˜ x T ` ´ ˜ x T ` q ` ε ˜ u T ` ď ˜ u T ` ` ˜ p T ` ¨ p ˜ x T ` ´ ˜ x T ` q ` ε. Moreover, when V p ˜ p T ` , ˜ p T ` , ε q is finite it is the minimum of the same problem.Under Assumption 1 ( ε “ ε ˚ ), these bounds cannot be improved. Note that we do not impose the constraint that if ˜ p T ` “ p t for some t , then ˜ x T ` “ x t .This is because the observed demand x t is not known to exactly maximize utility atthe price p t so we must account for the fact that ˜ x T ` can differ. Proof of Proposition A.3 (and Proposition 10).
Recall V p ˜ p T ` , ˜ p T ` , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) V u,A p ˜ p T ` , ε q ´ V u,A p ˜ p T ` , ε q ( . For a utility function u , price ˜ p , and bound on approximate optimization given by ε ,let the set of approximate optimizers be given by AO u p ˜ p, ε q “ t x P R K ` | u p x q ´ p ¨ x ě V u p p q ´ ε u .
48e can write V p ˜ p T ` , ˜ p T ` , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u sup ˜ x T ` P AO u p ˜ p T ` ,ε q ` u p ˜ x T ` q ´ ˜ p T ` ¨ ˜ x T ` ˘ ´ inf ˜ x T ` P AO u p ˜ p T ` ,ε q ` u p ˜ x T ` q ´ ˜ p T ` ¨ ˜ x T ` ˘ + . We can write the difference as V p ˜ p T ` , ˜ p T ` , ε q ď sup u ˜ x T ` , ˜ x T ` P R K ` u p ˜ x T ` q ´ ˜ p T ` ¨ ˜ x T ` ´ u p ˜ x T ` q ` ˜ p T ` ¨ ˜ x T ` s.t. u p x t q ´ p t ¨ x t ě sup x P R K ` u p x q ´ p t ¨ x ´ ε @ t P t , . . . , T u u p ˜ x T ` q ´ ˜ p T ` ¨ ˜ x T ` ě sup x P R K ` u p x q ´ ˜ p T ` ¨ x ´ εu p ˜ x T ` q ´ ˜ p T ` ¨ ˜ x T ` ě sup x P R K ` u p x q ´ ˜ p T ` ¨ x ´ ε. The first inequality constraint imposes the requirement that u ε -rationalizes thedataset. The second inequality constraint only involves the variables ˜ x T ` , andso when we take a supremum this is the upper approximate indirect utility V u,A p ˜ p T ` , ε q “ V u p ˜ p T ` q . The third inequality constraint has infimum (over ˜ x T ` )at the lower approximate indirect utility V u,A p ˜ p T ` , ε q .Consider the feasibility region of this problem. Checking the inequalities for all x isweakly more restrictive than checking for x P t x , . . . , x T , ˜ x T ` , ˜ x T ` u . Thus, we willreplace the suprema over all x P R K ` with a finite collection of inequalities involving t x , . . . , x T , ˜ x T ` , ˜ x T ` u . In addition, searching over all utility functions to satisfythese inequalities is weakly more restrictive than searching over all utility numbers.From these two monotonicity observations, and the fact that the value function is49onotone in its feasibility region (with regard to set inclusion), we obtain V p ˜ p , ˜ p , ε q ď sup ˜ x T ` , ˜ x T ` P R K ` u ,...,u T , ˜ u T ` , ˜ u T ` P R ` ˜ u T ` ´ ˜ p T ` ¨ ˜ x T ` ´ ˜ u T ` ` ˜ p T ` ¨ ˜ x T ` s.t. u s ď u r ` p r ¨ p x s ´ x r q ` ε for all r, s P t , . . . , T u ˜ u T ` ď u r ` p r ¨ p ˜ x T ` ´ x r q ` ε for all r P t , . . . , T u ˜ u T ` ď u r ` p r ¨ p ˜ x T ` ´ x r q ` ε for all r P t , . . . , T u u r ď ˜ u T ` ` ˜ p T ` ¨ p x r ´ ˜ x T ` q ` ε for all r P t , . . . , T u u r ď ˜ u T ` ` ˜ p T ` ¨ p x r ´ ˜ x T ` q ` ε for all r P t , . . . , T u ˜ u T ` ď ˜ u T ` ` ˜ p T ` ¨ p ˜ x T ` ´ ˜ x T ` q ` ε ˜ u T ` ď ˜ u T ` ` ˜ p T ` ¨ p ˜ x T ` ´ ˜ x T ` q ` ε. We will now show the opposite inequality holds. First, recall that Proposition 2shows that this program is feasible provided ε ě ε ˚ . Let u , . . . , u T , ˜ u T ` , ˜ u T ` , ˜ x T ` ,and ˜ x T ` denote some values that are feasible. Construct the augmented dataset tp x t , p t qu T ` t “ that has p x T ` , p T ` q “ p ˜ x T ` , ˜ p T ` q and p x T ` , p T ` q “ p ˜ x T ` , ˜ p T ` q .Construct a utility function as˜ u p x q “ min σ P Σ T ` p σ p M q ¨ p x ´ x σ p M q q ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , for x ‰ x T ` , where Σ T ` is the set of sequences in the augmented dataset that startwith σ p q “ T `
1, have no cycles, and have length at least M ě
1. Finally, set˜ u p x T ` q “
0. The proof of Proposition A.2 shows that this function ε -rationalizes theaugmented dataset tp x t , p t qu T ` t “ . Moreover, Proposition A.2 also established˜ u T ` ´ ˜ u T ` ď ˜ u p x T ` q ´ ˜ u p x T ` q . (9)Recall we set p x T ` , p T ` q “ p ˜ x T ` , ˜ p T ` q and p x T ` , p T ` q “ p ˜ x T ` , ˜ p T ` q for hypo-50heticals. We conclude u T ` ´ p T ` ¨ x T ` ´ p u T ` ´ p T ` ¨ x T ` qď ˜ u p x T ` q ´ p T ` ¨ x T ` ´ p ˜ u p x T ` q ´ p T ` ¨ x T ` qď V ˜ u,A p p T ` , ε q ´ V ˜ u,A p p T ` , ε qď V p ˜ p T ` , ˜ p T ` , ε q . The first inequality uses (9). The second inequality holds because for ˜ u , x T ` is anapproximate optimizer given p T ` and similarly for x T ` . The third inequality is thedefinition of the bounds on approximate indirect utility. Since this is true for anyfeasible values, we conclude that V p ˜ p T ` , ˜ p T ` , ε q is obtained by the linear programdescribed in the proposition. Recall that while we have used suprema throughout, inthis last step since we have established V p ˜ p T ` , ˜ p T ` , ε q as the (bounded) value of alinear program, we know the supremum is attained by Lemma C.4.Finally, we note that the bounds are weakly monotone in ε by Proposition 13. For ε ă ε ˚ we know the bounds are not defined because no utility function ε -rationalizesthe dataset. Thus, the bounds are the tightest possible under Assumption 1 ( ε “ ε ˚ ). Proof of Proposition 11.
Step 1 provides the upper bound on V . Step 2 provides thelower bound on V . Recall that ˜ p “ P S for some S P t , . . . , T u . Step 1.
Recall from Equation 5 that for any r P t , . . . , T u and p P R K `` , V u p p r q ´ V u p p q ď x r ¨ p p ´ p r q ` ε. By summing up such inequalities over sequences, we obtain the upper bound V u p p S q ´ V u p ˜ p q ď min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + . Since V u,A p p S , ε q ´ V u,A p ˜ p , ε q ď V u p p S q ´ V u p ˜ p q ` ε
51y construction, we prove that V p p S , ˜ p , ε q ď min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + ` ε. Step 2.
We now establish the lower bound for V p p S , ˜ p , ε q . Define V p p q “ ´ min σ P Σ S x σ p M q ¨ p p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + . Let V denote the minimum of V p p q over the convex hull of t p t u Tt “ , which is attainedand finite because V is continuous and the convex hull here is compact. Define V p p q “ max t V p p q , V u .First we show that V satisfies a set of inequalities that are a dual version of ε -rationalizability. Duality is considered in more detail in the Supplemental Ap-pendix C.3, and we use several results from that Appendix.For some t P t , . . . , T u , let σ ˚ ,t P Σ S be a sequence that obtains the minimum of V p p t q , and let M ˚ ,t be the length of that sequence. We have ´ V p p q ´ x t ¨ p ď x t ¨ p p ´ p t q ` x σ ˚ ,t p M ˚ ,t q ¨ p p t ´ p σ ˚ ,t p M ˚ ,t q q` M ˚ ,t ´ ÿ m “ p σ ˚ ,t p m q ¨ p x σ ˚ ,t p m ` q ´ x σ ˚ ,t p m q q ` p M ˚ ,t ` q ε ´ x t ¨ p “ ´ V p p t q ´ x t ¨ p t ` ε. Recall that V p p q ě V p p q and V p p t q “ V p p t q for t P t , . . . , T u . This implies V p p q ě V p p t q ´ x t ¨ p p ´ p t q ´ ε (10)for any t P t , . . . , T u and any p P R K .Define u V : R K Ñ R Yt´8u by u V p x q “ inf p P R K ` V p p q` p ¨ x . Proposition C.1(ii) showsthat u V ε -rationalizes the dataset. Note that since u V p x q ě V , u V is everywherefinite.The function V is the maximum of finitely many affine functions, each weakly de-52reasing in p , and is hence continuous, weakly decreasing, and convex. We concludefrom Lemma C.7 that V “ V u V . In particular, V is the indirect utility function for u V and we have established that u V ε -rationalizes the dataset. We conclude that V p p S q ´ V p ˜ p q ď V u V ,A p p S , ε q ´ V u V ,A p ˜ p , ε q ď V p p S , ˜ p , ε q . (11)We now characterize V p p S q ´ V p ˜ p q to state the lower bound on the proposition.Since p S is in the convex hull of prices, V p p S q “ V p p S q . Note that V p p S q ď ε -rationalized by quasilinear utility (see Lemma C.1); thisrelies on the fact that the sum of each sequence defining V makes a cycle because itbegins and ends at p S . In addition, by considering a sequence of length 1, V p p S q ě´ ` x S ¨ p p S ´ p S q ` ε ˘ “ ´ ε . Thus, V p p S q ´ V p ˜ p q ě ´ ε ´ V p ˜ p q“ min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + ´ ε. So from (11), V p p S , ˜ p , ε q ě min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + ´ ε “ h p ˜ p q ´ ε, establishing the lower bound.It is worth noting that the arguments above rely on the inequality ´ ε ď V p p S q ď V p p S q we actually prove the stronger result53ere that V p p S , ˜ p , ε q ě min σ P Σ S x σ p M q ¨ p ˜ p ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + ´ min σ P Σ S x σ p M q ¨ p p S ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + “ h p ˜ p q´ min σ P Σ S x σ p M q ¨ p p S ´ p σ p M q q ` M ´ ÿ m “ x σ p m q ¨ p p σ p m ` q ´ p σ p m q q ` M ε + . Thus, the lower bound h p ˜ p q stated in the proposition can be tightened a bit. Proof of Proposition 12.
Recall for this proof we consider when K “ p ą min t p , . . . , p T , ˜ p u , V p ˜ p , ˜ p , q ď ż x p t ˜ p ` p ´ t q ˜ p , q ` ˜ p ´ ˜ p ˘ dt. For any utility u function that admits maximizers over p ą min t p , . . . , p T , ˜ p u , write x u p p q P argmax x P R ` u p x q ´ px for some selector from the argmax correspondence. We first argue V p ˜ p , ˜ p , q “ sup t u P U | u rationalizes tp x t ,p t qu Tt “ u ż x u ` t ˜ p ` p ´ t q ˜ p ˘ ` ˜ p ´ ˜ p ˘ dt where the supremum is over the set of utility functions U that admit a maximizerover p ą min t p , . . . , p T , ˜ p u .To that end, note that ˜ p P CCo pt p t u Tt “ Y ˜ p q . From Proposition 13, V p ˜ p , ˜ p , q canbe written as a supremum of differences in indirect utility, where each indirect utility V u is finite for p ě t p , . . . , p T , ˜ p u . Each V u is convex, and for such functions, thesubdifferential B V u p p q “ t x P R ` | V u p ˜ p q ě V u p p q ` x p ˜ p ´ p q @ ˜ p P R `` u
54s nonempty for any p ą min t p , . . . , p T , ˜ p u . See the proof of Lemma C.2. Wecan construct a function ˜ x V u by selecting from the subdifferential, so that for each p ą min t p , . . . , p T , ˜ p u ˜ x V u p p q P B V u p p q . Any such function satisfies the formula V u p ˜ p q ´ V u p ˜ p q “ ż ˜ x V u ` t ˜ p ` p ´ t q ˜ p ˘ ` ˜ p ´ ˜ p ˘ dt. (12)See for example Rockafellar [2015], Corollary 24.2.1, or Chambers and Echenique[2017], Theorem 2. (The selector ˜ x V u p p q is always Reimann integrable.)We note that for any V u such that u ε -rationalizes the dataset, the utility function u V u p x q “ inf p P R K V p p q` px also ε -rationalizes the dataset from Lemma C.1. Moreover, u V u is concave, weakly increasing, and satisfies V u Vu “ V u from Lemmas C.6 and C.7.From this and the proof of Lemma C.2 we conclude that ˜ x V u is a maximizer of u V u .Summing up, we conclude that it is without loss of generality to consider utilityfunctions U that induce a maximizer for all p ą min t p , . . . , p T , ˜ p u ..Putting these arguments together, we conclude that V p ˜ p , ˜ p , q “ sup t u | u rationalizes tp x t ,p t qu Tt “ u (cid:32) V u,A p ˜ p , q ´ V u,A p ˜ p , q ( “ sup t u | u rationalizes tp x t ,p t qu Tt “ u (cid:32) V u p ˜ p q ´ V u p ˜ p q ( “ sup t u P U | u rationalizes tp x t ,p t qu Tt “ u (cid:32) V u p ˜ p q ´ V u p ˜ p q ( “ sup t u P U | u rationalizes tp x t ,p t qu Tt “ u ż x u ` t ˜ p ` p ´ t q ˜ p ˘ ` ˜ p ´ ˜ p ˘ dt ď ż x p t ˜ p ` p ´ t q ˜ p , q ` ˜ p ´ ˜ p ˘ dt. The first equality is the definition. The second equality uses the fact that when ε “ p ą min t p , . . . , p T , ˜ p u . The fourth equalityuses (12). The first inequality uses the fact that x is a pointwise maximizer of55emand induced by u that are ε -rationalized by the dataset.Now it remains to show the opposite inequality. We first show that x is the demandinduced by some quasilinear utility function. When K “ ε “
0, for p ą min t p , . . . , p T , ˜ p u , it follows that x p p q is finite from Proposition 5. Consider thesets E “ tp x p p q , p qu p ą min t p ,...,p T , ˜ p u and E “ tp x t , p t qu Tt “ . We argue that for e “p e x , e p q P t E Y E u and e “ p e x , e p q P t E Y E u , the inequality p e x ´ e x qp e p ´ e p q ď e, e P E this follows from Proposition 6.When e, e P E , the inequality holds by Lemma C.1 and the fact that ε “
0. When e P E and e P E , the result follows from Equation 3. This covers all cases.Now consider the set E “ E Y E . From Rockafellar [2015], p. 240 the set E is thegraph of a multivalued mapping that is cyclically monotononically decreasing. FromRockafellar [2015], Theorem 24.3 there is some lower semicontinuous convex function f such that for any e “ p e x , e p q P E , ´ e x P B f p e p q , where B f denotes the subdifferential of f . Let f ˚ denote the convex conjugate of f , which is defined in Appendix C.3. We conclude from Rockafellar [2015], Theorem23.5 that for each e “ p e x , e p q P E , ´ e x P argmax x xe p ´ f ˚ p x q . Since e x ě
0, we conclude via a change in variables that e x P argmax x ě ´ xe p ´ f ˚ p´ x q . We conclude that by setting ˜ u p x q “ ´ f ˚ p´ x q , we have for each e P E , the price e p induces the demand e x as some exact maximizer of a quasilinear utility function. Note that we differ from the statement of Theorem 24.3 in Rockafellar [2015] because we considercyclically monotonically decreasing mappings while that result considers increasing mappings; thisis why we need to take a negative involving e x .
56n particular, recalling E “ E Y E and using that E is the graph of x over p ą min t p , . . . , p T , ˜ p u , we conclude that x is a demand function generated byquasilinear utility with ˜ u . Moreover, recall ˜ u rationalizes the dataset. Thus, ż x p t ˜ p ` p ´ t q ˜ p q dt ` ˜ p ´ ˜ p ˘ ď sup t u P U | u rationalizes tp x t ,p t qu Tt “ u ż x u ` t ˜ p ` p ´ t q ˜ p ˘ ` ˜ p ´ ˜ p ˘ dt “ V p ˜ p , ˜ p , q . Proof of Proposition 13.
To show shape restrictions on V p ˜ p , ˜ p , ε q , recall V p ˜ p , ˜ p , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε q ( , where the supremum is over u such that V u,A p ˜ p , ε q ‰ 8 . For any u that ε -rationalizesthe data, we can add or subtract a constant to u and the new utility function ratio-nalizes the data as well. In addition, if we let u ` a be the utility u plus the constant a , then V u,A p ˜ p , ε q ´ V u,A p ˜ p , ε q “ V u ` a,A p ˜ p , ε q ´ V u ` a,A p ˜ p , ε q . Thus, it is thus without loss of generality to restrict u such that V u,A p ˜ p , ε q “
0. Nowrecall the upper approximate indirect utility satisfies V u,A p ˜ p , ε q “ V u p ˜ p q . Combiningthese arguments we can write V p ˜ p , ˜ p , ε q “ sup t u | u ε ´ rationalizes tp x t ,p t qu Tt “ u (cid:32) V u p ˜ p q ´ ( , where the supremum is over u such that V u,A p ˜ p , ε q “
0. We know that each V u isconvex, weakly decreasing, and lower semicontinuous by Lemma C.6. Note that thisis true for any u , regardless of whether it is concave or upper semicontinuous.We conclude that when viewing V p ˜ p , ˜ p , ε q only as a function of ˜ p , it is the supremum(over u ) of convex, weakly decreasing, lower semicontinuous functions. It is thereforeconvex, weakly decreasing, and lower semicontinuous in ˜ p from Rockafellar [2015],57heorems 5.5 and 9.4.To see that V p ˜ p , ˜ p , ε q is weakly increasing in ε , we recall the characterization asa linear program in Proposition A.3. The feasibility region of the linear program isweakly increasing (with regard to set inclusion) in ε , and so the value function of theproblem is weakly increasing in ε .We now establish the finiteness properties in the proposition. Let ˜ p P CCo pt p t u Tt “ q .We show V p ˜ p , ˜ p , ε q is finite. We can write˜ p ě T ÿ t “ α t p t for some nonnegative α , . . . , α T that sum to 1 where the inequality holds componen-twise. We have V u p ˜ p q ď V u ˜ T ÿ t “ α t p t ¸ ď T ÿ t “ α t V u p p t q , where the first inequality follows because V u is weakly decreasing, and the secondinequality follows because V u is convex. Recall from (5) in the main text that for any u that ε -rationalizes the dataset, we have V u,A p p t , ε q ´ V u,A p ˜ p , ε q ď V u p p t q ´ V u p ˜ p q ` ε ď p x t ¨ p ˜ p ´ p t q ` ε q ` ε for any t P t , . . . , T u . Combining the previous steps we obtain V p ˜ p , ˜ p , ε q ď T ÿ t “ α t x t ¨ p ˜ p ´ p t q ` ε ă 8 . Now we show that if ˜ p R CCo pt p t u Tt “ Y ˜ p q , then V p ˜ p , ˜ p , ε q “ 8 . First note thatfrom Proposition 2, there is some utility function that ε -rationalizes the dataset andhas a maximizer at the price ˜ p . Let ˜ x denote such a maximizer. Now constructthe augmented dataset tp x t , p t qu T ` t “ , where p x T ` , p T ` q “ p ˜ x , ˜ p q . Note that by con-struction, there is some utility function u that ε -rationalizes the augmented dataset58 p x t , p t qu T ` t “ . For any such function u , the indirect utility satisfies V u p p q ă 8 forany p P CCo pt p t u Tt “ Y ˜ p q from the finiteness arguments above. Thus, it remainsto show that when p R CCo pt p t u Tt “ Y ˜ p q , there is some u that ε -rationalizes theaugmented dataset and satisfies V u p p q “ 8 . That end, fix p x , p q P tp x t , p t qu T ` t “ andlet Σ denote the set of finite sequences of t P t , . . . , T u with no cycles that beginsat σ p q “
1. Define U p x q “ min σ P Σ p σ p M q ¨ ` x ´ x σ p M q ˘ ` M ´ ÿ m “ p σ p m q ¨ p x σ p m ` q ´ x σ p m q q ` M ε + , where M corresponds to the length of a particular sequence. Allen and Rehbeck[2020] have shown that for ε ě ε ˚ , this function ε -rationalizes the augmented dataset tp x t , p t qu T ` t “ .Since ˜ p R CCo pt p t u Tt “ , ˜ p q from the separating hyerplane theorem, there is some x P R K with x ‰ p p ´ ˜ p q ¨ x ą p P CCo pt p t u Tt “ Y ˜ p q . We argue by contradiction that x contains no negative components. Indeed, supposeit does so that x k ă k . Since ˜ p P R K `` and CCo pt p t u Tt “ Y ˜ p q is uppercomprehensive, we can find some p P CCo pt p t u Tt “ Y ˜ p q with p k high enough so that p p ´ ˜ p q ¨ x ă
0. We reach a contradiction and conclude x P R K ` and x ‰ U p x q , the minimum is taken over certain functions thatinvolve p t ¨ x for some t , plus a constant. Thus, for shorthand write U p x q “ min t Pt ,...,T ` u min a P A t t p t ¨ x ` a u for certain finite sets A t corresponding to the sums in the construction of U p x q . Weconclude thatlim λ Ñ8 U p λx q ´ ˜ p ¨ λx “ lim λ Ñ8 min t Pt ,...,T ` u min a P A t (cid:32) p p t ´ ˜ p q ¨ λx ` a ( “ 8 . This establishes that V U p ˜ p q “ 8 and so since V U p ˜ p q ă 8 we conclude V p ˜ p , ˜ p , ε q ě V U p ˜ p q ´ V U p ˜ p q “ 8 . 59 .4 Proofs for Section 5 In order to prove Proposition 14, we first prove a lemma. The lemma establishes aconvexity property of the set of counterfactual quantities at a given price, X p ˜ p, D, ε q “ t ˜ x | p ˜ x, ˜ p q P C p D, ε qu . Lemma A.1.
Let D “ tp d ,t , p t qu Tt “ and D “ tp d ,t , p t qu Tt “ differ only for quanti-ties. If ˜ d j P X p ˜ p, D j , ε j q for j P t , u , then for any α P r , s , α ˜ d ` p ´ α q ˜ d P X p ˜ p, αD ` p ´ α q D , αε ` p ´ α q ε q . In addition, the set A described in Proposition 14 is convex.Proof. The sets C p D, ε q and X p ˜ p, D, ε q can be characterized by using any of the equiv-alent statements of Lemma C.1, applied to the counterfactual-augmented dataset.In particular, for an arbitrary (hypothetical) dataset D j “ tp d j,t , p t qu Tt “ , ˜ d j P X p ˜ p, D j , ε j q means there are numbers u j, , . . . , u j,K , ˜ u j P R ` such that u j,s ď u j,r ` p r ¨ ` d j,s ´ d j,r ˘ ` ε for all r, s P t , . . . , T u u j ď u j,r ` p r ¨ ´ ˜ d j ´ d j,r ¯ ` ε for all r P t , . . . , T u u j,r ď ˜ u j ` ˜ p ¨ ´ d j,r ´ ˜ d j ¯ ` ε for all r P t , . . . , T u . In addition, ˜ d j must be non-negative. We can take a convex combination of the valuesfor j “ j “ r, s , we have αu ,s ` p ´ α q u ,s ď αu ,r ` p ´ α q u ,r ` p r ¨ p αd ,s ` p ´ α q d ,s ´ p αd ,r ` p ´ α q d ,r qq ` αε ` p ´ α q ε . Since by Lemma C.1 the inequalities displayed above are the only ones we need to60heck, we obtain that α ˜ d ` p ´ α q ˜ d P X p ˜ p, αD ` p ´ α q D , αε ` p ´ α q ε q . Finally, convexity of A follows from similar averaging of the inequalities characterizing ε -rationalizability in Lemma C.1(iii). Proof of Proposition 14.
Recall we now include quantities as arguments of thebounds. In more detail, write x k p d, ˜ p, ε q “ max ˜ x P X p ˜ p,D,ε q ˜ x k , where D “ tp d t , p t qu Tt “ and we work in the extended reals so that x k p d, ˜ p, ε q may be . Addition is defined as a “ 8 provided a is not ´8 .Let ˜ d , ˜ d P R K ˆ T be arbitrary quantities datasets. Since x k is a maximum, we obtain x k ´ α ˜ d ` p ´ α q ˜ d , ˜ p, αε ` p ´ α q ε ¯ ě αx k ´ ˜ d , ˜ p, ε ¯ ` p ´ α q x k ´ ˜ d , ˜ p, ε ¯ , because from Lemma A.1 the weighted average of the values is feasible. This estab-lishes concavity of x k in its non-price arguments. Since x k is a minimum we obtain x k ´ α ˜ d ` p ´ α q ˜ d , ˜ p, αε ` p ´ α q ε ¯ ď αx k ´ ˜ d , ˜ p, ε ¯ ` p ´ α q x k ´ ˜ d , ˜ p, ε ¯ , and so the lower bound is convex in its non-price arguments.To establish continuity of x k p¨ , ˜ p, ¨q as stated in the proposition, recall the linear pro-gramming formulation in A.1. We see that quantities d and approximation error ε enter additively (relative to the choice variables) in the inequalities describing feasi-bility region. That is, they are part of the “b” in the canonical linear programmingformulation from Appendix C.2. Continuity then follows from Lemma C.4. Proof of Proposition 15.
The proof is analogous to the proof of Proposition 14 andso we only outline it. Recall that V and V are described by a linear program inProposition 10. By inspecting the feasibility region of this program we see convexityholds similar to Lemma A.1. From this, we conclude that V satisfies the concavity61roperty in the proposition because it is a maximum, and V satisfies the convexityproperty in the proposition because it is a minimum.Continuity of V and V in p d, ε q over the region where these bounds are finite followsfrom the linear programming formulation in Proposition A.3 and Lemma C.4. Proof of Proposition 16.
Convexity is established in Proposition 3 in Allen and Re-hbeck [2020]. Continuity follows from the linear programming characterization of ε ˚ in Proposition 2 in Allen and Rehbeck [2020]. Indeed, ε ˚ can be written as amaximum of finitely many functions that are affine in quantities d . Proof of Corollary 1.
By assumption x k p d, ˜ p, ε ˚ p d qq is finite. By construction, ´ ˆ d n , ε ´ ˆ d n ¯¯ P A for each n . Then from Propositions 14, 16, and the continuousmapping theorem, x ´ ˆ d n , ˜ p, ε p ˆ d n q ¯ p ÝÑ x k p d, ˜ p, ε ˚ p d qq . The arguments for x, V , and V are analogous. Proof of Proposition 17.
First note that the set A p ˜ x q is convex for any ˜ x P R K ` be-cause it is a projection of the set A , which is convex by Lemma A.1. The proofis analogous to the proofs of Propositions 14 and 15. The feasibility region of theprogram describing u and u is given in Proposition A.2. This feasibility regions of u and u are convex in quantities and degree of approximation error over the sets A p ˜ x q and A p ˜ x q respectively, similar to Lemma A.1. Since u is a supremum it is concave,and since u is an infimum it is convex.Continuity of u and u in p d, ε q over the region stated in the proposition follows fromthe characterization of the bounds in Proposition A.2(ii) and (iii). Indeed, u is themaximum of finitely many functions that are each affine in p d, ε q , and u is the mini-mum. Appendix B Alternative Approaches
Assumption 1 is the key conceptual assumption for this paper, which posits thatapproximation error is the same in new settings as the data we have seen. We have62perationalized this for counterfactual and welfare analysis with a number controllingapproximation error as in Allen and Rehbeck [2020]. We now describe other poten-tial ways to conduct counterfactual or welfare analysis. We also elaborate on themeasurement and prediction wedges.This paper focuses on approximation error being controlled by a single scalar. Analternative approach is to consider a multidimensional notion along the lines of Afriat[1972], Varian [1990], Varian [1991], Halevy et al. [2018], and Masten and Poirier[2018b]. We pursue this by allowing each observation to have its own value ε tV ofapproximation error relative to exact optimization. Definition B.1.
A dataset tp x t , p t qu Tt “ is ε V -rationalized by quasilinear utility for ε V “ p ε V , . . . , ε T V q P R T ` if there exists a utility function u : R K ` Ñ R such that for all t P t , . . . , T u and for all x P R K ` , the following inequality holds: u p x t q ´ p t ¨ x t ` ě u p x q ´ p t ¨ x ´ ε t V . We also refer to the above by saying a dataset is ε V -quasilinear rationalized. We can apply this concept to counterfactual analysis, as in the main text, by con-sidering datasets in which the last observation is the hypothetical. That is, for adataset tp x t , p t qu T ` t “ , interpret the first T observations as data we have seen and thelast T ` measurement wedges are controlled by the collection p ε V , . . . , ε TV q of values for the observed data, whilethe prediction wedge is the scalar ε T ` V . In principle we can consider counterfactualsinvolving several observations such as T ` T `
2. We focus on the case of asingle counterfactual for brevity.In the main text, we set a single number controlling the measurement wedge andthe prediction wedge. The full strength of Assumption 1 also imposes that these areequal to the minimal approximation error needed to explain the data we have seen.In this appendix we drop the assumptions that these wedges are the same, whichshows how to generalize our framework when Assumption 1 is relaxed.63 .1 Counterfactuals with Multidimensional ApproximationError Because ε V -rationalization is multidimensional, in general there is no single “smallest”vector ε V such that the dataset is ε V -rationalized by quasilinear utility. Nonetheless,we can define a set of such rationalizing vectors via E ´ ˜ D ¯ “ ! ε V P R T ` | ˜ D is ε V -quasilinear rationalized ) , where we let ˜ D “ pp x , p q , . . . , p x T , p T qq be the observed dataset written as an orderedtuple. Note that we switch from an unordered dataset to an ordered tuple. Thereason we care about the order of observations now is that the t -th dimension of E ´ ˜ D ¯ corresponds to approximation error associated with the t -th observation. Wecan conduct counterfactual analysis as before by considering the set of quantity-pricetuples that do not make approximation error worse.Formalizing worse here leads to some ambiguity since we do not have a total orderon vectors. We consider two possibilities. To formalize these, let π T : R T ` Ñ R T denote the projection onto the first T components. We can then define the sets oflower and upper approximate counterfactuals by AC ´ ˜ D ¯ “ ! p ˜ x, ˜ p q P R K ` ˆ R K `` | π T ´ E ´ ˜ D ˆ p ˜ x, ˜ p q ¯¯ “ E ´ ˜ D ¯) AC ´ ˜ D ¯ “ ! p ˜ x, ˜ p q P R K ` ˆ R K `` | π T ´ E ´ ˜ D ˆ p ˜ x, ˜ p q ¯¯ X E ´ ˜ D ¯ ‰ H ) . With minor abuse of notation we define E p¨q in the obvious way for datasets ofdifferent dimensions. Clearly, AC ´ ˜ D ¯ Ď AC ´ ˜ D ¯ . The smaller set formalizes thatcounterfactuals do not change the potential ε V vectors that rationalize the data wesee when we add an existing observation. This smaller set is conceptually closer to theoriginal adapative counterfactual set AC p¨q . The larger set formalizes that there is some ε V vector that ε V -rationalizes both the original dataset and the counterfactual-augmented dataset ˜ D ˆ p ˜ x, ˜ p q .While the sets AC and AC may appear to be intuitive alternatives to the adaptiveapproach presented in the main text, unfortunately these sets are trivial. To see this,note that the set AC ´ ˜ D ¯ allows the prediction wedge ε T ` V for the counterfactual64alue to be unbounded. This leads to trivial restrictions for both AC ´ ˜ D ¯ and thelarger set AC ´ ˜ D ¯ . However, nontriviality can be restored if we modify the setsby placing an a priori bound on approximation error at the new observation. Toformalize this, let π : R T ` Ñ R denote the projection of the last component. Thenwe can modify the smaller set via AC ´ ˜ D, ε T ` V ¯ “ ! p ˜ x, ˜ p q P R K ` ˆ R K `` | π T ´ E ´ ˜ D ˆ p ˜ x, ˜ p q ¯¯ “ E ´ ˜ D ¯ ,π ´ E ´ ˜ D ˆ p ˜ x, ˜ p q ¯¯ ď ε T ` V ) . Thus, the prediction wedge is restricted by the number ε T ` V . This set is not data-adaptive in the sense that ε T ` V needs to be chosen by the researcher. However, onecan make this data adaptive by using information obtained from other measures ofapproximation error discussed below. B.2 Other Measures of Approximation Error
It is natural to wonder for the multidimensional vector of approximation errors, ε V ,whether other intuitive one-dimensional summaries can be used for counterfactualanalysis. In fact, there can be many ways to do this depending on how one aggre-gates the approximation error. We consider general aggregators of the elements ofmultidimensional approximation error ε V that turn it into a one-dimensional measureof approximation error. Formally, an aggregator can be written e T : R T ` Ñ r , .Higher values of the aggregator can be interpreted as more approximation error. Var-ian [1990] and Halevy et al. [2018] consider a related notion in the standard consumerproblem for general utility maximization. Given a dataset ˜ D “ pp x , p q , . . . , p x T , p T qq and an aggregator e T , we can define ameasure of approximation error via e T ˚ ´ ˜ D ¯ “ inf ε V P E p ˜ D q e T p ε V q . Many of the convenient shape restrictions we obtain in this paper do not hold for the case ofgeneral utility maximization since the constraint set of consistent utility indices is non-convex. e TM p ε V q “ max t P T ε tV . The measure ofapproximation error for the max aggregator agrees with the one presented in the maintext, i.e. e T ˚ M “ ε ˚ . In general, an aggregator can depend on the sample size. Forexample, consider the average approximation error aggregator e TA p ε V q “ T ř Tt “ ε tV .Our leading measure, ε ˚ , does not depend on T .For an arbitrary aggregator, similar to how AC was constructed with the measure ε ˚ , we can define a set of counterfactuals such that approximation error does not getworse: ! p ˜ x, ˜ p q P R K ` ˆ R K `` | e p T ` q˚ ´ ˜ D ˆ p ˜ x, ˜ p q ¯ ď e T ˚ ´ ˜ D ¯) . (13)This construction does not separately control the prediction wedge and approximationwedge as in AC’ described at the end of the previous subsection. Instead, it lumpstogether both prediction and approximation wedges via the aggregators e p T ` q˚ and e T ˚ .To understand properties of this set, consider an aggregator that sums up theobservation-specific bounds on approximate optimization, e TS p ε V q “ ř Tt “ ε tV . Withthis choice of aggregator, each conjectured observation in (13) must be perfectly con-sistent with the model, i.e. ε T ` V “
0, for approximation error to not be made worse.In other words, for each element of (13), there must exist some utility function thatapproximately explains the existing dataset ˜ D , but exactly explains the counterfac-tual. Thus, there is no prediction wedge. This property may be desirable when onethinks the observed dataset ˜ D comes from a “true” dataset that is generated by thequasilinear model but has been measured with error. Using the previous terminology,in this case we may wish to conduct counterfactual analysis without a predictionwedge. If instead we think approximation error propogates to new settings, thenwe may wish to allow the prediction wedge. This is one motivation for ε ˚ and theadaptive set AC .One potential way to address this limitation of the sum-type aggregator e TS is to adjustit by dividing by the sample size to obtain the average approximation error aggregatorso that ˜ e T ˚ A “ T e T ˚ S . This division allows one to construct a set analogously to (13)that allows a prediction wedge when generating counterfactual information.66 ppendix C Supplemental Appendix This appendix contains additional results needed for proofs of the main results. Sec-tion C.1 contains miscellaneous lemmas, Section C.2 presents lemmas specifically forlinear programming results, and Section C.3 presents duality results used in proofsfor approximate indirect utility in Section 4.2.
C.1 Miscellaneous Lemmas
Lemma C.1 (Allen and Rehbeck [2020]) . For any dataset tp x t , p t qu Tt “ and ε ě ,the following are equivalent:(i) tp x t , p t qu Tt “ is ε -rationalized by quasilinear utility.(ii) There exist numbers t u t u Tt “ that satisfy the following inequalities for all r, s Pt , . . . , T u : u s ď u r ` p r ¨ p x s ´ x r q ` ε. (iii) For all finite sequences t t m u Mm “ with t m P t , . . . , T u and M ě , the inequality M M ÿ m “ p t m ¨ p x t m ´ x t m ` q ď ε holds, where p x t M ` , p t M ` q “ p x t , p t q . We require a lemma that will be used in the proof of Proposition 2 to ensure a max-imizer exists. In contrast with models with compact budget constraints, continuityof the utility function u is not enough to ensure a maximizer exists, which is why werequire the following lemma. To state the lemma, recall that for a utility function u ,the indirect utility is defined as V u p p q “ sup x P R K ` u p x q ´ p ¨ x. Lemma C.2.
Suppose u : R K ` Ñ R is concave, monotonically increasing, and con-tinuous. Moreover, suppose V u p p q is finite over some open set O Ď R K . It follows hat for any price p P ri p Co p O qq , u p x q ´ p ¨ x admits a maximizer for x P R ` .Proof of Lemma C.2. Since V u p p q is convex and finite over O , then V u p p q is finite onthe ri p Co p O qq . Thus, the subdifferential B V u p p q “ (cid:32) x | V u p ˜ p q ě V u p p q ` x ¨ p ˜ p ´ p q @ ˜ p P R K ( is nonempty for any p P ri p Co p O qq by Rockafellar [2015], Theorem 23.4. Extend u to all of R K by setting u p x q “ ´8 for any x P R K z R K ` . Recall that the original u defined on R K ` is continuous, so that it is upper semicontinuous and t x | u p x q ě a u isclosed for any a P R by Theorem 7.1 in Rockafellar [2015]. Note that the extensionis also upper semicontinuous because it does not change the topological properties ofthe upper contour sets for all a P R . Since u is upper semicontinuous and concave,we conclude from Rockafellar [2015] Theorem 23.5 parts p b q and p a ˚ q that for any p P ri p Co p O qq there is some x ˚ P R K such that u p x q ´ p ¨ x is maximized over x P R at x ˚ since B V u p p q is nonempty. Since u is ´8 outside of R K ` , we conclude x ˚ P R K ` . This establishes existence of an exact maximizer for theutility function u over the region p P ri p Co p O qq , which completes the proof. For a set S Ď R K , ri p S q gives the relative interior of the set S as defined in Rockafellar [2015]. .2 Linear Programming Lemmas We require some existing results from the theory of linear programming. In canonicalform, a linear program is written as max x P R J c ¨ x s.t. Ax ď bx ě . Here, c, x P R J , and b P R J are vectors and A P R J ˆ J is a matrix.This is written as a maximum rather than a supremum because provided the supre-mum is finite, the maximum is attained as we formalize now. Lemma C.3.
If the value function of a linear program is finite, then the maximumis attained.Proof.
See e.g. Bertsekas [2009], Proposition 1.4.12.Fixing all other variables, let B Ď R J be the set of b where the linear program isbounded. Write the value function as a function of b so that G p b q “ sup x c ¨ x s.t. Ax ď bx ě . Lemma C.4.
Let b m Ñ b ˚ where b ˚ P B and for each b m , the set t x | Ax ď b m , x ě u is nonempty. It follows that G p b m q Ñ G p b ˚ q .Proof. Let η ą G p b q “ min t G p b q , G p b ˚ q ` η u . G p b q is the value function of a linear program defined by G p b q that appendsthe inequality constraint c ¨ x ď G p b ˚ q ` η . Because G p b ˚ q ` η is finite, G p b q is finitefor any feasible b . B¨ohm [1975], Theorem 1 states that G p b q is continuous over theset of b such that the feasibility region is nonempty. Since b m Ñ b ˚ and each b m and b ˚ P B are feasible, we then obtain G p b m q Ñ G p b ˚ q “ G p b ˚ q . C.3 Duality
The focus of the paper is on counterfactuals with approximate utility maximization.In other words, we consider utility functions u such that the inequality u p x t q ´ p t ¨ x t ě u p x q ´ p t ¨ x ´ ε holds for every t P t , . . . , T u and x P R K ` . In this supplement, we consider a dualapproach involving functions V such that V p p q ě V p p t q ´ x t ¨ p p ´ p t q ´ ε holds for every t P t , . . . , T u and p P R K `` . We also mention some results from convexanalysis. Results from this section are used to prove several results in the main text.Our first result formalizes that finding a utility function u that satisfies the first setof inequalities is equivalent to finding a V function that satisfies the second set ofinequalities.To state the result, recall the indirect utility function of u is given by V u : R K ` Ñ R K Y t´8 , V u p p q “ sup x P R K ` u p x q ´ p ¨ x. We make use of a “dual” utility function u V : R K ` Ñ R K Y t´8 , constructed via u V p x q “ inf p P R K ` V p x q ` p ¨ x. These operations can be defined for any extended real-valued functions u : R K Ñ R K Y t´8 , and V : R K Ñ R K Y t´8 , .70 roposition C.1. Let ε ě and let tp x t , p t qu Tt “ be an arbitrary dataset.i. Suppose u : R K ` Ñ R satisfies u p x t q ´ p t ¨ x t ě u p x q ´ p t ¨ x ´ ε for every t P t , . . . , T u and every x P R K ` . It follows that V u satisfies V u p p q ě V u p p t q ´ x t ¨ p p ´ p t q ´ ε for every t P t , . . . , T u and every p P R K ` .ii. Suppose V : R K ` Ñ R satisfies V p p q ě V p p t q ´ x t ¨ p p ´ p t q ´ ε for every t P t , . . . , T u and every p P R K ` . It follows that u V satisfies u V p x t q ´ p t ¨ x t ě u V p x q ´ p t ¨ x ´ ε for every t P t , . . . , T u and every x P R K ` .Proof. First we show (i). For arbitrary t P t , . . . , T u , write V u p p t q “ u p x t q ´ p t ¨ x t ` δ t (14)where δ t “ V u p p t q ´ u p x t q ` p t ¨ x t ě δ t ď ε since the observed quantities areonly approximately optimal. For arbitrary p P R K ` we have V u p p q ě u p x t q ´ p ¨ x t . Differencing yields V u p p q ´ V u p p t q ě ´ x t ¨ p p ´ p t q ´ δ t . The term V u p p q may equal , in which case we define a “ 8 for any finite a . Here, V u p p t q is finite from (14) and the fact that u is finite. We know for all t P t , . . . , T u that 0 ď δ t ď ε by assumption, so (i) is established.71ow we show (ii). As before, write u V p x t q “ V p p t q ` p t ¨ x t ´ δ t where δ t “ u V p x t q ´ V p p t q ´ p t ¨ x t ě δ ď ε since the observed quantities andprices are only supposed to satisfy the inequality in (ii) for V . For arbitrary x P R K ` we have u V p x q ď V p p t q ` p t ¨ x, and so u V p x t q ´ p t ¨ x t ě u V p x q ´ p t ¨ x ´ δ t . As before, u V p x q can equal ´8 , but u V p x t q is always finite. Recall, for all t Pt , . . . , T u that 0 ď δ t ď ε by assumption, and so (ii) is established.The mappings u Ñ V u and V Ñ u V are closely related to convex conjugates, and wecan adapt existing results from convex analysis. Recall that for a function f : R K Ñ R Y t´8 , , the convex conjugate is given by f ˚ p p q “ sup x P R K p ¨ x ´ f p x q . The monotone conjugate is given by f ` p p q “ sup x P R K ` p ¨ x ´ f p x q . Let the function ˜ f equal f over x P R K ` , and otherwise. It follows that ˜ f ˚ p p q “ f ` p p q .We now formalize the relationships between u V and V u and monotone conjugates.Following this, we present some immediate consequences. Lemma C.5. u V p x q “ ´ V ` p´ x q and V u p p q “ p´ u q ` p´ p q . roof. u V p x q “ inf p P R K ` V p p q ` p ¨ x “ ´ sup p P R K ` ´ V p p q ´ p ¨ x “ ´ V ` p´ x q . and V u p p q “ sup x P R K ` u p x q ´ p ¨ x “ sup x P R K ` ´p´ u p x qq ` p´ p q ¨ x “ p´ u q ` p´ p q . Lemma C.6.
The function u V is concave, weakly increasing, and upper semicontin-uous. The function V u is convex, weakly decreasing, and lower semicontinuous.Proof. To see that V u is weakly decreasing, consider p a , p b with p a ě p b . For x P R K ` we have p a ¨ x ě p b ¨ x and so V u p p a q “ sup x P R K ` u p x q ´ p a ¨ x ď sup x P R K ` u p x q ´ p b ¨ x ď V u p p b q . Also, V u is convex and lower semicontinuous from Bertsekas [2009], p. 83. Thearguments for u V are analogous by applying Lemma C.5. Lemma C.7. i. Suppose u : R K ` Ñ R K Y t´8 , is concave, weakly increasing,upper semicontinuous, and finite at . Then u V u “ u .ii. Suppose V : R K ` Ñ R K Y t´8 , is concave, weakly decreasing, lower semicon-tinuous, and finite at . Then V u V “ V .Proof. Write f `` as the monotone conjugate of f ` . Rockafellar [2015], Theorem 12.4states V `` “ V and p´ u q `` “ ´ u . From Lemma C.5 we conclude ´ u V u p x q “ p V u q ` p´ x q “ p´ u q `` p x q “ ´ u p x q and V u V p p q “ p´ u V q ` p´ p q “ V `` p p q “ V p p q . eferences Abi Adams. Mutually consistent revealed preference demand predictions.
AmericanEconomic Journal: Microeconomics , 2019. Forthcoming.Sidney N Afriat. The construction of utility functions from expenditure data.
Inter-national economic review , 8(1):67–77, 1967.Sidney N Afriat. Efficiency estimation of production functions.
International eco-nomic review , pages 568–598, 1972.Sydney N Afriat. On a system of inequalities in demand analysis: an extension ofthe classical method.
International Economic Review , pages 460–472, 1973.Victor Aguiar, Per Hjertstrand, and Roberto Serrano. A rationalization of the weakaxiom of revealed preference. 2020.Victor H. Aguiar and Nail Kashaev. Stochastic revealed preferences with measure-ment error. 2018. Working Paper.Roy Allen and John Rehbeck. Identification with additively separable heterogeneity.
Econometrica , 87(3):1021–1054, 2019a.Roy Allen and John Rehbeck. Measuring rationality: Percentages vs expenditures.
Available at SSRN 3399065 , 2019b.Roy Allen and John Rehbeck. Satisficing, aggregation, and quasilinear utility. 2020.Donald WK Andrews and Soonwoo Kwon. Inference in moment inequality modelsthat is robust to spurious precision under model misspecification. 2019.Isaiah Andrews, Matthew Gentzkow, and Jesse M Shapiro. Measuring the sensi-tivity of parameter estimates to estimation moments.
The Quarterly Journal ofEconomics , 132(4):1553–1592, 2017.Timothy Armstrong and Michal Koles´ar. Sensitivity analysis using approximate mo-ment condition models. 2018.B Douglas Bernheim. The good, the bad, and the ugly: a unified approach to behav-ioral welfare economics.
Journal of Benefit-Cost Analysis , 7(1):12–68, 2016.74 Douglas Bernheim and Antonio Rangel. Beyond revealed preference: choice-theoretic foundations for behavioral welfare economics.
The Quarterly Journalof Economics , 124(1):51–104, 2009.B Douglas Bernheim and Dmitry Taubinsky. Behavioral public economics. In
Hand-book of Behavioral Economics: Applications and Foundations 1 , volume 1, pages381–516. Elsevier, 2018.Steven Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equi-librium.
Econometrica: Journal of the Econometric Society , pages 841–890, 1995.Dimitri P Bertsekas.
Convex optimization theory . Athena Scientific Belmont, 2009.Richard Blundell, Martin Browning, and Ian Crawford. Best nonparametric boundson demand responses.
Econometrica , 76(6):1227–1262, 2008.Richard Blundell, Joel L Horowitz, and Matthias Parey. Measuring the price re-sponsiveness of gasoline demand: Economic shape restrictions and nonparametricdemand estimation.
Quantitative Economics , 3(1):29–51, 2012.Richard Blundell, Dennis Kristensen, and Rosa Matzkin. Bounding quantile demandfunctions using revealed preference inequalities.
Journal of Econometrics , 179(2):112–127, 2014.Richard W Blundell, Martin Browning, and Ian A Crawford. Nonparametric engelcurves and revealed preference.
Econometrica , 71(1):205–240, 2003.Richard W Blundell, Dennis Kristensen, and Rosa Liliana Matzkin. Individualcounterfactuals with multidimensional unobserved heterogeneity. Technical report,cemmap working paper, 2017.Volker B¨ohm. On the continuity of the optimal policy set for linear programs.
SIAMJournal on Applied Mathematics , 28(2):303–306, 1975.St´ephane Bonhomme and Martin Weidner. Minimizing sensitivity to model misspec-ification. arXiv preprint arXiv:1807.02161 , 2018.Donald J Brown and Caterina Calsamiglia. The nonparametric approach to appliedwelfare analysis.
Economic Theory , 31(1):183–188, 2007.75 Kate Bundorf, Jonathan Levin, and Neale Mahoney. Pricing and welfare in healthplan choice.
American Economic Review , 102(7):3214–48, 2012.Christopher P Chambers and Federico Echenique. A characterization of combinatorialdemand.
Mathematics of Operations Research , 2017.Laurens Cherchye, Thomas Demuynck, and Bram De Rock. Bounding counterfactualdemand with unobserved heterogeneity and endogenous expenditures.
Journal ofEconometrics , 2019.Raj Chetty. Bounds on elasticities with optimization frictions: A synthesis of microand macro evidence on labor supply.
Econometrica , 80(3):969–1018, 2012.K Chiong, YW Hsieh, and Matthew Shum. Counterfactual estimation in semipara-metric discrete choice models.
URL https://ssrn. com/abstract , 2979446, 2017.Timothy M Christensen and Benjamin Connault. Counterfactual sensitivity androbustness. arXiv preprint arXiv:1807.02161 , 2018.Jessica Cohen, Pascaline Dupas, et al. Free distribution or cost-sharing? evidencefrom a randomized malaria prevention experiment.
Quarterly journal of Economics ,125(1):1, 2010.Timothy G Conley, Christian B Hansen, and Peter E Rossi. Plausibly exogenous.
Review of Economics and Statistics , 94(1):260–272, 2012.Sam Cosaert and Thomas Demuynck. Nonparametric welfare and demand analysiswith unobserved individual heterogeneity.
Review of economics and statistics , 100(2):349–361, 2018.Rahul Deb, Yuichi Kitamura, John Quah, and J¨org Stoye. Revealed price preference:Theory and stochastic testing. 2018. Working Paper.Xavier d’Haultfoeuille, Christophe Gaillac, and Arnaud Maurel. Rationalizing ra-tional expectations? tests and deviations. Technical report, National Bureau ofEconomic Research, 2018.W Erwin Diewert. Afriat and revealed preference theory.
The Review of EconomicStudies , 40(3):419–425, 1973. 76ederico Echenique, Sangmok Lee, and Matthew Shum. The money pump as ameasure of revealed preference violations.
Journal of Political Economy , 119(6):1201–1223, 2011.Liran Einav, Amy Finkelstein, and Mark R Cullen. Estimating welfare in insurancemarkets using variation in prices.
The quarterly journal of economics , 125(3):877–921, 2010.Pirmin Fessler and Maximilian Kasy. How to use economic theory to improve estima-tors: Shrinking toward theoretical restrictions.
Review of Economics and Statistics ,101(4):681–698, 2019.Charles Gauthier. Nonparametric identification of discount factors under partial ef-ficiency. 2019. Working Paper.Yoram Halevy, Dotan Persitz, and Lanny Zrill. Parametric recoverability of prefer-ences.
Journal of Political Economy , 126(4):1558–1593, 2018.Lars Peter Hansen and Ravi Jagannathan. Assessing specification errors in stochasticdiscount factor models.
The Journal of Finance , 52(2):557–590, 1997.Lars Peter Hansen and Thomas J Sargent.
Robustness . Princeton university press,2008.Lars Peter Hansen and Thomas J Sargent. Structured uncertainty and model misspec-ification.
University of Chicago, Becker Friedman Institute for Economics WorkingPaper , (2018-77), 2018.Stefan Hoderlein and J¨org Stoye. Testing stochastic rationality and predictingstochastic demand: the case of two goods.
Economic Theory Bulletin , 3(2):313–328,2015.Martijn Houtman and J Maks. Determining all maximal data subsets consistent withrevealed preference.
Kwantitatieve Methoden , 19(1):89–104, 1985.Guido W Imbens. Sensitivity to exogeneity assumptions in program evaluation.
Amer-ican Economic Review , 93(2):126–132, 2003.77uichi Kitamura and J¨org Stoye. Nonparametric counterfactuals in random utilitymodels. arXiv preprint arXiv:1902.08350 , 2019.Patrick Kline and Andres Santos. Sensitivity to missing data assumptions: Theoryand an evaluation of the us wage structure.
Quantitative Economics , 4(2):231–267,2013.Patrick Kline and Melissa Tartari. Bounding the labor supply responses to a ran-domized welfare experiment: A revealed preference approach.
American EconomicReview , 106(4):972–1014, 2016.Finn E Kydland and Edward C Prescott. Time to build and aggregate fluctuations.
Econometrica , pages 1345–1370, 1982.Charles F Manski and John V Pepper. How do right-to-carry laws affect crime rates?coping with ambiguity using bounded-variation assumptions.
Review of Economicsand Statistics , 100(2):232–244, 2018.Matthew Masten and Alexandre Poirier. Inference on breakdown frontiers. 2019.Matthew A Masten and Alexandre Poirier. Identification of treatment effects underconditional partial independence.
Econometrica , 86(1):317–351, 2018a.Matthew A Masten and Alexandre Poirier. Salvaging falsified instrumental variablemodels. arXiv preprint arXiv:1812.11598 , 2018b.Daniel McFadden. Econometric models of probabilistic choice.
Structural analysis ofdiscrete data with econometric applications , 198272, 1981.Ulrich K M¨uller and Andriy Norets. Credibility of confidence sets in nonstandardeconometric problems.
Econometrica , 84(6):2183–2213, 2016.Maria Ponomareva and Elie Tamer. Misspecification in moment inequality models:Back to moment equalities?
The Econometrics Journal , 14(2):186–203, 2011.Peter M Robinson. Root-n-consistent semiparametric regression.
Econometrica:Journal of the Econometric Society , pages 931–954, 1988.Ralph Tyrell Rockafellar.
Convex analysis . Princeton university press, 2015.78ernard Salani´e and Frank A Wolak. Fast, “robust,” and approximately correct:estimating mixed demand systems. Technical report, National Bureau of EconomicResearch, 2019.Herbert A Simon.
Administrative Behavior . Macmillan, 1947.Pietro Tebaldi, Alexander Torgovitsky, and Hanbin Yang. Nonparametric estimatesof demand in the california health insurance exchange. Technical report, WorkingPaper, 2018.Hal R Varian. The nonparametric approach to demand analysis.
Econometrica , pages945–973, 1982.Hal R Varian. Goodness-of-fit in optimizing models.
Journal of Econometrics , 46(1-2):125–140, 1990.Hal R Varian. Goodness-of-fit for revealed preference tests. Technical report, 1991.Quang H Vuong. Likelihood ratio tests for model selection and non-nested hypotheses.