[PDF] Dilation bootstrap

Abstract

We propose a methodology for constructing confidence regions with partially identified models of general form. The region is obtained by inverting a test of internal consistency of the econometric structure. We develop a dilation bootstrap methodology to deal with sampling uncertainty without reference to the hypothesized economic structure. It requires bootstrapping the quantile process for univariate data and a novel generalization of the latter to higher dimensions. Once the dilation is chosen to control the confidence level, the unknown true distribution of the observed data can be replaced by the known empirical distribution and confidence regions can then be obtained as in Galichon and Henry (2011) and Beresteanu, Molchanov and Molinari (2011).

Full PDF

DDILATION BOOTSTRAP

ALFRED GALICHON AND MARC HENRY

Abstract.

We propose a methodology for constructing conﬁdence regions withpartially identiﬁed models of general form. The region is obtained by inverting atest of internal consistency of the econometric structure. We develop a dilationbootstrap methodology to deal with sampling uncertainty without reference to thehypothesized economic structure. It requires bootstrapping the quantile process forunivariate data and a novel generalization of the latter to higher dimensions. Oncethe dilation is chosen to control the conﬁdence level, the unknown true distributionof the observed data can be replaced by the known empirical distribution and conﬁ-dence regions can then be obtained as in Galichon and Henry, 2008 and Beresteanuet al., 2008.

JEL Classiﬁcation: C15, C31.Keywords: Partial identiﬁcation, dilation bootstrap, quantile process, optimal matching.

Date : First version: May 2006. This version: October 22, 2012. Correspondence address: AlfredGalichon, ´Ecole polytechnique, D´epartment d’´economie, 91128 Palaiseau, France. Marc Henry, Uni-versit´e de Montr´eal, D´epartement de sciences ´economiques, 3150, Jean-Brillant, Montr´eal, Qu´ebecH3C 3J7, Canada. a r X i v : . [ ec on . E M ] F e b ALFRED GALICHON AND MARC HENRY

Introduction

In several rapidly expanding areas of economic research, the identiﬁcation problemis steadily becoming more acute. In policy and program evaluation Manski, 1990and more general contexts with censored or missing data Shaikh and Vytlacil, 2010,Magnac and Maurin, 2008 and measurement error Chen et al., 2005, ad hoc imputa-tion rules lead to fragile inference. In demand estimation based on revealed preferenceBlundell et al., 2008 the data is generically insuﬃcient for identiﬁcation. In the anal-ysis of social interactions Brock and Durlauf, 2001, Manski, 2004, complex strategiesto reduce the large dimensionality of the correlation structure are needed. In the es-timation of models with complex strategic interactions and multiple equilibria Bjornand Vuong, 1985, Tamer, 2003, assumptions on equilibrium selection mechanismsmay not be available or acceptable.More generally, in all areas of investigation with structural data insuﬃcienciesor incompletely speciﬁed economic mechanisms, the hypothesized structure fails toidentify a unique possible data generating mechanism for the data that is actuallyobserved. In such cases, many traditional estimation and testing techniques becomeinapplicable and a framework for inference in incomplete models is developing, withan initial focus on estimation of the set of structural parameters compatible withtrue data distribution (hereafter identiﬁed set ). A question of particular relevancein applied work is how to construct valid conﬁdence regions for the identiﬁed set.

ILATION BOOTSTRAP 3

Formal methodological proposals abound since the seminal work of Chernozhukovet al., 2007, but computational eﬃciency is still a major concern.In the present work, we propose a methodology that clearly distinguishes howto deal with sampling uncertainty on the one hand, and model uncertainty on theother, so that unlike previous methodological proposals, search in the parameterspace is conducted only once, thereby greatly reducing the computational burden.The key to this separation is to deal with sampling variability without any referenceto the hypothesized structure, using a methodology we call the dilation method . Thisconsists in dilating each point in the space of observable variables in such a waythat the empirical probability (which is known) of a dilated set dominates the trueprobability (which is unknown) of the original set (before dilation). The unknowntrue probability (i.e. the true data generating mechanism) is then removed from theanalysis, and we can proceed as if the problem were purely deterministic, hence applythe methods proposed in Galichon and Henry, 2008 and Beresteanu et al., 2008.To construct conﬁdence regions of level 1 − α for the identiﬁed set, such a di-lation y ⇒ J ( y ) (where ⇒ denotes a one-to-many map) must satisfy ˜ Y ∗ ∈ J ( ˜ Y )a.s. for some pair of random vectors ( ˜ Y ∗ , ˜ Y ), with probability 1 − α , where ˜ Y isdrawn from the true distribution of observable variables and ˜ Y ∗ is drawn from theempirical distribution relative to the observed sample. We propose a dilation boot-strap procedure to construct J , in which bootstrap realizations Y bj , j = 1 , . . . , n arematched one-to-one with the original sample points Y j , j = 1 , . . . , n so as to minimize ALFRED GALICHON AND MARC HENRY η bn = max j =1 ,...,n (cid:107) Y bj − Y σ ( j ) (cid:107) , where the permutation σ deﬁnes the matching. The α quantile of the distribution of η bn then deﬁnes the radius of the dilation.When the observable Y is a random variable, the dilation bootstrap relies on boot-strapping the quantile process, as proposed by Doss and Gill, 1992. However, boot-strapping the quantile process relies on order statistics and had no higher dimensionalgeneralization to date. This is now provided by the dilation bootstrap, which removesthe constraint on dimension through the appeal to optimal matching. Although theproblem of ﬁnding minimum cost matchings (called the assignment or marriage prob-lem ) is very familiar to economists, as far as we know, its application within aninference procedure is unprecedented.The rest of the paper is organized as follows. The next section describes the econo-metric framework and introduces the Composition Theorem and the dilation methodthe latter justiﬁes. Section 2.1 discusses the application of the Composition Theo-rem to constructing conﬁdence regions for partially identiﬁed parameters. Section 2.3presents the bootstrap feasible dilation and its theoretical underpinnings. Section 3presents simulation evidence on the performance of the dilation bootstrap in compar-ison with alternative methods. Section 4 explains how the method extends to higherdimensions and discrete choice and the last section concludes.

ILATION BOOTSTRAP 5 Dilation method and Composition Theorem

We consider the problem of inference on the structural parameters of an economicmodel, when the latter are (possibly) only partially identiﬁed. The economic struc-ture is deﬁned as in Jovanovic, 1989, which generalizes Koopmans and Reiersøl, 1950.Variables under consideration are divided into two groups. Latent variables U cap-ture unobserved heterogeneity in the model. They are typically not observed bythe analyst, but some of their components may be observed by the economic actors.Observable variables Y include outcome variables and other observable heterogene-ity. They are observed by the analyst and the economic actors. We call observabledistribution P the true probability distribution generating the observable variables,and denote by ν the probability distribution that generated the latent variables U .The econometric structure under consideration is given by a binary relation betweenobservable and latent variables, i.e. a subset of Y × U , which can be written withoutloss of generality as a correspondence from U to Y . Assumption 1 (Econometric speciﬁcation) . Observable variables Y , with realizations y ∈ Y ⊆ R d y and latent variables U , with realizations u ∈ U ⊆ R d u , are deﬁned on acommon probability space (Ω , F , P ) and satisfy the relation: Y ∈ G ( U ) ⊆ Y almostsurely. Example 1 (Revealed Preferences) . This approach is particularly well suited to re-vealed preference analysis. Suppose X is the vector of observed choices made by an ALFRED GALICHON AND MARC HENRY agent, possibly over several periods. Let Z be a vector of observable variables deﬁningthe environment in which the agent made their choices. Call Y = ( X, Z ) the vectorof all observable variables. Suppose the agent maximized a utility u ( X, Z, U | θ ) underconstraints g ( X, Z, U | θ ) ≤ (budget constraints, etc...), where θ is a vector of struc-tural parameters (including elasticities, risk aversion, etc...) and U a random vectordescribing unobserved heterogeneity. Call D ( U, X | θ ) the demand correspondence, i.e.the set of utility maximizing choices. Then we can deﬁne G ( U | θ ) by Y ∈ G ( U | θ ) ifand only if X ∈ D ( U, X | θ ) , and G ( U | θ ) exhausts all the information embodied in theutility maximization model. Example 2 (Games) . Another family of examples of our framework arises with para-metric games. Let N players with observable characteristics X = ( X , . . . , X N ) andunobservable characteristics U = ( U , . . . , U N ) have strategies Z = ( Z , . . . , Z N ) andpayoﬀs parameterized by X, U, Z and θ . For a given choice of equilibrium conceptin pure strategies, call C ( X, U, θ ) the equilibrium correspondence, i.e. the set of purestrategy equilibrium proﬁles. Then the empirical content of the game is characterizedby Z ∈ C ( X, U, θ ) , which can be equivalently rewritten Y ∈ G ( U ; θ ) with Y = ( Z, X ) . We assume a parametric structure for the unobserved heterogeneity and the modellinking unobserved heterogeneity variables to observable ones.

Assumption 2 (Correspondence) . The correspondence G : U ⇒ Y is known by theanalyst up to a ﬁnite dimensional vector of parameters θ ∈ Θ ⊆ R d θ . It is denoted ILATION BOOTSTRAP 7 G ( · ; θ ) . For all θ ∈ Θ , G ( · ; θ ) is measurable (i.e. the set { u : G ( u ; θ ) ∩ A (cid:54) = ∅ } ismeasurable for each open subset A of Y ) and has non empty and closed values. Note that the measurability and closed values assumptions are very mild conditions.The assumption that the correspondence is non-empty, however, may be restrictive.In the revealed preferences example, we require that the demand correspondence benon empty. In the games example, we require existence of equilibrium.

Assumption 3 (Latent variables) . The distribution ν of the unobservable variables U is assumed to belong to a parametric family ν ( ·| θ ) , θ ∈ Θ . The same notation is usedfor the parameters of ν and G to highlight the fact that they may have components incommon. The pair of random vectors (

Y, U ) involved in the model is generated by a probabil-ity distribution, that we denote π . Since the vector U is unobservable, the probabilitydistribution π is not directly identiﬁable from the data. However, the econometricmodel imposes restrictions on π . The distribution of its component Y is the ob-servable distribution P . The distribution of its component U is the hypothesizedprobability distribution ν ( ·| θ ). Finally, the joint distribution is further restricted bythe fact that it gives mass 0 to the event that the relation Y ∈ G ( U | θ ) is violated. Forany given value of the structural parameter vector θ , a joint distribution satisfyingall these restrictions may or may not exist. If it does, it is generally non unique. The ALFRED GALICHON AND MARC HENRY identiﬁed set Θ I is the collection of values of the structural parameter vector θ forwhich such a joint probability distribution does indeed exist. • If Θ I = ∅ , the model is rejected. • If Θ I is a singleton, the parameter vector θ is point identiﬁed. • Otherwise, the parameter θ is set identiﬁed.The set Θ I , ﬁrst formalized in this way in Galichon and Henry, 2006 is sometimescalled “sharp identiﬁcation region” to emphasize the fact that it exhausts all theinformation on the parameter available in the model. No value θ ∈ Θ I could berejected on the basis of the knowledge of the model and the observable distribution P only. Take a parameter value θ ∈ Θ. It belongs to the identiﬁed set Θ I if and onlyif there exists a joint distribution satisfying the required restrictions, in other words,if and only if there exists a “version” of U , i.e. a random vector ˜ U with the samedistribution as U , namely ν ( ·| θ ), such that Y ∈ G ( ˜ U | θ ) with probability 1. Hence,denoting by X ∼ µ the statement “the random vector X has probability distribution µ ,” we can characterize the identiﬁed set in the following way, which we take as ourformal deﬁnition. Deﬁnition 1 (Identiﬁed set) . Θ I = (cid:110) θ ∈ Θ | ∃ ˜ Y ∼ P, ˜ U ∼ ν ( ·| θ ) : P ( ˜ Y / ∈ G ( ˜ U | θ )) = 0 (cid:111) . ILATION BOOTSTRAP 9

Our inference method on the identiﬁed set will be based on a general way of com-bining sources of uncertainty (sampling uncertainty or data incompleteness) by com-position of correspondences. Suppose the probability measure Q on Y is the knowndistribution of a random vector Z and that it is related to the true unknown distri-bution P of the observed variables Y by the following relation: Assumption 4 (Dilation) . There exists a correspondence J : Y ⇒ Y such that P ( ˜ Z / ∈ J ( ˜ Y )) ≤ β for some ˜ Z ∼ Q , ˜ Y ∼ P and ≤ β < . Assumption 4 characterizes the additional level of indeterminacy the analyst faces.The structural model is incomplete in the sense that the relation between unobservedheterogeneity U and outcomes Y is a many-to-many mapping. In addition, dueto observability issues or sampling uncertainty, the distribution of outcomes P isunknown and the relation between true outcome Y and a variable Z that we cansimulate is also many-to-many. Example 3 (Measurement error) . Suppose true outcome Y is mismeasured as Z = Y + (cid:15) and nothing is known about measurement error (cid:15) except that it is small, i.e. (cid:107) (cid:15) (cid:107) ≤ η for some η > , with a degree of conﬁdence − β . In that case, Assumption 4holds with the correspondence J deﬁned by J ( y ) = B ( y, η ) for all y ∈ Y , where B ( y, η ) is the closed ball centered at y with radius η . Example 4 (Censored outcomes) . Suppose the true outcome Y is reported with cen-soring as Z = J ( Y ) , where J ( y ) returns the minimum of y and an upper bound B > .Assumption 4 is satisﬁed with β = 0 . The following theorem shows how the two levels of uncertainty can be combinedwithout loss of information. Theorem 1 (Composition Theorem) . Under assumptions 1 to 4, there exist ˜ Z ∼ Q and ˜ U ∼ ν such that P ( ˜ Z / ∈ J ◦ G ( ˜ U | θ )) ≤ β . Theorem 1 implies that when the distribution P of outcomes is unknown, theinfeasible identiﬁed set Θ I can be replaced by a feasible identiﬁed set˜Θ I = (cid:110) θ ∈ Θ | ∃ ˜ Z ∼ Q, ˜ U ∼ ν ( ·| θ ) : P ( ˜ Z / ∈ J ◦ G ( ˜ U | θ )) ≤ β (cid:111) . Proof of Theorem 1.

Under Assumptions 1 and 3, there is a pair (

Y, U ) such that Y ∼ P and U ∼ ν ( ·| θ ) and Y ∈ G ( U | θ ) almost surely. Equivalently, the minimumover all pairs ( ˜ Y , ˜ U ), with ˜ Y ∼ P and ˜ U ∼ ν ( ·| θ ), of the quantity E (1 { ˜ Y / ∈ G ( ˜ U | θ ) } )is zero. By proposition 1 of Galichon and Henry, 2009 (hereafter denoted P1) , thelatter is equivalent tosup ( P ( A ) − ν ( { u ∈ U : G ( u | θ ) ∩ A (cid:54) = ∅ }| θ )) = 0 , (1.1) The current proof of Theorem 1, suggested by Alexei Onatski, is shorter and simpler than ouroriginal proof in previous versions of the paper. We are responsible for any remaining errors.

ILATION BOOTSTRAP 11 where the sup is over all Borel subsets A of Y . Similarly, by Assumption 4, theminimum over all pairs ( ˜ Z, ˜ Y ), with ˜ Z ∼ Q and ˜ Y ∼ P , of the quantity E (1 { ˜ Z / ∈ J ( ˜ Y ) } ) is smaller than or equal to β . By P1 the latter is equivalent tosup ( Q ( A ) − P ( { y ∈ Y : J ( y ) ∩ A (cid:54) = ∅ }| θ )) ≤ β. (1.2)Denote J − ( A ) = { y ∈ Y : J ( y ) ∩ A (cid:54) = ∅ } . By (1.1), we have P ( J − ( A )) ≤ ν ( { u ∈U : G ( u | θ ) ∩ J − ( A ) (cid:54) = ∅ }| θ ) for all Borel subsets A of Y . Hence, (1.2) yieldssup (cid:0) Q ( A ) − ν ( { u ∈ U : G ( u | θ ) ∩ J − ( A ) (cid:54) = ∅ }| θ ) (cid:1) ≤ β, Hence sup ( Q ( A ) − ν ( { u ∈ U : J ◦ G ( u | θ ) ∩ A (cid:54) = ∅ }| θ )) ≤ β, (1.3)since G ( u | θ ) ∩ J − ( A ) (cid:54) = ∅ and J ◦ G ( u | θ ) ∩ A (cid:54) = ∅ are equivalent. Finally, by athird application of P1, (1.3) is equivalent to β weakly dominating the minimum ofthe quantity E (1 { ˜ Z / ∈ J ◦ G ( ˜ U | θ ) } ) over all pairs ( ˜ Z, ˜ U ) with ˜ Z ∼ Q and ˜ U ∼ ν ( ·| θ )and the result follows. (cid:3) To illustrate the composition theorem, consider a special case of the revealed pref-erence example 1 combined with measurement error, as in example 3. Suppose we ob-serve the share Y of risky assets in the portfolio of investors, who are assumed to maxi-mize the expectation of a CARA utility function u ( Y, A ; U ) = exp( − U [(1 − Y )+ Y A ]),hence they are assumed to maximize Y E ( A ) − U Y var(A) /

2, where E ( A ) is the per-ceived mean of the risky asset A and var( A ) its perceived variance. We further suppose investors diﬀer by their risk aversion U , for which the analyst hypothesizes an expo-nential distribution ( F U ( u ; θ ) = P ( U ≤ u ; θ ) = 1 − e − θu ) and by their perception ofthe riskiness of the asset, and all the analyst knows is a pair of bounds ( λ, λ ) suchthat E ( A ) / var( A ) ∈ [ λ, λ ]. The investor’s maximization yields Y = E ( A ) /U var( A ),so that the model can be summarized by Y ∈ G ( U ) = [ λ/U, λ/U ]. Values λ = 50%and λ = 200% can be calibrated according to Weitzman, 2007. The true distribu-tion of income Y is unknown, but the true cumulative distribution of a mismeasuredversion Z = Y + (cid:15) , with (cid:107) (cid:15) (cid:107) ≤ η a.s., is F Z ( y ) = P ( Z ≤ z ) = exp( − /z ). ByTheorem 1, the identiﬁed set ˜Θ I can be derived from the composed correspondence J ◦ G : u ⇒ J ◦ G ( u ) = [ λ/u − η, λ/u + η ], where J : y ⇒ J ( y ) = B ( y, η ) is a dilation sat-isfying Assumption 4. The cumulative distribution of risk aversion satisﬁes 1 − e − θu = P ( U ≤ u ) ∈ [ P ( λ/u + η ≤ Z ) , P ( λ/u − η ≤ Z )] = [1 − e − ( λ/u + η ) − , − e − ( λ/u − η ) − ].Hence, for all u >

0, ( λ + ηu ) − ≤ θ ≤ ( λ − ηu ) − . Therefore, the identiﬁed set canbe derived as ˜Θ I = [1 /λ, /λ ].2. Dilation method and sampling uncertainty

Conﬁdence regions.

The main application of the Composition Theorem 1 thatwe consider here is the construction of valid conﬁdence regions for partially identiﬁedmodels, based on a sample of realizations of the observable variables.

ILATION BOOTSTRAP 13

Assumption 5 (Sampling) . Let ( Y , . . . , Y n ) be a sample of independent and iden-tically distributed random vectors with distribution P and let P n = (cid:80) nj =1 δ Y j be theempirical distribution associated with the sample. We propose a new method to construct a conﬁdence region for the identiﬁed setΘ I of deﬁnition 1. Deﬁnition 2 (Conﬁdence region) . A valid α -conﬁdence region for the identiﬁed set Θ I is a sequence of random regions Θ αn satisfying lim inf n P (Θ I ⊆ Θ αn ) ≥ − α. As noted in Imbens and Manski, 2004, this is not the only way to deﬁne conﬁdenceregions in a partially identiﬁed setting, as one might also consider coverage (pointwise or uniform) of each value within the identiﬁed set. Here we concentrate on asituation where one cannot assume that any value within the identiﬁed set can beconstrued as the true value, so that the whole set is the object of interest. Moreover, aconﬁdence region for the identiﬁed set is also a uniform conﬁdence region for each of itselements. The construction of the conﬁdence region is based on a new nonparametricway of controlling sampling uncertainty and its validity relies on a corollary to theComposition Theorem (Theorem 1 of Section 1). We construct sample based sets J αn ,where α ∈ (0 ,

1) is the desired conﬁdence level, to account for the discrepancy betweenthe empirical distribution P n associated with the sample and the true observabledistribution P . We thereby obtain an analogue of Assumption 4: Assumption 6 (Sample dilation) . With probability − α n such that lim sup n α n ≤ α ,conditionally on the sample ( Y , . . . , Y n ) , the sequence of correspondences J αn satisﬁes Y ∈ J αn ( ˜ Y ∗ ) almost surely for some ˜ Y ∼ P , ˜ Y ∗ ∼ P n . Heuristically, the region J αn satisfying Assumption 6 ensures that with suitableconﬁdence, the realizations of the empirical distribution are caught by the enlarged realizations of the true distribution J αn ( ˜ Y ). Once the dilation J αn is obtained, theComposition Theorem can be applied to prove the following: Theorem 2.

Under assumptions 1, 2, 3, 5 and 6, then Θ αn := { θ ∈ Θ | ∃ ˜ Y ∗ ∼ P n , ˜ U ∼ ν ( ·| θ ) : P ( ˜ Y ∗ / ∈ J αn ◦ G ( ˜ U | θ )) = 0 } is a valid α -conﬁdence region for theidentiﬁed set Θ I . The dilation J αn is chosen to control the conﬁdence level: indeed, by Proposition 1of Galichon and Henry, 2009 (called P1 in the proof of Theorem 1), the statement ∃ ˜ Y ∗ ∼ P n , ˜ U ∼ ν ( ·| θ ) : P ( ˜ Y ∗ / ∈ J αn ◦ G ( ˜ U | θ )) = 0 is equivalent to P ( A ) ≤ P n ( J αn ( A )) , for all Borel subset A of Y . Hence, the unknown distribution P of an event A isdominated by the empirical distribution of the dilation J αn ( A ) of the event A . As both P n and ν ( ·| θ ) are known, the construction of Θ αn is feasible and eﬃcient methods tocompute it were proposed in Galichon and Henry, 2008 and Beresteanu et al., 2008.2.2. Oracle dilation.

We now turn to the question of how to construct the dilation J αn that satisﬁes Assumption 6. When Y is a random variable, such dilation will beobtained from uniform conﬁdence bands for the quantile process. ILATION BOOTSTRAP 15

Deﬁnition 3 (Quantile process) . Let F be the cumulative distribution of Y . Let Q ( t ) , t ∈ [0 , be the quantile function of Y , deﬁned by Q ( t ) = inf { y ∈ [0 ,

1] : F ( y ) ≥ t } .Call Q n the empirical quantile relative to the sample ( X , . . . , X n ) . It is deﬁned by Q n ( t ) = Y ( j ) for j − < nt ≤ j for each j , with Y ( j ) denoting the j th order statistic.The quantile process is deﬁned as q n ( t ) := √ n ( Q n ( t ) − Q ( t )) . The idea of the construction of dilations satisfying Assumption 6 is based onthe quantile transformation. Indeed, letting Z be a uniform random variable on[0 ,

1] and deﬁning ˜ Y = Q ( Z ) and ˜ Y ∗ = Q n ( Z ), we have a pair of random vari-ables ˜ Y and ˜ Y ∗ with respective probability distributions P and P n . Suppose auniform conﬁdence band is available for the quantile function of the form P ( η n :=sup ≤ t ≤ | q n ( t ) | ≤ ˜ c n ( α )) = 1 − α n . Then, with probability 1 − α n , we have | ˜ Y ∗ − ˜ Y | = | Q n ( Z ) − Q ( Z ) | ≤ ˜ c n ( α ) / √ n almost surely. Hence, the dilation J αn deﬁned for all y by J αn ( y ) = B ( y, ˜ c n ( α ) / √ n ) satisﬁes Assumption 6. Moreover, the choice of dilation J αn ( y ) = B ( y, ˜ c n ( α ) / √ n ) is optimal in the sense that, under the regularity conditionsof Assumption 7, | Q n ( Z ) − Q ( Z ) | achieves the minimum of | ˜ Y ∗ − ˜ Y | when ˜ Y ∗ (respec-tively ˜ Y ) ranges over the set of random variables with distribution P n (respectively P ). Note that smaller dilations are desirable, as they maximize informativeness ofthe resulting conﬁdence region.The following conditions guarantee the existence of such uniform conﬁdence bandsfor the quantile process. Assumption 7 (Uniform quantile bands) . The sample { Y , . . . , Y n } is an iid sampleof random variables with cumulative distribution function F satisfying:(i) F ( y ) is twice continuously diﬀerentiable on its support ( a, b ) .(ii) F (cid:48) = f > on ( a, b ) .(iii) For some γ > , sup y ∈ ( a,b ) F ( y )(1 − F ( y )) | f (cid:48) ( y ) | /f ( y ) ≤ γ .(iv) lim sup y ↓ a f ( y ) < ∞ and lim sup y ↑ b f ( y ) < ∞ .(v) f is nondecreasing (resp. nonincreasing) on an interval to the right of a (resp. tothe left of b ). A distribution function F satisfying Assumption 7 is called tail monotonic withindex γ by Parzen Parzen, 1979. To indicate the mildness of Assumption 7, Parzen,1979 gives the following example where it fails: 1 − F ( y ) = exp( − y − C sin y ) with0 . < C <

1. As shown below, under Assumption 7, asymptotic results on theempirical quantile process allow us to derive a dilation J αn that satisﬁes Assumption 6for all α ∈ (0 , c ( α ) implicitly by P (sup ≤ t ≤ | B ( t ) | ≤ c ( α )) = 1 − α , where B ( t ) is a Gaussian process called a Brownian bridge . For any α ∈ (0 , Proposition 1 (Oracle dilation) . Under assumptions 5 and 7, the dilation J αn deﬁnedfor each y by J αn ( y ) = [ y − c ( α ) / √ nf ( y ) , y + c ( α ) / √ nf ( y )] satisﬁes Assumption 6. ILATION BOOTSTRAP 17

Proof of Proposition 1.

Under Assumption 7, we have the following strong approxi-mation result in Cs¨org˝o, 1983, theorem 4.1.2 page 31:sup ≤ t ≤ | f ( Q ( t )) q n ( t ) − B n ( t ) | = O ( n − / ε ) , a.s.for (cid:15) > B n ( t ) is a sequence of Brownian bridges. Hence, the interval Q n ( t ) − c ( α ) / √ nf ( Q ( t )) ≤ Q ( t ) ≤ Q n ( t ) − c ( α ) / √ nf ( Q ( t ))is an asymptotically valid uniform conﬁdence band for Q ( t ), 0 ≤ t ≤

1, of level 1 − α. Take Z a uniform random variable on [0 , Y ∗ := Q n ( Z ) and Y := Q ( Z ). Bythe quantile transform, ˜ Y ∗ has distribution P n and ˜ Y has distribution P . Therefore,with probability tending to 1 − α , there exists ˜ Y ∗ ∼ P n and ˜ Y ∼ P such that˜ Y − c ( α ) / √ nf ( ˜ Y ) ≤ ˜ Y ∗ ≤ ˜ Y + c ( α ) / √ nf ( ˜ Y ) almost surely, and the result follows. (cid:3) The dilation in proposition 1 is infeasible, as it depends on the unknown f and itrelies on quantiles c ( α ) that are diﬃcult to compute. We develop a feasible alternativein our dilation bootstrap procedure in section 2.3. We resort to a bootstrap matchingalgorithm to construct feasible versions of the dilation above.2.3. Bootstrap dilation.

To introduce the simple idea underlying the method, con-sider the sample ( Y , . . . , Y n ) and a given bootstrap realization ( Y b , . . . , Y bn ) as inﬁgure 1. As before, ( Y (1) , . . . , Y ( n ) ) are the order statistics associated with the sampleand ( Y b (1) , . . . , Y b ( n ) ) are the order statistics associated with the bootstrap realization (with arbitrary ranking of the ties). In the illustrative example of ﬁgure 1, the small-est observation of the initial sample Y (1) was drawn once in the bootstrap sample, thesecond smallest was not drawn, the third smallest was drawn once, the fourth smallesttwice, and the largest Y ( n ) was drawn twice. The arrows in the ﬁgure represent thebijection that matches the j ’th order statistic of the initial sample Y ( j ) with the j ’thorder statistic of the bootstrap sample Y b ( j ) for each j = 1 , . . . , n . Figure 1.

Bootstrap Quantile Matching.To achieve a bootstrap analog of Assumption 6, we need a dilation J bn and a per-mutation σ of { , . . . , n } such that Y ( j ) ∈ J bn ( Y bσ ( j ) ) for all j = 1 , . . . , n . One suchpermutation matches the order statistics of the initial sample with the order statis-tics of the bootstrap sample. In this matching in the example of ﬁgure 1, Y (1) ismatched with Y b (1) , namely with itself. Y (2) was not drawn in the bootstrap sample,so it is matched with Y b (2) , which is equal to Y (3) , for whom Y (2) is the second closestneighbor in Euclidian distance. Y (3) is the nearest neighbor of its match Y b (3) = Y (4) , ILATION BOOTSTRAP 19 Y (4) is matched with itself, Y ( n − is the nearest neighbor of its match Y b ( n − = Y ( n ) and ﬁnally Y ( n ) is matched with itself. The longest distance between two matches is η bn = | Y b ( n − − Y ( n − | = | Y ( n ) − Y ( n − | . Hence, if J bn ( y ) = B ( y, η bn ), Y b ( j ) ∈ J bn ( Y ( j ) ) willbe satisﬁed for all j = 1 , . . . , n in this particular bootstrap sample realization. Thechosen matching in ﬁgure 1 characterizes the bootstrap quantile process (see Deﬁni-tion 4) and it minimizes the largest deviation η bn , and hence produces the smallestdilation. Deﬁnition 4 (Bootstrap quantile process) . A bootstrap sample is a sample ( Y b , . . . , Y bn ) of i.i.d. variables with distribution P n . The quantile function of the distribution ofthe bootstrap sample ( bootstrap quantile ) is deﬁned for each t ∈ [0 , by Q bn ( t ) = Y b ( j ) for j − < nt ≤ j . The bootstrap quantile process is deﬁned as q bn ( t ) := √ n ( Q bn ( t ) − Q n ( t )) . Call η bn the maximum of the bootstrap quantile process. In the illustrative example of ﬁgure 1, the bootstrap quantile process attains itsmaximum over t ∈ [0 ,

1] at t such that n − < nt ≤ n − η bn = Y b ( n − − Y ( n − = Y ( n ) − Y ( n − . In the population of bootstrap realizations, η bn has distribution with 1 − α quantile c ∗ n ( α ). The latter can be approximated by simulation with a large number B of bootstrap replications. We obtain η bn for each b = 1 , . . . , B . Call ˆ c ∗ n ( α ) the [ Bα ]-thlargest among the η bn ’s (where [ . ] denoted integer part) and ˆ J α, ∗ n ( y ) = B ( y, ˆ c ∗ n ( α )), thenby construction, a proportion 1 − α of the bootstrap samples indexed by b = 1 , . . . , n will satisfy Y b ( j ) ∈ J α, ∗ n ( Y ( j ) ) for all j = 1 , . . . , n . By Theorem 2 of Singh, 1981 (see also Theorem 5.1 of Bickel and Freedman, 1981), the bootstrap quantile process( q bn ( t )) t ∈ [0 , has almost surely the same uniform weak limit as the empirical quantileprocess ( q n ( t )) t ∈ [0 , and we therefore have the following result on the validity of thebootstrap dilation. Proposition 2 (Bootstrap dilation) . Let c ∗ n ( α ) be the − α quantile of the supremum η bn of the bootstrap quantile process ( q bn ( t )) t ∈ [0 , . Under assumptions 5 and 7, thedilation deﬁned for each y by J α, ∗ n ( y ) = B ( y, c ∗ n ( α ) / √ n ) satisﬁes Assumption 6 almostsurely. Note that in the univariate case, the simulation approximation ˆ c ∗ n ( α ) to the quan-tile c ∗ n ( α ) is very simple to derive. The simplest algorithm requires ordering the initialsample and each of the bootstrap samples and computing the maximum of | Y b ( j ) − Y ( j ) | over j = 1 , . . . , n . However, we have introduced, with ﬁgure 1 and the discussionabove, an equivalent algorithm, which runs as follows: for each b = 1 , . . . , B , ﬁndthe permutation σ over { , . . . , n } , which minimizes the quantity max j | Y bj − Y σ ( j ) | .Unlike the algorithm based on the order statistics, such an optimal matching or op-timal assignment procedure can be performed regardless of dimension and eﬃcientalgorithms and implementations are available. ILATION BOOTSTRAP 21 Simulation evidence

We assess the small sample performance of the dilation bootstrap on the followingsimulation design. Observable variables Y have a standard normal distribution, whileunobserved heterogeneity variable U is assumed to follow a normal distribution withmean θ and variance 1. The cumulative distribution of U is denoted F U . The modelcorrespondence G is deﬁned for each u by G ( u ) = [ u − , u + 1], so that the modelis characterized by the relation Y ∈ G ( U ) = [ U − , U + 1]. Therefore, the identiﬁedset can be immediately derived as Θ I = [ − , ,

000 initial samples of size n =50 , , P n , we compute ˆ c ∗ n ( α ) with 5 ,

000 bootstrapreplications, and use the dilation ˆ J α, ∗ n ( y ) = B ( y, ˆ c ∗ n ( α )), so that a parameter value θ belongs to the (1 − α )-conﬁdence region Θ CR for Θ I if and only if there exist ˜ Y ∗ ∼ P n and ˜ U ∼ F U ( . ; θ ) such that P ∗ ( ˜ Y ∗ ∈ ˆ J α, ∗ n ◦ G ( ˜ U )) = 1. Since P n and F U are known, thelatter condition can be checked eﬃciently with the core determining class method ofGalichon and Henry, 2008, section 2.3. We report Monte Carlo coverage probabilitiesin case of signiﬁcance level α = 0 .

01, 0 .

05 and 0 . Table 1.

Rejection levels from the dilation bootstrap procedure.Sample Size 50 100 500 α = 0 .

01 0.0122 0.0118 0.0108 α = 0 .

05 0.0324 0.0364 0.0438 α = 0 .

10 0.0590 0.0648 0.0754

The most notable feature to note is the tendency to under reject in small samples,especially for true size α = 0 .

10 but also for true size α = 0 .

05. For true size α = 0 .

01 on the other hand, the procedure displays slight over rejection in smallsamples. For comparison purposes, we also report coverage probabilities from thegeneric subsampling procedure for set coverage in Chernozhukov et al., 2007 based onthe criterion function √ n max(max j =1 ,...,n [ F n ( Y j ) + F U ( Y j + 1)] , max j =1 ,...,n [ − F n ( Y j ) + F U ( Y j − , ,

48 when n = 50, 85 , ,

95 for n = 100 and 425 , ,

475 for n = 500 arereported in Table 2. We ﬁnd the procedure over rejects in all but one case, and thereis moderate dependence in the choice of subsample size.4. Extensions

The dilation method and dilation bootstrap have natural extensions to the cases,where observable variables Y are multivariate and to the case, where Y is discrete.We consider both extensions in the following subsections.4.1. Multivariate extension.

Consider ﬁrst the case, where the random vector ofobservable variables Y has dimension d ≥

2. This extension allows the considerationof multiple equations models. Moreover, it is particularly relevant in this partially

ILATION BOOTSTRAP 23

Table 2.

Rejection levels from the infeasible CHT procedure.Sample Subsample α = 0 . α = 0 . α = 0 . Example 5 (Single equation model with endogeneity) . Suppose the econometricmodel under consideration is Z = f ( X, U ; θ ) , where Z and X are observed ran-dom variables, U is unobserved heterogeneity and f is a function parameterized by θ . Suppose no assumption is made on the dependence between X and U . Deﬁne Y = ( X, Z ) (cid:48) . Deﬁne the correspondence G for each u by ( x, z ) ∈ G ( u ; θ ) if and onlyif z = f ( x, u ; θ ) . Then the model can be rewritten Y ∈ G ( U ; θ ) as in Assumption 1. In case Y is multivariate, although Theorem 2 holds irrespective of dimension,the construction of a dilation satisfying Assumption 6 can no longer rely on thetraditional quantile process as in Propositions 1 and 2. However, the quantity η n = inf {(cid:107) ˜ Y ∗ − ˜ Y (cid:107) ∞ : ˜ Y ∗ ∼ P n , ˜ Y ∼ P } is still well deﬁned. When attained, it isachieved by a pair of random vectors ( ˜ Y ∗ , ˜ Y ) with marginal distributions P n and P ,which minimizes the largest deviation. Equivalently, there exist ˜ Y ∗ ∼ P n and ˜ Y ∼ P such that ˜ Y ∗ belongs to a closed ball B ( ˜ Y , η n ) centered on ˜ Y and with radius η n , i.e.such that E [1 { ˜ Y ∗ / ∈ B ( ˜ Y , η n ) } ] = 0.When Y is uniformly distributed on the unit cube [0 , d , the quantity η n is wellstudied in the probability literature. Hence, using asymptotic results on the quantity η n in the literature, speciﬁcally Leighton and Shor, 1989 for the case d = 2 and Shorand Yukich, 1991 for the case d ≥

3, we can derive analytical formulae for the dilation J n : Proposition 3 (Minimax matchings) . The exist a constant c > and a function c d > of the dimension d of Y such that J n ( y ) = B ( y, c (ln n ) / / √ n ) satisﬁesAssumption 6 with α n = n − c √ ln n when d = 2 and J n ( y ) = B ( y, c d (. ln n/n ) /d ) satisﬁesAssumption 6 for any α ∈ [0 , when d ≥ . However, the results of Proposition 3 only pertain to the uniform case and produceconservative conﬁdence regions. More generally, we propose constructing suitabledilations based on the distribution of η n . Deﬁnition 5 (Minimax matching) . Call c n ( α ) the − α quantile of the distributionof η n = inf {(cid:107) ˜ Y ∗ − ˜ Y (cid:107) ∞ : ˜ Y ∗ ∼ P n , ˜ Y ∼ P } . ILATION BOOTSTRAP 25

By construction, we then see that the ball B ( y, c n ( α )) is a suitable dilation, in thesense that it satisﬁes Assumption 6. Proposition 4 (Multivariate oracle dilation) . The dilation J αn deﬁned for each y by J αn ( y ) = B ( y, c n ( α )) satisﬁes Assumption 6. As for the approximation of c n ( α ) to obtain a feasible dilation, once again, althoughthe quantile process is no longer deﬁned, the matching algorithm described in Sec-tion 2.3 is easily generalizable and delivers a bootstrap dilation approximation of J αn .The general procedure is described as follows. Bootstrap Algorithm: • Consider bootstrap samples ( Y b , . . . , Y bn ), b = 1 , . . . , B drawn from P n andcall P bn the empirical distribution of sample b . • For each bootstrap replication b , deﬁne η bn = min σ max j ∈{ ,...,n } (cid:107) Y bj − Y σ ( j ) (cid:107) , where σ ranges over all permutations of { , . . . , n } . • Let ˆ c ∗ n ( α ) be the [ Bα ] largest among the η bn , b = 1 , . . . , B , and for each y , setˆ J α, ∗ n ( y ) = B ( y, ˆ c ∗ n ).The problem of ﬁnding the permutation that achieves η bn is called bottleneck bipartitematching in the combinatorial optimization and operations research literature. Case of discrete choice.

We now turn to the case of aggregate data fromdiscrete choice. To ﬁx ideas, consider a voting model, where K parties are representedin n electoral districts and observations ˆ p i,k , i = 1 , . . . , n and k = 1 , . . . , K , arereported shares of votes for party k in district i . Voter l chooses the party thatmaximizes their utility u li,k ( θ ) + ρ i,k + (cid:15) li,k , where u i,k ( θ ) is a deterministic function of(observed covariates and) the unknown parameter θ , ρ i,k are random district-partyeﬀects (independent of voters) and the (cid:15) li,k ’s are i.i.d. type I extreme value randomutilities. True vote shares for party k in district i satisfy ln p ∗ i,k ( ρ i,k ) = u i,k ( θ ) + ρ i,k +ln (cid:80) k exp( u i,k + ρ i,k ). True shares p ∗ i,k are unobserved, however, due to the possibilityof electoral fraud. Reported shares p i,k are assumed to satisfy p i,k ≥ p ∗ i,k when arepresentative of party k is present during the vote count in district i . In districts,where no party representative is present, the situation is equivalent to missing dataon vote shares. Let X i,k be equal to 1 if a representative of party k is present indistrict i during vote count, and zero otherwise. We assume X = ( X i,k ) i =1 ,...,n ; k =1 ,...,K is exogenous. The correspondence characterizing the model is G (cid:0) ( ρ i,k ) Kk =1 | X ; θ (cid:1) = (cid:40) ( p i,k ) Kk =1 : K (cid:88) k =1 p i,k = 1; p i,k ≥ p ∗ i,k ( ρ i,k ) X i,k , each k (cid:41) . District i has n i voters. Call ˆ p i,k the proportion of votes in district i reported as goingto party k and write ˆ p i = (ˆ p i,k ) k =1 ,...,K . By the central limit theorem, √ n i (ˆ p i − p i )has Gaussian limiting distribution with zero mean and covariance matrix V i , with ILATION BOOTSTRAP 27 diagonal elements p i,k (1 − p i,k ) and oﬀ-diagonal elements − p i,k p i,k (cid:48) . Call Z i a randomvector with distribution N (0 , V i /n i ) and let η i be such that P ( Z i / ∈ B (0 , η i )) = α i ,where B (0 , η i ) is the open ball centered at zero with radius η i . Deﬁne the dilation J α i n i deﬁned for each p by J α i n i ( p ) = B ( p, η i ). Then J αn ( p ) = (cid:83) i J α i n i ( p ) = B ( p, max i η i )satisﬁes Assumption 6 for α = lim sup n Π ni =1 α i .In the two-party case, call ˆ p i the reported share of votes for party 1, p i the true or population reported share and p ∗ i the true share (absent reporting fraud). The trueshare satisﬁes ln p ∗ i ( ρ i ) = u i, ( θ ) + ρ i, + ln(exp( u i, ( θ ) + ρ i, ) + exp( u i, ( θ ) + ρ i, )).Because of fraud issues, all we know about the relation between p i and p ∗ i is thefollowing: p i ≥ p ∗ i if party 1 places an observer in district i.p i ≤ p ∗ i if party 2 places an observer in district i. Note that reported vote shares are equal to true vote shares in case both partieshave observers present for vote count. Letting X i,k take value 1 if party k places anobserver in district i and zero otherwise, the correspondence characterizing the modelis G ( ρ i | X, θ ) = { p i : p i ≥ p ∗ i ( ρ i ) X i, and (1 − p i ) ≥ (1 − p ∗ i ( ρ i )) X i, } = (cid:20) X i, exp( u i, ( θ ) + ρ i, )exp( u i, ( θ ) + ρ i, ) + exp( u i, ( θ ) + ρ i, ) , − X i, exp( u i, ( θ ) + ρ i, )exp( u i, ( θ ) + ρ i, ) + exp( u i, ( θ ) + ρ i, ) (cid:21) . By the central limit theorem, √ n i (ˆ p i − p i ) has Gaussian limiting distribution withzero mean and variance p i (1 − p i ). Call c α i / the quantile of level 1 − α i / η = max i η i with η i = c α i / (cid:112) p i (1 − p i ) /n i . Thenthe dilation deﬁned for each p by J αn ( p ) = [ p − η, p + η ] satisﬁes Assumption 6 with α = lim sup n Π i α i . The composition of the dilation J αn and the correspondence G yields J αn ◦ G ( ρ | X ; θ ) = [ X p ∗ ( ρ ) − η, X (1 − p ∗ ( ρ )) + η ] . The region ˜Θ I containing all θ such that ˆ p ∈ J αn ◦ G ( ρ | X, θ ) a.s. is therefore a validconﬁdence region for the identiﬁed set and can be computed eﬃciently using methodsproposed in Galichon and Henry, 2008.

Conclusion

We have proposed a method to combine several sources of uncertainty, such asmissing or corrupted data and structural incompleteness in the model through acomposition of correspondences. We show that our composition theorem applies inparticular to the construction of conﬁdence regions in partially identiﬁed models ofgeneral form. In that case, the composition theorem is applied to the compositionof the correspondence that deﬁnes the econometric structure and a dilation of thesample space that controls the signiﬁcance level and allows to replace the unknowndistribution of observable data by the empirical distribution of the sample in the char-acterization of compatibility between model and data. An important computational

ILATION BOOTSTRAP 29 advantage of this method over previous proposed conﬁdence regions for partiallyidentiﬁed parameters is that the dilation is performed independently of the structuralparameter, hence needs to be performed only once. The remaining search over theparameter space is purely deterministic. The dilation is obtained through a minimaxmatching procedure. It is equivalent to a uniform conﬁdence band for the quantileprocess when the dimension of the endogenous variable is one, however, it has noparallel in higher dimensions. The method is shown to perform well in simulationexperiments.

Acknowledgements

We thank Christian Bontemps, Gary Chamberlain, Victor Chernozhukov, Pierre-Andr´e Chiappori, Ivar Ekeland, Rustam Ibragimov, Guido Imbens, Thierry Magnac,Francesca Molinari, Alexei Onatski, Geert Ridder, Bernard Salani´e, participants atthe “Semiparametric and Nonparametric Methods in Econometrics” conference inOberwolfach and seminar participants at BU, Brown, CalTech, Chicago, ´Ecole poly-technique, Harvard, MIT Sloan, Northwestern, Toulouse, UCLA, UCSD and Yale forhelpful comments (with the usual disclaimer). Both authors gratefully acknowledgeﬁnancial support from NSF grant SES 0532398 and from Chaire AXA “Assurancedes Risques Majeurs” and Chaire Soci´et´e G´en´erale “Risques Financiers”. Galichon’sresearch is partly supported by Chaire EDF-Calyon “Finance and D´eveloppement

Durable” and FiME, Laboratoire de Finance des March´es de l’Energie. Henry’s re-search is also partly supported by SSHRC Grant 410-2010-242.

References

Beresteanu, A., Molchanov, I., & Molinari, F. (2008).

Sharp identiﬁcation regions inmodels with convex predictions: Games, individual choice, and incomplete data [cemmap working paper CWP27/09].Bickel, P., & Freedman, D. (1981). Some asymptotic theory for the bootstrap.

Annalsof Statistics , , 1196–1217.Bjorn, P., & Vuong, Q. (1985). Simultaneous equations models for dummy endogenousvariables [Caltech Working Paper 537].Blundell, R., Browning, M., & Crawford, I. (2008). Best nonparametric bounds ondemand responses.

Econometrica , , 1227–1262.Brock, W., & Durlauf, S. (2001). Discrete choice with social interactions. Review ofEconomic Studies , , 235–265.Chen, X., Hong, H., & Tamer, E. (2005). Measurement error models with auxiliarydata. Review of Economic Studies , , 343–366.Chernozhukov, V., Hong, H., & Tamer, E. (2007). Estimation and conﬁdence regionsfor parameter sets in econometric models. Econometrica , , 1243–1285.Cs¨org˝o, M. (1983). Quantile processes with statistical applications . Regional Confer-ence Series in Applied Mathematics.

EFERENCES 31

Doss, H., & Gill, R. (1992). An elementary approach to weak convergence for quantileprocesses, with applications to censored survival data.

Journal of the AmericanStatistical Association , , 869–877.Galichon, A., & Henry, M. (2006). Inference in incomplete models [available fromSSRN at http://papers.ssrn.com/sol3/papers.cfm?abstract id=886907].Galichon, A., & Henry, M. (2008).

Set identiﬁcation in models with multiple equilibria [forthcoming in the

Review of Economic Studies ].Galichon, A., & Henry, M. (2009). A test of non-identifying restrictions and conﬁdenceregions for partially identiﬁed parameters.

Journal of Econometrics , , 186–196.Imbens, G., & Manski, C. (2004). Conﬁdence intervals for partially identiﬁed param-eters. Econometrica , , 1845–1859.Jovanovic, B. (1989). Observable implications of models with multiple equilibria. Econometrica , , 1431–1437.Koopmans, T., & Reiersøl, O. (1950). The identiﬁcation of structural characteristics. Annals of Mathematical Statistics , , 165–181.Leighton, T., & Shor, P. (1989). Tight bounds for minimax grid matching with ap-plications to the average case analysis of algorithms. Combinatorica , , 161–187. Magnac, T., & Maurin, E. (2008). Partial identiﬁcation in monotone binary models:Discrete regressors and interval data.

Review of Economic Studies , , 835–864.Manski, C. (1990). Nonparametric bounds on treatment eﬀects. American EconomicReview , , 319–323.Manski, C. (2004). Social learning from private experiences: The dynamics of theselection problem. Review of Economic Studies , , 443–458.Parzen, E. (1979). Non parametric statistical data modeling. Journal of the AmericanStatistical Association , , 105–131.Shaikh, A., & Vytlacil, E. (2010). Partial identiﬁcation in triangular systems of equa-tions with binary dependent variables [forthcoming in

Econometrica ].Shor, P., & Yukich, J. (1991). Minimax grid matching and empirical measures.

Annalsof Probability , , 1338–1348.Singh, K. (1981). On the asymptotic accuracy of Efron’s bootstrap. Annals of Statis-tics , , 1187–1195.Tamer, E. (2003). Incomplete simultaneous discrete response model with multipleequilibria. Review of Economic Studies , , 147–165.Weitzman, M. (2007). Subjective expectations and asset return puzzles. AmericanEconomic Review ,97