[PDF] Games of Incomplete Information Played By Statisticians

Abstract

Players are statistical learners who learn about payoffs from data. They may interpret the same data differently, but have common knowledge of a class of learning procedures. I propose a metric for the analyst's "confidence" in a strategic prediction, based on the probability that the prediction is consistent with the realized data. The main results characterize the analyst's confidence in a given prediction as the quantity of data grows large, and provide bounds for small datasets. The approach generates new predictions, e.g. that speculative trade is more likely given high-dimensional data, and that coordination is less likely given noisy data.

Full PDF

GGames of Incomplete InformationPlayed By Statisticians

Annie Liang ∗ July 13, 2020

Abstract

Players are statistical learners who learn about payoﬀs from data. They may in-terpret the same data diﬀerently, but have common knowledge of a class of learningprocedures. I propose a metric for the analyst’s “conﬁdence” in a strategic prediction,based on the probability that the prediction is consistent with the realized data. Themain results characterize the analyst’s conﬁdence in a given prediction as the quantityof data grows large, and provide bounds for small datasets. The approach generatesnew predictions, e.g. that speculative trade is more likely given high-dimensional data,and that coordination is less likely given noisy data.

Predictions of play in incomplete information games depend crucially on the beliefs of theagents, but we rarely know what those beliefs are. A standard approach to modeling beliefsassumes that players share a common prior belief over states of the world, and form posteriorbeliefs using Bayesian updating. Under this approach, posterior beliefs that are commonlyknown must be identical (Aumann, 1976), and repeated communication of beliefs eventuallyleads to agreement (Geanakoplos and Polemarchakis, 1982). These implications conﬂict not ∗ Department of Economics, University of Pennsylvania. Email: [email protected]. I am especiallygrateful to Drew Fudenberg for his guidance on this paper. This paper also beneﬁtted from useful commentsand suggestions by Jetlir Duraj, Siddharth George, Ben Golub, Jerry Green, Philippe Jehiel, Scott Kominers,David Laibson, Jonathan Libgober, Erik Madsen, Stephen Morris, Sendhil Mullainathan, Mariann Ollar,Harry Pei, Andrei Shleifer, Dov Samet, Tomasz Strzalecki, Satoru Takahashi, Anton Tsoy, and MuhametYildiz. a r X i v : . [ ec on . T H ] J u l nly with considerable empirical evidence of public and persistent disagreement, but alsowith the more basic experience that individuals interpret the same information in diﬀerentways. This paper relaxes the assumption of a common prior by supposing that players arestatistical learners: they form beliefs about payoﬀ-relevant parameters based on data, butpotentially disagree on how to interpret that data. I deﬁne a learning rule to be any func-tion that maps data (a sequence of signals) into a belief distribution over payoﬀ-relevantparameters (a ﬁrst-order belief). Players have common knowledge of some set of reasonablelearning rules—for example, these learning rules may correspond to Bayesian updating froma set of prior beliefs, or they may be maps from data to beliefs based on frequentist esti-mates for the unknown parameter. The special case of a singleton Bayesian learning rulereturns the common prior assumption, but in general, the set of learning rules will producediﬀerent beliefs from the same data, which I interpret as the set of plausible beliefs . I imposea key restriction to structure the approach: for any realization of the data, each player’s ownbelief about the parameter is a plausible belief; they assign probability 1 to all other playersholding plausible beliefs; and so forth. Since the set of plausible beliefs is endogenous tothe (random) data, so too are the strategic predictions that are consistent with this beliefrestriction.The main contribution of the paper is a proposed metric for the analyst’s “conﬁdence” ina strategic prediction in this game. Speciﬁcally, consider the prediction that a given action isrationalizable. I quantify the analyst’s conﬁdence in this prediction via a conﬁdence set: Theupper bound of the conﬁdence set is the probability that the action is rationalizable given some belief satisfying the belief restriction, and the lower bound is the probability that theprediction holds for all beliefs satisfying the restriction. Thus, if both of these probabilitiesare equal to one, the analyst has maximal certainty that the action is rationalizable, and ifthey are both zero, he has maximal certainty that it is not. In the intermediate cases, thereis uncertainty about whether the action is rationalizable, and the conﬁdence sets present away to quantify the extent of that uncertainty.The main results in this paper characterize various properties of these conﬁdence sets,beginning with their asymptotic behaviors as the quantity of data grows large. I ﬁrst show For example, in ﬁnancial markets, individuals publicly disagree in their interpretations of earnings an-nouncements (Kandel and Pearson, 1995), valuations of ﬁnancial assets (Carlin et al., 2013), and forecastsfor inﬂation (Mankiw et al., 2004). strictly rationalizable in the limiting inﬁnite-data game, the analyst’s conﬁdence sets may be very diﬀerent from t u for arbitrarily largequantities of data. Roughly, this is because the rate of convergence under diﬀerent learningrules cannot be uniformly bounded, so it is always possible that some learning rule producesa belief that is very diﬀerent from the others. If, however, the set of learning rules satisfya uniform convergence property that I describe, then the following statements hold: If anaction is strictly rationalizable at the limit, then the analyst’s conﬁdence set must convergeto t u as the quantity of data gets large, and if an action is not rationalizable in the limit,the analyst’s conﬁdence set converges to t u . (The intermediate case, in which actions arerationalizable but not strictly rationalizable, is more subtle—see Section 4 for further detail.)Next, I consider the setting of small sample sizes, and bound the extent to which theanalyst’s conﬁdence set diﬀers from its asymptotic limit. These bounds depend on propertiesof the learning environment—speciﬁcally, the quantity of data, and how fast the diﬀerentlearning rules jointly recover the payoﬀ-relevant parameter—as well as on a cardinal measurefor how strict the solution is at the limit. I apply these bounds to characterize conﬁdence setsfor example games and sets of learning rules. They allow us to obtain speciﬁc, quantitative,statements about conﬁdence away from the limit of inﬁnite data.In some cases it is possible to go beyond these bounds, and to fully characterize theconﬁdence set. Two such examples given in Section 2, where I consider a trade game and acoordination game. I show that the proposed approach generates novel comparative staticsin these games: Speculative trade is predicted to be more plausible when agents learn fromhigher-dimensional data, and coordination is predicted to be less likely when agents observenoisy data. These predictions—which hold even though data is public and common—arediﬃcult to produce under assumption of a common prior. This paper builds on a literature on the role of the common prior assumption in economictheory. (See Morris (1995) for a survey of key conceptual points.) Here I focus on anargument that even if learning does produce common priors in the long run, this does notimply that we should see common priors given a ﬁnite quantity of data, especially if thatdata is complex and hard to interpret. Rather than taking the limiting common prior as3ne that is already reached, I ask what predictions we can make while data is still beingaccumulated. The conﬁdence sets introduced in this paper provide a quantitative accountof whether predictions implied by a (limiting) common prior also hold for small data sets. This paper also contributes to a literature on the robustness of strategic predictionsto the speciﬁcation of player beliefs (Rubinstein, 1989; Dekel et al., 2006; Weinstein andYildiz, 2007; Chen et al., 2010; Ely and Peski, 2011) and equilibrium selection in incompleteinformation games (Carlsson and van Damme, 1993; Kajii and Morris, 1997). At a technicallevel, the actions that are rationalizable under the belief restriction that I impose are ∆-rationalizable strategies of Battigalli and Siniscalchi (2003), where the set ∆ of ﬁrst-orderbelief restrictions is endogenous to a learning process. The permitted types converge inthe uniform-weak topology , as proposed and characterized in Chen, di Tillio, Faingold, andXiong (2010) and Chen, di Tillio, Faingold, and Xiong (2017), and I use results about thistopology to prove several of the main results.Conceptually, the goals of the present paper diﬀer from the previous literature in severalrespects: First, my focus here is not on equilibrium selection—choosing one equilibriumfrom a set of many—but rather on providing a metric for conﬁdence in a given prediction.Second, in contrast to the many binary or “qualitative” notions of robustness that have beenproposed, this paper delivers a quantitative metric. Third, while the literature has primarilyconsidered robustness to perturbations of beliefs, I am interested here also in predictions thatwe may make for beliefs that are far from the limiting beliefs. To discipline these beliefs, Iendogenize the type space using a statistical learning foundation for belief formation. Thisaspect of the paper—combining learning foundations with game theoretic implications—connects to papers such as Dekel et al. (2004), Esponda (2013), and Steiner and Stewart(2008), among others. The modeling of agents as “statisticians” or “machine learners”relates to a growing literature in decision theory (Gilboa and Schmeidler, 2003; Gayer et al.,2007; Al-Najjar, 2009; Al-Najjar and Pai, 2014) and game theory (Jehiel, 2005; Spiegler,2016; Olea et al., 2017; Salant and Cherry, 2020; Haghtalab et al., 2020). Of these papers, Other reasons that the common prior is tenuous include that the data itself may lead to incompletelearning if it is endogenously acquired, and that convergence of individual beliefs need not imply convergencein beliefs about beliefs (Cripps et al., 2008; Acemoglu et al., 2015). Such belief restrictions have also been usefully applied in the work of Ollar and Penta (2017), amongothers. Brandenburger et al. (2008) and Battigalli and Prestipino (2011) also motivate small type structures asemerging from learning, although they do not explicitly model a dynamic learning process.

In this section, I use the proposed approach to revisit some classic examples—a two-playercoordination game and a two-player trading game. In each of these games, the assumptionthat players share a common prior has strong implications for strategic play. I show how wecan relax this assumption by endogenizing disagreement based on two new primitives: a data-generating process and set of learning rules. This approach allows us to relate conﬁdence ina strategic prediction to primitives of the learning environment.

A Seller owns a good of unknown value v P t , u . He can either enter a market at cost c , or exit and keep the good. Entering leads to a simultaneous interaction with a Buyer,where the Seller chooses whether to sell the good at a (pre-set) posted price p , and the Buyerchooses whether to purchase the good at that price. The game and its payoﬀs are describedin Figure 1 below. Enter

Exit

Sell Don’t SellBuy v p, p c , v c Don’t Buy 0 , v c , v c (0 , v ) Seller

Figure 1: Description of GameSuppose that the cost c and price p satisfy 0 ă c ă p ă

1, so the Seller prefers to sell5t the low value and prefers to keep the good at the high value. If players share a commonprior about v , then entering is not rationalizable for the Seller in this game, so trade willnot occur (similar to the no-trade theorem of Milgrom and Stokey (1982)). I suppose instead that player form beliefs based on a common data set of past goods andtheir valuations, but draw inferences from this data in diﬀerent ways. Each good in the datais described by m observable attributes with values normalized to lie in the interval r´ , s .Typical attributes are denoted x P X : “ r´ , s m , and the value of a good is a deterministicfunction f : X Ñ t , u of its attributes. The public data z n “ tp x i , f p x i qqu ni “ is a sequenceof n goods with attributes x i drawn from a uniform distribution on X , and the values ofthese goods.Players have common knowledge that f belongs to a certain family of functions F ,which we can think of as the relevant set of models. For simplicity, let F be the set of rectangular classiﬁcation rules , i.e. functions f R p x q “ p x P R q indexed to hyper-rectangles R in r´ , s m . The attributes of the Seller’s good are known to be the zero-vector, denoted x S , so the agent’s beliefs over F determine their beliefs about v “ f p x S q , the value of theSeller’s good. Agents may have diﬀerent prior beliefs π P ∆ p F q over the set of models. Fixing a prior π , the posterior belief π p f | z n q given data z n is a re-normalization of the prior over all rulesconsistent with the observed data (see Figure 2), and the posterior probability assigned tothe Seller’s good having a high value is π pt f : f p x S q “ u | z n q . Deﬁne B p z n q Ď ∆ pt , uq tobe the set of all posterior beliefs about v that are consistent with Bayesian updating fromsome prior π P ∆ p F q and the data z n .Without making detailed assumptions about the priors that players hold, or the beliefsthat players have about the priors of other players, I place the following restriction. Assumption 1 (Restriction on Beliefs) . For every realized z n , players have common cer- If trade does not occur subsequently, then the Seller receives v ´ c from entering but v ą v ´ c fromexiting. Thus, entering can be rationalized only if trade subsequently occurs. But trade can occur only if theBuyer believes that E p v q ě p while the Seller believes that E p v q ď p , implying E p v q “ p under their sharedbelief. The Seller can improve on his expected payoﬀ of p ´ c by choosing to exit. For example, whether all attributes fall into an “acceptable” range, as judged by a downstream buyer. In more detail, the state space is Ω “ F ˆ p X ˆ t , uq and the payoﬀ-relevant parameter is f p x S q .Agents have a prior belief q over Ω, where marg F q has support on the set of models R . Conditional on thetrue model being f , the data-generating process over p X ˆ t , uq is q pp x, y q | f q “ g p x q ¨ p y “ f p x qq where g is the uniform density on X . +1+1 ( x i , f ( x i )) f ( x i ) = 1 f ( x i ) = 0 Figure 2:

The circles represent the observed data. Each good is described by a vector in r´ , s ˆ r´ , s .The circle is black if its valuation is 1 and gray if its valuation is low. A rule is consistent with the data if itcorrectly predicts the valuation for each observation. Two rectangular classiﬁcation rules are depicted: eachpredicts ‘1’ for goods in the shaded region and ‘0’ for goods outside. Both are consistent with the observeddata. tainty in the event that all players have a ﬁrst-order belief in B p z n q . That is, players are assumed to have beliefs consistent with Bayesian updating from someprior on F ; they assign probability 1 to the other player having a posterior belief in this set,and so on.The analyst’s (ex-ante) conﬁdence in predicting that entering is rationalizable for theSeller is deﬁned as follows: For every number of observations n , let p n be the measure of datasets z n such that entering is rationalizable for some belief satisfying Assumption 1, and let p n be the measure of data sets z n such that entering is rationalizable for all beliefs satisfyingAssumption 1. Then the set r p n , p n s represents a “conﬁdence set” that describes how certainthe analyst should be in predicting that the entering is rationalizable, when agents observe n common data points. The extreme case of p n “ p n “ p n “ p n “ Claim 1.

The probability p n “ for every n P Z ` . Additionally:(a) Fixing any number of attributes m P Z ` , the probability p n Ñ as n Ñ 8 .(b) Fixing any number of data observations n P Z ` , the probability p n is increasing in thenumber of attributes m , and p n Ñ as m Ñ 8 . See Section 3.2 for the precise deﬁnition of common certainty in such an event. n , the probability that entering is rationalizable forall beliefs satisfying Assumption 1 is zero. But the probability that entering is rationalizablefor some belief satisfying Assumption 1 varies depending on n and m . Part (a) says that asthe number of observations n grows large, the probability p n vanishes to zero, implying thatthe conﬁdence set converges to a degenerate interval at zero. The inﬁnite-data limit thusreturns the prediction of “no trade” consistent with the common prior assumption. But if thequantity of data is ﬁnite, and the number of attributes is large, then p n can be substantiallygreater than zero. Indeed, Part (b) of the claim says that this probability p n can be madearbitrarily close to 1 by increasing the dimensionality of the learning problem via choice oflarge m . This reﬂects that in a high-dimensional learning problem, many classiﬁcation rulesare likely to be consistent with the data, including some that yield conﬂicting predictions.Thus, “rational” disagreement is possible and even likely.An exact characterization of r p n , p n s is given in Lemma 3 in the appendix. Using thislemma, I plot in Figure 3 the behavior of these conﬁdence sets for diﬀerent values of m ,assuming the true function to be p x P r´ . , . s m q . Although trade is not predicted inthe limiting game, it is a plausible outcome if the number of data observations is small andthe number of attributes is large. As one speciﬁc example: If there are 10 attributes andplayers observe only 20 goods, then the conﬁdence set r p n , p n s “ r , . s . That is, withnear certainty, entering will be rationalizable for the Seller given the realized data for somebelief satisfying Assumption 1. While I chose a simple set of learning rules for the purposeof obtaining exact expressions for the conﬁdence set, in practice conﬁdence sets like thosedepicted in Figure 3 can be simulated for more complex kinds of learning procedures.

20 40 60 80 1000.20.40.60.81.0

Number of observations n Number of observations n Number of observations n

20 40 60 80 1000.20.40.60.81.0

20 40 60 80 1000.20.40.60.81.0 ( a ) m = 1 ( b ) m = 10 ( c ) m = 100 p n p n p n p n p n p n Figure 3: The shaded area depicts conﬁdence sets r p n , p n s for the rationalizability of enteringgiven n common observations. 8 .2 Coordination I now turn to a second classic game, where (in contrast to the previous example) the pre-diction of interest holds in the inﬁnite-data limit. I characterize the robustness of thatprediction when players have beliefs based on small quantities of data.A contagious disease spreads across a population at an unknown speed. Two states areconnected by travel, and their governors choose between implementing a strong or a weak lockdown policy in their states to slow the spread of the disease. Implementation of thestrong lockdown policy entails a large economic cost, but if the states coordinate on thispolicy, then the disease will be suppressed with certainty.The two governors form beliefs about the growth rate of the disease based on a publicdata set tp t, y t qu nt “ , which consists of the number of reported cases of the disease, y t , ondays t “ , , . . . , n . The number of reported cases grows exponentially according tolog y t “ βt ` ε t where the noise term ε t has a normal distribution with known parameters µ “ σ ą β is not known. Payoﬀs are given by the following matrix: Strong WeakStrong ´ , ´ ´ ´ β, ´ β Weak ´ β, ´ ´ β ´ β, ´ β The economic cost of the strong lockdown is normalized to 1, and the cost of letting thedisease progress without a strong lockdown is given by its growth rate β . A weak lockdown isstrictly dominant if β ă

1, but coordination on the strong lockdown is the Pareto-dominantNash equilibrium if β ą β restrictsagents to identical posterior beliefs. This rules out, for example, the possibility that agovernor chooses the weak policy, not because he infers from the data that the disease islow-risk, but because he believes that the other governor may make such an inference.In practice, agents use statistical procedures to infer β from the data. I thus relax theassumption of a common prior in the following way. Deﬁne ˆ β p z n q to be the ordinary least-squares estimate of β from the data z n : “ tp t, log y t qu nt “ , and let φ n be the constant such9hat C p z n q “ t β : | β ´ ˆ β p z n q| ď φ n u is a p ´ α q -conﬁdence interval for β , where α P p , q is ﬁxed. Suppose the governors have common certainty in the event that both of their ﬁrst-order beliefs have support on C p z n q ; that is, each governor’s own belief about the value of β has support in this conﬁdence interval; they assign probability 1 to the other agent havingsuch a belief; and so forth. This allows players to disagree given the data, but requires thesize of that disagreement to be conﬁned within the conﬁdence interval. (The assumptionthat players assign common certainty is not crucial. Similar results go through if the playershave common p -belief (Monderer and Samet, 1989) in the set for large p , so long as we boundthe parameter space for β . See Section 6.) Claim 2.

Suppose that the actual growth rate is fast p β ą ), so that the strong lockdownis rationalizable given complete information of the payoﬀs. Then, for every σ ą , both p n and p n are increasing in n , while for every n , both p n and p n are decreasing in σ . That is, the analyst gains conﬁdence in predicting that the strong lockdown is ratio-nalizable as the reporting noise σ decreases, and the number of observations n increases. Lemma 3 in the appendix explicitly characterizes p n and p n . Using the expressions in thislemma, I plot in Figure 4 the behavior of these conﬁdence sets for diﬀerent levels of reportingnoise σ .

200 400 600 800 10000.20.40.60.81.0 200 400 600 800 10000.20.40.60.81.0 200 400 600 800 10000.20.40.60.81.0

Number of observations n Number of observations n Number of observations n ( c ) = 1000 ( b ) = 100 ( a ) = 10 p n p n p n p n p n p n Figure 4: The shaded area depicts conﬁdence sets r p n , p n s for the rationalizability of thestrong lockdown given n common observations, and allowing the reporting noise σ to vary.In all panels, β “ α “ . n grows, both p n and p n increase as well. If the numberof observations is large relative to the reporting noise, then the analyst should have high If instead β ă

1, then the reverse statements hold; that is, the probabilities p n and p n are decreasing in n and increasing in σ . σ “

10 and n “ r . , s is nearly degenerate at certainty. On the otherhand, if the reporting noise is large and the number of observations is relatively small, thenthe strong lockdown is likely to be rationalizable for some permitted types, but not for allof them—for example, if σ “ n “ r . , . s , suggestingsubstantial ambiguity regarding whether the strong lockdown is a good prediction of play.Subsequently, I generalize the approach described in these two examples. Basic Game.

There is a ﬁnite set I of players and a ﬁnite set of actions A i for each player i . The set of action proﬁles is A “ ś i P I A i , and the set of possible games is identiﬁedwith U : “ R | I |ˆ| A | . Agents have beliefs over a set of payoﬀ-relevant parameters Θ, which isa compact and convex subset of ﬁnite-dimensional Euclidean space. It is possible to takeΘ to be a subset of U , so that each θ is itself a game, or to deﬁne beliefs over a lower-dimensional set of payoﬀ-relevant parameters as in Section 2. In either case, the parametersin Θ are assumed to be related to payoﬀs by a bounded and Lipschitz continuous embedding g : Θ Ñ U (assuming the sup-norm on both spaces). Beliefs.

For each player i , let X i “ Θ, X i “ X i ˆ ś j ‰ i ∆ p X j q , . . . , X ni “ X n ´ i ˆ ś j ‰ i ∆ p X n ´ j q , etc., so that each X ki is the set of possible k -th order beliefs for player i .Deﬁne T i “ ś n “ ∆ p X in q . An element p t i , t i , . . . q P T i is a hierarchy of beliefs over Θ(describing the player’s uncertainty over Θ, his uncertainty over his opponents’ uncertaintyover Θ, and so forth), and referred to simply as a belief or type . There is a subset oftypes T ˚ i (that satisfy the property of coherency and common knowledge of coherency)and a function κ ˚ i : T ˚ i Ñ ∆ ` Θ ˆ T ˚´ i ˘ such that κ ˚ i p t i q preserves the beliefs in t i ; thatis, marg X n ´ κ ˚ i p t i q “ t ni for every n (Mertens and Zamir, 1985; Brandenburger and Dekel, The map g can be interpreted as capturing the known information about the structure of payoﬀs. Types are sometimes modeled as encompassing all uncertainty in the game. In the present paper,types describe players’ structural uncertainty over payoﬀs, but not their strategic uncertainty over opponentactions. marg X n ´ t ni “ t n ´ i , so that p t i , t i , . . . q is a consistent stochastic process. The tuple p T ˚ i , κ ˚ i q i P I is the universal type space . Subsequently I will develop smallertype spaces p T i , κ i q i P I where each T i Ď T ˚ i and κ i : T i Ñ κ ˚ i p T i q is the restriction of κ ˚ i to T i . The proposed approach endogenizes the type space based on two new primitives: a data-generating process , and a set of rules for how to extrapolate beliefs from realized data.Formally, let p Z t q t P Z ` be a stochastic process where the random variables Z t take value in acommon set Z , and the typical sample path is denoted z “ p z , z , . . . q . The data-generatingprocess is a measure P over the set Z of all (inﬁnite) sample paths. Let P n denote theinduced measure on the ﬁrst n variables. A data set z n of size n is the restriction of z to itsﬁrst n coordinates, and Z n is the set of all length- n data sets. I use Z n “ p Z , . . . , Z n q todenote the random initial sequence of length n . A learning rule is deﬁned to be any map from data sets into ﬁrst-order beliefs: µ : ď n “ Z n Ñ ∆ p Θ q . The special case of Bayesian learning rule is identiﬁed with a distribution on Θ along with afamily of distributions p P θ q θ P Θ , where each P θ P ∆ p Z q is the data-generating distributiongiven parameter θ . The realized data z n determines a posterior distribution on Θ ˆ Z ,and the learning rule maps this data into the marginal posterior distribution on Θ. Otherlearning rules may map the data z n to a degenerate belief at a sample statistic (such as theempirical average) or to a distribution over various point-estimates for θ .Players have common knowledge of a set M of learning rules. The set of plausiblebeliefs given this set M and realized data set z n is deﬁned B p z n q “ t µ p z n q : µ P M u Ď ∆ p Θ q . (1) The notation T ˚´ i is used throughout the paper to denote the set of proﬁles of opponent types, ś j ‰ i T ˚ j . Throughout, symbols such as Z t and Z n denote random variables, whereas lowercase symbols such as z n are particular, constant values. In general, making the mapping deterministic may require choosing a conditional probability if thereare multiple ones consistent with Bayes’ rule; here and elsewhere in the paper, implicitly assume that theupdating rule speciﬁes such a choice when Bayesian rules are mentioned. Note that this set is common across players. M that satisfy thefollowing condition: Assumption 2 (Common Limiting Belief) . There is a limiting belief µ such that lim n Ñ8 d P p µ p Z n q , µ q Ñ P -a.s. @ µ P M where d P is the Prokhorov metric on Θ . This assumption requires that all learning rules in M return the same limiting belief µ as the quantity of data n grows large. It is not critical that all diﬀerences in beliefs areremoved in the limit (see Section 6). But maintaining Assumption 2 in the main text allowsus to explore more precisely the scope for disagreement in an environment in which learningis feasible, but not immediate. While learning has not ceased, the available information maypermit multiple diﬀerent interpretations, producing diﬀerences in beliefs.I use the sets of plausible beliefs B p z n q to impose a restriction on hierarchies of beliefs,as in Battigalli and Siniscalchi (2003). Speciﬁcally, players are assumed to have commoncertainty in the event that all players have ﬁrst-order beliefs in B p z n q —that is, they have aﬁrst-order belief in this set, believe with probability 1 that all other players have a ﬁrst-orderbelief in this set, and so forth (Monderer and Samet, 1989). Formally, for any set B Ď ∆ p Θ q , and for any player i , deﬁne B , i p B q : “ t t i P T ˚ i : marg Θ κ ˚ i p t i q P B u to be the set of player i types whose marginal beliefs over Θ belong to the set B . For each k ą

1, and again for each player i , recursively deﬁne B k, i p B q “ t i P T ˚ i : κ ˚ i p t i q ˜ Θ ˆ ź j ‰ i B k ´ , j p B q ¸ “ + . For any ν, ν P ∆ p Θ q , the Prokhorov distance between these measures is d P p ν, ν q “ inf t (cid:15) ą ν p A q ď ν p A (cid:15) q ` (cid:15) for all Borel-measurable A Ď Θ u , where A (cid:15) denotes the (cid:15) -neighborhood of A in the sup-norm. This limiting belief µ can be interpreted as a common prior, following what Morris (1995) calls the“frequentist justiﬁcation” for assumption of a common prior. As I discuss in Section 6, it is not critical that players have common certainty in this event, and therestriction can be relaxed to common p -belief for large p . For example, B , i p B q is the set of player i types that assign probability 1 to all other players assigningprobability 1 to B . T B i “ Ş k ě B k, i p B q is the set of player i types that have common certainty inthe event that all players’ ﬁrst-order beliefs belong to B . Deﬁnition 1.

For every z n , the induced type space is ´ T B p z n q i , κ B p z n q i ¯ i P I , where κ B p z n q i : T B p z n q i Ñ κ ˚ i ´ T B p z n q i ¯ is the restriction of κ ˚ i to T B p z n q i . The type t i is permitted for player i if t i P T B p z n q i .This type space includes all type proﬁles where each player i has common certainty inthe event that all players have ﬁrst-order beliefs in B p z n q . Note that the type space permitscommon knowledge disagreement—that is, player i can believe with probability 1 that (allbelieve with probability 1 that...) players hold diﬀerent ﬁrst-order beliefs. Such types areprecluded under the common prior assumption not only in the present setting of commondata, but also if we were to allow for private and diﬀerent information (Aumann, 1976). Theinduced type spaces in Deﬁnition 1 thus corresponds to a relaxation of the common priorassumption, where the permitted extent of disagreement is governed by the set of learningrules M . In the special case in which M consists of a singleton Bayesian rule, then we returnthe common prior assumption.In this approach, restrictions are placed only on the beliefs that players hold on theexogenous parameter space Θ, and not on how they came to form those beliefs. For example,each of the following is consistent with the restriction in Deﬁnition 1: • (Common Knowledge) Each player i is associated with a player-speciﬁc learning rule µ i P M . It is common knowledge that given any data set z n , each player i ’s ﬁrst-orderbelief is µ i p z n q . • (Randomization) Each player i randomizes over learning rules in M using a player-speciﬁc distribution Q i P ∆ p M q , and applies the randomly drawn learning rule tothe realized data to form a ﬁrst-order belief. The distributions p Q i q i P M are commonknowledge, although the realized random rule is privately known. • (Misspeciﬁcation.) Player j believes with probability 1 that player i ’s ﬁrst order beliefis µ p z n q for every data set z n , whereas in fact player i ’s ﬁrst-order belief is µ p z n q fora diﬀerent learning rule µ P M . It is straightforward to show that the type sets T B p z n q i are belief-closed ; that is, κ B p z n q i p t i q ´ Θ ˆ T B p z n q´ i ¯ “ t i P T B p z n q i .

14t is possible sometimes to take B p z n q as the primitive without explicitly deﬁning M (as in the coordination example in Section 2.2), so that each measure P n directly deﬁnesa measure over sets of ﬁrst-order beliefs. In other cases, it is more natural to deﬁne M .Some examples for this set include: Bayesian Updating with Diﬀerent Priors.

Each learning rule µ π P M is identiﬁed with aprior distribution π P ∆ p Θ ˆ Z q . For any realized data z n , the belief µ π p z n q is the marginalover Θ of the posterior belief associated with the corresponding prior π . Sample Statistics.

The set M consists of learning rules that map the data to diﬀerentpoint-estimates for the payoﬀ-relevant parameter. For example, M might consist of the twolearning rules µ mean and µ median , where for any data set z n , µ mean p z n q is a point-mass beliefon the mean realization in z n (as in Jehiel (2018)), and µ median p z n q is a point-mass belief onthe median realization. Linear Regression.

Suppose that X Ď R p , p ă 8 , is a set of attributes that determine thevalue of a parameter in Θ (e.g. physical covariates of a patient seeking health insurance,and medical outcomes for those patients). The observations in z n are pairs p x, θ q , and thepayoﬀ-relevant unknown is the parameter associated with some new x ˚ (e.g. the outcomefor a new patient with characteristics x ˚ ).Each learning rule µ P M corresponds to a diﬀerent regression model based on a subsetof attributes I µ Ď t , . . . , p u (as for example in Olea et al. (2017)). Write x µ “ p x i q i P I µ for the coordinates of x at those indices. Then, ˆ f OLSµ r z n sp x q “ β OLSµ ¨ x µ is the linearfunction of the attributes in I µ that best ﬁts the observed data z n “ tp x k , θ k qu nk “ ; that is, β OLSµ “ argmin β P R | Iµ | n ř ni “ p β ¨ x kµ ´ θ k q . For every z n , the learning rule µ maps z n into apoint-mass belief on the corresponding prediction ˆ f OLSµ r z n sp x ˚ q . Case-Based Learning with Diﬀerent Similarity Functions.

As in the previous example, sup-pose that the observations are pairs p x, θ q P X ˆ Θ, and the payoﬀ-relevant parameteris the outcome at some new x ˚ . Each “case-based” learning rule µ P M is identiﬁedwith a real number λ P R ` (to be interpreted momentarily) and maps the historical data The deﬁnition of B p z n q must, however, respect Assumption 2. However, there must exist some set oflearning rules M that could be deﬁned, which respects Assumption 2 and gives rise to these sets B p z n q . Asuﬃcient condition is for there to exist a P -measure 1 set of sequences Z ˚ Ď Z such that for every z P Z ˚ ,and every sequence p ν n q n “ satisfying ν n P B p z n q for each n , it holds that lim n Ñ8 d P p ν n , µ q “ n “ tp x k , θ k qu nk “ into a weighted average of the observed parameter values θ k (Gilboa andSchmeidler, 1995).The observed parameters at x -values “more similar” to x ˚ are weighted more heavily.Formally, let g : X ˆ X Ñ R ` be a similarity function on attributes, where g p x, x q describesthe distance between attribute vectors x and x . The learning rule with parameter λ maps z n to a point mass on the weighted average n ř nk “ θ k ¨ p e ´ λg µ p x k ,x ˚ q q{p ř k e ´ λg µ p x k ,x ˚ q q . Theparameter λ controls the degree to which similar observations are weighted more heavily thandissimilar observations. For example, λ “ λ Ñ 8 returns the observed state at the most similar attribute vector.

I now use the proposed framework to construct a quantitative metric for the analyst’s con-ﬁdence in a strategic prediction, focusing on prediction that an action is interim-correlatedrationalizable (Dekel et al., 2007; Weinstein and Yildiz, 2017). The use of this particularsolution concept is not critical to the approach—for example, we could study the agent’sconﬁdence in prediction of Bayesian Nash equilibria (see Section 6)—but interim-correlatedrationalizability is well-suited to the present setting, where agents may have common knowl-edge disagreement. Its deﬁnition is reviewed here:For every player i and type t i P T ˚ i , set S i r t i s “ A i , and deﬁne S ki r t i s for k ě a i P S ki r t i s if and only if a i is a best reply to some π P ∆ p Θ ˆ T ˚´ i ˆ A ´ i q satisfying(1) marg Θ ˆ T ˚´ i π “ κ ˚ i p t i q and (2) marg A ´ i ˆ T ´ i π pt a ´ i , t ´ i q | a ´ i P S k ´ ´ i r t ´ i suq “

1, where S k ´ ´ i r t ´ i s “ ś j ‰ i S k ´ j r t ´ j s . We can interpret π to be an extension of type t i ’s belief κ ˚ i p t i q onto the space ∆ p Θ ˆ T ´ i ˆ A ´ i q , with support in the set of actions that survive k ´ T ´ i . For every i , the actions in S i r t i s “ Ş k “ S ki r t i s are interim correlated rationalizable for type t i , or(henceforth) simply rationalizable . Say that a i is strictly rationalizable for type t i if the bestreply conditions above are strengthened to strict best replies.For any set of beliefs B Ď ∆ p Θ q , say that action a i is strongly B -rationalizable if it is Equilibrium notions are known to lead to potentially counterintuitive predictions when players havecommon knowledge disagreement. For example, consider a matching pennies game where player 1 receives θ if players match and ´ θ otherwise, and player 2 receives ´ θ if the players match and θ otherwise. Let θ P t´ , u . Then if player 1 assigns probability 1 to θ “ θ “ ´

1, itis (somewhat counterintuitively) a Bayesian Nash equilibrium for both players to choose match. See Dekelet al. (2004) for an extended discussion. i with any type t i P T B i , and it is weakly B -rationalizable if it isrationalizable for player i with some type t i P T B i . Strong and weak B -rationalizability rep-resent (respectively) maximally stringent and maximally lenient approaches for determiningwhether a i constitutes a “reasonable” prediction in the interim type space ` T B i , κ B i ˘ i P I . The main concept of a conﬁdence set is now deﬁned.

Deﬁnition 2.

For every n P Z ` , deﬁne p n p a i q to be the probability (over possible datasets z n ) that action a i is rationalizable for every type in T B p z n q i ; that is, p n p a i q “ P n pt z n : a i is strongly B p z n q -rationalizable uq . (2)Deﬁne p n p a i q to be the probability (over possible datasets z n ) that action a i is rationalizable for some type t i P T B p z n q i ; that is, p n p a i q “ P n pt z n : a i is weakly B p z n q -rationalizable uq . (3)The conﬁdence set for rationalizability of a i given n observations is r p n p a i q , p n p a i qs .The larger p n p a i q and p n p a i q are, the more conﬁdent an analyst should be in predictingthat a i is rationalizable. At extremes: If p n p a i q “ p n p a i q “

1, then given observation of n random samples, action a i is guaranteed to be rationalizable for player i (for all permittedtypes). If p n p a i q “ p n p a i q “

0, then action a i is guaranteed to not be rationalizable forplayer i (for any permitted types). In the intermediate cases, if 0 ă p n p a i q “ p n p a i q ă a i depends on the speciﬁc realization of the data, and if p n p a i q ă p n p a i q , then the prediction requires assumptions on the details of the agent’s beliefbeyond Assumption 1. Observation 1.

For every player i and action a i P A i : Whether an action a i is interim-correlated rationalizable for some type t i does not depend on thedescription of the underlying type space (Dekel et al., 2007). Hence we only need to deﬁne rationalizabilityfor the universal type space, even though the type spaces that we will work with are the smaller type spaces ´ T B p z n q i , κ B p z n q i ¯ , and they vary depending on the data z n . Given the restriction in Deﬁnition 1, the weakly B -rationalizable strategies are the ∆-rationalizablestrategies of Battigalli and Siniscalchi (2003), where ∆ “ p ∆ i q i P I and each ∆ i “ t ν P ∆ p Θ ˆ T ´ i ˆ A ´ i q | marg Θ µ P B u encodes the belief restriction that ﬁrst-order beliefs belong to B . The concept of strong B -rationalizability can be interpreted as a “robust” version of ∆-rationalizability. I do not comment here on what further assumptions may be imposed, interpreting this case simply asone in which the prediction is tenuous. a) p n p a i q ď p n p a i q for every n P Z ` .(c) If M consists of a single learning rule, then p n p a i q “ p n p a i q for every n P Z ` . In the special case in which agents have a common prior, the deﬁnitions in p n p a i q and p n p a i q have the following familiar interpretation: Remark 1. (Common Prior.) Suppose that players share a common prior over Θ ˆ Z andfor simplicity let Z be ﬁnite. Write µ for the learning rule that maps z n into the inducedposterior belief over Θ under the common prior. Then, each realization z n determines aninterim game, where players all have common certainty in the posterior belief. Moreover,the common prior determines a distribution over z n , and hence over possible interim games.For any player i and action a i , the probabilities p n p a i q “ p n p a i q , and are equal to themeasure of size- n datasets z n (under the common prior ) with the property that action a i is rationalizable for player i in the corresponding interim game. In the above approach, the common prior serves multiple roles: it determines the truedistribution over the data that agents might see, and also determines how agents updatefrom that data. When we separate these roles, we can still use an objective data-generatingprocess to deﬁne a measure over interim games, as I do here. In this way, the probabilities p n p a i q and p n p a i q are a natural generalization of a standard measure of the typicality of astrategic prediction, in the absence of a common prior. The subsequent sections study how conﬁdence sets depend on the underlying learning envi-ronment and the game in question. I ﬁrst consider the limiting behavior of the probabilities p n p a i q and p n p a i q as the quantity of data n gets large. Recall that by Assumption 2, thebeliefs induced by the diﬀerent learning rules converge to a limiting belief µ . Thus, the n “ 8 limit corresponds to an incomplete information game in which players have commoncertainty in the event that every player has ﬁrst-order belief µ . Whether the probabilities A small diﬀerence in the formulations is that p n p a i q and p n p a i q are deﬁned using the “true” probabilitymeasure P P ∆ p Z q in the present approach, instead of a measure Q P ∆ p Θ ˆ Z q . This approach is used for example in Kajii and Morris (1997) (if we re-interpret the histories z n as thestates), where an incomplete information game is “close” to a complete information game if the payoﬀs ofthe complete information game occur with high probability under the prior. n p a i q and p n p a i q are continuous at n “ 8 tells us how sensitive rationalizability of a i is toan assumption that agents have coordinated their beliefs using inﬁnite data. When theseprobabilities are discontinuous at n “ 8 , then the inﬁnite-data prediction is fragile—thatis, the analyst would make a diﬀerent prediction for arbitrarily large but ﬁnite quantities ofdata.Formally, let t i be the player i type with common certainty in the event that each player’sﬁrst-order belief is µ . Then deﬁne p p a i q “ p p a i q “ a i is rationalizable fortype t i , and deﬁne p p a i q “ p p a i q “ Deﬁnition 3.

The conﬁdence set for action a i is asymptotically continuous iflim n Ñ8 r p n p a i q , p n p a i qs “ r p p a i q , p p a i qs . Whether conﬁdence sets are asymptotically continuous depends crucially on whether thebeliefs induced by the diﬀerent learning rules converge uniformly to µ . Assumption 3 (Uniform Convergence) . lim n Ñ8 sup µ P M d P p µ p Z n q , µ q “ P -a.s., where d P is the Prokhorov metric on ∆ p Θ q . Assumption 2 already implies that for each learning rule µ P M , the (random) inducedbelief µ p Z n q almost surely converges to µ as the quantity of data n grows large. Assumption3 strengthens this by requiring additionally that the speed of convergence does not vary toomuch across the diﬀerent learning rules in M . Speciﬁcally, the sequence of beliefs t µ p Z n qu must converge to µ (as n Ñ 8 ) uniformly across µ P M .A suﬃcient condition for Assumption 3 to hold is that the set of learning rules M isﬁnite. But failures of Assumption 3 occur for classes of learning rules that we may considerplausible. In particular, Assumption 3 fails if the class M is too rich, as in the followingexample: Example 1 (Rich Sets of Priors and Likelihoods) . An unknown parameter v takes valuesin t , u . Players commonly observe a sequence of realizations from the set Z “ t , u .Learning rules µ π,q P M are indexed to parameters π P p , q and q P p { , q , where the arameter π is the prior probability of value 1, and q identiﬁes the following signal structure: z “ z “ v “ q ´ qv “ ´ q Each rule µ π,q is identiﬁed with prior π and signal structure q , and maps the observed signaloutcomes into the posterior belief over t , u . Assume that the true data-generating processbelongs to this class; that is, there exists some q ˚ P p { , q such that the distribution overthe signal set t , u is p q ˚ , ´ q ˚ q when v “ , and the distribution is p ´ q ˚ , q ˚ q when v “ . In this example, all learning rules lead to the same belief (that is, there is asymptoticagreement in the sense of Acemoglu et al. (2015)). But because the rate of this conver-gence cannot be uniformly bounded across the diﬀerent learning rules, it is possible for theconﬁdence set to be discontinuous at n “ 8 . Claim 3.

Consider the trading game described in Section 2, and the data-generating pro-cess and set of learning rules from Example 1. Then, lim n Ñ8 r p n p a i q , p n p a i qs “ r , s , while r p p a i q , p p a i qs “ t u , so the prediction that entering is not rationalizable for the Seller isnot asymptotically continuous. The claim tells us that although trade will not occur in the limiting game, this predictionis sensitive to the assumption that agents have indeed coordinated their priors using inﬁnitedata. Even if the amount of data that players commonly observed were to be arbitrarilylarge, the analyst should nevertheless consider trade to be a plausible outcome.

In contrast, when the assumption of uniform convergence is satisﬁed, then the limitingconﬁdence sets can be tightly linked to predictions in the limiting game.

Theorem 1.

Suppose Assumption 3 is satisﬁed.(a) If a i is strictly rationalizable for player i of type t i , then lim n Ñ8 r p n p a i q , p n p a i qs “ t u . b) If a i is not rationalizable for player i of type t i , then lim n Ñ8 r p n p a i q , p n p a i qs “ t u . This theorem says that if an action a i is strictly rationalizable for player i given inﬁnitedata, then p n p a i q and p n p a i q both converge to 1 as n grows large. Thus, when agentsobserve suﬃciently large quantities of public data, the analyst should be arbitrarily conﬁdentin predicting that a i is rationalizable. On the other hand, if action a i is not rationalizablegiven inﬁnite data, then p n p a i q and p n p a i q both converge to 0, so the analyst should bearbitrarily conﬁdent in predicting that a i is not rationalizable for large data sets. Theorem 1 builds on results from the literature on topologies on the universal type space.Consider any sequence of types p t ni q n “ where each t i P T B p z n q i . Under Assumption 3, as thequantity of data n gets large, the types t ni (almost surely) have common certainty thatﬁrst-order beliefs lie in an arbitrarily small neighborhood of the limiting belief µ . Thus,the sequence p t ni q can be shown to converge to t i , in the uniform-weak topology (Chenet al., 2010) on the universal type space (see Lemma 4). Since rationalizability is upperhemi-continuous in the uniform-weak topology (Chen et al., 2010), Part (b) of the theoremfollows.Part (a) of the theorem is related to lower hemi-continuity of strict rationalizability inthe uniform-weak topology (as shown in Chen et al. (2010)), but this property is notsuﬃcient. Lower hemi-continuity guarantees that for any sequence of types p t ni q n “ from T B p z n q i , the action a i must eventually be rationalizable along the sequence, but the ratesof this convergence can diﬀer substantially across diﬀerent sequences. For eventual strong If the limiting belief µ is degenerate at a limiting parameter θ , and players have common certaintythat players’ ﬁrst-order beliefs have support in a shrinking neighborhood of θ (see Section 5.1 for a moreformal development), then the property that p n p a i q Ñ a i is robustlyrationalizable, as deﬁned in Morris et al. (2012), with the small diﬀerence that Morris et al. (2012) consideralmost common belief in the exact parameter θ , while I consider common certainty in a neighborhood of θ . As Proposition 1 in Morris et al. (2012) shows, strict rationalizability is a suﬃcient condition for robustrationalizability. See also Kajii and Morris (2020) for related results. The intermediate case in which a i is rationalizable for player i given inﬁnite data, but not strictlyrationalizable, is subtle and depends on details of the game. See Online Appendix O.4 for examples in whichlim n Ñ8 r p n p a i q , p n p a i qs “ t u and in which lim n Ñ8 r p n p a i q , p n p a i qs “ r , s . Note that the latter correspondsto a maximally ambiguous outcome—no amount of data is decisive on whether or not the action should beconsidered rationalizable. It is crucial that convergence occurs in this topology and not simply the product topology, as otherwisethe negative results of Weinstein and Yildiz (2007) would apply. p z n q -rationalizability, we need that a i is rationalizable for all types from T B p z n q i when n is suﬃciently large. To establish this, I show that there is a P -measure 1 set of sequencesalong which the sets ´ T B p z n q i ¯ n “ converge to the singleton set t t i u in the Hausdorﬀ metricinduced by the uniform-weak metric. The key lemma underlying this result, Lemma 6,relates the degree of “strictness” of rationalizability of action a i at the limiting type t i tothe size of the neighborhood around µ such that common certainty of that neighborhoodimplies rationalizability of a i . The stronger property that types converge uniformly over theset T B p z n q i delivers the desired result. The previous section characterized conﬁdence sets given large numbers of common obser-vations. I now focus on the setting of small n , and bound the extent to which the agent’sconﬁdence set r p n p a i q , p n p a i qs diverges from its asymptotic limit r p p a i q , p p a i qs . Through-out this section, I impose the simplifying assumptions that observations are i.i.d., and thatthey take values from a ﬁnite set Z : Assumption 4. Z , . . . , Z n „ i.i.d. Q . Assumption 5. | Z | ă 8 . In some cases, as in the examples in Section 2, the conﬁdence set can be exactly charac-terized. In what follows, I provide bounds for the conﬁdence set that can be easier to derivein certain cases.

First consider an action a i that is strictly rationalizable for player i of type t i . By Theorem1, the analyst’s conﬁdence set r p n p a i q , p n p a i qs converges to a degenerate interval at 1. Propo-sition 5, below, provides a lower bound on p n p a i q , which informs how fast this convergenceoccurs.A key input into the bound is the “degree” to which a i is strictly rationalizable for thelimiting type t i . Say that a family of sets p R i r t i sq t i P T i , where each R j r t j s Ď A j , has the δ -strict best reply property if for each i P I , type t i P T i , and action a i P R i r t i s there is a22onjecture σ ´ i : Θ ˆ T ´ i Ñ ∆ p A ´ i q to which a i is a δ -strict best reply for t i ; that is, ż Θ u i p a i , σ ´ i p θ, t ´ i q , θ q t i r dθ ˆ dt ´ i s ´ ż Θ u i p a i , σ ´ i p θ, t ´ i q , θ q t i r dθ ˆ dt ´ i s ě δ @ a i ‰ a i . Say that an action a i is δ -strict rationalizable for type t i if there exists a family of sets p R j r t j sq t j P T j with the δ -strict best reply property, where a i P R i r t i s . Then, if a i is strictly rationalizable for the limiting type t i , and players have commonlyobserved n realizations, the probability that a i is rationalizable for all permitted types canbe upper bounded as follows. Proposition 1.

Suppose a i is strictly rationalizable for type t i , and deﬁne δ : “ sup t δ : a i is δ -strictly rationalizable for type t i u (4) noting that this quantity is strictly positive. Further deﬁne ξ : “ sup θ,θ P Θ } θ ´ θ } . (5) Then, for every n ě , p n p a i q ě ´ Kξδ E ˆ sup µ P M d P p µ p Z n q , µ q ˙ (6) where K is the Lipschitz constant of the map g : Θ Ñ U . Recalling that p n p a i q ě p n p a i q for every n , this proposition allows us to lower bound theconﬁdence set r p n p a i q , p n p a i qs .The expression in (7) is increasing in δ , so the “more strictly-rationalizable” the actionis for the limiting type, the fewer observations are necessary for the prediction to hold. Thebound is decreasing in E p sup µ P M d P p µ p Z n q , µ qq , which is the expected distance from thelimiting belief µ to the farthest belief in the plausible set B p Z n q . When Assumption 3 issatisﬁed, then E p sup µ P M d P p µ p Z n q , µ qq Ñ n Ñ 8 , and the speed of this convergencecan be interpreted as the speed at which players commonly learn (Cripps et al., 2008).Thus, Theorem 5 suggests that the quicker players commonly learn, the fewer observationsare necessary for limiting predictions to carry over to small-data settings.In an important special case, the limiting belief µ is a point mass at some θ , and the This is equivalent to γ -rationalizability from Dekel et al. (2007), where γ “ ´ δ . B p z n q consist of beliefs with support on shrinking neighborhoods of θ . Formally, let C p z n q : “ ď µ P M supp µ p z n q @ z n P Z n with the implication that every µ p z n q , µ P M , assigns probability 1 to C p z n q . If C p z n q collapses to the singleton set t θ u as n Ñ 8 , then the bound in Proposition 5 can besimpliﬁed as follows.

Assumption 6. sup θ P C p Z n q } θ ´ θ } converges to zero P -almost surely. Proposition 2.

Suppose Assumption 6 holds, and the action a i is strictly rationalizable fortype t i . Then, for every n ě , p n p a i q ě ´ Kδ E ˜ sup θ P C p Z n q } θ ´ θ } ¸ (7) where K is the Lipschitz constant of the map g : Θ Ñ U . The expressions in Propositions 5 and 2 can be used to derive quantitative bounds forspeciﬁc sets of learning rules, as in the following example:

Example 2.

Consider the payoﬀ matrix from Section 2.2 with unknown parameter β P R . Suppose that players commonly observe n public signals z t “ β ` ε t , with standardnormal error terms ε t that are i.i.d. across observations. The set of learning rules is M “t µ x u x Pr´ η,η s , where each learning rule µ x is identiﬁed with the prior belief β „ N p x, q , andmaps data into a point mass at the posterior expectation of β . The set C p z n q thus consistsof the posterior expectations under the diﬀerent priors, and players have common certaintyin the event that all players have ﬁrst-order beliefs with support on C p z n q . Let the true valueof β satisfy β ą . Then, applying Proposition 2: Corollary 1.

For each n ě , p n p strong q ě ´ β ´ ˜c πn ` β ` ηn ` ¸ The bound in Corollary 1 is decreasing in η (the size of the model class), increasing in n (thenumber of observations), and increasing in β ´ (the strictness of the solution at the limit). That is, there is a P -measure 1 set of (inﬁnite) sequences such that sup θ P C p z n q } θ ´ θ } Ñ n Ñ 8 for each sequence z in this set. .2 Upper Bound Now suppose that the action a i is not rationalizable for player i of type t i . We know fromPart (c) of Theorem 1 that in this case, the analyst’s conﬁdence set r p n p a i q , p n p a i qs convergesto a degenerate interval at zero. But given small quantities of data n , the action a i may stillconstitute a plausible prediction of play, as in the trading game studied in Section 2. Claim3, below, provides an upper bound on p n p a i q , which informs whether the analyst shouldconsider a i a plausible prediction away from the limit.To deﬁne this bound, a few intermediate deﬁnitions are needed. Let Z a i be all data sets z n given which the action a i is weakly B p z n q -rationalizable. (This set must be determined ona case-by-case basis.) Let p Q z n P ∆ p Z q be the empirical measure associated with data set z n .The Kullback-Leibler divergence between p Q z n and the actual data-generating distribution Q is D KL p p Q z n } Q q “ ř z P Z Q p z q log ´ p Q z n p z q Q p z q ¯ . Deﬁne Q ˚ n “ argmin p Q P t z n P Z nai u D KL p p Q z n } Q q to be the empirical measure (associated with a data set in Z a i ) that is closest in Kullback-Leibler divergence to Q . Application of Sanov’s theorem directly gives the following result. Proposition 3.

Suppose a i is not rationalizable for type t i ; then, for every n ě , p n p a i q ď p n ` q | Z | ´ nD KL p Q ˚ n } Q q . Recalling that p n p a i q ě p n p a i q for every n , this proposition allows us to upper bound theconﬁdence set r p n p a i q , p n p a i qs . The claim is applied below in an example setting: Example 3.

Consider the trading game from Section 2 and the learning rules described inExample 1, but suppose that the domain of q is r { , s and the domain of π is r { , { s ,so that Assumption 3 is satisﬁed. Let the true signal structure be identiﬁed with q ˚ “ { .and suppose the posted price is p “ { . Theorem 1 implies that entering will fail to berationalizable when players have observed suﬃcient data. Nevertheless, the action may berationalizable for a permitted belief if players have observed a small number of data points.The corollary below quantiﬁes this probability. orollary 2. For each n ě , p n p enter q ď p n ` q ´ r n n where r n “ ´ log p n q ´ log ´ t n ` log p q log p q u ¯¯ ` ´ log p n q ´ log ´ t n ´ log p q log p q u ¯¯ . Asymptotic Disagreement.

In the main text, I imposed an assumption which guaran-teed that beliefs produced by learning rules in M uniformly converge to a common limitingbelief µ . This implies that learning eventually removes all diﬀerences in beliefs. It is pos-sible to replace Assumption 3 with the following, weaker condition, which allows players tohave heterogeneous beliefs even in the limit: For any (cid:15) ě

0, say that the class of learningrules M satisﬁes (cid:15) -Uniform Convergence iflim n Ñ8 sup µ P M d P p µ p Z n q , µ q ď (cid:15) P -a.s.This requires that the set of expected parameters converges to an (cid:15) -neighborhood of µ .Then, Theorem 1 holds as long as the set of learning rules M satisﬁes (cid:15) -Uniform Convergencefor some (cid:15) ď δ {p Kξ q . The rate results do not change. Approximate Common Certainty.

Suppose that instead of imposing common certaintyin B p z n q , as we have done in the main text, players have common p -belief in B p z n q . Formally,for any probability p P r , s , player i , and set B Ď ∆ p Θ q , deﬁne B ,pi p B q : “ t t i P T ˚ i : marg Θ κ ˚ i p t i q P B u . For each k ą

1, and again for each player i , recursively deﬁne B k,pi p B q “ t i P T ˚ i : κ ˚ i p t i q ˜ Θ ˆ ź j ‰ i B k ´ ,pj p B q ¸ ě p + . This set has the same deﬁnition as B , i from the main text. It is possible to relax the assumptionsfurther, so that B ,pi p B q : “ t t i P T ˚ i : sup ν P B d P p ν, marg Θ κ ˚ i p t i qq ď ´ p u , but this does not correspond toany standard deﬁnitions. T B ,pi “ Ş k ě B k,pi p B q is the set of player i types that have common p -belief in the eventthat all players’ ﬁrst-order beliefs belong to B . There exists a p such that so long as playershave common p -belief in the event that all players’ ﬁrst-order beliefs belong to B p z n q , where p ą p , then Theorem 1 holds as stated. Rate results similar to those in Section 5 can alsobe obtained (see Online Appendix O.5 for details). Both extensions rely on boundedness ofthe payoﬀ range. Conﬁdence Sets for Equilibrium.

The proposed approach can be paired with solutionconcepts besides rationalizability. For example, suppose we are interested in evaluating ananalyst’s conﬁdence in predicting that the action proﬁle a P A is part of a (pure-strategy)Bayesian Nash equilibrium. The analogous conﬁdence set is r p n p a q , p n p a qs , where the lowerbound p n p a q is the probability (over possible datasets z n ) that a i is a best reply to a ´ i forevery player i of any type t i P T B p z n q i . The upper bound p n is the probability that thereexists a belief-closed type space p T i , κ i q i P I where each T i Ď T B p z n q i , and the strategy proﬁle σ with σ i p t i q “ a i for all i, t i P T i is a Bayesian Nash equilibrium. Then, Theorem 1 holdswith “strict rationalizability” replaced with “strict equilibrium” in the limiting game, andthe rate results provided in Theorem 5 hold when δ is replaced with an analogous notionfor the strictness of the equilibrium in the limiting game. Economists make predictions in incomplete information games based on models of unobserv-able beliefs. A large literature on the robustness of strategic predictions to the speciﬁcationof agent beliefs provides guidance regarding whether these predictions should be trusted.These robustness notions tend to be qualitative—we learn whether the prediction is or isn’trobust to perturbations in the agents’ beliefs. Here I oﬀer a diﬀerent perspective, namely aquantitative metric for how robust the prediction is. The metric depends on the quantity ofdata that agents get to see. Predictions that hold given inﬁnite quantities of data may nothold given large quantities of data, and those that hold given large quantities of data may nothold in environments where agents see only a few observations. Likewise, predictions thatdon’t hold at the limit may nevertheless be plausible when agents’ beliefs are coordinated bya small number of observations. The proposed framework provides a way of formalizing this,27enerating new comparative statics for how the analyst’s conﬁdence in a strategic predictionvaries with primitives of the learning environment.

References

Acemoglu, D., V. Chernozhukov, and M. Yildiz (2015): “Fragility of AsymptoticAgreement under Bayesian Learning,”

Theoretical Economics , 11, 187–225.

Al-Najjar, N. (2009): “Decisionmakers as Statisticians: Diversity, Ambiguity, and Learn-ing,”

Econometrica , 77, 1371–1401.

Al-Najjar, N. and M. Pai (2014): “Coarse Decision Making and Overﬁtting,”

Journalof Economic Theory , 150, 467–486.

Aumann, R. J. (1976): “Agreeing to Disagree,”

The Annals of Statistics , 4, 1236–1239.

Battigalli, P. and A. Prestipino (2011): “Transparent Restrictions on Beliefs andForward Induction Reasoning in Games with Asymmetric Information,”

The B.E. Journalof Theoretical Economics , 13, 79–130.

Battigalli, P. and M. Siniscalchi (2003): “Rationalization and Incomplete Informa-tion,”

Advances in Theoretical Economics , 3, 1534–5963.

Brandenburger, A. and E. Dekel (1993): “Hierarchies of Belief and Common Knowl-edge,”

Journal of Economic Theory , 59, 189–198.

Brandenburger, A., A. Friedenberg, and J. Kiesler (2008): “Admissibility inGames,”

Econometrica , 76, 307–352.

Carlin, B. I., S. Kogan, and R. Lowery (2013): “Trading Complex Assets,”

TheJournal of Finance , 68, 1937–1960.

Carlsson, H. and E. van Damme (1993): “Global Games and Equilibrium Selection,”

Econometrica , 61, 989–1018.

Chen, Y.-C., A. di Tillio, E. Faingold, and S. Xiong (2010): “Uniform topologieson types,”

Theoretical Economics , 5, 445–478.——— (2017): “Characterizing the Strategic Impact of Misspeciﬁed Beliefs,”

Review ofEconomic Studies , 84, 1424–1471.

Chen, Y.-C. and S. Takahashi (2020): “On Robust Selection and Robust Rationaliz-ability,” Working Paper.

Cripps, M., J. Ely, G. Mailath, and L. Samuelson (2008): “Common Learning,”28 conometrica , 76, 909–933.

Dekel, E., D. Fudenberg, and D. Levine (2004): “Learning to Play Bayesian Games,”

Games and Economic Behavior , 46, 282–303.

Dekel, E., D. Fudenberg, and S. Morris (2006): “Topologies on Types,”

TheoreticalEconomics , 1, 275–309.——— (2007): “Interim Correlated Rationalizability,”

Theoretical Economics , 2, 15–40.

Ely, J. and M. Peski (2011): “Critical Types,”

Review of Economic Studies , 78, 907–937.

Esponda, I. (2013): “Rationalizable Conjectural Equilibrium: A Framework for RobustPredictions,”

Theoretical Economics , 8, 467–501.

Gayer, G., I. Gilboa, and O. Lieberman (2007): “Rule-Based and Case-Based Rea-soning in Housing Prices,”

The B.E. Journal of Theoretical Economics , 7, 1–37.

Geanakoplos, J. and H. Polemarchakis (1982): “We Can’t Disagree Forever,”

Journalof Economic Theory , 28, 192–200.

Geary, R. C. (1935): “The Ratio of the Mean Deviation to the Standard Deviation as aTest for Normality,”

Biometrika , 27, 310–332.

Gibbs, A. and F. Su (2002): “On Choosing and Bounding Probability Metrics,”

Interna-tional Statistic Review . Gilboa, I. and D. Schmeidler (1995): “Case-Based Decision Theory,”

The QuarterlyJournal of Economics , 110, 605–639.——— (2003): “Inductive Inference: An Axiomatic Approach,”

Econometrica , 71, 1–26.

Haghtalab, N., M. O. Jackson, and A. D. Procaccia (2020): “Belief Polarizationin a Complex World: A Learning Theory Perspective,” Working Paper.

Hastie, T., R. Tibshirani, and J. Friedman (2009):

The Elements of Statistical Learn-ing , Springer.

Jehiel, P. (2005): “Analogy-Based Expectation Equilibrium,”

Journal of Economic The-ory , 123, 81–104.——— (2018): “Investment Strategy and Selection Bias: An Equilibrium Perspective onOveroptimism,”

American Economic Review , 108, 1582–1597.

Kajii, A. and S. Morris (1997): “The Robustness of Equilibria to Incomplete Informa-tion,”

Econometrica , 65, 1283–1309.——— (2020): “Reﬁnements and Higher-Order Beliefs: A Uniﬁed Survey,”

The JapaneseEconomic Review , 71, 7–34. 29 andel, E. and N. Pearson (1995): “Diﬀerential Interpretation of Information andTrade in Speculative Markets,”

Journal of Political Economy , 103, 831–872.

Mankiw, G., R. Reis, and J. Wolfers (2004): “Disagreement about inﬂation expecta-tions,”

NBER Macroeconomics Annual 2003 . Mertens, J.-F. and S. Zamir (1985): “Formulation of Bayesian Analysis for Games withIncomplete Information,”

International Journal of Game Theory , 14, 1–29.

Milgrom, P. and N. Stokey (1982): “Information, Trade, and Common Knowledge,”

Journal of Economic Theory , 26, 17–27.

Monderer, D. and D. Samet (1989): “Approximating Common Knowledge with Com-mon Beliefs,”

Games and Economic Behavior , 1, 170–190.

Morris, S. (1995): “The Common Prior Assumption in Economic Theory,”

Economics andPhilosophy , 11, 227–253.

Morris, S., S. Takahashi, and O. Tercieux (2012): “Robust Rationalizability UnderAlmost Common Certainty of Payoﬀs,”

The Japanese Economic Review , 63, 57–67.

Olea, J. M., P. Ortoleva, M. Pai, and A. Prat (2017): “Competing Models,” Work-ing Paper.

Ollar, M. and A. Penta (2017): “Full Implementation and Belief Restrictions,”

Amer-ican Economic Review , 107, 2243–2277.

Rubinstein, A. (1989): “The Electronic Mail Game: Strategic Behavior Under ”AlmostCommon Knowledge”,”

American Economic Review , 79, 385–391.

Salant, Y. and J. Cherry (2020): “Statistical Inference in Games,” Working Paper.

Spiegler, R. (2016): “Bayesian Networks and Boundedly Rational Expectations,”

Quar-terly Journal of Economics , 131, 1243–1290.

Steiner, J. and C. Stewart (2008): “Contagion through Learning,”

Theoretical Eco-nomics , 3, 431–458.

Weinstein, J. and M. Yildiz (2007): “A Structure Theorem for Rationalizability withApplication to Robust Prediction of Reﬁnements,”

Econometrica , 75, 365–400.——— (2017): “Interim Correlated Rationalizability in Inﬁnite Games,”

Journal of Mathe-matical Economics , 72, 82–87. 30 ppendix

A Proofs for Section 2

A.1 Proof of Claim 1

Suppose that f p x S q “

1, so that the Seller’s value good has a high value. (The proof follows alongsimilar lines in the other case.) I will ﬁrst show that p n “ n . Let π be a point mass at f .An agent with this prior assigns probability 1 to v “ B p z n q for every z n , so common certainty in v “ But entering is not rationalizable for the Sellerwith this belief, implying p n “ p n , I ﬁrst show thatentering is rationalizable for some type satisfying Assumption 1 if and only if there exist ˜ f , ˜ f P F that are consistent with the data, and which make conﬂicting predictions for the Seller’s good x S (Lemma 1). I characterize the probability of this event in Lemma 3, from which the comparativestatics for p n follow directly. Lemma 1.

Fix an arbitrary data set z n “ tp x i , f p x i qqu ni “ . Entering is rationalizable for the Sellerwith a belief satisfying Assumption 1 if and only if there exist ˜ f , ˜ f P F where(1) ˜ f p x i q “ ˜ f p x i q “ f p x i q for each observation i “ , . . . , n (2) ˜ f p x S q “ while ˜ f p x S q “ Proof.

Suppose there exists a pair ˜ f , ˜ f satisfying (1) and (2), and deﬁne π ˜ f , π ˜ f P ∆ p F q to bepoint masses on ˜ f and ˜ f . Since these rules are consistent with the data by (1), the posterior beliefsupdated to z n are likewise degenerate at ˜ f and ˜ f , and thus assign (respectively) probability 1 to v “ v “

0. This implies that degenerate distributions at 1 and 0 both belongto B p z n q . Entering is rationalizable for the Seller who believes that v “ v “ B p z n q .Now suppose that no such pair ˜ f , ˜ f exists, implying either that every ˜ f P F consistent with thedata predicts f p x S q “

0, or that every ˜ f P F consistent with the data predicts f p x S q “

1. Theneither B p z n q is the singleton set consisting of a degenerate distribution at 1, or it is the singletonset consisting of a degenerate distribution at 0. If the former, the only type satisfying Assumption1 is the one with common certainty in v “

1, and if the latter, the only type satisfying Assumption1 is the one with common certainty in v “

0. Entering is not rationalizable for the Seller witheither of these beliefs. Here, and elsewhere in the proof, type t i has common certainty in v “ t f | f p x S q “ u ˆ p X ˆ t , uq ˆ T ˚´ i . emma 2. Suppose the true function is f p x q “ p x P R q where R “ r´ r , r s ˆ r´ r , r s ˆ . . . r´ r m , r m s for a sequence of constants r , r , . . . , r m , r m P p , q . Then p n p a i q “ ´ m ź k “ ˆ ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s ˙ . Proof.

From Lemma 1, the probability p n is equal to the measure of data sets z n given which thereexist ˜ f , ˜ f P F that are consistent with z n , and which make conﬂicting predictions at the input x S .The true classiﬁcation rule f is always consistent with the data, and predicts f p x S q “

1, so a pairof such rules exists if we can additionally ﬁnd a rule ˜ f P F consistent with the data that predicts˜ f p x S q “ k on which either every observation x i satisﬁes x ki ă

0, or every x i satisﬁes 0 ă x kj . This allowssome ˜ f P F to be consistent with the data, but to predict 0 at the zero vector.For each dimension k , the probability that there is at least one observation x i with x ki P r´ r k , q and at least one observation x j with x kj P p , r k s is1 ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s . Observe that attribute values are independent across dimensions. So the probability that for everydimension k , there is at least one observation x ki P r´ r k , q and at least one observation x j with x kj P p , r k s , is m ź k “ ˆ ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s ˙ . The desired probability is the complement of this event, which yields the expression in the lemma.The following functional form is used in the main text:

Corollary 3.

In the special case in which the true function is f p x q “ p x P R q where R “ r´ a, a s m for some a P p , q , then p n p a i q “ ´ “ ´ ` ` ´ a ˘ n ´ p ´ a q n ˘‰ m . A.2 Proof of Claim 2

I ﬁrst demonstrate the following lemma, which characterizes the probabilities p n and p n . Lemma 3.

For every n ě , p n “ ´ Φ ˜ z α ´ β ´ σ c n ´ ¸ while p n “ ´ Φ ˜ ´ z α ´ β ´ σ c n ´ ¸ here z α “ ´ Φ ´ p α { q with Φ denoting the CDF of the standard normal distribution. Since β ą σ and increasing in n . Thus Claim2 follows.Towards this lemma, I ﬁrst prove the following intermediate result: Lemma 4.

Write T C i for the set of player i types with common certainty in the event that allplayers have ﬁrst-order beliefs that assign probability 1 to C .(a) The strong policy is rationalizable for all types t i P T C i if and only if C Ď r , .(b) The strong policy is rationalizable for some type t i P T C i if and only if C X r ,

8q ‰ H .Proof. (a) C Ď r , is a necessary condition, as otherwise the strong policy is not rationalizablefor any type with common certainty in β P C zr , . Suppose C Ď r , and choose any t i P T C i .For each β P C , u i p strong , strong , β q “ ´ u i p weak , strong , β q “ ´ β ď ´

1. So ż u i p strong , strong , β q t i p β q dβ “ ´ ě ż u i p weak , strong , β q t i p β q dβ where t i denotes the ﬁrst-order belief of type t i . Thus the family of sets p R , R q with R “ R “t strong u are closed under best reply, and rationalizability of the strong policy follows.(b) Suppose C X r ,

8q “ H . Then for every β P C , u i p strong , strong , β q “ ´ ď ´ β “ u i p weak , strong , β q . So the strong policy is strictly dominated (and hence not rationalizable) for player i given any type t i P T C i . If instead C X r ,

8q ‰ H , then the strong policy is rationalizable for any type withcommon certainty in some β in this intersection. So the strong policy is rationalizable for at leastone type t i P T C i , as desired.I now prove Lemma 3. Proof.

Using standard results for ordinary least-squares (Hastie et al., 2009), the distribution ofthe OLS estimator ˆ β is ˆ β „ N ˜ β, σ n ř nt “ p t ´ t q ¸ where t “ n ř nt “ t . Since1 n n ÿ t “ p t ´ t q “ n ˜ n ÿ t “ t ´ t n ÿ t “ t ` n ÿ t “ t ¸ “ p n ` qp n ` q ´ p n ` q ` ˆ n ` ˙ “ p n ´ q e can simplify the variance of ˆ β to σ n ´ . The p ´ α q -conﬁdence interval for β given data z n isthus C p z n q “ « ˆ β p z n q ´ z α ¨ σ ¨ c n ´ , ˆ β p z n q ` z α ¨ σ ¨ c n ´ ﬀ (8)where β p z n q is the OLS estimate of β given the data z n , and z α “ ´ Φ ´ p α { q is the critical valueassociated with the p ´ α q -conﬁdence level. The probability that the interval in (8) is containedin r , is Pr ˜ ˆ β p z n q ą ` z α ¨ σ ¨ c n ´ ¸ . which is in turn equal to 1 ´ Φ ˜ . ´ β ´ σ c n ´ ¸ . (9)By Part (a) of Lemma 4, p n is equal to (9), delivering the ﬁrst part of the lemma.The probability that the interval in (8) has nonempty intersection with r , is given byPr ˜ ˆ β p z n q ą ´ z α ¨ σ ¨ c n ´ ¸ which is equal to 1 ´ Φ ˜ ´ z α ´ β ´ σ c n ´ ¸ (10)By Part (b) of Lemma 4, p n is equal to (10), concluding the proof. B Proofs for Main Results (Sections 4 and 5)

B.1 Proof of Theorem 1 Part (a)

Recall that Θ and U are both endowed with the sup-norm, and the map g : Θ Ñ U has Lipschitzconstant K . The set of probability measures ∆ p Θ q is endowed with the Prokhorov metric d P . The Wasserstein distance on ∆ p Θ q is d W p ν, ν q “ sup "ż hdν ´ ż hdν : } h } L ď * where } h } L is the Lipschitz constant of the function h : Θ Ñ R . Lemma 5.

Fix any player i , action a i P A i , mixed strategy α i P ∆ p A i q , and set R ´ i Ď A ´ i . Let a ´ i p θ q : Θ Ñ ∆ p A ´ i q be any function satisfying a ´ i p θ q P argmax a ´ i P R ´ i p u i p a i , a ´ i , θ q ´ u i p α i , a ´ i , θ qq @ θ P Θ nd deﬁne h : Θ Ñ R by h p θ q “ u i p a i , a ´ i p θ q , θ q ´ u i p α i , a ´ i p θ q , θ q . Then, the function h is Lipschitz continuous with Lipschitz constant K .Proof. Choose any θ, θ P Θ, and without loss of generality, suppose h p θ q ě h p θ q . Then | h p θ q ´ h p θ q| “ |p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq´p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq|ď |p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq´p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq|ď | u p a i , a ´ i p θ q , θ q ´ u p a i , a ´ i p θ q , θ q|`| u p α i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ q|ď } g p θ q ´ g p θ q} ď K } θ ´ θ } using in the ﬁnal inequality that g : Θ Ñ U has Lipschitz constant K .Below, let F (cid:15) denote the (cid:15) -neighborhood of the set F . Lemma 6.

Suppose a i is δ -strictly rationalizable for player i of type t i , where δ ą . Let B beany subset of t µ u δ {p Kξ q , where K is the Lipschitz constant of g : Θ Ñ U , and ξ is as deﬁned in(5). Then, a i is rationalizable for all types t i P T B i . Proof.

Fix (cid:15) ą

0, and consider an arbitrary set B Ď t µ u (cid:15) . I will show that a i is rationalizable forall types t i P T B i when (cid:15) is suﬃciently small.To show this, I use Proposition 1 from Chen et al. (2010): Proposition 4 (Chen et al. (2010)) . For each k ě , player i P I , type t i P T i , and action a i P A i , we have a i P S ki r t i s if and only if for each α i P ∆ p A i zt a i uq , there exists a measurable σ ´ i : Θ ˆ T ´ i Ñ ∆ p A ´ i q with supp σ ´ i p θ, t ´ i q Ď S k ´ ´ i r t ´ i s @p θ, t ´ i q P Θ ˆ T ´ i such that ż Θ ˆ T ´ i r u i p a i , σ ´ i p θ, t ´ i q , θ q ´ u i p α i , σ ´ i p θ, t ´ i q , θ s t i r dθ ˆ dt ´ i s ě δ P R `` such that a i is δ -strictly rationalizable for player i of type t i . This implies that there exists a family of sets p R j q j P I Ď ś j P I A j , where a i P R i , and for every Chen et al. (2010) demonstrate a similar result for ﬁnite state spaces Θ (see their Proposition 2). I useideas from their proof here, but consider a more general environment, replacing ﬁniteness of Θ with Lipschitzcontinuity on g : Θ Ñ U . Proposition 1 from Chen et al. (2010) characterizes γ -rationalizability for arbitrary γ P R . For thepurposes of this proof, it is suﬃcient to set γ “ j P R j there exists a σ j : Θ Ñ ∆ p A ´ j q satisfyingsupp σ j p θ q Ď R ´ j @ θ P Θand ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p a j , σ j p θ q , θ q dµ ě δ @ a j ‰ a j (11)I will show that for each k ě

1, player j , type t j P T B j , action a j P R j , and mixed strategy α j P ∆ p A j zt a j uq , there exists a measurable σ ´ j : Θ ˆ T B ´ j Ñ ∆ p A ´ j q withsupp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B ´ j and ż Θ ˆ T B ´ j r u j p a j , σ ´ j p θ, t ´ j q , θ q ´ u j p α j , σ ´ j p θ, t ´ j q , θ s t j r dθ ˆ dt ´ j s ě . (12)Since a i P R i by design, it follows from Proposition 4 that for any type t i P T B i , the action a i P S ki r t i s for every k , and hence a i P S i r t i s , as desired.Fix an arbitrary player j , a j P R j , type t j P T B j , and α j P ∆ p A j zt a j uq . Deﬁne a ´ j : Θ Ñ A ´ j to satisfy a ´ j p θ q P argmax a ´ j P R ´ j p u j p a j , a ´ j , θ q ´ u j p α j , a ´ j , θ qq @ θ P Θand deﬁne σ ´ j : Θ ˆ T B ´ j Ñ ∆ p A ´ j q so that each σ ´ j p θ, t ´ j q is a point mass at a ´ i p θ q . Then bydeﬁnition supp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B ´ j . Further deﬁne h p θ q : “ u j p a j , a ´ j p θ q , θ q ´ u j p α j , a ´ j p θ q , θ q @ θ P Θ . For notational ease, write ν P ∆ p Θ q for the ﬁrst-order belief of type t j . Then ż Θ ˆ T B ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T B ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s“ ż Θ u j p a j , a ´ j p θ q , θ q ν r dθ s ´ ż Θ u j p α j , a ´ j p θ q , θ q ν r dθ s “ ż Θ h p θ q ν r dθ s so the desired condition in (12) follows if we can show that ş Θ h p θ q ν r dθ s ě . By Lemma 5, the function h : Θ Ñ R has Lipschitz constant 2 K , so ˇˇˇˇż Θ h p θ q dν ´ ż Θ h p θ q dµ ˇˇˇˇ ď K ¨ d W p ν, µ q where d W is the Wasserstein distance on ∆ p Θ q . This implies ż Θ h p θ q dν ě ż Θ h p θ q dµ ´ K ¨ d W p ν, µ q . pplying Theorem 2 in Gibbs and Su (2002), d W p ν, µ q ď ξ ¨ d P p ν, µ q , where d P is the Prokhorovdistance on ∆ p Θ q and ξ is as deﬁned in (5). So ż Θ h p θ q dν ě ż Θ h p θ q dµ ´ Kξ ¨ d P p ν, µ q (13)It follows from the inequality in (11) that ż Θ h p θ q dµ “ ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p α j , σ j p θ q , θ q dµ ě δ, so (13) implies ż Θ h p θ q dν ě δ ´ Kξ ¨ d P p ν, µ q . Finally, by assumption that t j P T B j for some B Ď t µ u (cid:15) , the Prokhorov distance between theﬁrst-order belief of type t j and the limiting belief ν is d P p ν, µ q ď (cid:15). So ż Θ h p θ q dν ě δ ´ Kξ(cid:15).

It follows that (cid:15) ď δ {p Kξ q is a suﬃcient condition for the constructed σ ´ j to satisfy the desiredcondition in (12).Since a i is strictly rationalizable for type t i (by assumption), there exists a δ P R `` for which a i is δ -strictly rationalizable. Assumption 3 implies thatlim n Ñ8 P n ˜ z n | sup µ P M d P p µ p z n q , µ q ď (cid:15) +¸ “ @ (cid:15) ą . which further implies lim n Ñ8 P n ´! z n | B p z n q Ď t θ u δ { Kξ )¯ “ p n p a i q ě P n ´! z n | B p z n q Ď t θ u δ { Kξ )¯ @ n ě p n p a i q Ñ

1. Theorem 1 Part (a) follows.

B.2 Proof of Theorem 1 Part (b)

I begin by reviewing deﬁnitions from Chen et al. (2010) that will be used in the proof. For eachplayer i , let X i “ Θ, and recursively for k ě

1, deﬁne X ki “ Θ ˆ ś j ‰ i ∆ p X k ´ j q . The space of k -th order beliefs for player i is deﬁned T ki : “ ∆ p X k ´ i q , noting that each T ki “ ∆ p Θ ˆ T k ´ ´ i q . The The sets X k deﬁned in Section 3.1 can be identiﬁed with the sets X k deﬁned in this way. niform-weak metric on the universal type space T ˚ i is d UWi p s i , t i q “ sup k ě d ki p s i , t i q @ s i , t i P T ˚ i where d is the supremum norm on Θ and recursively for k ě d ki is the Prokhorov distance on∆ p Θ ˆ T k ´ ´ i q induced by the metric max t d , d k ´ ´ i u on Θ ˆ T k ´ ´ i . The uniform-weak topology onthe universal type space is the metric topology induced by d UWi . Lemma 7.

Let B be a subset of t µ u (cid:15) , and choose any s i P T B i . Then, d UWi p t i , s i q ď (cid:15) .Proof. For simplicity of notation, write t i for t i . It will be useful to deﬁne T B ,ki “ ! s ki P T ki | s i P T B i ) for the set of all k -th order beliefs that are consistent with some type s i P T B i . I will show that d P ´ T B ,ki , t ki ¯ : “ sup s ki P T B ,ki d P p s ki , t ki q ď (cid:15) @ k ě T B , i “ B , so the assumption B Ď t µ u (cid:15) immediately implies (15) for k “ d P ´ T B ,ki , t ki ¯ ď (cid:15) , and consider any measurable set E Ď T k . If t ki P E , then t k ` i p E q “ t i . Also, s k ` i p E (cid:15) q ě s k ` i ´ t t ki u (cid:15) ¯ ě s k ` i p T B ,ki q “ s i P T B i . So t k ` i p E q ď s k ` i p E (cid:15) q ` (cid:15). (16)If t ki R E , then t k ` i p E q “ t i ), so (16) follows trivially. Thus d k ` i p t i , s i q “ inf t δ | t k ` i p E q ď s k ` i p E δ q ` δ @ measurable E Ď T ki u ď (cid:15) and so d UWi p t i , s i q “ sup k ě d ki p t i , s i q ď (cid:15) as desired.Lemma 7 implies the subsequent corollary. Corollary 4.

Suppose Assumption 3 holds. Consider any sequence z P Z satisfying lim n Ñ8 sup µ P M d P p µ p z n q , µ qq “ This deﬁnition is slightly modiﬁed from Chen et al. (2010), where d was the discrete metric on Θ. Thechange reﬂects the diﬀerence that Θ was taken to be a ﬁnite set in Chen et al. (2010), while it is a compactand convex subset of Euclidean space here. Here and elsewhere, t ki denotes the k ´ th order belief of type t i . nd choose any sequence of types p s ni q n “ with s ni P T B p z n q i for each n ě . Then lim n Ñ8 d UWi p t i , s ni q “ . Now we will complete the proof of Theorem 1 Part (b). By Assumption 3, there is a set Z ˚ Ď Z of P -measure 1 such thatlim n Ñ8 sup µ P M d P p µ p z n q , µ q “ @ z P Z ˚ (18)Suppose towards contradiction that p n p a i q (cid:57)

0. Then, there is a set p Z Ď Z with strictly positive P -measure such that for every z P p Z , there is a sequence of types p t ni p z qq n “ where t ni p z q P T B p z n q i for every n ě

1, and a i P S i r t ni p z qs for all n suﬃciently large.But since Z ˚ has P -measure 1, it must be that p Z X Z ˚ ‰ H . Choose any z from this intersection.Then, Lemma 4 and the display in (18) imply that t ni p z q Ñ t i in the uniform-weak topology. Butrationalizability is upper hemi-continuous in the uniform-weak topology (Theorem 1, Chen et al.(2010)). So a i R S i r t i s implies a i R S i r t ni p z qs for inﬁnitely many n , a contradiction. B.3 Proof of Proposition 5

By assumption, a i is strictly rationalizable for type t i , so δ ą

0. Applying Lemma 6, p n p a i q ě P n pt z n : B p z n q Ď t µ u δ {p Kξ q uq“ P n ˜ z n : sup µ P M d P p µ p z n q , µ q ď δ {p Kξ qu +¸ ě ´ Kξδ E ˜ sup µ P M d P p µ p Z n q , µ q ¸ using Markov’s inequality in the ﬁnal line. B.4 Proof of Proposition 2

Suppose a i is strictly rationalizable for player i in the complete information game θ , and let δ be as deﬁned in (4). Then, there exists a family of sets p R j q j P I with a i P R i , where for each player j and action a j P R j , there is a mixed strategy σ ´ j P ∆ p A ´ j q satisfying σ ´ j r R ´ j s “

1, and u i p a i , σ ´ j , θ q ´ u i p a i , σ ´ j , θ q ě δ @ a i ‰ a i . Now consider an arbitrary set C Ď t θ u (cid:15) and a type t i with common certainty in the eventthat every player’s ﬁrst-order belief assigns probability 1 to C . Write ν P ∆ p Θ q for the ﬁrst-order elief of type t i . For any action a j ‰ a j , ż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ j , θ q dν “ ż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ ´ j , θ q dµ ` ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ ´ j , θ q dµ ` ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ j , θ q dν ě ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ ´ j , θ q dµ ´ ˇˇˇˇż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ ´ j , θ q dµ ˇˇˇˇ ´ ˇˇˇˇż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ j , θ q dµ ˇˇˇˇ ě δ ´ K ¨ d P p ν, µ q ě δ ´ K(cid:15) using in the penultimate inequality that g : Θ Ñ U has Lipschitz constant K . Since this boundon the payoﬀ diﬀerence holds across all actions a j ‰ a j , the action a j is a best reply to belief ν whenever (cid:15) ď δ {p K q .This allows us to construct the lower bound p n p a i q ě Q n ´! z n : C p z n q Ď t θ u δ {p K q )¯ “ Q n ˜ z n : sup θ P C p z n q } θ ´ θ } ď δ {p K q +¸ ě ´ Kδ E ˜ sup θ P C p z n q } θ ´ θ } ¸ using Markov’s inequality in the ﬁnal line. For Online Publication

O.1 Proof of Claim 3

Fix an arbitrary p π, q q P p , q ˆ p { , q . Given data z n , the posterior belief µ π,q p z n q assignsprobability ˆ v p π, q, z n q : “ { ˜ ` ´ ππ ˆ ´ qq ˙ n p z n ´ q ¸ (19)to v “

1, where z n “ n ř nn “ z n denotes the average realization in the sequence z n .Suppose without loss that v “

1, and let q ˚ P p { , q be the true frequency of z “

1. By thestrong Law of Large numbers, there is a measure 1 set of sequences Z ˚ satisfying lim n Ñ8 p n ř nn “ z n q “ q ˚ for every z “ p z , z , . . . q P Z ˚ . The expression in (19) converges to 1 on this set for every p π, q q P p , q ˆ p { , q . So Assumption 2 is satisﬁed, and the limiting belief µ assigns probability1 to v “

1. Since entering is not rationalizable for the Seller given common certainty in the eventin that all players assign probability 1 to v “

1, it follows that p p8q “ p p8q “ p p n q converges to 1 as n Ñ 8 . Fix an arbitrary n , and deﬁne Z : n “ t z n | z n ą { u to be the set of length- n sequences with majority realizations of z “

1. Forevery z n P Z : n , the expression pp ´ q q{ q q n p z n ´ q is bounded between 1/2 and 1 on the domain q P p { , q , while the image of p ´ π q{ π is all of R ` . Thus, the display in (19) ranges from zeroto 1; that is, t ˆ v p π, q, z n q : π P p , q , q P p { , qu “ p , q @ z n P Z : n . It follows that for every z n P Z : n , there exist pairs p π, q q , p π , q q P p , q ˆ p { , q satisfyingˆ v p π, q, z n q ă p ă ˆ v p π , q , z n q . Entering is rationalizable for the Seller with a type that assignsprobability ˆ v p π , q , z n q to the high value, and which assigns probability 1 to the Buyer assign-ing probability ˆ v p π, q, z n q to the high value. So entering is weakly B p z n q -rationalizable for every z n P Z : n , implying p n p a i q ě P n p Z : n q .Again by the law of large numbers, the measure of datasets with majority realizations of z “ n Ñ 8 ; that is, P n ´ Z : n ¯ Ñ

1. So lim n Ñ8 p n p a i q “

1, as desired.

O.2 Proof of Corollary 1

First observe that δ “ β ´

1, since the action

Strong is δ -strictly rationalizable for every δ ă β ´ δ ě β ´

1. It remains to determine E ” sup θ P C p Z n q } θ ´ θ } ı . Write Z n for the(random) empirical mean of n signal realizations, and ˆ β x p z n q for the expectation of β given signals n and prior β „ N p x, q . Then, using standard formulas for updating to Gaussian signals: E ˜ sup x Pr´ η,η s | β ´ ˆ β x p Z n q| ¸ “ E „ max x Pr´ η,η s ˆˇˇˇˇ β ´ x ` nZ n n ` ˇˇˇˇ˙ We can further bound the RHS as follows: E „ max x Pr´ η,η s ˆˇˇˇˇ β ´ x ` nZ n n ` ˇˇˇˇ˙ ď E ˆˇˇˇˇ β ´ nZ n n ` ˇˇˇˇ˙ ` max x Pr´ η,η s ˇˇˇˇ xn ` ˇˇˇˇ “ E ˆˇˇˇˇ β ´ nZ n n ` ˇˇˇˇ˙ ` η {p n ` qď E `ˇˇ β ´ Z n ˇˇ˘ ` E ˆ Z n n ` ˙ ` η {p n ` q“ c nπ ` β ` ηn ` n observations froma Gaussian distribution (Geary, 1935). Finally, the map g : Θ Ñ U has Lipschitz constant 1.Applying Proposition 2, we have the desired bound. O.3 Proof of Corollary 2

Fix arbitrary π, π, q, q satisfying 0 ă π ă π ă { ă q ă q ă

1, and let M be theset of learning rules identiﬁed with p π, q q P r π, π s ˆ r q, q s . Entering is rationalizable for a Sellerwith common certainty that all players have ﬁrst-order beliefs in B p z n q if and only if there exist π, π P r π, π s and q, q P r q, q s satisfyingˆ v p π, q, z n q ă p ă ˆ v p π , q , z n q . (20)where ˆ v p π, q, z n q is as deﬁned in (19). Let Z ˚ n denote the set of all sequences z n satisfying (20).Since the state space is binary, each empirical measure p Q p z n q P ∆ pt , uq can be identiﬁed withits average realization z n , which is also the probability assigned to z “

1. The KL-divergencebetween p Q p z n q and the actual signal-generating distribution Q “ p q ˚ , ´ q ˚ q is D KL p p Q p z n q | Q q “ q ˚ log ˆ q ˚ z n ˙ ` p ´ q ˚ q log ˆ ´ q ˚ ´ z n ˙ and this expression is monotonically increasing in | z n ´ q ˚ | . Thus, to minimize the KL-divergence,we seek the value of z n closest to q ˚ for which (20) is satisﬁed.Suppose z n ą {

2. By assumption, π ą p and q ą {

2, so ˆ v p π, q, z n q ą p . It remainsto determine when ˆ v p π, q, z n q ă p is satisﬁed for some other p π, q q P M . Since ˆ v p π, q, z n q is onotonically increasing in both π and q for sequences z n satisfying z n ą { π, q )), a necessary and suﬃcient condition is ˆ v p π, q, z n q ă p . Using (19), this inequalityrequires 1 { ˜ ` ´ ππ ˆ ´ qq ˙ n p z n ´ q ¸ ă p which can be rewritten z n ď ˆ ` n log p ´ q q{ q ˆ π ´ π ¨ ´ pp ˙˙ : “ z ˚ n . Since z ˚ n ¨ n need not be an integer, the distribution p z ˚ n , ´ z ˚ n q may not be achievable by anyempirical measure p Q n for ﬁnite n . Thus, Q ˚ n is instead given by p t z ˚ n ¨ n u { n, ´ p t z ˚ n ¨ n u { n q , and D KL p Q ˚ n } Q q “ q ˚ log ˆ q ˚ t z ˚ n ¨ n u { n ˙ ` p ´ q ˚ q log ˆ ´ q ˚ ´ t z ˚ n ¨ n u { n ˙ Plugging in the given parameter values, and applying Proposition 3, yields the expression in thecorollary.

O.4 Examples Related to Theorem 1

Part (a) of Theorem 1 provides a suﬃcient condition for the conﬁdence set r p n p a i q , p n p a i qs toconverge to certainty— a i is strictly rationalizable for type t i —and Part (b) of Theorem 1 provides anecessary condition— a i is rationalizable for type t i . The condition that a i is strictly rationalizableis not necessary, as I demonstrate in Section O.4.1, and the condition that a i is rationalizable isnot suﬃcient, as I demonstrate in Section O.4.2.In each of these examples, I assume (as in Section 5.1) that the limiting belief µ is degenerateat a limiting parameter θ , and players have common certainty of shrinking neighborhoods of thisparameter. That is, for every realization z n , players have common certainty in the event that playershave ﬁrst-order beliefs with support on C p z n q , where the support sets C p z n q satisfy Assumption 6. O.4.1 Strict Rationalizability is Not Necessary

Consider the following complete information game a a a θ, θ, a , , nd suppose that the limiting belief is degenerate at θ “

1. Then, the action a is strictly dominant for player 1 in the limiting complete information game, and also for all types with common certaintyin the event that players have ﬁrst-order beliefs with support on a small enough neighborhood of θ .So Assumption 3 implies lim n Ñ8 r p n p a i q , p n p a i qs “ t u . But action a is not strictly rationalizable for type t i . O.4.2 Rationalizability is not Suﬃcient

I show next that rationalizability of a i for type t i is not suﬃcient for the analyst’s conﬁdenceset for a i to converge to certainty. Section O.4.3 provides a simple example to this eﬀect. DeﬁneΘ a i to be the set of parameter values θ such that a i is rationalizable for player i in the completeinformation game indexed to θ . If a i is on the boundary of Θ a i , then common certainty of shrinkingneighborhoods around θ does not guarantee rationalizability of a i . More surprisingly, commoncertainty in arbitrarily small open sets within the interior of Θ a i also does not guarantee rational-izability of a i , and I provide an example of this in Section O.4.4. (See also the working paper ofChen and Takahashi (2020) for a nice two-player example to this eﬀect.) O.4.3 θ is on the Boundary of Θ a i Consider the following two-player game, parametrized by θ P r θ, θ s for some θ ă ă θ : a ba θ, θ , b , , θ “

0, so that a is rationalizable in the limiting completeinformation game, but not strictly rationalizable. It is straightforward to see that common certaintyof shrinking neighborhoods of θ does not guarantee rationalizability of action a , as the type withcommon certainty of any θ ă a to be strictly dominated. O.4.4 θ is in the Interior of Θ a i But even if θ is not on the boundary of the set Θ a i , it may be that common certainty of a shrinkingneighborhood of θ does not guarantee rationalizability of a i . Consider the following four-playergame. Players 1 and 2 choose between actions in t a, b u , and player 3 chooses between matrices rom t l, r u . Their payoﬀs are: a ba , , , , b , , , , a ba , , , , b , , , , p l q p r q A fourth player predicts whether players 1 and 2 chose matching actions or mis-matching actions.He receives a payoﬀ of 1 if he predicts correctly (and 0 otherwise). Player 4’s action does notaﬀect the payoﬀs of the other three players.Let the state space Θ “ R be the set of all payoﬀ matrices given these actions, where thepayoﬀs described above are a particular θ . Match is clearly rationalizable for player 4 at θ ; it isalso rationalizable for player 4 on a neighborhood of θ (in the Euclidean metric). Nevertheless, I will show existence of a sequence of types for player 4 with common certaintyin increasingly small neighborhoods of θ , given which Match fails to be rationalizable. Alongthis sequence, player 4 believes that a is uniquely rationalizable for player 1, while b is uniquelyrationalizable for player 2, so the action Match is strictly dominated.Deﬁne θ ε to be the following perturbation of the payoﬀ matrix θ (with player 4’s payoﬀsunchanged): a ba , , , , b , , ´ ε, , a ba , , ´ ε , , ´ εb , , ´ ε , , ´ ε (22) p l q p r q Let θ ε correspond to the following payoﬀ matrix (again with player 4’s payoﬀs unchanged): a ba , , ´ ε , , ´ εb , , ´ ε , , ´ ε a ba ´ ε, , , , b , , , , p l q p r q In more detail: player 4 chooses between t Match , Mismatch u . His payoﬀ from Match is 1 if players 1and 2 choose the same action (both a or both b ) and 0 otherwise; his payoﬀ from Mismatch is 1 if players 1and 2 chose diﬀerent actions ( a and b or ﬂipped), and 0 otherwise. Suppose neither l nor r are strictly dominated for player 1; then, all actions are rationalizable for player1-3, so Match is rationalizable for player 4. If either l or r is strictly dominated for player 1, then oneof the following will be a rationalizable family: t l u ˆ t a u ˆ t a u ˆ t Match u , t l u ˆ t a, b u ˆ t a, b u ˆ t Match u , t r u ˆ t b u ˆ t b u ˆ t Match u , or t r u ˆ t a, b u ˆ t a, b u ˆ t Match u . Thus, Match is rationalizable for player 4. et ε ą

0. If player 1 has common certainty in the state θ ε , then a is his uniquely rationalizableaction: l strictly dominates r for player 3, given which a strictly dominates b for player 1. By asimilar argument, if player 2 has common certainty in the state θ ε , then b is his uniquely rational-izable action. These statements hold for ε arbitrarily small. Construct a sequence of types p t ε n q for player 4, where each type t ε n has common certainty that player 1 has common certainty inthe state θ ε n and player 2 has common certainty in the state θ ε n . Then, player 4 of type t ε n hascommon certainty in an ε -neighborhood of θ , but only one rationalizable action: Mismatch. Take ε n Ñ ε n ą

0) and the desired conclusion obtain: rationalizability of Match holds atlim n Ñ8 ε n but fails to hold arbitrarily far out along the sequence ε n . O.5 Extension to Common p -Belief For each q P r , s , deﬁne: p n,q p a i q “ P n ´! z n : a i P S i r t i s @ t i P T B ,qi )¯ . where q “ p n p a i q given in the main text. Proposition 5.

Suppose a i is strictly rationalizable for type t i , and deﬁne δ : “ sup t δ : a i is δ -strictly rationalizable for type t i u noting that this quantity is strictly positive. Deﬁne M : “ sup a,a P A,θ,θ P Θ ,j P I | u j p a, θ q ´ u j p a , θ q| . (24) Then, for every n ě , and q ą M {p δ ` M q , p n,q p a i q ě ´ M ξqδ q ´ p ´ q q M E ˜ sup µ P M d P p µ p Z n q , µ q ¸ Proof.

I ﬁrst demonstrate a lemma analogous to Lemma 6.

Lemma 8.

Suppose a i is δ -strictly rationalizable for player i of type t i . Let B Ď ∆ p Θ q by anyset satisfying sup ν P B d P p ν, µ q ď δ q ´ p ´ q q M M ξq where M is as deﬁned in (24) and ξ is as deﬁned in (5). Then, a i is rationalizable for all types t i P T B ,qi .Proof. The proof follows along similar lines to the proof of Lemma 6. Fix (cid:15) ą

0, and consider anarbitrary set B Ď t µ u (cid:15) . I will show that a i is rationalizable for all types t i P T B ,qi when (cid:15) issuﬃciently small and q is suﬃciently large. y assumption, action a i is δ -strictly rationalizable for player i of type t i . This implies thatthere exists a family of sets p R j q j P I Ď ś j P I A j , where a i P R i , and for every a j P R j there exists a σ j : Θ Ñ ∆ p A ´ j q satisfying supp σ j p θ q Ď R ´ j @ θ P Θand ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p a j , σ j p θ q , θ q dµ ě δ @ a j ‰ a j (25)Partition the set of types T B ,qj into those types whose ﬁrst-order beliefs belong to B T j : “ ! t j P T B ,qj | t j P B ) @ j P I and all remaining types T j : “ T B ,qj z T j . By construction, every type in T B ,qj assigns probability atleast q to T ´ j . I will now show that there exists a family of sets p V j r t j sq j P I ,t j P T B ,qj with the propertythat for each k ě

1, player j , type t j P T B ,qj , action a j P R j , and mixed strategy α j P ∆ p A j zt a j uq ,there exists a measurable σ ´ j : Θ ˆ T B ,q ´ j Ñ ∆ p A ´ j q with(1) supp σ ´ j p θ, t ´ j q Ď V ´ j r t ´ j s @p θ, t ´ j q P Θ ˆ T B ,q ´ j (2) ş Θ ˆ T B ,q ´ j r u j p a j , σ ´ j p θ, t ´ j q , θ q ´ u j p α j , σ ´ j p θ, t ´ j q , θ s t j r dθ ˆ dt ´ j s ě V j r t j s “ R j for every player j and type t j P T j .Since a i P R i by design, it follows from Proposition 4 that for any type t i P T B ,qi , the action a i P S ki r t i s for every k , and hence a i P S i r t i s , as desired.Fix an arbitrary player j , a j P R j , type t j P T B ,qj , and α j P ∆ p A j zt a j uq . Deﬁne a ´ j : Θ Ñ A ´ j to satisfy a ´ j p θ q P argmax a ´ j P R ´ j p u j p a j , a ´ j , θ q ´ u j p α j , a ´ j , θ qq @ θ P ΘFurther deﬁne h p θ q : “ u j p a j , a ´ j p θ q , θ q ´ u j p α j , a ´ j p θ q , θ q @ θ P Θ . (26)Let σ ´ j : Θ ˆ T B ,q ´ j Ñ ∆ p A ´ j q be any conjecture with the property that σ ´ j p θ, t ´ j q is a point massat a ´ i p θ q for every p θ, t ´ j q P Θ ˆ T B , ´ j . The conjectures σ ´ j p θ, t ´ j q for p θ, t ´ j q R Θ ˆ T B , ´ j are notexplicitly speciﬁed. By deﬁnition,supp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B , ´ j . Then Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s“ ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ` ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ ´ ˜ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ` ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ “ ˜ż Θ ˆ T ´ j u j p a j , a ´ j p θ q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , a ´ j p θ q , θ q t j r dθ ˆ dt ´ j s ¸ ` ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ “ ż Θ ˆ T ´ j h p θ q t j r dθ ˆ dt ´ j s` ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ ě ż Θ ˆ T ´ j h p θ q t j r dθ ˆ dt ´ j s ´ M ż Θ ˆ T ´ j t j r dθ ˆ dt ´ j s where the ﬁnal inequality follows from the deﬁnitions of h (as given in (26)) and M (as given in(24)).In the proof of Lemma 6, we showed that the inequality in (25) implies ş Θ h p θ q t j r dθ s ě δ ´ M ξ(cid:15) . Since moreover t j assigns probability at least p to the set T ´ j , we can further bound ż Θ ˆ T B , ´ j h p θ q t j r dθ ˆ dt ´ j s ` M ż Θ ˆ T ´ j t j r dθ ˆ dt ´ j s ě q p δ ´ M ξ(cid:15) q ´ p ´ q q M Thus, a j is a best reply for type t j so long as q p δ ´ M ξ(cid:15) q ´ p ´ q q M ě (cid:15) ď δ q ´p ´ q q M Mξq

This bound holds across all players j and actions a j P T j .Thus p n p a i q ě P n ˜ z n : sup µ P M d P p µ p z n q , µ q ď δ q ´ p ´ q q M M ξq +¸ ě ´ M ξpδ q ´ p ´ q q M E ˜ sup µ P M d P p µ p Z n q , µ q ¸ using Markov’s inequality in the ﬁnal line.using Markov’s inequality in the ﬁnal line.