GGames of Incomplete InformationPlayed By Statisticians
Annie Liang ∗ July 13, 2020
Abstract
Players are statistical learners who learn about payoffs from data. They may in-terpret the same data differently, but have common knowledge of a class of learningprocedures. I propose a metric for the analyst’s “confidence” in a strategic prediction,based on the probability that the prediction is consistent with the realized data. Themain results characterize the analyst’s confidence in a given prediction as the quantityof data grows large, and provide bounds for small datasets. The approach generatesnew predictions, e.g. that speculative trade is more likely given high-dimensional data,and that coordination is less likely given noisy data.
Predictions of play in incomplete information games depend crucially on the beliefs of theagents, but we rarely know what those beliefs are. A standard approach to modeling beliefsassumes that players share a common prior belief over states of the world, and form posteriorbeliefs using Bayesian updating. Under this approach, posterior beliefs that are commonlyknown must be identical (Aumann, 1976), and repeated communication of beliefs eventuallyleads to agreement (Geanakoplos and Polemarchakis, 1982). These implications conflict not ∗ Department of Economics, University of Pennsylvania. Email: [email protected]. I am especiallygrateful to Drew Fudenberg for his guidance on this paper. This paper also benefitted from useful commentsand suggestions by Jetlir Duraj, Siddharth George, Ben Golub, Jerry Green, Philippe Jehiel, Scott Kominers,David Laibson, Jonathan Libgober, Erik Madsen, Stephen Morris, Sendhil Mullainathan, Mariann Ollar,Harry Pei, Andrei Shleifer, Dov Samet, Tomasz Strzalecki, Satoru Takahashi, Anton Tsoy, and MuhametYildiz. a r X i v : . [ ec on . T H ] J u l nly with considerable empirical evidence of public and persistent disagreement, but alsowith the more basic experience that individuals interpret the same information in differentways. This paper relaxes the assumption of a common prior by supposing that players arestatistical learners: they form beliefs about payoff-relevant parameters based on data, butpotentially disagree on how to interpret that data. I define a learning rule to be any func-tion that maps data (a sequence of signals) into a belief distribution over payoff-relevantparameters (a first-order belief). Players have common knowledge of some set of reasonablelearning rules—for example, these learning rules may correspond to Bayesian updating froma set of prior beliefs, or they may be maps from data to beliefs based on frequentist esti-mates for the unknown parameter. The special case of a singleton Bayesian learning rulereturns the common prior assumption, but in general, the set of learning rules will producedifferent beliefs from the same data, which I interpret as the set of plausible beliefs . I imposea key restriction to structure the approach: for any realization of the data, each player’s ownbelief about the parameter is a plausible belief; they assign probability 1 to all other playersholding plausible beliefs; and so forth. Since the set of plausible beliefs is endogenous tothe (random) data, so too are the strategic predictions that are consistent with this beliefrestriction.The main contribution of the paper is a proposed metric for the analyst’s “confidence” ina strategic prediction in this game. Specifically, consider the prediction that a given action isrationalizable. I quantify the analyst’s confidence in this prediction via a confidence set: Theupper bound of the confidence set is the probability that the action is rationalizable given some belief satisfying the belief restriction, and the lower bound is the probability that theprediction holds for all beliefs satisfying the restriction. Thus, if both of these probabilitiesare equal to one, the analyst has maximal certainty that the action is rationalizable, and ifthey are both zero, he has maximal certainty that it is not. In the intermediate cases, thereis uncertainty about whether the action is rationalizable, and the confidence sets present away to quantify the extent of that uncertainty.The main results in this paper characterize various properties of these confidence sets,beginning with their asymptotic behaviors as the quantity of data grows large. I first show For example, in financial markets, individuals publicly disagree in their interpretations of earnings an-nouncements (Kandel and Pearson, 1995), valuations of financial assets (Carlin et al., 2013), and forecastsfor inflation (Mankiw et al., 2004). strictly rationalizable in the limiting infinite-data game, the analyst’s confidence sets may be very different from t u for arbitrarily largequantities of data. Roughly, this is because the rate of convergence under different learningrules cannot be uniformly bounded, so it is always possible that some learning rule producesa belief that is very different from the others. If, however, the set of learning rules satisfya uniform convergence property that I describe, then the following statements hold: If anaction is strictly rationalizable at the limit, then the analyst’s confidence set must convergeto t u as the quantity of data gets large, and if an action is not rationalizable in the limit,the analyst’s confidence set converges to t u . (The intermediate case, in which actions arerationalizable but not strictly rationalizable, is more subtle—see Section 4 for further detail.)Next, I consider the setting of small sample sizes, and bound the extent to which theanalyst’s confidence set differs from its asymptotic limit. These bounds depend on propertiesof the learning environment—specifically, the quantity of data, and how fast the differentlearning rules jointly recover the payoff-relevant parameter—as well as on a cardinal measurefor how strict the solution is at the limit. I apply these bounds to characterize confidence setsfor example games and sets of learning rules. They allow us to obtain specific, quantitative,statements about confidence away from the limit of infinite data.In some cases it is possible to go beyond these bounds, and to fully characterize theconfidence set. Two such examples given in Section 2, where I consider a trade game and acoordination game. I show that the proposed approach generates novel comparative staticsin these games: Speculative trade is predicted to be more plausible when agents learn fromhigher-dimensional data, and coordination is predicted to be less likely when agents observenoisy data. These predictions—which hold even though data is public and common—aredifficult to produce under assumption of a common prior. This paper builds on a literature on the role of the common prior assumption in economictheory. (See Morris (1995) for a survey of key conceptual points.) Here I focus on anargument that even if learning does produce common priors in the long run, this does notimply that we should see common priors given a finite quantity of data, especially if thatdata is complex and hard to interpret. Rather than taking the limiting common prior as3ne that is already reached, I ask what predictions we can make while data is still beingaccumulated. The confidence sets introduced in this paper provide a quantitative accountof whether predictions implied by a (limiting) common prior also hold for small data sets. This paper also contributes to a literature on the robustness of strategic predictionsto the specification of player beliefs (Rubinstein, 1989; Dekel et al., 2006; Weinstein andYildiz, 2007; Chen et al., 2010; Ely and Peski, 2011) and equilibrium selection in incompleteinformation games (Carlsson and van Damme, 1993; Kajii and Morris, 1997). At a technicallevel, the actions that are rationalizable under the belief restriction that I impose are ∆-rationalizable strategies of Battigalli and Siniscalchi (2003), where the set ∆ of first-orderbelief restrictions is endogenous to a learning process. The permitted types converge inthe uniform-weak topology , as proposed and characterized in Chen, di Tillio, Faingold, andXiong (2010) and Chen, di Tillio, Faingold, and Xiong (2017), and I use results about thistopology to prove several of the main results.Conceptually, the goals of the present paper differ from the previous literature in severalrespects: First, my focus here is not on equilibrium selection—choosing one equilibriumfrom a set of many—but rather on providing a metric for confidence in a given prediction.Second, in contrast to the many binary or “qualitative” notions of robustness that have beenproposed, this paper delivers a quantitative metric. Third, while the literature has primarilyconsidered robustness to perturbations of beliefs, I am interested here also in predictions thatwe may make for beliefs that are far from the limiting beliefs. To discipline these beliefs, Iendogenize the type space using a statistical learning foundation for belief formation. Thisaspect of the paper—combining learning foundations with game theoretic implications—connects to papers such as Dekel et al. (2004), Esponda (2013), and Steiner and Stewart(2008), among others. The modeling of agents as “statisticians” or “machine learners”relates to a growing literature in decision theory (Gilboa and Schmeidler, 2003; Gayer et al.,2007; Al-Najjar, 2009; Al-Najjar and Pai, 2014) and game theory (Jehiel, 2005; Spiegler,2016; Olea et al., 2017; Salant and Cherry, 2020; Haghtalab et al., 2020). Of these papers, Other reasons that the common prior is tenuous include that the data itself may lead to incompletelearning if it is endogenously acquired, and that convergence of individual beliefs need not imply convergencein beliefs about beliefs (Cripps et al., 2008; Acemoglu et al., 2015). Such belief restrictions have also been usefully applied in the work of Ollar and Penta (2017), amongothers. Brandenburger et al. (2008) and Battigalli and Prestipino (2011) also motivate small type structures asemerging from learning, although they do not explicitly model a dynamic learning process.
In this section, I use the proposed approach to revisit some classic examples—a two-playercoordination game and a two-player trading game. In each of these games, the assumptionthat players share a common prior has strong implications for strategic play. I show how wecan relax this assumption by endogenizing disagreement based on two new primitives: a data-generating process and set of learning rules. This approach allows us to relate confidence ina strategic prediction to primitives of the learning environment.
A Seller owns a good of unknown value v P t , u . He can either enter a market at cost c , or exit and keep the good. Entering leads to a simultaneous interaction with a Buyer,where the Seller chooses whether to sell the good at a (pre-set) posted price p , and the Buyerchooses whether to purchase the good at that price. The game and its payoffs are describedin Figure 1 below. Enter
Exit
Sell Don’t SellBuy v p, p c , v c Don’t Buy 0 , v c , v c (0 , v ) Seller
Figure 1: Description of GameSuppose that the cost c and price p satisfy 0 ă c ă p ă
1, so the Seller prefers to sell5t the low value and prefers to keep the good at the high value. If players share a commonprior about v , then entering is not rationalizable for the Seller in this game, so trade willnot occur (similar to the no-trade theorem of Milgrom and Stokey (1982)). I suppose instead that player form beliefs based on a common data set of past goods andtheir valuations, but draw inferences from this data in different ways. Each good in the datais described by m observable attributes with values normalized to lie in the interval r´ , s .Typical attributes are denoted x P X : “ r´ , s m , and the value of a good is a deterministicfunction f : X Ñ t , u of its attributes. The public data z n “ tp x i , f p x i qqu ni “ is a sequenceof n goods with attributes x i drawn from a uniform distribution on X , and the values ofthese goods.Players have common knowledge that f belongs to a certain family of functions F ,which we can think of as the relevant set of models. For simplicity, let F be the set of rectangular classification rules , i.e. functions f R p x q “ p x P R q indexed to hyper-rectangles R in r´ , s m . The attributes of the Seller’s good are known to be the zero-vector, denoted x S , so the agent’s beliefs over F determine their beliefs about v “ f p x S q , the value of theSeller’s good. Agents may have different prior beliefs π P ∆ p F q over the set of models. Fixing a prior π , the posterior belief π p f | z n q given data z n is a re-normalization of the prior over all rulesconsistent with the observed data (see Figure 2), and the posterior probability assigned tothe Seller’s good having a high value is π pt f : f p x S q “ u | z n q . Define B p z n q Ď ∆ pt , uq tobe the set of all posterior beliefs about v that are consistent with Bayesian updating fromsome prior π P ∆ p F q and the data z n .Without making detailed assumptions about the priors that players hold, or the beliefsthat players have about the priors of other players, I place the following restriction. Assumption 1 (Restriction on Beliefs) . For every realized z n , players have common cer- If trade does not occur subsequently, then the Seller receives v ´ c from entering but v ą v ´ c fromexiting. Thus, entering can be rationalized only if trade subsequently occurs. But trade can occur only if theBuyer believes that E p v q ě p while the Seller believes that E p v q ď p , implying E p v q “ p under their sharedbelief. The Seller can improve on his expected payoff of p ´ c by choosing to exit. For example, whether all attributes fall into an “acceptable” range, as judged by a downstream buyer. In more detail, the state space is Ω “ F ˆ p X ˆ t , uq and the payoff-relevant parameter is f p x S q .Agents have a prior belief q over Ω, where marg F q has support on the set of models R . Conditional on thetrue model being f , the data-generating process over p X ˆ t , uq is q pp x, y q | f q “ g p x q ¨ p y “ f p x qq where g is the uniform density on X . +1+1 ( x i , f ( x i )) f ( x i ) = 1 f ( x i ) = 0 Figure 2:
The circles represent the observed data. Each good is described by a vector in r´ , s ˆ r´ , s .The circle is black if its valuation is 1 and gray if its valuation is low. A rule is consistent with the data if itcorrectly predicts the valuation for each observation. Two rectangular classification rules are depicted: eachpredicts ‘1’ for goods in the shaded region and ‘0’ for goods outside. Both are consistent with the observeddata. tainty in the event that all players have a first-order belief in B p z n q . That is, players are assumed to have beliefs consistent with Bayesian updating from someprior on F ; they assign probability 1 to the other player having a posterior belief in this set,and so on.The analyst’s (ex-ante) confidence in predicting that entering is rationalizable for theSeller is defined as follows: For every number of observations n , let p n be the measure of datasets z n such that entering is rationalizable for some belief satisfying Assumption 1, and let p n be the measure of data sets z n such that entering is rationalizable for all beliefs satisfyingAssumption 1. Then the set r p n , p n s represents a “confidence set” that describes how certainthe analyst should be in predicting that the entering is rationalizable, when agents observe n common data points. The extreme case of p n “ p n “ p n “ p n “ Claim 1.
The probability p n “ for every n P Z ` . Additionally:(a) Fixing any number of attributes m P Z ` , the probability p n Ñ as n Ñ 8 .(b) Fixing any number of data observations n P Z ` , the probability p n is increasing in thenumber of attributes m , and p n Ñ as m Ñ 8 . See Section 3.2 for the precise definition of common certainty in such an event. n , the probability that entering is rationalizable forall beliefs satisfying Assumption 1 is zero. But the probability that entering is rationalizablefor some belief satisfying Assumption 1 varies depending on n and m . Part (a) says that asthe number of observations n grows large, the probability p n vanishes to zero, implying thatthe confidence set converges to a degenerate interval at zero. The infinite-data limit thusreturns the prediction of “no trade” consistent with the common prior assumption. But if thequantity of data is finite, and the number of attributes is large, then p n can be substantiallygreater than zero. Indeed, Part (b) of the claim says that this probability p n can be madearbitrarily close to 1 by increasing the dimensionality of the learning problem via choice oflarge m . This reflects that in a high-dimensional learning problem, many classification rulesare likely to be consistent with the data, including some that yield conflicting predictions.Thus, “rational” disagreement is possible and even likely.An exact characterization of r p n , p n s is given in Lemma 3 in the appendix. Using thislemma, I plot in Figure 3 the behavior of these confidence sets for different values of m ,assuming the true function to be p x P r´ . , . s m q . Although trade is not predicted inthe limiting game, it is a plausible outcome if the number of data observations is small andthe number of attributes is large. As one specific example: If there are 10 attributes andplayers observe only 20 goods, then the confidence set r p n , p n s “ r , . s . That is, withnear certainty, entering will be rationalizable for the Seller given the realized data for somebelief satisfying Assumption 1. While I chose a simple set of learning rules for the purposeof obtaining exact expressions for the confidence set, in practice confidence sets like thosedepicted in Figure 3 can be simulated for more complex kinds of learning procedures.
20 40 60 80 1000.20.40.60.81.0
Number of observations n Number of observations n Number of observations n
20 40 60 80 1000.20.40.60.81.0
20 40 60 80 1000.20.40.60.81.0 ( a ) m = 1 ( b ) m = 10 ( c ) m = 100 p n p n p n p n p n p n Figure 3: The shaded area depicts confidence sets r p n , p n s for the rationalizability of enteringgiven n common observations. 8 .2 Coordination I now turn to a second classic game, where (in contrast to the previous example) the pre-diction of interest holds in the infinite-data limit. I characterize the robustness of thatprediction when players have beliefs based on small quantities of data.A contagious disease spreads across a population at an unknown speed. Two states areconnected by travel, and their governors choose between implementing a strong or a weak lockdown policy in their states to slow the spread of the disease. Implementation of thestrong lockdown policy entails a large economic cost, but if the states coordinate on thispolicy, then the disease will be suppressed with certainty.The two governors form beliefs about the growth rate of the disease based on a publicdata set tp t, y t qu nt “ , which consists of the number of reported cases of the disease, y t , ondays t “ , , . . . , n . The number of reported cases grows exponentially according tolog y t “ βt ` ε t where the noise term ε t has a normal distribution with known parameters µ “ σ ą β is not known. Payoffs are given by the following matrix: Strong WeakStrong ´ , ´ ´ ´ β, ´ β Weak ´ β, ´ ´ β ´ β, ´ β The economic cost of the strong lockdown is normalized to 1, and the cost of letting thedisease progress without a strong lockdown is given by its growth rate β . A weak lockdown isstrictly dominant if β ă
1, but coordination on the strong lockdown is the Pareto-dominantNash equilibrium if β ą β restrictsagents to identical posterior beliefs. This rules out, for example, the possibility that agovernor chooses the weak policy, not because he infers from the data that the disease islow-risk, but because he believes that the other governor may make such an inference.In practice, agents use statistical procedures to infer β from the data. I thus relax theassumption of a common prior in the following way. Define ˆ β p z n q to be the ordinary least-squares estimate of β from the data z n : “ tp t, log y t qu nt “ , and let φ n be the constant such9hat C p z n q “ t β : | β ´ ˆ β p z n q| ď φ n u is a p ´ α q -confidence interval for β , where α P p , q is fixed. Suppose the governors have common certainty in the event that both of their first-order beliefs have support on C p z n q ; that is, each governor’s own belief about the value of β has support in this confidence interval; they assign probability 1 to the other agent havingsuch a belief; and so forth. This allows players to disagree given the data, but requires thesize of that disagreement to be confined within the confidence interval. (The assumptionthat players assign common certainty is not crucial. Similar results go through if the playershave common p -belief (Monderer and Samet, 1989) in the set for large p , so long as we boundthe parameter space for β . See Section 6.) Claim 2.
Suppose that the actual growth rate is fast p β ą ), so that the strong lockdownis rationalizable given complete information of the payoffs. Then, for every σ ą , both p n and p n are increasing in n , while for every n , both p n and p n are decreasing in σ . That is, the analyst gains confidence in predicting that the strong lockdown is ratio-nalizable as the reporting noise σ decreases, and the number of observations n increases. Lemma 3 in the appendix explicitly characterizes p n and p n . Using the expressions in thislemma, I plot in Figure 4 the behavior of these confidence sets for different levels of reportingnoise σ .
200 400 600 800 10000.20.40.60.81.0 200 400 600 800 10000.20.40.60.81.0 200 400 600 800 10000.20.40.60.81.0
Number of observations n Number of observations n Number of observations n ( c ) = 1000 ( b ) = 100 ( a ) = 10 p n p n p n p n p n p n Figure 4: The shaded area depicts confidence sets r p n , p n s for the rationalizability of thestrong lockdown given n common observations, and allowing the reporting noise σ to vary.In all panels, β “ α “ . n grows, both p n and p n increase as well. If the numberof observations is large relative to the reporting noise, then the analyst should have high If instead β ă
1, then the reverse statements hold; that is, the probabilities p n and p n are decreasing in n and increasing in σ . σ “
10 and n “ r . , s is nearly degenerate at certainty. On the otherhand, if the reporting noise is large and the number of observations is relatively small, thenthe strong lockdown is likely to be rationalizable for some permitted types, but not for allof them—for example, if σ “ n “ r . , . s , suggestingsubstantial ambiguity regarding whether the strong lockdown is a good prediction of play.Subsequently, I generalize the approach described in these two examples. Basic Game.
There is a finite set I of players and a finite set of actions A i for each player i . The set of action profiles is A “ ś i P I A i , and the set of possible games is identifiedwith U : “ R | I |ˆ| A | . Agents have beliefs over a set of payoff-relevant parameters Θ, which isa compact and convex subset of finite-dimensional Euclidean space. It is possible to takeΘ to be a subset of U , so that each θ is itself a game, or to define beliefs over a lower-dimensional set of payoff-relevant parameters as in Section 2. In either case, the parametersin Θ are assumed to be related to payoffs by a bounded and Lipschitz continuous embedding g : Θ Ñ U (assuming the sup-norm on both spaces). Beliefs.
For each player i , let X i “ Θ, X i “ X i ˆ ś j ‰ i ∆ p X j q , . . . , X ni “ X n ´ i ˆ ś j ‰ i ∆ p X n ´ j q , etc., so that each X ki is the set of possible k -th order beliefs for player i .Define T i “ ś n “ ∆ p X in q . An element p t i , t i , . . . q P T i is a hierarchy of beliefs over Θ(describing the player’s uncertainty over Θ, his uncertainty over his opponents’ uncertaintyover Θ, and so forth), and referred to simply as a belief or type . There is a subset oftypes T ˚ i (that satisfy the property of coherency and common knowledge of coherency)and a function κ ˚ i : T ˚ i Ñ ∆ ` Θ ˆ T ˚´ i ˘ such that κ ˚ i p t i q preserves the beliefs in t i ; thatis, marg X n ´ κ ˚ i p t i q “ t ni for every n (Mertens and Zamir, 1985; Brandenburger and Dekel, The map g can be interpreted as capturing the known information about the structure of payoffs. Types are sometimes modeled as encompassing all uncertainty in the game. In the present paper,types describe players’ structural uncertainty over payoffs, but not their strategic uncertainty over opponentactions. marg X n ´ t ni “ t n ´ i , so that p t i , t i , . . . q is a consistent stochastic process. The tuple p T ˚ i , κ ˚ i q i P I is the universal type space . Subsequently I will develop smallertype spaces p T i , κ i q i P I where each T i Ď T ˚ i and κ i : T i Ñ κ ˚ i p T i q is the restriction of κ ˚ i to T i . The proposed approach endogenizes the type space based on two new primitives: a data-generating process , and a set of rules for how to extrapolate beliefs from realized data.Formally, let p Z t q t P Z ` be a stochastic process where the random variables Z t take value in acommon set Z , and the typical sample path is denoted z “ p z , z , . . . q . The data-generatingprocess is a measure P over the set Z of all (infinite) sample paths. Let P n denote theinduced measure on the first n variables. A data set z n of size n is the restriction of z to itsfirst n coordinates, and Z n is the set of all length- n data sets. I use Z n “ p Z , . . . , Z n q todenote the random initial sequence of length n . A learning rule is defined to be any map from data sets into first-order beliefs: µ : ď n “ Z n Ñ ∆ p Θ q . The special case of Bayesian learning rule is identified with a distribution on Θ along with afamily of distributions p P θ q θ P Θ , where each P θ P ∆ p Z q is the data-generating distributiongiven parameter θ . The realized data z n determines a posterior distribution on Θ ˆ Z ,and the learning rule maps this data into the marginal posterior distribution on Θ. Otherlearning rules may map the data z n to a degenerate belief at a sample statistic (such as theempirical average) or to a distribution over various point-estimates for θ .Players have common knowledge of a set M of learning rules. The set of plausiblebeliefs given this set M and realized data set z n is defined B p z n q “ t µ p z n q : µ P M u Ď ∆ p Θ q . (1) The notation T ˚´ i is used throughout the paper to denote the set of profiles of opponent types, ś j ‰ i T ˚ j . Throughout, symbols such as Z t and Z n denote random variables, whereas lowercase symbols such as z n are particular, constant values. In general, making the mapping deterministic may require choosing a conditional probability if thereare multiple ones consistent with Bayes’ rule; here and elsewhere in the paper, implicitly assume that theupdating rule specifies such a choice when Bayesian rules are mentioned. Note that this set is common across players. M that satisfy thefollowing condition: Assumption 2 (Common Limiting Belief) . There is a limiting belief µ such that lim n Ñ8 d P p µ p Z n q , µ q Ñ P -a.s. @ µ P M where d P is the Prokhorov metric on Θ . This assumption requires that all learning rules in M return the same limiting belief µ as the quantity of data n grows large. It is not critical that all differences in beliefs areremoved in the limit (see Section 6). But maintaining Assumption 2 in the main text allowsus to explore more precisely the scope for disagreement in an environment in which learningis feasible, but not immediate. While learning has not ceased, the available information maypermit multiple different interpretations, producing differences in beliefs.I use the sets of plausible beliefs B p z n q to impose a restriction on hierarchies of beliefs,as in Battigalli and Siniscalchi (2003). Specifically, players are assumed to have commoncertainty in the event that all players have first-order beliefs in B p z n q —that is, they have afirst-order belief in this set, believe with probability 1 that all other players have a first-orderbelief in this set, and so forth (Monderer and Samet, 1989). Formally, for any set B Ď ∆ p Θ q , and for any player i , define B , i p B q : “ t t i P T ˚ i : marg Θ κ ˚ i p t i q P B u to be the set of player i types whose marginal beliefs over Θ belong to the set B . For each k ą
1, and again for each player i , recursively define B k, i p B q “ t i P T ˚ i : κ ˚ i p t i q ˜ Θ ˆ ź j ‰ i B k ´ , j p B q ¸ “ + . For any ν, ν P ∆ p Θ q , the Prokhorov distance between these measures is d P p ν, ν q “ inf t (cid:15) ą ν p A q ď ν p A (cid:15) q ` (cid:15) for all Borel-measurable A Ď Θ u , where A (cid:15) denotes the (cid:15) -neighborhood of A in the sup-norm. This limiting belief µ can be interpreted as a common prior, following what Morris (1995) calls the“frequentist justification” for assumption of a common prior. As I discuss in Section 6, it is not critical that players have common certainty in this event, and therestriction can be relaxed to common p -belief for large p . For example, B , i p B q is the set of player i types that assign probability 1 to all other players assigningprobability 1 to B . T B i “ Ş k ě B k, i p B q is the set of player i types that have common certainty inthe event that all players’ first-order beliefs belong to B . Definition 1.
For every z n , the induced type space is ´ T B p z n q i , κ B p z n q i ¯ i P I , where κ B p z n q i : T B p z n q i Ñ κ ˚ i ´ T B p z n q i ¯ is the restriction of κ ˚ i to T B p z n q i . The type t i is permitted for player i if t i P T B p z n q i .This type space includes all type profiles where each player i has common certainty inthe event that all players have first-order beliefs in B p z n q . Note that the type space permitscommon knowledge disagreement—that is, player i can believe with probability 1 that (allbelieve with probability 1 that...) players hold different first-order beliefs. Such types areprecluded under the common prior assumption not only in the present setting of commondata, but also if we were to allow for private and different information (Aumann, 1976). Theinduced type spaces in Definition 1 thus corresponds to a relaxation of the common priorassumption, where the permitted extent of disagreement is governed by the set of learningrules M . In the special case in which M consists of a singleton Bayesian rule, then we returnthe common prior assumption.In this approach, restrictions are placed only on the beliefs that players hold on theexogenous parameter space Θ, and not on how they came to form those beliefs. For example,each of the following is consistent with the restriction in Definition 1: • (Common Knowledge) Each player i is associated with a player-specific learning rule µ i P M . It is common knowledge that given any data set z n , each player i ’s first-orderbelief is µ i p z n q . • (Randomization) Each player i randomizes over learning rules in M using a player-specific distribution Q i P ∆ p M q , and applies the randomly drawn learning rule tothe realized data to form a first-order belief. The distributions p Q i q i P M are commonknowledge, although the realized random rule is privately known. • (Misspecification.) Player j believes with probability 1 that player i ’s first order beliefis µ p z n q for every data set z n , whereas in fact player i ’s first-order belief is µ p z n q fora different learning rule µ P M . It is straightforward to show that the type sets T B p z n q i are belief-closed ; that is, κ B p z n q i p t i q ´ Θ ˆ T B p z n q´ i ¯ “ t i P T B p z n q i .
14t is possible sometimes to take B p z n q as the primitive without explicitly defining M (as in the coordination example in Section 2.2), so that each measure P n directly definesa measure over sets of first-order beliefs. In other cases, it is more natural to define M .Some examples for this set include: Bayesian Updating with Different Priors.
Each learning rule µ π P M is identified with aprior distribution π P ∆ p Θ ˆ Z q . For any realized data z n , the belief µ π p z n q is the marginalover Θ of the posterior belief associated with the corresponding prior π . Sample Statistics.
The set M consists of learning rules that map the data to differentpoint-estimates for the payoff-relevant parameter. For example, M might consist of the twolearning rules µ mean and µ median , where for any data set z n , µ mean p z n q is a point-mass beliefon the mean realization in z n (as in Jehiel (2018)), and µ median p z n q is a point-mass belief onthe median realization. Linear Regression.
Suppose that X Ď R p , p ă 8 , is a set of attributes that determine thevalue of a parameter in Θ (e.g. physical covariates of a patient seeking health insurance,and medical outcomes for those patients). The observations in z n are pairs p x, θ q , and thepayoff-relevant unknown is the parameter associated with some new x ˚ (e.g. the outcomefor a new patient with characteristics x ˚ ).Each learning rule µ P M corresponds to a different regression model based on a subsetof attributes I µ Ď t , . . . , p u (as for example in Olea et al. (2017)). Write x µ “ p x i q i P I µ for the coordinates of x at those indices. Then, ˆ f OLSµ r z n sp x q “ β OLSµ ¨ x µ is the linearfunction of the attributes in I µ that best fits the observed data z n “ tp x k , θ k qu nk “ ; that is, β OLSµ “ argmin β P R | Iµ | n ř ni “ p β ¨ x kµ ´ θ k q . For every z n , the learning rule µ maps z n into apoint-mass belief on the corresponding prediction ˆ f OLSµ r z n sp x ˚ q . Case-Based Learning with Different Similarity Functions.
As in the previous example, sup-pose that the observations are pairs p x, θ q P X ˆ Θ, and the payoff-relevant parameteris the outcome at some new x ˚ . Each “case-based” learning rule µ P M is identifiedwith a real number λ P R ` (to be interpreted momentarily) and maps the historical data The definition of B p z n q must, however, respect Assumption 2. However, there must exist some set oflearning rules M that could be defined, which respects Assumption 2 and gives rise to these sets B p z n q . Asufficient condition is for there to exist a P -measure 1 set of sequences Z ˚ Ď Z such that for every z P Z ˚ ,and every sequence p ν n q n “ satisfying ν n P B p z n q for each n , it holds that lim n Ñ8 d P p ν n , µ q “ n “ tp x k , θ k qu nk “ into a weighted average of the observed parameter values θ k (Gilboa andSchmeidler, 1995).The observed parameters at x -values “more similar” to x ˚ are weighted more heavily.Formally, let g : X ˆ X Ñ R ` be a similarity function on attributes, where g p x, x q describesthe distance between attribute vectors x and x . The learning rule with parameter λ maps z n to a point mass on the weighted average n ř nk “ θ k ¨ p e ´ λg µ p x k ,x ˚ q q{p ř k e ´ λg µ p x k ,x ˚ q q . Theparameter λ controls the degree to which similar observations are weighted more heavily thandissimilar observations. For example, λ “ λ Ñ 8 returns the observed state at the most similar attribute vector.
I now use the proposed framework to construct a quantitative metric for the analyst’s con-fidence in a strategic prediction, focusing on prediction that an action is interim-correlatedrationalizable (Dekel et al., 2007; Weinstein and Yildiz, 2017). The use of this particularsolution concept is not critical to the approach—for example, we could study the agent’sconfidence in prediction of Bayesian Nash equilibria (see Section 6)—but interim-correlatedrationalizability is well-suited to the present setting, where agents may have common knowl-edge disagreement. Its definition is reviewed here:For every player i and type t i P T ˚ i , set S i r t i s “ A i , and define S ki r t i s for k ě a i P S ki r t i s if and only if a i is a best reply to some π P ∆ p Θ ˆ T ˚´ i ˆ A ´ i q satisfying(1) marg Θ ˆ T ˚´ i π “ κ ˚ i p t i q and (2) marg A ´ i ˆ T ´ i π pt a ´ i , t ´ i q | a ´ i P S k ´ ´ i r t ´ i suq “
1, where S k ´ ´ i r t ´ i s “ ś j ‰ i S k ´ j r t ´ j s . We can interpret π to be an extension of type t i ’s belief κ ˚ i p t i q onto the space ∆ p Θ ˆ T ´ i ˆ A ´ i q , with support in the set of actions that survive k ´ T ´ i . For every i , the actions in S i r t i s “ Ş k “ S ki r t i s are interim correlated rationalizable for type t i , or(henceforth) simply rationalizable . Say that a i is strictly rationalizable for type t i if the bestreply conditions above are strengthened to strict best replies.For any set of beliefs B Ď ∆ p Θ q , say that action a i is strongly B -rationalizable if it is Equilibrium notions are known to lead to potentially counterintuitive predictions when players havecommon knowledge disagreement. For example, consider a matching pennies game where player 1 receives θ if players match and ´ θ otherwise, and player 2 receives ´ θ if the players match and θ otherwise. Let θ P t´ , u . Then if player 1 assigns probability 1 to θ “ θ “ ´
1, itis (somewhat counterintuitively) a Bayesian Nash equilibrium for both players to choose match. See Dekelet al. (2004) for an extended discussion. i with any type t i P T B i , and it is weakly B -rationalizable if it isrationalizable for player i with some type t i P T B i . Strong and weak B -rationalizability rep-resent (respectively) maximally stringent and maximally lenient approaches for determiningwhether a i constitutes a “reasonable” prediction in the interim type space ` T B i , κ B i ˘ i P I . The main concept of a confidence set is now defined.
Definition 2.
For every n P Z ` , define p n p a i q to be the probability (over possible datasets z n ) that action a i is rationalizable for every type in T B p z n q i ; that is, p n p a i q “ P n pt z n : a i is strongly B p z n q -rationalizable uq . (2)Define p n p a i q to be the probability (over possible datasets z n ) that action a i is rationalizable for some type t i P T B p z n q i ; that is, p n p a i q “ P n pt z n : a i is weakly B p z n q -rationalizable uq . (3)The confidence set for rationalizability of a i given n observations is r p n p a i q , p n p a i qs .The larger p n p a i q and p n p a i q are, the more confident an analyst should be in predictingthat a i is rationalizable. At extremes: If p n p a i q “ p n p a i q “
1, then given observation of n random samples, action a i is guaranteed to be rationalizable for player i (for all permittedtypes). If p n p a i q “ p n p a i q “
0, then action a i is guaranteed to not be rationalizable forplayer i (for any permitted types). In the intermediate cases, if 0 ă p n p a i q “ p n p a i q ă a i depends on the specific realization of the data, and if p n p a i q ă p n p a i q , then the prediction requires assumptions on the details of the agent’s beliefbeyond Assumption 1. Observation 1.
For every player i and action a i P A i : Whether an action a i is interim-correlated rationalizable for some type t i does not depend on thedescription of the underlying type space (Dekel et al., 2007). Hence we only need to define rationalizabilityfor the universal type space, even though the type spaces that we will work with are the smaller type spaces ´ T B p z n q i , κ B p z n q i ¯ , and they vary depending on the data z n . Given the restriction in Definition 1, the weakly B -rationalizable strategies are the ∆-rationalizablestrategies of Battigalli and Siniscalchi (2003), where ∆ “ p ∆ i q i P I and each ∆ i “ t ν P ∆ p Θ ˆ T ´ i ˆ A ´ i q | marg Θ µ P B u encodes the belief restriction that first-order beliefs belong to B . The concept of strong B -rationalizability can be interpreted as a “robust” version of ∆-rationalizability. I do not comment here on what further assumptions may be imposed, interpreting this case simply asone in which the prediction is tenuous. a) p n p a i q ď p n p a i q for every n P Z ` .(c) If M consists of a single learning rule, then p n p a i q “ p n p a i q for every n P Z ` . In the special case in which agents have a common prior, the definitions in p n p a i q and p n p a i q have the following familiar interpretation: Remark 1. (Common Prior.) Suppose that players share a common prior over Θ ˆ Z andfor simplicity let Z be finite. Write µ for the learning rule that maps z n into the inducedposterior belief over Θ under the common prior. Then, each realization z n determines aninterim game, where players all have common certainty in the posterior belief. Moreover,the common prior determines a distribution over z n , and hence over possible interim games.For any player i and action a i , the probabilities p n p a i q “ p n p a i q , and are equal to themeasure of size- n datasets z n (under the common prior ) with the property that action a i is rationalizable for player i in the corresponding interim game. In the above approach, the common prior serves multiple roles: it determines the truedistribution over the data that agents might see, and also determines how agents updatefrom that data. When we separate these roles, we can still use an objective data-generatingprocess to define a measure over interim games, as I do here. In this way, the probabilities p n p a i q and p n p a i q are a natural generalization of a standard measure of the typicality of astrategic prediction, in the absence of a common prior. The subsequent sections study how confidence sets depend on the underlying learning envi-ronment and the game in question. I first consider the limiting behavior of the probabilities p n p a i q and p n p a i q as the quantity of data n gets large. Recall that by Assumption 2, thebeliefs induced by the different learning rules converge to a limiting belief µ . Thus, the n “ 8 limit corresponds to an incomplete information game in which players have commoncertainty in the event that every player has first-order belief µ . Whether the probabilities A small difference in the formulations is that p n p a i q and p n p a i q are defined using the “true” probabilitymeasure P P ∆ p Z q in the present approach, instead of a measure Q P ∆ p Θ ˆ Z q . This approach is used for example in Kajii and Morris (1997) (if we re-interpret the histories z n as thestates), where an incomplete information game is “close” to a complete information game if the payoffs ofthe complete information game occur with high probability under the prior. n p a i q and p n p a i q are continuous at n “ 8 tells us how sensitive rationalizability of a i is toan assumption that agents have coordinated their beliefs using infinite data. When theseprobabilities are discontinuous at n “ 8 , then the infinite-data prediction is fragile—thatis, the analyst would make a different prediction for arbitrarily large but finite quantities ofdata.Formally, let t i be the player i type with common certainty in the event that each player’sfirst-order belief is µ . Then define p p a i q “ p p a i q “ a i is rationalizable fortype t i , and define p p a i q “ p p a i q “ Definition 3.
The confidence set for action a i is asymptotically continuous iflim n Ñ8 r p n p a i q , p n p a i qs “ r p p a i q , p p a i qs . Whether confidence sets are asymptotically continuous depends crucially on whether thebeliefs induced by the different learning rules converge uniformly to µ . Assumption 3 (Uniform Convergence) . lim n Ñ8 sup µ P M d P p µ p Z n q , µ q “ P -a.s., where d P is the Prokhorov metric on ∆ p Θ q . Assumption 2 already implies that for each learning rule µ P M , the (random) inducedbelief µ p Z n q almost surely converges to µ as the quantity of data n grows large. Assumption3 strengthens this by requiring additionally that the speed of convergence does not vary toomuch across the different learning rules in M . Specifically, the sequence of beliefs t µ p Z n qu must converge to µ (as n Ñ 8 ) uniformly across µ P M .A sufficient condition for Assumption 3 to hold is that the set of learning rules M isfinite. But failures of Assumption 3 occur for classes of learning rules that we may considerplausible. In particular, Assumption 3 fails if the class M is too rich, as in the followingexample: Example 1 (Rich Sets of Priors and Likelihoods) . An unknown parameter v takes valuesin t , u . Players commonly observe a sequence of realizations from the set Z “ t , u .Learning rules µ π,q P M are indexed to parameters π P p , q and q P p { , q , where the arameter π is the prior probability of value 1, and q identifies the following signal structure: z “ z “ v “ q ´ qv “ ´ q Each rule µ π,q is identified with prior π and signal structure q , and maps the observed signaloutcomes into the posterior belief over t , u . Assume that the true data-generating processbelongs to this class; that is, there exists some q ˚ P p { , q such that the distribution overthe signal set t , u is p q ˚ , ´ q ˚ q when v “ , and the distribution is p ´ q ˚ , q ˚ q when v “ . In this example, all learning rules lead to the same belief (that is, there is asymptoticagreement in the sense of Acemoglu et al. (2015)). But because the rate of this conver-gence cannot be uniformly bounded across the different learning rules, it is possible for theconfidence set to be discontinuous at n “ 8 . Claim 3.
Consider the trading game described in Section 2, and the data-generating pro-cess and set of learning rules from Example 1. Then, lim n Ñ8 r p n p a i q , p n p a i qs “ r , s , while r p p a i q , p p a i qs “ t u , so the prediction that entering is not rationalizable for the Seller isnot asymptotically continuous. The claim tells us that although trade will not occur in the limiting game, this predictionis sensitive to the assumption that agents have indeed coordinated their priors using infinitedata. Even if the amount of data that players commonly observed were to be arbitrarilylarge, the analyst should nevertheless consider trade to be a plausible outcome.
In contrast, when the assumption of uniform convergence is satisfied, then the limitingconfidence sets can be tightly linked to predictions in the limiting game.
Theorem 1.
Suppose Assumption 3 is satisfied.(a) If a i is strictly rationalizable for player i of type t i , then lim n Ñ8 r p n p a i q , p n p a i qs “ t u . b) If a i is not rationalizable for player i of type t i , then lim n Ñ8 r p n p a i q , p n p a i qs “ t u . This theorem says that if an action a i is strictly rationalizable for player i given infinitedata, then p n p a i q and p n p a i q both converge to 1 as n grows large. Thus, when agentsobserve sufficiently large quantities of public data, the analyst should be arbitrarily confidentin predicting that a i is rationalizable. On the other hand, if action a i is not rationalizablegiven infinite data, then p n p a i q and p n p a i q both converge to 0, so the analyst should bearbitrarily confident in predicting that a i is not rationalizable for large data sets. Theorem 1 builds on results from the literature on topologies on the universal type space.Consider any sequence of types p t ni q n “ where each t i P T B p z n q i . Under Assumption 3, as thequantity of data n gets large, the types t ni (almost surely) have common certainty thatfirst-order beliefs lie in an arbitrarily small neighborhood of the limiting belief µ . Thus,the sequence p t ni q can be shown to converge to t i , in the uniform-weak topology (Chenet al., 2010) on the universal type space (see Lemma 4). Since rationalizability is upperhemi-continuous in the uniform-weak topology (Chen et al., 2010), Part (b) of the theoremfollows.Part (a) of the theorem is related to lower hemi-continuity of strict rationalizability inthe uniform-weak topology (as shown in Chen et al. (2010)), but this property is notsufficient. Lower hemi-continuity guarantees that for any sequence of types p t ni q n “ from T B p z n q i , the action a i must eventually be rationalizable along the sequence, but the ratesof this convergence can differ substantially across different sequences. For eventual strong If the limiting belief µ is degenerate at a limiting parameter θ , and players have common certaintythat players’ first-order beliefs have support in a shrinking neighborhood of θ (see Section 5.1 for a moreformal development), then the property that p n p a i q Ñ a i is robustlyrationalizable, as defined in Morris et al. (2012), with the small difference that Morris et al. (2012) consideralmost common belief in the exact parameter θ , while I consider common certainty in a neighborhood of θ . As Proposition 1 in Morris et al. (2012) shows, strict rationalizability is a sufficient condition for robustrationalizability. See also Kajii and Morris (2020) for related results. The intermediate case in which a i is rationalizable for player i given infinite data, but not strictlyrationalizable, is subtle and depends on details of the game. See Online Appendix O.4 for examples in whichlim n Ñ8 r p n p a i q , p n p a i qs “ t u and in which lim n Ñ8 r p n p a i q , p n p a i qs “ r , s . Note that the latter correspondsto a maximally ambiguous outcome—no amount of data is decisive on whether or not the action should beconsidered rationalizable. It is crucial that convergence occurs in this topology and not simply the product topology, as otherwisethe negative results of Weinstein and Yildiz (2007) would apply. p z n q -rationalizability, we need that a i is rationalizable for all types from T B p z n q i when n is sufficiently large. To establish this, I show that there is a P -measure 1 set of sequencesalong which the sets ´ T B p z n q i ¯ n “ converge to the singleton set t t i u in the Hausdorff metricinduced by the uniform-weak metric. The key lemma underlying this result, Lemma 6,relates the degree of “strictness” of rationalizability of action a i at the limiting type t i tothe size of the neighborhood around µ such that common certainty of that neighborhoodimplies rationalizability of a i . The stronger property that types converge uniformly over theset T B p z n q i delivers the desired result. The previous section characterized confidence sets given large numbers of common obser-vations. I now focus on the setting of small n , and bound the extent to which the agent’sconfidence set r p n p a i q , p n p a i qs diverges from its asymptotic limit r p p a i q , p p a i qs . Through-out this section, I impose the simplifying assumptions that observations are i.i.d., and thatthey take values from a finite set Z : Assumption 4. Z , . . . , Z n „ i.i.d. Q . Assumption 5. | Z | ă 8 . In some cases, as in the examples in Section 2, the confidence set can be exactly charac-terized. In what follows, I provide bounds for the confidence set that can be easier to derivein certain cases.
First consider an action a i that is strictly rationalizable for player i of type t i . By Theorem1, the analyst’s confidence set r p n p a i q , p n p a i qs converges to a degenerate interval at 1. Propo-sition 5, below, provides a lower bound on p n p a i q , which informs how fast this convergenceoccurs.A key input into the bound is the “degree” to which a i is strictly rationalizable for thelimiting type t i . Say that a family of sets p R i r t i sq t i P T i , where each R j r t j s Ď A j , has the δ -strict best reply property if for each i P I , type t i P T i , and action a i P R i r t i s there is a22onjecture σ ´ i : Θ ˆ T ´ i Ñ ∆ p A ´ i q to which a i is a δ -strict best reply for t i ; that is, ż Θ u i p a i , σ ´ i p θ, t ´ i q , θ q t i r dθ ˆ dt ´ i s ´ ż Θ u i p a i , σ ´ i p θ, t ´ i q , θ q t i r dθ ˆ dt ´ i s ě δ @ a i ‰ a i . Say that an action a i is δ -strict rationalizable for type t i if there exists a family of sets p R j r t j sq t j P T j with the δ -strict best reply property, where a i P R i r t i s . Then, if a i is strictly rationalizable for the limiting type t i , and players have commonlyobserved n realizations, the probability that a i is rationalizable for all permitted types canbe upper bounded as follows. Proposition 1.
Suppose a i is strictly rationalizable for type t i , and define δ : “ sup t δ : a i is δ -strictly rationalizable for type t i u (4) noting that this quantity is strictly positive. Further define ξ : “ sup θ,θ P Θ } θ ´ θ } . (5) Then, for every n ě , p n p a i q ě ´ Kξδ E ˆ sup µ P M d P p µ p Z n q , µ q ˙ (6) where K is the Lipschitz constant of the map g : Θ Ñ U . Recalling that p n p a i q ě p n p a i q for every n , this proposition allows us to lower bound theconfidence set r p n p a i q , p n p a i qs .The expression in (7) is increasing in δ , so the “more strictly-rationalizable” the actionis for the limiting type, the fewer observations are necessary for the prediction to hold. Thebound is decreasing in E p sup µ P M d P p µ p Z n q , µ qq , which is the expected distance from thelimiting belief µ to the farthest belief in the plausible set B p Z n q . When Assumption 3 issatisfied, then E p sup µ P M d P p µ p Z n q , µ qq Ñ n Ñ 8 , and the speed of this convergencecan be interpreted as the speed at which players commonly learn (Cripps et al., 2008).Thus, Theorem 5 suggests that the quicker players commonly learn, the fewer observationsare necessary for limiting predictions to carry over to small-data settings.In an important special case, the limiting belief µ is a point mass at some θ , and the This is equivalent to γ -rationalizability from Dekel et al. (2007), where γ “ ´ δ . B p z n q consist of beliefs with support on shrinking neighborhoods of θ . Formally, let C p z n q : “ ď µ P M supp µ p z n q @ z n P Z n with the implication that every µ p z n q , µ P M , assigns probability 1 to C p z n q . If C p z n q collapses to the singleton set t θ u as n Ñ 8 , then the bound in Proposition 5 can besimplified as follows.
Assumption 6. sup θ P C p Z n q } θ ´ θ } converges to zero P -almost surely. Proposition 2.
Suppose Assumption 6 holds, and the action a i is strictly rationalizable fortype t i . Then, for every n ě , p n p a i q ě ´ Kδ E ˜ sup θ P C p Z n q } θ ´ θ } ¸ (7) where K is the Lipschitz constant of the map g : Θ Ñ U . The expressions in Propositions 5 and 2 can be used to derive quantitative bounds forspecific sets of learning rules, as in the following example:
Example 2.
Consider the payoff matrix from Section 2.2 with unknown parameter β P R . Suppose that players commonly observe n public signals z t “ β ` ε t , with standardnormal error terms ε t that are i.i.d. across observations. The set of learning rules is M “t µ x u x Pr´ η,η s , where each learning rule µ x is identified with the prior belief β „ N p x, q , andmaps data into a point mass at the posterior expectation of β . The set C p z n q thus consistsof the posterior expectations under the different priors, and players have common certaintyin the event that all players have first-order beliefs with support on C p z n q . Let the true valueof β satisfy β ą . Then, applying Proposition 2: Corollary 1.
For each n ě , p n p strong q ě ´ β ´ ˜c πn ` β ` ηn ` ¸ The bound in Corollary 1 is decreasing in η (the size of the model class), increasing in n (thenumber of observations), and increasing in β ´ (the strictness of the solution at the limit). That is, there is a P -measure 1 set of (infinite) sequences such that sup θ P C p z n q } θ ´ θ } Ñ n Ñ 8 for each sequence z in this set. .2 Upper Bound Now suppose that the action a i is not rationalizable for player i of type t i . We know fromPart (c) of Theorem 1 that in this case, the analyst’s confidence set r p n p a i q , p n p a i qs convergesto a degenerate interval at zero. But given small quantities of data n , the action a i may stillconstitute a plausible prediction of play, as in the trading game studied in Section 2. Claim3, below, provides an upper bound on p n p a i q , which informs whether the analyst shouldconsider a i a plausible prediction away from the limit.To define this bound, a few intermediate definitions are needed. Let Z a i be all data sets z n given which the action a i is weakly B p z n q -rationalizable. (This set must be determined ona case-by-case basis.) Let p Q z n P ∆ p Z q be the empirical measure associated with data set z n .The Kullback-Leibler divergence between p Q z n and the actual data-generating distribution Q is D KL p p Q z n } Q q “ ř z P Z Q p z q log ´ p Q z n p z q Q p z q ¯ . Define Q ˚ n “ argmin p Q P t z n P Z nai u D KL p p Q z n } Q q to be the empirical measure (associated with a data set in Z a i ) that is closest in Kullback-Leibler divergence to Q . Application of Sanov’s theorem directly gives the following result. Proposition 3.
Suppose a i is not rationalizable for type t i ; then, for every n ě , p n p a i q ď p n ` q | Z | ´ nD KL p Q ˚ n } Q q . Recalling that p n p a i q ě p n p a i q for every n , this proposition allows us to upper bound theconfidence set r p n p a i q , p n p a i qs . The claim is applied below in an example setting: Example 3.
Consider the trading game from Section 2 and the learning rules described inExample 1, but suppose that the domain of q is r { , s and the domain of π is r { , { s ,so that Assumption 3 is satisfied. Let the true signal structure be identified with q ˚ “ { .and suppose the posted price is p “ { . Theorem 1 implies that entering will fail to berationalizable when players have observed sufficient data. Nevertheless, the action may berationalizable for a permitted belief if players have observed a small number of data points.The corollary below quantifies this probability. orollary 2. For each n ě , p n p enter q ď p n ` q ´ r n n where r n “ ´ log p n q ´ log ´ t n ` log p q log p q u ¯¯ ` ´ log p n q ´ log ´ t n ´ log p q log p q u ¯¯ . Asymptotic Disagreement.
In the main text, I imposed an assumption which guaran-teed that beliefs produced by learning rules in M uniformly converge to a common limitingbelief µ . This implies that learning eventually removes all differences in beliefs. It is pos-sible to replace Assumption 3 with the following, weaker condition, which allows players tohave heterogeneous beliefs even in the limit: For any (cid:15) ě
0, say that the class of learningrules M satisfies (cid:15) -Uniform Convergence iflim n Ñ8 sup µ P M d P p µ p Z n q , µ q ď (cid:15) P -a.s.This requires that the set of expected parameters converges to an (cid:15) -neighborhood of µ .Then, Theorem 1 holds as long as the set of learning rules M satisfies (cid:15) -Uniform Convergencefor some (cid:15) ď δ {p Kξ q . The rate results do not change. Approximate Common Certainty.
Suppose that instead of imposing common certaintyin B p z n q , as we have done in the main text, players have common p -belief in B p z n q . Formally,for any probability p P r , s , player i , and set B Ď ∆ p Θ q , define B ,pi p B q : “ t t i P T ˚ i : marg Θ κ ˚ i p t i q P B u . For each k ą
1, and again for each player i , recursively define B k,pi p B q “ t i P T ˚ i : κ ˚ i p t i q ˜ Θ ˆ ź j ‰ i B k ´ ,pj p B q ¸ ě p + . This set has the same definition as B , i from the main text. It is possible to relax the assumptionsfurther, so that B ,pi p B q : “ t t i P T ˚ i : sup ν P B d P p ν, marg Θ κ ˚ i p t i qq ď ´ p u , but this does not correspond toany standard definitions. T B ,pi “ Ş k ě B k,pi p B q is the set of player i types that have common p -belief in the eventthat all players’ first-order beliefs belong to B . There exists a p such that so long as playershave common p -belief in the event that all players’ first-order beliefs belong to B p z n q , where p ą p , then Theorem 1 holds as stated. Rate results similar to those in Section 5 can alsobe obtained (see Online Appendix O.5 for details). Both extensions rely on boundedness ofthe payoff range. Confidence Sets for Equilibrium.
The proposed approach can be paired with solutionconcepts besides rationalizability. For example, suppose we are interested in evaluating ananalyst’s confidence in predicting that the action profile a P A is part of a (pure-strategy)Bayesian Nash equilibrium. The analogous confidence set is r p n p a q , p n p a qs , where the lowerbound p n p a q is the probability (over possible datasets z n ) that a i is a best reply to a ´ i forevery player i of any type t i P T B p z n q i . The upper bound p n is the probability that thereexists a belief-closed type space p T i , κ i q i P I where each T i Ď T B p z n q i , and the strategy profile σ with σ i p t i q “ a i for all i, t i P T i is a Bayesian Nash equilibrium. Then, Theorem 1 holdswith “strict rationalizability” replaced with “strict equilibrium” in the limiting game, andthe rate results provided in Theorem 5 hold when δ is replaced with an analogous notionfor the strictness of the equilibrium in the limiting game. Economists make predictions in incomplete information games based on models of unobserv-able beliefs. A large literature on the robustness of strategic predictions to the specificationof agent beliefs provides guidance regarding whether these predictions should be trusted.These robustness notions tend to be qualitative—we learn whether the prediction is or isn’trobust to perturbations in the agents’ beliefs. Here I offer a different perspective, namely aquantitative metric for how robust the prediction is. The metric depends on the quantity ofdata that agents get to see. Predictions that hold given infinite quantities of data may nothold given large quantities of data, and those that hold given large quantities of data may nothold in environments where agents see only a few observations. Likewise, predictions thatdon’t hold at the limit may nevertheless be plausible when agents’ beliefs are coordinated bya small number of observations. The proposed framework provides a way of formalizing this,27enerating new comparative statics for how the analyst’s confidence in a strategic predictionvaries with primitives of the learning environment.
References
Acemoglu, D., V. Chernozhukov, and M. Yildiz (2015): “Fragility of AsymptoticAgreement under Bayesian Learning,”
Theoretical Economics , 11, 187–225.
Al-Najjar, N. (2009): “Decisionmakers as Statisticians: Diversity, Ambiguity, and Learn-ing,”
Econometrica , 77, 1371–1401.
Al-Najjar, N. and M. Pai (2014): “Coarse Decision Making and Overfitting,”
Journalof Economic Theory , 150, 467–486.
Aumann, R. J. (1976): “Agreeing to Disagree,”
The Annals of Statistics , 4, 1236–1239.
Battigalli, P. and A. Prestipino (2011): “Transparent Restrictions on Beliefs andForward Induction Reasoning in Games with Asymmetric Information,”
The B.E. Journalof Theoretical Economics , 13, 79–130.
Battigalli, P. and M. Siniscalchi (2003): “Rationalization and Incomplete Informa-tion,”
Advances in Theoretical Economics , 3, 1534–5963.
Brandenburger, A. and E. Dekel (1993): “Hierarchies of Belief and Common Knowl-edge,”
Journal of Economic Theory , 59, 189–198.
Brandenburger, A., A. Friedenberg, and J. Kiesler (2008): “Admissibility inGames,”
Econometrica , 76, 307–352.
Carlin, B. I., S. Kogan, and R. Lowery (2013): “Trading Complex Assets,”
TheJournal of Finance , 68, 1937–1960.
Carlsson, H. and E. van Damme (1993): “Global Games and Equilibrium Selection,”
Econometrica , 61, 989–1018.
Chen, Y.-C., A. di Tillio, E. Faingold, and S. Xiong (2010): “Uniform topologieson types,”
Theoretical Economics , 5, 445–478.——— (2017): “Characterizing the Strategic Impact of Misspecified Beliefs,”
Review ofEconomic Studies , 84, 1424–1471.
Chen, Y.-C. and S. Takahashi (2020): “On Robust Selection and Robust Rationaliz-ability,” Working Paper.
Cripps, M., J. Ely, G. Mailath, and L. Samuelson (2008): “Common Learning,”28 conometrica , 76, 909–933.
Dekel, E., D. Fudenberg, and D. Levine (2004): “Learning to Play Bayesian Games,”
Games and Economic Behavior , 46, 282–303.
Dekel, E., D. Fudenberg, and S. Morris (2006): “Topologies on Types,”
TheoreticalEconomics , 1, 275–309.——— (2007): “Interim Correlated Rationalizability,”
Theoretical Economics , 2, 15–40.
Ely, J. and M. Peski (2011): “Critical Types,”
Review of Economic Studies , 78, 907–937.
Esponda, I. (2013): “Rationalizable Conjectural Equilibrium: A Framework for RobustPredictions,”
Theoretical Economics , 8, 467–501.
Gayer, G., I. Gilboa, and O. Lieberman (2007): “Rule-Based and Case-Based Rea-soning in Housing Prices,”
The B.E. Journal of Theoretical Economics , 7, 1–37.
Geanakoplos, J. and H. Polemarchakis (1982): “We Can’t Disagree Forever,”
Journalof Economic Theory , 28, 192–200.
Geary, R. C. (1935): “The Ratio of the Mean Deviation to the Standard Deviation as aTest for Normality,”
Biometrika , 27, 310–332.
Gibbs, A. and F. Su (2002): “On Choosing and Bounding Probability Metrics,”
Interna-tional Statistic Review . Gilboa, I. and D. Schmeidler (1995): “Case-Based Decision Theory,”
The QuarterlyJournal of Economics , 110, 605–639.——— (2003): “Inductive Inference: An Axiomatic Approach,”
Econometrica , 71, 1–26.
Haghtalab, N., M. O. Jackson, and A. D. Procaccia (2020): “Belief Polarizationin a Complex World: A Learning Theory Perspective,” Working Paper.
Hastie, T., R. Tibshirani, and J. Friedman (2009):
The Elements of Statistical Learn-ing , Springer.
Jehiel, P. (2005): “Analogy-Based Expectation Equilibrium,”
Journal of Economic The-ory , 123, 81–104.——— (2018): “Investment Strategy and Selection Bias: An Equilibrium Perspective onOveroptimism,”
American Economic Review , 108, 1582–1597.
Kajii, A. and S. Morris (1997): “The Robustness of Equilibria to Incomplete Informa-tion,”
Econometrica , 65, 1283–1309.——— (2020): “Refinements and Higher-Order Beliefs: A Unified Survey,”
The JapaneseEconomic Review , 71, 7–34. 29 andel, E. and N. Pearson (1995): “Differential Interpretation of Information andTrade in Speculative Markets,”
Journal of Political Economy , 103, 831–872.
Mankiw, G., R. Reis, and J. Wolfers (2004): “Disagreement about inflation expecta-tions,”
NBER Macroeconomics Annual 2003 . Mertens, J.-F. and S. Zamir (1985): “Formulation of Bayesian Analysis for Games withIncomplete Information,”
International Journal of Game Theory , 14, 1–29.
Milgrom, P. and N. Stokey (1982): “Information, Trade, and Common Knowledge,”
Journal of Economic Theory , 26, 17–27.
Monderer, D. and D. Samet (1989): “Approximating Common Knowledge with Com-mon Beliefs,”
Games and Economic Behavior , 1, 170–190.
Morris, S. (1995): “The Common Prior Assumption in Economic Theory,”
Economics andPhilosophy , 11, 227–253.
Morris, S., S. Takahashi, and O. Tercieux (2012): “Robust Rationalizability UnderAlmost Common Certainty of Payoffs,”
The Japanese Economic Review , 63, 57–67.
Olea, J. M., P. Ortoleva, M. Pai, and A. Prat (2017): “Competing Models,” Work-ing Paper.
Ollar, M. and A. Penta (2017): “Full Implementation and Belief Restrictions,”
Amer-ican Economic Review , 107, 2243–2277.
Rubinstein, A. (1989): “The Electronic Mail Game: Strategic Behavior Under ”AlmostCommon Knowledge”,”
American Economic Review , 79, 385–391.
Salant, Y. and J. Cherry (2020): “Statistical Inference in Games,” Working Paper.
Spiegler, R. (2016): “Bayesian Networks and Boundedly Rational Expectations,”
Quar-terly Journal of Economics , 131, 1243–1290.
Steiner, J. and C. Stewart (2008): “Contagion through Learning,”
Theoretical Eco-nomics , 3, 431–458.
Weinstein, J. and M. Yildiz (2007): “A Structure Theorem for Rationalizability withApplication to Robust Prediction of Refinements,”
Econometrica , 75, 365–400.——— (2017): “Interim Correlated Rationalizability in Infinite Games,”
Journal of Mathe-matical Economics , 72, 82–87. 30 ppendix
A Proofs for Section 2
A.1 Proof of Claim 1
Suppose that f p x S q “
1, so that the Seller’s value good has a high value. (The proof follows alongsimilar lines in the other case.) I will first show that p n “ n . Let π be a point mass at f .An agent with this prior assigns probability 1 to v “ B p z n q for every z n , so common certainty in v “ But entering is not rationalizable for the Sellerwith this belief, implying p n “ p n , I first show thatentering is rationalizable for some type satisfying Assumption 1 if and only if there exist ˜ f , ˜ f P F that are consistent with the data, and which make conflicting predictions for the Seller’s good x S (Lemma 1). I characterize the probability of this event in Lemma 3, from which the comparativestatics for p n follow directly. Lemma 1.
Fix an arbitrary data set z n “ tp x i , f p x i qqu ni “ . Entering is rationalizable for the Sellerwith a belief satisfying Assumption 1 if and only if there exist ˜ f , ˜ f P F where(1) ˜ f p x i q “ ˜ f p x i q “ f p x i q for each observation i “ , . . . , n (2) ˜ f p x S q “ while ˜ f p x S q “ Proof.
Suppose there exists a pair ˜ f , ˜ f satisfying (1) and (2), and define π ˜ f , π ˜ f P ∆ p F q to bepoint masses on ˜ f and ˜ f . Since these rules are consistent with the data by (1), the posterior beliefsupdated to z n are likewise degenerate at ˜ f and ˜ f , and thus assign (respectively) probability 1 to v “ v “
0. This implies that degenerate distributions at 1 and 0 both belongto B p z n q . Entering is rationalizable for the Seller who believes that v “ v “ B p z n q .Now suppose that no such pair ˜ f , ˜ f exists, implying either that every ˜ f P F consistent with thedata predicts f p x S q “
0, or that every ˜ f P F consistent with the data predicts f p x S q “
1. Theneither B p z n q is the singleton set consisting of a degenerate distribution at 1, or it is the singletonset consisting of a degenerate distribution at 0. If the former, the only type satisfying Assumption1 is the one with common certainty in v “
1, and if the latter, the only type satisfying Assumption1 is the one with common certainty in v “
0. Entering is not rationalizable for the Seller witheither of these beliefs. Here, and elsewhere in the proof, type t i has common certainty in v “ t f | f p x S q “ u ˆ p X ˆ t , uq ˆ T ˚´ i . emma 2. Suppose the true function is f p x q “ p x P R q where R “ r´ r , r s ˆ r´ r , r s ˆ . . . r´ r m , r m s for a sequence of constants r , r , . . . , r m , r m P p , q . Then p n p a i q “ ´ m ź k “ ˆ ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s ˙ . Proof.
From Lemma 1, the probability p n is equal to the measure of data sets z n given which thereexist ˜ f , ˜ f P F that are consistent with z n , and which make conflicting predictions at the input x S .The true classification rule f is always consistent with the data, and predicts f p x S q “
1, so a pairof such rules exists if we can additionally find a rule ˜ f P F consistent with the data that predicts˜ f p x S q “ k on which either every observation x i satisfies x ki ă
0, or every x i satisfies 0 ă x kj . This allowssome ˜ f P F to be consistent with the data, but to predict 0 at the zero vector.For each dimension k , the probability that there is at least one observation x i with x ki P r´ r k , q and at least one observation x j with x kj P p , r k s is1 ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s . Observe that attribute values are independent across dimensions. So the probability that for everydimension k , there is at least one observation x ki P r´ r k , q and at least one observation x j with x kj P p , r k s , is m ź k “ ˆ ´ ˆ ˙ n rp ´ r k q n ` p ´ r k q n ´ p ´ p r k ` r k qq n s ˙ . The desired probability is the complement of this event, which yields the expression in the lemma.The following functional form is used in the main text:
Corollary 3.
In the special case in which the true function is f p x q “ p x P R q where R “ r´ a, a s m for some a P p , q , then p n p a i q “ ´ “ ´ ` ` ´ a ˘ n ´ p ´ a q n ˘‰ m . A.2 Proof of Claim 2
I first demonstrate the following lemma, which characterizes the probabilities p n and p n . Lemma 3.
For every n ě , p n “ ´ Φ ˜ z α ´ β ´ σ c n ´ ¸ while p n “ ´ Φ ˜ ´ z α ´ β ´ σ c n ´ ¸ here z α “ ´ Φ ´ p α { q with Φ denoting the CDF of the standard normal distribution. Since β ą σ and increasing in n . Thus Claim2 follows.Towards this lemma, I first prove the following intermediate result: Lemma 4.
Write T C i for the set of player i types with common certainty in the event that allplayers have first-order beliefs that assign probability 1 to C .(a) The strong policy is rationalizable for all types t i P T C i if and only if C Ď r , .(b) The strong policy is rationalizable for some type t i P T C i if and only if C X r ,
8q ‰ H .Proof. (a) C Ď r , is a necessary condition, as otherwise the strong policy is not rationalizablefor any type with common certainty in β P C zr , . Suppose C Ď r , and choose any t i P T C i .For each β P C , u i p strong , strong , β q “ ´ u i p weak , strong , β q “ ´ β ď ´
1. So ż u i p strong , strong , β q t i p β q dβ “ ´ ě ż u i p weak , strong , β q t i p β q dβ where t i denotes the first-order belief of type t i . Thus the family of sets p R , R q with R “ R “t strong u are closed under best reply, and rationalizability of the strong policy follows.(b) Suppose C X r ,
8q “ H . Then for every β P C , u i p strong , strong , β q “ ´ ď ´ β “ u i p weak , strong , β q . So the strong policy is strictly dominated (and hence not rationalizable) for player i given any type t i P T C i . If instead C X r ,
8q ‰ H , then the strong policy is rationalizable for any type withcommon certainty in some β in this intersection. So the strong policy is rationalizable for at leastone type t i P T C i , as desired.I now prove Lemma 3. Proof.
Using standard results for ordinary least-squares (Hastie et al., 2009), the distribution ofthe OLS estimator ˆ β is ˆ β „ N ˜ β, σ n ř nt “ p t ´ t q ¸ where t “ n ř nt “ t . Since1 n n ÿ t “ p t ´ t q “ n ˜ n ÿ t “ t ´ t n ÿ t “ t ` n ÿ t “ t ¸ “ p n ` qp n ` q ´ p n ` q ` ˆ n ` ˙ “ p n ´ q e can simplify the variance of ˆ β to σ n ´ . The p ´ α q -confidence interval for β given data z n isthus C p z n q “ « ˆ β p z n q ´ z α ¨ σ ¨ c n ´ , ˆ β p z n q ` z α ¨ σ ¨ c n ´ ff (8)where β p z n q is the OLS estimate of β given the data z n , and z α “ ´ Φ ´ p α { q is the critical valueassociated with the p ´ α q -confidence level. The probability that the interval in (8) is containedin r , is Pr ˜ ˆ β p z n q ą ` z α ¨ σ ¨ c n ´ ¸ . which is in turn equal to 1 ´ Φ ˜ . ´ β ´ σ c n ´ ¸ . (9)By Part (a) of Lemma 4, p n is equal to (9), delivering the first part of the lemma.The probability that the interval in (8) has nonempty intersection with r , is given byPr ˜ ˆ β p z n q ą ´ z α ¨ σ ¨ c n ´ ¸ which is equal to 1 ´ Φ ˜ ´ z α ´ β ´ σ c n ´ ¸ (10)By Part (b) of Lemma 4, p n is equal to (10), concluding the proof. B Proofs for Main Results (Sections 4 and 5)
B.1 Proof of Theorem 1 Part (a)
Recall that Θ and U are both endowed with the sup-norm, and the map g : Θ Ñ U has Lipschitzconstant K . The set of probability measures ∆ p Θ q is endowed with the Prokhorov metric d P . The Wasserstein distance on ∆ p Θ q is d W p ν, ν q “ sup "ż hdν ´ ż hdν : } h } L ď * where } h } L is the Lipschitz constant of the function h : Θ Ñ R . Lemma 5.
Fix any player i , action a i P A i , mixed strategy α i P ∆ p A i q , and set R ´ i Ď A ´ i . Let a ´ i p θ q : Θ Ñ ∆ p A ´ i q be any function satisfying a ´ i p θ q P argmax a ´ i P R ´ i p u i p a i , a ´ i , θ q ´ u i p α i , a ´ i , θ qq @ θ P Θ nd define h : Θ Ñ R by h p θ q “ u i p a i , a ´ i p θ q , θ q ´ u i p α i , a ´ i p θ q , θ q . Then, the function h is Lipschitz continuous with Lipschitz constant K .Proof. Choose any θ, θ P Θ, and without loss of generality, suppose h p θ q ě h p θ q . Then | h p θ q ´ h p θ q| “ |p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq´p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq|ď |p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq´p u p a i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ qq|ď | u p a i , a ´ i p θ q , θ q ´ u p a i , a ´ i p θ q , θ q|`| u p α i , a ´ i p θ q , θ q ´ u p α i , a ´ i p θ q , θ q|ď } g p θ q ´ g p θ q} ď K } θ ´ θ } using in the final inequality that g : Θ Ñ U has Lipschitz constant K .Below, let F (cid:15) denote the (cid:15) -neighborhood of the set F . Lemma 6.
Suppose a i is δ -strictly rationalizable for player i of type t i , where δ ą . Let B beany subset of t µ u δ {p Kξ q , where K is the Lipschitz constant of g : Θ Ñ U , and ξ is as defined in(5). Then, a i is rationalizable for all types t i P T B i . Proof.
Fix (cid:15) ą
0, and consider an arbitrary set B Ď t µ u (cid:15) . I will show that a i is rationalizable forall types t i P T B i when (cid:15) is sufficiently small.To show this, I use Proposition 1 from Chen et al. (2010): Proposition 4 (Chen et al. (2010)) . For each k ě , player i P I , type t i P T i , and action a i P A i , we have a i P S ki r t i s if and only if for each α i P ∆ p A i zt a i uq , there exists a measurable σ ´ i : Θ ˆ T ´ i Ñ ∆ p A ´ i q with supp σ ´ i p θ, t ´ i q Ď S k ´ ´ i r t ´ i s @p θ, t ´ i q P Θ ˆ T ´ i such that ż Θ ˆ T ´ i r u i p a i , σ ´ i p θ, t ´ i q , θ q ´ u i p α i , σ ´ i p θ, t ´ i q , θ s t i r dθ ˆ dt ´ i s ě δ P R `` such that a i is δ -strictly rationalizable for player i of type t i . This implies that there exists a family of sets p R j q j P I Ď ś j P I A j , where a i P R i , and for every Chen et al. (2010) demonstrate a similar result for finite state spaces Θ (see their Proposition 2). I useideas from their proof here, but consider a more general environment, replacing finiteness of Θ with Lipschitzcontinuity on g : Θ Ñ U . Proposition 1 from Chen et al. (2010) characterizes γ -rationalizability for arbitrary γ P R . For thepurposes of this proof, it is sufficient to set γ “ j P R j there exists a σ j : Θ Ñ ∆ p A ´ j q satisfyingsupp σ j p θ q Ď R ´ j @ θ P Θand ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p a j , σ j p θ q , θ q dµ ě δ @ a j ‰ a j (11)I will show that for each k ě
1, player j , type t j P T B j , action a j P R j , and mixed strategy α j P ∆ p A j zt a j uq , there exists a measurable σ ´ j : Θ ˆ T B ´ j Ñ ∆ p A ´ j q withsupp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B ´ j and ż Θ ˆ T B ´ j r u j p a j , σ ´ j p θ, t ´ j q , θ q ´ u j p α j , σ ´ j p θ, t ´ j q , θ s t j r dθ ˆ dt ´ j s ě . (12)Since a i P R i by design, it follows from Proposition 4 that for any type t i P T B i , the action a i P S ki r t i s for every k , and hence a i P S i r t i s , as desired.Fix an arbitrary player j , a j P R j , type t j P T B j , and α j P ∆ p A j zt a j uq . Define a ´ j : Θ Ñ A ´ j to satisfy a ´ j p θ q P argmax a ´ j P R ´ j p u j p a j , a ´ j , θ q ´ u j p α j , a ´ j , θ qq @ θ P Θand define σ ´ j : Θ ˆ T B ´ j Ñ ∆ p A ´ j q so that each σ ´ j p θ, t ´ j q is a point mass at a ´ i p θ q . Then bydefinition supp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B ´ j . Further define h p θ q : “ u j p a j , a ´ j p θ q , θ q ´ u j p α j , a ´ j p θ q , θ q @ θ P Θ . For notational ease, write ν P ∆ p Θ q for the first-order belief of type t j . Then ż Θ ˆ T B ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T B ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s“ ż Θ u j p a j , a ´ j p θ q , θ q ν r dθ s ´ ż Θ u j p α j , a ´ j p θ q , θ q ν r dθ s “ ż Θ h p θ q ν r dθ s so the desired condition in (12) follows if we can show that ş Θ h p θ q ν r dθ s ě . By Lemma 5, the function h : Θ Ñ R has Lipschitz constant 2 K , so ˇˇˇˇż Θ h p θ q dν ´ ż Θ h p θ q dµ ˇˇˇˇ ď K ¨ d W p ν, µ q where d W is the Wasserstein distance on ∆ p Θ q . This implies ż Θ h p θ q dν ě ż Θ h p θ q dµ ´ K ¨ d W p ν, µ q . pplying Theorem 2 in Gibbs and Su (2002), d W p ν, µ q ď ξ ¨ d P p ν, µ q , where d P is the Prokhorovdistance on ∆ p Θ q and ξ is as defined in (5). So ż Θ h p θ q dν ě ż Θ h p θ q dµ ´ Kξ ¨ d P p ν, µ q (13)It follows from the inequality in (11) that ż Θ h p θ q dµ “ ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p α j , σ j p θ q , θ q dµ ě δ, so (13) implies ż Θ h p θ q dν ě δ ´ Kξ ¨ d P p ν, µ q . Finally, by assumption that t j P T B j for some B Ď t µ u (cid:15) , the Prokhorov distance between thefirst-order belief of type t j and the limiting belief ν is d P p ν, µ q ď (cid:15). So ż Θ h p θ q dν ě δ ´ Kξ(cid:15).
It follows that (cid:15) ď δ {p Kξ q is a sufficient condition for the constructed σ ´ j to satisfy the desiredcondition in (12).Since a i is strictly rationalizable for type t i (by assumption), there exists a δ P R `` for which a i is δ -strictly rationalizable. Assumption 3 implies thatlim n Ñ8 P n ˜ z n | sup µ P M d P p µ p z n q , µ q ď (cid:15) +¸ “ @ (cid:15) ą . which further implies lim n Ñ8 P n ´! z n | B p z n q Ď t θ u δ { Kξ )¯ “ p n p a i q ě P n ´! z n | B p z n q Ď t θ u δ { Kξ )¯ @ n ě p n p a i q Ñ
1. Theorem 1 Part (a) follows.
B.2 Proof of Theorem 1 Part (b)
I begin by reviewing definitions from Chen et al. (2010) that will be used in the proof. For eachplayer i , let X i “ Θ, and recursively for k ě
1, define X ki “ Θ ˆ ś j ‰ i ∆ p X k ´ j q . The space of k -th order beliefs for player i is defined T ki : “ ∆ p X k ´ i q , noting that each T ki “ ∆ p Θ ˆ T k ´ ´ i q . The The sets X k defined in Section 3.1 can be identified with the sets X k defined in this way. niform-weak metric on the universal type space T ˚ i is d UWi p s i , t i q “ sup k ě d ki p s i , t i q @ s i , t i P T ˚ i where d is the supremum norm on Θ and recursively for k ě d ki is the Prokhorov distance on∆ p Θ ˆ T k ´ ´ i q induced by the metric max t d , d k ´ ´ i u on Θ ˆ T k ´ ´ i . The uniform-weak topology onthe universal type space is the metric topology induced by d UWi . Lemma 7.
Let B be a subset of t µ u (cid:15) , and choose any s i P T B i . Then, d UWi p t i , s i q ď (cid:15) .Proof. For simplicity of notation, write t i for t i . It will be useful to define T B ,ki “ ! s ki P T ki | s i P T B i ) for the set of all k -th order beliefs that are consistent with some type s i P T B i . I will show that d P ´ T B ,ki , t ki ¯ : “ sup s ki P T B ,ki d P p s ki , t ki q ď (cid:15) @ k ě T B , i “ B , so the assumption B Ď t µ u (cid:15) immediately implies (15) for k “ d P ´ T B ,ki , t ki ¯ ď (cid:15) , and consider any measurable set E Ď T k . If t ki P E , then t k ` i p E q “ t i . Also, s k ` i p E (cid:15) q ě s k ` i ´ t t ki u (cid:15) ¯ ě s k ` i p T B ,ki q “ s i P T B i . So t k ` i p E q ď s k ` i p E (cid:15) q ` (cid:15). (16)If t ki R E , then t k ` i p E q “ t i ), so (16) follows trivially. Thus d k ` i p t i , s i q “ inf t δ | t k ` i p E q ď s k ` i p E δ q ` δ @ measurable E Ď T ki u ď (cid:15) and so d UWi p t i , s i q “ sup k ě d ki p t i , s i q ď (cid:15) as desired.Lemma 7 implies the subsequent corollary. Corollary 4.
Suppose Assumption 3 holds. Consider any sequence z P Z satisfying lim n Ñ8 sup µ P M d P p µ p z n q , µ qq “ This definition is slightly modified from Chen et al. (2010), where d was the discrete metric on Θ. Thechange reflects the difference that Θ was taken to be a finite set in Chen et al. (2010), while it is a compactand convex subset of Euclidean space here. Here and elsewhere, t ki denotes the k ´ th order belief of type t i . nd choose any sequence of types p s ni q n “ with s ni P T B p z n q i for each n ě . Then lim n Ñ8 d UWi p t i , s ni q “ . Now we will complete the proof of Theorem 1 Part (b). By Assumption 3, there is a set Z ˚ Ď Z of P -measure 1 such thatlim n Ñ8 sup µ P M d P p µ p z n q , µ q “ @ z P Z ˚ (18)Suppose towards contradiction that p n p a i q (cid:57)
0. Then, there is a set p Z Ď Z with strictly positive P -measure such that for every z P p Z , there is a sequence of types p t ni p z qq n “ where t ni p z q P T B p z n q i for every n ě
1, and a i P S i r t ni p z qs for all n sufficiently large.But since Z ˚ has P -measure 1, it must be that p Z X Z ˚ ‰ H . Choose any z from this intersection.Then, Lemma 4 and the display in (18) imply that t ni p z q Ñ t i in the uniform-weak topology. Butrationalizability is upper hemi-continuous in the uniform-weak topology (Theorem 1, Chen et al.(2010)). So a i R S i r t i s implies a i R S i r t ni p z qs for infinitely many n , a contradiction. B.3 Proof of Proposition 5
By assumption, a i is strictly rationalizable for type t i , so δ ą
0. Applying Lemma 6, p n p a i q ě P n pt z n : B p z n q Ď t µ u δ {p Kξ q uq“ P n ˜ z n : sup µ P M d P p µ p z n q , µ q ď δ {p Kξ qu +¸ ě ´ Kξδ E ˜ sup µ P M d P p µ p Z n q , µ q ¸ using Markov’s inequality in the final line. B.4 Proof of Proposition 2
Suppose a i is strictly rationalizable for player i in the complete information game θ , and let δ be as defined in (4). Then, there exists a family of sets p R j q j P I with a i P R i , where for each player j and action a j P R j , there is a mixed strategy σ ´ j P ∆ p A ´ j q satisfying σ ´ j r R ´ j s “
1, and u i p a i , σ ´ j , θ q ´ u i p a i , σ ´ j , θ q ě δ @ a i ‰ a i . Now consider an arbitrary set C Ď t θ u (cid:15) and a type t i with common certainty in the eventthat every player’s first-order belief assigns probability 1 to C . Write ν P ∆ p Θ q for the first-order elief of type t i . For any action a j ‰ a j , ż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ j , θ q dν “ ż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ ´ j , θ q dµ ` ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ ´ j , θ q dµ ` ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ j , θ q dν ě ż u j p a j , σ ´ j , θ q dµ ´ ż u j p a j , σ ´ j , θ q dµ ´ ˇˇˇˇż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ ´ j , θ q dµ ˇˇˇˇ ´ ˇˇˇˇż u j p a j , σ ´ j , θ q dν ´ ż u j p a j , σ j , θ q dµ ˇˇˇˇ ě δ ´ K ¨ d P p ν, µ q ě δ ´ K(cid:15) using in the penultimate inequality that g : Θ Ñ U has Lipschitz constant K . Since this boundon the payoff difference holds across all actions a j ‰ a j , the action a j is a best reply to belief ν whenever (cid:15) ď δ {p K q .This allows us to construct the lower bound p n p a i q ě Q n ´! z n : C p z n q Ď t θ u δ {p K q )¯ “ Q n ˜ z n : sup θ P C p z n q } θ ´ θ } ď δ {p K q +¸ ě ´ Kδ E ˜ sup θ P C p z n q } θ ´ θ } ¸ using Markov’s inequality in the final line. For Online Publication
O.1 Proof of Claim 3
Fix an arbitrary p π, q q P p , q ˆ p { , q . Given data z n , the posterior belief µ π,q p z n q assignsprobability ˆ v p π, q, z n q : “ { ˜ ` ´ ππ ˆ ´ qq ˙ n p z n ´ q ¸ (19)to v “
1, where z n “ n ř nn “ z n denotes the average realization in the sequence z n .Suppose without loss that v “
1, and let q ˚ P p { , q be the true frequency of z “
1. By thestrong Law of Large numbers, there is a measure 1 set of sequences Z ˚ satisfying lim n Ñ8 p n ř nn “ z n q “ q ˚ for every z “ p z , z , . . . q P Z ˚ . The expression in (19) converges to 1 on this set for every p π, q q P p , q ˆ p { , q . So Assumption 2 is satisfied, and the limiting belief µ assigns probability1 to v “
1. Since entering is not rationalizable for the Seller given common certainty in the eventin that all players assign probability 1 to v “
1, it follows that p p8q “ p p8q “ p p n q converges to 1 as n Ñ 8 . Fix an arbitrary n , and define Z : n “ t z n | z n ą { u to be the set of length- n sequences with majority realizations of z “
1. Forevery z n P Z : n , the expression pp ´ q q{ q q n p z n ´ q is bounded between 1/2 and 1 on the domain q P p { , q , while the image of p ´ π q{ π is all of R ` . Thus, the display in (19) ranges from zeroto 1; that is, t ˆ v p π, q, z n q : π P p , q , q P p { , qu “ p , q @ z n P Z : n . It follows that for every z n P Z : n , there exist pairs p π, q q , p π , q q P p , q ˆ p { , q satisfyingˆ v p π, q, z n q ă p ă ˆ v p π , q , z n q . Entering is rationalizable for the Seller with a type that assignsprobability ˆ v p π , q , z n q to the high value, and which assigns probability 1 to the Buyer assign-ing probability ˆ v p π, q, z n q to the high value. So entering is weakly B p z n q -rationalizable for every z n P Z : n , implying p n p a i q ě P n p Z : n q .Again by the law of large numbers, the measure of datasets with majority realizations of z “ n Ñ 8 ; that is, P n ´ Z : n ¯ Ñ
1. So lim n Ñ8 p n p a i q “
1, as desired.
O.2 Proof of Corollary 1
First observe that δ “ β ´
1, since the action
Strong is δ -strictly rationalizable for every δ ă β ´ δ ě β ´
1. It remains to determine E ” sup θ P C p Z n q } θ ´ θ } ı . Write Z n for the(random) empirical mean of n signal realizations, and ˆ β x p z n q for the expectation of β given signals n and prior β „ N p x, q . Then, using standard formulas for updating to Gaussian signals: E ˜ sup x Pr´ η,η s | β ´ ˆ β x p Z n q| ¸ “ E „ max x Pr´ η,η s ˆˇˇˇˇ β ´ x ` nZ n n ` ˇˇˇˇ˙ We can further bound the RHS as follows: E „ max x Pr´ η,η s ˆˇˇˇˇ β ´ x ` nZ n n ` ˇˇˇˇ˙ ď E ˆˇˇˇˇ β ´ nZ n n ` ˇˇˇˇ˙ ` max x Pr´ η,η s ˇˇˇˇ xn ` ˇˇˇˇ “ E ˆˇˇˇˇ β ´ nZ n n ` ˇˇˇˇ˙ ` η {p n ` qď E `ˇˇ β ´ Z n ˇˇ˘ ` E ˆ Z n n ` ˙ ` η {p n ` q“ c nπ ` β ` ηn ` n observations froma Gaussian distribution (Geary, 1935). Finally, the map g : Θ Ñ U has Lipschitz constant 1.Applying Proposition 2, we have the desired bound. O.3 Proof of Corollary 2
Fix arbitrary π, π, q, q satisfying 0 ă π ă π ă { ă q ă q ă
1, and let M be theset of learning rules identified with p π, q q P r π, π s ˆ r q, q s . Entering is rationalizable for a Sellerwith common certainty that all players have first-order beliefs in B p z n q if and only if there exist π, π P r π, π s and q, q P r q, q s satisfyingˆ v p π, q, z n q ă p ă ˆ v p π , q , z n q . (20)where ˆ v p π, q, z n q is as defined in (19). Let Z ˚ n denote the set of all sequences z n satisfying (20).Since the state space is binary, each empirical measure p Q p z n q P ∆ pt , uq can be identified withits average realization z n , which is also the probability assigned to z “
1. The KL-divergencebetween p Q p z n q and the actual signal-generating distribution Q “ p q ˚ , ´ q ˚ q is D KL p p Q p z n q | Q q “ q ˚ log ˆ q ˚ z n ˙ ` p ´ q ˚ q log ˆ ´ q ˚ ´ z n ˙ and this expression is monotonically increasing in | z n ´ q ˚ | . Thus, to minimize the KL-divergence,we seek the value of z n closest to q ˚ for which (20) is satisfied.Suppose z n ą {
2. By assumption, π ą p and q ą {
2, so ˆ v p π, q, z n q ą p . It remainsto determine when ˆ v p π, q, z n q ă p is satisfied for some other p π, q q P M . Since ˆ v p π, q, z n q is onotonically increasing in both π and q for sequences z n satisfying z n ą { π, q )), a necessary and sufficient condition is ˆ v p π, q, z n q ă p . Using (19), this inequalityrequires 1 { ˜ ` ´ ππ ˆ ´ qq ˙ n p z n ´ q ¸ ă p which can be rewritten z n ď ˆ ` n log p ´ q q{ q ˆ π ´ π ¨ ´ pp ˙˙ : “ z ˚ n . Since z ˚ n ¨ n need not be an integer, the distribution p z ˚ n , ´ z ˚ n q may not be achievable by anyempirical measure p Q n for finite n . Thus, Q ˚ n is instead given by p t z ˚ n ¨ n u { n, ´ p t z ˚ n ¨ n u { n q , and D KL p Q ˚ n } Q q “ q ˚ log ˆ q ˚ t z ˚ n ¨ n u { n ˙ ` p ´ q ˚ q log ˆ ´ q ˚ ´ t z ˚ n ¨ n u { n ˙ Plugging in the given parameter values, and applying Proposition 3, yields the expression in thecorollary.
O.4 Examples Related to Theorem 1
Part (a) of Theorem 1 provides a sufficient condition for the confidence set r p n p a i q , p n p a i qs toconverge to certainty— a i is strictly rationalizable for type t i —and Part (b) of Theorem 1 provides anecessary condition— a i is rationalizable for type t i . The condition that a i is strictly rationalizableis not necessary, as I demonstrate in Section O.4.1, and the condition that a i is rationalizable isnot sufficient, as I demonstrate in Section O.4.2.In each of these examples, I assume (as in Section 5.1) that the limiting belief µ is degenerateat a limiting parameter θ , and players have common certainty of shrinking neighborhoods of thisparameter. That is, for every realization z n , players have common certainty in the event that playershave first-order beliefs with support on C p z n q , where the support sets C p z n q satisfy Assumption 6. O.4.1 Strict Rationalizability is Not Necessary
Consider the following complete information game a a a θ, θ, a , , nd suppose that the limiting belief is degenerate at θ “
1. Then, the action a is strictly dominant for player 1 in the limiting complete information game, and also for all types with common certaintyin the event that players have first-order beliefs with support on a small enough neighborhood of θ .So Assumption 3 implies lim n Ñ8 r p n p a i q , p n p a i qs “ t u . But action a is not strictly rationalizable for type t i . O.4.2 Rationalizability is not Sufficient
I show next that rationalizability of a i for type t i is not sufficient for the analyst’s confidenceset for a i to converge to certainty. Section O.4.3 provides a simple example to this effect. DefineΘ a i to be the set of parameter values θ such that a i is rationalizable for player i in the completeinformation game indexed to θ . If a i is on the boundary of Θ a i , then common certainty of shrinkingneighborhoods around θ does not guarantee rationalizability of a i . More surprisingly, commoncertainty in arbitrarily small open sets within the interior of Θ a i also does not guarantee rational-izability of a i , and I provide an example of this in Section O.4.4. (See also the working paper ofChen and Takahashi (2020) for a nice two-player example to this effect.) O.4.3 θ is on the Boundary of Θ a i Consider the following two-player game, parametrized by θ P r θ, θ s for some θ ă ă θ : a ba θ, θ , b , , θ “
0, so that a is rationalizable in the limiting completeinformation game, but not strictly rationalizable. It is straightforward to see that common certaintyof shrinking neighborhoods of θ does not guarantee rationalizability of action a , as the type withcommon certainty of any θ ă a to be strictly dominated. O.4.4 θ is in the Interior of Θ a i But even if θ is not on the boundary of the set Θ a i , it may be that common certainty of a shrinkingneighborhood of θ does not guarantee rationalizability of a i . Consider the following four-playergame. Players 1 and 2 choose between actions in t a, b u , and player 3 chooses between matrices rom t l, r u . Their payoffs are: a ba , , , , b , , , , a ba , , , , b , , , , p l q p r q A fourth player predicts whether players 1 and 2 chose matching actions or mis-matching actions.He receives a payoff of 1 if he predicts correctly (and 0 otherwise). Player 4’s action does notaffect the payoffs of the other three players.Let the state space Θ “ R be the set of all payoff matrices given these actions, where thepayoffs described above are a particular θ . Match is clearly rationalizable for player 4 at θ ; it isalso rationalizable for player 4 on a neighborhood of θ (in the Euclidean metric). Nevertheless, I will show existence of a sequence of types for player 4 with common certaintyin increasingly small neighborhoods of θ , given which Match fails to be rationalizable. Alongthis sequence, player 4 believes that a is uniquely rationalizable for player 1, while b is uniquelyrationalizable for player 2, so the action Match is strictly dominated.Define θ ε to be the following perturbation of the payoff matrix θ (with player 4’s payoffsunchanged): a ba , , , , b , , ´ ε, , a ba , , ´ ε , , ´ εb , , ´ ε , , ´ ε (22) p l q p r q Let θ ε correspond to the following payoff matrix (again with player 4’s payoffs unchanged): a ba , , ´ ε , , ´ εb , , ´ ε , , ´ ε a ba ´ ε, , , , b , , , , p l q p r q In more detail: player 4 chooses between t Match , Mismatch u . His payoff from Match is 1 if players 1and 2 choose the same action (both a or both b ) and 0 otherwise; his payoff from Mismatch is 1 if players 1and 2 chose different actions ( a and b or flipped), and 0 otherwise. Suppose neither l nor r are strictly dominated for player 1; then, all actions are rationalizable for player1-3, so Match is rationalizable for player 4. If either l or r is strictly dominated for player 1, then oneof the following will be a rationalizable family: t l u ˆ t a u ˆ t a u ˆ t Match u , t l u ˆ t a, b u ˆ t a, b u ˆ t Match u , t r u ˆ t b u ˆ t b u ˆ t Match u , or t r u ˆ t a, b u ˆ t a, b u ˆ t Match u . Thus, Match is rationalizable for player 4. et ε ą
0. If player 1 has common certainty in the state θ ε , then a is his uniquely rationalizableaction: l strictly dominates r for player 3, given which a strictly dominates b for player 1. By asimilar argument, if player 2 has common certainty in the state θ ε , then b is his uniquely rational-izable action. These statements hold for ε arbitrarily small. Construct a sequence of types p t ε n q for player 4, where each type t ε n has common certainty that player 1 has common certainty inthe state θ ε n and player 2 has common certainty in the state θ ε n . Then, player 4 of type t ε n hascommon certainty in an ε -neighborhood of θ , but only one rationalizable action: Mismatch. Take ε n Ñ ε n ą
0) and the desired conclusion obtain: rationalizability of Match holds atlim n Ñ8 ε n but fails to hold arbitrarily far out along the sequence ε n . O.5 Extension to Common p -Belief For each q P r , s , define: p n,q p a i q “ P n ´! z n : a i P S i r t i s @ t i P T B ,qi )¯ . where q “ p n p a i q given in the main text. Proposition 5.
Suppose a i is strictly rationalizable for type t i , and define δ : “ sup t δ : a i is δ -strictly rationalizable for type t i u noting that this quantity is strictly positive. Define M : “ sup a,a P A,θ,θ P Θ ,j P I | u j p a, θ q ´ u j p a , θ q| . (24) Then, for every n ě , and q ą M {p δ ` M q , p n,q p a i q ě ´ M ξqδ q ´ p ´ q q M E ˜ sup µ P M d P p µ p Z n q , µ q ¸ Proof.
I first demonstrate a lemma analogous to Lemma 6.
Lemma 8.
Suppose a i is δ -strictly rationalizable for player i of type t i . Let B Ď ∆ p Θ q by anyset satisfying sup ν P B d P p ν, µ q ď δ q ´ p ´ q q M M ξq where M is as defined in (24) and ξ is as defined in (5). Then, a i is rationalizable for all types t i P T B ,qi .Proof. The proof follows along similar lines to the proof of Lemma 6. Fix (cid:15) ą
0, and consider anarbitrary set B Ď t µ u (cid:15) . I will show that a i is rationalizable for all types t i P T B ,qi when (cid:15) issufficiently small and q is sufficiently large. y assumption, action a i is δ -strictly rationalizable for player i of type t i . This implies thatthere exists a family of sets p R j q j P I Ď ś j P I A j , where a i P R i , and for every a j P R j there exists a σ j : Θ Ñ ∆ p A ´ j q satisfying supp σ j p θ q Ď R ´ j @ θ P Θand ż Θ u j p a j , σ j p θ q , θ q dµ ´ ż Θ u j p a j , σ j p θ q , θ q dµ ě δ @ a j ‰ a j (25)Partition the set of types T B ,qj into those types whose first-order beliefs belong to B T j : “ ! t j P T B ,qj | t j P B ) @ j P I and all remaining types T j : “ T B ,qj z T j . By construction, every type in T B ,qj assigns probability atleast q to T ´ j . I will now show that there exists a family of sets p V j r t j sq j P I ,t j P T B ,qj with the propertythat for each k ě
1, player j , type t j P T B ,qj , action a j P R j , and mixed strategy α j P ∆ p A j zt a j uq ,there exists a measurable σ ´ j : Θ ˆ T B ,q ´ j Ñ ∆ p A ´ j q with(1) supp σ ´ j p θ, t ´ j q Ď V ´ j r t ´ j s @p θ, t ´ j q P Θ ˆ T B ,q ´ j (2) ş Θ ˆ T B ,q ´ j r u j p a j , σ ´ j p θ, t ´ j q , θ q ´ u j p α j , σ ´ j p θ, t ´ j q , θ s t j r dθ ˆ dt ´ j s ě V j r t j s “ R j for every player j and type t j P T j .Since a i P R i by design, it follows from Proposition 4 that for any type t i P T B ,qi , the action a i P S ki r t i s for every k , and hence a i P S i r t i s , as desired.Fix an arbitrary player j , a j P R j , type t j P T B ,qj , and α j P ∆ p A j zt a j uq . Define a ´ j : Θ Ñ A ´ j to satisfy a ´ j p θ q P argmax a ´ j P R ´ j p u j p a j , a ´ j , θ q ´ u j p α j , a ´ j , θ qq @ θ P ΘFurther define h p θ q : “ u j p a j , a ´ j p θ q , θ q ´ u j p α j , a ´ j p θ q , θ q @ θ P Θ . (26)Let σ ´ j : Θ ˆ T B ,q ´ j Ñ ∆ p A ´ j q be any conjecture with the property that σ ´ j p θ, t ´ j q is a point massat a ´ i p θ q for every p θ, t ´ j q P Θ ˆ T B , ´ j . The conjectures σ ´ j p θ, t ´ j q for p θ, t ´ j q R Θ ˆ T B , ´ j are notexplicitly specified. By definition,supp σ ´ j p θ, t ´ j q Ď R ´ j @p θ, t ´ j q P Θ ˆ T B , ´ j . Then Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s“ ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ` ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ ´ ˜ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ` ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ “ ˜ż Θ ˆ T ´ j u j p a j , a ´ j p θ q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , a ´ j p θ q , θ q t j r dθ ˆ dt ´ j s ¸ ` ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ “ ż Θ ˆ T ´ j h p θ q t j r dθ ˆ dt ´ j s` ˜ż Θ ˆ T ´ j u j p a j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ´ ż Θ ˆ T ´ j u j p α j , σ ´ j p θ, t ´ j q , θ q t j r dθ ˆ dt ´ j s ¸ ě ż Θ ˆ T ´ j h p θ q t j r dθ ˆ dt ´ j s ´ M ż Θ ˆ T ´ j t j r dθ ˆ dt ´ j s where the final inequality follows from the definitions of h (as given in (26)) and M (as given in(24)).In the proof of Lemma 6, we showed that the inequality in (25) implies ş Θ h p θ q t j r dθ s ě δ ´ M ξ(cid:15) . Since moreover t j assigns probability at least p to the set T ´ j , we can further bound ż Θ ˆ T B , ´ j h p θ q t j r dθ ˆ dt ´ j s ` M ż Θ ˆ T ´ j t j r dθ ˆ dt ´ j s ě q p δ ´ M ξ(cid:15) q ´ p ´ q q M Thus, a j is a best reply for type t j so long as q p δ ´ M ξ(cid:15) q ´ p ´ q q M ě (cid:15) ď δ q ´p ´ q q M Mξq
This bound holds across all players j and actions a j P T j .Thus p n p a i q ě P n ˜ z n : sup µ P M d P p µ p z n q , µ q ď δ q ´ p ´ q q M M ξq +¸ ě ´ M ξpδ q ´ p ´ q q M E ˜ sup µ P M d P p µ p Z n q , µ q ¸ using Markov’s inequality in the final line.using Markov’s inequality in the final line.