M Equilibrium: A theory of beliefs and choices in games
MM Equilibrium
A Dual Theory of Beliefs and Choices in Games
Jacob K. Goeree and Philippos Louis November 11, 2018 (First version: December 5, 2017)
Abstract
We introduce a set-valued generalization of Nash equilibrium, called
M equilibrium , whichis based on ordinal monotonicity – players’ choice probabilities are ranked the same as theexpected payoffs based on their beliefs – and ordinal consistency – players’ beliefs yield thesame ranking of expected payoffs as their choices. Using results from semi-algebraic geometry,we prove there exist a finite number of M equilibria, each consisting of a finite number of con-nected components. Generically, M -equilibria can be “color coded” by their ranks in the sensethat choices and beliefs belonging to the same M equilibrium have the same color. We showthat colorable M -equilibria are behaviorally stable , a concept that strengthens strategic stabil-ity. Furthermore, set-valued and parameter-free M -equilibrium envelopes various parametricmodels based on fixed-points, including QRE as well as a new and computationally simplerclass of models called µ Equilibrium .We report the results of several experiments designed to contrast M -equilibrium predic-tions with those of existing behavioral game-theory models. A first experiment considers fivevariations of an asymmetric-matching pennies game that leave the predictions of Nash, variousversions of QRE, and level- k unaltered. However, observed choice frequencies differ substan-tially and significantly across games as do players’ beliefs. Moreover, beliefs and choices areheterogeneous and beliefs do match choices in any of the games. These findings contradictexisting behavioral game-theory models but accord well with the unique M equilibrium. Fol-low up experiments employ 3 × M equilibria. The belief and choice data exhibit coordination problems that could notbe anticipated through the lens of existing behavioral game-theory models. Keywords : M Equilibrium, µ Equilibrium, Behavioral Stability Goeree: AGORA Center for Market Design, UNSW. Louis: Department of Economics, University ofCyprus. We gratefully acknowledge funding from the Australian Research Council (DP150104491). We thankcolleagues and seminar participants at Monash University, UNSW, UTS, and the University of Cyprus foruseful comments, and MSM and Dylan Bøwie for their hospitality. Inspiration for the title can be found at https://en.wikipedia.org/wiki/M-theory . a r X i v : . [ ec on . T H ] N ov Only now (1943) am I conscious my work has just been‘drawing’ with paint. Drawing emphasizes lines, but inpainting, lines arise as the abstract limitations of planes toconserve their great value.” — Piet Mondrian (1872 – 1944)
1. Introduction
The central solution concept of game theory, the Nash equilibrium, rests on the assumption ofmutually consistent behavior: each player’s choice is optimal given others’ choices. While theNash equilibrium is defined in terms of a fixed-point condition on choices (Nash, 1950), it isquiet about how they come about. Presumably, players form beliefs about others’ behavior anduse these to optimize their choices. Seen this way, the Nash assumption of mutually consistentbehavior is equivalent to the following two conditions: players’ choices are best responses totheir beliefs and players’ beliefs are correct.Observed choices conform to Nash-equilibrium predictions in some settings, but a largebody of work in experimental game theory has documented systematic departures from Nashequilibrium (e.g. Goeree and Holt, 2001). Nasar (1998) describes how the lack of experimentalsupport for his equilibrium concept caused Nash to lose confidence in the relevance of gametheory after which he turned to pure mathematics in his later research. Selten, who shared the1994 Nobel Prize with Nash, likewise concluded that “game theory is for proving theorems, notfor playing games.”This paper is motivated by the desire for an empirically relevant game theory. We introducea novel solution concept, M equilibrium, which replaces the assumption of perfect maximiza-tion (“no mistakes”) with an ordinal monotonicity condition – players’ choice probabilities areranked the same as their associated expected payoffs – and the assumption of perfect beliefs(“no surprises”) with an ordinal consistency condition – players’ beliefs yield the same rankingof expected payoffs as their choices. If the latter condition holds, which does not require beliefsto be homogeneous or correct, we say that players’ beliefs support their choices. M equilibrium is a set-valued solution concept that puts beliefs and choices on an equalfooting. Specifically, M equilibrium considers all beliefs that support a certain equilibriumchoice and, dually, it considers all choices supported by a certain equilibrium belief. An M From Mondrian’s letter to art critic James Sweeney, see Bax (2001, p.20). M equilibrium for any normal-form game. We explore the geom-etry of M equilibrium sets, which are examples of semi-algebraic sets (see e.g. Coste, 2002).Borrowing results from semi-algebraic geometry, we show there are finite number of M equi-libria, each consisting of a finite number of connected components. Generically, there can bean even or odd number of M equilibria, unlike fixed-point theories, such as Nash equilibrium,that generically yield an odd number of equilibria (e.g. Harsanyi, 1973). We show that Nashequilibria arise as limit points of the M -equilibrium sets. There may be fewer, as many, ormore M equilibria than Nash equilibria. Moreover, an M equilibrium may contain zero, one,or more Nash equilibria. Importantly, we show the measure of the M -equilibrium choice setfalls quickly with the number of players and the number of possible choices.Generically, any M equilibrium can be “color coded” by the ranks of the equilibrium-choiceprofiles in the sense that all choices and beliefs that belong to the same M equilibrium musthave the same color. To illustrate, consider the symmetric two-player game in Table 1 wherechoice and belief profiles are compositions of red ( R ), yellow ( Y ), and blue ( B ). This “Mondriangame” has three symmetric M equilibria: the colored “planes” in the left panel of Figure 1show the M -equilibrium choice sets, and the right panel shows the corresponding M -equilibriumbelief sets. Beliefs of a certain color support actions with the same color. The thick interior linesreflecting payoff indifferences form the boundaries of the colored planes. Conform Mondrian’squote, drawing these lines is somewhat superfluous as all relevant information is encoded bythe planes, including the Nash equilibria. For example, since all three colored planes includea vertex of the choice simplex, each color corresponds to a pure-strategy Nash equilibrium.Furthermore, the point on the edge of the choice simplex where the red (or yellow) planeborders the white plane corresponds to a degenerate mixed-strategy Nash equilibrium. Theseare the five symmetric Nash equilibria of the Mondrian game.We show that, generically, the interior of the M equilibrium sets consists of choices andbeliefs that are behaviorally stable . Roughly speaking, an M -equilibrium profile is behaviorallystable when small errors in implementation or perception do not destroy its equilibrium nature.In other words, an M -equilibrium profile is behaviorally stable when the profile is also an M - Together with the three lines (not shown) that divide the simplex in six equal parts, i.e. the three lineswhere two of the three choice probabilities are equal. If ( p R , p B , p Y ) denotes the symmetric equilibrium profile then the red boundary point is ( , ,
0) and theyellow boundary point is ( , , ). B YR
9, 9 6, 8 4, 4 B
8, 6 8, 8 2, 4 Y
4, 4 4, 2 5, 5Table 1: Mondrian’s “composition of Red, Yellow, and Blue” game.Figure 1: The symmetric M -equilibrium choice sets (left) and belief sets (right) for the Mon-drian game in Table 1.equilibrium profile for all nearby games. Obviously, the concept of behavioral stability buildson ideas from the literature on refinements of Nash equilibrium (e.g. Van Damme, 1991), inparticular, Kohlberg and Mertens’ (1986) “strategic stability.” The latter requires that nearbygames have a Nash-equilibrium profile that is close to, but not necessarily equal to, the Nash-equilibrium profile of the original game. Behavioral stability strengthens this requirement byinsisting that perturbations of the game do not change the set of M -equilibrium profiles. Webelieve this stronger requirement is needed for M equilibrium to be empirically relevant.We establish M equilibrium as a “meta theory” of various parametric models that rely onfixed-point conditions. In particular, we introduce a class of µ -Equilibrium models in whichchoice probabilities are parametric functions of the ranks of their associated expected payoffs. µ -Equilibrium choices follow from a fixed-point condition and µ -equilibrium beliefs support thefixed-point choice profile. We show that µ -equilibrium choices are easy to compute and typically3upported by a continuous set of beliefs. Moreover, we prove that an M -equilibrium of a givencolor contains all different µ -equilibrium fixed points supported by beliefs of the same color.Conversely, by varying µ , the different µ -equilibrium fixed-points “fill out” the M -equilibriumchoice set.We report the results of a series of laboratory experiments that test M equilibrium. Thefirst experiment considers five variations of an asymmetric-matching pennies game that leavethe predictions of various behavioral game-theory models (Nash, QRE, and level- k ) unaltered.However, observed choice frequencies differ substantially and significantly across games as doplayers’ beliefs. Moreover, beliefs do not match choices, and beliefs and choices are heteroge-neous in any of the games. These findings contradict the behavioral game-theory models butaccord well with the unique M equilibrium.Follow up experiments exploit the fact that there can be multiple M equilibria in games witha unique pure-strategy Nash equilibrium. In particular, the experiment employs variations of3 × M equilibria. The beliefand choice data reveal the resulting coordination problems that play no role in traditional fixed-point theories such as Nash and QRE (which both yield unique predictions in these games). . Prior Approaches
A distinct feature of M equilibrium is that beliefs and choices play a dual role. Choices satisfyan ordinal monotonicity condition, which determines the largest set of choices supported by acertain belief, and beliefs satisfy an ordinal consistency condition, which determines the largestset of beliefs that support a certain choice. In particular, M -equilibrium beliefs may differacross players and may differ from M -equilibrium choices. In other words, M equilibriumallows for “surprises” unlike Nash and QRE, which assume correct beliefs. . Quantal Response Equilibrium
McKelvey and Palfrey’s (1995)
Quantal Response Equilibrium (QRE) incorporates the possi-bility of errors into an equilibrium framework. In particular, the Nash best-response corre-spondences are replaced by smooth and increasing response functions, known as the “quantalresponse” or “better response” functions. This is the “QR” part. In addition, QRE retains theNash-equilibrium assumption that beliefs are correct so that choice probabilities are derivedrelative to the true expected payoffs taking into account others’ error-prone behavior. This isthe “E” part. 4 RU
1, 6 2, 7 D
4, 5 3, 8Table 2: A simple dominance-solvable game.A prominent behavioral economist once quipped “I like the Q and the R but not the E.”
Toillustrate why QRE’s equilibrium assumption is problematic, consider the dominance-solvablegame in Table 2. If p ( q ) denotes the probability with which Column (Row) chooses L ( U )then p ∗ = q ∗ = 0 in the unique Nash equilibrium. Logit QRE choice and belief probabilities,in contrast, follow from q = e λ R π R ( U ) e λ R π R ( U ) + e λ R π R ( D ) = 11 + e λ R (1+2 p ) p = e λ C π C ( L ) e λ C π C ( L ) + e λ C π C ( R ) = 11 + e λ C (3 − q ) with λ R and λ C non-negative “precision” or “rationality” parameters determining the sensitivityof choices with respect to expected payoffs. Note that the choice probabilities on the left arethe same as those used to compute expected payoffs on the right. In other words, QRE beliefs(and choices) are fixed-points of a set of non-transcendental equations.If QRE is seen as a model of boundedly-rational players who make mistakes it seems in-consistent to assume they are able to solve the above fixed-point equations. Indeed, not evenfully rational players can solve these equations, which admit approximate numerical solutionsonly (and only after picking specific values for λ R and λ C ). Second, while players’ choices maydiffer in equilibrium, players must have identical beliefs about the likelihood of each choice, i.e.all players’ beliefs must coincide with a single (fixed) point in the probability simplex. Thislevel of homogeneity in beliefs is unrealistic and will be falsified by any experiment that elicitsbeliefs as elements of the simplex, see Section 5.Importantly, the “E” in QRE is far more demanding than the equilibrium assumptionunderlying Nash equilibrium. Not only does it involve solving transcendental equations ratherthan simply checking for consistency (of the type “if Row chooses D then Column chooses R ,and if Column chooses R then Row chooses D ”), beliefs also have to be extremely precise.Any small change in beliefs results in a different choice profile, causing the QRE fixed-pointconditions to be violated. In contrast, the Nash equilibrium is often robust to small, and5ossibly heterogenous, variations in beliefs. For the game in Table 2, for instance, Row’s andColumn’s best responses coincide with Nash-equilibrium choices for any beliefs they hold. Inother words, even when Row and Column have different and incorrect beliefs about what choicesmay transpire, their best responses to their beliefs support the Nash equilibrium in this game.We are not suggesting that the Nash equilibrium is a more robust or a more reliable predictorof behavior than QRE. For that, its degenerate predictions (in case of a pure-strategy Nashequilibrium) are too easily falsified by a single deviant choice. QRE avoids such zero-likelihoodproblems by offering a statistical theory of behavior in games. But QRE is too demandingabout the “E” part by insisting that beliefs are homogeneous – even when players have differentrationality parameters – and correct – even when that requires solving transcendental equations.While the “QR” part avoids zero-likelihood problems when QRE is applied to choice data, its“E” part will surely be rejected by any data on beliefs (see Section 5).Approaches that do allow for surprises can generally be divided into “equilibrium” or “non-equilibrium” models. . Equilibrium Belief-Based Models
An early example of an equilibrium model of surprises is Random Belief Equilibrium (Friedmanand Mezzetti, 2005). In this model, players’ beliefs are draws from a distribution arounda central strategy profile, called the “focus.” Players best respond to their beliefs and theequilibrium condition is that expected choices coincide with the focus of the belief distributions.Rogers, Palfrey, and Camerer (2008) introduce a family of equilibrium models where player i ’s choice probabilities follow from a logit quantal response function with rationality parameter λ i . In the most general formulation, called Subjective Quantal Response Equilibrium (SQRE),players have subjective beliefs about the distributions from which others’ rationality parametersare drawn. The (Bayes-Nash) equilibrium condition is on players’ strategies, i.e. how rationalityparameters map into choices, rather than on the choices themselves.This general model can be restricted in two possible ways to establish a connection withrelated models. First, in Truncated Quantal Response Equilibrium (TQRE), players havedownward-looking beliefs about others’ rationality parameters. Rogers, Palfrey, and Camerer(2008) show that the (non-equilibrium) Cognitive Hierarchy model can be seen as a limit case ofdiscretized TQRE. Second, in Heterogeneous Quantal Response Equilibrium (HQRE), playershave common and correct beliefs about the distributions of the rationality parameters. In thiscase, the equilibrium condition is on choices as in standard QRE (which arises as a limit case6hen the rationality-parameter distributions are degenerate). . Non-Equilibrium Belief-Based Models
Prominent examples include the level- k model (Stahl and Wilson, 1994, 1995; Nagel, 1995)and the related Cognitive Hierarchy model (Camerer, Ho and Chong, 2004). In these models,players differ in skill and their beliefs about others’ skill levels are “downward looking.” In thelevel- k model, level-0 randomizes or makes a “non-strategic” choice (if such a choice is easilyidentified). Level-1 players best respond to level-0 choices, level-2 players best respond to level-1 choices, etc. In the Cognitive Hierarchy model, level-0 randomizes while level- k players, with k ≥
1, assume others’ skill levels follow a truncated Poisson distribution over 0 , . . . , k − . Differences with M Equilibrium M equilibrium differs from these prior approaches in several ways. First, M equilibrium is a set -valued, rather than a fixed- point , solution concept. As a result, it is typically simpler tocompute. Second, M equilibrium is a parameter-free theory, unlike the aforementioned modelsthat require specific parametric assumptions for the quantal responses (e.g. logit-based SQRE),the belief distributions (e.g. Dirichlet-based RBE), the distribution of levels (e.g. Poisson-basedCH), etc. As a result, its predictions can be confronted with (experimental) data without theneed to estimate parameters. Third, M equilibrium treats beliefs in a way that does not neatlyfit the “equilibrium” versus “non-equilibrium” classification. Unlike level- k type models, M equilibrium does not make ad hoc assumptions about the belief-formation process to arrive ata specific model for disequilibrium beliefs (whether it be in terms of others’ “levels of strategicthinking,” as in level- k and Cognitive Hierarchy, or in terms of others’ “rationality parameters,”as in SQRE and its descendants). And unlike RBE or QRE type models, M equilibrium doesnot assume correct beliefs. Rather M equilibrium includes all beliefs that yield the same rankingof expected payoffs. The rationale is that all those beliefs support the same set of choices, and,in this sense, they sustain an equilibrium situation in which there is “no need for change.” Goeree and Holt (2004) show that Noisy Introspection can be interpreted as a stochastic version of ratio-nalizability (Bernheim, 1984; Pearce, 1984). .2 . Organization
The next section introduces the rank correspondence, which replaces the best-response cor-respondence and is used to implement the ordinal monotonicity condition that options withhigher expected payoffs are chosen more likely without specifying by how much . Stated differ-ently, choice probabilities are ranked the same as expected payoffs, a requirement sometimesreferred to as “stochastic rationality.” Section 3 pairs the ordinal monotonicity condition forchoices with the ordinal consistency condition for beliefs to define an M equilibrium. Existencefor general normal-form games is proven and several properties and examples are discussed.Section 4 introduces a parametric class of µ -Equilibrium models where choices follow fromfixed-point conditions. We show that M equilibrium “minimally envelopes” these parametricmodels. Section 5 reports results from various experiments designed to contrast M -equilibriumpredictions with those of the existing behavioral game-theory models. Section 6 offers a sum-mary of our results, and the concluding Section 7 discusses several extensions of the basic M -equilibrium approach and discusses the value of M equilibrium when its predictions accordwell with the data and when they do not. The Appendices contain the proofs not shown in themain text, results from statistical tests based on the experimental data, as well as additionaldetails about the methods used in the data analysis.
2. Preliminaries
Consider a finite normal-form game G = ( N, { C i } ni =1 , { Π i } ni =1 ), where N = { , . . . , n } is the setof players, and for i ∈ N , C i = { c i , . . . , c iK i } is player i ’s choice set and Π i : C → R , with C = (cid:81) nj =1 C j , is player i ’s payoff function. Let Σ i denote the set of probability distributionsover C i . An element σ ci ∈ Σ i is player i ’s choice profile, which is a mapping from C i to Σ i ,where σ ci ( c i ) is the probability that player i chooses c i ∈ C i . Player i ’s beliefs about player j ∈ N are represented by σ bij , which is a mapping from C j to Σ j , where σ bij ( c j ) is the probabilitythat i assigns to player j choosing c j ∈ C j . The concatenation of the σ bij is player i ’s beliefprofile, σ bi , which is a mapping from C to Σ = (cid:81) nj =1 Σ j . Player i ’s expected payoff given belief σ bi is π i ( σ bi ) = (cid:80) c ∈ C p i ( c )Π i ( c ) where p i ( c ) = (cid:81) nj =1 σ bij ( c j ). Choosing c ik for sure is representedby the choice profile e ik = (0 , . . . , , , , . . . , ∈ Σ i and the associated expected payoff, givenbelief σ bi (with σ bii = e ik ), is denoted π ik ( σ bi ). Finally, E i = { e ik } K i k =1 represents all of player i ’s Including player i ’s beliefs about her own choices in σ bi is done for notational convenience only. Throughoutwe assume player i ’s belief about her own choices are correct, i.e. σ bii = σ ci for i ∈ N . π i = ( π i , . . . , π iK i ) ∈ R K i denotes the vector of associated expected payoffs.A player’s best-response correspondence maps the vector of expected payoffs to a player’schoice profile. This mapping may be point or set valued, e.g. the best-response correspondenceassigns probability 1 to the choice with the strictly highest payoff but it allows for mixtureswhen there is a tie for the highest payoff. Player i ’s best-response correspondence can generallybe defined as the convex hull of the elements in E i that yield the highest expected payoff. BR i ( π i ) = Conv (cid:0)(cid:8) σ i ∈ E i | ∀ k, ℓ such that σ ik > σ iℓ ⇒ π ik ≥ π iℓ (cid:9)(cid:1) (1)For the case of K i = 3 possible actions, the best-response correspondence is illustrated in thetop panel of Figure 2. The left-most simplex pertains to the case when there is a strict highestpayoff in which case the best-response is one of the vertices of the simplex. In the middle panel,two options tie for the highest payoff and now the best-response correspondence produces anedge of the simplex. Finally, in the right panel all three options have the same payoff and thebest-response correspondence is the entire simplex.To allow for the possibility of suboptimal choices, we soften the payoff maximization rulethat underlies the best-response correspondence (as in QRE) but in a manner that retains itsordinal nature (unlike QRE). Let r i = (1 , , . . . , K i ) / ( K i ( K i + 1)) ∈ Σ i and let R i = { r ik } K i ! k =1 denote the set of vectors that result by permuting all elements of r i . Then player i ’s rankcorrespondence is defined as: rank i ( π i ) = Conv (cid:0)(cid:8) σ i ∈ R i | ∀ k, ℓ such that σ ik > σ iℓ ⇒ π ik ≥ π iℓ (cid:9)(cid:1) (2)The bottom panel of Figure 2 illustrates the rank correspondence for the case K i = 3. If thereis no tie the rank correspondence yields one of the six points in the left panel, when two payoffstie the rank correspondence produces one of the six line segments in the middle panel, andwhen all three payoffs tie the rank correspondence produces the hexagon in the right panel.The best-response and rank correspondences share some important features. First, theirimages are closed and convex sets (see e.g. Figure 2), i.e. both are upper-hemicontinuous correspondences. Second, both define idempotent mappings in the sense that BR i ◦ BR i = BR i and rank i ◦ rank i = rank i . Consider, for instance, the two-dimensional case: BR i ( x, y ) = (1 , x > y , BR i ( x, y ) = (0 ,
1) when x < y , and BR i ( x, y ) = ∪ ≤ p ≤ ( p, − p ) when x = y .9 i g u r e : T h e i m ag e o f t h e B R c o rr e s p o nd e n ce (t o pp a n e l ) a nd t h e r a n k c o rr e s p o nd e n ce ( b o tt o m p a n e l ) f o r d i ff e r e n t π ∈ R . I f t h e r e a r e n o t i e s b e t w ee n e l e m e n t s i n π t h e nb o t h B R a nd r a n k r e t u r n a s i n g l e p o i n t i n t h e s i m p l e x . T h e t h r ee ( s i x ) p o ss i b l e s u c h c a s e s a r e d e p i c t e d i n t h e l e f t p a n e l s . I n c a s e o f e x a c t l y o n e t i e , B R y i e l d s o n e o f t h e t h r eee d g e s o f t h e s i m p l e x w h il e r a n k y i e l d s o n e o f s i x li n e s e g m e n t s i n t h e s i m p l e x , a ss h o w b y t h e m i dd l e p a n e l s . F i n a ll y ,i f a ll e l e m e n t s i n π a r ee q u a l, t h e n B R e q u a l s t h ee n t i r e s i m p l e x w h il e r a n k i s e q u a l t oa h e x ago n i n t h e s i m p l e x ’i n t e r i o rr e t u r n s , s ee t h e r i g h t p a n e l s . BR i (1 ,
0) = (1 , BR i (0 ,
1) = (0 , (cid:83) ≤ p ≤ BR i ( p, − p ) = (1 , ∪ (cid:83) ≤ q ≤ ( q, − q ) ∪ (0 , p = . Hence, BR i ( BR i ( x, y )) = BR i ( x, y ) for all ( x, y ) ∈ IR . A similar argument establishes rank i ( rank i ( x, y )) = rank i ( x, y ) for all ( x, y ) ∈ IR . Thisresult is intuitive: ranking alternatives that were already ranked results in the same outcome.The best-response and rank correspondences also differ in important ways. First, the imageof the rank correspondence is contained in the interior of the simplex, i.e. all options are assignedstrictly positive probability, while the best-response correspondence may assign zero probabilityto one or more options. Second, non-optimal options matter for the rank correspondencebut not for the best-response correspondence (which is why there are only three vertices inthe top-left simplex compared to six vertices in the lower-left simplex of Figure 2). Stateddifferently, ranking options retains all ordinal information about their expected payoffs, whilesome information is lost when picking only the best. As a result, BR ◦ rank = BR , but notnecessarily rank ◦ BR = rank . M Equilibrium
Let σ c and σ b denote the concatenations of players’ choice and belief profiles respectively, andlet rank and π denote the concatenations of players’ rank correspondences and profit functions.We write π ( σ b ) for the profile of expected payoffs based on players’ beliefs and π ( σ c ) for theprofile of expected payoffs when beliefs are correct, i.e. σ bi = σ c for i ∈ N . The set of possiblechoice profiles is Σ = (cid:81) i ∈ N Σ i and the set of possible belief profiles is Σ n . Definition 1
We say ( M c , M b ) ⊂ Σ × Σ n form an M Equilibrium if they are the closuresof the largest non-empty sets M c and M b that satisfy rank ( σ c ) ⊆ rank ( π ( σ b )) = rank ( π ( σ c )) (3) for all σ c ∈ M c , σ b ∈ M b . The set of M equilibria of G is denoted M ( G ) = ( M c ( G ) , M b ( G )) . The characterization of M equilibrium in (3) provides an intuitive generalization of Nash equi-11ibrium. The assumption of perfect maximization, σ c ∈ BR ( π ( σ b )), is replaced with an or-dinal monotonicity condition, rank ( σ c ) ⊆ rank ( π ( σ b )), and the assumption of perfect beliefs, σ b = σ c , is replaced with an ordinal consistency condition, rank ( π ( σ b )) = rank ( π ( σ c )). Proposition 1 M ( G ) is non-empty for any normal-form game. Proof.
Recall from Section 2 that the rank correspondence is upper-hemicontinuous andidempotent. Kakutani’s (1941) fixed-point theorem implies existence of a profile, σ , that sat-isfies σ ∈ rank ( π ( σ )), and since rank is idempotent we have rank ( σ ) ⊆ rank ( π ( σ )). Hence, M c ⊇ { σ } and M b ⊇ { σ } × · · · × { σ } . Example 1.
Consider a matching-pennies game where a Row and Column player choose Headsor Tails. Row receives 1 if their choices match and loses 1 if they don’t. Column’s payoffs are thenegative of Row’s payoffs. Let σ u denote the profile where each player randomizes uniformly overHeads and Tails. Players’ expected payoffs are the same when evaluated at σ u , which is thus aNash-equilibrium profile. Moreover, rank ( BR ( π ( σ u ))) = rank ( π ( σ u )), so the Nash-equilibriumcondition, σ u ∈ BR ( π ( σ u )), implies the M -equilibrium condition, rank ( σ u ) ⊆ rank ( π ( σ u )). Itis readily verified that σ u is the only profile such that rank ( σ u ) ⊆ rank ( π ( σ u )). Hence, M ( G )consists of a single Nash-equilibrium profile in this (non-generic) example.In general, rank ◦ BR ̸ = rank , so the argument in Example 1 cannot be applied to showthat any Nash equilibrium is an M equilibrium. For instance, for a pure-strategy Nash-equilibrium profile σ in a game with three or more options, rank ( σ ) is multi-valued while rank ( π ( σ )) is single-valued when the expected payoffs of non-optimal choices are unequal.Hence, rank ( σ ) ̸ = BR ( rank ( σ )) and rank ( σ ) ̸⊆ rank ( π ( σ )), so pure-strategy Nash-equilibriumprofiles are generally not examples of M equilibria. Instead, they arise as boundary points ofthe M -equilibrium sets. The same is true for degenerate mixed-strategy Nash-equilibrium pro-files that lie on the boundary of the simplex. We next present a slightly relaxed definition of M equilibrium that allows for the inclusion of these boundary cases more directly. . Semi-Algebraic Geometry of M Equilibrium
Here we present an alternative definition of M equilibrium that highlights its connection withsemi-algebraic geometry. Recall that a semi-algebraic set is defined by a finite number of poly- Note that rank ( BR ( π ( σ c ))) = rank ( π ( σ c )) for any non-degenerate mixed-strategy Nash-equilibrium profile σ c , which are thus always part of M equilibrium (see also Example 1). B YR
6, 6 2, 6 1, 1 B
6, 2 3, 3 1, 8 Y
1, 1 8, 1 9, 9
R B YR
3, 3 6, 6 9, 5 B
6, 6 3, 3 9, 2 Y
5, 9 2, 9 10, 10Table 3: Examples of non-generic symmetric two-player games with three possible choices. Forthe game in the left panel, a payoff-indifference line lies on the simplex’ boundary. For thegame in the right panel, a payoff-indifference line lies on one of the simplex’ diagonals.nomial equalities and inequalities. Let R = (cid:81) ni =1 R i denote the set of all possible permutationsof players’ rank vectors. For r = ( r k , . . . , r nk n ) ∈ R let Σ r = { σ ∈ Σ | rank ( σ ) ⊇ r } . Definition 2
Fix r ∈ R . We say ( M cr , M br ) ⊂ Σ r × Σ n form an M r Equilibrium if they arethe closures of the largest non-empty sets M cr and M br that satisfy ( σ cik − σ ciℓ )( π ik ( σ c ) − π iℓ ( σ c )) > σ cik ( π ik ( σ c ) − π iℓ ( σ c )) = 0( π ik ( σ c ) − π iℓ ( σ c ))( π ik ( σ b ) − π iℓ ( σ b )) > σ cik ( π ik ( σ b ) − π iℓ ( σ b )) = 0 (4) for all k ̸ = ℓ ∈ { , . . . , K i } , i ∈ N , σ c ∈ M cr , and σ b ∈ M br . The set of M equilibria of G is M ( G ) = ∪ r ∈ R ( M cr ( G ) , M br ( G )) . Remark 1.
Definitions 1 and 2 are equivalent except for profiles that lie on the simplex’boundary. For these boundary profiles, Definition 2 relaxes the constraint that σ c cannothave more equal elements than π ( σ c ) as implied by rank ( σ c ) ⊆ rank ( π ( σ c )) in Definition 1.As a result, also degenerate Nash profiles satisfy (4) while they generally do not satisfy (3).Non-emptiness of M ( G ) under Definition 2 thus follows from existence of Nash equilibrium.Definition 2 shows that M equilibrium is generally characterized by a mixture of inequalities andequalities. If only inequalities suffice to describe the M equilibrium then its choice and beliefsets have full dimension. Because if inequalities hold for some choice and belief profiles then,by continuity of expected payoffs, they hold for choice and belief profiles that are sufficientlyclose. In general, however, semi-algebraic sets can have components of various dimensions. Example 2.
The top-left panel of Figure 3 shows the different components of the symmetric M -equilibrium choice set for the game in the left panel of Table 3. The colored lines in Figure 3correspond to cases where an equality in (4) holds, either because two expected payoffs are equal Each of the Σ r corresponds to one of the | R | = (cid:81) i ∈ N K i ! equally-sized parts of Σ and Σ = ∪ r ∈ R Σ r . M -equilibrium choices and the right panel shows M -equilibrium beliefs. The top and middle panels correspond to the non-generic games inTable 3 and the bottom panel to the generic game in Table 4. The green curves indicate M -equilibrium profiles where two expected payoffs are equal, the red lines indicate M -equilibriumprofiles contained in the simplex’ boundary, and the grey lines indicate M -equilibrium profilescontained in one of the simplex’ diagonals. The blue shaded areas correspond to interior M -equilibrium profiles. 14 R B Y
R B YR
14 12 30 B
12 4 32 Y
30 32 10
R B YR
32 17 22 B
17 36 8 Y
22 8 40
R B YR B
26 32 20 Y
23 20 6Table 4: A generic symmetric three-player game with three possible choices.(green) or because the profile lies on the simplex’ boundary (red). The grey lines correspondto the simplex’ diagonals. For this non-generic game, there are four M equilibria: two havedimension zero (the Nash-equilibrium profiles σ c = (1 , ,
0) and σ c = ( , , )), one hasdimension one (the closure of the set of profiles σ c = ( p, , − p ) for 0 < p < ), and one hasdimension two (the closure of the set of profiles σ c = ( σ c , σ c , σ c ) that satisfy 0 < σ c < σ c < σ c and σ c + σ c + σ c = 1). Except for the Nash-equilibrium profile σ c = (1 , , M -equilibria under Definition 1. The middle-panel shows a case where the symmetric M -equilibrium choice and belief sets are one dimensional and have measure zero. The bottompanel shows that the M -equilibrium sets can, generically, have multiple connected components. Proposition 2
For any normal-game G , there is at least one and at most a finite number of M equilibria, each of which consist of a finite number of components. Proof.
That there is at least one M equilibrium was shown in Proposition 1. That there arefinitely many M equilibria follows from basic results in semi-algebraic geometry, see e.g. Coste(2002). . Coloring of M Equilibrium
The lower-dimensional M -equilibrium components shown in the top and middle panels of Fig-ure 3 arise when a special condition is met: a payoff-indifference line coincides with the boundaryof the simplex or one of its diagonals. Games for which this occurs are non-generic in the sense15hat if the game is perturbed slightly then these lower-dimensional components disappear andonly the sets of full dimension remain. The reason that full-dimensional sets do not disappearis that the rank correspondence is single-valued on the interior of these sets and, hence, theexpected payoffs can be strictly ranked. By continuity of expected payoffs, this strict rankingremains the same if the game is slightly perturbed. Also, the expected payoffs based on the sup-porting beliefs must have the same unique ranks, which allows us to “color” the M -equilibriumchoice and belief sets by their rank vector. Proposition 3
Generically,(i) an M equilibrium is a set of positive measure in Σ × Σ n ,(ii) an M equilibrium is characterized or “colored” by the rank vector r = ( r k , . . . , r nk n ) ∈ R ,(iii) Nash-equilibrium profiles and σ u are boundary points of some M -equilibrium choice set. Example 3.
To illustrate, consider the four 2 × σ R = ( q, − q ) and σ C = ( p, − p ) denote Row’s and Column’s choice profiles, where q and p are the probabilitiesthat Row and Column choose A respectively. Then, for instance, rank ( σ R ) = ( , ) when q < and rank ( σ R ) = ( , ) when q > . Since the entries in the rank vectors add up to 1, we cancharacterize any M equilibrium by the first entries of the players’ rank vectors. That leavesfour possible M equilibria that can be color coded: ( , ) (red), ( , ) (grey), ( , ) (blue), and( , ) (yellow).In Figure 4, the left panels show the actions sets, Σ c , and the right panels show the belief sets,Σ b , for the four games in Table 5. In the left panels, the square at ( , ) indicates the uniformrandomization profile, σ u , and the disks indicate Nash equilibria. For each of the four games,the beliefs sets are shown in the same color as the actions they support. For example, in thesymmetric coordination game, A has a higher expected payoff only when a player believes thechance the other plays A exceeds . The requirement rank ( σ c ) = rank ( π ( σ b )) = rank ( π ( σ c ))then implies Σ c = Σ b = [ , as shown by the same-sized yellow squares in the top panels ofFigure 4. In contrast, B has a higher expected payoff when a player believes the chance theother plays A does not exceed . Now, the requirement rank ( σ c ) = rank ( π ( σ b )) = rank ( π ( σ c ))implies Σ c = [0 , ] and Σ b = [0 , ] as shown by the red squares in the top-left and top-rightpanels of Figure 4 respectively. To be precise, the choice sets are based on only the first entries of σ R and σ C , i.e. q for Row and p forColumn (as the two entries of the σ ’s add up to 1). Correspondingly, the belief sets are based on Row’s beliefabout p and Column’s belief about q . This allows us to depict Σ c and Σ b in two-dimensional graphs. BA
2, 2 2, 1 B
1, 2 4, 4
A BA
0, 0 6, 1 B
1, 10 2, 2
A BA
1, 0 0, 5 B
0, 1 5, 0
A BA
2, 2 0, 0 B
2, 2 1, 1Table 5: A symmetric coordination game, an asymmetric game of chicken, an asymmetricmatching-pennies game, and a non-generic game with a continuum of Nash equilibria.
R B YR
6, 6 7, 1 2, 7 B
1, 7 6, 6 7, 0 Y
7, 2 0, 7 6, 6
R B YR
6, 6 2, 20 2, 0 B
20, 2 0, 0 0, 5 Y
0, 2 5, 0 5, 5
R B YR
6, 6 0, 2 1, 2 B
2, 0 5, 5 1, 3 Y
2, 1 3, 1 4, 4Table 6: A symmetric 3 × M equilibria for the asymmetric game of chicken and the asymmetric matching-penniesgame can be worked out similarly. In the game on the far-right in Table 5, B is the weaklydominant choice for Row and A is the dominant choice for Column. The Nash equilibria are p = 1 and q ∈ [0 ,
1] and the trembling-hand-perfect equilibrium is p = 1 and q = 0. In thisnon-generic example, there is an M -equilibrium of full dimension, i.e. Σ c = [ , × [0 , ] andΣ b = [0 , , and a non-colorable M -equilibrium of lower dimension, i.e. Σ c = Σ b = { } × [0 , Example 4.
Next, consider the three symmetric 3 × M -equilibrium sets are shown in Figure 5.Different from the 2 × M -equilibrium choice sets for the 3 × Themiddle panels show a case where both players choosing “yellow” is the unique pure-strategyNash equilibrium. Most of the belief set is “blue,” however, and these beliefs support choiceprofiles where “blue”’ is the most likely choice. We will discuss this possibility in more detail in the next section where we compare M equilibrium withparametric theories such as QRE. M -equilibrium choice sets and the right panels show the M -equilibrium belief sets for the four games in Table 5. The belief sets are shown in the samecolor as the actions they support. In the left panels, the square at the center indicates theuniform randomization profile and the disks / line indicate Nash equilibria.18igure 5: The left panels show the M -equilibrium choice sets and the right panels show the M -equilibrium belief sets for the three games in Table 6. The belief sets are shown in the samecolor as the actions they support. In the left panels, the square at the center indicates theuniform randomization profile and the disks indicate Nash equilibria.19 roposition 4 Generically,(i) |M ( G ) | may be even or odd, and may be less than, equal, or greater than the number ofNash equilibria.(ii) An M -equilibrium may contain zero, one, or multiple Nash equilibria.(iii) The measure of an M -equilibrium choice set is bounded by (cid:81) ni =1 / | C i | ! (iv) In contrast, an M -equilibrium belief set may have full measure. Proof.
Properties (i)-(ii) are demonstrated by Figures 4 and 5. Property (iv) holds, forinstance, for any dominance-solvable game. Property (iii) follows since each C i can be parti-tioned into | C i | ! equally-sized subsets indexed by the ranks of the entries of the choice profilesit contains. Since rank i ( σ ci ) must be constant on an M -equilibrium choice set for all i ∈ N , the M -equilibrium choice set must be contained in the Cartesian product of a single such subsetfor each player. Hence, its size cannot be larger than (cid:81) ni =1 / | C i | !A priori, the set-valued nature of M equilibrium might have been considered a drawback as itmight render its predictions non-falsifiable. Proposition 4 shows this is not the case. The sizeof an M -equilibrium choice set falls quickly (in fact, factorially or exponentially fast) with thenumber of players and the number of possible choices.Our interest in colorable M -equilibria is based on the intuition that they are the empiricallyrelevant ones. To make this precise, we define a new stability notion called behavioral stability .For ε >
0, let G ( ε ) denote the set of normal-form games that result from G by perturbing anyof its payoff numbers by at most ε . Definition 3
We say that ( σ c , σ b ) ∈ Σ × Σ n is a behaviorally stable profile of the normal-form game G if there exists an ε > such that ( σ c , σ b ) ∈ M ( G ′ ) for all G ′ ∈ G ( ε ) . Let B ( G ) denote the closure of the set of behaviorally stable profiles of G . In words, a choice-belief profile is behaviorally stable for the game G if it is M -equilibriumprofile for G as well as for all nearby games. Note that behavioral stability is a sharpening of the“strategic stability” criterion introduced in Kohlberg and Mertens (1986). The latter requiresa strategically-stable Nash profile to be “close to” the Nash equilibrium of the perturbed game.Behavioral stability, in contrast, requires the M -equilibrium profile to be an M -equilibrium ofthe perturbed game. 20 roposition 5 If M ∈ M ( G ) is colorable then M ⊆ B ( G ) . Proof.
Profiles in the interior of a colorable M -equilibrium are those for which the rankcorrespondence is single valued, i.e. that can be characterized by a single vector r ∈ R . Thismeans that at those profiles, expected payoffs can be strictly ranked. Since expected payoffs arecontinuous in the payoff numbers, they will be ranked the same for games that are sufficientlyclose. Hence, the interior of M is contained in B ( G ), and since B ( G ) is closed, M ⊆ B ( G ). Remark 2.
Generically, only colorable M -equilibrium profiles are behaviorally stable. Forinstance, for the game in the left panel of Table 3 only the profiles in the full-dimensionalset in the top panel of Figure 3 are behaviorally stable. However, it is readily verified thatfor the matching-pennies game of Example 1, the behaviorally stable set consists of a singlenon-colorable M -equilibrium profile.
4. Parametric Models of Stochastic Choice
Like M equilibrium, the parametric models introduced in this section obey the ordinal mono-tonicity condition that choice probabilities are ordered the same as their associated expectedpayoffs. Unlike set -valued M equilibrium, however, their predictions are based on fixed- points .For i ∈ N , let µ i ∈ Σ i and let R i ( µ i ) = { µ ik } K i ! k =1 denote the set of vectors that result bypermuting the elements of µ i . For i ∈ N , we define player i ’s rank µ i i correspondence as follows rank µ i i ( π i ) = Conv (cid:0)(cid:8) σ i ∈ R i ( µ i ) | ∀ k, ℓ such that σ ik > σ iℓ ⇒ π ik ≥ π iℓ (cid:9)(cid:1) (5)The rank µ i i correspondence includes random behavior and best response behavior as specialcases and satisfies a generalized idempotence condition. Proposition 6
For all i ∈ N ,(i) rank µ i i ( π i ) = σ ui for all π i ∈ IR K i when µ i = σ u .(ii) rank µ i i ( π i ) = rank i ( π i ) for any µ i ∈ R i .(iii) rank µ i i ( π i ) = BR i ( π i ) for any µ i ∈ E i .(iv) rank µ i i ◦ rank µ ′ i i = rank µ i i for any µ i , µ ′ i ∈ Σ i such that rank i ( µ ′ i ) is single valued. v) rank µ i ( µ i ) = µ i for any µ i ∈ Σ i . For µ = ( µ , . . . , µ n ) ∈ Σ, let rank µ denote the concatenation of the rank µ i i for i ∈ N . Definition 4
We say ( σ c , Σ b ) is a µ -Equilibrium of the normal-form game G if, for all σ b ∈ Σ b , σ c ∈ rank µ ( π ( σ b )) (6) and Σ b is the closure of the largest set, Σ b , such that rank µ ( π ( σ b )) = rank µ ( π ( σ c )) for all σ b ∈ Σ b . The set of all µ -equilibria of G is denoted E µ ( G ) . The image of the rank µ correspondence is a closed and convex set (see e.g. Figure 2 for thecase of µ ∈ R and µ ∈ E ). Existence of a µ -equilibrium thus follows from Kakutani’s (1941)fixed-point theorem. This also implies that Σ b is non-empty as Σ b ∋ σ c if σ c ∈ rank µ ( π ( σ c )). Proposition 7 E µ ( G ) is non-empty for any µ ∈ Σ and any normal-form game G . While µ -equilibrium choice profiles are defined as fixed-points, they are easy to compute. Thereason is that the right side of (6) does not vary continuously with players’ beliefs but, instead,is a piecewise-constant function over the simplex that takes on only finitely many values. Example 5.
To illustrate the simplicity of µ -equilibrium computations, consider the asymmet-ric game of chicken in the second panel of Table 5. Let σ C = ( p, − p ) and σ R = ( q, − q ) where p ( q ) denotes the probability with which Column (Row) chooses A . To obtain a parsimoniousmodel, let µ R = µ C = (1 ρ , ρ ) / (1 ρ + 2 ρ ) where ρ ≥ rank µ responses are rank µR ( p ) = − ρ if p < [ ρ , − ρ ] if p = ρ if p > and rank µC ( q ) = − ρ if q < [ ρ , − ρ ] if q = ρ if q > The rank µ responses are “flat,” i.e. rank µR ( p ) = rank µC ( q ) = , when ρ = 0, and they limit tostandard best responses when ρ = ∞ . Figure 6 shows Row’s and Column’s rank µ responseswhen ρ increases from 0 in the top-left panel to 5 in the bottom-right panel. The intersectionof the rank µ correspondences typically consists of an odd number of points (1 or 3) except at ρ = 2, in which case it contains p = and any q ∈ [ , ], and at ρ = 3, when a bifurcationoccurs and the intersection contains q = and any p ∈ [ , ].22igure 6: Row’s (solid) and Column’s (dashed) ordinal responses for the asymmetric game ofchicken in the second panel of Table 5 when ρ ∈ { , , , , , } as indicated in the panels.Figure 7: In the left panel, the colored (dashed) curves show the µ -equilibrium (logit-QRE)choice correspondence for the asymmetric game of chicken in the second panel of Table 5 for0 ≤ ρ ≤ ∞ . The square at the center corresponds to random behavior ( ρ = 0) and the diskscorrespond to Nash equilibria ( ρ = ∞ ). In the right panel, the colored areas correspond to thesupporting beliefs, Σ b , for the different ranges of ρ .23ore generally, let Γ µ ( ρ ) denote the µ -equilibrium correspondence, which consists of a choicepart and an associated supporting belief part: Γ µ ( ρ ) = (Γ cµ ( ρ ) , Γ bµ ( ρ )). The colored curves in theleft panel of Figure 7 show the µ -equilibrium correspondence while the dashed curves show thelogit-QRE correspondence. There are some similarities. Each have a “principal branch” thatstarts at the center when ρ = 0 and ends at the pure-strategy Nash equilibria ( p ∗ = 1 , q ∗ = 0)when ρ = ∞ . And each have an additional branch that connects the other pure-strategy Nashequilibrium, ( p ∗ = 0 , q ∗ = 1), with the mixed equilibrium, ( p ∗ = , q ∗ = ).There are also some differences. First, the µ -equilibrium correspondence can be computedeasily and characterized analytically. More importantly, µ -equilibrium choices are generallysupported by a range of beliefs. In the right panel of Figure 7, the three differently col-ored rectangles show the supporting beliefs that correspond to the part of the µ -equilibriumcorrespondence with the same color in the left panel. In contrast, the beliefs that supportlogit-equilibrium choices are forced to be correct and, hence, have zero measure, see the dashedcurves in the right panel of Figure 7.We next generalize the findings in Example 5. For this we need to define the two possible limitcases, i.e. ρ = 0 and ρ = ∞ , more generally. The former case corresponds to random behavior,which leads us to define: RAND = ( σ u , { σ b | rank ( π ( σ b )) = rank ( π ( σ u )) } ). The latter casecorresponds to best-response behavior, which leads to the following definition. Definition 5
We say ( σ c , Σ b ) form a belief-augmented Nash equilibrium (BEAUNE) if σ c is a best response to any belief in Σ b , i.e. for all σ b ∈ Σ b , σ c ∈ BR ( π ( σ b )) (7) and Σ b is the closure of the largest set, Σ b , such that BR ( π ( σ b )) = BR ( π ( σ c )) for all σ b ∈ Σ b . The full description of the µ -equilibrium correspondence for the asymmetric game of chicken in the secondpanel of Table 5 is:Γ µ ( ρ ) = (cid:0) { − ρ , − ρ } , [0 , ] × [0 , ] (cid:1) if 0 ≤ ρ < (cid:0) { } × [ , ] , [0 , ] × { } (cid:1) if ρ = 2 (cid:0) { − ρ , ρ } , [0 , ] × [ , (cid:1) if 2 < ρ < (cid:0) { , } , [0 , ] × [ , (cid:1) ∪ (cid:0) [ , ] × { } , { } × [0 , ] (cid:1) if ρ = 3 (cid:0) { − ρ , ρ } , [ , × [0 , ] (cid:1) ∪ (cid:0) { , } , { , } (cid:1) ∪ (cid:0) { ρ , − ρ } , [0 , ] × [ , (cid:1) if ρ > other player. In the aboveexpressions, { x, y } denotes a point and [ x, y ] an interval. b = [0 , contains all possiblebeliefs, reflecting the dominant-strategy nature of the game. In contrast, the non-degeneratemixed-strategy Nash-equilibrium profile for the asymmetric matching-pennies game in Table 5is supported only by the profile itself. Proposition 8
For i ∈ N , define µ i ( ρ ) = (1 ρ , ρ , . . . , K ρi ) / (cid:80) k k ρ where ρ ≥ , and let µ ( ρ ) denote their concatenation. The µ -equilibrium correspondence Γ µ : ρ ⇒ E µ ( ρ ) ( G ) has the following properties for generic games:(i) Γ µ ( ρ ) is upper hemicontinuous.(ii) | Γ cµ ( ρ ) | is odd and Γ bµ ( ρ ) has strictly positive measure for almost all ρ .(iii) Γ µ ( ρ ) limits to a BEAUNE when ρ → ∞ and to RAND when ρ → .(iv) Γ µ ( ρ ) has a principal branch that connects RAND to exactly one
BEAUNE . The other
BEAUNE are connected as pairs.
Remark 3.
These properties are standard and mimic those of the logit-QRE correspondence,see McKelvey and Palfrey (1995, 1996). They point out that not all Nash equilibria arise aslimit points of QRE, and they call those that do “approachable.” Likewise, not all BEAUNEarise as limit points of Γ µ ( ρ ) when ρ → ∞ . We will see, however, that colorable BEAUNE, i.e.those that are boundary points of a colorable M -equilibrium, are approachable.The main interest in Proposition 8 is that it offers a single-parameter alternative to logit-QRE, which does not restrict beliefs to be correct . It would be more realistic, however, to assume thatplayers’ rationality parameters differ, which begs the question what choices and beliefs occurunder a heterogeneous µ -equilibrium. We next show that by varying µ , the µ -equilibriummodels “fill out” the set of M equilibria. . M Equilibrium as a Meta Theory
We first compare M equilibrium with the Luce-QRE model for which analytical solutions existfor simple games. 25 xample 6. To glean some intuition, consider the asymmetric matching-pennies game in thethird panel of Table 5. Suppose players’ choice probabilities follow from expected payoffs usinga Luce choice rule, e.g. the chance that Column and Row choose A is p = π ρ C A π ρ C A + π ρ C B q = π ρ R A π ρ R A + π ρ R B where ρ R ≥ ρ C ≥ p Luce ( ρ R , ρ C ) = 11 + 5 ρC (1 − ρR ) ρCρR +1 q Luce ( ρ R , ρ C ) = 11 + 5 ρR (1+ ρC ) ρCρR +1 And given these expressions it is straightforward to show that (cid:91) ρR ≥ ρC ≥ (cid:8) p Luce ( ρ R , ρ C ) , q Luce ( ρ R , ρ C ) (cid:9) = [0 , ] × [ , ] ∪ [ , ] × [0 , ]i.e. the closure of the set of all Luce-QRE fixed-points equals the union of the “red” and “blue” M -equilibrium choice sets in Figure 4.While the M -equilibrium choice set envelopes all Luce-QRE profiles, the modeling of beliefsdiffers sharply in the two models. M -equilibrium choices in the “red” set are supported by“red” beliefs while “blue” actions are supported by “blue” beliefs. In other words, beliefsand choices do not necessarily match, they just have the same color. Also, players may holddifferent beliefs as long as the “color” of their choices matches that of their beliefs. In contrast,the “colorblind” Luce-QRE model assumes that beliefs are homogeneous across players andcorrect. The same is true more generally for QRE models: if σ c denotes a QRE action profilethen σ bi = σ c for all i ∈ N . As a result, the collection of all QRE beliefs will have measure zeroin Σ n and collection of all QRE choices can only be compared with the union of M -equilibriumchoice sets irrespective of their color.In contrast, µ -equilibrium choices are colorable in the sense that they are supported by aset of beliefs of the same color. We next show that µ -equilibrium choices of a certain color fill26ut the M -equilibrium choice set with the same color. Proposition 9
Let M ∈ M ( G ) be colorable, i.e. characterized by a single rank vector r ∈ R ,then M = (cid:91) rank ( µ ) = r E µ ( G ) (8)BEAUNE that are boundary points of a colorable M -equilibrium are approachable as is RAND . Proof.
Suppose ( σ c , σ b ) ∈ E µ ( G ) for some µ with rank ( µ ) = r then σ c ∈ rank µ ( π ( σ b )) = rank µ ( π ( σ c )). Applying rank and using the properties of Proposition 6 yields rank ( σ c ) ⊆ rank ( π ( σ b )) = rank ( π ( σ c )), i.e. ( σ c , σ b ) ∈ M . This shows that M contains the union of the E µ ( G ) and since M is closed it follows that M ⊇ (cid:91) rank ( µ ) = r E µ ( G )Conversely, if ( σ c , σ b ) is in the interior of M then we have rank ( σ c ) = r and rank ( σ c ) ⊆ rank ( π ( σ b )) = rank ( π ( σ c )). Applying rank σ c and using the properties of Proposition 6 yields σ c ∈ rank σ c ( π ( σ b )) = rank σ c ( π ( σ c )), i.e. ( σ c , σ b ) ∈ E σ c ( G ). This shows that the interior of M is contained in the union of the E µ ( G ). Since A ⊆ B implies A ⊆ B , it follows that M ⊆ (cid:91) rank ( µ ) = r E µ ( G )which completes the proof of (8). Any BEAUNE ( σ c , Σ b ) that is a boundary point of M canthus be obtained as the limit of E µ ( G ) when µ → σ c , while RAND follows when µ → σ u . Remark 4.
The proof uses property (iv) of Proposition 6, which only applies if the rank µ correspondences does not “lose any information,” which explains the restriction to rank ( µ )being single valued. We conjecture that non-colorable M equilibria follow by considering µ such that rank ( µ ) is multi-valued. For instance, for the matching-pennies game of Example 1,the unique M -equilibrium is given by M = E σ u ( G ). More generally, we conjecture that M ( G ) = (cid:91) µ ∈ Σ E µ ( G )But our main interest is in the colorable M equilibria since, generically, these are the only onesthat are behaviorally stable. 27 similar comparison of M equilibrium to the union QRE, i.e. for different formulationsof the quantal response functions, is complicated by the fact that QRE assumes correct pointbeliefs. As a result, the set of beliefs traced out by different QRE models has the same dimensionas the set of choice profiles, Σ, which has zero measure in the set of belief profiles, Σ n . Varyingover all possible QRE models will, therefore, not “fill out” the M -equilibrium belief sets.However, we can establish an equivalence result for the union of M -equilibrium choice sets.Imposing correct beliefs, σ bi = σ c for i ∈ N , reduces (3) to rank ( σ c ) = rank ( π ( σ c )). Let R denote the infinitely-dimensional space of regular quantal responses (see e.g. Goeree, Holt, andPalfrey, 2016) with typical element R ∈ R , and let QRE R ( G ) denote the set of quantal responseequilibria of G with respect to R . Moreover, let (cid:102) M c ( G ) denote the closure of the intersectionof M c ( G ), i.e. the union of all M -equilibrium choice sets, with the simplex interior. Goeree etal. (2018) show that (cid:102) M c ( G ) = (cid:91) R ∈ R QRE R ( G )In other words, by varying over the infinite-dimensional space of all possible quantal responsefunctions, the QRE fixed-points essentially “fill out” the union of M -equilibrium choice sets.However, unlike µ -equilibrium (see Proposition 9), it is not possible to do this “color by color,”i.e. to fill out individual M -equilibrium choice sets. Further care must be taken in interpretingthe QRE “equivalence” result as the next example shows. Example 7.
Consider the middle game in Table 6, in which R is dominated by a fifty-fiftycombination of Y and B . Using the logit formulation, the probability a player chooses R isgiven by p R = 11 + e λ ( π Y − π R ) + e λ ( π B − π R ) ≤
11 + 2 e λ (( π Y + π B ) / − π R ) ≤ R is worse than randomizing uniformly over B and Y . Notethat the 1 / p R > . However, there is an M -equilibrium choice set with p B > p R > , see the blue set inthe middle-left panel of Figure 5. While this finding does not invalidate the result that the The logit QRE correspondence consists of a single curve that starts at the random choice profile and crossesonly the cyan and green sets to end at the unique Nash equilibrium. Expected payoffs are π R = π B + 2 − p R and π R = π Y + 9 p R −
3, so rank ( σ c ) = rank ( π ( σ c )) for σ c = ( p R , p B , p Y ) with p B > p R > . M -equilibrium choice sets (it just meansother-than-logit quantal responses are needed to produce blue QRE choices), it underscores theimpossibility for QRE to fill out individual M -equilibrium sets “color by color.” By varyingthe two players’ rationality parameters, logit QRE fills out the green and cyan M -equilibriumchoice sets but it does not produce any choice profiles in the blue M -equilibrium choice set.Given the size of its belief set (see the middle-right panel of Figure 5), the blue M equilibriumis likely to be relevant empirically even though it cannot be reached by logit QRE. We confirmthis conjecture in the next section using a variant of the game in Table 6 with an even largerblue choice and belief set. The main reason QRE is falsified, however, is because it assumeshomogeneous and correct beliefs. In the experiments reported below, beliefs generally differacross subjects and beliefs differ from choices.
5. An Experiment
We report the results from a series of experiments to illustrate how M equilibrium provides alens through which to better understand strategic behavior in games. The experiments sharedsome common features. Sixteen participants joined each experimental session. Subjects werefirst given instructions in a power-point presentation that was read aloud. Subjects played two-player matrix games with either two or three possible choices. Each game was played for either8 rounds (the 2 × × subjects were also asked to submit their beliefs about their opponent’s choice in termsof “percentage chances.” Belief elicitation was incentivized using a generalization of a methodproposed by Wilson and Vespa (2016), which is an implementation of Hossain and Okuis’ (2013) binarized scoring rule (BSR). The BSR is incentive compatible for general risk-preferences and,hence, avoids issues of risk-aversion that plague other scoring rules (e.g. quadratic scoringrule). Wilson and Vespa’s (2016) method operationalizes the BSR for binary-choice settings All subjects played by choosing a row. For asymmetric games this simply means that Column played asRow with a transposed payoff matrix. See Schotter and Trevino (2014) and Schlag et al. (2014) for discussions of belief elicitation methods.
29n a way that is simple to explain. Subjects submit the “percentage chance” with which theybelieve their opponent chooses each action by moving a single slider between endpoints labeled A and B . Any point on the slider corresponds to a unique chance of A being played (and B with complementary chance) with the A endpoint ( B endpoint) corresponding to the beliefthat the opponent chooses A ( B ) for sure. The point chosen by the subject is then comparedwith two computer-generated random points on the slider. If the chosen point is closer to theopponent’s actual choice (one of the endpoints) than at least one of the two randomly-drawnpoints then the subject receives a fixed prize. The next proposition generalizes this method sothat it can be used for general normal-form games. Proposition 10
Consider the elicitation mechanism where player i ∈ N uses S i = (cid:80) j ̸ = i K j sliders, labeled S jk for j ̸ = i and k = 1 , . . . , K j , to report beliefs q jk with q jk ≥ and (cid:80) K j k =1 q jk =1 . For each slider, two (uniform) random numbers are drawn and player i ’s belief for that slideris “correct” if the reported belief is closer to the actual outcome (0 or 1) than at least one of therandom draws. If players are risk-neutral then the elicitation mechanism is incentive compatiblewhen a prize, P ≥ , is paid for all correct beliefs for any randomly selected subset of sliders.If players are not risk neutral then the elicitation mechanism is incentive compatible if a prizeis paid when the stated belief is correct for a single randomly selected slider. Proof.
Let u i ( x ) denote player i ’s utility of being paid a prize mount x (with u i (0) = 0) andlet p and q denote the concatenations of player i ’s true and reported beliefs respectively. Player i wins a prize P for slider S jk with chance P jk = p jk (1 − (1 − q jk ) ) + (1 − p jk )(1 − q jk )and gets 0 with complementary probability. If all correct beliefs for a random subset S ⊆ S i ofsliders pay a prize P then player i ’s expected utility of reporting q when her true beliefs are p is U i ( p, q ) = (cid:88) W ⊆ S u i ( P | W | ) (cid:16) | S || W | (cid:17) (cid:89) S jk ∈ W P jk (cid:89) S jk ̸∈ W − P jk (9)where W is the subset of selected sliders for which player i wins the prize P . If player i is riskneutral, i.e. u i ( x ) = x , then this reduces to the expected number of wins U i ( p, q ) = P | S || S i | (cid:88) S jk ∈ S i P jk q yields ∇ q U i ( p, q ) = 2 P | S || S i | ( p − q )so truthful reporting is optimal. If player i is not risk neutral then there can only be twopossible payoff outcomes, 0 and P , for the elicitation mechanism to be incentive compatible.In other words, | S | = 1 and (9) reduces to U i ( p, q ) = u i ( P ) | S i | (cid:88) S jk ∈ S i P jk and truthful reporting is again optimal.After all subjects had submitted their choices and beliefs, they would be shown their opponent’schoice, the points on the belief slider chosen by the computer, and whether their submittedbeliefs were close enough to their opponent’s actual play to win the prize. Their payoff in eachround was randomly selected to be either their payoff from the game or their payoff from thebelief elicitation task. This was done to avoid subjects hedging between the two tasks. At theend of the experiment, subjects were informed of their total earnings and paid. . Heterogeneous and Incorrect Beliefs
First, consider the 2 × asymmetric matching pennies (AMP) games in Table 7, which areall derived from the same parametric form shown in the top-left panel. The use of a commonparametric form guarantees that the best response structure is identical across games. Hence, allgames in this family have the same unique mixed-strategy Nash equilibrium. Let p and q denotethe probability with which Column and Row choose A respectively. The Nash equilibrium forany such game is ( p ∗ , q ∗ ) = ( , ). The M -equilibrium action set shown as the red colored areain the left panel of Figure 8. The red colored area in the right panel shows the M -equilibriumbelief set.The numbered points in the top-left (top-right) panel of Figure 8 show the average choices(beliefs) in games 1-5. The ellipsis around each point represents the 95% confidence intervalfor the samples means. The study of such games has been instrumental in the development of alternative models of strategicbehavior. See, for instance, Ochs (1995), Erev and Roth (1998), Goeree, Holt, and Palfrey (2003), Selten andChmura (2008), as well as the comment by Brunner et al. (2010) and the reply by Selten et al. (2010).
A BA X + 10,
Z W , Z + 50 B X , Y + 10 W + 10, Y AMP1
A BA
20, 10 10, 60 B
10, 20 20, 10AMP2
A BA
60, 10 10, 60 B
50, 20 20, 10 AMP3
A BA
60, 50 10, 100 B
50, 20 20, 10AMP4
A BA
60, 50 10, 100 B
50, 60 20, 50 AMP5
A BA
60, 50 50, 100 B
50, 60 60, 50Table 7: Asymmetric Matching Pennies (AMP) games.
Result 1
In the AMP games:(i) Subjects’ choices differ across games.(ii) Subjects’ beliefs differ across games.(iii) Subjects’ choices differ from their beliefs in any of the games.(iv) Subjects’ choices are heterogeneous.(v) Subjects’ beliefs are heterogeneous.
Support for these findings can be found in the Appendix. They are obviously at odds withNash and logit-QRE, which both predict that choices and beliefs are identical and homoge-neous across all five games and that beliefs match choices. (In addition, the Nash-equilibriumprediction, ( p ∗ , q ∗ ) = ( , ) is far from observed choice averages.) To allow for heterogeneitywithin the QRE framework, Rogers, Palfrey and Camerer (2009) propose several generaliza-tions. Heterogeneous Quantal Response Equilibrium (HQRE) allows for heterogeneous choicesby assuming that players’ rationality parameters, which determine the sensitivity of their logitquantal responses with respect to expected payoffs, are draws from commonly-known distribu-tions. In HQRE, players’ beliefs are thus assumed to be correct. The HQRE model is at oddswith findings (i)-(iii) and (v). The most general QRE model that allows for heterogeneity in32igure 8: Choices (left) and beliefs (right) in each of the five AMP games.choices and beliefs is Subjective Quantal Response Equilibrium. SQRE assumes that players’have subjective beliefs about the distributions that others’ rationality parameters are drawnfrom. SQRE is therefore not at odds with findings (iii)-(iv), but since it is based on logitquantal responses, it predicts no change in choices and beliefs across games, which is refutedby findings (i) and (ii). Similarly, the level- k and Cognitive Hierarchy (CH) models, whichare based on best responses, yield identical predictions for choices and beliefs across games,contradicting findings (i) and (ii). Result 2
The findings in Result 1 contradict the predictions of Nash equilibrium, QRE, HQRE,SQRE, level- k , and Cognitive Hierarchy, but accord well with M -equilibrium predictions. Support : Set-valued M equilibrium easily accommodates the variations in choices and beliefsacross games as well as the fact that beliefs differ from choics. As can be seen from Figure 8,average choices and beliefs fall within the M -equilibrium sets. M equilibrium relies on the assumption of stochastic rationality . This posits that subjectswill choose the alternative that is best, given their beliefs, more often. In Table 8 we report thefraction of best responses given stated beliefs for the five games. As can be seen, these rangebetween .55 and .75, which is in accordance with stochastic rationality.33MP game 1 2 3 4 5 average Row .61 .56 .55 .60 .58 .58Column .75 .73 .66 .72 .75 .72average .68 .65 .61 .66 .67 .65Table 8: Fraction of best responses for each role in each of the five AMP games.DS1
A B CA
80, 80 30, 160 20, 10 B C
10, 20 40, 10 30, 30 DS2
A B CA
75, 75 5, 155 190, 5 B C
5, 190 15, 180 200, 200NL
A B CA
70, 70 60, 500 10, 50 B C
50, 10 61, 0 30, 30 KM
A B CA B
60, 90 90, 90 60, 90 C × . M Equilibrium Multiplicity
Next consider the two symmetric 3 × As a result,both have the same best-response structure and a unique Nash equilibrium in pure strategies: { C, C } . In fact, note that these games are dominance solvable. One might reasonably expectNash to predict well and that behavior would be homogeneous across the two games.Subjects’ behavior does not support these predictions. The top-left (top-right) panel ofFigure 9 shows average choices (beliefs) in the two experiments. Like in the 2 × M equilibria for these games allows for a better understandingof these results.The M -equilibrium choice and belief sets for these games are shown in Figure 9. Notice In particular one game can be obtained by the other by adding a constant to each column (row) of Row’s(Column’s) payoffs. M equilibrium sets. The Nash equilibrium is part of the yellow set, whichis supported by a relatively small M -equilibrium belief set. The blue choice set is the furthestfrom Nash but is supported by the largest belief set. Multiplicity introduces the issue ofstrategic coordination. Two players may be both playing M -equilibrium strategies supportedby the corresponding M equilibrium beliefs, but the two could be different. In DS1, subjectsseem to coordinate on the blue equilibrium, as both average choices and beliefs fall within thecorresponding M equilibrium sets. This is not the case for DS2, where average choices falloutside of any M -equilibrium choice set.Our explanation is that even though the games share the same best response structure, thedifferences in the payoff matrices induce different beliefs in the two games. In DS2, predictingthe opponent’s play is harder, leading to a higher degree of strategic mis-coordination. To seewhy, notice that in DS1, options A and B offer an attractive “upside” and the same “downside”as C . This makes the upper left sub-matrix focal. In DS2, the high payoff at { C, C } makesthe Nash equilibrium more salient but the upper-left sub-matrix also remains focal because itprovides safer options.To examine these assertions, we provide a more detailed analysis of the beliefs and theirnexus to choices in the experiment. We separate elicited beliefs in each of the two games indifferent clusters using the k -means algorithm (MacQueen, 1967). We then take the average ofthe choices corresponding to beliefs in each cluster. The middle and bottom panels of Figure 9show the results of this exercise.As one can see from the middle panels of Figure 9, in DS 1 elicited beliefs are mainlyconcentrated in the lower side of the blue belief set. The corresponding average choices arealso in the blue choice set, indicating that subjects in this game are mostly playing the blue M equilibrium. Interestingly, in the two small clusters were beliefs are outside of the blue set,the corresponding average choices are very close to the set of the same color, indicating that insome instances some subjects did play a different M equilibrium.The bottom panels of Figure 9 correspond to DS2. As we expected, elicited beliefs here aremore spread out, with a substantial number lying outside of the blue set. Nevertheless, exceptfor two clusters (depicted by the cyan and magenta colored circles) in all other cases averagechoices lie within or very close to the set of the same color of the corresponding beliefs. Incontrast to DS1, the blue set contains the average choices of only two of these clusters, whileone of the clusters is essentially playing the M equilibrium that includes Nash.35igure 9: The colored sets indicate the four M equilibria for the DS1 and DS2 games and theblack circle at the top indicates the unique Nash equilibrium. In the top panels, the coloredcircles indicate the average choices (left) and beliefs (right) for DS1 (red) and DS2 (green).The middle (DS1) and bottom (DS2) panels show a k -means analysis for the DS games. The k -means algorithm was performed on the elicited belief data. Each colored circle correspondsto the average choices (left) and elicited beliefs (right) within each cluster. The size of thecircles is proportional to the number of observations belonging to the particular cluster.36 .3 . No Logit
We obtain similar results for the symmetric 3 × { C, C } is also the unique Nash equilibrium and there areagain four M equilibria. These are depicted in the graphs of Figure 10.The two with the largest belief sets also have the largest choice sets and are thereforeexpected to be empirically most relevant. An interesting feature in this game is that theprofiles in the blue equilibrium cannot be part of any logit-QRE (see also Example 7 above).Nevertheless, as can be seen from the experimental results depicted in Figure 10, both the blueand the cyan M equilibria are empirically relevant. A k -means clustering analysis again showsstrong evidence of strategic mis-coordination and reveals that the majority of choices (62%)are in the “no Logit” blue region.As pointed out above, the issue of equilibrium-multiplicity only arises in the context of M equilibrium – the “no Logit” game has a unique Nash equilibrium as well as a unique QRE. Thestrategic coordination problems uncovered by the choice and belief data cannot be explainedthrough the lens of existing behavioral game-theory models. . Stability
Next we take up the issue of stability. Definitions of stability abound, but generally the concepttries to capture the robustness of some equilibria to perturbations of game form perception(“strategic stability”). It follows that that stable equilibria are the empirically valid ones. Asmentioned before, profiles in the interior of M equilibrium sets are behaviorally stable, whichis a stronger notion than strategic stability stability.When multiple M equilibria exists, as is the case for the two DS games, stability cannot helppredict the empirical relevance of any of them. Here instead, we look at a game with multipleNash equilibria (bottom left of Table 9), identified by McLennan (2016) as one where “observedbehavior will not be characterized by repetition of any one of the equilibria.” This predictionfollows the application of his index +1 principle , introduced in that paper, and because thesix pure strategy Nash equilibria can be arranged in a circle, where mixtures of any adjacentpure equilibria are mixed equilibria. The KM game is originally introduced in Kohlberg andMertens (1986). In stark contrast to this prediction of instability we find the existence of a As McLennan (2016) points out, a more precise (albeit less catchy) version of the “index +1 principle”is the “Euler characteristic equals index principle.” Demichelis and Ritzberger (2003) show that the lattercondition is necessary for any “natural” dynamic process of adjustment to converge. For the KM game, theEuler characteristic of the set of Nash equilibria is zero, while its index is +1. M equilibriumsets for this game. The two graphs on the top show average choices (left) and elicited beliefs(right). The two bottom graphs show the data organized in clusters obtained using the k -meansalgorithm on the elicited beliefs data. Each colored circle corresponds to the average choices(left) and elicited beliefs (right) within each cluster. The size of the circles is proportional tothe number of observations belonging to the particular cluster.38igure 11: KM game. The colored sets correspond to the unique symmetric M equilibriumsets for the game. The colored circles indicate the average choices (left) and beliefs (right) foreach of six clusters obtained through the k -means algorithm. Each circle’s size is proportionalto the number of observations in the respective cluster.unique symmetric and behaviorally stable M equilibrium.Figure 11 shows the M equilibrium for the KM game. The data from the experiment isshown here already broken down in clusters, using the same procedure as before. It is clearfrom the picture that the unique symmetric M equilibrium captures almost the entirety ofsubjects’ behavior, both in terms of choices and beliefs. This provides further support for M equilibrium, and behavioral stability, as an empirically relevant theory.
6. Summary
This paper introduces a novel set-valued solution concept, M equilibrium , which is based onan ordinal monotonicity condition – players’ choice probabilities are ranked the same as theirassociated expected payoffs – and an ordinal consistency condition – players’ beliefs result inthe same ranking of expected payoffs as their choices.The first condition, also known as stochastic rationality , captures the idea that unbiasedmistakes dilute choice probability away from better to worse options but not to the extent thatthey overturn their ranking. The rationale behind the second condition is that there is no reasonto improve beliefs when doing so leaves choices unaffected. For instance, in the dominance-solvable game of the Introduction (see Table 2), all beliefs result in the same ranking of choices39nd there is no need to correctly anticipate the other’s behavior. The ordinal consistencycondition underlying M equilibrium captures this “no-need-for-change” intuition.We prove existence of a finite number of M equilibria, each with a finite number of com-ponents, for any normal-form game (Propositions 1 and 2). We show that Nash equilibria aretypically not examples of M equilibria. Rather, Nash equilibria arise as “abstract limit pointsof the M -equilibrium planes,” to paraphrase Mondrian. There may be an even or odd numberof M equilibria, fewer or more M equilibria than Nash equilibria, and an M equilibrium maycontain any number of Nash equilibria. Importantly, the measure of any M -equilibrium choiceset falls quickly with the number of players and the number of possible choices (Proposition4). We show that, generically, M equilibria can be “color coded” by a single rank vector inthe sense that choices and supporting beliefs are of the same color. We introduce the conceptof behavioral stability , which strengthens Kohlberg and Merten’s (1986) strategic stability,and show that colorable M equilibria are the behaviorally stable ones (Proposition 5).We introduce a new class of parametric µ -equilibrium models, which replace players’ best-response correspondences with rank-based correspondences. We prove existence of µ equilib-rium for any normal-form game (Proposition 7) and show that, unlike QRE, the µ -equilibriumchoices can be easily and analytically determined. Importantly, µ -equilibrium choices are sup-ported by a set of beliefs. As a result, µ -equilibrium beliefs may be heterogeneous and incorrect.In other words, µ -equilibrium allows for “mistakes” as well as “surprises,” again unlike QRE,which assumes homogeneous and correct beliefs.A common criticism of parametric models like µ equilibrium and QRE is that they intro-ducing “degrees of freedom” into the theory. QRE, for instance, requires the specification ofplayers’ quantal responses, which are elements of an infinite-dimensional space. This begs thequestion whether QRE can be falsified. Haile, Hortacsu, and Kosenok (2008) suggest the an-swer is “no,” although their proof uses quantal responses that violate stochastic monotonicity.In contrast, most of the QRE literature considers regular quantal responses that obey stochasticmonotonicity, see e.g. Goeree, Holt, and Palfrey (2016). Nonetheless, also the set of regularquantal responses is infinite dimensional, so the question remains whether QRE is falsifiable.We answer this question through the lens of M equilibrium. In particular, we show that M equilibrium, a parameter -free and set -valued solution concept, is a meta theory that minimallyenvelopes various parametric fixed- point models, including µ -equilibrium and QRE (see Proposi-tion 9 and subsequent discussion) and other models obeying stochastic rationality. Proposition4 shows that the measure of any M -equilibrium choice set falls factorially fast with the number40f players and possible choices, confirming that the parametric models can (easily) be falsified. We test M equilibrium in a series of experiments. Five versions of an asymmetric matching-pennies game, which differ only by additive constants to the payoffs, show significantly differentchoices and beliefs. This finding is problematic for all existing behavioral game-theory models(level- k , SQRE and its HQRE/TQRE/QRE descendants, Cognitive Hierarchy), but is easilyaccommodated by set-valued M equilibrium.Further experiments based on dominance-solvable 3 × M equilibria raise the possibility of mis-coordination and highlight the roleof beliefs. In one version of a dominance-solvable game, elicited beliefs predict that choicesare mainly in the M -equilibrium set furthest away from the unique Nash equilibrium – andthey are. In an “equivalent” version, beliefs predict that choices are scattered over the four M equilibria – and they are. See Figure 9. In another experiment, beliefs predict that choicesare mostly in the region that logit-QRE cannot attain – and they are. See Figure 10. A finalexperiment, predicted to result in unstable actions and beliefs, shows the empirical relevance ofbehavioral stability. The KM game has a unique M equilibrium to which both choice and beliefdata conform. See Figure 11. Combined these findings confirm the potential of M equilibriumas an empirically relevant game theory.
7. Outlook
Nasar’s (1998) book “A Beautiful Mind” details how Nash was disappointed by the lack ofempirical support for his solution concept, which led him to return to doing research in puremathematics. Interestingly, some of the machinery Nash developed underlies the alternativeapproach pursued in this paper. For example, an M equilibrium is a semi-algebraic set withfinitely many components of different dimensions (see Section 3.1, in particular, Figure 3),which are now known as “Nash cells” (e.g. Coste, 2005).In retrospect, von Neumann’s reaction when Nash introduced him to his solution concept –“that’s just a fixed-point” – may have foreshadowed its empirical weakness. The Nash equilib-rium predicts certain choice profiles without detailing how they come about – they are simply so-lutions to some fixed-point equations. Quantal Response Equilibrium (QRE), developed almost Furthermore, Example 7 shows there exist M -equilibrium sets that contain no logit-QRE at all. The “noLogit” experiment was inspired by this example. We find that the majority of the choice data (62%) is in the“no Logit” region. As the experimentalresults reported in this paper highlight, and as von Neumann intuited, the assumption of homo-geneous and correct beliefs that drives fixed-points theories like Nash and QRE, is untenable.In none of the experiments reported in this paper are beliefs homogeneous or correct.Yet choices and beliefs are more likely to be “right” than “wrong.” This is the simplepremise underlying the M -equilibrium: players’ choice probabilities are ranked the same as theexpected payoffs based on their beliefs, and players’ beliefs yield the same ranking of expectedpayoffs as their choices. The mathematical consequences of this simple premise are governedby semi-algebraic geometry, a field in mathematics that Nash made distinguished contributionsto. Importantly, the empirical consequences of this simple premise are corroborated by beliefand choice data from several of our experiments.Of course, M equilibrium will not be universally correct. There are well-documented caseswhere behavioral factors (e.g. other-regarding preferences, risk aversion, etc.) play an impor-tant role. These factors can be incorporated in an extension of the theory by replacing expectedpayoffs with expected utilities. But even when accounting for behavioral elements, M equi-librium is unlikely to always be correct. Given the minimal assumptions that M equilibriumimposes, the reason it fails then offers important insights about behavior. Is stochastic ratio-nality violated or is it that beliefs do not satisfy a minimal consistency condition? Whethercorrect or not, M equilibrium offers a novel and promising approach toward an empiricallyrelevant game theory. In Crawford’s (2018) words “QRE is a fixed point in a high-dimensional space of distributions, making itsthinking justification cognitively far more demanding than for Nash equilibrium.” eferences Bax, M. (2001)
Mondriaan Compleet , Atrium, Alphen aan den Rijn.Bernheim, D.B. (1984) “Rationalizable Strategic Behavior,”
Econometrica , 52(4), 1007–1028.Brunner, C., Camerer, C.F., and Goeree, J.K. (2010) “Stationary Concepts for 2 × American Economic Review , 101(2), 1029–1040.Camerer, C.F., Ho, T.H. and Chong, J.K. (2004) “A Cognitive Hierarchy Model of Games,”
The Quarterly Journal of Economics , 119(3), 861–898.Coste, M. (2002) “An Introduction to Semi-Algebraic Geometry,” Lecture Notes, University ofRennes.Coste, M. (2005) “Real Algebraic Sets,” Lecture Notes, Unievrsity of Rennes.Crawford, V.P. (2018) “Experiments on Cognition, Communication, Coordination, and Coop-eration in Relationships,” Prepared for the
Annual Review of Economics , Volume 11.Demichelis, S. and Ritzberger, K. (2003) “From Evolutionary to Strategic Stability,”
Journalof Economic Theory , 113, 51–75.Friedman, J.W., and Mezzetti, C. (2005) “Random Belief Equilibrium in Normal-Form Games,”
Games and Economic Behavior , 51(2), 296–323.Goeree, J.K. and Holt, C.A. (2001) “Ten Little Treasures of Game Theory and Ten IntuitiveContradictions,”
American Economic Review , 91(5), 1402–1422.Goeree, J.K. and Holt, C.A. (2004) “A Model of Noisy Introspection,”
Games and EconomicBehavior , 46(2), 365–382.Goeree, J.K., Holt, C.A., and Palfrey, T.R. (2003) “Risk Averse Behavior in Generalized Match-ing Pennies Games,”
Games and Economic Behavior , 45, 97–113.Goeree, J.K., Holt, C.A., and Palfrey, T.R. (2005) “Regular Quantal Response Equilibrium,”
Experimental Economics , 8, 347–367.Goeree, J.K., Holt, C.A., and Palfrey, T.R. (2016)
Quantal Response Equilibrium: A StochasticTheory of Games , Princeton University Press.Goeree, J.K., Louis, P., and Zhang, J. (2018) “Noisy Introspection in the 11-20 Game,”
Eco-nomic Journal , 128(611), 1509–1530.Goeree, J.K., Holt, C.A., Louis, P., Palfrey, T.R., and Rogers, B. (2018) “Rank-DependentChoice Equilibrium: A Non-Parametric Generalization of QRE,” in the
Handbook ofResearch Methods and Applications in Experimental Economics , Eds. A. Schram and A.Ule, Edward Elgar Publishers, forthcoming.43aile, P. A., Hortacsu, A., and Kosenok, G. (2008) “On the Empirical Content of QuantalResponse Equilibrium,”
American Economic Review , 98, 180–200.Harsanyi, J.C. (1973) “Oddness of the Number of Equilibrium Points: A New Proof,”
Interna-tional Journal of Game Theory , 2, 235–250.Kakutani, S. (1941) “A Generalization of Brouwer’s Fixed-Point Theorem,”
Duke MathematicalJournal , 8(3), 457–459.Kohlberg, E. and J.-F. Mertens (1986) “On the Strategic Stability of Equilibria,”
Econometrica ,54(5), 1003–1037.MacQueen, James (1967) “Some methods for classification and analysis of multivariate ob-servations.”
Proceedings of the fifth Berkeley symposium on mathematical statistics andprobability.
Games and Economic Behavior , 10(1), 6–38.McLennan, A. (2016). “The Index+ 1 Principle”, Mimeo.McKelvey, Richard D. and Thomas R. Palfrey (1996) “A Statistical Theory of Equilibrium inGames,”
Japanese Economic Review , 47(2), 186–209.Nagel, R. (1995) “Unraveling in Guessing Games: An Experimental Study,”
American Eco-nomic Review , 85(5) 1313–1326.Nasar, S. (1998)
A Beautiful Mind , Simon and Schuster, New York.Nash, J.F. (1950) “Equilibrium Points in n -Person Games,” Proceedings of the NationalAcademy of Science , 36(1) 48–49.Nyarko, Y. and Schotter, A. (2002) “An Experimental Study of Belief Learning Using ElicitedBeliefs,”
Econometrica , 70(3), 971–1005.Ochs, J. (1995) “Games with Unique Mixed-Strategy Equilibria: An Experimental Study,”
Games and Economic Behavior , 10, 202–217.Pearce, D.G. (1984) “Rationalizable Strategic Behavior and the Problem of Perfection ,”
Econo-metrica , 52(4), 1029–1050.Rogers, B.W., Palfrey, T.R. and Camerer, C.F. (2009) “Heterogeneous Quantal Response Equi-librium and Cognitive Hierarchies,”
Journal of Economic Theory , 44(4), 1440–1467.Schlag, K.H., Tremewan, J. and Van der Weele, J.J. (2015) “A Penny for Your Thoughts: ASurvey of Methods for Eliciting Beliefs,”
Experimental Economics , 18(3), 457–490.Schotter, A. and Trevino, I. (2014) “Belief Elicitation in the Laboratory,”
Annual Review ofEconomics , 6(1), 103–128. 44elten, R. (1975) “A Reexamination of the Perfectness Concept for Equilibrium Points inExtensive Games,”
International Journal of Game Theory , 4(1), 25–55.Selten, R. and Chmura, T. (2008) “Stationary Concepts for 2 × American EconomicReview , 98(3), 938–966.Selten, R., Chmura, T., Georg, S. (2010) “Stationary Concepts for 2 × American Economic Review , 101(2), 1041–1044.Stahl, D.O. and Wilson, P.W. (1994) “Experimental Evidence on Players’ Models of OtherPlayers,”
Journal of Economic Behavior and Organization , 25(3), 309–327.Stahl, D.O. and Wilson, P.W. (1995) “On Players’ Models of Other Players: Theory andExperimental Evidence,”
Games and Economic Behavior , 10(1), 218–254.Van Damme, E.E.C. (1991)
Stability and Perfection of Nash Equilibria , Second Edition,Springer-Verlag, Berlin.Wilson, A.J. and Vespa, E. (2016) “Paired Uniformed Scoring: Implementing a BinarizedScoring Rule with Non-Mathematical Language,” working paper.45 . Support for Result 1